Upload
dataversity
View
1.229
Download
3
Embed Size (px)
DESCRIPTION
The bulk of the NoSQL Technologies focus on achieving scale-out ability by building their architecture around a simple, distributed hash, key-value store. This works well for partitioning simple data, but in reality, your information models are not simple. As a result, you may have to build enormous layers of code to manage an explicit structure baked into the persistence tier. In this session, take a look at a NoSQL solution which allows you to store naturally clustered, richly linked object networks beneath your key partitioned roots. The result is that you do not have to write extensive code to deal with the physical structure in the persistence tier even when dealing with complex information models like predictive models, timeseries, recursive relations, compositions, etc. We will explore how such an implementation works in practice by looking at a case study of an advanced model analytics and visualization solution built on the clustered NoSQL database solution Versant Database Engine.
Citation preview
NoSQL BeyondNoSQL Beyond the Key:ValueyStoreBy Robert Greene
Versant Corporation U.S. Headquarters255 Shoreline Dr Suite 450 Redwood City CA 94065
#NoSQLVersant#NoSQLVersant
255 Shoreline Dr. Suite 450, Redwood City, CA 94065www.versant.com | 650-232-2400
The Genesis of NoSQLOverviewThe Sky is Falling
NoSQL at it’s Core
Overview
Shift in Architecture
Shift Innovation
Domain Models, Distribution, SOA
Enterprise Needs and NoSQLEnterprise Needs and NoSQL
Application Development with NoSQL
NoSQL 2 0 Leveraging the KnowledgeNoSQL 2.0 - Leveraging the Knowledge
Base
#NoSQLVersant
Genesis of NoSQL► The Sky is Falling
Early Web 2.0 Social Computing drives innovationy p g
► End of the Hammer EraOne relational tool for every data problem failsOne relational tool for every data problem, fails.Agility and Cost, usher in reason and innovation
#NoSQLVersant#NoSQLVersant
NoSQL at its Core
An Increasingly Crowed SpaceTo “shift”, is to be NoSQL
No “shift” Inside
#NoSQLVersant#NoSQLVersant
Traditional DBMS Scale ArchitectureINEFFICIENT
CPU destroyingMappingMapping
EXPENSIVERepetitive data
movement and JOINcalculation
#NoSQLVersant
NoSQL at its CoreA Shift In Application Architecture
UNIFEDA li tiApplication
driven schema
COMMODITY HWCOMMODITY HWHorizontal scale out, distribution and partitioning
• Google – Soft-Schema• IBM – Schema-Less
#NoSQLVersant
A Shift is Needed
► How Often do Relations Change?► How Often do Relations Change?Blog : BlogEntry , Order : OrderItem , You : Friend
►Relations Rarely Change, Stop RecalculatingThem ► Do you need ALL of your data in one place.► o you eed o you da a o e p ace
► You don’t. You can distribute it.
#NoSQLVersant#NoSQLVersant
NoSQLNoSQL
Innovation and the Shift
#NoSQLVersant#NoSQLVersant
Domain Model Thinking
► Business Model is Schema► Business Model is SchemaNot Data Model under Entities
► Movement of ResponsibilitySoft-Schema (vs) Schema-less
► Enables changing Nature of AnalyticsSQL/MapReduce “give me top 20 performers”SQL/MapReduce – “give me top 20 performers”NoSQL – “find 3 dimensional protein pattern match”
#NoSQLVersant#NoSQLVersant
Distributed Thinking► Scale-out, with fall out
► Partition Impact –Implementation, AlgorithmsDifferent design considerationsDifferent design considerations
► Key Driven access impacts► Embedded Models ► Enterprise Reference Data
#NoSQLVersant#NoSQLVersant
SOA Thinking
► Business Processes and Service Orchestration► Business Processes and Service OrchestrationThe Drivers of Business Agility
► NoSQL enables increased speed of agility► Faster Time to Market, Competitive Edge
► Raw Data Manipulation and Mining► Raw Data Manipulation and MiningTypically done outside of day to day businessETL strategy essentialETL strategy essential
► Feedback loop for BPM/O layers
#NoSQLVersant#NoSQLVersant
NoSQL and the Enterprise
Responsibly, taking advantage of the “Shift”
#NoSQLVersant#NoSQLVersant
Embedded ModelsNoSQL 1 0NoSQL 1.0
► Document Store Characteristics► Document Store CharacteristicsBlogs have Articles
► Patterns of AccessOnly access sub elements from rootGood candidate for simple web system
► Query on Articles content to get similar BlogsDisplay Blogs and their Articles► Display Blogs and their Articles
#NoSQLVersant#NoSQLVersant
Enterprise ModelsNoSQL 2 0NoSQL 2.0
► Many to Many► Many to ManyBlogs get Tags - search based on tagTags weighted, Similarity Meta Datag g y
► Faster algorithmic searchingNarrow Blogs via back reference
► Sub queries on collection contents
C l A ti l i dditi t BlCan leverage Articles in addition to Blogs
#NoSQLVersant#NoSQLVersant
Operational FeaturesNoSQL 1 0NoSQL 1.0
► Transactions – The 20:80 Rule (ACID:CAP)► Transactions The 20:80 Rule (ACID:CAP)Most prevalent NoSQL 1.0 approach
► Give up transactions for better scalibility► Compensating application code needed
Code Complexity, Manual ProcessesHigh Operational Cost
► Weak TransactionsIt’s a start, gets us to 20%, demonstrates the need
From Key to Criteria Based QueryFrom Key to Criteria Based Query
#NoSQLVersant#NoSQLVersant
Enterprise Operational FeaturesNoSQL 2 0NoSQL 2.0
► Transactions – The 80:20 Rule ( ACID:CAP )► Transactions The 80:20 Rule ( ACID:CAP )Algorithm, Tagged Blogs via Tag
► No Transactions = lost Blog, no results from Algorithm
► Cascading OperationsNetwork essential
► External AccessJdbc/odbc tooling support
#NoSQLVersant#NoSQLVersant
Operating NoSQL 1.0
► DevOps – Dev builds it, Dev owns it.► DevOps Dev builds it, Dev owns it.Schema-less implementation
► Evolution directly impacts application space ( Development )
► Data BackupL l fil d tl t ff liLargely file dumps, mostly systems off-line
Custom tooling for out of band needs► Custom tooling for out of band needsOperational need, write a custom access Non-centralized scripted monitoring
#NoSQLVersant#NoSQLVersant
Non-centralized, scripted monitoring
Enterprise Operations NoSQL 2 0NoSQL 2.0
► DevOps – Dev builds it, IT owns it eventually.p yIT System Management
► Centralized monitoring► Integrated with SNMP / system managementg y g
► Availability, Governance, Data BackupE t i i t i ti SOX HIPPA tEnterprise point in time recovery, SOX, HIPPA, etcFault tolerant, globally replicatedOnline and distributed back upp
► Cloud Enabled - utility efficiencyAutomated SLA based Provisioning
#NoSQLVersant#NoSQLVersant
Automated SLA based ProvisioningMobility of Processes
Web Development NoSQL 1 0NoSQL 1.0
► Requires completely new skill set► Requires completely new skill set
► Lack of ecosystem integration► Lack of ecosystem integrationIDE toolingImmature integrationgNon standard connectivity
► Custom, custom and more customEach 1st generation product unique / proprietary
#NoSQLVersant#NoSQLVersant
Enterprise DevelopmentNoSQL 2 0NoSQL 2.0
► Leverages existing enterprise skill setg g p
► Mature development platformsp pTomcat, Spring, Hudson, Eclipse enabled
► Industry standard API’sJava – JPA ( 10 years of ORM experts )Ruby OnRails its the shift the mattersRuby – OnRails, its the shift the matters
#NoSQLVersant#NoSQLVersant
Application Developmentpp pThe Things You Will Build
NoSQL 1.0
NoSQL 2.0
#NoSQLVersant#NoSQLVersant
Need Proxy PatternNoSQL 1 0NoSQL 1.0
► Avoid overhead of extraneous loading► Avoid overhead of extraneous loadingYou want all Blog Articles to get 1 Article?
► Model must change to use ReferencesBlog:owner(User) becomes Blog:owner_id(long)
P tt f l t U i l► Proxy pattern for long to User swizzleObject to Value, Value to Object
► Maybe Document store BasicDBObject
#NoSQLVersant#NoSQLVersant
► Maybe Document store BasicDBObject► Maybe Key:Value store BSON
SerializableNoSQL 1 0NoSQL 1.0
► You don’t write code in JSON or XML► You don t write code in JSON or XMLProgramming models need transformation
► Non-Vendor transformation limitsCreate binary format value, cannot query it
► Not all programming structures are supportedMap -- Need to breakdown programming modelList’s -- Array need Serializable
#NoSQLVersant#NoSQLVersant
Reference SystemNoSQL 1 0NoSQL 1.0
► Avoid object duplicatesj pLoad a User’s Personal Blog, Search Tagged Blog
► Inconsistencies during runtime
► Materialization of bi-directional relationsfNeed to avoid circular references
►Load Blog*, blog has a Owner:User►Load User user has a Personal Blog*►Load User, user has a Personal Blog► …..repeat
#NoSQLVersant#NoSQLVersant
Need Lifecycle TrackingNoSQL 1 0NoSQL 1.0
► New, Changed, Deleted► New, Changed, DeletedOn store, update: Slow overhead to replace all objects
► If not dirty, do not traverse and update► If new, add to the reference system► If null, delete underlying element
► Need to manage the reference system
#NoSQLVersant#NoSQLVersant
NoSQL 1.0(observations)(observations)
► Mapping layer is forming► Mapping layer is formingWhy re-invent the wheel
► ‘O’RM – Object Relational Mapping► ‘O’DM – Object Document Mapping► ‘O’CM – Object Column Mapping
Software Industry knows where this leadsSoftware Industry knows where this leads► Mapping Complexity, brittle code base, non-agility► The ‘O’ is what matters, ‘O’bject Lifecycle Management
#NoSQLVersant#NoSQLVersant
NoSQL 2.0
► Leverage NoSQL 1.0 architectural shift► Leverage NoSQL 1.0 architectural shiftScale out with performance
► Key partitioned data distributiony p► The good stuff from NoSQL 1.0
► Eliminate mapping complexityHandle modern information models
► Eliminate domain model mapping► Enable development agility► Enable development agility► Leverage existing enterprise skills
‘O’ in a standard (e.g. JPA), without RM,DM,CM
#NoSQLVersant#NoSQLVersant
Verite Group Case Study
#NoSQLVersant#NoSQLVersant
Verite Group
► Value Proposition► Value PropositionLine Level I.P. Analytics
► Answers the question: What is happening?Not: What has happened?
Activity CorrelationActivity Correlation► Capturing time related sequences of activity
Not capturing discrete “product” on the wire
#NoSQLVersant#NoSQLVersant
Verite Group
► Core netScope Use CasepPipeline Monitor and capture
► In-flight I.P. traffic content
Apply target rules and populate meta models► High network traffic content equipment variation► High network traffic, content, equipment variation
Present analyst visualization and alertsy► Customize new target rules
Insert into Pipeline and iterate
#NoSQLVersant#NoSQLVersant
Verite Group► Technology Adoption Process
IBM DB2 – Pure XML store► Driver: fast ingestion, excellent reg_exp query support► Failure: huge CPU issues pulling query results
Analytic model too complex, need objects from resultsHib t P t M SQLHibernate – Postgress, MySQL
► Driver: binary protocol to analytic model up frontSoft-Schema driven, Still supports reg_exp query
► Failure: data ingestion too slow CPU max high disk spin► Failure: data ingestion too slow, CPU max, high disk spinVersant – NoSQL 2.0
► Driver: speed data ingestion► Success: high speed data ingestion low CPU low disk spin► Success: high speed data ingestion, low CPU, low disk spin
Direct soft-schema storage, still supports reg_exp queryScale-out capability for large data analytics
#NoSQLVersant#NoSQLVersant
Verite Group► Discovered Value, Lessons Learned
Changing nature of analyticsChanging nature of analytics ► Model driven algorithmic, not iterative query
E.g. eliminated many reg_exp queries and moved to model► Significant increase in performance of analytic► Significant increase in performance of analytic
Operational efficienciesp► Soft-Schema is database schema
Faster analytic model evolution ( less DBA )Lower CPU cost to marshal type systems ( mapping )yp y ( pp g )Less Disk space and fast I/O ( less duplication, disk seeking )
#NoSQLVersant#NoSQLVersant
Q&AQ&A
#NoSQLVersant#NoSQLVersant
Contact
Robert GreeneRobert GreeneVice President, Technology
SQ #NoSQL Now! – Booth #14
#NoSQLVersant#NoSQLVersant