Upload
votuyen
View
225
Download
1
Embed Size (px)
Citation preview
October 1, 2017 Sam Siewert
CS317File and Database Systems
Lecture 6 – DBMS Development Lifecycle
http://dilbert.com/strips/comic/1998-03-23/
For Discussion…Software Engineering vs. DBMS Analysis and Design
1. Modern SWE Lifecycles Include Feedback – E.g. DevOps, Spiral, Extreme Programming in Agile, and even Waterfall with Feedback
2. Software Engineering Initiated by 1968 NATO Conference on the Software Crisis [Paper on Blackboard]
3. No Mention of Databases, Data Processing More So – Focus on Programming and Programming Languages [COBOL mentioned – CODASYL (Conference on Data Systems Languages, 1959 – Has DDL and DML]
4. SA/SD, Yourdon/DeMarco – Dataflow [Source, Sink, Store, Flow, Process]
5. ER Models (Chen) Useful for RDBMS -http://mysqlworkbench.org/ , Relational part of UML OOD
Sam Siewert 3
Dataflow – Data Processing [SA/SD]Simple Voice / IP – One End-Point ShownHardware Sources/Sinks, Stored Audio BuffersDataflow Between Source/Sink, Record, Playback, Streaming, Network Transport Interface
Sam Siewert 4RTECS, Sam Siewert, 2006 - ISBN-13: 978-1584504689
Sakila ER Model - Logical Design
Sam Siewert 5
Tables, Views, SQL/PSM Routines, Triggers in ER Diagram
CASE ToolsComputer Aided Software Engineering - Schemas– MySQL Workbench – DBMS Logical Design (Schema)– Modelio UML & SysML - SE310– Visio UML Templates - Design Edit Only!
Software Design Automation– Requirements, Architecture, High-Level Design, Detailed-Design
[Sometimes Executable Simulations with State Machines], Test Cases for Verification and Validation
– Rational Software Tools, Telelogic - Acquired by IBM
Discrete Event Simulation - MATLAB SimEvents, SimPy, SystemC [Hardware/System Oriented]
Sam Siewert 6
Embedded DatabasesMobile Devices that Synchronize with Cloud
Service Provisioning and Billing Systems – E.g. Telecommunications, Digital Cable, ISPs
E.g. Oracle Berkeley DB
E.g. http://sqlite.org/
Android SQLite -http://www.androidhive.info/2011/11/android-sqlite-database-tutorial/
Sam Siewert 7
DBMS Design LifecyclePlanning - Goals and Objectives
System Definition– Customer interviews– Requirements analysis and capture– ER and Schema design prototyping with user feedback
Requirements review and refinement [SE300, SE310]
Logical DBE - Schema design, deployment, test data, normalization, referential integrity, client interfaces [views and connectors]
Physical DBE - Selection of DBMS, indexing, installation and scaling, performance
Data conversion and load
Testing - QA [SE420]
Maintain and improve
Sam Siewert 9
10
Planning
System Definition(Architecture, Views, IM)
Requirements RefinementSRS
Logical DBE(Schema - CASE)
Physical DBE(DBMS physical storage, indexing, scaling)
Strategy for DB Development
Client AppDesign
Select DBMS(RDBMS, OODBMS,
NoSQL)
Load Data
Test
Maintain
Conceptual
Logical
Physical
PlanningOutline Goals and Objectives
Basic Schedule
Detail Work Breakdown and Tasks
Cost - TCO = CAPEX + OPEX
Scaling, Disaster Recovery, Data center, Storage and Server Technologies
Security - Physical, Logical, and Best Practices
Sam Siewert 11
System DefinitionHosting (Architecture)– Public or Private Cloud– Private SAN, NAS, or DAS - Scale-Up or Scale-Out– Co-Location for Data center?– Private Data center - 3-tier, 4-tier, N-tier
User Views– Collect through user views by funciton (engineering, accounting,
management, manufacturing, sales, etc.)– By job (sustaining engineer, data entry, point-of-sale, supervisor,
etc.)
Views define Information Models
IM helps to define major requirements
Sam Siewert 12
Conceptual Design - AnalysisStick to High Level Information Model– ER, (Traditional Relational approach), EER adds class hierarchy– UML, SysML (OO approach) - Class diagram is EER+methods,
but UML has many more diagrams and models– Top down - define major entities and key relations or classes
Views (keyword in SQL), but also Use Cases– Interview stakeholders– Draft a User’s Guide– Write a data dictionary (bottom up approach for data domain,
attribute analysis for entities)
Conceptual IM should be easily discussed with all stakeholders
Sam Siewert 16
RequirementsFocus on Information Required by User– Data descriptions (dictionaries)– Data generation and ingest– Data use (e.g. reports, documents, mobile query, analytics,
decision support, compliance, meta-data for files, other?)
Client Application Requirements (Parallel Development)
File system requirements (Parallel Development)
Use Centralized (Single IM) or View Integration (IM by Use Case) to be integrated
Sam Siewert 17
Start Parallel Efforts - Select DBMSKick-off Traditional Software Analysis, Design, Development for Client Applications
Kick-off File systems to coordinate with RDBMS
Select DBMS by Type (Candidates)– Relational (SQL)– OODBMS (C++ or other OOPL + Data = Persistent Objects)– NoSQL - Key / Value, Documents, No Schema (per se)
Sam Siewert 18
Logical DesignRDBMS - Relational Schema (ER/EER)– Start work on Schema using High Level IM– Domains, Attributes, Tables, Relations, Keys, etc.
OODBMS - Class Hierarchy and Object Interaction– Use UML (SE310)– Consider SysML (Extension to UML for Systems)
NoSQL - Key / Value Searches and Indexing– Open research topic– Columnar design– E.g. Google Big Query [https://cloud.google.com/bigquery]– Web-based REST (Representational State Transfer)
Sam Siewert 19
Physical Design [Part 2]Choose Storage and Technology– Battery Backed RAM– Phase Change Memory– Host-Bus Flash– SSD– HDD
Install on Block Storage Partitions– SAN or DAS (No File system)– Scale with SAN or DAS Host Channels (scale-up)– Block RAID
Install on File system– NAS, PNFS, GPFS, etc. - Scale out!– Local (scale-up)– File RAID
Indexing Method Selection [Part 2 of our course]
Selected DBMS Physical Features
Sam Siewert 20
DBMS Selection Criteria
Sam Siewert 22
Performance (Transactions/second, Latency, TPC)Reliability, Availability, RPO/RTO – Recovery Point/Time ObjectiveFeatures (E.g. De-duplication, Logging, Import/Export, …)Ease of Use (SQL Compliance, GUI, Installation)
Load DataDump Existing Database to Files or SQL
For RDBMS to RDBMS, Migrate from Old Schema to New
For Files, No-SQL, or OODBMS to RDBMS– Parse files (data processing), generate SQL DML inserts– Data entry (customer driven, or enterprise driven)
Make before Break [Parallel, Shut-down, cut-over]Big Bang [Re-build enterprise data]Evolutionary [Migrate data to new from old over time]
Sam Siewert 23
TestingData Integrity (Bad data entry prevention)Referential Integrity (Update, Delete, Insert focus)Performance (Transactions per second)– Custom benchmarks– SPC (RDBMS or No-SQL) or TPC (RDBMS, Big Data)
Resilience (Data loss protection)– Single, Double, or Triple Fault– Availability During Recovery (Hot, Warm, Cold Spares)
Disaster Recovery (loss of client connectivity or data center)RPO, RTO - Recovery Point and Recovery Time Objectives– Transactions lost that need restart (replay)– Time until transactions can be services or re-run
Upgrades and scaling while in service
Sam Siewert 24