Upload
scott-w-ambler
View
2.988
Download
6
Embed Size (px)
Citation preview
Agile Data Warehousing and Business Intelligence:A Disciplined Approach
The Story I’m About to Tell
© Disciplined Agile Consortium
We have many challenges to overcome together
© Disciplined Agile Consortium
© Disciplined Agile Consortium
© Disciplined Agile Consortium
Modern development is evolutionary in nature
© Disciplined Agile Consortium
You need to be flexible
© Disciplined Agile Consortium
Collaboration is Crucial to Your Success
© Disciplined Agile Consortium
© Disciplined Agile Consortium
We face several challenges
© Disciplined Agile Consortium
Data Quality Capability
© Disciplined Agile Consortium
16%
25%
15%
32%
10%
16%
29%
25%
24%
3%
Very challenged
Challenged
Neutral
Capable
Very Capable
Prevent New Problems Fix Existing Problems
Source: 2015 Data Quality Survey
Yikes!
Cultural Impedance Mismatch
© Disciplined Agile Consortium
• Developers
• Have embraced evolutionary, and now agile and lean, strategies for over two decades
• Are generally weak on data skills
• Data professionals
• Just now starting to embrace agile strategies for DW/BI
• Are generally weak on development skills
Source: Agiledata.org/essays/culturalImpedanceMismatch.html
Traditional DW/BI
© Disciplined Agile Consortium
• Heavy modeling performed early in the lifecycle
• Heavy specification throughout lifecycle
• Requirements changes are avoided
• Solution released as a “big bang”
Disciplined Agile DW/BI
© Disciplined Agile Consortium
• High-level modeling performed early in the lifecycle
• Light-weight modeling throughout the lifecycle
• Comprehensive regression test suite(s) developed
• Solution is developed in thin, vertical slices
• Requirements changes are welcomed
© Disciplined Agile Consortium
© Disciplined Agile Consortium
The Disciplined Agile Delivery (DAD) Framework
Disciplined Agile Delivery (DAD) is a process decision framework
The key characteristics of DAD:– People-first– Goal-driven– Hybrid agile– Learning-oriented– Full delivery lifecycle– Solution focused– Risk-value lifecycle– Enterprise aware
© Disciplined Agile Consortium
© Disciplined Agile Consortium
Scrum
Extreme Programming
LeanKanban
DAD is a Hybrid Framework
© Disciplined Agile Consortium
Unified Process Agile Modeling
Agile Data“Traditional”Outside In Dev.
DevOps …and more
DAD leverages proven strategies from several sources,providing a decision framework to guide your adoption and
tailoring of them in a context-driven manner.
SAFe
© Disciplined Agile Consortium
Which Lifecycle for DW/BI?
• When the concept of DW/BI is completely new to your organization– The Exploratory lifecycle should be considered so that
you can identify a potential strategy for the DW/BI solution
• Initial release of a DW/BI solution– The Agile/Basic lifecycle is the typical choice– Straightforward approach to agile– Iterations force discipline onto a team that is likely new
to agile
• Future releases of a DW/BI solution– The Lean or Continuous Delivery lifecycles are the
typical choices– New requirements are often small and independent of
one another
© Disciplined Agile Consortium
Basic/Agile Lifecycle
A full Scrum-based agile delivery lifecycle.
© Disciplined Agile Consortium
Lean Lifecycle
A full lean delivery lifecycle
© Disciplined Agile Consortium
Lean Continuous Delivery Lifecycle
Your evolutionaryend goal?
© Disciplined Agile Consortium
Agile DW/BI Principles
© Disciplined Agile Consortium
Agile DW/BI is Collaborative
© Disciplined Agile Consortium
NOT
Agile DW/BI is Usage Driven
• Data-driven approaches are too narrowly focused • How people use, or work with, the data to fulfill business goals is the
true issue � focus on usage, not the underlying data structures• Work closely with your stakeholders to explore their intent
© Disciplined Agile Consortium
User stories drive developmentNOT
data models
Agile DW/BI is Evolutionary
© Disciplined Agile Consortium
NOT
Agile Teams Work in Short Iterations
© Disciplined Agile Consortium
Source: SA+A 2015 Agile State of the Art Survey
Disciplined agile DW/BI teamsdeliver new data or reports each iteration
1%
7%
15%
65%
13%
> 4 weeks
4 weeks
3 weeks
2 weeks
1 week or less
Agile DW/BI Teams Strive to Fix Data at the Source
• Extract Transform Load (ETL) is often a quality band-aid• Transformation is time consuming and risks defect injection
– Often needed due to semantics differences• Disciplined agilists strive to fix data problems at the source via
database refactoring– This is one aspect of paying down technical debt– Transformation is often dramatically reduced
© Disciplined Agile Consortium
Agile DW/BI is Pragmatic
• Sometimes:– You need to accept data inconsistencies– There is no “one truth”– You can’t get all of the data (at first) that you need for a report– Stakeholders can’t agree
• Perfection is the enemy of good enough
© Disciplined Agile Consortium
RobertSmith
Robert W. Smith
Robert William Smith
BobSmith
Modern development is evolutionary in nature
© Disciplined Agile Consortium
The Agile Database Techniques Stack
© Disciplined Agile Consortium
Vertical Slicing
Clean Architecture and Design
Agile Data Modeling
Database Refactoring
Database Regression Testing
Continuous Database Integration
Configuration Management
Vertical Slices of a DW/BI Solution
• Every iteration the disciplined agile DW/BI team produces a working solution
• Functionality is added in “vertical slices”
• Slicing strategies:– One new data element from a single data source– One new data element from several sources– A change to an existing report– A new report– A new reporting view– A new data mart table
© Disciplined Agile Consortium
High-Level Logical DW/BI Architecture
© Disciplined Agile Consortium
Clean Database Design
© Disciplined Agile Consortium
• The cleaner the design of your database, the easier it is to evolve
• Data Vault 2.0 is the best source of strategies for clean DW architecture and design
© Disciplined Agile Consortium
Agile Data Modeling
• Data modeling is the act of exploring data-oriented structures • Evolutionary data modeling is data modeling performed in an iterative and
incremental manner• Agile data modeling is evolutionary data modeling done in a collaborative
manner
© Disciplined Agile Consortium
© Disciplined Agile Consortium
Why Evolving Databases is Thought to be Hard
Production databases are often highly coupled to other systems, services, data sources, …
© Disciplined Agile Consortium
Database Refactoring
• A database refactoring is a simple change to a database schema that improves its design while retaining both its behavioral and informational semantics
• A database schema includes structural aspects such as table and view definitions; functional aspects such as stored procedures and triggers; and informational aspects such as the data itself
© Disciplined Agile Consortium
The Process of Database Refactoring
© Disciplined Agile Consortium
© Disciplined Agile Consortium
Database Regression Testing
• Agilists develop regression unit and acceptance test suites for their applications, why not for databases too?
• If database testing is done at all these days, it is at the black-box level. This isn’t enough
• You must be able to put the database into a known state, therefore you need test data (generation)
© Disciplined Agile Consortium
Test/Behavior Driven Database Development (TDDD/BDDD)
• An evolutionary approach to database design
• Driven by tests, not models• A practical approach to ensuring
database quality• Completely different than
traditional approaches to database design
• Really just one part of BDD, not a stand-alone activity
• May/June 2007 issue of IEEE Software
© Disciplined Agile Consortium
© Disciplined Agile Consortium
Continuous Integration and Databases
• Part of building the system is building the database (if it changed)
• Challenge: Tests SHOULD put the database back into a known state, but sometimes don’t• You will want to rebuild the (non-
production) database from scratch every so often
• Challenge: Database accesses take time• Some test suites will test against DB
mocks• You still need to test the actual database
occasionally
© Disciplined Agile Consortium
Configuration Management
• All work products should be stored in a versioned repository
• Maintains the integrity of the system and all supporting work products as it evolves
• Facilitates change in a controlled fashion
• All revisions kept and who made what changes to all work products
• Potential benefits include the ability to:– Manage versions of systems across environments– Rollback to previous versions– Modify work products in parallel and then merge– Find source of defects injected
© Disciplined Agile Consortium
The Story I Told
© Disciplined Agile Consortium
We have many challenges to overcome together
© Disciplined Agile Consortium
© Disciplined Agile Consortium
© Disciplined Agile Consortium
Modern development is evolutionary in nature
© Disciplined Agile Consortium
You need to be flexible
© Disciplined Agile Consortium
Collaboration is Crucial to Your Success
© Disciplined Agile Consortium
© Disciplined Agile Consortium
Interested in Learning More?
• We offer a hands-on workshop entitled Disciplined Agile Data Warehousing/Business Intelligence. Visit DisciplinedAgileConsortium.org/da210 for details.
• At DisciplinedAgileConsortium.org/posters there is a downloadable poster overviewing the Disciplined Agile DW/BI approach
• At DisciplinedAgileConsortium.org/webinars you can download a recorded presentation of this deck.
© Disciplined Agile Consortium
Thank You!scott [at] scottambler.com
@scottwambler
AgileModeling.comAgileData.orgAmbysoft.com
DisciplinedAgileConsortium.orgDisciplinedAgileDelivery.com
ScottAmbler.com
Disciplined Agile DeliveryDisciplined Agile Delivery
© Disciplined Agile Consortium