Upload
rob-worthington
View
642
Download
0
Embed Size (px)
Citation preview
Connected Development Data
Self-aware Data Objects
Vision
Planning and performance data from development activities is connected
Vision: planning
• Who is planning to work in district X next year?
• Which communities, facilities or partners are others planning to work with?
• How can we identify and avoid potential duplicate activities?
• How can we identify opportunities for collaboration?
Vision: reporting
• Define what data you want to share and when• Select who you want to share it with• Creates a feed with stream of relevant data• No more reports…
Vision: evaluation
• Joint evaluations focused on specific sectors or approaches
• Draw on data from multiple implementers• Drill down to examine source data and
evidence• Identify implementers for interviews
Challenges
• These are not new ideas• Many previous attempts highlight significant
challenges:
• Developing data standards• Creating mechanisms to link systems• Data quality problems• Complex data governance issues
Data standards
• Data standards ensure that data from different sources is based on same definition
• Necessary for data to be comparable, but can be extremely time-consuming to develop
• Some success stories:– International Aid Transparency Initiative– HIV and AIDS indicator registry– Humanitarian response indicator registry
Data standards
• Focus is typically on indicators and higher level data
• Less effort to create standards for activity level data
Mechanisms to link systems
• Migrating data from one system to another is complex, time-consuming and expensive
• If one system changes then the link often breaks
• Many different ways of linking systems means work is often duplicated
• Only worth-while if working with large data-sets
Data Quality
Can’t see the trees for the woods• Focus on defining indicator level standards • Therefore data often shared at this level too• Connections and definitions that help
understand and audit the data quality often missing– How was data collected?– What are the definitions inherent in the data?
Data Governance
Data Governance
• Connecting data makes it more useful but also increases the risk of malicious attacks
• Data protection issues• Cross-border issues (health data?)• Security risks with vulnerable populations
Time for a fresh approach?
• Seems like these problems are un-solvable• Final slides show-case work that we have been
doing over last five years• Shows promising new approach to tackle
these challenges
Principles for a new approach
Emergent (bottom-up) standards development
• Support the development of standards where there is interest and value to gain
• Ensure that each standard follows the same ‘design rules’
• Ensure that standards can be curated, shared and – where possible – merged over time
Principles for a new approach
De-couple data from applications
• Context of the data is tightly linked to the application in which it is created
• Ability to view and edit the data is also tightly linked to the application
• Data must be able to exist as a micro-application, aware of it’s context and able to function independently
Principles for a new approach
Focus on operational data
• Current standards tend to focus on indicators, but don’t include linkages to how the data was collected
• If standard can include the full context, better to start with operational data and aggregate up
Principles for a new approach
Strong data governance
• Need strong mechanisms to manage privacy and security
• Share data only as required for a specific purposes
What are we trying?
• Kwantu has been working in this area for many years
• Some promising approaches to help tackle these problems
(1) Domain Specific Language
• Domain Specific Language (DSL) is a computer language designed to be used by technical experts, not programmers
• Using a DSL provides a standard and comparable way of creating data standards
• Kwantu have developed and tested an open source DSL in many contexts
(1) Domain Specific Language
• DSL used to create ‘Self-aware Data Objects’ (SDO) that define the standard for any development data
• Doesn’t matter who creates each SDO definition. They can be linked and queried jointly
(2) Data context
• SDOs can define:– Field names in any language– Validations– Help text– Calculations– Evidence– Data taxonomies– Hierarchies in and linkages to other data
• Enables us to embed the full context in the data definition
(3) Application independent
• SDOs offers a more efficient and decentralised application architecture
• SDO data includes it’s own view and edit model
• Means you can interact with it in a browser or other standard application
• SDO data is effectively a micro-application
(3) Application independent
• Legacy apps can transfer their data to and from the relevant SDO definition
• New apps (including BetterData) can use the view and edit models natively
• Simplifies the development of new applications
Business context Data envelope
Micro application
Self-aware Data Objects - Definition
Data
SDOs use a domain specific language to define M&E or planning data and it’s business context.
This can be transformed into a micro-application that allows the data to be edited and viewed easily
Business context
Self-aware Data Objects - Definition
Business context includes:
Data model that specifies:- Fields- Labels- Help text- Validations- Languages- Evidence (files or photos)- Taxonomies- Links to other SDO data- Data can be expressed hierarchically- Data can be contained in sets
It also includes a schema that validates the data saved in the data model
Micro application
The micro application containsview and edit models neededto view or edit the data in astandard web browser.
Self-aware Data Objects - Definition
Data envelope
Data
Existing applications canaccess the data directly viathe Gatekeeper API
Transformer engines can beused to transform the datainto the view and edit modelsused by the application
Self-aware Data Objects - Definition
Data envelope
Data
Self-aware Data Objects - Definition
Data envelope
Every SDO includes a data envelope. This contains data on:
Who created the data and whenWho last updated the data and whenGIS coordinatesGlobally unique ID for the dataTags to code the dataFlags to indicate if the data is periodic or ad-hocFlags to indicate if it forms part of a series of dataLinkages to other data
Business context Data
Self-aware Data Objects - Data
Data
Data
Data
Data
Data
Each SDO will have multiple data instances in the Collectorfrom different data producers
Data envelope
Data
Query dataImport data
Collector
Data registry
API
Gate
keep
er
Prototype collector system environment
M&E or planning system environment
Existing system
API
BetterData
(4) Data Registry
• Library of shared data definitions• Data governance team manage:– Who can share new SDO definitions– Who can use SDO definitions– Curate and review SDO definitions– Identify opportunities to link or merge
• Provide advice on privacy• Responsible for data security
(4) Data Registry
• Option for multiple registries• Scope set by the group that manages it• Provides for a more organic and incremental
approach to developing standards• While still allowing for separate data registries
to coordinate and share
(5) Collector system
• Distributed database that is linked to the Data Registry
• Accessible only via an API that can:– Validate SDO data against the schema held in the
Data Registry– Publish SDO data into the collector system– Query data held in the collector system
(6) Existing systems
• Simplify the process of integrating existing systems
• Single standard API to validate, publish and query data
• Systems must transform data into SDO standard before publishing it
• Or can use SDO view and edit model to store data natively as an SDO
(6) Existing systems
• Over time can create libraries to help speed up integration with API
• No other changes needed
(7) BetterData
• Open source M&E system• Integrated with Collector API• Integrated with Data Registry• Browse Data Registry and download relevant
SDOs• Link SDOs into a workflow that incorporates
business logic• Store locally or publish to Collector system
Where are we now?
• DSL – completed• SDO examples – many in active usage• BetterData M&E – completed• Data Registry – early 2016• Collector System and API – early 2016• Data Governance guidelines - consultation
What next?
• GIZ funded pilot with South African government
• Demonstrate working prototype in 2016• Link and aggregate data from Municipal,
Provincial and National levels
What next?
• Canvass interest in applying to other contexts?– Who is interested?– What new issues does this raise?
• Establish advisory group– Assist with refinement of DSL and SDO
specifications– Assist with development of data governance
guidelines
Thank you!
• Rob Worthington• [email protected]• www.kwantu.net• @kwantu