20
Andy Jenkinson, EBI An Introduction to DAS

Andy Jenkinson, EBI An Introduction to DAS. Summary of Topics What is Data Integration? Problems in Data Integration An architectural overview of DAS

Embed Size (px)

Citation preview

Page 1: Andy Jenkinson, EBI An Introduction to DAS. Summary of Topics What is Data Integration? Problems in Data Integration An architectural overview of DAS

Andy Jenkinson, EBI

An Introduction to DAS

Page 2: Andy Jenkinson, EBI An Introduction to DAS. Summary of Topics What is Data Integration? Problems in Data Integration An architectural overview of DAS

Summary of Topics

• What is Data Integration?

• Problems in Data Integration

• An architectural overview of DAS

• Brief History of DAS

Page 3: Andy Jenkinson, EBI An Introduction to DAS. Summary of Topics What is Data Integration? Problems in Data Integration An architectural overview of DAS

What is Data Integration

Page 4: Andy Jenkinson, EBI An Introduction to DAS. Summary of Topics What is Data Integration? Problems in Data Integration An architectural overview of DAS

All These are Data Integration

• Reading some papers so you can write a report

• Exploring some database websites so you can learn about a topic

• Downloading some data from different databases so you can analyse it

• Downloading some data from different databases so you can combine it with your own

Page 5: Andy Jenkinson, EBI An Introduction to DAS. Summary of Topics What is Data Integration? Problems in Data Integration An architectural overview of DAS

All These are Data Integration

• Reading some papers so you can write a report

• Exploring some database websites so you can learn about a topic

• Downloading some data from different databases so you can analyse it

• Downloading some data from different databases so you can combine it with your own

Page 6: Andy Jenkinson, EBI An Introduction to DAS. Summary of Topics What is Data Integration? Problems in Data Integration An architectural overview of DAS

Data Integration

• “Automatic” data integration• pulling in data from different

locations• processing it• creating a resource derived from

the data• done via computers, not humans

• e.g. creating/updating a data warehouse

Warehouse

PDB

Ensembl

UniProt

Page 7: Andy Jenkinson, EBI An Introduction to DAS. Summary of Topics What is Data Integration? Problems in Data Integration An architectural overview of DAS

Warehouse model

Page 8: Andy Jenkinson, EBI An Introduction to DAS. Summary of Topics What is Data Integration? Problems in Data Integration An architectural overview of DAS

Data Integration:like herding cats

Page 9: Andy Jenkinson, EBI An Introduction to DAS. Summary of Topics What is Data Integration? Problems in Data Integration An architectural overview of DAS

Databases are all different

Page 10: Andy Jenkinson, EBI An Introduction to DAS. Summary of Topics What is Data Integration? Problems in Data Integration An architectural overview of DAS

Databases evolve

Page 11: Andy Jenkinson, EBI An Introduction to DAS. Summary of Topics What is Data Integration? Problems in Data Integration An architectural overview of DAS

Data ages

Page 12: Andy Jenkinson, EBI An Introduction to DAS. Summary of Topics What is Data Integration? Problems in Data Integration An architectural overview of DAS

Databases are big

Page 13: Andy Jenkinson, EBI An Introduction to DAS. Summary of Topics What is Data Integration? Problems in Data Integration An architectural overview of DAS

Distributed Annotation System

• Distributed

• Client-Server architecture

• Federation

• RESTful web services

Page 14: Andy Jenkinson, EBI An Introduction to DAS. Summary of Topics What is Data Integration? Problems in Data Integration An architectural overview of DAS

Warehouse model

Page 15: Andy Jenkinson, EBI An Introduction to DAS. Summary of Topics What is Data Integration? Problems in Data Integration An architectural overview of DAS

DAS model

Page 16: Andy Jenkinson, EBI An Introduction to DAS. Summary of Topics What is Data Integration? Problems in Data Integration An architectural overview of DAS

Architectural Overview

Page 17: Andy Jenkinson, EBI An Introduction to DAS. Summary of Topics What is Data Integration? Problems in Data Integration An architectural overview of DAS

DAS

• Databases are all different• DAS is a uniform facet of a database – always the same

• Databases change their structure• when the database changes, DAS stays the same

• Databases are updated• DAS data comes directly from the provider so is always fresh

• Databases are big• DAS uses real-time targeted queries

Page 18: Andy Jenkinson, EBI An Introduction to DAS. Summary of Topics What is Data Integration? Problems in Data Integration An architectural overview of DAS

History

Developed circa 1999 for sharing genome annotations

Expanded 2004 onwards• more data types• better metadata• addition of Registry

DAS/2 project• split from DAS, not backwards compatible• inspired some DAS developments

Page 19: Andy Jenkinson, EBI An Introduction to DAS. Summary of Topics What is Data Integration? Problems in Data Integration An architectural overview of DAS

To Summarise…

The Distributed Annotation System is…• A network of biological data sources• An example of federation• A collection of REST web services

The DAS Protocol is…• An integration platform• A client-server protocol• An agreed standard

Page 20: Andy Jenkinson, EBI An Introduction to DAS. Summary of Topics What is Data Integration? Problems in Data Integration An architectural overview of DAS

Image Credits

• Flickr/muir.ceardach• Flickr/Horia Varlan• Flickr/Alessandro Pinna• Fotopedia/Jean-Marie Hullot• listicles.com/?p=3485• Google Earth/Cnes/Spot Image• Olivier H. Beauchesne