Click here to load reader

Unlocking Open Data in the Cloud - Amazon Web Event-Driven Computing Resource Templates Identity Mobile Analytics

  • View

  • Download

Embed Size (px)

Text of Unlocking Open Data in the Cloud - Amazon Web...

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    Unlocking Open Data in the Cloud

    Grischa GundelsweilerPublic Sector Account Manager, DACHLoft + Lab Munich11th November 2016

  • What this session is about

    1) Open Data: Concepts, Examples & Trends2) AWS as a Platform for Open Data3) Case Study: Provide Open Data on AWS4) Case Study: Use Open Data on AWS


  • Open Data: Concepts, Examples & Trends


  • “Open data is data that can be freely used, shared and built-on by anyone, anywhere, for any purpose.”

    Definition by Open Knowledge Foundation, 2013

  • The 8 Open Government Data Principles

    1. Complete2. Primary3. Timely4. Accessible5. Machine processable6. Non-discriminatory7. Non-proprietary8. License-free OGD Principles

  • Why Open Data?

    1. Transparency

    2. Releasing social and commercial value

    3. Participation and engagement

  • 8

    McKinsey report from October 2013

  • 9EC study from November 2015: Creating Value through Open Data: Study on the Impact of Re-use of Public Data Resources

  • 10 Open Data Portal of Deutsche Bahn

  • 12

  • 14

  • 15

  • 16

  • 17

  • AWS as a Platformfor Open Data


  • Why does AWS care about Open Data?

    Many of our commercial sector customers rely on quality open data as much as they rely on our cloud infrastructure services.

    Many of our public sector customers use AWS to make their data available to a global community of researchers, entrepreneurs, students, and fellow government agencies.

    Sharing data makes it accessible to a large and growing community of researchers, entrepreneurs, and enterprises.


  • The cloud allows users from anywhere to take their algorithms to data rather than downloading data to their computing resources.

    Data Acquisition in the Cloud


  • Open data as a platform

    Data Creation Data Enrichment





    Data at Rest(Object storage)

    Basic APIs

    Complex APIs




    Data Catalogs

    Focused datadashboards



    Lower cost of knowledge(Efficiency)


  • A Rich Set of Programmable Services


    Administrationand Security

    Access Control

    Identity Management

    Key Management and Storage

    Monitoringand Logs

    Resource and Usage Auditing

    Platform Services

    Analytics App Services Developer Tools and Operations Mobile Services




    Real-TimeStreaming Data

    Application LifecycleManagement




    Event-Driven Computing

    Resource Templates Identity

    Mobile Analytics

    Push Notifications


    App Streaming


    Queuing and Notifications




    Core Services CDNCompute(VMs, Auto-Scaling and Load Balancing)

    Databases(Relational, NoSQL, and Caching)

    Networking(VPC, DX, and DNS)

    Storage(Object, Block, and Archival)

    Infrastructure Availability ZonesPoints of Presence



    Business Email

    Sharing and Collaboration

    Virtual Desktop

    Technical and Business Support




    Security and Pricing Reports


    Training and Certification

  • Providing Open Data on AWS


  • Case Study: Transport for London

    25 graphics from TfL, October 2016

  • Why open data at TfL?

    TransparencyReachOptimal use of transport networkEconomic benefitInnovation…


  • Available Datasets

    The API supports all the data requirements of the TfLwebsite. Every data-driven aspect of the website (including maps) is powered by the unified API.

    Some of the multi-modal core datasets included and available to developers are:

    Journey Planning (current and future)Status (current and future)Disruptions (current) and Planned works (future)Arrival/departure predictions (instant and websockets)TimetablesEmbarkation points and facilitiesRoutes and lines (topology and geographical)Fares


  • London



    Almost 500 apps produced.Playground for innovation.Improving transportation, collaboratively.

    Apps by public transportationauthorities: MVV, MVG, DB. No info how to access data, lacksdocumentation.

  • 29 graphic from TfL, October 2016

  • Outcomes Cloud Benefits

    Customers save time, economic benefitsNew jobs and investmentsin startup and techecosystemUsage of data has sincedoubledData consolidation andquality

    Pay for what you useLower maintenance costsElasticityAutomation and consistencyBlue/green deployment –zero downtimeHighly secure

    30 mwd advisors cased study

  • Solutions for providing Open Data on AWS

    Open data platformsCatalogPublishDiscoverVisualizeAnalyzeShare…


  • Using Open Data on AWS


  • Public Data Sets on AWSSeveral high-value datasets are available for anyone to access for free on AWS. Examples include:

    Landsat on AWS3K Rice Genome NEXRAD on AWS


  • More available Public Datasets on AWS…

    GDELT: Over a quarter-billion records monitoring the world's broadcast, print, and web news from nearly every corner of every country, updated daily..IRS 990 Filings on AWS: Machine-readable data from certain electronic 990 forms filed with the IRS from 2011 to presentCommon Crawl Corpus: A corpus of web crawl data composed of over 5 billion web pagesTCGA on AWS: Raw and processed genomic, transcriptomic, and epigenomic data from The Cancer Genome Atlas (TCGA) available to qualified researchers via the Cancer Genomics CloudICGC on AWS: Whole genome sequence data available to qualified researchers via The International Cancer Genome Consortium (ICGC)1000 Genomes Project: A detailed map of human genetic variationMultimedia Commons: A collection of nearly 100M images and videos with audio and visual features and annotationsGoogle Books Ngrams: A dataset containing Google Books n-gram corpusesA list of other Public Datasets is available here.


  • 35

  • Accessing and processing Landsat data

    What is Landsat on AWS?

    How to access Landsat on AWS?

    How to use Landsat on AWS?


  • Landsat on AWS

    We have committed to make up to 1 petabyte of Landsat imagery readily available as objects on Amazon S3.

    All Landsat 8 scenes from 2015 and 2016 are available, along with a selection of cloud-free scenes from 2013 and 2014.

    All new Landsat 8 scenes are made available each day (~700 per day), often within hours of production.


  • Landsat on AWS

    Landsat on AWS makes each band of each scene readily available as objects on Amazon S3. Data can be accessed programmatically via HTTP and quickly deployed to any of our products for analysis and processing.

    Users do not need to worry about local storage and have access to virtually unlimited computing power on demand.






  • Undifferentiated heavy lifting

    We use GDAL to add “internal tiling” on each Landsat on AWS tiff, which allows developers to use HTTP range gets to access specific portions of each scene.

    This allows people to only access the data they need when they need it. Standard tiff

    objectInternal tiled tiff


    1 2 3 4 5 67 8 9 10 11 12

    13 14 15 16 17 18

    19 20 21 22 23 24

    25 26 27 28 29 3031 32 33 34 35 36

    1 2 34 5 6

    7 8 9

    10 11 1213 14 15

    16 17 18

    19 20 2122 23 2425 26 27

    28 29 3031 32 3334 35 36


  • RGBVisible light


    Shortwave infraredUrban areas

    Think of URLs instead of copiesWellington, New Zealand

  • Using Landsat on S3

    Landsat on Amazon


    ArcGIS Server on

    Amazon EC2

    AWS US West Oregon Region

    reliable, performant data access


  • Usage in the first year:Over 400,000 scenes available

    Over 1 billion hits globally

    Used for new product development by:

    Landsat on AWS

    Small invest, big impact:

    Public dataset hosted in FRA

    Apps for agriculture, disaster relief, vegetation monitoring, property taxation, ..

    Used for new product development by:


    Sentinel-2 on AWS

  • Next steps

    Depending on your role, your goalsUse open data in your projects / your organisationProvide open data from your organisationBuild a new business on open data

    AWS offersTechnology platform that constantly evolvesEnablement through workshops, training, ProServCustomer and partner ecosystem to connect and build


  • Thank you!

    [email protected]

    Grischa Gundelsweiler