Big Data. Remaking Science and Government - Clarivateips.clarivate.com/m/pdfs/fed-res/big_data_presentation_fnl.pdf · Data-driven science • “There’s just a deluge of data

Embed Size (px)

Citation preview

  • Big Data

    Remaking Science and Government

  • So whattaya mean by big?

    1. Theres a lot of it 2. Its unstructured

  • Whats changed?

    The Cloud Hadoop

  • For a number of years weve worked really hard at transforming the information we were collecting into something that computers could understandThe next revolution thats starting to come is instead of spending a lot of energy turning data into something computers can understand, we can train computers to understand the data and information we humans understand.

    Sky Bristol, chief of the USGSs Science Information Services

  • Data-driven science

    Theres just a deluge of data. Rather than starting by developing your own hypothesis, now you can do the data analysis first and develop your hypotheses when youre deeper in.

    Suzi Iacono, NSF Deputy Assistant Director CISE

  • Democratization

    In the old days if you wanted to know what was going on in the Indian Ocean, you had to get a boat and get a crew It was easier for men to do that. But big data democratizes things. Now weve got sensors on the whole floor of the Indian Ocean and you can look at that data every morning, afternoon and night. You could look at data from the last five years. Thats a whole new ballgame for science and engineering.

    Suzi Iacono, NSF CISE

  • What this means to me

  • What does it mean for government?

    Fraud and abuse

  • March 2012: The government invested $200 million in big data research

  • NSF: Geo Deep Dive

    Watson for the geosciences A computer system that crawls the scanned

    science journals, videos, spreadsheets and other data to create a query-able database of all geological knowledge

  • What does it mean?

    Dark data would be uncovered Geologists could more easily rely on existing

    data

    Geologists could spend less time gathering data

    Geologists could ask bigger questions

  • Some problems were kind of off limits. You couldnt really think about reasonably addressing them in a meaningful way in one lifetime. These new tools have that promise: to change the types of questions were able to ask and the nature of answers we get.

    Prof. Shanan Peters, UW GeoSciences

  • NIH: 1,000 Genomes in the Cloud

    About 1,400 individual genomes placed inside Amazons EC2 Cloud

    Anyone can look at the data for free. Anyone can do genomic research inside the

    cloud using a cloud fee structure

  • What does it mean?

    The barriers to entry for genomic research are much lower

    Genomic research is opened to small universities, non profits and graduate students

    Better treatments for diseases with a genetic component such as diabetes and breast cancer

  • If you rewind seven years, the questions that scientists could ask were constrained by the resources available to themNow we dont have to worry about arbitrary constraints so research is significantly accelerated. They dont have to live with the repercussions of making incorrect assumptions or of running an experiment that didnt play out.

    Matt Wood, AWS Principal Data Scientist

  • Defense: Clear Heart

    A system that scans hours of video to identify human motions that suggest mal intent

  • What does it mean?

    Much smarter data coming back from drones and satellites

    Human analysts concentrated where theyre most needed

    Smarter domestic security systems

  • In a situation like Newtown, if youd had a video camera connected with this system it could have given an early warning that someone was roaming the halls with a gun. That could have hooked into an alarm system. It would have been an early warning.

    Richard McNeight, President Modus Operandi

  • Challenges

    Just because theres a lot of data doesnt mean its the right data

    Quality matters A lot of data is proprietary