14
Data Sharing in the Next 5 Years Mercè Crosas, Ph.D. @mercecrosas Director of Data Science Institute for Quantitative Social Science (IQSS) Harvard University Northeastern Research Computing Charrette, Spet 9, 2015

Institute for Quantitative Social Science (IQSS) Director of Data ...scholar.harvard.edu/files/mercecrosas/files/2015-datasharingnext5ye… · What are we doing towards supporting

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Institute for Quantitative Social Science (IQSS) Director of Data ...scholar.harvard.edu/files/mercecrosas/files/2015-datasharingnext5ye… · What are we doing towards supporting

Data Sharing in the Next 5 Years

Mercè Crosas, Ph.D.@mercecrosas

Director of Data ScienceInstitute for Quantitative Social Science (IQSS)

Harvard University

Northeastern Research Computing Charrette, Spet 9, 2015

Page 2: Institute for Quantitative Social Science (IQSS) Director of Data ...scholar.harvard.edu/files/mercecrosas/files/2015-datasharingnext5ye… · What are we doing towards supporting

Sharing research data in a data repository enables reuse, extension andvalidation of previous research work

Page 3: Institute for Quantitative Social Science (IQSS) Director of Data ...scholar.harvard.edu/files/mercecrosas/files/2015-datasharingnext5ye… · What are we doing towards supporting

A widely-used, open-source data repository framework for publishing data

Page 4: Institute for Quantitative Social Science (IQSS) Director of Data ...scholar.harvard.edu/files/mercecrosas/files/2015-datasharingnext5ye… · What are we doing towards supporting

Research data sets are becoming larger,more sensitive, and more frequently updated

Page 5: Institute for Quantitative Social Science (IQSS) Director of Data ...scholar.harvard.edu/files/mercecrosas/files/2015-datasharingnext5ye… · What are we doing towards supporting

Data Sharing with Dataverse

NowNo sensitive data

Datasets up to ~ GB

Seldom Versioning

The Next 5 Years Highly-sensitive data

Datasets > GBs, TBs, PBs

Streaming data

Page 6: Institute for Quantitative Social Science (IQSS) Director of Data ...scholar.harvard.edu/files/mercecrosas/files/2015-datasharingnext5ye… · What are we doing towards supporting

What are we doing towards supporting these new types of data?

Page 7: Institute for Quantitative Social Science (IQSS) Director of Data ...scholar.harvard.edu/files/mercecrosas/files/2015-datasharingnext5ye… · What are we doing towards supporting

Sharing Sensitive Data with Confidence

Funded by

Page 8: Institute for Quantitative Social Science (IQSS) Director of Data ...scholar.harvard.edu/files/mercecrosas/files/2015-datasharingnext5ye… · What are we doing towards supporting

Standardized Levels of Data Sensitivity

Page 9: Institute for Quantitative Social Science (IQSS) Director of Data ...scholar.harvard.edu/files/mercecrosas/files/2015-datasharingnext5ye… · What are we doing towards supporting

Sensitive

Non-Sensitive

Sensitive

Page 10: Institute for Quantitative Social Science (IQSS) Director of Data ...scholar.harvard.edu/files/mercecrosas/files/2015-datasharingnext5ye… · What are we doing towards supporting

Sharing and Preserving Large Structural Biology Data

Funded by

Page 11: Institute for Quantitative Social Science (IQSS) Director of Data ...scholar.harvard.edu/files/mercecrosas/files/2015-datasharingnext5ye… · What are we doing towards supporting

Structural Biology Primary Data

1 Dataset is 180-360 images of X-ray diffraction data, 3.5-7 GB;Total up to 100 PBs

Integration with Dataverse: ● Long-term access● Formal Data Citation● Standard Metadata● Data Exploration (OME)● Preservation, with

copies in multiple sites

Page 12: Institute for Quantitative Social Science (IQSS) Director of Data ...scholar.harvard.edu/files/mercecrosas/files/2015-datasharingnext5ye… · What are we doing towards supporting

There is a need for closer integration of data repositories with research computing resources to support the new types of data

Page 13: Institute for Quantitative Social Science (IQSS) Director of Data ...scholar.harvard.edu/files/mercecrosas/files/2015-datasharingnext5ye… · What are we doing towards supporting

Towards an Integrated Research Ecosystem to Support Data-Intensive Research

Research Workspace

Research Computing

Publish Research Data

Data Repository

Explore and Visualize Data

Research Computing

Page 14: Institute for Quantitative Social Science (IQSS) Director of Data ...scholar.harvard.edu/files/mercecrosas/files/2015-datasharingnext5ye… · What are we doing towards supporting

Thanks

dataverse.orgdatatags.org

sbgrid.orgscholar.harvard.edu/mercecrosas