17
"A Replicable Guide to Reproducible Research" Iddo Friedberg Iowa State University

Random Musings on Fixing Data Shambles in Science

  • Upload
    iddo

  • View
    355

  • Download
    0

Embed Size (px)

Citation preview

"A Replicable Guide to Reproducible Research"

Iddo Friedberg

Iowa State University

“Random Musings on Fixing Data Shambles in Science"

NIH: Principles and Guidelines for Reporting Research

Rigorous statistical analysis

Transparency in reporting

Data & material sharing

Best practices

Interpretable statistical analysis

Effect size and significance

“EFFECT SIZE vs. SIGNIFICANCE: smoking increases your relative risk of lung cancer by 2,500 percent; eating two slices of bacon a day increases your relative risk for colorectal cancer by 18 percent. Given the frequency of colorectal cancer, that means your risk of getting colorectal cancer over your life goes from about 5 percent to 6 percent”

Interpretable statistical analysis

Multiple hypothesis testing

“Either we have stumbled onto a rather amazing discovery in terms of post-mortem ichthyological cognition, or there is something a bit off with regard to our uncorrected statistical approach.”

Craig M. Bennett, et al (2010). Neural Correlates of Interspecies Perspective Taking in the Post-Mortem Atlantic Salmon: An Argument For Proper Multiple Comparisons Correction Journal of Serendipitous and Unexpected Results, 1 (1), 1-5

Transparency in reporting

Standards

Replicates

Statistics

Randomizations

Blinding

Inclusion & exclusion criteria

Transparency in reporting

Standards

Replicates

Statistics

Randomizations

Blinding

Inclusion & exclusion criteria

MIAME (2001)

Minimum Information About a Microarray Experiment

Experimental conditions

Controlled vocabularies for organisms, gene functions

MIAME (2001)

A journal enforced reporting standard enables:

Reproducibility

Compare data

Compare method effectiveness

Refutability

Some more standards

Standard Purpose

Gene Ontology Protein function

Human Phenotype Ontology Human disease

MIxS Genomic and Metagenomic data

The Gene Ontology

Pingzhao Hu, Gary Bader, Dennis A. Wigle & Andrew Emili

Nature Reviews Cancer 7, 23-34 (January 2007)

The Gene Ontology

Consistent descriptions of genes

Mostly Human, mouse, fly & yeast

Functional aspects:

Molecular function

Biological process

Cellular component

Data Sharing

Requires standards

Requires Volume

Requires Bandwidth

Shape of Things

Useful standards & best practices exist in many fields

Funders beginning to understand needs for standards / best practices

Support for cyberinfrastructure

Education / awareness

Journal-mandated compliance

Funder based compliance(?)

Good Could be better

ISMB/ECCB 2015, July 10-14