37
Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002. Opinions, interpretations, recommendations and conclusions are those of the authors and are not necessarily endorsed by the United States Government.

Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Embed Size (px)

Citation preview

Page 1: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Peter Carr

2/2/2013

Bioinformatics Challenge Day

This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002.  Opinions, interpretations, recommendations and conclusions are those of the authors and are not necessarily endorsed by the United States Government.

Page 2: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -2Peter Carr 2/2/2013

Sponsor: Defense Threat Reduction Agency (DTRA)Organizer: MIT Lincoln Laboratory (MIT LL)

• Approach: A one day hack-a-thon– Innovate: tackle huge challenges in bioinformatics– Educate: bring in specialists from diverse fields,

participants in DoD bioinformatics interests– Investigate: what this short format can accomplish– Aggregate: bring people together

• The Challenges:– Can you determine the cause of an infection?– Can you invent a new way to visualize complex

bioinformatics data?– Can you spot the signs of genetic engineering? Can

you figure out what an engineered organism does?

Bioinformatics Challenge Days

The problem: drowning in complex data, very hard to make sense of it all

DN

A s

eq

ue

nc

ing

MA

GE

en

gin

ee

rin

g

Page 3: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -3Peter Carr 2/2/2013

Cast of Characters

Darrell Ricke (MIT Lincoln Laboratory)– Bioinformatics

Peter Carr (MIT Lincoln Laboratory)– Synthetic Biology, Biochemistry

Anna Shcherbina (MIT Lincoln Laboratory)– Bioengineering, Electrical Engineering

Nancy Burgess (Defense Threat Reduction Agency)– Chemical and Biological Defense

Page 4: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -4Peter Carr 2/2/2013

• Sequencing– Complete genome sequences– Mixed populations– Expression (RNA species)– Interaction (ChIP-seq)

• Mass spectroscopy– Protein/peptide fingerprinting– Metabolites– Interaction (cross-linking)

• Other tools– Microarrays– High-throughput screening

(e.g. fluorescence)

Some Big Hammers

Page 5: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -5Peter Carr 2/2/2013

• Data galore: Omics approaches are generating massive amounts of increasingly complex measurement data

• How do we best make sense of this information?

• Some fundamental development areas– Processing

– Visualizing/analyzing

– Storing/accessing

Now and Future

Page 6: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -6Peter Carr 2/2/2013

The Challenges

1. Metagenomic VisualDeveloping visualization methods to facilitate analysis of metagenomic data with unknown numbers of genomes at varying concentrations

2. Genome Assembly for the ClinicPerforming de novo assembly from clinical samples with an emphasis on pathogen identification

3. Genetic EngineeringID and interpret the signatures of genetic engineering

Page 7: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -7Peter Carr 2/2/2013

What can your efforts today produce?

• Analysis, answers to questions

• Heuristics, algorithms

• Specific software tools

• Roadmap for future work

Page 8: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -8Peter Carr 2/2/2013

What to get out of this?

• A deeper understanding of the field– Tools– Approaches– Concerns/challenges

• Ideas and experiences that may motivate future work

• Connection to others with similar interests

Page 9: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -9Peter Carr 2/2/2013

Creativity (innovative ideas and efforts)

Energy(intensity and focus)

Communication(results, feedback)

What We Hope to See From You

Page 10: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -10Peter Carr 2/2/2013

• You can work alone, come with a team, or team up on-site

• You can use any of the resources we have provided, any you have access to (including tools you code yourself ahead of time or today)

• You keep what you make (DTRA and MIT LL make no claims to what you produce)

Theme: Flexibility

Page 11: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -11Peter Carr 2/2/2013

Schedule

8:00 AM               Breakfast/check-in

9:00 AM               Welcome (Pete)

9:15 AM               Overview and logistics (Pete)

9:45 AM               The Challenges:

1. Metagenomic Visual (Anna) 2. Genome Assembly for the Clinic (Darrell) 3. Genetic Engineering (Pete)

10:45 AM             Coffee/Break into project groups

12:30 PM             Lunch served (groups can continue to work)

3:30 PM              Snack (groups can continue to work)

6:30 PM              Progress updates ready by dinnertime

6:30 PM              Dinner and progress reports

8:00 PM+            Groups can continue to work

Page 12: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -12Peter Carr 2/2/2013

• On the USB sticks:– Data for the three challenges (FASTA, FASTQ, CSV)

– Software (Mac, Windows, Linux)

• Local wifi access

• Teaming

Getting Started

Page 13: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -13Peter Carr 2/2/2013

Questions?

Page 14: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -14Peter Carr 2/2/2013

• Background: a sample has been dug from the back of a lab freezer, and subjected to Ion Torrent sequencing

• We would like to know what it is:– Simple or complex?– Natural or engineered?– If engineered, how? (what techniques)– For what purpose?– Will the design work?

• [No surprise: yes, there is an (in silico) engineered component. Find it! And figure out as much as you can about it.]

• We have a lot of great questions, but may not have all the answers

Challenge 3: Genetic Engineering

Page 15: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -15Peter Carr 2/2/2013

• Investigation (answer a biological question)

• Production (make a drug, a fuel)

• Serve a specialized role– Protect against infection– Detect dangerous chemicals– Environmental remediation

• Creatively explore an interesting design space

What Do We Design For?

Page 16: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -16Peter Carr 2/2/2013

How Do We Produce These?

Page 17: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -17Peter Carr 2/2/2013

• Transformation/transfection can be via natural, chemical, or electrical methods

Getting DNA In

Page 18: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -18Peter Carr 2/2/2013

• Transfer “in vivo” protects fragile DNA

• An entire genome can be transferred

• Transfer to other species

• Requires an origin of replication, pilus protein

Old School: Conjugation

donor(sender)

recipient(receiver)

Page 19: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -19Peter Carr 2/2/2013

Old School: Phage Transduction

• Phage/virus can replicate independently, or integrate into genome

• DNA or RNA, single- or double-stranded

• Examples:– Lentivirus (mammalian)– Lambda, T4, T7, P1, M13 (E. coli)

Page 20: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -20Peter Carr 2/2/2013

• Natural mutation rates (mutations accumulate slowly over time)

• Exposure to damaging effects (chemicals, radiation)

• Mutator strains: cells defective for one or more natural repair mechanisms

Old School: Mutagenesis

Page 21: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -21Peter Carr 2/2/2013

• Specific sites: often 6 bp, but can be longer or shorter

• “Outside cutters” cut some distance away from recognition site

• Homing nucleases (longer ~30 bp sites, can be unique in a genome)

• Multiple Cloning Site (MCS) often engineered into cloning vector

Revolution 1: Restriction Enzymes

Page 22: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -22Peter Carr 2/2/2013

• Circular

• Contain origin of replication– Single copy– Low to high copy (hundreds)

• Selection gene (1 or more)

• MCS and other features common

• Extension: BACs and YACs

Plasmids

Page 23: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -23Peter Carr 2/2/2013

• Almost all approaches give a mix of successes and failures

• Screening searches for what you want

• Selection kills off what you don’t want

Selection and Screening

Page 24: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -24Peter Carr 2/2/2013

Polymerase Chain Reaction

• Simple scheme made it possible to manipulate DNA in new ways

• Used not just to make more DNA, but to modify it

• Dependent on oligonucleotide synthesis and enzyme (DNA polymerase)

Revolution 2: PCR

Page 25: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -25Peter Carr 2/2/2013

• Perform on DNA in vitro (higher background error rates than in vivo)

• Employs a synthetic oligo and an enzyme (polymerase)

• Users typically screen clones with PCR or restriction, then sequencing

• Rest of the plasmid typically not re-sequenced

Site-Directed Mutagenesis

Page 26: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -26Peter Carr 2/2/2013

• Can bring together many pieces of DNA at once

• Based on identical sequence overlaps

• 3-enyzme reaction

• Intrinsically scar-less

• Often relies on PCR (& thus oligos) to produce each segment

Gibson Assembly

http://www.youtube.com/watch?v=WCWjJFU1be8

Page 27: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -27Peter Carr 2/2/2013

• “Outside cutter” restriction enzymes

• Little or no scar at joining point

• Segments may or may not be produced by PCR

Golden Gate Assembly

Page 28: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -28Peter Carr 2/2/2013

Recombination

• Site-specific– attB (Gateway)– Cre/lox

• Homologous– Natural (B. Subtilis, RecA)– Engineered (lambda red)

• Directed by double-stranded break repair– Zn finger nucleases– TALENs– CRISPRs

Page 29: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -29Peter Carr 2/2/2013

• Oligo synthesis (building blocks) using organic chemistry

• Assemble to genes using biochemistry (in vitro)

• Assemble to genomes (small ones for starters) using biology (in vivo)

• Each of these processes can carry their own error signature, but can also be counteracted by sequencing-based screening, post-repair, etc.

DNA Synthesis to Genome Assembly

Page 30: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -30Peter Carr 2/2/2013

MAGE: Multiplexed Automatable Genome Engineering

Wang, Isaacs, Carr et al. (2009) Nature 460(7257):894-8

Generation of genome edits at many targeted chromosomal locations

Much like site-directed mutagenesis, but on a chromosome

Page 31: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -31Peter Carr 2/2/2013

• A lot like site-directed mutagenesis—but on the genome of living cells

• Uses long oligos

• Does not require selection markers (but can use them)

• Other than the desired change (as small as a DNA base, as large as a multi-gene deletion) there is no obvious sign

• BUT there can be secondary signs:– Oligo-mediated defects within 50-100 bp of the edited site– Higher background mutation rates (mismatch repair deactivated)

MAGE

Page 32: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -32Peter Carr 2/2/2013

• Conjugation now employed with controlled precision

• But DNA crossover points not always perfectly defined

CAGE: Conjugative Assembly Genome Engineering

Isaacs, Carr, Wang, ... (2011) Science

Page 33: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -33Peter Carr 2/2/2013

• Make use of DNA “parts” libraries for constructing more advanced genetic designs

• Fundamental concept in synthetic biology, inspired by electrical engineering

• Basis of the iGEM competetion (International Genetically Engineered Machines)

Genetic Circuits: DNA Parts

Page 34: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -34Peter Carr 2/2/2013

• Repressilator an early example of synthetic biology circuits

• Three inverters in series (circular) made a ring oscillator)

Genetic Circuits: Bacteria

Elowitz and Liebler (2000) Nature

Page 35: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -35Peter Carr 2/2/2013

• Adapted a signaling system from plants

• Used to engineer communication between yeast cells

• Basic features can be installed in a variety of organisms

Genetic Circuits: Yeast

Chen and Weiss (2005) Nature Biotechnology

Page 36: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -36Peter Carr 2/2/2013

Genetic Circuits: Mammalian

Xie et al. (2011) Science (Weiss, Benenson labs)

Genetic CircuitsOverview

DNA for classifier circuit

match

cell death

no match

no effect

cancer cellnormal

cell

Concept: insert DNA circuit into cells ID cancer and/or kill it

Page 37: Peter Carr 2/2/2013 Bioinformatics Challenge Day This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA8721-05-C-0002

Bioinformatics Challenge Day -37Peter Carr 2/2/2013

• Codon usage– Adapt how often codons are used to match target organism– New amino acids (Tirrell, Schultz)– New genetic codes (Church, Carr)

• Minimal life– Engineering by subtraction (Blattner)– Compose from the ground up (Forster/Church)

• New DNA bases– Alternate hydrogen-bonding (Benner)– Hydrophobic bases (Schultz)

• Mirror-image life

Increasingly Alien