45
From DNA to genomics: the rise of bioinformatics Catherine Abbott [email protected] Monday 2 December BioInfoSummer 2013 1 NB. Most images in this presentation are vi Google images.

From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

Embed Size (px)

Citation preview

Page 1: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

From DNA to genomics: the rise of bioinformatics

Catherine [email protected] 2 DecemberBioInfoSummer 2013

1

NB. Most images in this presentation are viaGoogle images.

Page 2: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

Outline of talk

• Introduce bioinformatics• Very very basic introduction

to – DNA-molecular biology– Genes– Genomes

• Human Genome Project• Genomics• The challenges and the

future2

Page 3: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

What is Bioinformatics?• Bioinformatics is the

field of science in whichbiologyinformatics: computer

science, information technology, mathematics, statistics and other sciences

3

Page 5: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

Central paradigm of Molecular Biology

5

DNA RNA Protein PhenotypeGuanine- GAdenine- AThymine- TCytosine- C

Guanine- GAdenine- AUracil- UCytosine- C

G Glycine Gly

P Proline Pro

A Alanine Ala

V Valine Val

20 amino acids

Page 6: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

Central paradigm of Molecular Biology

6

Page 7: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

What is a gene?• The gene is the basic physical unit of

inheritance

7

http://www.bbc.co.uk/schools/gcsebitesize/science/add_aqa_pre_2011/celldivision/celldivision1.shtml

Page 8: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

8

Page 9: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

DNA Sequences- threebases and stop codons

9http://www.genome.gov/EdKit/bio2b.html

Page 10: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

Reading frames

10

http://www.genome.gov/EdKit/bio2e.html

Page 11: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

Exons and Introns

11

http://www.genome.gov/EdKit/bio2i.html

Page 13: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

1977: Sanger Sequencing

• used chemically altered "dideoxy" bases to terminate newly synthesized DNA fragments at specific bases (either A, C, T, or G).

• Was awarded two nobel prizes 1958 and 1980 (shared with Gilbert and Berg)

13

Page 14: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

Evolution of Sequencing Technology

1865 : Mendal shows inheritance in peas

1953 : Watson and Crick structure of DNA

1977 : Era of sequencing begins1980 : Shotgun sequencing

coined1982 : GenBank founded1983 : Kary Mullis and

colleagues develop PCR1986 : First commercial ABI

sequencer launched 1990 : Blast algorithm developed

at NCBI1991 : EST strategy developed

Page 15: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

Evolution of Sequencing Technology

1995 : Cycle Sequencing by Amersham; Applied Biosystems releases capillary electrophoresis system Prism 310; output 5000 bases per day

1997 : MegaBACE 1000 capillaries;output 250,000-500,000 bases per day

1998 : Pyrosequencing developed2001 : Draft human sequence2005 : Launch of Genome Sequencer 20

System by 454 Life Sciences based on Pyrosequencing technology; output 20 million bases per run

Page 16: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

Fihlo JS Breast Cancer Research 2009

Page 17: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

17

Traditional Sequencing vs 454 Technology

Page 18: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

Genbank:

18

Page 19: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

Nucleic Acids Res. 2011 Jan;39(Database issue):D32-7.

Page 20: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

Nucleic Acids Res. 2011 Jan;39(Database issue):D32-7.

Page 21: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

What is Genomics?

• An organism's complete set of DNA is called its genome

• Genomics is the study of the entire genome of an organism

• investigations into the structure and function of very large numbers of genes undertaken in a simultaneous fashion.

21

Page 22: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

The Race

22

20th July 1969 26th June 2000

Page 23: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

President Clinton 26th June 2000

• “We are here to celebrate the completion of the first survey of the entire human genome. Without a doubt, this is the most important, most wondrous map ever produced by humankind.”

23

Page 24: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

Prime Minister Blair26 June 2000• “……a revolution in medical

science whose implications far surpass even the discovery of antibiotics…... And every so often in the history of human endeavor there comes a breakthrough that takes humankind across a frontier and into a new era. ……a breakthrough that opens the way for massive advances in the treatment of cancer and hereditary diseases, and that is only the beginning.”

24

Page 25: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

February 2001

25$2.7 billion US $300 million US

Page 26: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

Cost of Private effort-13 years ago

• 300 machines running night and day for over a year

• $30,000,000 to buy• $2 M a month in

electricity• $4 M a month in

chemicals• Fits on 5 CDs

26

Page 27: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

27

Page 28: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

Human Genome Project• The biggest bioinformatics project of

its time• So what have we learned so far

– 3.2 billion bases in the human genome– Just over 20,000 protein coding genes– Humans vary 1/1000bp

• 3.2 million differences between non-relatives

• Almost as much information as in the entire genome of E.coli (4.6 million bases)

28

Page 29: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

29

Bishop Desmond Tutu2010

Craig Venter2001-2003

James D Watson2007

CompletedHuman Genomes

Page 30: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

James D. Watson• June 2007• 454 Sequencer• Took 4 months• Cost <$1 Million

30

Richard Carson/Reuters

Page 31: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

31

2005 2007 2008

2009 2010

Page 32: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

19 August 2011• Baylor College of Medicine

Human Genome Sequencing Center and the AGRF in Melbourne, Australia.

• WGS and Sanger sequencing• 2 x coverage• 5.9 x coverage on ABI SOLID• 2,574 Megabase

32

Renfree et al. Genome Biology 2011, 12:R81

Page 33: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

Complete Genomes-Nov 2010 • http://www.ncbi.nlm.nih.gov/

Genomes/• There are now over 1000

complete Prokaryotic Genomes available in Entrez Genome

• All three main domains of life - bacteria, archae and eukaroytic- are represented, as well as many viruses and organelles

• Humans, mice, rats, worms and flies have been completed

33http://www.ncbi.nlm.nih.gov/PMGifs/Genomes/org.html

Page 34: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

34

Page 35: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

http://www.1000genomes.org/about

Page 36: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

http://www.icgc.org/

Page 37: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

DIY genomics

Page 38: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

Summary and Challenges Ahead

• DNA sequencing is becoming faster and cheaper at a pace far outstripping Moore’s law (the rate at which computing gets faster and cheaper).

• the ability to determine DNA sequences is starting to outrun the ability of researchers to store, transmit and especially to analyze the data.

http://infoproc.blogspot.com/2011/11/dna-data-deluge.html

Page 39: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

Summary and Challenges Ahead

• Data handling is now the bottleneck

• It costs more to analyze a genome than to sequence a genome.

• The cost of sequencing a human genome — all three billion bases of DNA in a set of human chromosomes — plunged to $10,500 last July from $8.9 million in July 2007

Page 40: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

Summary and Challenges Ahead

• Storage and access to data causes issues– Not all data in Genbank or in a format that can be easily

accessed• Demand from non-scientists for tools to visualize, understand

and interpret their own genomic data

http://www.missionmassimo.com/?page_id=8

Page 41: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

Personalized Medicine: the future

Page 43: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

BioInfoSummer 2013 program

• Monday-Background to Biology and Statistics

• Tuesday- Evolution Biology• Wednesday- Systems Biology• Thursday-Next Generation Sequencing

(NGS)• Friday- Programing for Bioinformatics

43

Page 44: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott

•Thank You!

44

Page 45: From DNA to Genomics: The Rise of Bioinformatics - Catherine Abbott