16
Gene Expression and Microarrays Garrett M. Dancik, Ph.D. Note: All images from slides 3-6 on are from Campbell Biology, 9 th edition, © 2011 Pearson Education, Inc.

Gene Expression and Microarrays - GitHub PagesGene Expression Omnibus (GEO) •A GEO sample(GSM*) describes an individual sample, including the experimental conditions in which it

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Gene Expression and Microarrays - GitHub PagesGene Expression Omnibus (GEO) •A GEO sample(GSM*) describes an individual sample, including the experimental conditions in which it

Gene Expression and Microarrays

Garrett M. Dancik, Ph.D.

Note: All images from slides 3-6 on are from Campbell Biology, 9th edition,© 2011 Pearson Education, Inc.

Page 2: Gene Expression and Microarrays - GitHub PagesGene Expression Omnibus (GEO) •A GEO sample(GSM*) describes an individual sample, including the experimental conditions in which it

Overview of gene expression

TAGC

4-character alphabet

20-character alphabet

• A gene is a unit of hereditary (DNA) that makes a functional RNA or protein

• The human genome is 3 billion characters long• The human genome contains ~ 25,000 genes

transcription translationCentral Dogma of Molecular Biology:

Page 3: Gene Expression and Microarrays - GitHub PagesGene Expression Omnibus (GEO) •A GEO sample(GSM*) describes an individual sample, including the experimental conditions in which it

Synthesis ofmRNA

mRNA

DNA

NUCLEUSCYTOPLASM

1 Synthesis ofmRNA

mRNA

DNA

NUCLEUSCYTOPLASM

mRNAMovement ofmRNA intocytoplasm

1

2

Synthesis ofmRNA

mRNA

DNA

NUCLEUSCYTOPLASM

mRNA

Ribosome

AminoacidsPolypeptide

Movement ofmRNA intocytoplasm

Synthesisof protein

1

2

3

3

Overview of gene expression: DNA à RNA à Protein

• Genes are made of DNA, a nucleic acid made of monomers called nucleotides

• A gene is a unit of inheritance that codes for the amino acid sequence of a polypeptide

Page 4: Gene Expression and Microarrays - GitHub PagesGene Expression Omnibus (GEO) •A GEO sample(GSM*) describes an individual sample, including the experimental conditions in which it

Figure 5.26

Sugar-phosphate backbone5¢ end

5¢C

3¢C

5¢C

3¢C

3¢ end

(a) Polynucleotide, or nucleic acid

(b) Nucleotide

Phosphategroup Sugar

(pentose)

Nucleoside

Nitrogenousbase

5¢C

3¢C

1¢C

Nitrogenous bases

Cytosine (C) Thymine (T, in DNA) Uracil (U, in RNA)

Adenine (A) Guanine (G)

Sugars

Deoxyribose (in DNA) Ribose (in RNA)

(c) Nucleoside components

Pyrimidines

Purines

4

Nucleic Acids are made up of nucleotides

In DNA, the sugar is deoxyribose; in RNA, the sugar is ribose

Components of a nucleotide

Page 5: Gene Expression and Microarrays - GitHub PagesGene Expression Omnibus (GEO) •A GEO sample(GSM*) describes an individual sample, including the experimental conditions in which it

Figure 5.27

Sugar-phosphatebackbonesHydrogen bonds

Base pair joinedby hydrogen bonding

Base pair joinedby hydrogen

bonding

(a) DNA

5¢ 3¢

5¢3¢

• Complementary base pairing– The nitrogenous bases in

DNA pair up and form hydrogen bonds: adenine (A) always with thymine (T), and guanine (G) always with cytosine (C)

– Complementary pairing can also occur between two RNA molecules or between parts of the same molecule

• In RNA, thymine is replaced by uracil (U) so A and U pair

5

Page 6: Gene Expression and Microarrays - GitHub PagesGene Expression Omnibus (GEO) •A GEO sample(GSM*) describes an individual sample, including the experimental conditions in which it

DNAtemplatestrand

TRANSCRIPTION

mRNA

TRANSLATION

Protein

Amino acid

Codon

Trp Phe Gly

Ser

U U U U U3¢

5¢3¢

G

G

G G C C

T

C

A

A

AAAAA

T T T T

T

G

G G G

C C C G GDNAmolecule

Gene 1

Gene 2

Gene 3

C C

• The genetic code is a triplet code where a 3-nucleotide DNA word codes for a 3-nucleotide mRNA word (a codon) which codes for an amino acid

Page 7: Gene Expression and Microarrays - GitHub PagesGene Expression Omnibus (GEO) •A GEO sample(GSM*) describes an individual sample, including the experimental conditions in which it

Mutations of one or a few nucleotides can affect protein structure and function• Mutations are changes in the genetic material

of a cell or virus• Point mutations are chemical changes in just

one base pair of a gene– May or may not change the protein

• Insertions/deletions may cause frameshiftmutations that have a disasterous effect on the protein

Page 8: Gene Expression and Microarrays - GitHub PagesGene Expression Omnibus (GEO) •A GEO sample(GSM*) describes an individual sample, including the experimental conditions in which it

Sickle-Cell Disease: A Change in Primary Structure

• A slight change in the amino acid (primary structure) can affect a protein’s structure and ability to function – What causes a change in the primary structure?

• Sickle-cell disease, an inherited blood disorder, results from a single amino acid substitution in the protein hemoglobin

8

Page 9: Gene Expression and Microarrays - GitHub PagesGene Expression Omnibus (GEO) •A GEO sample(GSM*) describes an individual sample, including the experimental conditions in which it

Wild-type hemoglobin

Wild-type hemoglobin DNA3¢

3¢5¢

5¢ 3¢

3¢5¢

5¢5¢5¢3¢

mRNA

A AGC T T

A AGmRNA

Normal hemoglobinGlu

Sickle-cell hemoglobinVal

AA

AUG

GT

T

Sickle-cell hemoglobin

Mutant hemoglobin DNAC

Point mutation that causes sickle cell disease

Page 10: Gene Expression and Microarrays - GitHub PagesGene Expression Omnibus (GEO) •A GEO sample(GSM*) describes an individual sample, including the experimental conditions in which it

Figure 5.21

PrimaryStructure

Secondaryand TertiaryStructures

QuaternaryStructure Function Red Blood

Cell Shape

b subunit

b subunitb

b

a

a

Exposedhydrophobicregion

Molecules do notassociate with oneanother; each carriesoxygen.

Molecules crystallizeinto a fiber; capacityto carry oxygen isreduced.

Sickle-cellhemoglobin

Normalhemoglobin

10 µm

10 µm

Sick

le-c

ell h

emog

lobi

nN

orm

al h

emog

lobi

n

1234567

1234567

b

ba

a

10

Page 11: Gene Expression and Microarrays - GitHub PagesGene Expression Omnibus (GEO) •A GEO sample(GSM*) describes an individual sample, including the experimental conditions in which it

----ACTGA----

----GAGAT----

Probe 1: TGACT

Probe 2: CTCTA…Probe 20000: TTTAG

----ACTGA----

Page 12: Gene Expression and Microarrays - GitHub PagesGene Expression Omnibus (GEO) •A GEO sample(GSM*) describes an individual sample, including the experimental conditions in which it

Biomarkers and personalized medicine

A B

Gen

es

Samples

Gene expression profiles

Poss

ible

co

mpa

rison

s

Tumor Normal

High risk Low risk

Responder Non-responder

Biomarker identification (gene or gene signature)

Diagnostic: predictive of a clinical variable

Prognostic: predictive of disease outcome

Predictive: predictive of therapeutic response

• Bioinformatics challenges– Identification of genes or gene signature– Choice of classification method or gene

model

Page 13: Gene Expression and Microarrays - GitHub PagesGene Expression Omnibus (GEO) •A GEO sample(GSM*) describes an individual sample, including the experimental conditions in which it

Microarrays in more detail

http://www.oceanridgebio.com/images/system_rev_630.jpg

Page 14: Gene Expression and Microarrays - GitHub PagesGene Expression Omnibus (GEO) •A GEO sample(GSM*) describes an individual sample, including the experimental conditions in which it

Microarray Analysis

• Analysis will be performed using several Bioconductor packages (http://bioconductor.org)

• Data is available from the Gene Expression Omnibus (GEO; http://www.ncbi.nlm.nih.gov/geo/)– We will look at how to download raw and processed

data from GEO

Page 15: Gene Expression and Microarrays - GitHub PagesGene Expression Omnibus (GEO) •A GEO sample(GSM*) describes an individual sample, including the experimental conditions in which it

Gene Expression Omnibus (GEO)

• GEO (http://www.ncbi.nlm.nih.gov/geo/) is a public functional genomics data repository for gene expression (microarray) and sequence-based data.

• There are four kinds of records on GEO (http://www.ncbi.nlm.nih.gov/geo/info/overview.html)

Page 16: Gene Expression and Microarrays - GitHub PagesGene Expression Omnibus (GEO) •A GEO sample(GSM*) describes an individual sample, including the experimental conditions in which it

Gene Expression Omnibus (GEO)

• A GEO sample (GSM*) describes an individual sample, including the experimental conditions in which it was collected, and the gene expression value for each element on the array.

• A GEO platform (GPL*) is a summary of the array used, and links the array probes to genes

• A GEO series (GSE*) links together a collection of samples with one or more platforms for a particular experiment or study (such as profiling gene expression from 100 patients with lung cancer)

• A GEO dataset is a curated collection of samples that allows for user-friendly analysis. Not all series exist as datasets.