24
Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.i d

Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id

  • Upload
    ally

  • View
    52

  • Download
    0

Embed Size (px)

DESCRIPTION

Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id. Why do we care about genetic variations?. 1. Genetic variations underlie phenotypic differences among different individuals. - PowerPoint PPT Presentation

Citation preview

Page 1: Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id

Fatchiyah, PhDDept Biology UB

Fatchiyah.lecture.ub.ac.id

Page 2: Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id

Why do we care about genetic variations?

3. Genetic variations reveal clues of ancestral human migration history

2. Genetic variations determine our predisposition to complex diseases and responses to drugs and environmental factors

1. Genetic variations underlie phenotypic differences among different individuals

Page 3: Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id

Main Types of Genetic VariationsA. Single nucleotide mutation

Resulting in single nucleotide polymorphisms (SNPs) Accounts for up to 90% of human genetic variations Majority of SNPs do NOT directly or significantly contribute to any phenotypes

B. Insertion or deletion of one or more nucleotide(s)1. Tandem repeat polymorphisms Tandem repeats are genomic regions consisting of variable length of sequence

motifs repeating in tandem with variable copy number. Used as genetic markers for DNA finger printing (forensic, parentage testing) Many cause genetic diseases

Microsatelites (Short Tandem Repeats): repeat unit 1-6 bases long Minisatelites: repeat unit 11-100 bases long

2. Insertion/Deletion (INDEL or DIPS) polymorphisms Often resulted from localized rearrangements between homologous tandem repeats.

C. Gross chromosomal aberration Deletions, inversions, or translocation of large DNA fragments Rare but often causing serious genetic diseases

Page 4: Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id

How many variations are presentin human genome?

SNPs appear once per 0.1-1 kb interval or on average 1 per 300 bp. Considering the size of entire human genome (3.2 x109 bp), the total number of SNPs is well above 11 million. The high density and relatively easier assay make SNPs the ideal genomic markers.

In sillico estimation of potentially polymorphic variable number tandem repeats (VNTR) are over 100,000 across the human genome

The short insertion/deletions are very difficult to quantify and the number is likely to fall in between SNPs and VNTR.

Page 5: Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id

Types of Single Base Substitutions

TransitionsChange of one purine (A,G) for another purine, or a pyrimidine (C,T) for another pyrimidine

TransversionsChange of a purine (A,G) for a pyrimidine (C,T), or vice versa.

The cytosine to thymine (C>T) transition accounts for approximately 2 out of every 3 SNPs in human genome.

Page 6: Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id

SNP or Mutation? Call it a SNP IF

the single base change occurs in a population at a frequency of 1% or higher.

Call it a mutation IFthe single base change occurs in less than 1% of a population.

A SNP is a polymorphic position where the point mutation has been fixed in the population.

Page 7: Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id

From a Mutation to a SNP

Page 8: Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id

SNPs ClassificationSNPs can occur anywhere on a genome, they are classified based on their locations.

Intergenic region Gene region

can be further classified as promoter region, and coding region (intronic, exonic, promoter region, UTR, etc.)

Page 9: Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id

Coding Region SNPs Synonymous Non-Synonymous

Missense – amino acid changeNonsense – changes amino acid to stop codon.

Geo

spiz

a G

reen

Arr

ow™

tuto

rial b

y Sa

ndra

Por

ter,

Ph.D

.

Page 10: Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id

The Consequences of SNPsThe phenotypic consequence of a SNP is significantly affected by the location where it occurs, as well as the nature of the mutation.

No consequence Affect gene transcription quantitatively or

qualitatively. Affect gene translation quantitatively or

qualitatively. Change protein structure and functions. Change gene regulation at different steps.

Page 11: Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id

Simple/Complex Genetic Diseases and SNPs Simple genetic diseases (Mendelian diseases) are

often caused by mutations in a single gene. -- e.g. Huntington’s, Cystic fibrosis, PKU, etc.

Many complex diseases are the result of mutations in multiple genes, the interactions among them as well as between the environmental factors.-- e.g. cancers, heart diseases, Alzheimer's, diabetes, asthmas, etc.

Majority of SNPS may not directly cause any diseases. SNPs are ideal genomic markers (dense and easy to

assay) for locating disease loci in association studies.

Page 12: Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id
Page 13: Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id

NCBI dbSNPhttp://www.ncbi.nlm.nih.gov/SNP/index.html

NCBI Online Mendelian Inheritance in Man (OMIM)http://www.ncbi.nlm.nih.gov/sites/entrez?db=OMIM

International HapMap Projecthttp://www.hapmap.org/

Perlegen http://genome.perlegen.com

Genome Variation Server (Seattle SNPs)http://gvs.gs.washington.edu/GVS/

Main Genetic Variation Resources

Page 14: Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id

Where to Find Bioinformatics Resources for Genetic Variation Studies?

OBRC: Online Bioinformatics Resources Collection (Univ. of Pittsburgh)http://www.hsls.pitt.edu/guides/genetics/obrcThe most comprehensive annotated bioinformatics databases and software tools collection on the Web, with over 200 resources relevant to genetic variation studies.

HUGO Mutation Database Initiativehttp://www.hgvs.org/dblist/dblist.html

Page 15: Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id

NCBI dbSNP Database: Overview URL: http://www.ncbi.nlm.nih.gov/SNP/index.html

The NCBI’s Single Nucleotide Polymorphism database (dbSNP) is the largest and primary public-domain archive for simple genetic variation data.

The polymorphisms data in dbSNP includes:Single-base nucleotide substitutions (SNPs) Small-scale multi-base deletions or insertions variations

(also called deletion insertion polymorphisms or DIPs or INDELs)

Microsatellite tandem repeat variations (also called short tandem repeats or STRs).

Page 16: Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id

dbSNP Data Stats (build 128, Oct, 2007)http://www.ncbi.nlm.nih.gov/SNP/snp_summary.cgi

Page 17: Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id

dbSNP Data TypesThe dbSNP contains two classes of records:

Submitted recordThe original observations of sequence variation; submitted SNPs (SS) records started with ss (ss5586300)

Computationally annotated recordGenerated during the dbSNP "build" cycle by computation based the original submitted data, Reference SNP Clusters (ref SNP) start with rs (rs4986582)

Page 18: Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id

dbSNP Submitted Record Provides information on the SNP and conditions under which

it was collected. Provides links to collection methods (assay technique),

submitter information (contact data, individual submitter), and variation data (frequencies, genotypes).

ss5586300

Page 19: Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id

From Submitted Record to Reference SNP Cluster

SNPs records submittedby researchers

SNP position mappedto the reference genomic contigs

If the SNP position is unique, a new RS# is assigned

If the SNP position not unique, it will be assigned to the existing RefSNP cluster

Page 20: Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id

Different Ways to Search SNPs in dbSNP

dbSNP Web sitehttp://www.ncbi.nlm.nih.gov/SNP/index.htmlDirect search of SS record; batch search; allow SNP record submission; NO search limits

Entrez SNP http://www.ncbi.nlm.nih.gov/sites/entrez?db=SnpSearch limits options allows precise retrieval

Entrez Gene Record’s SNP Links Out FeatureDirect links to corresponding SNP records; access to genotype and linkage disequilibrium data

NCBI’s MapViewer Visualize SNPs in the genomic context along with other types of genetic data.

Page 21: Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id

Search SNPs from dbSNP Web Page

dbSNP Web sitehttp://www.ncbi.nlm.nih.gov/SNP/index.html

Page 22: Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id

Search SNPs from Entrez SNP Web Page Entrez SNP

http://www.ncbi.nlm.nih.gov/sites/entrez?db=SnpThe dbSNP is a part of the Entrez integrated information retrieval system and may be searched using either qualifiers (aliases) or a combination search limits from 14 different categories.

Page 23: Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id

Entrez SNP Search Limits Organisms Chromosome (including W and Z for non-mammals) Chromosome Ranges Map Weight (how many times in genome) Function Class (coding non-synonymous; intron; etc.) SNP Class (types of variations) Method Class (methods for determining the variations) Validation Status (if and how the data is validated) Variation Alleles (using IUPAC- codes) Annotation (Records with links to other NCBI database) Heterozygosity (% of heterozygous genotype) Success Rate (likelihood that the SNP is real) Created Build ID Updated Build ID

http://www.ncbi.nlm.nih.gov/portal/query.fcgi?db=Snp http://www.ensembl.org/common/helpview?kw=snpview;ref=

Page 24: Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id

Assessing Polymorphisms: Linkage Disequilibrium, Haplotype Block, and Tag SNPs

Ada

pted

from

Nat

ure

426,

696

8: 7

89-7

96 (2

003)

Linkage Disequilibrium (LD): If two alleles tend to be inherited together more often than would be predicted, then the alleles are in linkage disequilibrium.

If most SNPs have highly significant correlation to one or more of neighbors, these correlations can be used to generate haplotypes, which represent excellent proxies for individual SNP.

Because haplotypes may be identified by a much small number of SNPs (tag SNPs), assessing polymorphisms via haplotypes dramatically reduces genotyping work.