Click here to load reader

Disease Models and Association Statistics Nicolas Widman CS 224- Computational Genetics Nicolas Widman CS 224- Computational Genetics

  • View

  • Download

Embed Size (px)

Text of Disease Models and Association Statistics Nicolas Widman CS 224- Computational Genetics Nicolas...

  • Disease Models and Association StatisticsNicolas WidmanCS 224- Computational Genetics

  • IntroductionCertain SNPs within genes may be associated with a disease phenotypeStatistical model used in class only considers inheritance of a single copy of an SNP location: Single Chromosome ModelExpand the statistic to a diploid model and take into account different expression patterns of a SNP

  • Basic Statistic- Haploid Model: Relative RiskpA: Probability of disease-associated alleleF: Disease prevalenceFor this project, F is assumed to be very small+/-: Disease StateDerivation of case (p+) and control (p-) frequencies:P(A)=pAp+A=P(A|+)p-A=P(A|-)F=P(+)P(A|+)=P(+|A)P(A)/P(+)P(+|A)= P(+|A)

  • Derivation- ContinuedP(+)=F=pAP(+|A)+(1-pA)P(+|A)P(+)=F= pAP(+|A)+(1-pA)P(+|A)/P(+)=F=P(+|A)(pA+(1-pA)/)=P(+|A)(pA(-1)+1)/P(+|A)= F/(pA(-1)+1)P(A|+)=P(+|A)P(A)/P(+)=P(+|A)pA/F=pA/(pA(-1)+1)P(-|A)=1-P(+|A)=1- F/(pA(-1)+1)P(A|-)=P(-|A)P(A)/P(-)If F is small, then 1-F 1 and P(-|A) 1 then, P(A|-) P(A) = pA

  • Haploid ModelThe relative risk formula:

    Association Power:

  • AssumptionsLow disease prevalenceF 0: Allows p-A pAUses Hardy-Weinberg PrincipleA-Major Allelea-Minor AlleleP(AA)=P(A)^2P(Aa)=2*P(A)*(1-P(A))P(aa)=(1-P(A))^2Uses a balanced case-control study

  • Diploid Disease ModelsWhen inheriting two copies of a SNP site, there are three common relationships between major and minor SNPsDominantParticular phenotype requires one major alleleRecessiveParticular phenotype requires both minor allelesAdditiveParticular phenotype varies based whether there are one or two major alleles

  • Diploid Disease ModelsAA- Homozygous majorAa, aA- Heterozygousaa- Homozygous minor

  • Modifying the Calculation for Relative RiskPrevious relative risk formula only considered the haploid case of having a SNP or not having a SNP.Approach:Create a virtual SNP which replaces pA in the formula.

  • Virtual SNPsUse Hardy-Weinberg Principle to calculate a new pA - the virtual SNP using the characteristics of diploid disease models.RecessivepA=pd*pdDominantpA=pd*pd+2*pd*(1-pd)AdditivepA=pd*pd+c*pd*(1-pd)Pd: Probability of disease-associated allele. In the calculations used to determine the association power, c was set to sqrt(2).

  • Diploid Disease Models: =1.5

  • Diploid Disease Models: =1.5

  • Diploid Disease Models: =1.5

  • Diploid Disease Models: =2

  • Diploid Disease Models: =2

  • Diploid Disease Models: =2

  • Diploid Disease Models: =3

  • ResultsAchieving significant association power with low relative risk SNPs (=1.5)Minimum of 200 cases and 200 controls required to reach 80% power within strongest pd intervals for each type of SNPAt a sample size of 1000 cases and 1000 controls, dominant and additive SNPs show very significant power for almost all SNP probabilities below 50%Difficult to obtain significant association for low probability recessive SNPs regardless of sample size

  • ResultsSNP probability ranges for greatest association powerDominant: .10 - .30Recessive: .45 - .70Additive: .15 - .40Higher relative risk SNPs require fewer cases and controls to achieve the same power.As approaches 1, the association power to detect a recessive allele with probability p is the same as the power to detect dominant allele with probability 1-p.

  • ResultsDiseases with higher relative risk have their range of highest association power skewed toward lower probability SNPs.Challenges in obtaining high association power:Low probability recessive SNPsLow relative risk diseases, especially with small sample sizesHigh probability dominant SNPs, however these are unlikely due natural selection and that the majority of the population would be affected by such diseases.

Search related