- 1. Differential Gene Expression: Ischemic vs. Nonischemic Jing
Hu Dongmei Li Shuyan Wan Richard Yamada Jeong-Me Yoon Zailong Wang
(Mentor)
2. Outline for Our Talk
- Introduction and summary of previous work (Richard)
- Exploratory Analysis of Data (Jeong-Mi)
- Statistical Methods (Shuyan)
- Selected Gene Analysis (Jing)
- Conclusions and Further Work (Dongmei)
3. Human Heart Function 4. Arteries 5. What is Ischemic
Cardiomyopathy?
- Ischemic Lack of Blood and Oxygen
- Cardio Refers to the Heart
- Myopathy Muscle Related Disease
- ischemic cardiomyopathy is a medical term that doctors use to
describe patients who havecongestive heart failurethat is a result
ofcoronary artery disease . (coronary arteries are blocked)
6. Ischemic Cardiac Myopathy
- Risk Factors: genetics, smoking, high fat diet, obesity, and
prior heart problems
- Incidence: 1 in 100, typically male, starting with middle
age
- Symptoms include: chest pain, shortness of breath,
irregular/rapid pulse, and sensation of feeling the heart beat
- Treatment Regimens: ACE inhibitors, beta blockers, angioplasty
(to improve blood flow to the damaged or weakened heart muscle),
and heart transplant (severe cases)
7. The Basic Scientific Question
- What kinds of changes occur in cardiac transcription profiles
brought about by heart failure?
- 2 ways to go about attacking the question: Molecular Biology
(hypothesis based) vs High Thru-put techniques (i.e. microarrays
followed by confirmation of gene expression with qPCR)
8. Differential Expression between ischemic and non-ischemic
cardiomyopathy patients
- Gene expression analysis of ischemic and nonischemic
cardiomyopathy: shared and distinct gene in the development of
heart failure
- M. Kittleson, K. Minhas, R. Irizarry, S. Ye, G. Edness, E.
Breton, J. Conte, G. Tamselli, J. Garcia, and J. Hare. Physiol.
Genomics, 21:299-307, 2005
9. Methods of Kittleson et al
- 31 cardiomyopathy vs. 6 normal patients (clinical
characteristics were reasonably similar within groups)
- Tissue taken from cardio-myopathy patients at the time of LVAD
or cardiac transplantation
- Identified differentially expressed genes in 2 comparisons:
NICM (hypertrophic, valvular, alcholic) vs NF hearts and ICM vs NF
using significance analysis of microarrays
- Identified genes with FDR < 5% and absolute fold change
greater than 2.0
10. Conclusions of Kittleson et al
- No hypothesis, but the microarray experiment was used to
generate hypothesis
- Types of genes differentially expressed (41 total): cell growth
maintenance(9), signal transduction(7), metabolism(3), cell
adhesion/cell communication(2), binding(2), and catalytic
activity(2), nucleus(3), other (13)
11. Conclusions of Kittleson et al
- Predominance of fatty acid metabolic genes genesis of NICM
might be metabolic in nature
- Predominance of abnormalities in catalytic activity with ICM
(serine proteinase inhibitors)
- TNFRSF11B (member of TNF receptor subfamily) is significantly
downregulated in ICM
12. Experimental Procedure for Data that We are Using
- Collected myocardial samples from patients undergoing cardiac
transplantation whose failure arises from ischemic cardiomyopathy
and from "normal" organ donors whose hearts cannot be used for
transplants
- The transcriptional profile of the mRNA in these samples was
measured with gene array technology.
- Changes in transcriptional profiles can be correlated with the
physiologic profile of heart-failure hearts acquired at the time of
transplantation.
13. Working Hypothesis ?
- Because of the results of Kittleson et al, we can generate a
simple working hypothesis:
- Our differentially expressed genes, using our methods of
statistical analysis of the data, should roughly be the same as
what Kittleson et al obtained in their paper .
14. Exploratory Analysis of Data
- Goal: identify genes whose expression levels are
- differentially expressed between Ischemic and
- Affymetrix Data with Two Population:
- 54,675 genes are expressed
15.
- Only obtain the expression measurementof the data (ie., put it
into exprSet) using thedefault of justRMA method:
- normalized.method = quantiles
16.
- Histogram of Ischemic/Normal :
- The distribution is skewed right.
- The rangeis between 4 to 14.
- Both histograms have similar shapes.
- Boxplot of Ischemic/Normal:
- There are many outliers from the upper values.
- The intensity of Ischemic is higher than Normal.
- Histogram of MAD (Median Absolute Deviation)
- We can filter out 675 genes from a total of 54675 genes.
- A visual aid for identifying genes with unusual test
- It shows the large deviation at the right tail.
17. 18. 19. 20. 21. 22.
- Mean difference between Ischemic and Normal
- We are testing 54675 genes simultaneously and adjust for
multiple testing when assessing the statistical significance of the
observed associations to control the false positive rate.
23. Multiple Hypothesis Testing
- To identify as many differentially expressed genes as possible,
while incurring a relatively low proportion of false
positives.
- H 0 : No differential gene expression (between Ischemic and
normal group)
- Large multiplicity problem: more than fifty thousand hypotheses
are tested simultaneously.
- How can we control the false positive rate genomewide? FDR or
pFDR.
24. Table1. Possible outcomes from thresholding m genes for
significance (m p-values with some cutoff point applied). m m - S S
(# of sign. features) Total m 1 m 1- F T (# of true positives) True
alternative( H ais true) m 0 m 0- F F (# of false positives) True
null( H 0is true) Total Called not significant(accept H 0 ) Called
significant(reject H 0 ) 25. False Discovery Rate
- In case S=0, defined to be: E(F/S|S>0)P(S>0) or define
F/S=0 if S=0.
- Alternatively, definepFDR =E(F/S|S>0). When m is large,
P(S>=0) is approx. 1 and FDR is approx. equal to pFDR.
- FDR is a measure of the overall accuracy of a set of
significant features.
26. Linear Step-Up Procedure 27. Steps
- Select desired limitqon E(FDR)
28. FDR Adjusted P-Values
- For an individual hypothesis,
FDR Adjusted =p-valuelowest level of FDRfor which the
hypothesisis first includedin the set ofrejected hypothesis 29.
Data inter-dependencies
-
- Multiple testing of such data will produce correlated test
statistics !
-
- pooled variability estimation
- co-regulation - spatial effects
- Between measurement errors of expression levels :
30.
-
-
- (Benjamini & Yekutieli, 2001 and Yekutieli, 2002).
Correlated Test Statistics
-
-
- The linear step-up procedure controls the FDR for positive
dependent test statistics.
-
-
- This condition is satisfied by :
-
-
- - positively correlated one-sided normal and t test
statistics.
-
-
- - absolute values of normal and t test statistics, when all
null hypotheses are true.
31. BH and BY procedure
-
- adjustedp -values for the Benjamini & Hochberg (1995)
step-up FDR controlling procedure (independent and positive
regression dependent test statistics).
-
- adjustedp -values for the Benjamini & Yekutieli (2001)
step-up FDR controlling procedure (general dependency
structures).
32. Our Results
33. Plot of sorted adjusted p-values 34. Plot of adjusted
p-values vs. test statistics 35. Gene Selection Analysis
- Further select genes based on the fold change between two
conditions (Ischemic vs. Normal)
- The fold change for each gene is calculated as the average
expression over all Ischemic samples divided by the average
expression over all normal samples.
36. 37. Fold change cutoff value
- There are 1495 genes with Log2(fold change) > 1, and 26
genes with Log2(fold change) < -1
- There are only 43 genes with Log2(fold change) > 2, and 3
genes with Log2(fold change) < -2
- We choose the first option
38. 39. Discussion
- Among the 54,675 mRNA transcripts present on the Affymetrix
microarray platform, 675 housekeeping genes were filtered out.
- By selecting the adjusted P-value less than 0.0001, only 35,207
genes are left for the analysis of fold change.
- After fold change selection, only 1521 genes are leftfor
further selection.
- Finally, 74 up-regulated genes and 26 down-regulated genes are
selected from the microarray analysis for further biological
verification and study.
40. Summary of the Selected Genes
- Of the 100 genes, there are 53 genes that have known biological
functions. The functions of the other 47 genes are unknown.
41. Gene Classification
- Based on the biological process of the genes, the 100 genes can
be classified in several categories.
42. Biological Function Classification 43. 44. 45. 46.
Differentially Expressed Genes to ISC-Normal Comparisons
- Among the 100 genes that are differentially expressed between
ischemic and normal, the majority fell into cell adhesion, cell
growth and maintenance, signal transduction, muscle contraction and
development, immune response andregulation of transcription.
- Most of the genes are up-regulated in above process except one
or two genes in the process of cell growth and maintenance and cell
adhesion.
- Few genes belong to metabolism, inflammatory response, acute
phase response and oncogenesis.
47. An important gene for Ischemic Cardiomyopathy
- Serine proteinase inhibitors has an anti-ischemic protective
effect and has been previously observed in pigs subject to
experimentally induced myocardial ischemia (Khan 2004): Aprotinin
reduces reperfusion injury after regional ischemia and cardioplegic
arrest. Protease inhibition may represent a molecular strategy to
prevent postoperative myocardial injury after surgical
revascularization with cardiopulmonary bypass.
- It was hypothesized to ben an important gene in Kittlesons
paper (Physiol. Genomics, 2004).
48. The significance of the results
- The gene differentiation analysis find out the genes that
either up-regulated or down-regulated in ischemic patients, which
can correlated with clinical parameters in heart failure patients
and supported ongoing efforts to incorporate expression
profiling-based biomarkers in determining prognosis and response to
therapy in heart failure.
49. Comparison with Kittleson et. al.s Paper
- Although only one common gene is found in the analysis, it is
consistent considering the sample size difference, the tissue
difference and the statistical analysis method difference.
- However, most of the genes identified from the analysis fell in
the same categories of the biological functions.
50. Limitation
- Because circumstances causing a donor heart to be ineligible
for cardiac transplantation, such as infection or prolonged
hypotension, can also affect gene expression, a normal functional
unused donor heart is not the same as a normal heart.
51. Future Work
- First, the gene expression profile of these 100 genes need to
be verified by the Northern Blot or Real-Time RT-PCR (qPCR).
- After verification, some high fold change unknown function
genes can be chosen to study their functions for biologists.
52. Acknowledgements
- MBI (Prof. Friedman and staff)
- Professors Shili Lin and Joseph Verducci