7
Epidemiology modeling (Microarray, NGS & qRT-PCR) Theme: Transcriptional Program in the Response of Human Fibroblasts to Serum. Lab #2 Etienne Z. Gnimpieba BRIN WS 2013 Mount Marty College – June 24 th 2013 [email protected]

Session ii g3 overview epidemiology modeling mmc

Embed Size (px)

Citation preview

Page 1: Session ii g3 overview epidemiology modeling mmc

Epidemiology modeling (Microarray, NGS & qRT-PCR)

Theme: Transcriptional Program in the Response of Human Fibroblasts to Serum.

Lab #2

Etienne Z. GnimpiebaBRIN WS 2013

Mount Marty College – June 24th [email protected]

Page 2: Session ii g3 overview epidemiology modeling mmc

Resolution Process

Context

Specification & Aims

Lab #2

Preprocessing Viewing Clustering Differential expression Classification Data mining

2

Statement of problem / Case study: The temporal program of gene expression during a model physiological response of human cells, the response of fibroblasts to

serum, was explored with a complementary DNA microarray representing about 8600 different human genes. Genes could be clustered into groups on the basis of their temporal patterns of expression in this program. Many features of the transcriptional program appeared to be related to the physiology of wound repair, suggesting that fibroblasts play a larger and richer role in this complex multicellular response than had previously been appreciated.

Gene Expression Data Analysis

16 Vishwanath R. Iyer, Scince, 1999

Conclusion: ?

Aim: The purpose of this lab is to initiate on gene expression data analysis process. We simulated the application on “Transcriptional Program in the Response of Human Fibroblasts to Serum” . Now we can understand how a researcher can come to identify a significant expressed gene from microarray dataset.

T1. Gene expression overview

T2. Excel used in GenomicsObjective: used of basic excel functionalities to solve some

gene expression data analysis needs

Acquired skills- Gene expression data overview- Excel Used for genomics- Microarray data analysis using GEPAS

T1.1. Review of genomics place in OMIC- world T1.2. Microarray data technics and process T1.3. Data analysis cycle and tools

T2.1. Colum manipulation, functions used, anchor, copy with function, sort data, search and replaceT2.2. Experiment comparison: Data pre-treatmentT1.3. Differential expressed gene from replicate experiments (SAM)T2. GEPAS: Gene expression analysis

pattern suiteObjective: used of the GEPAS suite to apply the whole microarray data analyzing process on fibroblast data.

http://www.transcriptome.ens.fr/gepas/index.html

Expression Profile Clustering:

Slide Scanning:

Target Preparation:

Hybridization:

Page 3: Session ii g3 overview epidemiology modeling mmc

Data manipulation Gene expression data analysisOMIC World

DNA

E

DNA

mRNA

E Degradatio

n

Degradation

Translation

Transcription

Gene Repressi

on

S P

Catalyse

Genomics

FunctionalGenomics

Transcriptomics

Proteomics

Metabolomics

Etienne Z. GnimpiebaBRIN WS 2013

Mount Marty College – June 24th 2013

Page 4: Session ii g3 overview epidemiology modeling mmc

Data manipulation Gene expression data analysisEpidemiology model

Etienne Z. GnimpiebaBRIN WS 2013

Mount Marty College – June 24th 2013

## WHAT IS IT? This model is an extension of the basic model of epiDEM (a curricular unit which stands for Epidemiology: Understanding Disease Dynamics and Emergence through Modeling). It simulates the spread of an infectious disease in a semi-closed population, but with additional features such as travel, isolation, quarantine, inoculation, and links between individuals. However, we still assume that the virus does not mutate, and that upon recovery, an individual will have perfect immunity. Overall, this model helps users: 1) understand the emergent disease spread dynamics in relation to the changes in control measures, travel, and mobility 2) understand how the reproduction number, R_0, represents the threshold for an epidemic3) understand the relationship between derivatives and integrals, represented simply as rates and cumulative number of cases, and 4) provide opportunities to extend or change the model to include some properties of a disease that interest users the most.

Page 5: Session ii g3 overview epidemiology modeling mmc

Data manipulation Gene expression data analysisEpidemiology model

Etienne Z. GnimpiebaBRIN WS 2013

Mount Marty College – June 24th 2013

## HOW IT WORKS Individuals wander around the world in random motion. There are two groups of individuals, represented as either squares or circles, and are geographically divided by the yellow border. Upon coming into contact with an infected person, he or she has a chance of contracting the illness. Depending on their tendencies, which are set by the user, sick individuals will either isolate themselves at "home," go to a hospital, be force-quarantined into a hospital by health officials, or just move about. An infected individual has a chance of recovery after the given recovery time has elapsed. The presence of the virus in the population is represented by the colors of individuals. Four colors are used: white individuals are uninfected, red individuals are infected, green individuals are recovered, and blue individuals are inoculated. Once recovered, the individual is permanently immune to the virus. The yellow person symbolizes the health official or ambulance, who patrols the world in search of ill people. Once coming in contact with an infected individual, the ambulance immediately delivers the infected to the hospital within the region of residence. The graph INFECTION AND RECOVERY RATES shows the rate of change of the cumulative infected and recovered in the population. It tracks the average number of secondary infections and recoveries per tick. The reproduction number is calculated under different assumptions than those of the KM model, as we allow for more than one infected individual in the population, and introduce aforementioned variables. At the end of the simulation, the R_0 reflects the estimate of the reproduction number, the final size relation that indicates whether there will be (or there was, in the model sense) an epidemic. This again closely follows the mathematical derivation that R_0 = beta*S(0)/ gamma = N*ln(S(0) / S(t)) / (N - S(t)), where N is the total population, S(0) is the initial number of susceptibles, and S(t) is the total number of susceptibles at time t. In this model, the R_0 estimate is the number of secondary infections that arise for an average infected individual over the course of the person's infected period.

Page 6: Session ii g3 overview epidemiology modeling mmc

Data manipulation Gene expression data analysisEpidemiology model

Etienne Z. GnimpiebaBRIN WS 2013

Mount Marty College – June 24th 2013

## HOW TO USE IT The SETUP button creates individuals according to the parameter values chosen by the user. Each individual has a 5% chance of being initialized as infected. Once the simulation has been setup, push the GO button to run the model. GO starts the simulation and runs it continuously until GO is pushed again. Each time-step can be considered to be in hours, although any suitable time unit will do. What follows is a summary of the sliders in the model. INITIAL-PEOPLE (initialized to vary between 50 - 400): The total number of individuals the simulation begins with. INFECTION-CHANCE (10 - 50): Probability of disease transmission from one individual to another. RECOVERY-CHANCE (10 - 100): Probability of an individual's recovery, after the average recovery tie has elapsed.AVERAGE-RECOVERY-TIME (50 - 300): Time it takes for an individual to recover, on average. The actual individual's recovery time is pulled from a normal distribution centered around the AVERAGE-RECOVERY-TIME at its mean, with a standard deviation of a quarter of the AVERAGE-RECOVERY-TIME. Each time-step can be considered to be in hours, although any suitable time unit will do. AVERAGE-ISOLATION-TENDENCY (0 - 50): Average tendency of individuals to isolate themselves and will not spread the disease. Once an infected person is identified as an "isolator," the individual will isolate himself in the current location (as indicated by the grey patch) and will stay there until full recovery. AVERAGE-HOSPITAL-GOING-TENDENCY (0 - 50): Average tendency of individuals to go to a hospital when sick. If an infected person is identified as a "hospital goer," then he or she will go to the hospital, and will recover in half the time of an average recovery period, due to better medication and rest. INITIAL-AMBULANCE (0 - 4): Number of health officials or ambulances that move about at random, and force-quarantine sick individuals upon contact. The health officials are immune to the disease, and they themselves do not physically accompany the patient to the hospital. They move at a speed 5 times as fast as other individuals in the world and are not bounded by geographic region. INOCULATION-CHANCE (0 - 50): Probability of an individual getting vaccinated, and hence immune from the virus. INTRA-MOBILITY (0 - 1): This indicates how "mobile" an individual is. Usually, an individual at each time step moves by a distance 1. In this model, the person will move at a distance indicated by the INTRA-MOBILITY at each time-step. Thus, the lower the intra-mobility level, the less the movement in the individuals. Individuals move randomly by this assigned value; ambulances always move 5 times faster than this assigned value.

Page 7: Session ii g3 overview epidemiology modeling mmc

Plan • Gene Expression Measurement • Microarray Process• Gene Expression Data Stores• Data Mining / Querying• Data Analysis• Example: ATP13A2 Profile in Stress

Conditions