Upload
dan-gaston
View
154
Download
1
Embed Size (px)
Citation preview
Bioc4010 Sample Questions: 1. A) What is the base call accuracy of a base in an Illumina sequenced short read with a Q value of 20? B) Is this better or worse than a Q value of 10? Answer: A) Probability 1 in 100 or 99% call accuracy B)Better. Q10 corresponds to a probability of 1 in 10 or 90% call accuracy Formula: Q = -10 log10 P 2. What two primary advantages does exome sequencing provide over whole genome sequencing? Answer: Cost and data reduction. Exome capture limits the sequencing to known protein-coding genes and some miRNAs. 3. Split and sort the string ‘CAPTAINKIRK’ into its appropriate suffix array Answer: Ainkirk Aptainkirk Captainkirk Inkirk Irk K Kirk Nkirk Ptainkirk Rk Tainkirk
4. Given a base-quality score threshold of Q30, the following short read alignment, and reference sequence, what is the genotype (two alleles, eg G/C)at the indicated position? Base qualities for the position are listed on the side for each of the reads.
AGCTCCCAGGGTCCAG Q29
GTCCAGTCTCGGTT Q40
CAGGGTCCAGTC Q47
TCCAGTCTCGGTTCCATC Q35
CCCAGGGCCCAG Q50
GGGTCCAGTCTC Q31
TCCCAGGGCC Q10
AGGGTCCAGT Q45
GCTCCCAGGGCCCAGTCT Q46
CTCCCAGGGCCC Q33
CCAGGGTCCAGTCQ38
GCTCCCAGGGCCCAGTCTCGG Q41
CAGGGTCCAGTCTCG Q15
AGCTCCCAGGGTCCAGTCTCGGTTCCATCTA
* Answer: Discard the reads where the base quality score is below Q30. Sum up the reference and alternate bases at the position. (T =6 , C = 4). Therefore the genotype called is T/C (heterozygous). 5. Sort the following types of genetic variants into the categories: Potentially Disease Causing, Unlikely to be Disease Causing 1. Splice Site 2. Non-Synonymous 3. Synonymous 4. FrameshiftIndel 5. Stop Loss 6. Stop Gain 7. Intronic (Non-Splice Site) 8. Intergenic Answer: Disease: 1, 2, 4, 5, 6 Non-Disease: 3, 7, 8
6) What is the primary motivation for using “next gen” sequencing methods and modern genomics approaches to diagnosing human genetic diseases? Answer: Cost 7) What does the base quality of a sequencing read tell you? Answer: The base quality is equivalent to the probability of an incorrect base call. (Also acceptable answer is the base call accuracy) 8) What problem does binary search address? Answer: Efficiently searching the index of a genome