Upload
brenda-perkins
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
Searching for Efficacy Biomarker in Early Clinical Development – An Example of Using NGS in Early Oncology Development
Feng Gao, Jacob Zhang, Godwin Yung and Ray Liu
19th May 2015
2
Introduction
• Advances in next-generation sequencing (NGS) technologies provide a powerful tool for gaining deeper understanding of biological processes for diseases, but also provide new challenges for statistical analysis.
• In this study, we wanted to identify signatures of tumor somatic mutation that are associated with clinical efficacy.
• In the study:– Sample: archived tumor samples that include 5 different solid tumor types. All
patients are under the same treatment (single arm).– NGS whole exome sequencing (WES): variant calling from tumor-germline
pairs on coding regions (coding exons) to identify cancer somatic mutations. – Clinical endpoints: tumor size change (%), and PFS.– Sample size and power: started with ~120 samples, with adequate power for
testing 20 candidate genes.– Several tiers of NGS data: from WES, from cancer related gene list, from
further reduced top candidate cancer genes.– Types of statistical analyses: univariate analysis, multivariate analysis and
pathway analysis.
|○○○○ | DDMMYY
3
Challenges
• Sample quality: archived, FFPE treated samples had high attrition rate in the NGS process. Samples from clinically non-evaluable subjects were also removed. As a result, sample size with available NGS data and clinical data is reduced to n=47.
• Challenges with small sample size: – NGS data at variant level cannot be analyzed – almost all of them are singletons. We
have to aggregate the data to gene level. We have to make the assumption that all the variants within a gene have the effect in the same direction.
– Many genes were non-variant, or singletons, and have to be removed from univariate analysis, thus reducing the number of genes in the univariate analysis.
– With small sample size, the mutation patterns with several different genes could appear identical in our data, making it difficult to interpret the result.
– Data from different tumor types have to be pooled together, creating heterogeneity issues.
– Univariate analysis did not produce statistically significant findings from the top candidate cancer gene list after multiplicity adjustment. Top genes from univariate analysis on WES list were heavily influenced by a single patient. Multivariate regression approach generated gene pairs that lack biological interpretation. Recursive partition approach did not work well with such a small sample size.
• Preliminary analysis did show that there is definitely biological information in the data. Now the challenge is how do we identify such useful features.
|○○○○ | DDMMYY
Overview of data
5
How we dealt with the challenges
• Using multivariate approach starting with biologically driven gene set/pathways, we created 2434 sets of genes by grouping 6410 genes together if they belong to the same pathway.
• A panel of markers may be more powerful than a single marker– Sparse distribution of somatic mutations – Low information content of a single mutation (binary, 0/1)– Larger coverage of patients
• A panel of markers need to be supported by biology– Bigger confidence– Better interpretability
• Known pathways or protein-protein interaction
(PPI) network have been tested as predictive
marker panels– A pathway = a bag of genes grouped by
biology better chance to find biologically meaningful
markers
|○○○○ | DDMMYY
Pathway DB # of pathways
MetaCoreTM 912
BIOCYC 33
KEGG 794
REACTOME 1358
Wiki Pathways 225
Pathway interaction DB
183
6
Construction of gene-sets
Pathwaydatabase
(ex: KEGG)
2434
3
2
1
Genes Pathways
59325931
167616754465
1111103
1821111103
,
,...,,,
,,
,...,,,
6410
111
110
5
3
g
Pathways Genes
g
g
g
g
344334333
44343921
34711221
43
21
,,
,...,,,
,...,,,
,
,
6410
2
1
Genes
g
g
g
7
Model Assumption for the effect of genes within a pathway
Sequence kernel association test (SKAT) Effect sizes are distributed normal with mean 0
Counting-based burden test (cBT) Effect sizes are the same
Threshold-based burden test (tBT) Single effect when there are more than T mutations
Statistical models used in the analysis
• Tumor size change (%) used as efficacy variable• To increase statistical power, and to accommodate different modes of gene effect within a pathway/gene-set, we assume the following 3 models:
8
More details in statistical models
Model Assumption Test
SKAT Score
cBT Wald
tBT Wald
),0(~, Ngiii jjj
ij
g
threshold1
ij
g
• For all three models, we can test the null hypothesis of no pathway association by testing a single parameter:
• In practice, we may wish to reject H0 if at least one of the three tests is significant. Therefore, let us also consider the omnibus test (OT) whose p-value is the minimum of the p-values from SKAT, cBT, and tBT.
0:0 H
More statistical challenges
• Multiplicity correction, the issues:– Multiple models: multiple models used to test pathway association.– Correlation: 2434 pathways are not independent. Many share the same
genes (more detail in next slide).
• The solution for multiplicity correction in the presence of feature correlation:– Resampling-based multiple testing can adjust p-values to account for
multiple testing by incorporating correlation and other distributional characteristics.
10
Many gene-sets are correlated
• Many of the K=2434 gene-sets have overlapping genes. Some are even identical.
• For example, in 10 gene-sets, the following relation holds:
• It is of interest to efficiently test multiple hypotheses.
1. Diminish the number of gene-sets for association testing
2. Test all K gene-sets and then efficiently correct for multiple testing
10
98765
43
21
G
GGGGG
GG
GG
11
Review: Single-step methods for multiplicity adjustment
• Let us distinguish the random p-value from the experimentally observed p-value using capital and lower cases letters, respectively:
Also, denote the complete null hypothesis by
• Single-step methods are simultaneous test procedures that perform equivalent multiplicity adjustments for all tests. E.g.,– Bonferroni:
– Sidak:)1,min(~
ii Kpp
Kii pp )1(1~
),...,(
),...,(
1
1
K
K
ppp
PPP
K
ii
C HH1
0
12
Bonferroni and Sidak methods require certain assumptions to control the FWER.
•
• Bonferroni:
• Sidak:
Without independence, equality becomes ≤ for very general conditions.
iUHPKK
HKP
HKP
ii
K
i
Ci
CiKi
for )1,0(~| )/(
inequality Bonferroni )|/Pr(
)|Pr(minFWER
1 0
01
iUHP
HP
HP
iiK
K
i
CKi
CKiKi
for )1,0(~| ))1((1
ceIndependen )|)1(1Pr(1
)|)1(1Pr(minFWER
K/1
1 0
01
)|~
Pr(minFWER 01C
iKi HP
• Bonferroni and Sidak methods fail to incorporate dependence and distributional characteristics of the observed p-values. Both limitations are concerns.
Resampling-based multiple testing (RBMT)
13
RBMTIf we knew the joint distribution P, then we could compute
so that
In practice, we may not know P. However, in many cases vectors P*, having the same distribution as P, may be simulated via resampling:
We can therefore compute
)|Pr(min~01C
ijKji HpPp
)|~
Pr(minFWER 01C
iKi HP
)Pr(min~ *1 ijKji pPp
*1min jKj P
CjKj HP 01 |min
*PCHP 0|
14
Illustration
Gene-set(i)
Observedp-value (pi)
R resampled p-values (Pi*)
1 p1 p*1,1 p*
1,2 … p*1,R
2 p2 p*2,1 p*
2,2 … p*2,R
⁞
K pK p*K,1 p*
K,2 … p*K,R
min({p*i,1}) min({p*
i,2}) min({p*i,R})
Distribution of p-value under H0
Dependence between tests
(Multiple testing) adjusted p-value
15
Our approach to simulating P*“Simultaneous” bootstrap test
)ˆ,0(~ˆ,ˆˆ'
ˆ,ˆ,ˆ
),0(~,'
2***
2
2
NXY
NXY
*SKAT,,1 rp
*cBT,,1 rp
*tBT,,1 rp
*OT,,1 rp
Repeat R times
*SKAT,,2 rp
*cBT,,2 rp
*tBT,,2 rp
*OT,,2 rp
*SKAT,,rKp
*cBT,,rKp
*tBT,,rKp
*OT,,rKp
Estimate residualsunder the null model
Generate pseudo-datavia parametric bootstrap
Perform GSAT on pseudo-data
16
Intuition: How does P* estimate P|H0C?
)ˆ,0(~ˆ,ˆˆ'
ˆ,ˆ,ˆ
),0(~,'
2***
2
2
NXY
NXY
*SKAT,,1 rp
*cBT,,1 rp
*tBT,,1 rp
*OT,,1 rp
*SKAT,,2 rp
*cBT,,2 rp
*tBT,,2 rp
*OT,,2 rp
*SKAT,,rKp
*cBT,,rKp
*tBT,,rKp
*OT,,rKp
P*P|H0C
),0(~,' 2 NXY
SKAT,1p cBT,1p tBT,1p OT,1p
SKAT,2p cBT,2p tBT,2p OT,2p
SKAT,Kp cBT,Kp tBT,Kp OT,Kp
17
GS Adjusted p-value
SKAT cBT tBT OT
1 0.931 0.872 0.015 0.018
2 0.094 0.010 0.465 0.022
3 0.798 0.017 0.784 0.037
4 0.803 0.580 0.037 0.044
5 0.875 0.482 0.044 0.052
6 0.038 0.223 0.057 0.067
7 0.022 0.063 0.116 0.134
8 0.007 0.823 0.967 0.178
9 0.044 0.974 0.999 0.432
bold = significant at “α=.05”
Efficiently correcting for multiple testing increases # of significant gene-sets from 0 to 4.
NGS study resultsSignificant gene-sets
Resampling based multiplicity adjusted p-value=0.0184
Top pathway from analysis
19
How does resampling-based approach compare to other single-step methods for p-value adjustment
• Our approach: resampling-based approach for multiplicity adjustment.
• Other single-step methods for multiplicity adjustment:– If we knew the effective number of independent tests Ke, then we could
apply the Bonferroni or Sidak adjustments:
– One can estimate Ke based on biology or from a purely statistical standpoint:
• Linkage disequilibrium structure Ke = number of major LD blocks
• Principal components analysis (PCA) of P* where Ke = number of principal components that explains x% of the variation in Pi*’s
• Like PCA, resampling-based adjustments that use the minimum statistic rely on P*.
)1,min(~iei pKp eK
ii pp )1(1~
20
Simulation to compare resampling-based approach with other single-step methods for p-value adjustment
• Rather than consider all K=2434 gene-sets, we considered only10 gene-sets. These gene-sets are highly related:
• Using parameter estimates from the fitted null model of the original data, we randomly simulated outcomes under a null model or a desired alternative model.
• For the first of 10,000 simulated studies, we performed RBMT by generating R=10,000 bootstrap resamples.
98765
4321 ,
GGGGG
GGGG
21
Simulation resultsEffective number of independent tests (Ke) for OT
MethodCutoff for
pEst.
FWERx%
Sidak, Ke=1 0.0500 0.363 29.8
Sidak, Ke=9 0.0057 0.051 83.4
Sidak, Ke=30 0.0017 0.016 100.0
RBMT 0.0056 0.050 −
Table. In order to control FWER ≤ 0.05, different methods propose different cutoffs for the p-values. We provide the estimated FWER for each proposed cutoff, as well as the x% of variation in Pi* explained by the first Ke PCs (slide 24). Notice how similar RBMT is to the Sidak correction with Ke=9.
)|Pr(min~0
*101
Cjj HpPp
22
Summary
• NGS attrition rate high due to DNA quality and other issues.
• Univariate analysis did not have adequate power due to reduced sample size.
• Multivariate approach based on pathways is considered. We used 4 types of test (SKAT, cBT, tBT, and OT) for the association between 2434 gene-sets and %TSCB.
• We used a resampling-based approach to correct for multiple testing.
• Our efforts led to the identification of a gene-set as a statistically significant and biologically interesting pathway.
23
References
• Elbers C.C., et al. (2009) Using genome-wide pathway analysis to unravel the etiology of complex diseases. Genet Epi 33, 419-431.
• Wang K., Li M., and Bucan M. (2007) Pathway-based approaches for analysis of genomewide association studies. Am J Hum Genet 81, 1278-1283.
• Yu K., et al. (2009) Pathway analysis by adaptive combination of p-values. Genet Epi 33, 700-709.
• Westfall P.H. and Young S.S. (1993) Resampling-based multiple testing: Examples and methods for p-value adjustment. John Wiley & Sons Inc, New York.
• Wu M.C., et al. (2011) Rare variant association testing for sequencing data with the sequence kernel association test. Am J Hum Gen 89,82-93.
|○○○○ | DDMMYY