17
Supplementary Figure 1 Brain sample characteristics. (a-b) Number of reads mapped to the hg19 reference genome (a) and mean read quality scores (b) for 131 ASD and 111 CTL brain tissue samples sequenced in this study. (c-e) Numbers of mapped reads, mean read quality scores, distribution of age, RIN, PMI, numbers of male and female individuals, numbers of prefrontal (FC) and temporal cortex (TC) samples, and numbers of samples from different brain banks (H, Harvard-ATP; N, NICHD-BTB) are shown for the ASD and CTL groups in 95 cortex samples used for the main DGE analysis (c), 109 cortex samples used for the main WGCNA analysis (d), and 47 cerebellum samples used for DGE analysis (e). Boxplot whiskers encompass data points within 1.5-fold interquartile ranges of the lower and upper quartiles. Nature Neuroscience: doi:10.1038/nn.4373

Nature Neuroscience: doi:10.1038/nn · ) for 13 1ASD and 111CTL brain tissue samples sequenced in this study. -

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Nature Neuroscience: doi:10.1038/nn · ) for 13 1ASD and 111CTL brain tissue samples sequenced in this study. -

Supplementary Figure 1

Brain sample characteristics.

(a-b) Number of reads mapped to the hg19 reference genome (a) and mean read quality scores (b) for 131 ASD and 111 CTL brain tissue samples sequenced in this study. (c-e) Numbers of mapped reads, mean read quality scores, distribution of age, RIN, PMI, numbers of male and female individuals, numbers of prefrontal (FC) and temporal cortex (TC) samples, and numbers of samples from different brain banks (H, Harvard-ATP; N, NICHD-BTB) are shown for the ASD and CTL groups in 95 cortex samples used for the main DGE analysis (c), 109 cortex samples used for the main WGCNA analysis (d), and 47 cerebellum samples used for DGE analysis (e). Boxplot whiskers encompass data points within 1.5-fold interquartile ranges of the lower and upper quartiles.

Nature Neuroscience: doi:10.1038/nn.4373

Page 2: Nature Neuroscience: doi:10.1038/nn · ) for 13 1ASD and 111CTL brain tissue samples sequenced in this study. -

Nature Neuroscience: doi:10.1038/nn.4373

Page 3: Nature Neuroscience: doi:10.1038/nn · ) for 13 1ASD and 111CTL brain tissue samples sequenced in this study. -

Supplementary Figure 2

Principal component analysis and hierarchical sample clustering.

(a) Pearson correlations between different covariates and principal components (PCs) 1-5 of miRNA expression data (normalized for library size) for all 216 samples that passed quality control. Pearson correlation coefficients (R) and P values are indicated. Diagnosis (ASD vs. CTL), sex (female vs. male), and library preparation method (RiboZero selection vs. total RNA) were treated as binary numeric variables. Region (frontal cortex, temporal cortex, or cerebellum) and brain bank were treated as multi-level factor variables and adjusted R2 was calculated using a linear model between these two variables and other variables. Between region and brain bank, a chi-square test was performed and the P value was indicated in the heatmap (R was non-applicable). (b) Scatter plot of 216 samples based on PCs 1 and 2 of the miRNA expression data (normalized for library size). Dots are colored according to brain region. (c) Hierarchical clustering of all 216 samples using expression data (normalized for library size) of all expressed miRNAs. Information on diagnosis, age, sex, brain region, co-morbidity of seizures, psychiatric medication history, RIN, PMI, brain bank, and library preparation method is indicated with color bars below the dendrogram according to the legend on the right. (d) Pearson correlations between different covariates and PCs 1-5 of miRNA expression data (normalized for library size) for 95 cortex samples used for the main DGE analysis. Pearson correlation coefficients (R) and P values are indicated. Diagnosis (ASD vs. CTL), sex (female vs. male), region (frontal vs. temporal), and brain bank (Harvard-ATP vs. NICHD-BTB) were treated as binary numeric variables. (e) Scatter plot of 95 cortex samples based on PCs 1 and 2 of the miRNA expression data (normalized for library size). Dots are colored according to brain region. (f) Hierarchical clustering of 95 cortex samples using expression data (normalized for library size and other technical covariates) of all expressed miRNAs. Information on diagnosis, age, sex, brain region, co-morbidity of seizures, psychiatric medication history, RIN, PMI, and brain bank is indicated with color bars below the dendrogram according to the legend on the right.

Nature Neuroscience: doi:10.1038/nn.4373

Page 4: Nature Neuroscience: doi:10.1038/nn · ) for 13 1ASD and 111CTL brain tissue samples sequenced in this study. -

Supplementary Figure 3

Robustness of DGE results.

(a) Comparison of miRNA fold changes in ASD vs. CTL between 10 rounds of random sampling of 70% of the samples and the original

Nature Neuroscience: doi:10.1038/nn.4373

Page 5: Nature Neuroscience: doi:10.1038/nn · ) for 13 1ASD and 111CTL brain tissue samples sequenced in this study. -

95 cortex samples. Pearson correlation coefficients (R) and P values are indicated. Red line, y=x. (b-d) Comparison of miRNA fold changes in ASD vs. CTL between all 95 cortex samples and subsets of samples with RIN ≥ 5 (b), PMI ≤ 30 hrs (c), or no 15q duplication (d). Pearson correlation coefficients (R) and P values are indicated. Red line, y=x. (e) Comparison of miRNA fold changes in ASD vs. CTL between 95 cortex and 47 cerebellum samples. Red line, y=x; black line, linear regression between fold changes in the cerebellum and the cortex for miRNAs differentially expressed in the cortex (FDR < 0.05). Pearson’s R and P values are indicated. (f-g) Normalized log2(expression level) of down-regulated (f) and up-regulated (g) miRNAs in ASD (red) and control (blue) samples (n = 5 - 8) detected by qRT-PCR. Statistical significance was assessed using two-sided t-tests assuming unequal variance.

Nature Neuroscience: doi:10.1038/nn.4373

Page 6: Nature Neuroscience: doi:10.1038/nn · ) for 13 1ASD and 111CTL brain tissue samples sequenced in this study. -

Nature Neuroscience: doi:10.1038/nn.4373

Page 7: Nature Neuroscience: doi:10.1038/nn · ) for 13 1ASD and 111CTL brain tissue samples sequenced in this study. -

Supplementary Figure 4

Robustness, age dependence, and preservation of miRNA co-expression modules.

(a) Construction of consensus modules by bootstrapping for 200 rounds. Module assignment based on the original 109 samples, consensus module assignment based on 200 rounds of bootstrapping, and module assignment in each of the 200 resampled networks are shown below the dendrogram. The three ASD-associated modules are largely stable with perturbations to the initial subject and regional identity. The fraction of times each gene was assigned to the same module as in the consensus module is reported in Supplementary Table 2. (b-c) Comparison of ASD vs. CTL fold changes between samples from younger (15 - 30 years) and older (> 30 years) individuals for miRNAs in the yellow (b) and magenta (c) modules. Red line, y=x; black line, regression line between fold changes in the younger individuals and the older individuals for miRNAs differentially expressed (P < 0.05) in the younger set (orange and magenta dots). (d-g) Preservation of modules defined in ASD (d), CTL (e), TC (f), or FC (g) samples only in CTL (d), ASD (e), FC (f), or TC samples (g). (h-i) Module preservation analysis in two independent datasets: 31 independent cortex samples sequenced in this study (h) (Methods) and 167 samples covering 16 brain regions (neocortex, subcortical regions, thalamus, and cerebellum) and different ages (4 months to 19 years old) from the BrainSpan project (i). A module is considered not preserved if preservation Zsummary < 2, moderately preserved if 2 ≤ Zsummary < 10, and highly preserved if Zsummary ≥ 10.

Nature Neuroscience: doi:10.1038/nn.4373

Page 8: Nature Neuroscience: doi:10.1038/nn · ) for 13 1ASD and 111CTL brain tissue samples sequenced in this study. -

Supplementary Figure 5

Experimental validation of TargetScan predicted miRNA targets.

Distribution (left) and cumulative distribution (right) of mRNA log2(fold change) in response to over-expression of hsa-miR-21-5p in hNPCs. Orange line, the strongest mRNA targets predicted by TargetScan (summed context+ score ≤ -0.1); purple line, the most conserved mRNA targets predicted by TargetScan (branch length in the top 25%, context+ score ≤ -0.05); red line, all mRNAs predicted to be hsa-miR-21-5p targets by TargetScan (the above two categories combined); green line, hsa-miR-21-5p targets documented in miRTarBase (452 miRNAs); black line, mRNAs not predicted to be hsa-miR-21-5p targets by TargetScan. Statistical significance between target groups and non-targets was assessed using ones-sided t-tests assuming unequal variance.

Nature Neuroscience: doi:10.1038/nn.4373

Page 9: Nature Neuroscience: doi:10.1038/nn · ) for 13 1ASD and 111CTL brain tissue samples sequenced in this study. -

Supplementary Figure 6

Enrichment of ASD risk genes within the top targets of ASD-affected miRNAs and miRNA modules while controlling for 3’ UTR length.

(a) Heatmap showing enrichment of ASD SFARI genes, ID genes, ASD rare variants, FMRP targets, PSD genes, embryonically

Nature Neuroscience: doi:10.1038/nn.4373

Page 10: Nature Neuroscience: doi:10.1038/nn · ) for 13 1ASD and 111CTL brain tissue samples sequenced in this study. -

expressed genes, and chromatin modifiers, assessed using a logistic model that incorporates gene 3’ UTR length. P values were FDR corrected across 10 target groups for each gene list. (b) Heatmap showing enrichment of genes affected by multiple categories of de novo variants (DNVs), assessed using a logistic model that incorporates gene coding region length and gene 3’ UTR length. P values were FDR corrected across 10 target groups for each DNV category. (c) Heatmap showing enrichment of ASD-associated developmental gene co-expression modules in human cortex, assessed using a logistic model that incorporates gene 3’ UTR length. P values were FDR corrected across 10 target groups for each developmental module. Enrichment odds ratios (OR) and FDR corrected P values are shown for enrichments with FDR < 0.05.

Nature Neuroscience: doi:10.1038/nn.4373

Page 11: Nature Neuroscience: doi:10.1038/nn · ) for 13 1ASD and 111CTL brain tissue samples sequenced in this study. -

Supplementary Figure 7

Nature Neuroscience: doi:10.1038/nn.4373

Page 12: Nature Neuroscience: doi:10.1038/nn · ) for 13 1ASD and 111CTL brain tissue samples sequenced in this study. -

GO analysis for the predicted targets of differentially expressed miRNAs and ASD-related miRNA modules.

(a-b) Top relevant gene ontology categories (GO-Elite software, uncorrected P < 0.01, number of enriched genes > 5) for the strongest (a) or the most conserved (b) targets of the down-regulated and up-regulated miRNAs and the ASD-related miRNA modules. Enrichment Z scores represent relative enrichment in the targets compared to the background (Methods), with the red line at Z = 2. Asterisks indicate GO terms with FDR-corrected P values < 0.10.

Nature Neuroscience: doi:10.1038/nn.4373

Page 13: Nature Neuroscience: doi:10.1038/nn · ) for 13 1ASD and 111CTL brain tissue samples sequenced in this study. -

Supplementary Figure 8

Effect of miRNAs on mRNA expression changes.

Nature Neuroscience: doi:10.1038/nn.4373

Page 14: Nature Neuroscience: doi:10.1038/nn · ) for 13 1ASD and 111CTL brain tissue samples sequenced in this study. -

(a-h) Comparison of log2(fold change) distribution of down-regulated (a,b,e,f) or up-regulated (c,d,g,h) mRNAs (FDR < 0.05) that are predicted to be the strongest (a-d) or the most conserved (e-h) targets of the down-regulated or up-regulated miRNAs (a,c,e,g) or miRNA modules (b,d,f,h) to log2(fold change) distribution of differentially expressed mRNAs that are not predicted targets. The number of mRNAs in each group is indicated in brackets. One-tailed Wilcoxon rank sum tests were performed to compare each target group to the non-targets. *P < 0.05, **P < 0.01, ***P < 0.001. mRNAs that are predicted targets of both down-regulated and up-regulated miRNAs and miRNA modules were not included in the analysis.

Nature Neuroscience: doi:10.1038/nn.4373

Page 15: Nature Neuroscience: doi:10.1038/nn · ) for 13 1ASD and 111CTL brain tissue samples sequenced in this study. -

Supplementary Figure 9

Correlation between differentially expressed miRNAs and mRNAs.

(a-c) Correlations between the PC1s of differentially expressed miRNAs (FDR < 0.05, |log2(fold change)| ≥ 0.3) and differentially expressed mRNAs (FDR < 0.05) that are predicted targets. (a) All differentially expressed miRNAs vs. all differentially expressed mRNAs; (b) up-regulated miRNAs vs. down-regulated mRNAs; (c) down-regulated miRNAs vs. up-regulated mRNAs. Pearson correlation coefficients (R) and P values within 47 CTL or 54 ASD samples alone, or the combined samples together are shown below the plots.

Nature Neuroscience: doi:10.1038/nn.4373

Page 16: Nature Neuroscience: doi:10.1038/nn · ) for 13 1ASD and 111CTL brain tissue samples sequenced in this study. -

Supplementary Figure 10

Nature Neuroscience: doi:10.1038/nn.4373

Page 17: Nature Neuroscience: doi:10.1038/nn · ) for 13 1ASD and 111CTL brain tissue samples sequenced in this study. -

Further characterization of hsa-miR-21-3p and hsa_can_1002-m.

(a) miRNA expression fold changes in the cortex plotted against percentile rank of mean expression levels across 95 cortex samples, with hsa-miR-21-3p and hsa_can_1002-m highlighted. (b-c) Views from the UCSC Genome Browser (https://genome.ucsc.edu) showing the chromosome locations and pre-miRNA sequences of hsa-miR-21-3p (b) and hsa_can_1002-m (c). The mature sequences and seed regions are indicated with black lines and green rectangles, respectively. Within the seed region of hsa_can_1002-m, a single-nucleotide difference indicated with a red asterisk was observed between the human and all other primate sequences. Gene annotations, multiple alignments of corresponding regions in different vertebrates, and measurements of evolutionary conservation using phyloP are shown. (d-g) Boxplots showing expression patterns of hsa-miR-21-3p (d,e) and hsa_can_1002-m (f,g) in different human brain regions (d,f) and at different developmental stages (e,g). DFC, dorsolateral prefrontal cortex; VFC, ventrolateral prefrontal cortex; MFC, medial prefrontal cortex; OFC, orbital frontal cortex; M1C, primary motor cortex; S1C primary somatosensory cortex; IPC, posteroinferior parietal cortex; A1C, primary auditory cortex; STC, posterior superior temporal cortex; ITC, inferolateral temporal cortex; V1C, primary visual cortex; HIP, hippocampus; AMY, amygdaloid complex; STR, striatum; MD, mediodorsal nucleus of thalamus; CBC, cerebellar cortex. Infancy, 4 months - 1 year; early childhood, 2 - 4 years; late childhood, 8 - 13 years; adolescence, 15 - 19 years; adulthood, 21 - 40 years. Boxplot whiskers indicate data points within 1.5-fold interquartile ranges of the lower and upper quartiles.

Nature Neuroscience: doi:10.1038/nn.4373