Download pdf - Investigation of Myc-regulated long non-coding RNAs in cell cycle … · 2013. 12. 18. · APC from the Wnt pathway resulting in Myc transcription induced by TCF and β-catenin (12)

Investigation of Myc-regulated long non-coding RNAs in cell cycle and Myc-dependent transformation

by

Matthew Steven MacDougall

A thesis submitted in conformity with the requirements for the degree of Master of Science

Graduate Department of Laboratory Medicine and Pathobiology University of Toronto

© Copyright by Matthew Steven MacDougall (2012)

ii

Investigation of Myc-regulated long non-coding RNAs in cell cycle

and Myc-dependent transformation

Matthew Steven MacDougall

Master of Science

Graduate Department of Laboratory Medicine and Pathobiology University of Toronto

2012

Myc deregulation critically contributes to many cancer etiologies. Recent work suggests that

Myc and its direct interactors can confer a distinct epigenetic state. Our goal is to better

understand the Myc-conferred epigenetic status of cells. We have previously identified the long

non-coding RNA (lncRNA), H19, as a target of Myc regulation and shown it to be important for

transformation in lung and breast cells. These results prompted further analysis to identify

similarly important Myc-regulated lncRNAs. Myc-regulated lncRNAs associated with the cell

cycle and transformation have been identified by microarray analysis. A small number of

candidate lncRNAs that were differentially expressed in both the cell cycle and transformation

have been validated. Given the increasing importance of lncRNAs and epigenetics to cancer

biology, the discovery of Myc-induced, growth associated lncRNAs could provide insight into

the mechanisms behind Myc-related epigenetic signatures in both normal and disease states.

Abstract

iii

Acknowledgments To my supervisor, Dr. Linda Penn: thank you for your mentorship; you have truly taught me to

always challenge myself. Your guidance has been invaluable to my interest in pursuing a career

in academic medicine.

To my supervisor, Dr. Philip Marsden: your guidance goes far beyond the bench as you always

find new ways to challenge me to think critically. Your mentorship, from both a scientific and

clinical perspective, has helped me greatly.

To both of my supervisors: this opportunity to be co-supervised has been wonderful, thank you.

To my committee members: Dr. Cheryl Arrowsmith, Dr. Senthil Muthuswamy, and Dr. Rod

Bremner. Thanks for providing the knowledge and suggestions to keep me on track.

To the Penn lab members past and present: thank you for all of your support. In particular, I

thank Sam Kim, Christina Bros, and Romi Ponzielli for helpful discussions and suggestions

when it mattered most.

To the Marsden lab members past and present: thank you for all of your support. In particular, I

thank Jeff Man, Paul Turgeon, Matt Yan, and David Ho for intellectual discussions and

contributions. A big thank you goes to Maria Chalsev for her continued support in plasmid

preparation.

To my family and friends: thank you for all of your endless support and encouragement as I

pursue my goals.

Attribution of Work:

The work on the MCF-10As and the cell cycle could not have been achieved without the

optimization and experiments performed previously by Andrew Rust. Moreover, the work

completed using the MCF-10As and the 3D model of transformation was an extension of

Amanda Wasylishen’s work on Myc phosphorylation mutants in that model. Therefore, much of

the optimization and generation of stable expressing cell lines in that model were established

previously by her.

iv

Table of Contents Abstract ........................................................................................................................................... iiAcknowledgments .......................................................................................................................... iiiTable of Contents ........................................................................................................................... ivList of Figures ................................................................................................................................ viAbbreviations ................................................................................................................................ viiChapter 1 Introduction .................................................................................................................... 1

1.1 Myc Structure/Function ...................................................................................................... 21.2 Myc Transcriptional Activity and the Cell Cycle ............................................................... 41.3 Myc and Epigenetics ........................................................................................................... 61.4 H19 – a Myc-induced regulatory, non-coding RNA .......................................................... 81.5 Long non-coding RNA Function ...................................................................................... 111.6 Rationale for the investigation of Myc-regulated lncRNAs ............................................. 141.7 Working Model ................................................................................................................. 151.8 Assumptions ...................................................................................................................... 151.9 Hypothesis ......................................................................................................................... 161.10 Objective ........................................................................................................................... 16

Chapter 2 Results .......................................................................................................................... 172.1 MCF-10A Cell-based model development ....................................................................... 17

Introduction to MCF-10A cells ......................................................................................... 17Mitogen starved and stimulated MCF-10A cells synchronously enter the cell cycle ....... 17Expression profiling of Myc and Myc targets in MCF-10A cells within the cell cycle ... 19MCF-10A cells, grown on Matrigel, can undergo Myc-dependent transformation ......... 22

2.2 Global long non-coding gene expression profiling ........................................................... 27Identification of cell cycle associated lncRNAs ............................................................... 29Identification of Myc-induced transformation associated lncRNAs ................................ 29Identification of lncRNA genes common to Myc-induced transformation and the cell

cycle ...................................................................................................................... 322.3 Application of Inclusion Criteria ...................................................................................... 342.4 Candidate Expression and Selection ................................................................................. 372.5 Expression Validation ....................................................................................................... 402.6 Validation of Publicly Available Myc-ChIP data in MCF-10A cells ............................... 402.7 Functional Validation and Expression Profiling of Candidates lncRNA-LY6E and

lncRNA-FZD6 .................................................................................................................. 42Chapter 3 Discussion .................................................................................................................... 47

3.1 Candidate lncRNA profiling ............................................................................................. 473.2 Large scale lncRNA profiling ........................................................................................... 483.3 MCF-10A cell system ....................................................................................................... 503.4 Future Directions .............................................................................................................. 513.5 Conclusions and Implications ........................................................................................... 52

Chapter 4 Methods ........................................................................................................................ 544.1 Cell Culture ....................................................................................................................... 54

Reagents: ........................................................................................................................... 54MCF-10A cells: ................................................................................................................ 54

4.2 Immunoblotting ................................................................................................................. 54Whole Cell Extracts .......................................................................................................... 54

v

SDS-PAGE ....................................................................................................................... 55Antibodies ......................................................................................................................... 55

4.3 Quantitative Real-time PCR ............................................................................................. 56RNA Isolation ................................................................................................................... 56cDNA Synthesis ................................................................................................................ 56Primer Design ................................................................................................................... 56Relative and Absolute Quantification ............................................................................... 57

4.4 Flow Cytometry ................................................................................................................ 594.5 MCF-10A: Model of Cell Cycle Entry ............................................................................. 60

Seeding Density: ............................................................................................................... 604.6 MCF-10A: Myc-dependent Model of Transformation ..................................................... 614.7 MCF-10A: Model of Myc-dependent Gene Regulation in the absence of other stimuli .. 624.8 Gene Expression Array ..................................................................................................... 62

Sample Preparation ........................................................................................................... 62Arraystar Microarray Analysis ......................................................................................... 63

4.9 Bioinformatic Analysis ..................................................................................................... 63Preprocessing: Array Background Adjustment, raw mRNA data normalization, and

low intensity filtering ............................................................................................ 63Inclusion Criteria .............................................................................................................. 64Candidate Selection Criteria ............................................................................................. 65

4.10 Nuclear-Cytoplasmic Partitioning .................................................................................... 684.11 ChIP-qRT-PCR ................................................................................................................. 694.12 Statistical Analysis ............................................................................................................ 71

References ..................................................................................................................................... 72Appendices .................................................................................................................................... 87

vi

List of Figures Figure 1: Pictoral representation and hypotheses of Myc’s role in the epigenome. ...................... 9Figure 2: MCF-10A cells respond rapidly to mitogen withdrawal and induction ....................... 18Figure 3: MCF-10A cells synchronously entering the cell cycle show coordinated Myc and Myc target gene expression ................................................................................................................... 20Figure 4: MCF-10A cells acinar morphogenesis ......................................................................... 23Figure 5: MCF-10A cells stably over expressing Myc-T58A form transformed, multiacinar structures ....................................................................................................................................... 26Figure 6: Flow Chart of the strategy for systematic identification of cell cycle associated and Myc-induced transformation associated lncRNAs ....................................................................... 28Figure 7: Identification of cell cycle associated lncRNAs ........................................................... 30Figure 8: Identification of Myc-induced transformation associated lncRNAs ............................ 31Figure 9: Identification of lncRNAs common to G0/G1 to S phase progression of the cell cycle and Myc-induced transformation .................................................................................................. 33Figure 10: Expression validation of 6 candidate lncRNAs .......................................................... 38Figure 11: Myc binds the promoter of lncRNA-LY6E ................................................................ 41Figure 12: lncRNA-LY6E is dynamically regulated in cell cycle and Myc-dependent transformation ............................................................................................................................... 43Figure 13: Schematic working model of how Myc-repression contributes to epigenetic regulation ...................................................................................................................................... 49Figure 14: Distribution of differentially expressed genes by normalized array intensity ........... 67 Table 1: Characteristics of candidate lncRNAs from the expression profiling union between cell cycle and transformation ............................................................................................................... 35Table 2: Primer List ..................................................................................................................... 58

Supplemental Figure 1: Cell cycle seeding density optimization .............................................. 88Supplemental Figure 2: Model of Myc-dependent Gene Regulation ........................................ 89Supplemental Figure 3: lncRNA-FZD6 is dynamically regulated in cell cycle and Myc-dependent transformation .............................................................................................................. 90Supplemental Figure 4: Candidate lncRNAs are not nuclear retained under asynchronous growing conditions ........................................................................................................................ 92Supplemental Figure 5: 8 hours of Myc induction under starvation conditions does not lead to significant changes in candidate lncRNA expression ................................................................... 93

vii

Abbreviations AOMF Advanced Optical Microscopy Facility APC adenomatous polyposis coli bHLH-LZ basic helix-loop-helix-leucine zipper BLAST basic local alignment search tool CCNB1 cyclin B1 CCND1/D2 cyclin D1/D2 CCNE cyclin E cDNA complementary DNA ChIP chromatin immunoprecipitation c-Myc cellular Myc CNV copy number variation CoREST/REST corepressor/repressor element-1 silencing transcription

factor COSMIC Catalogue of Somatic Mutations in Cancer CTCF CCCTC-binding factor CTD C-terminal domain DNMT3a DNA methyltransferase 3a EDTA Ethylenediaminetetraacetic acid EGF epidermal growth factor ENCODE encyclopedia of DNA elements ENSEMBL EMBL-EBI and the Sanger Centre Genome Browser EST expressed sequence tag EZH2 enhancer of zeste 2 Fos c-Fos, FBJ murine osteosarcoma viral oncogene homolog FZD6 frizzled 6 GADD45 growth arrest and DNA damage inducible protein, 45 GAS1 growth arrest specific protein 1 GCN5 general control of amino acid synthesis protein 5 GEO Genome Expression Omnibus GFP green fluorescent protein GSK3β Glycogen Synthase Kinase 3 beta H19 imprinted maternally expressed transcript H2A.Z histone family H2A, member Z HAT histone acetyl transferase HOX Homeobox ICR imprinting control region Igf2 insulin-like growth factor 2 Jun c-Jun, jun proto-oncogene KDM5A/B lysine demethylase 5A/5B, JARID 1A/1B lncRNA long non-coding RNA LSD1 lysine demethylase 1

viii

LY6E lymphocyte antigen 6 complex, locus E miRNA microRNA Miz1 Myc-interacting zinc finger protein 1 MTA1 metastasis associated 1 NCBI National Center for Biotechnology Information ncRNA non-coding RNA NTD N-terminal domain NuRD nucleosome remodeling deacetylase ORF open reading frame p21 cyclin-dependent kinase inhibitor 1A p27 cyclin-dependent kinase inhibitor 1B PBS phosphate buffered saline PCA3 prostate cancer antigen 3 PRC2 polycomb repressive complex 2 pTEFb positive transcription elongation factor b qRT-PCR quantitative real-time polymerase chain reaction SDS-PAGE sodium dodecyl sulfate polyacrylamide gel electrophoresis SNP single nucleotide polymorphism SUZ12 suppressor of zeste 12 TCF T-cell factor TRRAP transformation/transcription domain-associated protein UCSC University of California at Santa Cruz Genome Browser UHN University Health Network Wnt wingless-type MMTC integration site family Xist X inactive-specific transcript

1

Chapter 1 Introduction

c-Myc (Myc) is a major player in many functions of the cell. As a proto-oncogene, Myc has

prominent roles in cellular functions ranging from cell cycle, growth, and metabolism to cell

death, differentiation, and angiogenesis (1, 2). Given that Myc is involved in such diverse

processes, it is also highly regulated within the cell. Myc is regulated at the transcriptional, post-

transcriptional, and post-translational levels (3, 4, 5). With its short mRNA and protein half-lives

as well its tight transcriptional regulation, Myc is the convergence point of many cellular

signaling pathways and thus responds rapidly to cues sensed by the cell in the extracellular

milieu.

When the tight regulation of Myc is lost, cellular transformation often follows (2, 6). The loss of

Myc’s tight regulation, or its deregulation, can occur by several mechanisms that affect the level

and activity of Myc in different ways. Among these mechanisms can be genomic events that

include insertional mutagenesis near the Myc promoter (7, 8, 9), chromosomal translocation (10),

and amplification (11) leading to constitutive Myc transcription. Other mechanisms of

deregulation include upstream signaling pathway dysregulation. An example includes loss of

APC from the Wnt pathway resulting in Myc transcription induced by TCF and β-catenin (12).

The diverse means of Myc deregulation lead to a broad range of cancer types that originate from

many different cell types. The first example of the importance of Myc in human cancer came

with the discovery of Myc translocation in Burkitt Lymphoma (13, 14). As an extension of this,

Myc amplification is commonly seen in many human solid tumours (15). Aside from the genetic

events that directly implicate Myc, the deregulation that can occur at the level of upstream

signaling on Myc is also relevant to cancer. For example, APC/Wnt deregulation of Myc is

2

common to colon cancer (16) whereas dysregulation involving the NOTCH pathway is

prominent in T-cell acute lymphoblastic leukemia (17, 18, 19). Myc deregulation is therefore

commonly observed in many human cancers. Clinically, Myc’s inhibition could be an important

avenue for future therapeutic development. The feasibility of this approach has been

demonstrated using a dominant-negative inhibitor of Myc that, when expressed throughout the

whole animal in vivo, inhibited Myc’s activity and led to tumour eradication (20). Therefore the

study of Myc in both normal and disease states can provide a greater understanding of Myc’s

normal functions and their role in the genesis of human cancer.

1.1 Myc Structure/Function The first member of the Myc family to be discovered was v-myc. It was one of the first

transforming oncogenes isolated from avian tumour viruses that specifically led to a leukemia

called my

The MYC gene encodes the Myc protein. Myc regulates its broad cellular functions through its

action as a transcription factor. It is therefore largely nuclear retained and localized to chromatin.

From the classical transcription factor functional perspective, Myc has the ability to either

directly activate or repress transcription (30). It achieves this as a member of the basic helix-

elocytomatosis (21, 22, 23). The MC29 viral genome where Myc was discovered

transduced the gene from the host genome, which is now called cellular Myc (c-Myc or Myc)

(24). The family of Myc also includes c-Myc, L-Myc, N-Myc, S-Myc, and B-Myc (25, 26, 27,

28, 28, 29). The transforming family members, c-Myc, L-Myc, and N-Myc, are expressed during

fetal development of organisms (2). c-Myc is the only transforming member that is also

expressed in the normal adult making it the primary focus of this thesis. It is important to note

that each of the Myc transforming family members is deregulated in cancer including N-Myc and

L-Myc (25, 26, 27).

3

loop-helix leucine zipper(bHLH-LZ) family of DNA-binding transcription factors (31, 32). The

DNA-binding abilities of this family are dependent on dimerization with other family members.

Myc dimerization mostly occurs with Max (33).

Together as a heterodimer, Myc and Max target DNA in a sequence specific manner at the

canonical E-box cis-regulatory element whose sequence is 5’-CACGTG-3’ (34, 35). Like most

proteins, Myc is modular and requires the heterodimerization and DNA-binding capabilities

within its C-terminal domain (CTD). This region contains the bHLH-LZ domain of Myc. The

helix-loop-helix-leucine zipper (HLH-LZ) is essential for dimerization with Max. The basic

region (b) of the bHLH-LZ is responsible for DNA binding in the major groove. Lastly, this

CTD region also contains Myc’s nuclear localization signal (36). Together, all of the modules

within the CTD of Myc are required for all of its functions (37).

As a transcription factor, Myc recruits cofactors and the basic transcriptional machinery for

proper function. Myc achieves this via through its CTD and an N-terminal domain (NTD).

Within the NTD are several modules important for Myc function called Myc-homology boxes I-

IV (MBI, MBII, MBIII, MBIV). Some of these regions have been demonstrated to be important

for both transcriptional activation and repression (30, 38). Although this has not been

investigated in full, it is thought that these regions are functionally important due to their ability

to interact with other proteins (39). Some examples of Myc’s cofactors that bind to the NTD are

histone acetyltransferases (HATs) such as TRRAP-GCN5 (40). The NTD is not the only protein

interaction domain though, cofactors can bind elsewhere in Myc in the CTD such as the histone

demethylase, KDM5A/B (41).

4

1.2 Myc Transcriptional Activity and the Cell Cycle Myc expression largely correlates with the growth state of the cell so an important measure of

Myc’s function in the cell is its ability to regulate the cell growth and division cycle. Sustained

and dysregulated signaling in the cell cycle is a hallmark of cancer and often implicates Myc;

therefore it is a primary focus of this thesis (42). In particular, cell cycle regulation by Myc

involves its transcription activating and repressing activities. Myc functions as an immediate-

early response gene by directly regulating genes that are important in cell growth during G1/G0

to S transition (43, 44). This includes the activation of cyclin D genes (CCND1 and CCND2) and

cyclin dependent kinase 4 as well as the repression of growth arrest associated genes such as

p21, p27, GADD45, and GAS1 (45). Induction or repression of these genes may or may not be

mitogen dependent as it has also been suggested that Myc induction alone under mitogen

withdraw conditions can induce cell cycle entry (46). Regardless, through the ability to regulate

transcription, Myc can orchestrate the necessary gene expression changes for cell cycle entry.

Myc function in the cell cycle context necessitates a description of how Myc can facilitate

transcription in greater detail. Myc transactivation activity first requires DNA binding.

Subsequent activation of transcription can then occur by several different pathways including

chromatin remodeling and polymerase-pause release. These functions are likely linked. It is

believed that Myc transcriptional activation involves, in part, its control of histone acetylation

through interaction with HATs (47). Thus, the cases of transactivation mechanisms are best

represented by an example; the cell cycle gene, CCND2. Mitogen stimulation leads to the

recruitment of TRRAP by Myc, which in turn recruits GCN5, a HAT, that leads to CCND2

promoter acetylation and activation (Figure 1A) (48, 49). Furthermore, Myc can also recruit the

cyclin dependent kinase complex, P-TEFb, which has been implicated in mediating the

5

phosphorylation of the CTD of RNA polymerase II and release of the stalled polymerase at the

CCND2 promoter (50).

Converse to Myc transactivation, Myc transrepression does not necessarily require DNA binding

as Myc can indirectly access the DNA through protein-protein interactions (51, 52). Therefore, it

is predominantly E-box independent (53, 54). Like transactivation, repression can also occur

through several, albeit less well-characterized, mechanisms. In general these require Max as well

as Miz1, a zinc finger protein (55, 56). A primary cell cycle-related example is the

transrepression of the growth arrest gene, p21. Myc, in complex with Max and Miz1, can recruit

the DNA methyltransferase (DMT), DNMT3a, which leads to DNA methylation at the promoter

and repression of p21 transcription (57). Therefore, this mechanism may be particularly relevant

for Myc target genes with CpG island promoters. Similar to transactivation, chromatin

remodeling via alteration of histone modifications can occur at the p21 promoter. Once again the

complex of Myc and Miz1 can recruit KDM5B, which is a histone demethylase (KDM, Lysine

demethylase). Recruitment results in the demethylation of histone H3 at lysine 4 (H3K4) and the

subsequent down regulation of p21 transcription (Figure 1B) (58).

Taken together, transcriptional activation and repression by Myc occurs through diverse

mechanisms in order to control and orchestrate its many functions. In particular, emphasis has

been placed on Myc regulation of cell cycle associated genes. These genes represent an

important class of Myc-regulated target genes as their over-expression or –repression can lead to

profound cellular transformation (59). The mechanisms of how Myc regulates the transcription

of its targets and the function of those downstream targets have been the focus of therapeutic

target development efforts as these could have broad clinical utility. Therefore, this thesis aims to

emphasize novel perspectives on Myc’s target gene function and regulation.

6

1.3 Myc and Epigenetics A description of Myc-activated and Myc-repressed targets in the cell cycle yields the observation

that Myc-mediated transcriptional regulation requires the recruitment of many different

cofactors. The types of these cofactors vary by target genes, but a theme emerges in the types of

proteins that Myc can recruit. These are enriched for proteins that regulate chromatin structure

and histone modification. It is known from large scale genomic studies that Myc can bind large

numbers of genes with some reports suggesting Myc binds 10-15% (1, 60) of the genome.

Regardless of number, Myc binding is widespread (61). Of these Myc-bound targets, it seems

that Myc is a rather weak transcription factor in that it very modestly affects target gene

expression by about two-fold (45, 60). These characteristics set it apart from many classical

transcription factors and those within the bHLH-LZ superfamily in that these proteins tend to

restrict their binding to a distinct set of genes and induce or repress their expression to great

magnitude (62, 63). Thus, it seems that Myc’s subtle effects on expression and its ability to act

globally set it at the interface between classical transcription factor and epigenetic gene

regulation.

Subtle effects on gene expression, globally, could have massive implications on cell biology.

Myc may achieve this by dynamic interaction with and alteration of chromatin state (64). As a

DNA binding transcription factor, the accessibility of DNA is a critical component of Myc

genome binding. This more open, accessible DNA encompasses euchromatin whereas

heterochromatin is characterized as repressed and less accessible. The chromatin structure is

regulated, in part, by histone tail modifications. Myc DNA binding is associated with active

histone modifications such as H3K4me3 that open chromatin (65). Furthermore, it seems that E-

box sequence specificity of Myc is less of a priority over chromatin context recognition as the E-

7

boxes in regions of repressed chromatin are not bound by Myc. As well, within euchromatic

regions, Myc associates with DNA in the absence of E-boxes directly or indirectly through

protein complexes (Figure 1C) (66). Overall, Myc’s dynamic interaction with the chromatin

landscape may be a means of the cell to direct specific gene expression programs of Myc in a

context dependent manner.

Once bound to the DNA, Myc can then alter histone modifications through its protein-protein

interactions (66, 67). It can achieve this locally by enhancing acetylation as a means of direct

transcriptional regulation, described previously (49). Myc can also act on chromatin more

globally, in particular Myc induction is positively correlated with global euchromatic marks and

H2A.Z variant exchange and negatively correlated with heterochromatic marks (68). More

interestingly, some Myc-dependent chromatin marks can occur in the absence of local Myc

binding suggesting that Myc can alter chromatin state in an indirect fashion (69). There are

several postulates that could explain this phenomenon. Myc binding in a specific region could

mediate distal chromatin modification by chromosomal looping and other higher order DNA

structures. Additionally, Myc could be associated with DNA indirectly through interaction with

other DNA binding transcription factors. In this way, Myc could have access to chromatin

without being directly bound to DNA (64).

Myc’s ability to indirectly interact with and alter chromatin structure seems to be best explained

by the idea that Myc’s own target genes could function to directly alter chromatin. The primary

example of Myc’s role in the epigenome involves the induction of GCN5, an activating, HAT

previously described as a Myc interactor (70). The induction of GCN5 leads to the strong

increase in overall, genome-wide acetylation (67). Since Myc can also directly bind GCN5, this

may be occurring through a feed forward mechanism (Figure 1E, left). Other induced target

8

genes that support this concept and demonstrate that this observation is not limited to chromatin

activating modifiers are CCCTC-Binding factor (CTCF) and Metastasis Associated 1 (MTA1).

CTCF is an example of an insulator protein. It has been implicated in blocking the spread of

heterchromatic, repressive histone marks. Myc’s induction of this gene could result in the

facilitation of Myc’s role in the maintenance of active chromatin context as well (71). Lastly,

MTA1 is part of the repressive NuRD complex, which functions to deactylate and remodel

histones (Figure 1D, Left). Altogether, the most feasible explanation of Myc’s indirect regulation

of chromatin is that it does so through its target genes’ inherent epigenetic activating, repressing,

and maintenance functions (Figure 1D & E).

Thus far, Myc-induced transcriptional targets have been emphasized, but Myc mediated gene

repression may also play a role in this model. Myc has been shown to regulate regulatory, non-

coding RNAs (ncRNA) on a large scale (2, 6). In particular, Myc exhibits widespread repression

of micro RNA (miRNA) expression (72). These genes primarily exhibit post-transcriptional

control of gene expression through RNA interference and some reports have suggested

interaction with epigenetic pathways (73). Therefore, the repressed targets of c-Myc as well as its

ncRNA targets may also play a critical role in Myc’s regulation of the epigenome.

1.4 H19 – a Myc-induced regulatory, non-coding RNA A finding that began to integrate the idea that Myc depended on epigenetic context for binding

and could regulate ncRNAs other than miRNAs came with the discovery that Myc could regulate

H19 (74). H19 is an imprinted regulatory RNA. As part of the Igf2-H19 locus, H19 is maternally

expressed in early development and only differentiated cardiac and skeletal muscle have

measurable expression after this time, in a fully developed organism (75). It has been

demonstrated that parent-of-origin specific expression of H19 is regulated epigenetically by

9

Figure 1: Pictoral representation and hypotheses of Myc’s role in the epigenome.

A schematic representation of Myc’s direct and known roles in regulating the epigenome in

transcriptional activation (A), transcriptional repression (B), and intergenic regulation (C). D &

E) Left, A schematic representation of Myc’s indirect roles in regulating the epigenome through

its target genes or direct interaction with its own target genes (feed forward). Right,

Demonstrates the overall hypothesis that Myc’s role in the epigenome may, in part, be regulated

by lncRNAs by similar mechanisms.

10

differential methylation at an imprinting control region (ICR). Specifically, this ICR is upstream

of the H19 promoter (76). Since Igf2 is further upstream of the ICR from H19, when the ICR is

unmethylated on the maternal allele CTCF can bind and allow enhancers, downstream of the

Igf2-H19 locus, to enhance H19 expression. Conversely, when the ICR is methylated, expression

of Igf2 is activated in lieu of H19 by the locus enhancers (77). Myc has been shown to bind and

induce H19 at the maternal allele under the unmethylated ICR conditions suggesting methylation

status as another means of Myc’s interaction with the epigenome (74). This is consistent with

later findings that Myc’s interaction with E-boxes is methylation dependent outside of imprinted

loci (78).

The function of H19 in normal biology is not well understood, but its expression is associated

with positive and negative regulators of growth. H19 has also been shown to be important to

tumour biology (Reviewed in Ref(77)). Knockdown of H19 leads to reduced colony formation of

cells grown in soft agar as a measure of transformation (74). Several theories have been

suggested; most relevant to this thesis is the concept that H19 is a transcriptional regulator. The

discovery that Xist, another ncRNA, can coat the inactive X chromosome and mediate gene

repression by chromatin modification and DNA methylation led to the hypothesis that H19 could

regulate gene expression in a similarly epigenetic fashion (79). To date, the only piece of

evidence to support this is that the deletion of H19 leads to altered methylation at the Igf2 locus

(80). This would further support the idea that Myc could regulate epigenetic effector genes as a

means of controlling gene expression globally. Furthermore, it may suggest that Myc can

achieve this through ncRNA regulation and that this regulation could have important

implications for cancer biology.

11

1.5 Long non-coding RNA Function Interestingly, H19 and Xist are examples of a newly recognized class of regulatory ncRNAs

called long non-coding RNAs (lncRNAs). For many years, this type of regulatory RNA was

thought to be rare; H19 and Xist being unique examples. After their discovery in the early

1990’s, they added evidence to an already ongoing debate over the function of the non-coding

genome. This debate largely stems from the conundrum that a mere 2-3% of the genome codes

for protein, once thought to be the primary effector molecule of the cell. Elucidating the role of

the remaining 97-98% of the genome has been fraught with many challenges generally focused

around limitations of current technologies (81). The ENCODE project has yielded evidence that

suggests that 93% of the genome is actively or “pervasively” transcribed (82). This has added to

the number of “dark matter” transcripts, those RNA molecules with unknown function (83), and

led to the hypothesis that there may be biological relevance to these transcripts. Recent evidence

using high throughput sequencing technology suggests that the complexity of the RNA

transcribed from these regions should not be underestimated (84). Counter arguments to the

existence of the transcription from the so called “junk DNA” regions include the concept that it

could be biological noise (81, 85).

The debate has begun to shift away from the existence of these transcripts towards a dispute over

how to define pervasive transcription of the genome and what it entails (86, 87, 88). Amidst the

shift in debate, a window has opened in which numerous groups have identified transcripts,

distinct from coding genes, with important functions in the cell much like H19 and Xist.

(Reviewed in (89)) As such, this will function as the definition of lncRNAs in this thesis.

Early experiments that began to establish that there were many lncRNAs interspersed throughout

the genome utilized chromatin marks. In particular, since histone 3 trimethylation at lysine 4 and

12

lysine 36 (H3K4 and H3K36) marked active sites of protein coding gene transcription, it was

hypothesized that intergenic regions containing this mark would provide a means of

systematically identifying lncRNAs. Through the use of tiling arrays across these H3K4-K36

marked intergenic regions, over 1000 new lncRNAs were discovered (90). The discovery opened

the field to more widespread studies such as one particularly relevant to this work; the discovery

of extensive transcription of lncRNAs in cell cycle promoters. This work showed numerous

lncRNAs in the vicinity of known cell cycle gene promoters and demonstrated that some exhibit

temporal-specific expression in different phases of the cell cycle (91). The class of genes

identified by Hung T et al.(91) represents a unique class of lncRNAs in cell cycle promoter

regions. The group does not address their mechanisms of action in the cell cycle or the idea that

more lncRNAs outside of cell cycle protein coding gene promoters could also be functional in

the cell cycle. As such, these experiments set the stage for this thesis. Overall, the lncRNA class

is broad reaching and many of its functions, ranging from development to cancer, have been

demonstrated through the use of knockdown and over expression genetic studies, which provides

evidence in favor of their relevance.

Advances in understanding these novel lncRNA genes have been made in spite of their poor

sequence conservation and low abundance when compared to their protein coding counterparts.

To this end, Ulitsky et al. (92) have suggested through studies of lncRNA genes in zebrafish that

conservation of synteny, or genomic co-localization with neighboring genes, is the most relevant

form of conservation for lncRNAs. This was achieved by showing that zebrafish lncRNA

knockout phenotypes could be rescued with the expression of their human or mouse counterparts

that were conserved in synteny, but minimally at the sequence level (92). Therefore, the lncRNA

genes may require some reconsideration of our criteria for evaluating a gene’s functional and

13

disease relevance. In light of these considerations, this thesis aims to explore the functions of

these lncRNAs and the role they play as Myc target genes.

lncRNAs have diverse cellular roles. As described with H19 and Xist, they can function in

imprinting and dosage compensation, respectively. Moreover, they can play important roles in

development and differentiation among many other functions (93, 94). They are differentially

expressed in human disease and, in particular, cancer and thus could provide novel insight in

cancer biology and as therapeutic targets (95). Excitingly, in these contexts they display highly

tissue specific expression (96, 97) (Marsden Lab, unpublished). The mechanisms through which

these lncRNAs act to regulate their diverse functions in distinct cellular and tissue compartments

seem to be primarily associated with transcription and chromatin regulation (98).

One such lncRNA that best illustrates the diverse epigenetic roles of this class of genes is called

HOTAIR. HOTAIR was discovered using high resolution tiling arrays in the HOX loci (99).

Rinn and colleagues demonstrated anatomic specific expression of the HOX genes and their

neighboring lncRNAs, including HOTAIR. More specifically, they demonstrated that HOTAIR

that was expressed at the boundary of active and inactive chromatin in the HOXC locus acted to

down regulate, in trans, the HOXD locus through an interaction with SUZ12, a part of the

polycomb repressive complex 2(PRC2). Therefore, knockdown of HOTAIR led to increased

expression of HOXD genes. HOTAIR also interacts with LSD1, a part of the CoREST/REST

lysine demethylase, repressor complex (100). Specifically, it seems that HOTAIR can act as a

scaffold between SUZ12 -PRC2 and LSD1-CoREST/REST through HOTAIR’s 5’ and 3’ends,

respectively. Therefore, it seems that HOTAIR acts to integrate the removal of active

methylation histone marks with the addition of repressive methylated histone marks.

Additionally, it seems that there may be some cell cycle dependence for the interaction of

14

HOTAIR with these protein complexes. Specifically, cell cycle specific phosphorylation of

EZH2, another component of PRC2, leads to increased HOTAIR binding (101). Taken together,

HOTAIR is an lncRNA that can integrate the writing and erasing functions of two important

chromatin modifiers. In turn this complex regulates a specific set of genes and may be

coordinated in the cell cycle by phosphorylation of PRC2 components (102).

Much like H19, HOTAIR also provides an example of an lncRNA that can be functionally

important in cancer biology. Given HOTAIR’s localization to the HOX locus and its potentially

cell cycle coordinated roles, it has been shown that HOTAIR deregulation can alter chromatin

states of breast tissue in such a way that promotes metastasis (103). Additionally, HOTAIR

overexpression and associated chromatin structure alterations have been observed as a

characteristic in many other cancers including colorectal and hepatocellular carcinoma (104,

105).

1.6 Rationale for the investigation of Myc-regulated lncRNAs In conclusion, the lncRNA class has many functionally similar genes that can regulate gene

expression at the level of the epigenome by interacting with, directing, and integrating the

functions of chromatin modifying complexes genome wide like HOTAIR. This function can

have important implications in human diseases including cancer. Importantly, these genes are

regulated much like mRNAs and preliminary data from the ENCODE project suggests that Myc

can bind the regulatory regions of a large proportion of these genes much like H19. As Myc

induces chromatin modulating, protein coding genes, I believe that Myc can regulate the

expression of lncRNAs and that these lncRNAs can similarly contribute to Myc’s indirect

regulation of epigenetics (Figure 1D, right) or a feed forward mechanism wherein Myc induces

the transcription of, and binds to, a target lncRNA to regulate chromatin changes (Figure 1E,

15

right). Therefore, gaining an understanding of Myc-regulated lncRNAs could aid in better

understanding Myc’s role in the epigenome and may unveil novel therapeutic targets.

Given that Myc plays such a prominent role in many cancers, another challenging aspect of

studying Myc-regulated lncRNAs was selection of a model. Our desire to study normal functions

of Myc and how their deregulation could lead to cancer led us to a desire to discover Myc-

regulated lncRNAs in a near-normal cell line model that when Myc was overexpressed

transformed to cells with the characteristics of cancer. Since Myc has been demonstrated to be

associated with breast cancer progression and late-stage aggressive breast tumours (Reviewed in

Ref(106)), a mammary epithelial cell line model called MCF-10A, previously characterized in

the lab, was selected for these purposes.

1.7 Working Model Myc deregulation critically contributes to many cancer etiologies in part through its regulation of

cell cycle and cell growth related genes. Less known is Myc’s role in epigenetic gene regulation

both in the cell cycle and at large. Though this epigenetic gene regulation could occur through a

variety different mechanisms, Myc’s ability to induce the expression of H19 suggests that the

long non-coding RNA genes could figure in prominently. As such, lncRNA biology could

contribute to our understanding of Myc’s normal and cancer epigenetic functions.

1.8 Assumptions Several working assumptions have been made regarding this work. The first is that Myc

functions as an epigenetic regulator in the cell cycle and that it achieves this through an as yet

unknown group of lncRNAs. Based on the first assumption, it is then assumed that cell cycle

16

associated lncRNAs will have overlapping roles in the process of transformation founded upon

our knowledge of the dysregulation of the cell cycle as a hallmark of cancer.

1.9 Hypothesis I hypothesize that Myc can regulate lncRNAs in the cell cycle and that these could be important

in Myc-dependent transformation.

1.10 Objective Evaluation of this hypothesis will make use gene expression microarray technology to discover

and profile gene expression of Myc-regulated lncRNAs. Using this technology, the objective is

to profile lncRNA expression in the cell cycle. lncRNA expression will then be profiled in a

Myc-dependent model of transformation, which will be used to inform and prioritize the cell

cycle lncRNAs that are potentially Myc and cancer associated. Functional characterization of

these lncRNAs will then ensue.

17

Chapter 2 Results

2.1 MCF-10A Cell-based model development

Introduction to MCF-10A cells

The MCF-10A cell line was isolated from a patient with benign, fibrocystic disease. Cells from

the mastectomy, initially characterized as cytogenetically normal, were cultured in low calcium

medium for an extended amount of time and gave rise to spontaneously immortalized, adherent

MCF-10A cells (107). The MCF-10A cells have been confirmed as mammary epithelial cells

that bear resemblance to basal cells and are estrogen receptor negative (108). These cells, though

viewed as normal breast epithelial cells, were later cytogenetically characterized as not normal.

They are near diploid with few genomic lesions (109, 110). These cells have a wild type p53

status and are non-transformed as is evidenced by their inability to form tumours in nude mice

xenograft experiments (107, 111). Importantly, Myc levels are elevated, but they remain highly

responsive to extracellular cues. Therefore, these cells provide the approximate model of non-

transformed mammary epithelial cells to study the role and regulation of Myc in the context

breast tumourigenesis.

Mitogen starved and stimulated MCF-10A cells synchronously enter the cell

cycle

Gaining an understanding of Myc’s roles in normal cell biology is essential to better evaluating

its role in the deregulated state commonly observed in neoplasia and malignancy. One of Myc’s

prominent functions that can go awry is its role in the cell cycle. The nature of MCF-10A cells

being non-transformed means that they are still responsive to extracellular stimuli and a good fit

18

Figure 2: MCF-10A cells respond rapidly to mitogen withdrawal and induction

A) Fixed propidium iodide(PI) flow cytometry was conducted on MCF-10A cells that were

asynchronously growing, starved of mitogens for 24 hours, or starved for 24 hours and

subsequently exposed to mitogens for 8 to 24 hours. Bars represent the mean fraction of the cell

population with a given amount PI stain for N = 2-6 where the error bars represent standard

deviation. Since PI stains DNA, ‘pre-G1’ represents <2N DNA content, ‘G1’ – 2N, ‘S’ – 2N to

4N, and ‘G2/M’ – 4N. B) MCF-10A cells were monitored for phenotypic changes by light

microscopy. MCF-10A cells that were asynchronously growing (left panel) and starved of

mitogens for 24 hours (right panel), as in A, are shown to highlight their qualitative differences.

19

for studying normal cellular functions like the cell cycle. Given that MCF-10A cells are

epithelial, they require and are highly responsive to mitogens like epidermal growth factor

(EGF).(107) Similarly, they maintain their ability to contact inhibit cell cycle progression

through the down-regulation of EGF-dependent signaling (112). Their EGF dependence for cell

cycle progression has been established (113). Therefore, the goal was to use the MCF-10A cells,

taking advantage of their mitogen and EGF dependence, to study Myc’s role in the cell cycle.

MCF-10As that were asycnchronously growing at 40-50% confluent, low density (Figure 2A,

‘Growing’; Figure 2B, Left; Supp. Figure 1) were withdrawn from EGF, horse serum, cholera

toxin, and hydrocortisone, which are components of their normal growth medium (See Methods).

On average greater than 80% of the cells respond by arrest in a 2N, G1/G0 phase of the cell cycle

as measured by fixed propidium iodide flow cytometry (Figure 2A, ‘Starved’). Phenotypically,

this leads to changes in cellular morphology marked by a rounded appearance of the islands of

epithelial cells on a two dimensional surface (Figure 2B, Right) as compared to the

asynchronously growing cells (Figure 2B, Left). Subsequent release of these cells from arrest by

reintroducing all of the aforementioned mitogen media components leads to a concerted and

synchronous reentry into the cell cycle. In particular over the 8-24 hour time course, this entails

the decrease of the 2N, G1 peak to a minimum at 16 hours. Concurrently, the S phase and G2/M

populations approach their maximum values around 16-18 hours (Figure 2A, ‘8h-24h’).

Expression profiling of Myc and Myc targets in MCF-10A cells within the

cell cycle

In order to confirm that Myc is playing a role in the observed entry into the cell cycle, expression

profiling at the protein and transcript level was performed for Myc, its transcriptional targets and

20

Figure 3: MCF-10A cells synchronously entering the cell cycle show coordinated Myc and

Myc target gene expression

A) Immunoblotting with antibody targeting Myc and other indicated cell cycle markers was

performed on protein isolates from MCF-10A cells asynchronously grown (Asy), starved of

mitogens (Stv), or starved and subsequently exposed to mitogens to release the cells from arrest.

The cell cycle markers probed were Cyclin E (CCNE), c-Jun, and c-Fos. α-actin was used as a

loading control. A representative image of N=1 blots is shown. B) RNA was isolated from cells

identically treated to those in A. qRT-PCR quantification of Myc, Cyclin D2 (CCND2), p21, and

Cyclin B1 (CCNB1) relative transcript levels was conducted utilizing the ΔΔCt method with

ribosomal protein, large, P0 (RPLP0) as the endogenous control. Bar height represents the

21

average of N=2 biological replicates except for CCNB1 which is N=1. Error bars represent

standard deviation of those experiments with 2 biological replicates. It should be noted that these

data are only preliminary to demonstrate the aspects of MCF-10A cell cycle entry through

molecular markers.

22

other cell cycle markers. In the cell cycle experiment, lysates were harvested from similar time

points as previously described. As the cells prepare to divide and consistent with Myc being an

immediate early response gene, its protein level is detectable by about 1 hour post re-stimulation

of starved cells with mitogens as measured by immunoblot (Figure 3A). This is consistent with

other immediate early response genes, Fos and Jun (Figure 3A) Similarly, Myc mRNA is up-

regulated early by 8 hours, prior to the full release of cells from G1 arrest (Figure 3B, Top Left).

Cyclin D2 (CCND2) and Cyclin B1 (CCNB1) are important Myc-induced markers of G1/S

transition and G2/M, respectively. Consistent with the fixed propidium iodide flow cytometry

data, CCND2 mRNA is induced approximately 3 fold by 8 hours when the cells are preparing to

transition from G1 to S phase (Figure 3B, Top Right). Similarly, the protein levels of another

marker of G1/S transition, CCNE, increase until around 12 hours post stimulation (Figure 3A).

Conversely, CCNB1 mRNA level is upregulated at 24h with an approximately 30 fold induction

from starvation which is consistent with its role in the G2/M phase of the cell cycle (Figure 3B,

Bottom Right). Lastly, the mRNA of an important Myc-repressed marker of growth arrest, p21,

is at its maximum in the starved cells and is repressed approximately 2 fold in response to

mitogen induction by 8 hours (Figure 3B, Bottom Left). Given Myc’s well known role in

regulating the G1/S transition, these preliminary data suggest that, in the MCF-10A cells,

elevated functional levels of Myc occur by 8 hours of exposure to mitogen stimulation.

MCF-10A cells, grown on Matrigel, can undergo Myc-dependent

transformation

The MCF-10A cell system is also responsive to over expression of known oncogenes. This

concept holds true for Myc over expression as well. Wildtype Myc can induce the transformation

23

Figure 4: MCF-10A cell acinar morphogenesis

Normal MCF-10A (top) cells plated on basement membrane mimicking extracellular membrane

substrate proliferate for approximately 8 days. Over this time they form spherical cell masses

with cells in contact with the basement membrane polarizing and initiating the signaling

dichotomy that occurs around day 8 with the initiation of cell death pathways of the inner cells

and the growth arrest of outer, polarized cells. Over the next 8 days, inner cells die and are

cleared to form the lumen of the highly ordered acinar structures. MCF-10A cells that over

express oncogenes (bottom), like Myc, form abnormal, transformed multiacinar structures that

do not undergo luminal clearing. Reprinted and Adapted by permission from Macmillan

Publishers Ltd: Nature Reviews Cancer, Debnath,J. and Brugge,J.S, 2005 (117).

24

of the MCF-10A cells, at least as measured by anchorage independent colony formation in soft

agar (114). Similarly, within the Myc protein, threonine 58 is a site that, when phosphorylated by

GSK3β, destabilizes Myc and targets it to proteasomal degradation (115, 116). Therefore, a Myc

phosphorylation mutant at threonine 58 (T58A) is predicted to stabilize Myc and give the protein

a dominant positive function when over expressed in MCF-10A cells. To this end, Myc-T58A

cells grown in soft agar form significantly more colonies than cells over expressing wildtype

Myc.(114) Given that Myc and Myc-T58A over expression can lead to the transformation of

MCF-10A cells, a Myc-dependent model of transformation to complement the cell cycle was

needed.

The complementary model chosen was a 3D culture model that made use of the unique

characteristic of MCF-10A cells being an immortalized cell line that has maintained its ability to

form polarized structures and tight junction cell-to-cell interactions when grown on extracellular

matrix substrates (110). In fact, these glandular epithelial cells have retained their ability to form

the well-ordered, gland-like acinar structures when grown on Matrigel™ (Figure 4) (118). These

acinar structures mimic mammary gland architecture with hollowed lumens and apicobasal

polarity; some even progress to milk formation (110, 117, 119, 120). Thus, oncogenes that

interrupt the normal morphogenesis and architecture of these mammary epithelial acinar

structures could be important for understanding carcinoma formation from the early to late stages

(Figure 4).

In this model of MCF-10A acinar formation, over-expression of the potent oncogene Myc and its

more transforming Myc-T58A leads to disruption of the normal acinar morphogenesis with

differing extent (Wasylishen AR and Penn LZ, In Preparation). If transformation in this model is

defined as the formation of disordered, multiacinar structures, then transformation can be

25

26

Figure 5: MCF-10A cells stably over expressing Myc-T58A form transformed, multiacinar

structures

A) MCF-10A cells stably expressing empty vector pMN-GFP, pMN-MYC, or pMN-T58A-MYC

grown on matrigel extracellular matrix form normal and transformed acinar structures. These

have been phenotyped and counted on day 8 of morphogenesis in a blinded fashion. The plot

shows transformation as a percentage of all acini counted (>100). The mean and the standard

deviation are shown for N=5 biological replicates; *** indicates p<0.001 by one-way ANOVA

with Bonferroni post-test for the T58A to GFP comparison. B) Representative light microscopy

images are shown for day 4 and day 8 acini. An example of a transformed, multiacinar structure

is delineated by the arrow.

27

measured relative to the total population of both normal and transformed acini (Figure 4,

Bottom). Using this single parameter as a measure, Myc over-expression results in a trend

towards increased transformation. More significantly, Myc-T58A over-expression leads to ~30%

transformation, which is about 3 fold over basal transformation that occurs in the empty vector

GFP cells (Figure 5A). Qualitatively, these differences are not apparent until day 8 when the

proliferative phase of morphogenesis is nearing completion (Figure 5B, ‘Day 8’). Therefore, the

working assumption is that by day 4 the necessary transcriptional programs may be in place to

establish the phenotypic differences observed on day 8 (Figure 5B, ‘Day 4’).

2.2 Global long non-coding gene expression profiling MCF-10A cells were used as the model system to study Myc and its ability to regulate lncRNAs

in both the normal and transformation disease context. Primarily, the global expression profiling

made use of a commercial expression microarray from Arraystar™ that has gene specific probes

that target both protein coding and lncRNAs. The lncRNAs contained on this array are

assembled from several different sources, as detailed in the Methods section of this thesis and

include both known and putative lncRNAs. The function or disease relevance of these lncRNAs

is not well known to date. In order to address this question and how Myc coordinates their

expression and function, the first step is to identify Myc lncRNA targets. Therefore, the

microarray experiment made use of both the cell cycle and Myc-induced transformation models

in MCF-10As. The rationale for profiling lncRNA gene expression in both models is that it may

provide a means of finding those genes commonly regulated in both processes (Figure 6).

28

Figure 6: Flow Chart of the strategy for systematic identification of cell cycle associated

and Myc-induced transformation associated lncRNAs

A) A flow chart that represents the overall strategy of identifying lncRNAs in the cell cycle and

prioritizing them through comparison with lncRNAs identified in transformation. B)The boxes in

the flow diagram represent each step of data creation of handling used for the handling of

microarray data and candidate identification. Below each box indicates how each step selects for

a specific subset of genes targeted by probes on the array platform. Above several boxes, the

corresponding figure is indicated.

29

Identification of cell cycle associated lncRNAs

To profile the lncRNA gene expression changes in the cell cycle, the MCF-10A model of

synchronous cell cycle entry was used. To enrich for those lncRNAs regulated by Myc directly

in the cell cycle, it was hypothesized that, since Myc is an immediate-early response gene that

functions to regulate G1 to S transition, the prudent time for lncRNA profiling would be just

prior to cells beginning to enter S phase. This would be a time at which established Myc target

genes show regulated expression changes. The first noted increase in the S phase population after

mitogen induced release from arrest was at 10 hours and corresponded with changes in

expression of CCND2 and p21 (Figure 2A, Figure 3B). Therefore, the time point chosen for

microarray gene expression profiling was at 8 hours. The analysis of gene expression changes

was performed by Arraystar using a fold change cutoff of 1.5 fold and a p-value cutoff of 0.05

(Figure 7). When 8h mitogen induced cells were compared cells in a starved state, there were

1,123 differentially expressed lncRNA genes (Figure 7). Of note, there seemed to be widespread

down-regulation (~70%) of lncRNA transcripts in response to mitogen stimulation. Importantly,

Myc-regulated internal control genes such as cyclin E and gadd45γ, were up- and down-

regulated, respectively, as expected.

Identification of Myc-induced transformation associated lncRNAs

To address the questions of disease relevance and how Myc coordinates these lncRNAs, the

MCF-10A model of Myc-dependent transformation was used. As mentioned previously, the

transformed phenotype of the MCF-10A cells is only morphologically measurable by day 8. It

was hypothesized that the transcriptional programs required for establishing any differences in

phenotype were set earlier in the proliferative phase of morphogenesis around day 4. Therefore,

30

Figure 7: Identification of cell cycle associated lncRNAs

A) The volcano plot of all significant and non-significant lncRNA expression changes that occur

between MCF-10A cells stimulated for 8 hours with mitogens post 24 hours of starvation and

MCF-10A cells starved of mitogens for 24 hours only. Each plotted point represents an lncRNA

gene targeting probe above the low intensity filtering threshold (B, ‘Present Probes’). The

horizontal line represents the p value cutoff of 0.05 and the two vertical lines represent the fold

change cutoff of 1.5 fold in either direction of change. lncRNA genes that fit the cutoff criteria

are in black and those that do not are in grey. B) A table summary of the important aspects of the

volcano plot in A.

31

Figure 8: Identification of Myc-induced transformation associated lncRNAs

A & C) The volcano plots of all significant and non-significant lncRNA expression changes that

occur between MCF-10A-Myc (A) or MCF-10A-Myc-T58A (C) and MCF-10A-GFP empty

vector cells grown on Matrigel™. Each plotted point represents an lncRNA targeting probe

above the low intensity filtering threshold (B and D, ‘Present Probes’, respectively). The

horizontal line in each plot represents the p value cutoff of 0.05 and the two vertical lines

represent the fold change cutoff of 1.5 fold in either direction of change. lncRNA genes that fit

the cutoff criteria are in black and those that do not are in grey. B & D) The table summaries of

the important aspects of the volcano plot in either A or C, respectively.

32

RNA for lncRNA gene expression analysis was isolated at the day 4 time point for MCF-10A

GFP (empty vector), MYC, and MYC-T58A cells. Gene expression changes would therefore be

a measure of population gene expression including both transformed and non-transformed acini.

Myc over-expressing MCF-10A cells form acini that differentially express only 89 lncRNAs

when compared the GFP expressing acini (Figure 8A & 8B). Consistent with more phenotypic

transformation (Figure 5), Myc-T58A over-expressing MCF-10A cells form acini that

differentially express 407 lncRNAs compared to GFP (Figure 8C). As was previously noted in

the cell cycle, these transformation associated lncRNA genes also seem to undergo widespread

down-regulation (Figure 8D). These will be identified as transformation associated lncRNAs

from this point forward.

Identification of lncRNA genes common to Myc-induced transformation and

the cell cycle

Since many Myc, cell cycle regulated coding genes have been shown to be key regulators of

cancer development (ie Cyclin D1), it was hypothesized that identifying cell cycle regulated

lncRNAs that were similarly regulated in the T58A-Myc transformation model would likely

identify Myc regulated lncRNAs important in cancer. Therefore, the differentially expressed

transformation and cell cycle associated lncRNAs have been compared for commonality. This

comparison begins to address the initially posed challenges of inferring lncRNA function,

disease relevance, and Myc’s role. As such, there were 33 lncRNAs common among those

differentially expressed genes identified (Figure 9A). This small population of genes in the union

is statistically more likely to occur than random chance alone (p =0.007, hypergeometric

distribution).

33

Figure 9: Identification of lncRNAs common to G0

A) Venn diagram that highlights the union between the cell cycle associated lncRNAs (1,123)

and the Myc-T58A transformation associated lncRNAs (407). B) Of the 33 genes identified in

the union of the venn diagram in A, 20 fit the inclusion criteria highlighted in the text. B is a heat

map of the relative expression changes in those 20 candidate lncRNAs across the treatment

conditions indicated. The order starts with coordinately up-regulated in the conditions and

progresses to coordinately down-regulated in the conditions.

/G1 to S phase progression of the cell

cycle and Myc-induced transformation

34

2.3 Application of Inclusion Criteria Given that the array contains predicted genes, a set of inclusion criteria was applied to yield a list

of 20 candidates. The following criteria were applied to better define the transcript candidates.

Primarily, application of these criteria made use of Ensembl (121) and University of California

Santa Cruz(UCSC) Genome Browser (122). These databases provide an interconnected interface

to easily view genomic data in non-coding regions of the genome including predicted transcript

and pseudogene information as well as data from the Encyclopedia of DNA Elements

(ENCODE) project (123). Information cited below that could be useful for understanding the

relevance of these candidate lncRNAs was gathered and collated in Table 1.

Gene prediction by Ensembl is achieved through the use of expressed sequence tags(EST)

information (124). First the ESTs, are aligned to the genome using the programs Exonerate

(125), BLAST (126), and EST2Genome (127). This is then followed by all redundant,

overlapping regions being consolidated into a single transcript or a set of variants. GenomeWise

(128), a program that predicts all the of the open reading frames(ORFs) in a given genomic

contig, is then used to assess the ORFs present in the predicted transcripts. The absence of a

significantly sized ORF results in the annotation as a non-coding processed transcript by

Ensembl. Probes on the microarray that directly overlap these predicted genes as well as those

annotated as lincRNAs and antisense RNAs were selected.

Evidence for Presence of a Transcript

These transcripts were then confirmed across databases in the UCSC Genome Browser using the

EST information deposited there. Potential spliced ESTs that spanned the full length of a known

or predicted transcript were annotated. Similarly, the UCSC Genome Browser is the depository

35

Table 1:

Characteristics of

candidate lncRNAs

from the expression

profiling union

between cell cycle and

transformation

ID1

Coordinates2

StrandExons

Type3

Gene U

pstreamG

ene Dow

nstreamM

yc Bound4A

bundance5

Database

FCAbsolute M

itogenregulation

p-valueFCA

bsolute 3D T58A

-GFP

regulationp-value

ENST00000423943

1:159931014-159948851+

3Intergenic

SLAM

F9PIG

M+

+Ensem

bl1.590

up0.017

1.908up

0.018BC035759

16:3700637-3701704-

1A

ntisenseTRA

P1D

NA

SEI+

+U

CSC2.527

up0.026

2.090up

0.037EN

ST00000504601.117:48261457-48262313

+2

Antisense

HILS1

COL1A

-+

Ensembl

1.823up

0.0382.887

down

0.044EN

ST0000052658511:76491664-76495723

-2

Antisense

TSKULRRC32

++

Ensembl

5.443dow

n0.003

1.717dow

n0.003

ENST00000522060

8:144063466-144099798-

3Intergenic

LY6ECYP11B2

++

Ensembl

10.334dow

n0.002

1.584dow

n0.044

ENST00000443631

9:131486724-131495473+

2A

ntisenseZD

HH

C12ZER1

++

Ensembl

2.210dow

n0.013

1.754dow

n0.031

ENST00000499203

5:139483881-139487940-

3A

ntisensePU

RAN

RG2

+-

Ensembl

3.155dow

n0.048

1.954dow

n0.039

AK055958

14:91314206-91316246-

1Intergenic

RPS6KA5

TTC7B-

+U

CSC8.505

down

0.0031.647

down

0.029EN

ST000004405742:209118795-209122854

+2

Antisense

IDH

1PIKFYV

E+

+Ensem

bl2.720

down

0.0401.570

down

0.048EN

ST0000055690415:91565852-91574370

+4

IntergenicV

PS33BSV

2B+

-Ensem

bl2.031

down

0.0232.317

down

0.018BC037839

15:32,828,960-32,872,810+

3Intergenic

FAM

7A1

ARG

HA

P11A+

+U

CSC3.748

down

0.0242.738

down

0.049A

F14731714:23456112-23456486

+1

Antisense

JUB

C14ORF93

+-

UCSC

2.863dow

n0.012

1.534dow

n0.036

AK026750

8:142363507-142365465-

1Intergenic

GPR20

SLC45A4

--

UCSC

3.794dow

n0.005

2.093dow

n0.007

ENST00000522856

8:104258701-104310991-

2Intergenic

FZD6

BAA

LC+

+Ensem

bl1.879

down

0.0011.667

down

0.013EN

ST000004256882:166649530-166653589

+3

IntergenicG

ALN

T3TTC21B

--

Ensembl

1.988dow

n0.010

1.688dow

n0.002

AL832163

8:117945114-117953474-

1A

ntisenseC8O

RF85RA

D21

-+

UCSC

2.655dow

n0.024

2.716dow

n0.038

ENST00000429269

1:234765057-234770526+

2Intergenic

IRF2BP2TO

MM

20-

-Ensem

bl1.786

down

0.0481.971

down

0.048A

K0273838:144840686-144842682

+1

IntergenicM

APK15

SCRIB+

+U

CSC4.091

down

0.0001.843

down

0.039EN

ST000004729433:156465135-156534823

-3

IntergenicLEKR1

TIPARP

++

Ensembl

3.342dow

n0.010

2.234dow

n0.038

ENST00000434627

9:34665662-34681295+

3A

ntisenseRP11-195F19.5

CCL16+

+Ensem

bl6.598

down

0.0002.221

down

0.012

5, Abundance m

easurement using norm

alized hybridization intensity, + represents a lncRNA

whose norm

alized hybridation intensity is greater than the median of the array experim

ent

Table 1: Characteristics of candidate lncRNA

s from the expression profiling union betw

een cell cycle and transformation

1, Database Identification for the database indicated

2, Genom

ic Coordinates in the form of: Chr: Start - Finish

3, Type of transcript Identified where antisense

represents any lncRNA

that lies on the opposite strand of a known gene and w

here intergenic represents any lncRNA

that does not overlap any other known gene

4, Myc bound annotation derived from

an analysis of ENCO

DE M

yc-ChIP seq experiment; + represents M

yc bound in any human cell line assayed w

ith 5kb of the lncRNA

's annotated transcription start site

36

for ENCODE data as well other large scale genomic experiments. The information provided by

the publicly available RNA-seq information can support the predicted gene structure of the gene

of interest. The chromatin mark information can provide information about regions of the gene

marked with promoter associated histone marks as well as gene body associated marks, which

provides further evidence for the transcript.

Transcripts that fit the above criteria were then subjected to classification based on their

orientation with respect to neighboring protein coding genes. Specifically, a transcript could be

classified in one of three ways, for simiplicity. First, it could be defined as an antisense

transcript, which is a transcript that overlaps the gene body of a known protein coding gene, but

is located on the opposite strand. The second classification is bidirectional transcripts, which are

defined as transcripts oriented in a “head-to-head” fashion with a known protein coding gene and

thus share a common promoter region. The last classification is intergenic transcripts, which are

those transcripts that are not in the vicinity of any known protein coding gene. Importantly, those

unclassified transcripts that did not fit any of the previous criteria were removed. In particular,

these primarily included transcripts that were intragenic or interspersed within a known protein

coding gene body and oriented in the same direction of transcription.

Gene Structure Selection

Additionally, those transcripts that were spliced were annotated by the number exons they

contained as spliced transcripts would be preferred in the candidate selection process. From this,

evidence that the cell is regulating the gene of interest as well as evidence for gene orientation

can be inferred.

37

As described above, intragenic transcripts were removed from analysis as this was considered

evidence that may confound the non-coding nature of a given transcript. This concept has been

applied as a primary means of the discovery of these lncRNA transcripts in the intergenic regions

of the genome.

Bioinformatic Evidence for Non-coding Nature of Transcript

Genes that were not parsed out of consideration in this manner were then subjected to translation

in all six reading frames using the ExPASy Translate Tool in order to predict any significant

open reading frames longer than 100 bases or ~30 amino acids.

Pseudogenes are a class of lncRNAs that are similar to known protein coding genes. Recent

reports have demonstrated that there may be some functional importance to the presence of

pseudogenes, but that does not change the fact that pseudogenes have high sequence identity

with known protein coding genes (129). For this reason, this analysis has filtered them.

Pseudogene Filtering

2.4 Candidate Expression and Selection The 20 candidate lncRNAs after application of inclusion criteria show expression, consistent

with the global trends in each of their parent datasets. Therefore they are primarily coordinately

down-regulated in response to mitogen stimulation of MCF-10As or in reponse to stable over-

expression of Myc-T58A MCF-10As grown on Matrigel™ (Figure 9B).

To prioritize the list of 20 candidate lncRNAs, candidate selection criteria were required. The

candidate selection criteria are highlighted in the methods section. Myc-binding status, the

relative abundance based on array hybridization intensity, and the splicing status of the candidate

38

Figure 10: Expression validation of 6 candidate lncRNAs

A) The fold expression changes in response to mitogen of the annotated lncRNAs on the array

are plotted against the fold change measured independently by qRT-PCR. The line of best fit is

plotted with the coefficient of determination noted. B) Similarly, fold expression changes in

response to over-expression of Myc-T58A are plotted for the array and independent qRT-PCR.

Annotations for both plots indicate the accession ID. The 6 digit numbers are to be preceded by

ENST00000 to complete their Ensembl ID. The qRT-PCR fold changes are the averages of 4

39

biological replicates consistent with the microarray. C & D) Represent the individual expression

changes in each of the treatments; 8h mitogen (C) and Myc/Myc-T58A overexpression (D).

Bars represent the average of 3 biological replicates with error bars indicating standard deviation.

The 8h mitogen-starved comparison utilizes a paired t-test, where the T58A comparison makes

use of a one-way ANOVA with a bonferroni post test. * is p<0.05, ** is p<0.01, *** is p<0.001.

40

genes were given the most weight because abundant, Myc-regulated lncRNAs are the goal of this

experiment. Foremost, 14 of 20 candidates were bound by Myc under growing conditions in any

one of the seven cell types where Myc ChIP-seq was performed as part of the ENCODE project

(82, 123). Of these 14, six were selected for expression verification.

2.5 Expression Validation Expression validation within the mitogen responsive experiment yielded a positive trend with

respect to the comparison of array fold change and qRT-PCR fold change (Figure10A). The

coefficient of determination (R2

2.6 Validation of Publicly Available Myc-ChIP data in MCF-10A cells

) was 0.7192. Given the small range of verified expression in the

Myc-T58A treatment condition, no positive trend was observed for the array expression profiling

and the qRT-PCR expression profiling (Figure 10B). In general, though, there was statistically

significant down regulation of the candidate lncRNAs in response to Myc-T58A stable over-

expression when grown on Matrigel™ as measured by qRT-PCR. The down-regulation observed

by qRT-PCR is consistent with the array direction of regulation and is best observed in the

individual gene expression changes (Figure 10C & D) Therefore, increasing the number of

lncRNAs validated and the range of their fold changes would yield a similar correlation to the

mitogen responsive verifications.

Publicly available Myc-ChIP-seq data from the ENCODE project was performed in diverse

human cell types that did not include MCF-10A cells. The potential for Myc to bind and regulate

a specific target gene may be cell-type specific, therefore validation of Myc binding was

performed in MCF-10A cells. Of the six coordinately downregulated lncRNAs selected for

validation, two were in regions of the genome that frequently undergo loss of heterozygosity in

41

Figure 11: Myc binds the promoter of lncRNA-LY6E

Chromatin immunoprecipitation using the antibody, N262, targeting Myc was performed using

crosslinked, asynchronously growing MCF-10A cells. The mean log2

ratios of N262 to IgG

were calculated for a non-Myc bound E-box negative control, CCND2 positive control, lncRNA-

FZD6, and lncRNA-LY6E. Paired t-tests were performed across N = 4 replicates; * represents p

< 0.05 and error bars represent standard deviation.

42

breast cancer cell lines as annotated in the COSMIC database (130). These candidate lncRNAs,

lncRNA-LY6E (ENST00000522060) and lncRNA-FZD6 (ENST00000521383), were both

downregulated in response to mitogen stimulation as well as Myc-T58A over-expression thus it

was hypothesized that these lncRNAs were targets of Myc repression. MCF-10A cells were

cross-linked and immunoprecipitated with Myc N262 antibody or paired rabbit IgG under

asynchronous growing conditions, when it was proposed that Myc would be actively repressing

these candidates. With a non-Myc-bound E-box as a negative control and the cyclin D2 promoter

as a positive control, there was a significant differential in Myc binding at the cyclin D2

promoter versus the negative control suggesting that the experiment is valid. The same samples

were then used to evaluate the Myc-binding status of lncRNA-FZD6 and lncRNA-LY6E. Only

lncRNA-LY6E had significantly enriched Myc-binding at its promoter under asynchronously

growing conditions (Figure 11).

2.7 Functional Validation and Expression Profiling of Candidates lncRNA-LY6E and lncRNA-FZD6

lncRNA-LY6E is located in a simple bidirectional locus with LY6E with Myc bound at the

bidirectional promoter region between the two genes (Figure 12A). This lncRNA is a 995bp,

poly-adenylated transcript with 2 exons. Its promoter contains both a canonical E-box and a

TATAA box. In order to develop a more clear view of the function and regulation of the

candidate lncRNAs, several experiments were performed. These experiments included the array

expression validation as well as expression profiling of lncRNA and pre-lncRNA species across

the cell cycle, after Myc induction, and in nuclear/cytoplasmic compartments. Array expression

validation experiments reveals that mitogen stimulation in 2D and Myc-T58A-overexpression in

3D yield significantly decreased mRNA levels consistent with array data (Figure 12B & 12C).

43

Figure 12: lncRNA-LY6E is dynamically regulated in cell cycle and Myc-dependent

transformation

44

Scale model of the lncRNA-LY6E locus with annotation of mRNA expression and Myc ChIP

primers. B) Verification of mitogen dependent repression of lncRNA-LY6E by qRT-PCR. ***

represents p < 0.001 by paired t-test. C) Verification of the decrease in expression of lncRNA-

LY6E in response to Myc-T58A overexpression in MCF-10A cells grown on Matrigel. **

represents p< 0.01 by one-way ANOVA and Bonferoni post-test. D) Expression profiling of

mature lncRNA-LY6E and pre-lncRNA-LY6E throughout MCF-10A cells synchronized by

mitogen starvation followed by induction. Error bars in B, C, D represent standard deviation.

45

Expression profiling in the cell cycle of mRNA revealed elevated transcript levels that were cell

cycle arrest-associated (Figure 12D, left). To evaluate whether these expression changes could be

attributed to transcriptional regulation of lncRNA-LY6E, primers targeting the heterogenous

nuclear RNA or pre-lncRNA were designed within the intron of this lncRNA. Pre-lncRNA

expression profiling in the cell cycle mimicked the trends seen with the mature lncRNA-LY6E

levels being elevated in cell cycle arrest (Figure 12D, right). This supports the view that the

changes in expression are occurring due to changes in transcription and thus are consistent with

the hypothesis that Myc is acting to repress the transcription of lncRNA-LY6E.

Similar analysis was performed for lncRNA-FZD6, which is a larger lncRNA that is located in a

similarly bidirectional locus with the Wnt pathway receptor, FZD6 (Supplemental Figure 3a).

This lncRNA has multiple predicted splice variants of various size that are poly-adenylated

transcript. Its promoter contains both a canonical E-box and a TATAA box. Though Myc is not

bound at this bidirectional promoter under growing conditions in MCF-10A cells, the array

expression changes still showed significantly decreased transcript levels under the two

conditions (Supp. Figure 3B & 3C). The cell cycle profiling of lncRNA-FZD6 mRNA showed a

similar trend in with cell cycle arrest-associated increase in transcript (Supp. Figure 3D, Left).

Conversely, measurement of pre-lncRNA levels revealed no appreciable change across the time

points measured (Supp. Figure 3D, Right).

Nuclear/Cytoplasmic partitioning was performed for both candidate lncRNAs in MCF-10A cells

under asynchronous growing conditions to evaluate whether they are nuclear retained under

conditions where their expression is low. Upon evaluation of candidate lncRNAs, their

expression was not particularly enriched in any given cellular compartment under these

asynchronously growing conditions where their expression is at or below the level of detection.

46

(Supp. Figure 4, Note scale differences). Xist lncRNA was used as a positive control for the

nuclear compartment and RPLP0 was used as a characteristic protein coding gene. Xist was

enriched in the nuclear extracted RNA and RPLP0 was enriched in the cytoplasmic RNA, as

expected.

Lastly, Myc’s ability to repress these candidate lncRNAs under starvation conditions was

evaluated after 8 hours of Myc induction in the absence of mitogens. Cyclin D2, a Myc induction

control, was significantly induced under these conditions whereas p21, a Myc repression control,

was not significantly repressed (Supp. Figure 5A & 5B). Similar to p21, there was no significant

repression of mature lncRNA-LY6E and pre-lncRNA levels (Supp. Figure 5C & 5D). lncRNA-

FZD6 showed significant repression for its mature transcript, but not for its hnRNA (Supp.

Figure 5E & 5F). Altogether, this suggests that longer Myc-induction for this experiment may be

necessary to see the effects of Myc repression and additional positive controls would add weight

to this experiment and help with data interpretation.

47

Chapter 3 Discussion

3.1 Candidate lncRNA profiling Selection and validation of Myc-bound candidate lncRNAs has revealed a promising lncRNA

that has been called lncRNA-LY6E for its nearest neighbour gene. LY6E stands for lymphocyte

antigen 6 complex, locus E with other aliases including retinoic acid inducible gene E protein

(RIG-E) and stem cell antigen 2 (SCA2). LY6E’s function is not well known, but it seems to be

broadly expressed and interferon responsive (131). Importantly, its loss of expression has been

implicated in hepatocellular carcinogenesis (132). Similarly, as previously stated, loss of

heterozygosity is observed in this locus in breast cancer cell lines. Therefore, lncRNA-LY6E

may act to coordinate the functions of LY6E and may also function as a tumour suppressor. An

interaction or interplay between lncRNA-LY6E and LY6E remains to be evaluated. Another

model that could be tested is the concept the lncRNA-LY6E is the important gene in the locus

for epigenetic regulation is cis or trans.

lncRNA-LY6E’s expression has been profiled in several different states. These data suggest that

it is a cell cycle arrest-associated transcript and that, upon induction with mitogen, may be

transcriptionally repressed by Myc. Two possible models could explain this, the first is that Myc

could be bound basally, as has been demonstrated, and actively repressing lncRNA-LY6E under

conditions permissive to cell growth. The second model is that Myc induction around 1-2 hours

post mitogen stimulation is what directly represses lncRNA-LY6E in the cell cycle.

Under growing conditions, lncRNA-LY6E is not nuclear retained while its localization under

starved conditions remains to be seen. Similarly, lncRNA-LY6E shows no significant change in

response to Myc induction alone for 8 hours, but a trend toward decreased expression is observed

48

at the mature lncRNA and pre-lncRNA levels suggesting that a longer Myc induction may be

warranted to observe repression by Myc in isolation in this model. Altogether, lncRNA-LY6E

may be a novel Myc-repressed lncRNA that could function, in part, to regulate growth arrest and

cell cycle progression.

The model of Myc’s indirect role in the epigenome through its target genes suggests that Myc’s

induced genes have chromatin modifying functions, but this model does not propose a role for

Myc-repressed genes. Recent work from Geiseler et al. (133)may suggest a potential model to

test in terms of how the transcriptional or post-transcriptional down-regulation of lncRNAs can

be important in epigenetic changes. Specifically they showed that the decapping and degradation

of lncRNA in the galactosidase locus of yeast in response to food source changes could lead to

the induction of neighboring galactosidase genes for metabolism of the newly acquired food

source (133). Therefore, we propose that Myc-mediated transcriptional repression of lncRNAs

can function to regulate inducible gene expression in a similar manner (Figure 13).

3.2 Large scale lncRNA profiling Using the developed MCF-10A models, microarray lncRNA expression profiling was

undertaken. First, lncRNA expression changes were profiled in the cell cycle model comparing

the 8h time point to the starved control. Numerous gene expression changes occurred suggesting

that lncRNAs may play an as yet unknown, but critical role in the cell cycle. Also observed was

widespread downregulation of lncRNAs, this observation seems to be consistent with Myc’s

ability to downregulate a large number of miRNAs (6). This can occur in the cell cycle through a

dynamic interplay between Myc and miRNAs, for example through the repression of let-7 and

miR-34a (134).

49

Figure 13: Schematic working model of how Myc-repression contributes to epigenetic

regulation

A stimulus like growth induction or growth arrest can lead to the specific Myc functions at a

given cell-cycle associated lncRNA promoter. Growth induction favors the repression of the cell

cycle arrest-associated transcript which in turn leads to the activation of inducible cell cycle

genes.

50

In the 3D culture model of Myc-dependent transformation, fewer gene expression changes were

observed overall. In this assay, wildtype Myc may therefore require further signaling deficiencies

to potentiate transformation. Consistent with this, in MCF-10A Myc-T58A over expressing cells,

larger scale repression was also observed in the lncRNAs. This is again consistent with the

observed importance of Myc-mediated repression of miRNAs in tumourigenesis (72). These data

seem to suggest some consistency between Myc’s regulation of non-coding genes, but this claim

must be further addressed as this work is limited by array technology in that probes are designed

in a biased way targeting known or predicted lncRNAs.

3.3 MCF-10A cell system Two models for analyzing Myc function were developed using non-transformed MCF-10A cell

system. Both take advantage of their exquisite sensitivity to external stimuli. The first makes use

of mitogen withdrawal, specifically EGF, to arrest cells followed by mitogen add-back to release

the cells from arrest synchronously into the cell cycle. Since Myc is an immediate early response

gene, as confirmed in these cells, it was hypothesized that Myc begins its control of

transcriptional programs early, prior to G1-S transition. As such, 8h post mitogen add-back was

selected as a time point to assay for gene expression in the hope of enriching for Myc target

genes. Similarly, this time point was selected because it allowed for some inference of function

of differentially expressed genes as growth related. For these reasons, the MCF-10A cell cycle

model demonstrated here is an important model of human cell cycle regulation. It is important to

note that although there is arrest and synchronous cell cycle progression, each process isn’t

complete and represents only population level changes.

MCF-10A cells, epithelial in origin, also respond to extracellular matrix mimicking substrates in

a well defined manner. Normal, non-transformed MCF-10A cells grown on Matrigel form highly

51

ordered, polarized acinar structures. This was demonstrated here and shown previously (110,

117). Similarly shown, over expression of oncogenes that transform the MCF-10A cells yield a

loss of polarity and result in a phenotype change apparent in 3D culture. Typically, this

phenotypic change is the formation of multiacinar structures, which is consistent with our model

of Myc-dependent transformation in Myc-T58A over expressing cells. This 3D culture model

mimics breast ductal histology and pathology remarkably well.

A caveat to this breast acinar model is that the system to quantify the level of transformation of

MCF-10A cells is rudimentary and not complete because it does not take into account other

parameters such as the heterogeneity of acinar size. Therefore, it should only be used to indicate

population changes. Any gene expression changes between Myc-T58A and GFP over expressing

MCF-10A cells are considered in light of this.

3.4 Future Directions A priority for future experiments would be to evaluate the function of lncRNA-LY6E through

knockdown or over expression experiments. First, the interplay between lncRNA-LY6E and its

neighbour LY6E would be addressed. This could then be extended to a larger scale by testing the

global gene expression changes in cells over expressing lncRNA-LY6E. If a global pattern of

gene expression changes was observed, a mechanism of gene regulation could be investigated by

evaluating chromatin marks on nearby genes with expression changes. Similarly, over-

expression experiments could be used to begin to evaluate the model that lncRNA-LY6E is a

tumour suppressor by over-expressing it in Myc-T58A transformed MCF-10A acini with the

hypothesis that lncRNA-LY6E would reduce Myc-T58A induced transformation. Other Myc-

dependent, breast cancer cell lines like MDA-MB-231 cells could be similarly used.

52

Upstream of lncRNA-LY6E, the direct regulation by Myc would need to be shown. This could

involve any of the following three experiments. The first would be the extension of the Myc-

inducible experiment shown in supplemental figure 5 to investigate if Myc induction, in the

absence of other signaling, can regulate lncRNA-LY6E. This could be achieved by extending the

Myc-induction time. The second experiment could be the use of a promoter-reporter construct

that contains the lncRNA-LY6E promoter in the presence or absence of Myc. Lastly, addressing

direct Myc repression in the cell cycle would require the knockdown of Myc expression during

synchronous cell cycle entry assaying for the changes in lncRNA-LY6E expression.

In general, these experiments provide the necessary rationale to pursue the hypothesis that Myc

is a global repressor of non-coding regulatory RNA, including lncRNAs. Specifically,

transcriptome-wide RNA-seq experiments would be a logical step to begin to address this

hypothesis as they are not limited by prediction based approaches and allow for de novo

transcript assembly (135).

3.5 Conclusions and Implications In all, this thesis is the first attempt, to our knowledge, to profile Myc regulated lncRNAs on a

large scale. Given the challenges of Myc knockdown experiments and their effects on cell

viability, an approach was taken that utilized a model of cell cycle progression and a model of

Myc-dependent transformation. Due to the largely unknown functions of lncRNA genes, the cell

cycle model was aimed at inferring cell cycle related functions of a large group of lncRNAs. The

profiling of gene expression changes in the Myc dependent model of transformation was aimed

at adding Myc- and cancer-dependence to a set of lncRNAs. Together, these gene sets combined

with publicly available Myc-ChIP-seq data provided the first view of a novel set of Myc

regulated lncRNA candidates. One of the candidates, lncRNA-LY6E provides promising

53

validation of this approach but its function and relevance to cancer biology remain to be

elucidated.

Indeed, as the details of these candidate lncRNAs begin to be elucidated they could be utilized in

the clinic as diagnostic tools. Much like the HOTAIR example, expression profiling of tumours

could aid in staging and treatment. Similarly, lncRNAs could be biomarkers in bodily fluids

(136). A strikingly relevant example is of the lncRNA, Prostate Cancer Antigen 3 (PCA3), which

can be found in the urine of men with prostate cancer (137). Therefore, the lncRNAs identified

could provide useful new diagnostic tools for management and treatment of cancer. Therapeutic

relevance is not limited to biomarkers. A prevailing therapeutic question with transcription

factors and Myc specifically is whether the transcription targets are “targetable” for disease

therapy (70). The lncRNAs identified in this study could provide a novel class of therapeutic

targets among Myc’s many target genes. Recent work has shown that cancers with specific

lncRNAs can be targeted by antisense-oligonucleotide photomolecular beacons for

photodynamic therapy (138). Overall, Myc regulated lncRNAs could not only provide a better

understanding of Myc’s roles in the epigenome, but could provide novel therapeutic targets and

tools for Myc-dependent cancers.

54

Chapter 4 Methods

4.1 Cell Culture

Reagents:

Media, supplemented with penicillin and streptomycin, and phosphate-buffered saline (PBS)

were supplied by the UHN Tissue Culture Media Facility. Trypsin-EDTA was purchased from

Gibco at 10x concentration and diluted to 1x in PBS.

MCF-10A cells:

MCF-10A cells were a kind gift from Senthil Muthuswamy, Cold Spring Harbor Laboratory,

Cold Spring Harbor, NY, USA.(107) These cells were grown in 1:1 DMEM H21/HAM F12

growth media supplemented with 5% [v/v] horse serum (Gibco# 16050-122, Lot# 8178102),

20ng/ml epidermal growth factor(EGF) (Cedarlane# 236EG), 0.5µg/mL hydrocortisone,

0.1µg/mL cholera toxin, 10µg/mL insulin (Sigma#19278).

4.2 Immunoblotting

Whole Cell Extracts

MCF-10A cells under various growth conditions were lysed by aspirating media, washed with

PBS, followed by the direct addition of boiling 1x SDS loading buffer (1%SDS [w/v], 11% [v/v]

glycerol, 0.1M Tris-HCL pH 6.8). Lysed cells were collected with cell scrapers and 10% β-

mercaptoethanol plus loading dye was added. These extracts were then boiled for 5 minutes and

placed at -20°C.

55

SDS-PAGE

Sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) was carried out using

8% polyacrylamide. Electrophoresis was performed at 120V for approximately one and a half

hours. Subsequently, gels were transferred to nitrocellulose membranes (Whatman). Membranes

were blocked for 1 hour in 5% milk dissolved [w/v] in 1xPBS with 0.01% TWEEN-20(Sigma,

cat. no.) at room temperature. Immunoblotting with primary antibody was achieved by overnight

incubation at 4°C in the aforementioned 5% milk solution. Washing was performed using 1xPBS

with 0.01% TWEEN-20(PBS-T) for 5 minutes, three times. Secondary antibodies were applied

for one hour at room temperature in the 5% milk solution. As previously, membranes were

washed for 5 minutes, three times in PBS-T followed by once with 1xPBS for 5 minutes.

Antibodies

The primary antibodies used for overnight incubation were diluted as follows: 1/1,000 anti-Myc

(9E10 – monoclonal, mouse hybridoma), 1/3,000 anti-actin (Sigma, cat. no. A2066), 1/1,000

anti-Cyclin E (Santa Cruz, Sc-247), 1/1,000 anti-c-Jun (Cell Signaling, 60A8), 1/1,000 anti-c-

Fos (Santa Cruz, Sc-52). Secondary antibodies for fluorescent imaging using an Odyssey IR

Imaging system (LICOR) were used as 1/20,000 dilutions of IRDye 680 conjugated goat

(polyclonal) anti-rabbit (LICOR, cat. no. 926-32221) and IRDye 800 CW conjugated goat

(polyclonal) anti-mouse (LICOR, cat. no. 926-32210).

56

4.3 Quantitative Real-time PCR

RNA Isolation

MCF-10A cellular lysis for RNA isolation was achieved through the use of the TRIZOL®

reagent from Invitrogen (Cat. No. 12183-555). TRIZOL is a guanidinium thiocyanate-phenol-

chloroform based method of RNA isolation that utilizes the soluble properties of RNA at a low

pH of 4 to perform a phase extraction (139). This phase extraction was combined with silica-

based column purification (Purelink™ RNA Mini Kit) for greater recovery of target RNA. While

RNA was bound to the column, an on-column, deoxyribonuclease treatment was performed

using the Purelink™ DNase Set in order to remove contaminating genomic DNA. The

concentration of the isolated RNA was then measured using the NanoDrop-1000

spectrophotometer (ThermoScientific).

cDNA Synthesis

One microgram of RNA was subjected to first-strand cDNA synthesis using the SuperScript® III

First-Strand synthesis system from Invitrogen (Cat. No. 18080-051). The manufacturer’s

protocol was followed specifically using random hexamers for non-specific, unbiased priming of

DNA synthesis. The final reaction product of cDNA synthesis was diluted 5-fold for

quantification purposes.

Primer Design

Primer design for cDNA quantification was achieved through the use of OLIGO 7 Primer

Analysis Software from Molecular Biology Insights Inc (140). Specifically, gene specific

primers were designed across exon-exon boundaries with preference given to introns greater than

57

500bp in length. Exons for inclusion in primer design were screened for interspersed repeats or

low complexity DNA using the program RepeatMasker (www.repeatmasker.org) as these

regions were to be excluded for amplification. Primers were designed to have a 40-60% GC-

content, an optimal annealing temperature of 50-60°C, and a melting temperature difference of

<1°C. More specifically the 5’ end of primers were designed to have a greater internal stability

(lower ΔG) than the 3’ end in order to optimize priming specificity. Lastly, and more subtly, all

primer-dimer and -hairpin formations were screened bioinformatically for the most energetically

stable subsets. Those primers that did not contain significant secondary structure or dimer

stability were selected. BLAST was used to compare primer sequences to the genome to ensure

there was no non-specific primer binding. Successfully designed primers are annotated in Table

2.

Relative and Absolute Quantification

Quantitative real-time polymerase chain reaction (qRT-PCR) was performed using the ABI

PRISM® 7900HT Sequence Detection System. Reactions were carried out in the 384-well

format using MicroAmp® optical 384-Well reaction plates (Applied Biosystems, part no.

4343814) and MicroAmp® Optical Adhesive Film (Applied Biosystems, part no. 4311971).

Specifically, reactions used a 10µl total volume that included 2µl of cDNA and 8µl of each gene

specific master mix. The gene specific master mixes contained forward and reverse primers as

well as Power SYBR® Green PCR Master Mix (Applied Biosystems, part no. 4367659). The

cycling program used an initial denaturation step of 95°C for 10 minutes followed by 40 cycles

of 95°C for 15 seconds and 60°C for 1 minute. The program was completed with a dissociation

step for product analysis. 60S acidic ribosomal protein P0 (RPLP0) was used as an endogenous

control for normalization in all experiments. Therefore, relative quantification was achieved

58

Table 2: Primer List

Application Target Gene Orientation Primer Sequence

qRT-PCR

RPLP0 F CAGATTGGCTACCCAACTGTT R GGGAAGGTGTAATCCGTCTCC

CCND2 F CTGTGTGCCACCGACTTTAAG R GATGGCTGCTCCCACACTTC

Myc F AGGGTCAAGTTGGACAGTGTC R TGGTGCATTTTCGGTTGTGG

CCNB1 F GAGGGAGCAGTGCGGGGTTT R AAGCAGAACACCGGAGGCCC

p21 F GGCGGTTGAATGAGAGGTTC R CCTCCGGGAGAGAGGAAAAG

lncRNA-LY6E F ACCGTCACTGACACCTGGA R TCAGCCCTGAGGCTTTGAT

lncRNA-FZD6 F CGGGGAGCCTGGTCACCAA R TGGGCTGCTCAGGGTTCCATC

hnRNA lncRNA-LY6E

F CACTGTGTCAGGGGTGTGTA R CAGGAGGGCTCTGGAATGG

hnRNA lncRNA-FZD6

F GGCTGAGCAGAGGCATAGA R ACTGCTTCCTCCCAAGTTC

Luciferase F ACTCCTCTGGATCTACTGGTC R GTAATGAAGGCTCCTCA

ENST00000443631 F CTGTGGTGACAGCTTTACC R TCGTCGAACGACCTTGTTT

ENST00000499203 F GACAAGCTGAACCAAATGTA R ATGCAACAAAGTTCAATAGT

ENST00000440574 F AGAGCACCACAGAGTGTTT R GGTGGCTCACGCTTGTAAT

AK027383 F AGCCCTGCACCAGCAAATC R TGTGCAAGGGAAGCTCCTCAT

ENST00000434627 F CCTGCAGACGGCCTATTGTG R GTGGGTCACACAGCCATAC

ChIP-qPCR

CCND2 F CCTTGACTCAAGGATGCGTTAGA R GAGCCGACTGCGGTGAAGT

HNT1 Exonic E-box Neg. Control

F CCAAACGCAGTACAGCATGG R GTTGTCTGTCTGCACCGAGC

lncRNA-LY6E F AGTGCTGCTACGTAAGAAGGA R CGAGGAGATGTCACAGAGATT

lncRNA-FZD6 F GCCCGCACCTGAGTTTCCTC R CCCGGCATCGCCTTCAGAG

59

using the ΔΔCt method where cDNA from genes of interest was measured relative to the

endogenous control and across experimental states being measured. Fold change within this

method was calculated using 𝐹𝐶 = 2−∆∆𝐶𝑡. For absolute quantification, expressed sequence tag

(EST) clones for genes of interest were ordered. High purity plasmids of known copy number

were then used to create a standard curve. Experimental samples were then compared to the

standard curve of known copy number to establish the amount of transcript in a given amount of

cDNA. Additionally, for these experiments in vitro transcribed luciferase RNA (0.025ng,

2.73x107

4.4 Flow Cytometry

copies) was added to lysed TRIZOL samples prior to purification in order to control for

RNA isolation and first-strand synthesis efficiency.

Flow cytometry was completed at the Flow Cytometry Facility of Ontario Cancer Institute.

Specifically, MCF-10A cells were placed in non-sterile 1x trypsin to create a suspension of both

adherent and floating cells and cellular fragments. Suspensions were spun at 2000rpm for 2

minutes. Pellets were isolated by decanting the supernatant and subsequently washed and

resuspended with non-sterile PBS. Suspensions were once again pelleted as before and isolated

by decanting the supernatant. Subsequently, cells were fixed by resuspension in cold 70%

ethanol and placed at -20°C for a minimum of 4 hours and a maximum of 14 days.

Fixed cells were then spun as before, the supernatant ethanol was discarded, and the pellets were

rehydrated in PBS. Cells were similarly pelleted and resuspended in a solution of RNase,

DNase-free(Roche#11119915001) and placed at 37°C for 1 hour. Lastly, the cells were stained

with a solution of the nucleic acid stain, propidium iodide (PI) (Sigma #P4170), and allowed to

incubate for 15 minutes.

60

The fixed, PI-stained cells were assessed using the BectonDickson Biosciences FACSCalibur

flow cytometer. Ten-thousand events were acquired as representative of the total cell population

specifically gating for those single cellular events that were PI-positive. The distribution of PI

positive cells was then plotted in two dimensions with number of events on the y-axis and

relative PI-staining on the x-axis. This allowed for visualization of cell cycle profiles of cells

changing their DNA content from 2N to 4N.

4.5 MCF-10A: Model of Cell Cycle Entry MCF-10A cells grown in two-dimensions are remarkably responsive to extracellular stimuli. In

order to study MCF-10A cells undergoing the process of cell cycle entry from a quiescent state.

This was achieved by depriving MCF-10A cells of mitogens through exposure to starvation

media that contained 0.05% horse serum [v/v] and 10µg/mL insulin. Most importantly this

reduces MCF-10A exposure to EGF, a mitogen that they are very dependent upon. Cells in

starvation media arrest in 24 hours in G1/G0 of the cell cycle as measured by previously

described fixed propidium iodide flow cytometry. After arrest, cells are placed back into full

MCF-10A growth media. In response, these cells re-enter the cell cycle in near complete

synchrony through the first round of replication which begins occurring around 14 hours after

addition of full mitogen-containing media.

Seeding Density:

Due to the constantly changing cell number, seeding density was an important variable for

optimization in order to prevent any contact related inhibition of cell growth that could skew any

cell cycle related analysis. This optimization was achieved through seeding between 50,000 and

500,000 cells/10cm plate and harvesting for fixed PI flow cytometry at 24 and 48 hours. The

61

goal was to use the fewest cells so as to maintain a sub-confluent population of cells at the end of

the experiment. In the end, an initial seeding density of 300,000cells/10cm was used

(Supplemental Figure 1)

4.6 MCF-10A: Myc-dependent Model of Transformation MCF-10A cells can also be effectively grown on substrates that mimic the basement membrane.

For this model, cells were grown on Matrigel™(BD Biosciences, cat no. 356230) that is a

solubilized basement membrane derived from Engelbreth-Holm-Swarm mouse sarcoma.

Specifically, we use Matrigel that has been growth factor reduced. This protocol is adapted from

Debnath J. et al (110). Briefly, 25,000 cells/mL of MCF-10A cells in a 2.5% Matrigel [v/v] are

seeded on a layer of 100% Matrigel that has been solidified. Cells proliferate in the matrigel

through about 10 days in culture while forming highly ordered acinar structure and beginning to

undergo luminal cell death around day 8 (Figure 4). In order to study transformation of the MCF-

10A cells in this context, stable cell lines of pMN-GFP, -Myc, and T58A-Myc were generated by

Amanda Wasylishen.

Imaging of MCF-10A cells grown in 3D on matrigel was performed every 4 days up to 16 days

using the Zeiss AxioObserver equipped with the Roper Scientific Coolsnap HQ camera for

image acquisition at the Advanced Optical Microscopy Facility (AOMF), Ontario Cancer

Institute, Princess Margaret Hospital, Toronto, ON, Canada. Images were acquired to monitor

acinar morphogenesis as well as for quantification of transformation at day 8. Transformation

was defined as disordered, multi-acinar structures and was quantified as percentage of the total

population of all sizes of acini (>100) by blinded counts of cells in 3D culture for 8 days.

62

4.7 MCF-10A: Model of Myc-dependent Gene Regulation in the absence of other stimuli

Stable inducible MCF-10A cell lines created by Amanda Wasylishen utilize a two plasmid, 4-

hydroxytamoxifen (4-OHT) inducible system. Briefly, the synthetic fusion GEV16 transcription

factor under the control of the ubiquitin promoter in the first plasmid is constitutively expressed

and localized to the cytoplasm. Upon addition of 4-OHT, GEV16 is translocated to the nucleus

where it binds to the Gal4 upstream activating sequences in the second plasmid, which in turn

activates the gene of interest(Supp. Figure 2) (141).

Using this system for inducible gene expression of Myc, we have used MCF-10A cells in

starvation media in the absence of mitogens for 24 hours and induced Myc expression.

Subsequently, RNA was harvested 8 hours after induction to analyze gene expression changes as

a result of Myc induction alone.

4.8 Gene Expression Array

Sample Preparation

Utilizing the model of cell cycle entry in the MCF-10A cells grown in 2D, RNA was harvested

from cells deprived of mitogens for 24 hours as well as cells deprived of mitogens for 24 hours

and subsequently reintroduced to them for 8 hours. As a control, cDNA was made from isolated

RNA and the relative changes of genes associated with early cell cycle events, p21 and CCND2,

were measured (Data not shown).

Similarly, RNA was harvested from MCF-10A-GFP, -MYC, and -MYC-T58A cells grown in 3D

on Matrigel for 4 days. It should be noted that TRIZOL reagent for RNA isolation placed

directly on the cell and Matrigel mixture can effectively lyse cells.

63

Four replicates of each of the five aforementioned samples were obtained for a total of twenty

RNA samples for array analysis.

Arraystar Microarray Analysis

Microarray analysis employed Arraystar Inc., a fully integrated microarray service, and their

Human lncRNA Microarray V2.0. This custom lncRNA array utilized an Agilent array platform.

The sample preparation and microarray hybridization were performed based on the

manufacturer’s standard protocols with minor modifications. Briefly, mRNA was purified from 1

μg total RNA after removal of rRNA (mRNA-ONLY™ Eukaryotic mRNA Isolation Kit,

Epicentre). Then, each sample was amplified and transcribed into fluorescent cRNA along the

entire length of the transcripts without 3’ bias utilizing a random priming method. The labeled

cRNAs were hybridized onto the Human LncRNA Array v2.0 (8 x 60K, Arraystar). After having

washed the slides, the arrays were scanned by the Agilent Scanner G2505B.

Specifically this platform is intended for parallel analysis of lncRNA and protein coding gene

expression. There are probes targeting 33,045 LncRNAs and 30,215 coding transcripts. The

lncRNA predicted and known genes were collected from databases such as Refseq, UCSC, and

Ensembl as well as literature sources.

4.9 Bioinformatic Analysis

Preprocessing: Array Background Adjustment, raw mRNA data

normalization, and low intensity filtering

In order to ensure that observed differential expression between treatments is not a result of

systematic variation and artifacts several preprocessing steps were utilized. The first involved

64

microarray spot selection, outlier pixel removal and identification of background pixels utilizing

Agilent Feature Extraction software (version 10.7.3.1) for array image processing and a set of

control spots on each microarray (Includes no probe and endogenous controls). Briefly, the raw

signal intensities after spot selection are background subtracted and corrected for signal biases to

yield processed signal intensities corrected for optical noise and cross hybridization.

The processed signal intensities from each microarray must then be subjected to normalization

between arrays. Using the GeneSpring GX v11.5.1 software package (Agilent Technologies),

quantile normalization was performed which assumes that the distribution of probe intensities is

the same across all arrays/samples and makes them the same. Low intensity filtering was then

performed as follows. lncRNAs were called as ‘Present,’ ‘Marginal,’ or ‘Absent’ based on their

normalized intensity. lncRNAs in the 2D samples that have at least 4 out of 8 flags as Present or

Marginal were chosen for further data analysis. Similarly, lncRNAs and mRNAs in the 3D

samples that have at least 8 out of 12 flags as Present or Marginal were also chosen. Probe

intensities can then be directly compared between experiments in a paired analysis within

experiments to yield differentially expressed gene fold changes and p-values. Statistically

significant differentially expressed genes were then identified through Volcano Plot filtering.

Inclusion Criteria

The nature of the microarray approach making use of a combination of probes targeting known

and predicted transcripts made it necessary to develop a means a selecting those transcripts that

fit the criteria of a long non-coding RNA, unambiguously. A long non-coding RNA was defined

as any transcriptional unit that contained no evidence of a significant open reading frame and did

not coincide with the location of known protein coding genes or pseudogenes. The criteria are

detailed in the results section of this thesis.

65

Candidate Selection Criteria

Application of the inclusion criteria and annotating transcripts as described created a uniform list

of predicted and known lncRNAs. To establish their importance and relevance to this study on

Myc-regulated lncRNAs and their importance to cancer, the candidate selection criteria below

were applied. These criteria were aimed at created a manageable and relevant list of candidate

lncRNAs for expression and functional validation.

Myc Binding Status

The ENCODE project contains information pertaining the transcription factor

binding through the use of chromatin immunoprecipitation followed by high-

throughput, next-generation sequencing (ChIP-seq) (123). The sequencing read

peaks from Myc ChIP-seq experiments performed in cell lines of diverse origins

can be visualized in the UCSC Genome Browser. As such, the Myc binding status

of each lncRNA candidate could be evaluated with preference given to those that

were Myc-bound within 5kb of their annotated transcriptional start site.

EST expression status

The National Center for Biotechnology Information (NCBI) contains many

resources pertaining to genes and expression including one called Unigene (142).

Unigene clusters expressed sequence tags that seem to be transcribed from the same

locus (143). The tissue of origin of each EST taken relative to all ESTs in a given

cluster can be used to suggest the tissue type restricted expression of the cluster.

Restricted expression in mammary glands as well as mammary gland tumours was

enriched and given preference.

66

GEO evidence

The Gene Expression Omnibus (GEO) is also a part of NCBI and is interfaced with

the EST clusters as annotated by Unigene. NCBI GEO is an archive of expression

data acquired through both microarray and high-throughput sequencing

technologies (144). GEO Profiles, a portion of this database, provides the

information of a single gene or EST cluster across array experiments. The profiles

of individual EST clusters that represented candidate lncRNAs were used to provide

expression evidence relevant to the biological question.

Abundance

The abundance of lncRNA transcript species has been a debated issue, but in

general the lncRNA class of genes is less expressed at steady state than their protein

coding gene counterparts in whole cell RNA isolates. A relative means of assessing

the abundance of these transcripts was needed in order to select those transcripts of

higher relative abundance in the array experiment. Therefore, the normalized

expression values, those values that were used to directly calculate fold change after

all pre-processing, of all probes targeting significantly changed lncRNAs in a given

experiment on the array were plotted as a function of their p-values in order to show

the distribution (Figure 14). The median value of the normalized expression in the

dataset was then plotted as a line. Any probes that had normalized expression

values in the compared conditions above the median values were considered

abundant by this analysis and given preference.

67

Figure 14: Distribution of differentially expressed genes by normalized array intensity

A representative distribution of significant, differentially expressed lncRNA genes’ normalized

intensity as a function of their p-value. The T58A and GFP normalized intensities of the 33

genes from figure 9 in the union of T58A and mitogen are plotted. The median of 6.496 is

plotted on the y-axis. This plot was used to demonstrate the relative abundance of transcript in a

given comparison.

68

COSMIC CNV evidence

The Catalogue of Somatic Mutations in Cancer (COSMIC) is a database that allows

for the analysis of somatic mutations in cancer using a simple user interface (130).

Making use of this resource, candidate lncRNA loci were viewed and analyzed for

the presence of small nucleotide polymorphisms (SNPs) or larger scale, copy

number variations (CNVs). Candidates that were mutated, gained, or lost in human

breast cancer were given preference.

4.10 Nuclear-Cytoplasmic Partitioning This protocol was adapted from Fish et al (145). MCF-10A cells were grown to 70-80%

confluence, the media transferred to a separate tube and the cells were washed two times with

1xPBS warmed to 37°C. Cells were detached using 1mL of 1x trypsin incubated at 37°C for 20-

30 minutes. Residual trypsin was used to collect and suspend cells, which were then added to the

previously collected media to inactivate trypsin. Cells were centrifuged at 800g for 5 minutes at

4°C. After discarding the supernatant, cell pellets were then washed with cold 1xPBS and

centrifuged as above. Pellets were resuspended in 1mL of cold 1xPBS and transferred to a sterile

Eppendorf tube. Cells were once again centrifuged as above, supernatants were discarded.

To the cell pellet, 175µl of fresh buffer RLN (50mM Tris-HCl, pH 8.0; 140mM NaCl; 1.5mM

MgCl2; 0.5% Nonidet P-40 or IGEPAL CA-630; 0.2units/ul RNaseOUT; 1mM dithiothreitol

[DTT]) was added and used to carefully resuspend cells. Cell suspension in buffer RLN was

incubated for 5 minutes on ice. Suspension was subjected to centrifugation at 300g for 2 minutes

at 4°C. The supernatant (cytoplasmic fraction) was removed and kept on ice. The nuclear pellet

was washed with 500µl of cold 1xPBS and centrifuged at 300g for 5 minutes at 4°C, supernatant

69

discarded. Lastly, 1mL of TRIZOL was added to both fractions for RNA isolation. Subsequently,

RNA extraction, cDNA synthesis, and absolute quantification by qRT-PCR was performed as

described previously.

4.11 ChIP-qRT-PCR Sub-confluent MCF10A cells were cross-linked with 1% formaldehyde for 10 minutes at room

temperature. To quench the cross-linking reaction, 1M glycine was added to a final concentration

of 0.125M for 10 minutes at room temperature. Two washes with cold 1xPBS followed. One

milliliter of cold 1xPBS + protease inhibitors was added and cells were harvested with a cell

scraper. Cells were centrifuged at 425g and cell pellets were resuspended in 1ml nuclei lysis

buffer(1% SDS, 10mM EDTA and 50mM Tris, pH 8.1) with protease inhibitors per 2.0x107

Sonicated cells were centrifuged at 15000g at 4°C for 10 minutes and supernatants were removed

and pooled. Using IP dilution buffer (0.01% SDS, 1.1% Triton X- 100, 1.2mM EDTA, 16.7mM

Tris-HCl, pH 8.1, 167mM NaCl), pooled supernatants were diluted ~10 times. Previously,

protein G agarose beads were washed two times with sonication buffer and then blocked in

sonication buffer containing salmon sperm DNA(50µg/ml) overnight at 4°C. 60µl of prepared

protein G agarose beads were used to pre-clear the chromatin by incubating at 4°C for 1 hours

with rotation. Beads were then centrifuged at 5000g for 1 minute. The supernatant was collected

and aliquoted according to relative number of cells per reaction with the corresponding antibody

added to each. Reactions were incubated overnight at 4°C with rotation.

cells.

Resuspended cells were incubated on ice for 10 minutes and then sonicated with 20 pulses

(setting high, 30s per pulse, 30s on ice between pulses) using a BioRuptor Sonicator (Diagenode,

BioRuptor 200, UCD-200 TM-EX) to generate fragments between 100 and 1000bp confirmed by

agarose gel electrophoresis.

70

60µl of blocked protein G agarose was added to the reactions and incubated for 3 hours at 4°C

with rotation. The beads were pulled down by centrifugation at 5000g for 1 minutes. The

supernatant was discarded except for 200µl of from the IgG sample, kept on ice, to use as a total

input control. The beads were then washed with 1ml of each cold buffer below by resuspending

the beads in the buffer, incubating for 3-5 minutes with rotation, and centrifuging at 5000g for 1

minute. The supernatant was removed prior to the addition of the next buffer.

1X low salt immune complex wash buffer (0.1% SDS, 1% Triton X-100, 2mM EDTA,

20mM Tris-HCl, pH 8.1, 150mM NaCl)

1X high salt immune complex wash buffer (0.1% SDS, 1% Triton X-100, 2mM EDTA,

20mM Tris-HCl, pH 8.1, 500mM NaCl)

1X lithium chloride wash buffer (0.25M LiCl, 1% IGEPAL CA630, 1% deoxycholic acid

[sodium salt], 1mM EDTA, 10mM Tris, pH 8.1)

2X Tris-EDTA (TE) buffer (10mM Tris-HCl, pH 8.0, 1mM EDTA)

After washes, make fresh elution buffer (1% SDS, 100mM NaHCO3); 200µl is required per

tube. Add 100µl to all tubes and mix by gently flicking. Incubate at 65°C for 15 minutes. Pull

down bead by centrifuging at 5000g for 1 minute and collect the supernatant to new tubes and

repeat with remaining 100µl/tube, combining eluates. To all tubes, including the total input

control, add 8µl of 5M NaCl and incubate overnight to begin reverse cross-linking. Add RNase

A and incubate for 30 minutes at 37°C and complete the reverse cross-linking by adding 4µl of

500mM EDTA, 8µl of 1M Tris-HCL (pH 6.5), and 1µl of Proteinase K followed by incubation

at 45°C for 1-2 hours.

71

5µl of 3M NaOAc was added to adjust pH for purification of the DNA which utilized the silica

column based QIAquick PCR Purification Kit (QIAGEN, cat. no. 28106) according to

manufacturer’s protocol. DNA was eluted from the columns with 60µl of RNase-, DNase-free

water and frozen at -20°C for further use

Relative DNA amounts were measured by qRT-PCR as described previously with primers

designed around Myc ChIP-seq peaks as described in the Candidate Selection section.

4.12 Statistical Analysis Statistical analysis was performed using Graph Pad Prism software. As well, calculation of the

hypergeometric distribution utilized R statistical programming.

72

References 1. Eilers,M. and Eisenman,R.N. (2008) Myc's broad reach Genes Dev., 22, 2755-2766.

2. Meyer,N. and Penn,L.Z. (2008) Reflecting on 25 years with MYC. Nat. Rev. Cancer., 8, 976-

990.

3. Levens,D. (2010) You don't muck with MYC Genes Cancer., 1, 547-554.

4. Hann,S.R. and Eisenman,R.N. (1984) Proteins encoded by the human c-myc oncogene:

Differential expression in neoplastic cells Mol. Cell. Biol., 4, 2486-2497.

5. Dani,C., Blanchard,J.M., Piechaczyk,M., El Sabouty,S., Marty,L. and Jeanteur,P. (1984)

Extreme instability of myc mRNA in normal and transformed human cells Proc. Natl. Acad. Sci.

U. S. A., 81, 7046-7050.

6. Dang,C.V. (2012) MYC on the path to cancer Cell, 149, 22-35.

7. Payne,G.S., Bishop,J.M. and Varmus,H.E. (1982) Multiple arrangements of viral DNA and an

activated host oncogene in bursal lymphomas Nature, 295, 209-214.

8. Steffen,D. (1984) Proviruses are adjacent to c-myc in some murine leukemia virus-induced

lymphomas Proc. Natl. Acad. Sci. U. S. A., 81, 2097-2101.

9. Hayward,W.S., Neel,B.G. and Astrin,S.M. (1981) Activation of a cellular onc gene by

promoter insertion in ALV-induced lymphoid leukosis Nature, 290, 475-480.

10. de Klein,A., van Kessel,A.G., Grosveld,G., Bartram,C.R., Hagemeijer,A., Bootsma,D.,

Spurr,N.K., Heisterkamp,N., Groffen,J. and Stephenson,J.R. (1982) A cellular oncogene is

translocated to the philadelphia chromosome in chronic myelocytic leukaemia Nature, 300, 765-

767.

11. Collins,S. and Groudine,M. (1982) Amplification of endogenous myc-related DNA

sequences in a human myeloid leukaemia cell line Nature, 298, 679-681.

73

12. Wilkins,J.A. and Sansom,O.J. (2008) C-myc is a critical mediator of the phenotypes of apc

loss in the intestine Cancer Res., 68, 4963-4966.

13. Dalla-Favera,R., Bregni,M., Erikson,J., Patterson,D., Gallo,R.C. and Croce,C.M. (1982)

Human c-myc onc gene is located on the region of chromosome 8 that is translocated in burkitt

lymphoma cells Proc. Natl. Acad. Sci. U. S. A., 79, 7824-7827.

14. Taub,R., Kirsch,I., Morton,C., Lenoir,G., Swan,D., Tronick,S., Aaronson,S. and Leder,P.

(1982) Translocation of the c-myc gene into the immunoglobulin heavy chain locus in human

burkitt lymphoma and murine plasmacytoma cells Proc. Natl. Acad. Sci. U. S. A., 79, 7837-7841.

15. Beroukhim,R., Mermel,C.H., Porter,D., Wei,G., Raychaudhuri,S., Donovan,J., Barretina,J.,

Boehm,J.S., Dobson,J., Urashima,M., et al. (2010) The landscape of somatic copy-number

alteration across human cancers Nature, 463, 899-905.

16. He,T.C., Sparks,A.B., Rago,C., Hermeking,H., Zawel,L., da Costa,L.T., Morin,P.J.,

Vogelstein,B. and Kinzler,K.W. (1998) Identification of c-MYC as a target of the APC pathway

Science, 281, 1509-1512.

17. Palomero,T., Lim,W.K., Odom,D.T., Sulis,M.L., Real,P.J., Margolin,A., Barnes,K.C.,

O'Neil,J., Neuberg,D., Weng,A.P., et al. (2006) NOTCH1 directly regulates c-MYC and

activates a feed-forward-loop transcriptional network promoting leukemic cell growth Proc.

Natl. Acad. Sci. U. S. A., 103, 18261-18266.

18. Sharma,V.M., Calvo,J.A., Draheim,K.M., Cunningham,L.A., Hermance,N., Beverly,L.,

Krishnamoorthy,V., Bhasin,M., Capobianco,A.J. and Kelliher,M.A. (2006) Notch1 contributes to

mouse T-cell leukemia by directly inducing the expression of c-myc Mol. Cell. Biol., 26, 8022-

8031.

19. Weng,A.P., Millholland,J.M., Yashiro-Ohtani,Y., Arcangeli,M.L., Lau,A., Wai,C., Del

Bianco,C., Rodriguez,C.G., Sai,H., Tobias,J., et al. (2006) c-myc is an important direct target of

Notch1 in T-cell acute lymphoblastic leukemia/lymphoma Genes Dev., 20, 2096-2109.

74

20. Soucek,L., Whitfield,J., Martins,C.P., Finch,A.J., Murphy,D.J., Sodir,N.M., Karnezis,A.N.,

Swigart,L.B., Nasi,S. and Evan,G.I. (2008) Modelling myc inhibition as a cancer therapy Nature,

455, 679-683.

21. Duesberg,P.H. and Vogt,P.K. (1979) Avian acute leukemia viruses MC29 and MH2 share

specific RNA sequences: Evidence for a second class of transforming genes Proc. Natl. Acad.

Sci. U. S. A., 76, 1633-1637.

22. Hu,S.S., Lai,M.M. and Vogt,P.K. (1979) Genome of avian myelocytomatosis virus MC29:

Analysis by heteroduplex mapping Proc. Natl. Acad. Sci. U. S. A., 76, 1265-1268.

23. Sheiness,D. and Bishop,J.M. (1979) DNA and RNA from uninfected vertebrate cells contain

nucleotide sequences related to the putative transforming gene of avian myelocytomatosis virus

J. Virol., 31, 514-521.

24. Vennstrom,B., Sheiness,D., Zabielski,J. and Bishop,J.M. (1982) Isolation and

characterization of c-myc, a cellular homolog of the oncogene (v-myc) of avian

myelocytomatosis virus strain 29 J. Virol., 42, 773-779.

25. Nau,M.M., Brooks,B.J., Battey,J., Sausville,E., Gazdar,A.F., Kirsch,I.R., McBride,O.W.,

Bertness,V., Hollis,G.F. and Minna,J.D. (1985) L-myc, a new myc-related gene amplified and

expressed in human small cell lung cancer Nature, 318, 69-73.

26. Kohl,N.E., Gee,C.E. and Alt,F.W. (1984) Activated expression of the N-myc gene in human

neuroblastomas and related tumors Science, 226, 1335-1337.

27. Brodeur,G.M., Seeger,R.C., Schwab,M., Varmus,H.E. and Bishop,J.M. (1984) Amplification

of N-myc in untreated human neuroblastomas correlates with advanced disease stage Science,

224, 1121-1124.

28. Ingvarsson,S., Asker,C., Axelson,H., Klein,G. and Sumegi,J. (1988) Structure and expression

of B-myc, a new member of the myc gene family Mol. Cell. Biol., 8, 3168-3174.

29. Sugiyama,A., Kume,A., Nemoto,K., Lee,S.Y., Asami,Y., Nemoto,F., Nishimura,S. and

Kuchino,Y. (1989) Isolation and characterization of s-myc, a member of the rat myc gene family.

Proc. Natl. Acad. Sci. U. S. A., 86, 9144-9148.

75

30. Cowling,V.H. and Cole,M.D. (2006) Mechanism of transcriptional activation by the myc

oncoproteins Semin. Cancer Biol., 16, 242-252.

31. Murre,C., McCaw,P.S. and Baltimore,D. (1989) A new DNA binding and dimerization motif

in immunoglobulin enhancer binding, daughterless, MyoD, and myc proteins Cell, 56, 777-783.

32. Landschulz,W.H., Johnson,P.F. and McKnight,S.L. (1988) The leucine zipper: A

hypothetical structure common to a new class of DNA binding proteins Science, 240, 1759-1764.

33. Blackwood,E.M. and Eisenman,R.N. (1991) Max: A helix-loop-helix zipper protein that

forms a sequence-specific DNA-binding complex with myc Science, 251, 1211-1217.

34. Blackwell,T.K., Kretzner,L., Blackwood,E.M., Eisenman,R.N. and Weintraub,H. (1990)

Sequence-specific DNA binding by the c-myc protein Science, 250, 1149-1151.

35. Dang,C.V., Dolde,C., Gillison,M.L. and Kato,G.J. (1992) Discrimination between related

DNA sites by a single amino acid residue of myc-related basic-helix-loop-helix proteins Proc.

Natl. Acad. Sci. U. S. A., 89, 599-602.

36. Dang,C.V. and Lee,W.M. (1988) Identification of the human c-myc protein nuclear

translocation signal Mol. Cell. Biol., 8, 4048-4054.

37. Luscher,B. and Larsson,L.G. (1999) The basic region/helix-loop-helix/leucine zipper domain

of myc proto-oncoproteins: Function and regulation Oncogene, 18, 2955-2966.

38. Kato,G.J., Barrett,J., Villa-Garcia,M. and Dang,C.V. (1990) An amino-terminal c-myc

domain required for neoplastic transformation activates transcription Mol. Cell. Biol., 10, 5914-

5920.

39. Agrawal,P., Yu,K., Salomon,A.R. and Sedivy,J.M. (2010) Proteomic profiling of myc-

associated proteins Cell. Cycle, 9, 4908-4921.

40. McMahon,S.B., Van Buskirk,H.A., Dugan,K.A., Copeland,T.D. and Cole,M.D. (1998) The

novel ATM-related protein TRRAP is an essential cofactor for the c-myc and E2F oncoproteins

Cell, 94, 363-374.

76

41. Secombe,J., Li,L., Carlos,L. and Eisenman,R.N. (2007) The trithorax group protein lid is a

trimethyl histone H3K4 demethylase required for dMyc-induced cell growth Genes Dev., 21,

537-551.

42. Hanahan,D. and Weinberg,R.A. (2011) Hallmarks of cancer: The next generation Cell, 144,

646-674.

43. Kelly,K., Cochran,B.H., Stiles,C.D. and Leder,P. (1983) Cell-specific regulation of the c-

myc gene by lymphocyte mitogens and platelet-derived growth factor Cell, 35, 603-610.

44. Heikkila,R., Schwab,G., Wickstrom,E., Loke,S.L., Pluznik,D.H., Watt,R. and Neckers,L.M.

(1987) A c-myc antisense oligodeoxynucleotide inhibits entry into S phase but not progress from

G0 to G1 Nature, 328, 445-449.

45. Dang,C.V., O'Donnell,K.A., Zeller,K.I., Nguyen,T., Osthus,R.C. and Li,F. (2006) The c-myc

target gene network Semin. Cancer Biol., 16, 253-264.

46. Eilers,M., Picard,D., Yamamoto,K.R. and Bishop,J.M. (1989) Chimaeras of myc oncoprotein

and steroid receptors cause hormone-dependent transformation of cells Nature, 340, 66-68.

47. Luscher,B. and Vervoorts,J. (2012) Regulation of gene transcription by the oncoprotein

MYC. Gene, 494, 145-160.

48. Bouchard,C., Dittrich,O., Kiermaier,A., Dohmann,K., Menkel,A., Eilers,M. and Luscher,B.

(2001) Regulation of cyclin D2 gene expression by the Myc/Max/Mad network: Myc-dependent

TRRAP recruitment and histone acetylation at the cyclin D2 promoter Genes Dev., 15, 2042-

2047.

49. Frank,S.R., Schroeder,M., Fernandez,P., Taubert,S. and Amati,B. (2001) Binding of c-myc to

chromatin mediates mitogen-induced acetylation of histone H4 and gene activation. Genes Dev.,

15, 2069-2082.

50. Bouchard,C., Marquardt,J., Bras,A., Medema,R.H. and Eilers,M. (2004) Myc-induced

proliferation and transformation require akt-mediated phosphorylation of FoxO proteins EMBO

J., 23, 2830-2840.

77

51. Herkert,B. and Eilers,M. (2010) Transcriptional repression: The dark side of myc Genes

Cancer., 1, 580-586.

52. Ponzielli,R., Katz,S., Barsyte-Lovejoy,D. and Penn,L.Z. (2005) Cancer therapeutics:

Targeting the dark side of myc Eur. J. Cancer, 41, 2485-2501.

53. Oster,S.K., Marhin,W.W., Asker,C., Facchini,L.M., Dion,P.A., Funa,K., Post,M.,

Sedivy,J.M. and Penn,L.Z. (2000) Myc is an essential negative regulator of platelet-derived

growth factor beta receptor expression Mol. Cell. Biol., 20, 6768-6778.

54. Izumi,H., Molander,C., Penn,L.Z., Ishisaki,A., Kohno,K. and Funa,K. (2001) Mechanism for

the transcriptional repression by c-myc on PDGF beta-receptor J. Cell. Sci., 114, 1533-1544.

55. Mao,D.Y., Watson,J.D., Yan,P.S., Barsyte-Lovejoy,D., Khosravi,F., Wong,W.W.,

Farnham,P.J., Huang,T.H. and Penn,L.Z. (2003) Analysis of myc bound loci identified by CpG

island arrays shows that max is essential for myc-dependent repression Curr. Biol., 13, 882-886.

56. Herold,S., Wanzel,M., Beuger,V., Frohme,C., Beul,D., Hillukkala,T., Syvaoja,J., Saluz,H.P.,

Haenel,F. and Eilers,M. (2002) Negative regulation of the mammalian UV response by myc

through association with miz-1. Mol. Cell, 10, 509-521.

57. Brenner,C., Deplus,R., Didelot,C., Loriot,A., Vire,E., De Smet,C., Gutierrez,A., Danovi,D.,

Bernard,D., Boon,T., et al. (2005) Myc represses transcription through recruitment of DNA

methyltransferase corepressor EMBO J., 24, 336-346.

58. Wong,P.P., Miranda,F., Chan,K.V., Berlato,C., Hurst,H.C. and Scibetta,A.G. (2012) Histone

demethylase KDM5B collaborates with TFAP2C and myc to repress the cell cycle inhibitor

p21(cip) (CDKN1A) Mol. Cell. Biol., 32, 1633-1644.

59. Conzen,S.D., Gottlob,K., Kandel,E.S., Khanduri,P., Wagner,A.J., O'Leary,M. and Hay,N.

(2000) Induction of cell cycle progression and acceleration of apoptosis are two separable

functions of c-myc: Transrepression correlates with acceleration of apoptosis Mol. Cell. Biol.,

20, 6008-6018.

60. Patel,J.H., Loboda,A.P., Showe,M.K., Showe,L.C. and McMahon,S.B. (2004) Analysis of

genomic targets reveals complex functions of MYC. Nat. Rev. Cancer., 4, 562-568.

78

61. Bieda,M., Xu,X., Singer,M.A., Green,R. and Farnham,P.J. (2006) Unbiased location analysis

of E2F1-binding sites suggests a widespread role for E2F1 in the human genome Genome Res.,

16, 595-605.

62. Efroni,S., Duttagupta,R., Cheng,J., Dehghani,H., Hoeppner,D.J., Dash,C., Bazett-Jones,D.P.,

Le Grice,S., McKay,R.D., Buetow,K.H., et al. (2008) Global transcription in pluripotent

embryonic stem cells Cell. Stem Cell., 2, 437-447.

63. Wong,D.J., Liu,H., Ridky,T.W., Cassarino,D., Segal,E. and Chang,H.Y. (2008) Module map

of stem cell genes guides creation of epithelial cancer stem cells. Cell. Stem Cell., 2, 333-344.

64. Varlakhanova,N.V. and Knoepfler,P.S. (2009) Acting locally and globally: Myc's ever-

expanding roles on chromatin Cancer Res., 69, 7487-7490.

65. Amente,S., Lania,L. and Majello,B. (2011) Epigenetic reprogramming of myc target genes

Am. J. Cancer. Res., 1, 413-418.

66. Guccione,E., Martinato,F., Finocchiaro,G., Luzi,L., Tizzoni,L., Dall'Olio,V., Zardo,G.,

Nervi,C., Bernard,L. and Amati,B. (2006) Myc-binding-site recognition in the human genome is

determined by chromatin context Nat. Cell Biol., 8, 764-770.

67. Knoepfler,P.S., Zhang,X.Y., Cheng,P.F., Gafken,P.R., McMahon,S.B. and Eisenman,R.N.

(2006) Myc influences global chromatin structure EMBO J., 25, 2723-2734.

68. Martinato,F., Cesaroni,M., Amati,B. and Guccione,E. (2008) Analysis of myc-induced

histone modifications on target chromatin PLoS One, 3, e3650.

69. Cotterman,R., Jin,V.X., Krig,S.R., Lemen,J.M., Wey,A., Farnham,P.J. and Knoepfler,P.S.

(2008) N-myc regulates a widespread euchromatic program in the human genome partially

independent of its role as a classical transcription factor Cancer Res., 68, 9654-9662.

70. Van Dang,C. and McMahon,S.B. (2010) Emerging concepts in the analysis of transcriptional

targets of the MYC oncoprotein: Are the targets targetable? Genes Cancer., 1, 560-567.

71. Chau,C.M., Zhang,X.Y., McMahon,S.B. and Lieberman,P.M. (2006) Regulation of epstein-

barr virus latency type by the chromatin boundary factor CTCF J. Virol., 80, 5723-5732.

79

72. Chang,T.C., Yu,D., Lee,Y.S., Wentzel,E.A., Arking,D.E., West,K.M., Dang,C.V., Thomas-

Tikhonenko,A. and Mendell,J.T. (2008) Widespread microRNA repression by myc contributes

to tumorigenesis. Nat. Genet., 40, 43-50.

73. O'Donnell,K.A., Wentzel,E.A., Zeller,K.I., Dang,C.V. and Mendell,J.T. (2005) c-myc-

regulated microRNAs modulate E2F1 expression Nature, 435, 839-843.

74. Barsyte-Lovejoy,D., Lau,S.K., Boutros,P.C., Khosravi,F., Jurisica,I., Andrulis,I.L.,

Tsao,M.S. and Penn,L.Z. (2006) The c-myc oncogene directly induces the H19 noncoding RNA

by allele-specific binding to potentiate tumorigenesis Cancer Res., 66, 5330-5337.

75. Pachnis,V., Brannan,C.I. and Tilghman,S.M. (1988) The structure and expression of a novel

gene activated in early mouse embryogenesis EMBO J., 7, 673-681.

76. Yoo-Warren,H., Pachnis,V., Ingram,R.S. and Tilghman,S.M. (1988) Two regulatory domains

flank the mouse H19 gene Mol. Cell. Biol., 8, 4707-4715.

77. Gabory,A., Jammes,H. and Dandolo,L. (2010) The H19 locus: Role of an imprinted non-

coding RNA in growth and development Bioessays, 32, 473-480.

78. Perini,G., Diolaiti,D., Porro,A. and Della Valle,G. (2005) In vivo transcriptional regulation

of N-myc target genes is controlled by E-box methylation Proc. Natl. Acad. Sci. U. S. A., 102,

12117-12122.

79. Wutz,A. and Jaenisch,R. (2000) A shift from reversible to irreversible X inactivation is

triggered during ES cell differentiation Mol. Cell, 5, 695-705.

80. Forne,T., Oswald,J., Dean,W., Saam,J.R., Bailleul,B., Dandolo,L., Tilghman,S.M., Walter,J.

and Reik,W. (1997) Loss of the maternal H19 gene induces changes in Igf2 methylation in both

cis and trans Proc. Natl. Acad. Sci. U. S. A., 94, 10243-10248.

81. Kapranov,P. and St Laurent,G. (2012) Dark matter RNA: Existence, function, and

controversy Front. Genet., 3, 60.

82. ENCODE Project Consortium, Birney,E., Stamatoyannopoulos,J.A., Dutta,A., Guigo,R.,

Gingeras,T.R., Margulies,E.H., Weng,Z., Snyder,M., Dermitzakis,E.T., et al. (2007)

80

Identification and analysis of functional elements in 1% of the human genome by the ENCODE

pilot project Nature, 447, 799-816.

83. Johnson,J.M., Edwards,S., Shoemaker,D. and Schadt,E.E. (2005) Dark matter in the genome:

Evidence of widespread transcription detected by microarray tiling experiments. Trends Genet.,

21, 93-102.

84. Mercer,T.R., Gerhardt,D.J., Dinger,M.E., Crawford,J., Trapnell,C., Jeddeloh,J.A.,

Mattick,J.S. and Rinn,J.L. (2011) Targeted RNA sequencing reveals the deep complexity of the

human transcriptome Nat. Biotechnol., 30, 99-104.

85. Ponting,C.P. and Belgard,T.G. (2010) Transcribed dark matter: Meaning or myth? Hum.

Mol. Genet., 19, R162-8.

86. van Bakel,H., Nislow,C., Blencowe,B.J. and Hughes,T.R. (2010) Most "dark matter"

transcripts are associated with known genes PLoS Biol., 8, e1000371.

87. van Bakel,H., Nislow,C., Blencowe,B.J. and Hughes,T.R. (2011) Response to "The reality of

pervasive transcription". PLoS Biol, 9, Epub.

88. Clark,M.B., Amaral,P.P., Schlesinger,F.J., Dinger,M.E., Taft,R.J., Rinn,J.L., Ponting,C.P.,

Stadler,P.F., Morris,K.V., Morillon,A., et al. (2011) The reality of pervasive transcription. PLoS

Biol., 9, e1000625; discussion e1001102.

89. Mattick,J.S. (2009) The genetic signatures of noncoding RNAs PLoS Genet., 5, e1000459.

90. Guttman,M., Amit,I., Garber,M., French,C., Lin,M.F., Feldser,D., Huarte,M., Zuk,O.,

Carey,B.W., Cassady,J.P., et al. (2009) Chromatin signature reveals over a thousand highly

conserved large non-coding RNAs in mammals. Nature, 458, 223-227.

91. Hung,T., Wang,Y., Lin,M.F., Koegel,A.K., Kotake,Y., Grant,G.D., Horlings,H.M., Shah,N.,

Umbricht,C., Wang,P., et al. (2011) Extensive and coordinated transcription of noncoding RNAs

within cell-cycle promoters. Nat. Genet., 43, 621-629.

81

92. Ulitsky,I., Shkumatava,A., Jan,C.H., Sive,H. and Bartel,D.P. (2011) Conserved function of

lincRNAs in vertebrate embryonic development despite rapid sequence evolution Cell, 147,

1537-1550.

93. Mattick,J.S. (2011) The central role of RNA in human development and cognition FEBS

Lett., 585, 1600-1616.

94. Mercer,T.R., Dinger,M.E. and Mattick,J.S. (2009) Long non-coding RNAs: Insights into

functions Nat. Rev. Genet., 10, 155-159.

95. Gibb,E.A., Brown,C.J. and Lam,W.L. (2011) The functional role of long non-coding RNA in

human carcinomas. Mol. Cancer., 10, 38.

96. Ravasi,T., Suzuki,H., Pang,K.C., Katayama,S., Furuno,M., Okunishi,R., Fukuda,S., Ru,K.,

Frith,M.C., Gongora,M.M., et al. (2006) Experimental validation of the regulated expression of

large numbers of non-coding RNAs from the mouse genome Genome Res., 16, 11-19.

97. Zhang,X., Lian,Z., Padden,C., Gerstein,M.B., Rozowsky,J., Snyder,M., Gingeras,T.R.,

Kapranov,P., Weissman,S.M. and Newburger,P.E. (2009) A myelopoiesis-associated regulatory

intergenic noncoding RNA transcript within the human HOXA cluster. Blood, 113, 2526-2534.

98. Wang,K.C. and Chang,H.Y. (2011) Molecular mechanisms of long noncoding RNAs. Mol.

Cell, 43, 904-914.

99. Rinn,J.L., Kertesz,M., Wang,J.K., Squazzo,S.L., Xu,X., Brugmann,S.A., Goodnough,L.H.,

Helms,J.A., Farnham,P.J., Segal,E., et al. (2007) Functional demarcation of active and silent

chromatin domains in human HOX loci by noncoding RNAs Cell, 129, 1311-1323.

100. Tsai,M.C., Manor,O., Wan,Y., Mosammaparast,N., Wang,J.K., Lan,F., Shi,Y., Segal,E. and

Chang,H.Y. (2010) Long noncoding RNA as modular scaffold of histone modification

complexes Science, 329, 689-693.

101. Kaneko,S., Li,G., Son,J., Xu,C.F., Margueron,R., Neubert,T.A. and Reinberg,D. (2010)

Phosphorylation of the PRC2 component Ezh2 is cell cycle-regulated and up-regulates its

binding to ncRNA Genes Dev., 24, 2615-2620.

82

102. Zeng,X., Chen,S. and Huang,H. (2011) Phosphorylation of EZH2 by CDK1 and CDK2: A

possible regulatory mechanism of transmission of the H3K27me3 epigenetic mark through cell

divisions Cell. Cycle, 10, 579-583.

103. Gupta,R.A., Shah,N., Wang,K.C., Kim,J., Horlings,H.M., Wong,D.J., Tsai,M.C., Hung,T.,

Argani,P., Rinn,J.L., et al. (2010) Long non-coding RNA HOTAIR reprograms chromatin state

to promote cancer metastasis. Nature, 464, 1071-1076.

104. Kogo,R., Shimamura,T., Mimori,K., Kawahara,K., Imoto,S., Sudo,T., Tanaka,F.,

Shibata,K., Suzuki,A., Komune,S., et al. (2011) Long noncoding RNA HOTAIR regulates

polycomb-dependent chromatin modification and is associated with poor prognosis in colorectal

cancers Cancer Res., 71, 6320-6326.

105. Geng,Y.J., Xie,S.L., Li,Q., Ma,J. and Wang,G.Y. (2011) Large intervening non-coding

RNA HOTAIR is associated with hepatocellular carcinoma progression J. Int. Med. Res., 39,

2119-2128.

106. Xu,J., Chen,Y. and Olopade,O.I. (2010) MYC and breast cancer Genes Cancer., 1, 629-640.

107. Soule,H.D., Maloney,T.M., Wolman,S.R., Peterson,W.D.,Jr, Brenz,R., McGrath,C.M.,

Russo,J., Pauley,R.J., Jones,R.F. and Brooks,S.C. (1990) Isolation and characterization of a

spontaneously immortalized human breast epithelial cell line, MCF-10 Cancer Res., 50, 6075-

6086.

108. Tait,L., Soule,H.D. and Russo,J. (1990) Ultrastructural and immunocytochemical

characterization of an immortalized human breast epithelial cell line, MCF-10 Cancer Res., 50,

6087-6094.

109. Stampfer,M.R. and Yaswen,P. (2000) Culture models of human mammary epithelial cell

transformation J. Mammary Gland Biol. Neoplasia, 5, 365-378.

110. Debnath,J., Muthuswamy,S.K. and Brugge,J.S. (2003) Morphogenesis and oncogenesis of

MCF-10A mammary epithelial acini grown in three-dimensional basement membrane cultures

Methods, 30, 256-268.

83

111. Merlo,G.R., Basolo,F., Fiore,L., Duboc,L. and Hynes,N.E. (1995) p53-dependent and p53-

independent activation of apoptosis in mammary epithelial cells reveals a survival function of

EGF and insulin J. Cell Biol., 128, 1185-1196.

112. LeVea,C.M., Reeder,J.E. and Mooney,R.A. (2004) EGF-dependent cell cycle progression is

controlled by density-dependent regulation of akt activation Exp. Cell Res., 297, 272-284.

113. Chou,J.L., Fan,Z., DeBlasio,T., Koff,A., Rosen,N. and Mendelsohn,J. (1999) Constitutive

overexpression of cyclin D1 in human breast epithelial cells does not prevent G1 arrest induced

by deprivation of epidermal growth factor Breast Cancer Res. Treat., 55, 267-283.

114. Wasylishen,A.R., Stojanova,A., Oliveri,S., Rust,A.C., Schimmer,A.D. and Penn,L.Z.

(2011) New model systems provide insights into myc-induced transformation. Oncogene, 30,

3727-3734.

115. Sears,R., Nuckolls,F., Haura,E., Taya,Y., Tamai,K. and Nevins,J.R. (2000) Multiple ras-

dependent phosphorylation pathways regulate myc protein stability Genes Dev., 14, 2501-2514.

116. Sears,R.C. (2004) The life cycle of C-myc: From synthesis to degradation Cell. Cycle, 3,

1133-1137.

117. Debnath,J. and Brugge,J.S. (2005) Modelling glandular epithelial cancers in three-

dimensional cultures Nat. Rev. Cancer., 5, 675-688.

118. Kleinman,H.K., McGarvey,M.L., Hassell,J.R., Star,V.L., Cannon,F.B., Laurie,G.W. and

Martin,G.R. (1986) Basement membrane complexes with biological activity Biochemistry, 25,

312-318.

119. Weaver,V.M. and Bissell,M.J. (1999) Functional culture models to study mechanisms

governing apoptosis in normal and malignant mammary epithelial cells J. Mammary Gland Biol.

Neoplasia, 4, 193-201.

120. Petersen,O.W., Ronnov-Jessen,L., Howlett,A.R. and Bissell,M.J. (1992) Interaction with

basement membrane serves to rapidly distinguish growth and differentiation pattern of normal

and malignant human breast epithelial cells Proc. Natl. Acad. Sci. U. S. A., 89, 9064-9068.

84

121. Flicek,P., Amode,M.R., Barrell,D., Beal,K., Brent,S., Carvalho-Silva,D., Clapham,P.,

Coates,G., Fairley,S., Fitzgerald,S., et al. (2012) Ensembl 2012 Nucleic Acids Res., 40, D84-90.

122. Dreszer,T.R., Karolchik,D., Zweig,A.S., Hinrichs,A.S., Raney,B.J., Kuhn,R.M.,

Meyer,L.R., Wong,M., Sloan,C.A., Rosenbloom,K.R., et al. (2012) The UCSC genome browser

database: Extensions and updates 2011 Nucleic Acids Res., 40, D918-23.

123. Raney,B.J., Cline,M.S., Rosenbloom,K.R., Dreszer,T.R., Learned,K., Barber,G.P.,

Meyer,L.R., Sloan,C.A., Malladi,V.S., Roskin,K.M., et al. (2011) ENCODE whole-genome data

in the UCSC genome browser (2011 update) Nucleic Acids Res., 39, D871-5.

124. Eyras,E., Caccamo,M., Curwen,V. and Clamp,M. (2004) ESTGenes: Alternative splicing

from ESTs in ensembl Genome Res., 14, 976-987.

125. Slater,G.S. and Birney,E. (2005) Automated generation of heuristics for biological sequence

comparison BMC Bioinformatics, 6, 31.

126. Camacho,C., Coulouris,G., Avagyan,V., Ma,N., Papadopoulos,J., Bealer,K. and

Madden,T.L. (2009) BLAST+: Architecture and applications BMC Bioinformatics, 10, 421.

127. Mott,R. (1997) EST_GENOME: A program to align spliced DNA sequences to unspliced

genomic DNA Comput. Appl. Biosci., 13, 477-478.

128. Birney,E., Clamp,M. and Durbin,R. (2004) GeneWise and genomewise Genome Res., 14,

988-995.

129. Poliseno,L., Salmena,L., Zhang,J., Carver,B., Haveman,W.J. and Pandolfi,P.P. (2010) A

coding-independent function of gene and pseudogene mRNAs regulates tumour biology Nature,

465, 1033-1038.

130. Forbes,S.A., Tang,G., Bindal,N., Bamford,S., Dawson,E., Cole,C., Kok,C.Y., Jia,M.,

Ewing,R., Menzies,A., et al. (2010) COSMIC (the catalogue of somatic mutations in cancer): A

resource to investigate acquired mutations in human cancer Nucleic Acids Res., 38, D652-7.

85

131. Schoggins,J.W., Wilson,S.J., Panis,M., Murphy,M.Y., Jones,C.T., Bieniasz,P. and

Rice,C.M. (2011) A diverse range of gene products are effectors of the type I interferon antiviral

response Nature, 472, 481-485.

132. Kondoh,N., Wakatsuki,T., Ryo,A., Hada,A., Aihara,T., Horiuchi,S., Goseki,N.,

Matsubara,O., Takenaka,K., Shichita,M., et al. (1999) Identification and characterization of

genes associated with human hepatocellular carcinogenesis Cancer Res., 59, 4990-4996.

133. Geisler,S., Lojek,L., Khalil,A.M., Baker,K.E. and Coller,J. (2012) Decapping of long

noncoding RNAs regulates inducible genes Mol. Cell, 45, 279-291.

134. Bueno,M.J., Perez de Castro,I. and Malumbres,M. (2008) Control of cell proliferation

pathways by microRNAs Cell. Cycle, 7, 3143-3148.

135. Martin,J.A. and Wang,Z. (2011) Next-generation transcriptome assembly. Nat. Rev. Genet.,

12, 671-682.

136. Reis,E.M. and Verjovski-Almeida,S. (2012) Perspectives of long non-coding RNAs in

cancer diagnostics Front. Genet., 3, 32.

137. Lee,G.L., Dobi,A. and Srivastava,S. (2011) Prostate cancer: Diagnostic performance of the

PCA3 urine test Nat. Rev. Urol., 8, 123-124.

138. Chen,J., Lovell,J.F., Lo,P.C., Stefflova,K., Niedre,M., Wilson,B.C. and Zheng,G. (2008) A

tumor mRNA-triggered photodynamic molecular beacon based on oligonucleotide hairpin

control of singlet oxygen production Photochem. Photobiol. Sci., 7, 775-781.

139. Chomczynski,P. and Sacchi,N. (1987) Single-step method of RNA isolation by acid

guanidinium thiocyanate-phenol-chloroform extraction Anal. Biochem., 162, 156-159.

140. Rychlik,W. (2007) OLIGO 7 primer analysis software Methods Mol. Biol., 402, 35-60.

141. Callus,B.A., Ekert,P.G., Heraud,J.E., Jabbour,A.M., Kotevski,A., Vince,J.E., Silke,J. and

Vaux,D.L. (2008) Cytoplasmic p53 is not required for PUMA-induced apoptosis Cell Death

Differ., 15, 213-5; author reply 215-6.

86

142. Sayers,E.W., Barrett,T., Benson,D.A., Bolton,E., Bryant,S.H., Canese,K., Chetvernin,V.,

Church,D.M., Dicuccio,M., Federhen,S., et al. (2012) Database resources of the national center

for biotechnology information Nucleic Acids Res., 40, D13-25.

143. Schuler,G.D. (1997) Pieces of the puzzle: Expressed sequence tags and the catalog of

human genes J. Mol. Med. (Berl), 75, 694-698.

144. Barrett,T., Troup,D.B., Wilhite,S.E., Ledoux,P., Evangelista,C., Kim,I.F., Tomashevsky,M.,

Marshall,K.A., Phillippy,K.H., Sherman,P.M., et al. (2011) NCBI GEO: Archive for functional

genomics data sets--10 years on Nucleic Acids Res., 39, D1005-10.

145. Fish,J.E., Matouk,C.C., Yeboah,E., Bevan,S.C., Khan,M., Patil,K., Ohh,M. and

Marsden,P.A. (2007) Hypoxia-inducible expression of a natural cis-antisense transcript inhibits

endothelial nitric-oxide synthase J. Biol. Chem., 282, 15652-15666.

87

Appendices

88

Supplemental Figure 1: Cell cycle seeding density optimization

MCF-10A cells were seeded at the various low density amounts indicated at left and allowed to

recover for 24 and 48 hours in full serum. The fixed propidium iodide flow cytometric cell cycle

profiles are shown at right. The red box indicates the selected density and recovery time after

seeding.

89

Supplemental Figure 2: Model of Myc-dependent Gene Regulation

A) Diagram representation of the Myc-inducible expression system that uses the constitutively

expressed GEV16 transcription factor that is estrogen agonist responsive. Upon induction with 4-

hydroxytamoxifen (4HT) GEV16 localizes to the nucleus and binds the 5xUAS sequences

upstream of Myc for induction of transcription. Reprinted by permission from Macmillan

Publishers Ltd: Cell Death and Differentiation, Callus, BA et al., 2008 (141). B) Immunoblot

showing the 8 hour inducible expression of Myc in MCF-10A pF-Myc cells with 4HT compared

with empty vector(pF) and ethanol treated, vehicle controls. C) Inducible expression of Myc in

MCF-10A cells that have been starved for 24 hours and subsequently induced with 4HT and

maintained under starvation conditions were analyzed by fixed propidium iodide flow cytometry.

D) RNA lysates from the cells in C were isolated and analyzed for changes in Myc, Myc-

induced CCND2 and CCNB1, and Myc-repressed p21 mRNA. All date presented here are

representative of 2 independent experiments.

90

Supplemental Figure 3: lncRNA-FZD6 is dynamically regulated in cell cycle and Myc-

dependent transformation

91

A) Scale model of the lncRNA-FZD6 locus with annotation of mRNA expression and Myc ChIP

primers. B) Verification of mitogen dependent repression of lncRNA-FZD6 by qRT-PCR. ***

represents p < 0.001 by paired t-test of three replicates. C) Verification of the decrease in

expression of lncRNA-FZD6 in response to Myc-T58A overexpression in MCF-10A cells grown

on Matrigel. *** represents p< 0.001 by one-way ANOVA and Bonferroni post-test of 3

replicates. D) Expression profiling of mature lncRNA-FZD6 and pre-lncRNA-FZD6 throughout

MCF-10A cells synchronized by mitogen starvation followed by induction. The mean relative

expression of three replicates is shown. Error bars in B, C, D represent standard deviation.

92

Supplemental Figure 4: Candidate lncRNAs are not nuclear retained under asynchronous

growing conditions

Nuclear-cytoplasmic partitioning was performed using NP-40 lysis of MCF-10A cells growing

asynchronously. Absolute quantification of RPLP0 (A), Xist positive control (B), lncRNA-LY6E

(C), and lncRNA-FZD6 (D) was achieved using standard curves of known copy number

plasmids of each respective target measured by qRT-PCR. Error bars represent standard

deviation of four replicates. Please note differences in scale.

93

Supplemental Figure 5: 8 hours of Myc induction under starvation conditions does not lead

to significant changes in candidate lncRNA expression

Using the inducible system described in supplemental figure 2 with Myc and empty vector (pF),

the expression of CCND2 (A), p21 (B), lncRNA-LY6E (C) and pre-lncRNA-LY6E (D), as well

as lncRNA-FZD6 (E) and pre-lncRNA-FZD6 (F) was measured by qRT-PCR relative to starved

cells. The primary comparison of 8 hours ethanol, vehicle treated and 8 hours 4-

hydroxytamoxifen treated MCF-10A cells starved for 24 hours was evaluated by two-way

ANOVA. ** represents p < 0.01 and error bars represent standard deviation of three replicates.