A new class of covalently closed circular RNA molecules (circRNAs) has recently become the object of intensive study. First described as rare events1,2,3, recent studies have demonstrated that circRNAs are commonly produced by thousands of genes from Archaea to mammals4,5,6. Interestingly, in higher eukaryotes they are highly expressed in neuronal tissues and enriched at synapses, suggesting a specific involvement in neuronal processes7,8. Moreover, circRNAs are more abundant than their host gene linear mRNA isoforms in the neuropil and dendrites, suggesting that they may regulate synaptic function and neuronal plasticity8.
So far, very little is known about their function: some can act as sponges for microRNAs and proteins9,10,11 or can compete with linear RNA production regulating the accumulation of full-length mRNA12. CircRNAs also regulate transcription of their parental genes by association with the RNA polymerase II machinery13. Notably, emerging data point to a potential role of circRNAs in human diseases14,15, with clear evidence of tumor-promoting properties in in vivo models16. In the nervous system, the best-studied circRNA, CDR1, was found expressed in neocortical and hippocampal neurons and downregulated in Alzheimer disease17. Through its ability to sponge miR-7 (refs 9, 10), it could play a crucial role in nervous system diseases deregulating targets with important function17,18.
CircRNAs originate from a back-splicing reaction in which a downstream 5′ splice site interacts with an upstream 3′ splice site, leading to the formation of a covalently closed circRNA19. The mechanisms underlying these events are not fully understood; however, in mammals it has been shown that complementarity between inverted sequences inside flanking introns3,20,21,22 and the activity of RNA-binding proteins (RBPs)12,23 enhance the juxtaposition of the splice sites involved in the back-splicing reaction. Muscleblind, a splicing factor derived from the Mbl gene, was the first example of an RBP controlling the levels of the circRNA derived from its second exon by binding both flanking introns12. Afterwards, Quaking (QKI), a splicing factor that promotes myelination and oligodendrocyte differentiation24,25, was also described as a circRNA regulator23. Finally, many hnRNPs as well as SR proteins are involved in circRNA production in flies26.
The RBP FUS has a well-characterized role in splicing regulation27 with several splicing factors identified as FUS interactors28,29,30,31. FUS functions are particularly interesting since several mutations have been causally linked to amyotrophic lateral sclerosis (ALS)32,33. Most ALS-linked FUS mutations cluster in the C-terminus of the protein in or near the nuclear localization signal. This leads to the mislocalization of the protein to the cytoplasm, with decrease of FUS levels in the nucleus and formation of abnormal cytoplasmic aggregates32,33,34. Aberrant RNA metabolism due to FUS mutations by gain- and/or loss-of-function has been proposed as a key mechanisms in the pathogenesis of ALS and frontotemporal dementia35; moreover, deregulation of splicing has been linked to several neurological diseases32,36,37.
In this study, we identify circRNAs expressed in in vitro-derived motor neurons (MNs) and we analyse whether FUS may be involved in the control of back-splicing events leading to circRNA formation. We characterize several circRNAs that are affected by FUS depletion and by FUS mutations associated with familial forms of ALS. Notably, for selected circRNAs, we demonstrate the enrichment of FUS binding on circularizing exon–intron regions by cross-linking immunoprecipitation (CLIP) and the direct role of the protein in regulating back-splicing. Finally, most of these circRNAs are expressed in induced pluripotent stem cells (iPSCs)-derived human MNs and two of them undergo similar FUS-dependent regulation in ALS-associated FUSP525L genetic background. Altogether, our data suggest a possible conserved function of this novel class of transcripts and provide an interesting link with the ALS pathology.
Identification of circRNAs in mESC-derived MNs
Mouse embryonic stem cells (mESCs), derived from wild-type (FUS+/+) or knock out (FUS−/−)38 FUS mice and expressing a green fluorescent protein (GFP) reporter under the control of the MN-specific Hb9 promoter (Hb9::GFP transgene)39, were differentiated into bona fide MNs according to Wichterle et al.40 (Supplementary Fig. 1a). In agreement with this procedure, Pax6 and Olig2 transcription factors, responsible for establishing MN progenitors, were found in the Hb9::GFP− cells while genes required for consolidation of MN identity (Hb9) and for development (Islet-1) and function (ChAT) of spinal MNs were highly enriched in Hb9::GFP+ cells. As expected, the markers for astrocytes (Gfap) and oligodendrocytes (Pdgfr-α) were almost undetectable in both cell populations as well as the V1, V2 (Bhlhe22) and V3 (Sim1) interneuron markers (Supplementary Fig. 1b,c). Total RNA from purified GFP+-FUS+/+ and GFP+-FUS−/− MNs was sequenced by ribo-Zero Next-Generation Sequencing from three biological replicates. A dedicated pipeline for in silico circRNA detection was then applied (find_circ)10 to identify circRNAs and to evaluate their expression levels. Briefly, reads mapping to ribosomal and other abundant non-coding RNAs were discarded (see Methods section), as were reads mapping contiguously to the reference genome, and the unmapped reads were used as input for circRNA identification (Fig. 1a). Since no reference transcriptome is used in the procedure, the back-splicing sites of the identified circRNAs do not necessarily coincide with annotated splice sites. The number of reads mapping on back-splicing and on corresponding linear-splicing junctions was computed. Three thousand nine hundred and eighty circRNAs were identified, having at least two unique reads mapping on their back-splicing junction in at least one sample (Table 1). This number is similar to that obtained from sequencing experiments previously performed on other neuronal samples7, confirming the high abundance of circRNAs in neuronal tissues, now also including in vitro mESC-derived MNs. We identified 3,894 circRNAs within the body of 2,097 known genes, many hosting more than one circRNA. As shown in Fig. 1b, the vast majority of these genes are protein-coding. Analysing the localization of circRNAs within the body of protein-coding transcripts (Supplementary Fig. 1d), we found that most of them are fully included in the coding region with a proportion spanning across the 5′ untranslated region higher than expected (22%, P value for chi-squared test=1.15e−28) (Fig. 1c and Supplementary Fig. 1e).