A detailed knowledge of the mapping between sequence and structure spaces in populations of RNA molecules is essential to better understand their present-day functional properties, to envisage a plausible early evolution of RNA in a prebiotic chemical environment and to improve the design of in vitro evolution experiments, among others. Analysis of natural RNAs, as well as in vitro and computational studies, show that certain RNA structural motifs are much more abundant than others, pointing out a complex relation between sequence and structure. Within this framework, we have investigated computationally the structural properties of a large pool (10 molecules) of single-stranded, 35 nt-long, random RNA sequences. The secondary structures obtained are ranked and classified into structure families. The number of structures in main families is analytically calculated and compared with the numerical results. This permits a quantification of the fraction of structure space covered by a large pool of sequences. We further show that the number of structural motifs and their frequency is highly unbalanced with respect to the nucleotide composition: simple structures such as stem-loops and hairpins arise from sequences depleted in G, while more complex structures require an enrichment of G. In general, we observe a strong correlation between subfamilies-characterized by a fixed number of paired nucleotides-and nucleotide composition. Our results are compared to the structural repertoire obtained in a second pool where isolated base pairs are prohibited.
Bibliographical noteNOTICE: this is the author’s version of a work that was accepted for publication in Journal of theoretical biology. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Stich, M, Briones, C & Manrubia, SC, 'On the structural repertoire of pools of short, random RNA sequences' Journal of theoretical biology, vol. 252, no. 4 (2008) DOI http://dx.doi.org/10.1016/j.jtbi.2008.02.018
- RNA motif
- genotype–phenotype map
- RNA folding
- RNA world
- structural family
Stich, M., Briones, C., & Manrubia, S. C. (2008). On the structural repertoire of pools of short, random RNA sequences. Journal of Theoretical Biology, 252(4), 750-763. https://doi.org/10.1016/j.jtbi.2008.02.018