21 Sept 2020

RNAcentral release 16

 

We are pleased to announce that a new RNAcentral release is now live. Version 16 features 3 new databases (CRW, snoRNA Database, and ZFIN), detailed RNA type classification using Sequence Ontology, as well as 14 million secondary structures (up from 8 million in release 15). Read on to learn more or browse RNAcentral.

Welcome to new databases

RNAcentral completed the integration of all Model Organism Databases forming the Alliance of Genome Resources by importing ZFIN, a model organism database that hosts a wide array of expertly curated, organized and cross-referenced research data for zebrafish (Danio rerio). We also imported CRW with 5S, SSU, and LSU rRNA sequences and links to the secondary structure diagrams, as well as the snoRNA Database, which is a curated collection of archaeal snoRNAs maintained by the Lowe Lab at UC Santa Cruz. This integration brings the total number of imported databases to 44.


You can browse CRW, ZFIN, and snoRNA Database data in RNAcentral now.

Read the R2DT preprint on BioRxiv

The RNAcentral secondary structure diagrams are generated in standard, reproducible, and recognisable orientations using R2DT. Now R2DT has been described in a new preprint where you can find the details of the method and its comprehensive validation. In release 16, R2DT has been used to generate >14 million secondary structure diagrams, creating the world’s largest set of RNA secondary structures.



You can browse the secondary structures, try the R2DT web server, or check out the R2DT source code on GitHub.

RNA types powered by Sequence Ontology

To enable more precise searches and annotations, all sequences in RNAcentral have been annotated with the Sequence Ontology (SO) terms. The SO terms are more precise than the INSDC RNA classification that was used before. For example, with SO terms it is possible to distinguish the rRNA subtypes, such as 16S, 18S, or 23S rRNA, while the old classification grouped all rRNAs in a single RNA type.



The details of the SO classification method will be described in the 2021 NAR Database Issue paper (currently in review) but you can already explore RNAcentral with SO terms using the new RNA types facet in the text search. Try it now and click the Feedback button to let us know what you think.

RNAcentral sequence search runs in miRBase and snoDB

The RNAcentral sequence similarity search is constantly updated with new features. For example, in release 15 we introduced a batch search mode allowing to search for up to 50 sequences at once. In addition, the results can be downloaded in a number of formats, such as JSON or plain text.


Now the sequence search has been integrated with R2DT so that for every query a secondary structure is generated in addition to the similar sequences from RNAcentral and Rfam classification (see panel A). The RNAcentral sequence search can be integrated into any website and it is now running in Rfam, snoDB, and miRBase (see panels B-D).


 


If you would like to add the search to your website, the code and the documentation are available on GitHub. The widget is easy to integrate into any website with just 2 lines of code. It can be customised to match the appearance of the host website and to search all or just a subset of RNAcentral sequences.

Get in touch

The data can be freely accessed on the RNAcentral website, via the API, and in the FTP archive. The next release is scheduled for December 2020. In the meantime, please get in touch by email, on Twitter, or by submitting an issue on GitHub. We look forward to hearing from you!