30 Jun 2021

RNAcentral Release 18

We are pleased to announce that release 18 of RNAcentral is available! This release features improvements to the RNA types, secondary structures, and updates to our member databases.

Additionally, we are hiring for the Project Leader role. If you are interested please see the job description here

Improved RNA types

We worked with the Sequence Ontology (SO) team to improve the rRNA terms.The Sequence Ontology now reflects the diversity of rRNA sequences with specific subtypes for cytosolic, mitochondrial, and plastid rRNAs. A  summary of the changes is shown below.

The new SO terms make it easier and quicker to find specific subtypes of rRNAs. Thanks to everyone who made this possible, including the SO team as well as Anton S. Petrov (Georgia Tech) and Steven Marygold (FlyBase)!

We include these new terms, along with some other minor improvements in our RNA type facet. Below is a comparison of the old, on the left, and new, on the right, facet.

You can browse the new RNA terms here and try browsing some new annotations, like mitochondrial LSU here. We plan to continue improving the precision and extent of our annotations in future releases.

Improvements to secondary structures

We recently published our method of drawing RNA secondary structures, R2DT. This approach is based on matching sequences to templates and then folding the sequences into a secondary structure that matches the template. For more details, you can read our paper here

In this release we have added a quality assurance step to R2DT. We now show fewer low quality diagrams of rRNA sequences. With these changes and updates we now have 25.9 million sequences with secondary structure diagrams. We plan to continue improving R2DT, if you have any feedback on our diagrams please get in touch! You can browse the diagrams here.

Database updates

We have updated 17 databases bringing the total number of sequences up to 30.7 million. Below is the list of updates. 

  • ENA (snapshot as of 7 Jun 2021)

  • Ensembl (104)

  • Ensembl/GENCODE (human 38/mouse 27)

  • Ensembl Genomes (51)

  • FlyBase (fb_2021_03)

  • GeneCards (5.2)

  • HGNC (2021-05-17)

  • IntAct (2021-05-17)

  • Malacards (5.2)

  • PDB (2021-05-17)

  • PomBase (2021-05-17)

  • QuickGO (2021-05-12)

  • RefSeq (205)

  • SGD (2021-04-27)

  • SILVA (138.1)

  • ZFIN (2020-06-22)

  • ZWD (1.1)

Get in touch

The data can be freely accessed on the RNAcentral website, via the API, and in the FTP archive. The next release is scheduled for September 2021. In the meantime, please get in touch by email, on Twitter, or by submitting an issue on GitHub. We look forward to hearing from you!