9 Mar 2021

RNAcentral release 17

We are pleased to announce that a new RNAcentral release is now live. Version 17 features new RNAse P secondary structure diagrams, a new API for running sequence similarity searches programmatically, as well as links to piRBase, a database of piRNAs. Read on to learn more or browse >30 million non-coding RNAs in RNAcentral

You can also help improve RNAcentral by filling out a short user survey. Please let us know how to make RNAcentral more useful for your needs by filling out the survey.

Welcome to new databases

piRBase is a database of piRNA sequences across 21 different organisms with over 170 million sequences. We now provide cross links from RNAcentral to piRBase. Here is an example sequence report showing links to piRBase annotating for a mouse piRNA:

Due to the large size of piRBase we have limited the import to include only the sequences that were already available in RNAcentral, which resulted in 219,000 annotated sequences. You can browse piRBase data in RNAcentral here.

Changes to ENA import

We have reduced the number of metagenomic rRNA fragments coming from ENA. In the last release we found this was becoming a larger and larger fraction of RNAcentral data. To provide our users with high quality datasets we began analyzing sequences with ribovore and excluding partial rRNA metagenomic sequences (matching less than 90% of the Rfam rRNA models). This excludes about 7 million sequences from RNAcentral. In the future we will work on methods to ensure RNAcentral remains a comprehensive but high quality database. If you have any questions or comments about this approach please reach out to us by email or on GitHub.

RNAcentral sequence search runs GtRNAdb

GtRNAdb now joins Rfam, miRBase, and snoDB in using the embeddable RNAcentral sequence search widget. This widget provides nhmmer sequence searches, Rfam classification and secondary structure prediction in a simple, easy to use interface.

If you would like to add the search to your website, the code and the documentation are available on GitHub. The widget is easy to integrate into any website with just 2 lines of code. It can be customised to match the appearance of the host website and to search all or just a subset of RNAcentral sequences.

Run sequence similarity searches programmatically 

The RNAcentral sequence similarity search can now be run programmatically. We have an API, with Swagger documentation and example code available at https://rnacentral.org/sequence-search/api. If you have any questions about the API please  get in touch.

New RNase P secondary structures

We have improved the R2DT software by adding new templates for the RNase P. With 19 new templates that represent a wide range of organisms. These new templates provide a clearer and more consistent display of RNase P secondary structures. As an example of the improvement here is the before, on the left, and after, on the right, for human ribonuclease P RNA component H1 (URS000013F331_9606):

You can browse the new structures here.

Get in touch

The data can be freely accessed on the RNAcentral website, via the API, and in the FTP archive. The next release is scheduled for May 2021. In the meantime, please get in touch by email, on Twitter, or by submitting an issue on GitHub. We look forward to hearing from you!