28 May 2020

RNAcentral release 15

We are pleased to announce that a new RNAcentral release is now live. Version 15 features intermolecular interactions from the IntAct database, an updated sequence similarity search, as well as new data from ENA, Ensembl, RefSeq, PDB and 17 other databases. Read on or browse 16.1 million ncRNA sequences in RNAcentral.

Welcome to IntAct

RNAcentral now integrates the interaction data annotated by the IntAct curators, who manually associate RNAcentral sequences with their interaction partners. For example, NEAT1 lncRNA interacts with Paraspeckle protein 1:

RNAcentral pages link to the interaction participants and the IntAct website where you can find additional information, such as supporting publications and experimental details.

So far there are 1,152 annotated interactions for 303 RNAs, with the majority of data points coming from human and yeast (168 and 114 annotated RNAs, respectively). We would like to thank Simona Panni (University of Calabria) for creating many of these entries. As curators continue to annotate additional interactions in IntAct, the new data will automatically flow into RNAcentral.

RNAcentral sequence search runs in Rfam now

Previously in release 14 the RNAcentral sequence search was updated with a new interface that allowed filtering the results with keywords and supported the same facets as text search. With the new search one can focus on the matches from your favourite organism or find sequences from a certain member database.

Since the introduction of the new interface, the number of searches have doubled and we received a lot of positive feedback, for example one user reached out to say:

“Nowhere else can you do the type of searches you allow - it’s fast and has a great interface!”

We thought that other RNAcentral member databases could also benefit from the same technology so we converted the sequence search into a web component that can be added to any website.

Now the RNAcentral sequence search has been integrated into Rfam. When a user enters a query sequence, it is not only annotated with Rfam families using Infernal but also searched against a comprehensive set of sequences from RNAcentral.

This change also enables Rfam users to find matches in non-coding RNA sequences that are not supported by Rfam, such as lncRNA, or where an Rfam family has not been created yet.

The embeddable search widget is available on GitHub. It is easy to integrate into any website with 2 lines of code but it is also highly customisable. You can select a subset of RNAcentral sequences to be searched or tweak the widget appearance to match your website.

Let us know if you have any feedback about the new search or would like to build it into your site.

Batch queries

One of the most requested features for the sequence search was the ability to use more than one sequence as a query. Now you can upload a FASTA file with up to 50 sequences and search them at once:

Data updates

The following database have also been updated:

  • ENA (143)

  • Ensembl (100) and Ensembl Genomes (47)

  • FlyBase (FB2020_02)

  • HGNC

  • PDB

  • RefSeq

First remote release

Release 15 is the first of a series of data releases produced remotely while the EMBL-EBI is working from home. We thank everyone for your patience and look forward to supporting the global RNA research during this challenging time.

Get in touch

As always, all data are freely available on the RNAcentral website, via the API, and in the FTP archive. The next release is scheduled for August 2020. In the meantime, if you have any feedback, please get in touch by email, on Twitter, or by submitting an issue on GitHub. We look forward to hearing from you!