We are happy to announce that the seventh release of RNAcentral is now available. The latest release includes FlyBase, Ensembl, and GENCODE as new Expert Databases as well as updates from ENA, RefSeq, snoPY, HGNC, and PDB. The data are available on the RNAcentral website, via the API, and in the FTP archive.
Model Organism Databases (MODs) perform invaluable service for the community by annotating genomes of key species, such as worm, yeast, and fly with functional information. RNAcentral already links to four MODs (dictyBase, PomBase, SGD, and WormBase). Starting with this release we also integrate ncRNAs from FlyBase, the database for Drosophila genes and genomes. FlyBase contributed over 13 thousand ncRNA sequences from 12 Drosophila species, with the majority coming from D. melanogaster. You can browse FlyBase sequences or view FlyBase summary page in RNAcentral.
Goodbye Vega, hello Ensembl
Since the first RNAcentral release, the Vega database provided RNAcentral with high quality annotations of human and mouse genomes. Recently, the Vega website has been archived, but the HAVANA team continues producing annotations and makes them available in Ensembl and GENCODE. In this release we retire Vega and begin importing non-coding RNAs from Ensembl, including GENCODE.
Ensembl provides manually curated, experimentally verified gene annotations for human and mouse genomes, as well as comprehensive gene annotations for over 60 other vertebrate genomes. Ensembl releases are built off GENCODE annotations for human and mouse where possible. We have imported release 87 of Ensembl, which contains 346,509 ncRNA sequences from 66 organisms.
- Browse ncRNAs from Ensembl or view Ensembl summary page in RNAcentral
- Browse ncRNAs from GENCODE or view GENCODE summary page in RNAcentral
Better sequence descriptions
We made improvements to the descriptions that RNAcentral displays for each sequence. We try to select the most informative name from the available descriptions submitted by different sources. Here is an example description, where the new name is much more specific than the old one:
The selected names come from high quality data sources like GENCODE, HGNC, and miRBase. This work is an ongoing process and we are always happy to get feedback on descriptions that could be improved.
Improved genome browser
RNAcentral features a genome browser that shows RNAcentral sequences alongside genes and transcripts from Ensembl or Ensembl Genomes. Now the browser supports deep linking and the URL is continuously updated as you scroll around or switch between species so that you can bookmark your favorite view to come back to it later or share the URL with anyone. For example, here is a link to the mouse Xist gene. The genome browser also received a fresh coat of paint and was updated to the latest version of Genoverse.
- Latest data from RefSeq, PDB, HGNC, and ENA
- Rfam families from release 12.2 and select cis-regulatory families (for example, here are SAM riboswitch sequences)
- Mouse sequences from snoPY
- Chicken, chimp, rat and cow sequences from NONCODE
- The GPI file now contains ncRNA types
- New FTP section for database identifiers. We have added database specific mappings from RNAcentral URS’s to database ids in ftp://ftp.ebi.ac.uk/pub/databases/RNAcentral/releases/7.0/id_mapping/database_mapping. This is a preliminary release of the data and will be refined in the future. The format and contains may change. We are looking for feedback on the current files.