Exercises on ENA

From BITS wiki
Jump to: navigation, search
Go to parent Basic bioinformatics concepts, databases and tools#Exercises_during_the_training

The ENA database

The European Bioinformatics Institute (EBI) hosts the ENA (European Nucleotide Archive) database: one part of ENA is called EMBL-bank, containing annotated primary sequence data. The other two parts are the Trace Archive and the Short Read Archive (SRA), containing batch-submitted primary sequence data.

EBI has multiple search portals:

  • ENA Browser to search in ENA
  • The fast search on the EBI home page and EBI Search perform a meta-search to all EBI databases (similar to Entrez)
  • SRS to perform searches on selected databases
     Note: information is liquid. Records change all the time: info is removed and added. Therefore, screenshots may not be up-to-date.

    The ENA Browser

    Go to the ENA Browser. You see two text field, the upper one for "Text search", the lower for "Sequence search". We will concentrate on the text search, sequence searches will be covered in Module 2. You can search using free text (e.g. species names, disease names, feature names,...) or using an accession number.

    Exercise 1: caspase

    Perform a search for 'caspase complete cds'.

    ENA.png

    The ENA search returns records from the EMBL-bank part of ENA, divided into "Update" and "Release". "Update" contains records that were recently updated. Clicking the "+" sign expands the corresponding section, revealing the individual search results.

    ENA2.png

    Each record can be further expanded by clicking on the "+" sign to see more details of the record. Do this for the first record of the "Update" list (JX912275 : Spodoptera frugiperda initiator caspase mRNA, complete cds).

    The most useful entries with the most relevant annotations are from the 'STD' (standard) data class. See more info on ENA database structure.

    Exercise 2: kinase

    The nicest thing about ENA Browser search, is the fact that the results are categorized by the part of ENA from which they originate. This becomes clear when you do a text search with "kinase".

    ENA5.png

    The results page groups the entries according to type of sequence.
    The text searches that you can perform using the ENA Browser are very 'crude'. For example, when you search for "kinase", every record containing somewhere the word "kinase" is shown, even non-kinase sequences just as in Genbank. Be aware of this because this is often not what you want!

    EBI Search, cross-database search at EBI

    EB-eye is a cross-database search tool for EBI databases similar to Entrez for NCBI databases. You can access it on EBI Search.

    Exercise 1: AF24735

    Search for "AF242735".

    ENA6.png

  • Click "Summary information is available for this gene"
  • Click "Dream"
    This redirects you to the EBI summary record of this gene

    ENA7.png

    EBI provides very nice overview pages, with links to many other databases. A good place to start.