SnpEff and SnpSift

From BITS wiki
Jump to: navigation, search

Add annotations to VCF Variant files and filter

SimilarTo.png: Annovar, vcfCodingSnps


[ BioWare | Main_Page ]


snpEff and SnpSift [1] are tools that predict variant effects at protein level and filter the obtained annotated (VCF formatted) calls. The most recent version of the software can be freely downloaded from https://snpeff.blob.core.windows.net/versions/snpEff_latest_core.zip

SnpEff is a variant annotation and effect prediction tool. It annotates and predicts the effects of variants on genes (such as amino acid changes).

Typical usage

Input: The inputs are predicted variants (SNPs, insertions, deletions and MNPs). The input file is usually obtained as a result of a sequencing experiment, and it is usually in variant call format (VCF). Output: SnpEff analyzes the input variants. It annotates the variants and calculates the effects they produce on known genes (e.g. amino acid changes). .

SnpSift is a toolbox that allows you to filter and manipulate annotated files.

Once your genomic variants have been annotated, you need to filter them out in order to find the "interesting / relevant variants". Given the large data files, this is not a trivial task (e.g. you cannot load all the variants into XLS spreasheet). SnpSift helps to perform this VCF file manipulation and filtering required at this stage in data processing pipelines.

Documentation

  • protocols: a page full of examples is also available [2]
  • SnpEff manual: [3]
  • SnpSift manual: [4]

Databases

SnpEff relies on annotation databases that are available (in http://sourceforge.net/projects/snpeff/files/databases/v3_5/) for many species and different builds. For human and mouse, the list of available databases is obtained with the SnpEff commands:

>$ java -jar snpEff.jar databases | grep Homo_
GRCh37.64    Homo_sapiens    http://downloads.sourceforge.net/project/snpeff/databases/v3_5/snpEff_v3_5_GRCh37.64.zip
GRCh37.65    Homo_sapiens    http://downloads.sourceforge.net/project/snpeff/databases/v3_5/snpEff_v3_5_GRCh37.65.zip
GRCh37.66    Homo_sapiens    http://downloads.sourceforge.net/project/snpeff/databases/v3_5/snpEff_v3_5_GRCh37.66.zip
GRCh37.68    Homo_sapiens    http://downloads.sourceforge.net/project/snpeff/databases/v3_5/snpEff_v3_5_GRCh37.68.zip
GRCh37.69    Homo_sapiens    http://downloads.sourceforge.net/project/snpeff/databases/v3_5/snpEff_v3_5_GRCh37.69.zip
GRCh37.70    Homo_sapiens    http://downloads.sourceforge.net/project/snpeff/databases/v3_5/snpEff_v3_5_GRCh37.70.zip
GRCh37.71    Homo_sapiens    http://downloads.sourceforge.net/project/snpeff/databases/v3_5/snpEff_v3_5_GRCh37.71.zip
GRCh37.72    Homo_sapiens    http://downloads.sourceforge.net/project/snpeff/databases/v3_5/snpEff_v3_5_GRCh37.72.zip
GRCh37.73    Homo_sapiens    http://downloads.sourceforge.net/project/snpeff/databases/v3_5/snpEff_v3_5_GRCh37.73.zip
GRCh37.74    Homo_sapiens    http://downloads.sourceforge.net/project/snpeff/databases/v3_5/snpEff_v3_5_GRCh37.74.zip
GRCh37.75    Homo_sapiens    http://downloads.sourceforge.net/project/snpeff/databases/v3_5/snpEff_v3_5_GRCh37.75.zip
GRCh37.GTEX    Homo_sapiens, Gencode 12, GTEX project    http://downloads.sourceforge.net/project/snpeff/databases/v3_5/snpEff_v3_5_GRCh37.GTEX.zip
hg19    Homo_sapiens (USCS)    http://downloads.sourceforge.net/project/snpeff/databases/v3_5/snpEff_v3_5_hg19.zip
hg19kg    Homo_sapiens (UCSC KnownGenes)    http://downloads.sourceforge.net/project/snpeff/databases/v3_5/snpEff_v3_5_hg19kg.zip

>$ java -jar snpEff.jar databases | grep Mus_
GRCm38.68    Mus_musculus    http://downloads.sourceforge.net/project/snpeff/databases/v3_5/snpEff_v3_5_GRCm38.68.zip
GRCm38.69    Mus_musculus    http://downloads.sourceforge.net/project/snpeff/databases/v3_5/snpEff_v3_5_GRCm38.69.zip
GRCm38.70    Mus_musculus    http://downloads.sourceforge.net/project/snpeff/databases/v3_5/snpEff_v3_5_GRCm38.70.zip
GRCm38.71    Mus_musculus    http://downloads.sourceforge.net/project/snpeff/databases/v3_5/snpEff_v3_5_GRCm38.71.zip
GRCm38.72    Mus_musculus    http://downloads.sourceforge.net/project/snpeff/databases/v3_5/snpEff_v3_5_GRCm38.72.zip
GRCm38.73    Mus_musculus    http://downloads.sourceforge.net/project/snpeff/databases/v3_5/snpEff_v3_5_GRCm38.73.zip
GRCm38.74    Mus_musculus    http://downloads.sourceforge.net/project/snpeff/databases/v3_5/snpEff_v3_5_GRCm38.74.zip
GRCm38.75    Mus_musculus    http://downloads.sourceforge.net/project/snpeff/databases/v3_5/snpEff_v3_5_GRCm38.75.zip
NCBIM37.64    Mus_musculus    http://downloads.sourceforge.net/project/snpeff/databases/v3_5/snpEff_v3_5_NCBIM37.64.zip
NCBIM37.65    Mus_musculus    http://downloads.sourceforge.net/project/snpeff/databases/v3_5/snpEff_v3_5_NCBIM37.65.zip
NCBIM37.66    Mus_musculus    http://downloads.sourceforge.net/project/snpeff/databases/v3_5/snpEff_v3_5_NCBIM37.66.zip
testMm37.61    Mus_musculus    http://downloads.sourceforge.net/project/snpeff/databases/v3_5/snpEff_v3_5_testMm37.61.zip

References:
  1. Pablo Cingolani, Adrian Platts, Le Lily Wang, Melissa Coon, Tung Nguyen, Luan Wang, Susan J Land, Xiangyi Lu, Douglas M Ruden
    A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3.
    Fly (Austin): 2012, 6(2);80-92
    [PubMed:22728672] ##WORLDCAT## [DOI] (I p)

    Pablo Cingolani, Viral M Patel, Melissa Coon, Tung Nguyen, Susan J Land, Douglas M Ruden, Xiangyi Lu
    Using Drosophila melanogaster as a Model for Genotoxic Chemical Mutational Studies with a New Program, SnpSift.
    Front Genet: 2012, 3;35
    [PubMed:22435069] ##WORLDCAT## [DOI] (I e)

  2. http://snpeff.sourceforge.net/protocol.html
  3. http://snpeff.sourceforge.net/SnpEff_manual.html
  4. http://snpeff.sourceforge.net/SnpSift.html



[ BioWare | Main_Page ]