Seq crumbs

Remove contaminant adapter sequences from your reads prior to other NGS processing

: cutadapt , FastX_toolkit, PrinSeq, Trimmomatic

All seq crumbs try to share a consistent interface. By default most Seq Crumbs read from standard input and write to standard output, allowing them to to be easily combined using Unix pipes. Alternatively, several input sequence files can be provided as a list of arguments. Output can also be directed to specific files with the -o parameter (or --outfile).

seq_crumbs supports compressed gzip, BGZF and bzip2 files. When used as input it autodetects the compressed files. It can also generate compressed outputs.

The sequence formats accepted by seq_crumbs are those supported by Biopython's SeqIO module. As output only Sanger and Illumina fastq and fasta files are supported.

seq_crumbs can take advantage of multiprocessor computers by splitting the computational load into several processes.

The filtering seq crumbs can be made aware of paired reads and can filter both reads of pairs at once.

You can find more information about seq_crumbs in the seq_crumbs web site^[1].

Available Crumbs

sff_extract	Extracts reads from an SFF file used by 454 and Ion Torrent.
split_matepairs	Splits mate-pairs separated by an oligo sequence.
filter_by_quality	Filters sequences according to mean quality.
filter_by_length	Filters sequences according to maximum and minimum length thresholds.
filter_by_name	Filters sequences with a list of names given in a file.
filter_by_blast	Filters the sequences using BLAST.
filter_by_complexity	Filters sequences according to their complexity.
filter_by_bowtie2	It filters the sequences using bowtie2
trim_by_case	Trims sequences according to case.
trim_edges	Removes a fixed number of residues from sequence edges.
trim_quality	Removes, using a sliding window, regions of low quality in the edges.
trim_blast_short	Removes oligonucleotides by using the blast-short algorithm.
convert_format	Converts between the different supported sequence formats.
guess_seq_format	Guesses the format of a file, including Sanger and Illumina fastq formats.
cat_seqs	Concatenates one or several input sequence files, possibly in different formats, into one output.
seq_head	Outputs only the first sequences of the given input.
sample_seqs	Outputs a random sampling of the input sequences.
count_seqs	It counts sequences in the input files
change_case	Modifies the case of sequences. Case can be converted to lower or upper, or swapped.
pair_matcher	Filters out orphaned read pairs.
interleave_pairs	Interleaves two ordered paired read files.
deinterleave_pairs	Splits an ordered file of paired reads into two files, one for each end.calculate_stats
calculate_stats	Generates basic statistics for the given sequence files.
orientate_transcripts	Reverse complements transcripts according to polyA, ORF or BLAST hits.
fastqual_to_fastq	Converts fasta and qual files to a fastq format file.

References:

↑ http://bioinf.comav.upv.es/seq_crumbs/

[ BioWare | Main_Page ]

[1] ttp://bioinf.comav.upv.es/seq_crumbs/

[1]

Seq crumbs

Available Crumbs

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Resources

Toolbox