ska annotate

SKA annotate

The annotate subcommand locates split kmers in a reference genome sequence and annotates them into a vcf (v4.3) format output file.

If the input format is a gff file, split kmers matching CDS, tRNA or rRNA features will be annotated with the following information where availble. This will all be in the info field of the vcf.

Feature ID
Feature type (CDS, tRNA or rRNA)
Strand
Position of base in feature

For CDS features the following will also be included where available

Locus tag
Systematic ID
Gene name
Position of amino acid in feature
Position of base in codon
Reference amino acid
Alternate amino acids (comma separated list matching the alt bases in the 5th column of the vcf file)
Product (only output when the -p flag is used)

Usage

ska annotate [options] <kmer files>

Options:
-h		Print this help.
-f <file>	File of split kmer file names. These will be added to or 
		used as an alternative input to the list provided on the 
		command line.
-i		Include kmers in repetitive reference regions.
-o <file>	Prefix for output files. [Default = found]
-p		Include product in output.
-r <file>	Reference fasta/gff file name. [Required]
-v		Only output variant sites.

Citation

SKA is currently only available as a preprint, so for now, if you use it, please cite: Harris SR. 2018. SKA: Split Kmer Analysis Toolkit for Bacterial Genomic Epidemiology. bioRxiv 453142 doi: https://doi.org/10.1101/453142

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ska annotate

SKA annotate

Usage

Citation

Clone this wiki locally