Skip to content

Releases: FelixKrueger/Bismark

v0.16.0

20 Apr 13:10
Compare
Choose a tag to compare

Bismark


  • File endings .fastq | .fq | .fastq.gz | .fq.gz are now removed from the output file (unless they were specified with --basename) in a bid to reduce the length of the already long file names.
  • Enabled the new option --dovetail (which will be turned on by default for --pbat libraries) which will now allow dovetailing reads to be reported. For a more in-depth description see #14.
  • Changed the behaviour of corner cases to where several non-directional alignments could have existed for the very same position but to different strands so that now the best alignment trumps the weaker one. As an example: If you relaxed the alignment criteria of a given alignment to allow ~60 mismatches for PE alignment we did find an alignment to the OT strand with a combined AS of -324, but there also was an alignment to the CTOB strand with and AS of 0 (perfect alignment). The CTOB now trumps the OT alignment, and the methylation information information is now reported for the bottom strand. Credits go to Sylvain Foret (ANU, Canberra) for bringing this to our attention!

New module: bismark2summary


Bismark summary

New module: bam2nuc


  • The new Bismark module bam2nuc calculcates the average mono- and di-nucleotide coverage of libraries and compares this to the genomic average composition. bam2nuc can be called straight from within Bismark (option --nucleotide_coverage) or run stand-alone. bam2nuc creates a ...nucleotide_stats.txt file that is also automatically detected by bismark2report and incorporated into the HTML report.
    (di-)nucleotide coverage

bismark2_sitrep.tpl


  • Removed an extra function call in bismark_sitrep.tpl so that the M-bias 2 plot is drawn once the M-bias 1 plot has finished drawing (parallel processing could with certain browsers and data may have resulted in a white spaceholder only).

methylation extractor


  • Altering the file path handling of coverage2cytosine and bismark2bedGraph also required some changes in the methylation extractor.

bismark2bedGraph


  • Input file path handling has been completely reworked. The output file which can be specified as -o output.bedGraph now has to be a single file name and mustn't contain any path information. A particular output folder may be specified with -dir /any/path/.
  • Addressing the file path handling issue also fixed a similar issue with the option --remove_spaces when -o had been specified.

coverage2cytosine


  • Changed zcat for gunzip -c when reading a gzipped coverage file. This should avoid some Mac platforms crashing because zcat invariably requires a file to end in the .Z (which it doesn't...)
  • Changed the way in which the coverage input file is handed over from the methylation_extractor
    to coverage2cytosine (previously the path information might have been part of the file name, but
    instead it will now be only part of the -dir output_directory option.

v0.15.0

14 Jan 13:59
Compare
Choose a tag to compare

Bismark


  • Added option --se/--single_end <list>. This sets single-end mapping mode explicitly giving a
    list of file names as <list>. The filenames may be provided as a comma , or colon :-separated
    list.
  • Added option --genome_folder <path/to/genome> as alternative to supplying the genome as the
    first argument.
  • Added an option --rg_tag to print an @RG header line as well as and RG:Z: tag to each read.
    The ID and SAMPLE fields default to 'SAMPLE', but can be specified manually with --rg_id or
    --rg_sample.
  • Added new option --ambig_bam for Bowtie2-mode only, which writes out a single alignment for
    sequences with multiple alignments to a special file ending in .ambiguous.bam. The alignments
    are in Bowtie2 format and do not any contain Bismark specific entries such as the methylation
    call etc. These ambiguous BAM files are intended to be used as coverage estimators for variant
    callers. Works for single-end and paired-end alignments in single or multi-core mode.
  • Added the new options --cram and --cram_ref to Bismark for both paired- and single-end alignments
    in single or multi-core mode. This option requires Samtools version 1.2 or higher. A genome
    FastA reference may be supplied as a single file with the option --cram_ref; if this is not
    specified the file is derived from the reference FastA file(s) used for the Bismark run, and written
    to the file Bismark_genome_CRAM_reference.mfa into the output directory.

deduplicate_bismark


  • Added better handling of cases when the input file was empty (died for percentage calculation
    instead of calling it N/A)
  • Added a note mentioning that Read1 and Read2 of paired-end files are expected to follow each
    other in two consecutive lines and possibly require name-sorting prior to deduplication. Also
    added a check that reads the first 100000 lines to see if the file appears to have been sorted
    and bail out if this is true.

methylation extractor


  • Added support for CRAM files (this option requires Samtools version 1.2 or higher)

bismark2bedGraph


  • Changed the way gzip compressed input files are handled when using the UNIX sort command (i.e. with
    --scaffolds/--gazillion or without --ample_memory

coverage2cytosine


  • Added option --gzip to compress output files. This currently only works for the default CpG_report
    and CX_report output files (and thus not with the option --gc or --split_files. The option --gzip
    is now also passed on from the bismark_methylation_extractor.
  • Added a check to bail if no information was found in the coverage file, e.g. if a wrong file path for a .cov.gz file had been specified

bismark_genome_preparation


  • Added process handling to the child processes.

Bismark v0.14.5

08 Nov 22:01
Compare
Choose a tag to compare

20-08-2015: 0.14.5 released - minor fix

  • deduplicate_bismark: Changed all instances of literal calls of samtools calls to $samtools_path