Skip to content

strelka-2.9.0

Compare
Choose a tag to compare
@ctsa ctsa released this 08 Feb 20:26
· 60 commits to master since this release

Summary

This is a major update from v2.8.4. The most important change in this release is indirect: haplotype modeling and realignment have been improved such that, given the strelka2 germline VCF output of a trio at typical cWGS depth, false-positive de novo variant calls have been roughly cut in half. This is due to fixes for realignment artifacts that were too rare to noticeably impact germline call quality, but frequent relative to de novo variant rates. These changes should also accelerate the future transition to haplotype modeling for somatic variants. Many additional improvements to stability, error diagnostics, ease of use and accuracy are also included in this release, as enumerated below.

Changelog

Added

  • Add strand bias feature to germline indel scoring model (STREL-676)
    • Improves filtration for a small number of false positive indels in typical WGS analysis.
  • Add haplotyping constraints to the read alignment (STREL-743)
    • Phasing information from haplotyping is used to constrain combinations of variants within read alignments
    • Removes rare artifact which could trigger false de novo calls from multi-sample germline variant output, baseline false positive SNVs and indels reduced to approx half of previous count.
  • Add new filter to make multi-sample germline variant output easier to interpret (STREL-819)
    • Locus filter 'NoPassedVariantGTs' added when no sample has a passing variant genotype.
    • This allows passing variants to be easily extracted with the FILTER field, without querying FORMAT/GT and FORMAT/FT.
  • Add new filter to prevent interference between forced indels and other indels (STREL-607)
    • Locus filter 'NotGenotyped' added to ForcedGT indels if they can possibly interfere with indels discovered by Strelka.
    • All complex alleles are also not genotyped and appear in the VCF output with this NotGenotyped filter.

Changed

  • Change default maximum indel size from 50 to 49 (STREL-811)
    • This change is made as part of an effort to better align manta with GIAB SV size range conventions, such that strelka and manta together provide complete, non-overlapping coverage over the full indel spectrum using default settings.
  • Remove preliminary step which counts the 'mappable' (non-N) size of the genome (STREL-772)
    • This has a legacy use in identifying noisy alignments. Now replaced with a simplified scheme.
  • Lower default local task memory requirement from 2 to 1.5 Gb (STREL-802)
    • This enables all cores on an AWS c4.8xlarge with default configuration, use --callMemMb option to override for unusual cases.
  • Update LowDepth filter for somatic calls to include cases where the normal sample depth is below 2 (STREL-745)
  • Update htslib to incorporate CRAM file query fix (STREL-839/MANTA-1336)
    • This is expected to resolve possible issues with error parameter estimation from alignments in CRAM format.

Fixed

  • Fix empirical variant scoring (EVS) of complex somatic indels (STREL-774)
    • Previously the EVS model was overly pessimistic against complex somatic indels. This is now fixed by changing how EVS input features are computed for complex indels.
  • Fix default sample name used in the VCF output for germline analysis (STREL-737)
    • Default is used when sample name cannot be parsed from the BAM header. Now fixed to insert SAMPLE1, SAMPLE2, etc. as documented.
  • Fix rare instance where strand bias (FORMAT/SB) is 'inf' (STREL-741)
  • Provide clear error message when attempting to configure/run with python3 (STREL-762)
  • Fix python configure scripts to make maximum reported indel size configurable (STREL-763)
    • This can be done by configuring the maxIndelSize value inside the .ini file.
  • Fix realignment slow-down issue that occurs when reads overlap too many candidate SNVs (STREL-805)
  • Stop automatically clearing python environment variables (STREL-810)
    • This should allow python from certain module systems to be used, but may (rarely) cause instability due to conflicting content in a user's PYTHONPATH.
  • Standardize germline FORMAT/GQ VCF tag to Integer type (STREL-812)
  • Fix the issue that low depth filter is not applied to continuous variant frequency (e.g. mitochondrial) calls (STREL-803)

strelka-2.9.0.centos6_x86_64.tar.bz2 is a binary distribution for 64-bit linux. This is built on CentOS 6 with all dependencies except glibc statically linked. It is expected to run without modification on most linux distributions.

The strelka-2.9.0.release_src.tar.bz2 download is the full source code release. The “Source code” downloads are generated by GitHub and are incomplete as they are missing version numbers and build system changes to improve portability.