Skip to content

Releases: Illumina/strelka

strelka-2.9.0

08 Feb 20:26
Compare
Choose a tag to compare

Summary

This is a major update from v2.8.4. The most important change in this release is indirect: haplotype modeling and realignment have been improved such that, given the strelka2 germline VCF output of a trio at typical cWGS depth, false-positive de novo variant calls have been roughly cut in half. This is due to fixes for realignment artifacts that were too rare to noticeably impact germline call quality, but frequent relative to de novo variant rates. These changes should also accelerate the future transition to haplotype modeling for somatic variants. Many additional improvements to stability, error diagnostics, ease of use and accuracy are also included in this release, as enumerated below.

Changelog

Added

  • Add strand bias feature to germline indel scoring model (STREL-676)
    • Improves filtration for a small number of false positive indels in typical WGS analysis.
  • Add haplotyping constraints to the read alignment (STREL-743)
    • Phasing information from haplotyping is used to constrain combinations of variants within read alignments
    • Removes rare artifact which could trigger false de novo calls from multi-sample germline variant output, baseline false positive SNVs and indels reduced to approx half of previous count.
  • Add new filter to make multi-sample germline variant output easier to interpret (STREL-819)
    • Locus filter 'NoPassedVariantGTs' added when no sample has a passing variant genotype.
    • This allows passing variants to be easily extracted with the FILTER field, without querying FORMAT/GT and FORMAT/FT.
  • Add new filter to prevent interference between forced indels and other indels (STREL-607)
    • Locus filter 'NotGenotyped' added to ForcedGT indels if they can possibly interfere with indels discovered by Strelka.
    • All complex alleles are also not genotyped and appear in the VCF output with this NotGenotyped filter.

Changed

  • Change default maximum indel size from 50 to 49 (STREL-811)
    • This change is made as part of an effort to better align manta with GIAB SV size range conventions, such that strelka and manta together provide complete, non-overlapping coverage over the full indel spectrum using default settings.
  • Remove preliminary step which counts the 'mappable' (non-N) size of the genome (STREL-772)
    • This has a legacy use in identifying noisy alignments. Now replaced with a simplified scheme.
  • Lower default local task memory requirement from 2 to 1.5 Gb (STREL-802)
    • This enables all cores on an AWS c4.8xlarge with default configuration, use --callMemMb option to override for unusual cases.
  • Update LowDepth filter for somatic calls to include cases where the normal sample depth is below 2 (STREL-745)
  • Update htslib to incorporate CRAM file query fix (STREL-839/MANTA-1336)
    • This is expected to resolve possible issues with error parameter estimation from alignments in CRAM format.

Fixed

  • Fix empirical variant scoring (EVS) of complex somatic indels (STREL-774)
    • Previously the EVS model was overly pessimistic against complex somatic indels. This is now fixed by changing how EVS input features are computed for complex indels.
  • Fix default sample name used in the VCF output for germline analysis (STREL-737)
    • Default is used when sample name cannot be parsed from the BAM header. Now fixed to insert SAMPLE1, SAMPLE2, etc. as documented.
  • Fix rare instance where strand bias (FORMAT/SB) is 'inf' (STREL-741)
  • Provide clear error message when attempting to configure/run with python3 (STREL-762)
  • Fix python configure scripts to make maximum reported indel size configurable (STREL-763)
    • This can be done by configuring the maxIndelSize value inside the .ini file.
  • Fix realignment slow-down issue that occurs when reads overlap too many candidate SNVs (STREL-805)
  • Stop automatically clearing python environment variables (STREL-810)
    • This should allow python from certain module systems to be used, but may (rarely) cause instability due to conflicting content in a user's PYTHONPATH.
  • Standardize germline FORMAT/GQ VCF tag to Integer type (STREL-812)
  • Fix the issue that low depth filter is not applied to continuous variant frequency (e.g. mitochondrial) calls (STREL-803)

strelka-2.9.0.centos6_x86_64.tar.bz2 is a binary distribution for 64-bit linux. This is built on CentOS 6 with all dependencies except glibc statically linked. It is expected to run without modification on most linux distributions.

The strelka-2.9.0.release_src.tar.bz2 download is the full source code release. The “Source code” downloads are generated by GitHub and are incomplete as they are missing version numbers and build system changes to improve portability.

strelka-2.8.4

31 Oct 15:55
Compare
Choose a tag to compare

Summary

This is a major bugfix update from v2.8.3. The two most notable changes are:

  1. A nearly 10-fold reduction in memory usage during germline analysis
  2. A fix to the default variant recall level for male non-PAR chrX, or any region given a ploidy of 1 in the ploidy VCF file.

NOTE: Minimum supported OS change

The minimum supported linux OS has changed in this release from Centos 5 to Centos 6. The Strelka2 binary release is now built on Centos 6, and is expected to run without modification on any linux distro with glibc corresponding to Centos 6 or later (this includes Ubuntu 12.04). If buiding from source, the corresponding new package minimums are python 2.6+, cmake 2.8.12+ and boost 1.58.0+.

Changelog

Changed

  • Switch to RapidJSON library for all json parsing (STREL-696)
    • Reduces germline calling memory usage ~10-fold due to improved parse of random forest rescoring models.
  • Change active region detection method to create active regions shared by all samples (STREL-710)
  • Verify region/callRegion values at configuration time (STREL-724)
    • Chromosome labels in BED records and region arguments must be found in the reference.
  • Verify run directory has not already been configured (MANTA-1252/STREL-734)
  • Verify alignment file extension at configuration time (MANTA-886)
  • Update minimum supported linux OS from Centos 5 to Centos 6 (STREL-720)
  • Move changelog to markdown format (STREL-571)

Fixed

  • Fix germline empirical variant scoring (EVS) for haploid regions (STREL-678)
    • Previously, EVS resulted in reduced recall for haploid regions such as non-PAR regions of chrX in male samples. After adding haploid training examples from NA12877 chrX, EVS performance for haploid regions is comparable to diploid.
  • Fix debug option to provide realigned reads in bam output (STREL-721/[#15])

strelka-2.8.4.centos6_x86_64.tar.bz2 is a binary distribution for 64-bit linux. This is built on CentOS 6 with all dependencies except glibc statically linked. It is expected to run without modification on most linux distributions.

The strelka-2.8.4.release_src.tar.bz2 download is the full source code release. The “Source code” downloads are generated by GitHub and are incomplete as they are missing version numbers and build system changes to improve portability.

strelka-2.8.3

22 Sep 20:59
Compare
Choose a tag to compare

This is bugfix update from v2.8.2

Updates

  • Minor correction to the non-error term used during adaptive indel error estimation (STREL-705)
  • Minor correction to somatic joint allele-frequency prior (STREL-632)
  • Improve somatic EVS feature consistency (STREL-652)
  • Improve CRAM reference handling (STREL-647)
    • The reference provided as input during workflow configuration is now prioritized over the URI in the CRAM header. This makes it easier to work with any CRAM file which contains a local file path in the header.

strelka-2.8.3.centos5_x86_64.tar.bz2 is a binary distribution for 64-bit linux. This is built on CentOS 5 with all dependencies except glibc statically linked. It is expected to run without modification on most linux distributions.

The strelka-2.8.3.release_src.tar.bz2 download is the full source code release. The “Source code” downloads are generated by GitHub and are incomplete as they are missing version numbers and build system changes to improve portability.

strelka-2.8.2

20 Aug 00:36
Compare
Choose a tag to compare

This is a minor bugfix update from v2.8.1

Updates:

  • Fix haplotype model issue occuring when contigs have no sequence coverage (STREL-653)

strelka-2.8.2.centos5_x86_64.tar.bz2 is a binary distribution for 64-bit linux. This is built on CentOS 5 with all dependencies except glibc statically linked. It is expected to run without modification on most linux distributions.

The strelka-2.8.2.release_src.tar.bz2 download is the full source code release. The “Source code” downloads are generated by GitHub and are incomplete as they are missing version numbers and build system changes to improve portability.

strelka-2.8.1

20 Aug 00:29
Compare
Choose a tag to compare

This is minor bugfix update from v2.8.0

Updates

  • Fix allele noise filtration to synchronize across multiple samples (STREL-650)
  • Fix minor inconsistency in sequence error counting genome segment bounds (STREL-651)
  • Update htslib/samtools to v1.5 for improved error detection/messages and CRAM support (STREL-633)

strelka-2.8.0

20 Aug 00:28
Compare
Choose a tag to compare

This is a major feature update form v2.7.1

New Features

  • Germline
    • Improve haplotype model with assembly and new haplotype-based denoising steps
    • Expand read back phasing capability: Phasing is attempted between all variant types within 10 bases, and phasing is opportunistically attempted at longer range (STREL-178)
    • EVS Model improvements: new features, new treatment of unknown variants, updated truth sets for training.
    • Add adaptive indel error estimation to dynamically adjust indel error rates based on the noise signature of each input sample.
  • Somatic
    • EVS Model improvements: new features and feature normalization, new truth sets.
  • Shared
    • Added new --callRegions option to restrict germline or somatic calling to a bed file (STREL-356)
    • All VCF input REF fields are now validated against the input reference fasta (STREL-260)

Other Changes

  • Germline
    • gVCF compressed block depth values have been updated (STREL-478)
      • FORMAT/DP is changed from minimum to average DP from the block
      • FORMAT/DPF is changed from minimum to average DPF from the block
      • FORMAT/MIN_DP is added to provide the minimum DP from the block
    • Removed germline-only --targetRegions option. See new --callRegions option above
  • Shared
    • Size of demo data in strelka installation reduced from 75 Mb to < 200 Kb (STREL-576)

Issues

Issues affecting v2.7.1 which are closed in v2.8.0

  • Sites spanned by homozygous reference forcedGT deletions no longer have altered QUAL/GQ/GQX values and FILTER entries (STREL-612)
    • Previously sites spanned by forceGT deletion could be pessimistically filtered or given very scores for QUAL/GQ/GQX.
  • Fix true indel calls being filtered out as IndelConflict entries when an upstream overlapping forcedGT deletion is present (STREL-609)
  • Add missing data to pooled indel calls (STREL-596)
    • All chrM continuous-model indel calls were given 0 for SAMPLE/DPI and INFO/MQ due to a bug introduced during multi-sample generalization. This is now fixed.
  • Filter low-depth PASS'd calls (STREL-564/STREL-597)
    • All germline calls with depth < 3 and somatic calls with tumor depth less than 1 are subject to an additional LowDepth filter.
  • Improve efficiency of task scaling for many 100s of cores (STREL-392)
    • Prior to pyflow 1.1.14, there were major inefficiencies in launching tasks as the task count scaled into many 100s of concurrent processes. Especially important for rapid TAT on trios on specialized servers.
  • Fix rare instances of invalid VCF output given complicated overlapping deletion at genome segment boundary (STREL-391)

Full Changelog

  • STREL-608 Fix hang after error during adaptive estimation
  • STREL-610 Fix workflow resumption after interrupt
  • STREL-557 Retrain somatic EVS on updated alignments and truth sets
  • STREL-609 Fix genotyping error on forced non-variant indels
  • STREL-612 Fix sites overlapping forced non-variant indels
  • STREL-602 Retrain germline SNV EVS on updated alignments
  • STREL-597 Add low depth filter for both somatic and germline calls
  • STREL-596 Add missing data to pooled indel calls
  • STREL-586 Improve germline multi-sample calling runtime
  • STREL-576 Reduce demo installation size
  • STREL-579 Add filtered depth rates as somatic SNV EVS features
  • STREL-580 Retrain germline EVS
  • STREL-260 Add validation on all input vcf reference fields
  • STREL-566 Use indel error estimation by default for germline calling
  • STREL-577 Fix bug creating negative size active regions
  • STREL-553 Standardize somatic EVS features
  • STREL-564 Add filter preventing low depth PASS calls
  • STREL-567 Change readConfidentSupportThreshold to 0.51
  • STREL-524 Retrain germline EVS
  • STREL-519 Fix callRegions option thread utilization
  • STREL-459 Retain optimal soft-clipping for RNA analysis
  • STREL-451 Enable somatic indel EVS and retrain somatic SNV EVS
  • STREL-478 Change gVCF non-variant blocks to use mean depth
  • STREL-469 Add RNAseq EVS models
  • STREL-479 External candidate indels create active regions
  • STREL-462 Do not penalize candidate SNVs in alignment scoring
  • STREL-450 Add dinucs to indel error stats module
  • STREL-465 Filter germline haplotypes with phasing error signature
  • STREL-460 Do not assess indel candidacy in active regions if assembly fails
  • STREL-454 Penalize for non-candidate indels in alignment scoring
  • STREL-443 Remove old germline caller target BED file option
  • STREL-178 Replace codon phaser with variant phaser
  • STREL-356 Allow BED file to restrict variant calling regions
  • STREL-392 Fix overlap del handling across segments
  • STREL-405 Fix overcounting issue in error stats module
  • STREL-401 Remove read edge events from error stats module
  • STREL-342 Use assembly to generate haplotypes in long active regions
  • STREL-251 Enable automatic germline EVS calibration
  • STREL-357 Add separate threshold for homref calls

strelka-2.8.0.centos5_x86_64.tar.bz2 is a binary distribution for 64-bit linux. This is built on CentOS 5 with all dependencies except glibc statically linked. It is expected to run without modification on most linux distributions.

strelka-2.7.1

21 Nov 17:13
Compare
Choose a tag to compare

This is a major bugfix release from v2.7.0.

  • The most important change since v2.7.0 is reverting from the somatic indel empirical scoring (EVS) model to the classic strelka QSI-based scores with hard-filters. EVS is still the default scoring system used for somatic SNVs and is looking promising for indels but is being withdrawn for the time being to better understand generalization to Nano-prep and FFPE tumor samples.
  • The remaining fixes are predominantly improvements to edge-case VCF correctness and ease of interpretability. No stability issues have been thus far identified on 2.7.x

Full Changelog:

  • STREL-336 Fix incorrect indel normalizations
  • STREL-332 Revert to core somatic indel scoring with adjusted threshold
  • STREL-331 left-shift indels inside of active regions
  • STREL-248 update somatic EVS to include isaac4 data in training
  • STREL-321 fix rare shared variant prefix issue in multi-sample analysis
  • STREL-322 fix VCF namespace conflict on INFO/EVS
  • STREL-200 Adjust mitochondrial strand-bias for consistency with autosomes
  • STREL-319 Sync RNA-seq EVS feature names with DNA changes
  • STREL-318 Accept VCFs with unknown ALT values as forced-call sites
  • STREL-275 REF and ALTs do not share a common prefix of size over 1
  • STREL-274 simplify variant filter logic in multi-sample germline gVCF(s)
  • STREL-269 prevent indels larger than the max indel size from becoming candidate
  • STREL-267 forced indel call does not appear in somatic output

strelka-2.7.1.centos5_x86_64.tar.bz2 is a binary distribution for 64-bit linux. This is built on CentOS 5 with all dependencies except glibc statically linked. It is expected to run without modification on most linux distributions.

The strelka-2.7.1.release_src.tar.bz2 download is the full source code release. The “Source code” downloads are generated by GitHub and are incomplete as they are missing version numbers and build system changes to improve portability.

strelka-2.7.0

28 Oct 15:21
Compare
Choose a tag to compare

Initial GPLv3 release


strelka-2.7.0.centos5_x86_64.tar.bz2 is a binary distribution for 64-bit linux. This is built on CentOS 5 with all dependencies except glibc statically linked. It is expected to run without modification on most linux distributions.

The strelka-2.7.0.release_src.tar.bz2 download is the full source code release. The “Source code” downloads are generated by GitHub and are incomplete as they are missing version numbers and build system changes to improve portability.