Skip to content

ONNX Runtime v1.19.2

Latest
Compare
Choose a tag to compare
@MaanavD MaanavD released this 04 Sep 19:33
· 251 commits to main since this release
ffceed9

Announcements

  • ORT 1.19.2 is a small patch release, fixing some broken workflows and introducing bug fixes.

Build System & Packages

  • Fixed the signing of native DLLs.
  • Disabled absl symbolize in Windows Release build to avoid dependency on dbghelp.dll.

Training

  • Restored support for CUDA compute capability 7.0 and 7.5 with CUDA 12, and 6.0 and 6.1 with CUDA 11.
  • Several fixes for training CI pipelines.

Mobile

  • Fixed ArgMaxOpBuilder::AddToModelBuilderImpl() nullptr Node access for CoreML EP.

Generative AI

  • Added CUDA kernel for Phi3 MoE.
  • Added smooth softmax support in CUDA and CPU kernels for the GroupQueryAttention operator.
  • Fixed number of splits calculations in GroupQueryAttention CUDA operator.
  • Enabled causal support in the MultiHeadAttention CUDA operator.

Contributors

@prathikr, @mszhanyi, @edgchen1, @tianleiwu, @wangyems, @aciddelgado, @mindest, @snnn, @baijumeswani, @MaanavD

Thanks to everyone who helped ship this release smoothly!

Full Changelog: v1.19.0...v1.19.2