Olive-ai 0.3.2
Examples
The following examples are added
- DirectML SDXL refiner #487
- Open Llama arc #582
- Enable Intel® Neural Compressor 4-bits weight-only quantization #614
- Add NCHW GroupNorm fusion to DirectML's SD examples #617
Passes (optimization techniques)
- QLoRA pass for torch model fine-tuning
- Intel® Neural Compressor 4-bits weight-only quantization
- OnnxModelOptimizer
- inserts a
Cast
operation for cases whereArgMax
input isn't supported on the device - Fuse consecutive Reshape operations when the latter results in flattening
- inserts a
Engine
- Summarize pass run history in table(install tabulate for better preview)
- Support to tune and evaluate models across different execution providers which are managed by Olive-ai.
Model
- Add model_loading_args, load_model and load_model_config to HFConfig.
- Add adapter_path to PyTorchModel
- Introduce model_attributes which can be used to simplify user's input for transformer_optimization
- Add AML curated model support
Dataset
- Auto-insertion of the input model (if it's a pytorch model with hf_config.dataset) data config in pass configs is removed. Use “input_model_data_config” if user want to use the input model's data config.
- Support a second type of dataset for
text-generation
tasks calledpair
- Support convert
olive dataset
to huggingfacedatasets.Dataset
Known Issues
- #571 Whisper gpu does not consume gpu resources
- #573 Distinguish pass instance with name not cls name
Dependencies:
- Support onnxruntime 1.16.1
- Drop python 3.7. Now you should ensure python >=3.8 to run Olive-ai optimization.