Releases: microsoft/Olive
Releases · microsoft/Olive
Olive-ai 0.2.1
Examples
The following examples are added
General
- Enable hardware accelerator for Olive. It introduced new config
accelerators
insystems
, for example,CPU
,GPU
etc. andexecution_providers
inengine
, for exampleCPUExecutionProvider
,CUDAExecutionProvider
etc.
Evaluator
- Support for evaluating distributed ONNX models
Metrics
- Extend metrics'
sub_type
to accept list input to gather the results in one evaluation job if possible, and addsub_type_for_rank
to sort/search strategy and etc.
Olive-ai 0.2.0
Examples
The following examples are added
- ResNet Optimization with Vitis-AI Quantization for CPU
- SqueezeNet Optimization with DirectML for GPU
- Stable Diffusion Optimization with DirectML for GPU
- MobileNet Optimization with QDQ Quantization for Qualcomm NPU
- Whisper Optimization for CPU
- BERT Optimization with Intel® Neural Compressor PTQ for CPU
General
- Simplify data load experience by adding transformers data config support. For transformer models, user can use hf_config.dataset to leverage the online huggingface datasets.
- Ease the process of setting up environment: user can run olive.workflows.run --config config.json --setup to install necessary packages required by passes.
Passes (optimization techniques)
- Integrate Intel® Neural Compressor into Olive: introduce new passes IncStaticQuantization, IncDynamicQuantization, and IncQuantization.
- Integrate Vitis-AI into Olive: intriduce new pass VitisAIQuantization.
- Introduce OnnxFloatToFloat16: converts a model to float16. It is based on onnxconverter-common.convert_float_to_float16.
- Introduce OrtMixedPrecision: converts model to mixed precision to retain a certain level of accuracy.
- introduce AppendPrePostProcessingOps: adds Pre/Post nodes to the input model.
- introduce InsertBeamSearch: chains two model components (for example, encoder and decoder) together by inserting beam search op in between them.
- Support external data for all ONNX passes.
- Enable transformer optimization fusion options in workflow file.
- Expose extra_options in ONNX quantization passes.
Models
- Introduce DistributedOnnxModel to support distributed inferencing
- Introduce CompositeOnnxModel to represent models with encoder and decoder subcomponents as individual OnnxModels.
- Add io_config to PytorchModel, including input_names, input_shapes, output_names and dynamic_axes
- Add MLFlow model loader
Systems
- Introduce PythonEnvironmentSystem: a python environment on the host machine. This system allows user to evaluate models using onnxruntime or pacakges installed in a different python environment.
Evaluator
- Remove target from the evaluator config.
- Introduce dummy dataloader for latency evaluation.
Metrics
- Introduce priority_rank: User needs to specify "priority_rank": rank_num for the metrics if you have multiple metrics. Olive will use the priority_ranks of the metrics to determine the best model.
Engine
- Introduce Olive Footprint: generate report json files, including footprints.json and Pareto frontier footprints, and dump frontier to html/image.
- Introduce Packaing Olive artifacts: pakcages CandidateModels, SampleCode and ONNXRuntimePackages in the output_dir folder if it is configured from Engine Configuration.
- Introduce log_severity_level.
Olive-ai 0.1.0
This is the pre-release of the next version of Olive as a hardware-aware model optimization solution. It mainly includes:
- A unified optimization framework based on modular design. Details
- More optimizations integrated including ONNX Runtime transformer optimization, ONNX post training quantization with accuracy tuning, PyTorch quantization aware training, OpenVINO toolkit and SNPE toolkit. Details
- Easy-of-use interface for contributors to plug in new optimization innovations. Details
- Flexible model evaluation through both local devices and Azure Machine Learning. Details
olive1
To archive the old version of Olive