23 May 07:14

leqiao-1

Olive-ai 0.2.1

Examples

The following examples are added

Dolly V2 optimization with DirectML

General

Enable hardware accelerator for Olive. It introduced new config accelerators in systems, for example, CPU, GPU etc. and execution_providers in engine, for example CPUExecutionProvider, CUDAExecutionProvider etc.

Evaluator

Support for evaluating distributed ONNX models

Metrics

Extend metrics' sub_type to accept list input to gather the results in one evaluation job if possible, and add sub_type_for_rank to sort/search strategy and etc.

Assets 3

17 May 12:15

leqiao-1

Olive-ai 0.2.0

Examples

The following examples are added

General

Simplify data load experience by adding transformers data config support. For transformer models, user can use hf_config.dataset to leverage the online huggingface datasets.
Ease the process of setting up environment: user can run olive.workflows.run --config config.json --setup to install necessary packages required by passes.

Passes (optimization techniques)

Integrate Intel® Neural Compressor into Olive: introduce new passes IncStaticQuantization, IncDynamicQuantization, and IncQuantization.
Integrate Vitis-AI into Olive: intriduce new pass VitisAIQuantization.
Introduce OnnxFloatToFloat16: converts a model to float16. It is based on onnxconverter-common.convert_float_to_float16.
Introduce OrtMixedPrecision: converts model to mixed precision to retain a certain level of accuracy.
introduce AppendPrePostProcessingOps: adds Pre/Post nodes to the input model.
introduce InsertBeamSearch: chains two model components (for example, encoder and decoder) together by inserting beam search op in between them.
Support external data for all ONNX passes.
Enable transformer optimization fusion options in workflow file.
Expose extra_options in ONNX quantization passes.

Models

Introduce DistributedOnnxModel to support distributed inferencing
Introduce CompositeOnnxModel to represent models with encoder and decoder subcomponents as individual OnnxModels.
Add io_config to PytorchModel, including input_names, input_shapes, output_names and dynamic_axes
Add MLFlow model loader

Systems

Introduce PythonEnvironmentSystem: a python environment on the host machine. This system allows user to evaluate models using onnxruntime or pacakges installed in a different python environment.

Evaluator

Remove target from the evaluator config.
Introduce dummy dataloader for latency evaluation.

Metrics

Introduce priority_rank: User needs to specify "priority_rank": rank_num for the metrics if you have multiple metrics. Olive will use the priority_ranks of the metrics to determine the best model.

Engine

Introduce Olive Footprint: generate report json files, including footprints.json and Pareto frontier footprints, and dump frontier to html/image.
Introduce Packaing Olive artifacts: pakcages CandidateModels, SampleCode and ONNXRuntimePackages in the output_dir folder if it is configured from Engine Configuration.
Introduce log_severity_level.

Assets 3

29 Mar 00:47

EmmaNingMS

Olive-ai 0.1.0

This is the pre-release of the next version of Olive as a hardware-aware model optimization solution. It mainly includes:

A unified optimization framework based on modular design. Details
More optimizations integrated including ONNX Runtime transformer optimization, ONNX post training quantization with accuracy tuning, PyTorch quantization aware training, OpenVINO toolkit and SNPE toolkit. Details
Easy-of-use interface for contributors to plug in new optimization innovations. Details
Flexible model evaluation through both local devices and Azure Machine Learning. Details

Assets 3

23 Mar 06:51

leqiao-1

olive1 Pre-release

Pre-release

To archive the old version of Olive

Assets 2