Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🚘 Auto opt cli #1343

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
Open

🚘 Auto opt cli #1343

wants to merge 13 commits into from

Conversation

trajepl
Copy link
Contributor

@trajepl trajepl commented Sep 3, 2024

Describe your changes

Auto opt cli.

e.g. for bert model, we can optimize the model from huggingface and conduct onnx model with:

olive auto-opt --model Intel/bert-base-uncased-mrpc --data_config_path data_config.json --task text-classification

olive auto-opt --model Intel/bert-base-uncased-mrpc --data_config_path data_config.json --task text-classification --precision int4 --providers CPU

olive auto-opt --model Intel/bert-base-uncased-mrpc --data_config_path data_config.json --task text-classification --precision fp16 --providers CUDA

# use model builder
olive auto-opt --model microsoft/Phi-3-mini-4k-instruct --precision fp16 --providers CUDA --use_model_builder

Checklist before requesting a review

  • Add unit tests for this change.
  • Make sure all tests can pass.
  • Update documents if necessary.
  • Lint and apply fixes to your code by running lintrunner -a
  • Is this a user-facing change? If yes, give a description of this change to be included in the release notes.
  • Is this PR including examples changes? If yes, please remember to update example documentation in a follow-up PR.

(Optional) Issue link

olive/cli/auto_opt.py Fixed Show fixed Hide fixed
olive/cli/auto_opt.py Outdated Show resolved Hide resolved
olive/cli/auto_opt.py Outdated Show resolved Hide resolved
olive/cli/auto_opt.py Outdated Show resolved Hide resolved
olive/cli/auto_opt.py Outdated Show resolved Hide resolved
olive/cli/auto_opt.py Show resolved Hide resolved

search_strategy_group = sub_parser.add_argument_group("search strategy options")
search_strategy_group.add_argument(
"--num-samples", type=int, default=5, help="Number of samples for search algorithm"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to expose this in the CLI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking if the search takes long time, user can somehow reduce the numbers of num-samples to stop the search in time.

Copy link
Contributor

@devang-ml devang-ml left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add "auto-opt" needed packages list under extra dependencies so that user can do pip install olive-ai[auto-opt]

@trajepl
Copy link
Contributor Author

trajepl commented Sep 4, 2024

Let's add "auto-opt" needed packages list under extra dependencies so that user can do pip install olive-ai[auto-opt]

I think it might be hard to define a unified extra dependencies lists for auto-opt. Since auto-opt may be used in different devices which causes the conflicts on different onnxruntime packages.

I think we have a feature to dynamicly get_required_packages based on the given accelerators which can be used in auto-opt.

@devang-ml
Copy link
Contributor

We can exclude OnnxRuntime and IHV toolkits. But we should be able to install other required packages using pip install olive-ai[auto-opt].

@trajepl
Copy link
Contributor Author

trajepl commented Sep 5, 2024

We can exclude OnnxRuntime and IHV toolkits. But we should be able to install other required packages using pip install olive-ai[auto-opt].

Updated. Basically, pip install olive-ai is adequate to run auto-opt cli. Furthermore, to simplify the conversion, since olive currently use optimum to get onnx model, I added optimum to olive-ai[auto-opt].

Also I tested that, after the installation olive-ai[auto-opt], user only needs to install corresponding onnxruntime or onnxruntime-genai(only enable if set --use_model_builder) based on the device(cpu/gpu etc.), olive auto-opt cli can conduct reasonable results.

@samuel100
Copy link
Contributor

What models will be supported by auto-opt? When I have tried the auto optimizer using:

{
    "input_model":{
        "type": "HfModel",
        "model_path": "microsoft/phi-3.5-mini-instruct",
        "task": "text-generation"
    },
    "systems": {
        "local_system": {
            "type": "LocalSystem",
            "accelerators": [
                {
                    "device": "cpu",
                    "execution_providers": [
                        "CPUExecutionProvider"
                    ]
                }
            ]
        }
    },
    "auto_optimizer_config": {
        "opt_level": 0,
        "disable_auto_optimizer": false,
        "precision": "int4"
    },
    "host": "local_system",
    "target": "local_system",
    "cache_dir": "cache",
    "output_dir" : "models"
}

I get an error message on the OrtTransformerOptimizer pass:

ValueError: Unsupported model type: phi3, please select one from [bart, bert, bert_tf, bert_keras, clip, gpt2, gpt2_tf, gpt_neox, swin, tnlr, t5, unet, vae, vit, conformer, phi] which need to be set under OrtTransformersOptimization.config

If these are the only models supported then it is a little underwhelming because they are pretty outdated. It is also a bit odd because the optimizer works for optimum models when I run olive finetune.

As a general rule, a user should be able to plug in:

  • A wide range of model architectures (for example, if Llama4 is released and it has the same architecture as Llama3, then I'd expect the tool to work).
  • Device target
  • precision

The user should not need to worry about adding data.

@devang-ml
Copy link
Contributor

The list and the error message is from OnnxRuntime's transformer optimizer.

@devang-ml
Copy link
Contributor

olive/cli/auto_opt.py Outdated Show resolved Hide resolved

# output options
output_group = sub_parser.add_argument_group("output options")
output_group.add_argument(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to this PR but we could gather this to base.py as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. But I saw there are different arguments attributes. Some of them are required, some of them are not. And they have different default values.

Can we update it in follow-up PR?

olive/cli/auto_opt.py Outdated Show resolved Hide resolved
device = (
"gpu"
if self.args.providers
and any(p[: -(len("ExecutionProvider"))] in ["CUDA", "Tensorrt", "Dml"] for p in self.args.providers)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

        system_group.add_argument(
            "--providers",
            type=str,
            nargs="*",
            choices=["CPU", "CUDA", "Tensorrt", "Dml", "VitisAI", "Qnn"],
            help="List of execution providers to use for optimization",
        )

without ExecutionProvider?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should device be set to cpu if VitisAI and Qnn provided here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExecutionProvider will added automatically by this cli.
image

Copy link
Contributor Author

@trajepl trajepl Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should device be set to cpu if VitisAI and Qnn provided here?

I think yes. For VitisAi/QNN, we can run quantization on cpu, then inference the model with corresponding EP.

and any(p[: -(len("ExecutionProvider"))] in ["CUDA", "Tensorrt", "Dml"] for p in self.args.providers)
else "cpu"
)
providers = self.args.providers or ["CPUExecutionProvider"] if device == "cpu" else ["CUDAExecutionProvider"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

olive/cli/auto_opt.py Outdated Show resolved Hide resolved
olive/cli/auto_opt.py Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants