Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mistral optimization(GPU) for a locally saved model, Failed to run Olive on gpu-cuda. #1341

Open
tjinjin95 opened this issue Aug 31, 2024 · 1 comment

Comments

@tjinjin95
Copy link

Describe the bug
Failed to run Olive on gpu-cuda.

To Reproduce
Download https://huggingface.co/mistralai/Mistral-7B-v0.1/tree/main to folder: D:\windowsAI\HFModel\Mistral-7B-v01
Follow readme: https://github.com/microsoft/Olive/tree/main/examples/mistral
Running step: python mistral.py --optimize --config mistral_fp16_optimize.json --model_id D:\windowsAI\HFModel\Mistral-7B-v01
If the method is not right, would you help to list the right methods.

my virtual environment pip list:
Package Version Editable project location


accelerate 0.33.0
aiohappyeyeballs 2.4.0
aiohttp 3.10.5
aiosignal 1.3.1
alembic 1.13.2
annotated-types 0.7.0
attrs 24.2.0
certifi 2024.7.4
charset-normalizer 3.3.2
colorama 0.4.6
coloredlogs 15.0.1
colorlog 6.8.2
contourpy 1.2.1
cycler 0.12.1
datasets 2.21.0
Deprecated 1.2.14
dill 0.3.8
evaluate 0.4.2
filelock 3.15.4
flatbuffers 24.3.25
fonttools 4.53.1
frozenlist 1.4.1
fsspec 2024.6.1
greenlet 3.0.3
huggingface-hub 0.24.6
humanfriendly 10.0
idna 3.8
inquirerpy 0.3.4
Jinja2 3.1.4
joblib 1.4.2
kiwisolver 1.4.5
lightning-utilities 0.11.6
Mako 1.3.5
MarkupSafe 2.1.5
matplotlib 3.9.2
mpmath 1.3.0
multidict 6.0.5
multiprocess 0.70.16
networkx 3.3
neural_compressor 3.0
numpy 1.26.4
olive-ai 0.7.0 D:\windowsAI\Olive
onnx 1.16.2
onnxconverter-common 1.14.0
onnxruntime-directml 1.19.0
onnxruntime_extensions 0.12.0
onnxruntime-gpu 1.19.0
opencv-python-headless 4.10.0.84
optimum 1.21.4
optuna 3.6.1
packaging 24.1
pandas 2.2.2
pfzy 0.3.4
pillow 10.4.0
pip 24.2
prettytable 3.11.0
prompt_toolkit 3.0.47
protobuf 3.20.2
psutil 6.0.0
py-cpuinfo 9.0.0
pyarrow 17.0.0
pycocotools 2.0.8
pydantic 2.8.2
pydantic_core 2.20.1
pyparsing 3.1.4
pyreadline3 3.4.1
python-dateutil 2.9.0.post0
pytz 2024.1
PyYAML 6.0.2
regex 2024.7.24
requests 2.32.3
safetensors 0.4.4
schema 0.7.7
scikit-learn 1.5.1
scipy 1.14.1
sentencepiece 0.2.0
setuptools 73.0.1
six 1.16.0
skl2onnx 1.17.0
SQLAlchemy 2.0.32
sympy 1.13.2
tabulate 0.9.0
tf2onnx 1.16.1
threadpoolctl 3.5.0
tokenizers 0.19.1
torch 2.4.0
torchaudio 2.4.0
torchmetrics 1.4.1
torchvision 0.19.0
tqdm 4.66.5
transformers 4.43.4
typing_extensions 4.12.2
tzdata 2024.1
urllib3 2.2.2
wcwidth 0.2.13
wrapt 1.16.0
xxhash 3.5.0
yarl 1.9.4
Expected behavior
generate a optimized model

Olive config
--config mistral_fp16_optimize.json

Olive logs
`(mistral_env) D:\windowsAI\Olive\examples\mistral>python mistral.py --optimize --config mistral_fp16_optimize.json --model_id D:\windowsAI\HFModel\Mistral-7B-v01

optimized_model_dir is:D:\windowsAI\Olive\examples\mistral\models\convert-optimize-perf_tuning\mistral_fp16_gpu-cuda_model
Optimizing D:\windowsAI\HFModel\Mistral-7B-v01
[2024-08-31 17:50:42,659] [INFO] [run.py:138:run_engine] Running workflow default_workflow
[2024-08-31 17:50:42,704] [INFO] [cache.py:51:init] Using cache directory: D:\windowsAI\Olive\examples\mistral\cache\default_workflow
[2024-08-31 17:50:42,757] [INFO] [engine.py:1013:save_olive_config] Saved Olive config to D:\windowsAI\Olive\examples\mistral\cache\default_workflow\olive_config.json
[2024-08-31 17:50:42,846] [INFO] [accelerator_creator.py:224:create_accelerators] Running workflow on accelerator specs: gpu-cuda
[2024-08-31 17:50:42,888] [INFO] [engine.py:275:run] Running Olive on accelerator: gpu-cuda
[2024-08-31 17:50:42,888] [INFO] [engine.py:1110:_create_system] Creating target system ...
[2024-08-31 17:50:42,889] [INFO] [engine.py:1113:_create_system] Target system created in 0.000000 seconds
[2024-08-31 17:50:42,889] [INFO] [engine.py:1122:_create_system] Creating host system ...
[2024-08-31 17:50:42,891] [INFO] [engine.py:1125:_create_system] Host system created in 0.000000 seconds
passes is [('convert', {}), ('optimize', {}), ('perf_tuning', {})]
[2024-08-31 17:50:43,102] [INFO] [engine.py:877:_run_pass] Running pass convert:OptimumConversion
Framework not specified. Using pt to export the model.
[2024-08-31 17:50:54,785] [ERROR] [engine.py:976:_run_pass] Pass run failed.
Traceback (most recent call last):
File "D:\windowsAI\Olive\olive\engine\engine.py", line 964, in _run_pass
output_model_config = host.run_pass(p, input_model_config, output_model_path, pass_search_point)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\windowsAI\Olive\olive\systems\local.py", line 30, in run_pass
output_model = the_pass.run(model, output_model_path, point)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\windowsAI\Olive\olive\passes\olive_pass.py", line 206, in run
output_model = self._run_for_config(model, config, output_model_path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\windowsAI\Olive\olive\passes\onnx\optimum_conversion.py", line 96, in run_for_config
export_optimum_model(model.model_name_or_path, output_model_path, **extra_args)
File "D:\windowsAI\mistral_env\Lib\site-packages\optimum\exporters\onnx_main
.py", line 248, in main_export
task = TasksManager.infer_task_from_model(model_name_or_path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\windowsAI\mistral_env\Lib\site-packages\optimum\exporters\tasks.py", line 1680, in infer_task_from_model
task = cls._infer_task_from_model_name_or_path(model, subfolder=subfolder, revision=revision)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\windowsAI\mistral_env\Lib\site-packages\optimum\exporters\tasks.py", line 1593, in _infer_task_from_model_name_or_path
raise RuntimeError(
RuntimeError: Cannot infer the task from a local directory yet, please specify the task manually (masked-im, automatic-speech-recognition, fill-mask, object-detection, text2text-generation, text-to-audio, image-to-image, audio-xvector, image-segmentation, mask-generation, zero-shot-object-detection, image-to-text, semantic-segmentation, question-answering, feature-extraction, conversational, token-classification, text-classification, audio-classification, depth-estimation, sentence-similarity, zero-shot-image-classification, audio-frame-classification, multiple-choice, text-generation, image-classification, stable-diffusion, stable-diffusion-xl).
[2024-08-31 17:50:55,193] [WARNING] [engine.py:370:run_accelerator] Failed to run Olive on gpu-cuda.
Traceback (most recent call last):
File "D:\windowsAI\Olive\olive\engine\engine.py", line 349, in run_accelerator
output_footprint = self.run_no_search(
^^^^^^^^^^^^^^^^^^^
File "D:\windowsAI\Olive\olive\engine\engine.py", line 441, in run_no_search
should_prune, signal, model_ids = self._run_passes(
^^^^^^^^^^^^^^^^^
File "D:\windowsAI\Olive\olive\engine\engine.py", line 814, in _run_passes
model_config, model_id, output_model_hash = self._run_pass(
^^^^^^^^^^^^^^^
File "D:\windowsAI\Olive\olive\engine\engine.py", line 964, in _run_pass
output_model_config = host.run_pass(p, input_model_config, output_model_path, pass_search_point)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\windowsAI\Olive\olive\systems\local.py", line 30, in run_pass
output_model = the_pass.run(model, output_model_path, point)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\windowsAI\Olive\olive\passes\olive_pass.py", line 206, in run
output_model = self._run_for_config(model, config, output_model_path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\windowsAI\Olive\olive\passes\onnx\optimum_conversion.py", line 96, in run_for_config
export_optimum_model(model.model_name_or_path, output_model_path, **extra_args)
File "D:\windowsAI\mistral_env\Lib\site-packages\optimum\exporters\onnx_main
.py", line 248, in main_export
task = TasksManager.infer_task_from_model(model_name_or_path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\windowsAI\mistral_env\Lib\site-packages\optimum\exporters\tasks.py", line 1680, in infer_task_from_model
task = cls._infer_task_from_model_name_or_path(model, subfolder=subfolder, revision=revision)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\windowsAI\mistral_env\Lib\site-packages\optimum\exporters\tasks.py", line 1593, in _infer_task_from_model_name_or_path
raise RuntimeError(
RuntimeError: Cannot infer the task from a local directory yet, please specify the task manually (masked-im, automatic-speech-recognition, fill-mask, object-detection, text2text-generation, text-to-audio, image-to-image, audio-xvector, image-segmentation, mask-generation, zero-shot-object-detection, image-to-text, semantic-segmentation, question-answering, feature-extraction, conversational, token-classification, text-classification, audio-classification, depth-estimation, sentence-similarity, zero-shot-image-classification, audio-frame-classification, multiple-choice, text-generation, image-classification, stable-diffusion, stable-diffusion-xl).
[2024-08-31 17:50:55,199] [INFO] [engine.py:292:run] Run history for gpu-cuda:
[2024-08-31 17:50:55,347] [INFO] [engine.py:587:dump_run_history] run history:
+------------+-------------------+-------------+----------------+-----------+
| model_id | parent_model_id | from_pass | duration_sec | metrics |
+============+===================+=============+================+===========+
| d03e43d3 | | | | |
+------------+-------------------+-------------+----------------+-----------+
[2024-08-31 17:50:55,378] [INFO] [engine.py:307:run] No packaging config provided, skip packaging artifacts`

Other information

  • OS: [Windows]
  • Olive version: [main]
  • ONNXRuntime package and version: [e.g. onnxruntime-gpu: 1.19.0]
  • Transformers package version: [e.g. transformers 4.43.4]
  • GPU memory: 4G

Additional context
None

@jambayk
Copy link
Contributor

jambayk commented Sep 4, 2024

Looks like optimum export is failing on local model.

could you try by replacing the "convert" config using this?

{
      "type": "OnnxConversion",
      "target_opset": 17,
      "torch_dtype": "float32"
  }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants