Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whisper with DirectML: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running WhisperBeamSearch node #1221

Open
WA225 opened this issue Jul 2, 2024 · 0 comments
Labels
DirectML DirectML

Comments

@WA225
Copy link

WA225 commented Jul 2, 2024

Describe the bug
The execution fails when I am trying to run Whisper on an AMD Radeon 780M Graphics using DirectML EP with the following error: onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running WhisperBeamSearch node. Name:'BeamSearch_node' Status Message: Non-zero status code returned while running Conv node. Name:'/whisper_encoder/encoder/conv1/Conv' Status Message: C:\a_work\1\s\onnxruntime\core\providers\dml\DmlExecutionProvider\src\MLOperatorAuthorImpl.cpp(2557)\onnxruntime_pybind11_state.pyd!00007FFC4E4A2689: (caller: 00007FFC4EBF5261) Exception(3) tid(1305c) 80070057 The parameter is incorrect.

To Reproduce
I am running the following commands in this order:
olive run --config whisper_dml_fp32.json --setup
python -m pip install onnxruntime-extensions>=0.9.0
olive run --config whisper_dml_fp32.json 2> log.txt --tempdir .

Olive config
{
"input_model": {
"type": "PyTorchModel",
"config": {
"model_script": "code/user_script.py",
"script_dir": "code",
"hf_config": {
"model_class": "WhisperForConditionalGeneration",
"model_name": "openai/whisper-tiny.en",
"components": [
{
"name": "encoder_decoder_init",
"io_config": "get_encdec_io_config",
"component_func": "get_encoder_decoder_init",
"dummy_inputs_func": "encoder_decoder_init_dummy_inputs"
},
{
"name": "decoder",
"io_config": "get_dec_io_config",
"component_func": "get_decoder",
"dummy_inputs_func": "decoder_dummy_inputs"
}
],
"from_pretrained_args": {
"attn_implementation": "eager"
}
}
}
},
"systems": {
"local_system": {
"type": "LocalSystem",
"config": {
"accelerators": [
{
"device": "gpu",
"execution_providers": [
"DmlExecutionProvider"
]
}
]
}
}
},
"evaluators": {
"common_evaluator": {
"metrics": [
{
"name": "latency",
"type": "latency",
"sub_types": [
{
"name": "avg"
}
],
"user_config": {
"user_script": "code/user_script.py",
"script_dir": "code",
"data_dir": "data",
"dataloader_func": "whisper_dataloader",
"func_kwargs": {
"dataloader_func": {
"model_name": "openai/whisper-tiny.en",
"use_audio_decoder": true
}
},
"batch_size": 1
}
}
]
}
},
"passes": {
"conversion": {
"type": "OnnxConversion",
"config": {
"target_opset": 17,
"save_as_external_data": true,
"all_tensors_to_one_file": true
}
},
"transformers_optimization": {
"type": "OrtTransformersOptimization",
"config": {
"save_as_external_data": true,
"all_tensors_to_one_file": true,
"opt_level": 0,
"optimization_options": {
"enable_gelu": true,
"enable_layer_norm": true,
"enable_attention": true,
"use_multi_head_attention": true,
"enable_skip_layer_norm": false,
"enable_embed_layer_norm": false,
"enable_bias_skip_layer_norm": false,
"enable_bias_gelu": false,
"enable_gelu_approximation": false,
"enable_qordered_matmul": false,
"enable_shape_inference": true,
"enable_gemm_fast_gelu": false,
"enable_nhwc_conv": false,
"enable_group_norm": false,
"enable_bias_splitgelu": false,
"enable_packed_qkv": true,
"enable_packed_kv": true,
"enable_bias_add": false,
"enable_rotary_embeddings": true
},
"use_gpu": true
}
},
"insert_beam_search": {
"type": "InsertBeamSearch",
"config": {
"use_forced_decoder_ids": false,
"use_logits_processor": false,
"use_gpu": true
}
},
"prepost": {
"type": "AppendPrePostProcessingOps",
"config": {
"tool_command": "whisper",
"tool_command_args": {
"model_name": "openai/whisper-tiny.en",
"use_audio_decoder": true
},
"target_opset": 17
}
}
},
"engine": {
"search_strategy": {
"execution_order": "joint",
"search_algorithm": "exhaustive"
},
"ort_log_severity_level" : 0,
"log_severity_level": 0,
"host": "local_system",
"target": "local_system",
"evaluator": "common_evaluator",
"evaluate_input_model": false,
"clean_cache": false,
"cache_dir": "cache",
"output_dir": "models",
"output_name": "whisper_dml_fp32"
}
}

Olive logs
[2024-07-02 10:46:11,124] [INFO] [run.py:138:run_engine] Running workflow default_workflow
[2024-07-02 10:46:11,132] [INFO] [engine.py:986:save_olive_config] Saved Olive config to cache\default_workflow\olive_config.json
[2024-07-02 10:46:11,132] [DEBUG] [run.py:179:run_engine] Registering pass OnnxConversion
[2024-07-02 10:46:11,136] [DEBUG] [run.py:179:run_engine] Registering pass OrtTransformersOptimization
[2024-07-02 10:46:11,137] [DEBUG] [run.py:179:run_engine] Registering pass InsertBeamSearch
[2024-07-02 10:46:11,138] [DEBUG] [run.py:179:run_engine] Registering pass AppendPrePostProcessingOps
[2024-07-02 10:46:11,146] [DEBUG] [accelerator_creator.py:130:_fill_accelerators] The accelerator device and execution providers are specified, skipping deduce.
[2024-07-02 10:46:11,146] [DEBUG] [accelerator_creator.py:169:_check_execution_providers] Supported execution providers for device gpu: ['DmlExecutionProvider', 'CPUExecutionProvider']
[2024-07-02 10:46:11,147] [DEBUG] [accelerator_creator.py:199:create_accelerators] Initial accelerators and execution providers: {'gpu': ['DmlExecutionProvider']}
[2024-07-02 10:46:11,147] [INFO] [accelerator_creator.py:224:create_accelerators] Running workflow on accelerator specs: gpu-dml
[2024-07-02 10:46:11,147] [DEBUG] [run.py:235:run_engine] Pass OnnxConversion already registered
[2024-07-02 10:46:11,147] [DEBUG] [run.py:235:run_engine] Pass OrtTransformersOptimization already registered
[2024-07-02 10:46:11,147] [DEBUG] [run.py:235:run_engine] Pass InsertBeamSearch already registered
[2024-07-02 10:46:11,148] [DEBUG] [run.py:235:run_engine] Pass AppendPrePostProcessingOps already registered
[2024-07-02 10:46:11,148] [INFO] [engine.py:109:initialize] Using cache directory: cache\default_workflow
[2024-07-02 10:46:11,161] [INFO] [engine.py:265:run] Running Olive on accelerator: gpu-dml
[2024-07-02 10:46:11,161] [INFO] [engine.py:1085:_create_system] Creating target system ...
[2024-07-02 10:46:11,161] [DEBUG] [engine.py:1081:create_system] create native OliveSystem SystemType.Local
[2024-07-02 10:46:11,162] [INFO] [engine.py:1088:_create_system] Target system created in 0.001005 seconds
[2024-07-02 10:46:11,162] [INFO] [engine.py:1097:_create_system] Creating host system ...
[2024-07-02 10:46:11,163] [DEBUG] [engine.py:1081:create_system] create native OliveSystem SystemType.Local
[2024-07-02 10:46:11,163] [INFO] [engine.py:1100:_create_system] Host system created in 0.000999 seconds
[2024-07-02 10:46:11,202] [DEBUG] [engine.py:711:_cache_model] Cached model df880b77 to cache\default_workflow\models\df880b77.json
[2024-07-02 10:46:11,203] [DEBUG] [engine.py:348:run_accelerator] Running Olive in search mode ...
[2024-07-02 10:46:11,203] [DEBUG] [engine.py:623:resolve_goals] Resolving goals: {'latency': {'avg': None}}
[2024-07-02 10:46:11,203] [DEBUG] [engine.py:642:resolve_goals] No baseline got as no goal is provided the the goal is threshold
[2024-07-02 10:46:11,204] [DEBUG] [engine.py:531:run_search] Step 1 with search point {'conversion': {}, 'transformers_optimization': {'only_onnxruntime': True}, 'insert_beam_search': {}, 'prepost': {}} ...
[2024-07-02 10:46:11,204] [INFO] [engine.py:867:_run_pass] Running pass conversion:OnnxConversion
[2024-07-02 10:46:11,207] [DEBUG] [resource_path.py:156:create_resource_path] Resource path code/user_script.py is inferred to be of type file.
[2024-07-02 10:46:11,209] [DEBUG] [resource_path.py:156:create_resource_path] Resource path code is inferred to be of type folder.
[2024-07-02 10:46:11,211] [DEBUG] [resource_path.py:156:create_resource_path] Resource path code is inferred to be of type folder.
[2024-07-02 10:46:11,212] [DEBUG] [resource_path.py:156:create_resource_path] Resource path code/user_script.py is inferred to be of type file.
[2024-07-02 10:46:11,449] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\code is inferred to be of type folder.
[2024-07-02 10:46:11,451] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\code\user_script.py is inferred to be of type file.
[2024-07-02 10:46:11,470] [INFO] [hf_config.py:112:load_hf_model] Loading Huggingface model from openai/whisper-tiny.en
[2024-07-02 10:46:12,330] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\code is inferred to be of type folder.
[2024-07-02 10:46:12,332] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\code\user_script.py is inferred to be of type file.
[2024-07-02 10:46:12,460] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\code is inferred to be of type folder.
[2024-07-02 10:46:12,462] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\code\user_script.py is inferred to be of type file.
[2024-07-02 10:46:12,466] [DEBUG] [dummy_inputs.py:45:get_dummy_inputs] Using dummy_inputs_func to get dummy inputs
[2024-07-02 10:46:12,583] [DEBUG] [pytorch.py:277:get_user_io_config] Calling get_encdec_io_config to get io_config
[2024-07-02 10:46:13,161] [DEBUG] [conversion.py:234:_export_pytorch_model] Converting model on device cpu with dtype None.
[2024-07-02 10:46:16,354] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\code is inferred to be of type folder.
[2024-07-02 10:46:16,355] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\code\user_script.py is inferred to be of type file.
[2024-07-02 10:46:16,471] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\code is inferred to be of type folder.
[2024-07-02 10:46:16,473] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\code\user_script.py is inferred to be of type file.
[2024-07-02 10:46:16,476] [DEBUG] [dummy_inputs.py:45:get_dummy_inputs] Using dummy_inputs_func to get dummy inputs
[2024-07-02 10:46:16,680] [DEBUG] [pytorch.py:277:get_user_io_config] Calling get_dec_io_config to get io_config
[2024-07-02 10:46:16,803] [DEBUG] [conversion.py:234:_export_pytorch_model] Converting model on device cpu with dtype None.
[2024-07-02 10:46:18,467] [INFO] [engine.py:954:_run_pass] Pass conversion:OnnxConversion finished in 7.256834 seconds
[2024-07-02 10:46:18,485] [DEBUG] [engine.py:711:_cache_model] Cached model 0_OnnxConversion-df880b77-673bf9e9 to cache\default_workflow\models\0_OnnxConversion-df880b77-673bf9e9.json
[2024-07-02 10:46:18,485] [DEBUG] [engine.py:794:_cache_run] Cached run for df880b77->0_OnnxConversion-df880b77-673bf9e9 into cache\default_workflow\runs\OnnxConversion-df880b77-673bf9e9.json
[2024-07-02 10:46:18,485] [INFO] [engine.py:867:_run_pass] Running pass transformers_optimization:OrtTransformersOptimization
[2024-07-02 10:46:18,493] [INFO] [transformer_optimization.py:178:validate_search_point] Please specify a positive value for opt_level when only_onnxruntime is True
[2024-07-02 10:46:18,493] [WARNING] [engine.py:873:_run_pass] Invalid search point, prune
[2024-07-02 10:46:18,493] [DEBUG] [engine.py:834:_run_passes] Pruned for pass transformers_optimization
[2024-07-02 10:46:18,493] [WARNING] [engine.py:850:_run_passes] Skipping evaluation as model was pruned
[2024-07-02 10:46:18,494] [DEBUG] [engine.py:531:run_search] Step 2 with search point {'conversion': {}, 'transformers_optimization': {'only_onnxruntime': False}, 'insert_beam_search': {}, 'prepost': {}} ...
[2024-07-02 10:46:18,494] [INFO] [engine.py:867:_run_pass] Running pass conversion:OnnxConversion
[2024-07-02 10:46:18,495] [DEBUG] [engine.py:886:_run_pass] Loading model from cache ...
[2024-07-02 10:46:18,497] [INFO] [engine.py:901:_run_pass] Loaded model from cache: 0_OnnxConversion-df880b77-673bf9e9 from cache\default_workflow\runs
[2024-07-02 10:46:18,499] [INFO] [engine.py:867:_run_pass] Running pass transformers_optimization:OrtTransformersOptimization
[2024-07-02 10:46:18,501] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\cache\default_workflow\models\0_OnnxConversion-df880b77-673bf9e9\output_model\encoder_decoder_init is inferred to be of type folder.
[2024-07-02 10:46:18,504] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\cache\default_workflow\models\0_OnnxConversion-df880b77-673bf9e9\output_model\decoder is inferred to be of type folder.
[2024-07-02 10:46:18,561] [DEBUG] [transformer_optimization.py:253:_run_for_config] model_type is set to bart from model attributes
[2024-07-02 10:46:18,561] [DEBUG] [transformer_optimization.py:259:_run_for_config] num_heads is set to 6 from model attributes
[2024-07-02 10:46:18,561] [DEBUG] [transformer_optimization.py:265:_run_for_config] hidden_size is set to 384 from model attributes
[2024-07-02 10:46:20,740] [DEBUG] [transformer_optimization.py:253:_run_for_config] model_type is set to bart from model attributes
[2024-07-02 10:46:20,740] [DEBUG] [transformer_optimization.py:259:_run_for_config] num_heads is set to 6 from model attributes
[2024-07-02 10:46:20,740] [DEBUG] [transformer_optimization.py:265:_run_for_config] hidden_size is set to 384 from model attributes
[2024-07-02 10:46:22,043] [INFO] [engine.py:954:_run_pass] Pass transformers_optimization:OrtTransformersOptimization finished in 3.542157 seconds
[2024-07-02 10:46:22,059] [DEBUG] [engine.py:711:_cache_model] Cached model 1_OrtTransformersOptimization-0-223aa855-gpu-dml to cache\default_workflow\models\1_OrtTransformersOptimization-0-223aa855-gpu-dml.json
[2024-07-02 10:46:22,065] [DEBUG] [engine.py:794:_cache_run] Cached run for 0_OnnxConversion-df880b77-673bf9e9->1_OrtTransformersOptimization-0-223aa855-gpu-dml into cache\default_workflow\runs\OrtTransformersOptimization-0-223aa855-gpu-dml.json
[2024-07-02 10:46:22,067] [INFO] [engine.py:867:_run_pass] Running pass insert_beam_search:InsertBeamSearch
[2024-07-02 10:46:22,069] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\cache\default_workflow\models\1_OrtTransformersOptimization-0-223aa855-gpu-dml\output_model\encoder_decoder_init is inferred to be of type folder.
[2024-07-02 10:46:22,073] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\cache\default_workflow\models\1_OrtTransformersOptimization-0-223aa855-gpu-dml\output_model\decoder is inferred to be of type folder.
[2024-07-02 10:46:22,566] [WARNING] [insert_beam_search.py:280:chain_model] DecoderMaskedMultiHeadAttention could not be applied to whisper decoder subgraph
[2024-07-02 10:46:23,084] [DEBUG] [insert_beam_search.py:302:chain_model] Using IR version 8 for chained model
[2024-07-02 10:46:24,877] [INFO] [engine.py:954:_run_pass] Pass insert_beam_search:InsertBeamSearch finished in 2.810152 seconds
[2024-07-02 10:46:24,880] [DEBUG] [engine.py:711:_cache_model] Cached model 2_InsertBeamSearch-1-e941a2d8 to cache\default_workflow\models\2_InsertBeamSearch-1-e941a2d8.json
[2024-07-02 10:46:24,881] [DEBUG] [engine.py:794:_cache_run] Cached run for 1_OrtTransformersOptimization-0-223aa855-gpu-dml->2_InsertBeamSearch-1-e941a2d8 into cache\default_workflow\runs\InsertBeamSearch-1-e941a2d8.json
[2024-07-02 10:46:24,883] [INFO] [engine.py:867:_run_pass] Running pass prepost:AppendPrePostProcessingOps
[2024-07-02 10:46:24,885] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\cache\default_workflow\models\2_InsertBeamSearch-1-e941a2d8\output_model\model_with_beam_search.onnx is inferred to be of type file.
[2024-07-02 10:46:24,886] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\cache\default_workflow\models\2_InsertBeamSearch-1-e941a2d8\output_model\model_with_beam_search.onnx is inferred to be of type file.
[2024-07-02 10:46:26,135] [INFO] [engine.py:954:_run_pass] Pass prepost:AppendPrePostProcessingOps finished in 1.248883 seconds
[2024-07-02 10:46:26,138] [DEBUG] [engine.py:711:_cache_model] Cached model 3_AppendPrePostProcessingOps-2-9e247843 to cache\default_workflow\models\3_AppendPrePostProcessingOps-2-9e247843.json
[2024-07-02 10:46:26,140] [DEBUG] [engine.py:794:_cache_run] Cached run for 2_InsertBeamSearch-1-e941a2d8->3_AppendPrePostProcessingOps-2-9e247843 into cache\default_workflow\runs\AppendPrePostProcessingOps-2-9e247843.json
[2024-07-02 10:46:26,142] [INFO] [engine.py:845:_run_passes] Run model evaluation for the final model...
[2024-07-02 10:46:26,142] [DEBUG] [engine.py:1026:_evaluate_model] Evaluating model ...
[2024-07-02 10:46:26,142] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\cache\default_workflow\models\3_AppendPrePostProcessingOps-2-9e247843\output_model\model_with_beam_search.onnx is inferred to be of type file.
[2024-07-02 10:46:26,144] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\cache\default_workflow\models\3_AppendPrePostProcessingOps-2-9e247843\output_model\model_with_beam_search.onnx is inferred to be of type file.
[2024-07-02 10:46:26,267] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\data is inferred to be of type folder.
[2024-07-02 10:46:27,265] [DEBUG] [ort_inference.py:72:get_ort_inference_session] inference_settings: {'execution_provider': ['DmlExecutionProvider'], 'provider_options': None}
[2024-07-02 10:46:27,265] [DEBUG] [ort_inference.py:111:get_ort_inference_session] Normalized providers: ['DmlExecutionProvider'], provider_options: [{}]
[2024-07-02 10:46:29,195] [WARNING] [engine.py:360:run_accelerator] Failed to run Olive on gpu-dml.
Traceback (most recent call last):
File "C:\Olive-main\olive\engine\engine.py", line 349, in run_accelerator
output_footprint = self.run_search(
File "C:\Olive-main\olive\engine\engine.py", line 534, in run_search
should_prune, signal, model_ids = self._run_passes(
File "C:\Olive-main\olive\engine\engine.py", line 846, in _run_passes
signal = self._evaluate_model(model_config, model_id, data_root, evaluator_config, accelerator_spec)
File "C:\Olive-main\olive\engine\engine.py", line 1052, in _evaluate_model
signal = self.target.evaluate_model(model_config, data_root, metrics, accelerator_spec)
File "C:\Olive-main\olive\systems\local.py", line 47, in evaluate_model
return evaluator.evaluate(model, data_root, metrics, device=device, execution_providers=execution_providers)
File "C:\Olive-main\olive\evaluator\olive_evaluator.py", line 205, in evaluate
metrics_res[metric.name] = self._evaluate_latency(
File "C:\Olive-main\olive\evaluator\olive_evaluator.py", line 123, in _evaluate_latency
latencies = self._evaluate_raw_latency(
File "C:\Olive-main\olive\evaluator\olive_evaluator.py", line 763, in _evaluate_raw_latency
return self._evaluate_onnx_latency(model, metric, dataloader, post_func, device, execution_providers)
File "C:\Olive-main\olive\evaluator\olive_evaluator.py", line 544, in _evaluate_onnx_latency
latencies = session.time_run(
File "C:\Olive-main\olive\common\ort_inference.py", line 334, in time_run
self.session.run(None, input_feed)
File "C:\anaconda3\envs\whisper-test\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 220, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running WhisperBeamSearch node. Name:'BeamSearch_node' Status Message: Non-zero status code returned while running Conv node. Name:'/whisper_encoder/encoder/conv1/Conv' Status Message: C:\a_work\1\s\onnxruntime\core\providers\dml\DmlExecutionProvider\src\MLOperatorAuthorImpl.cpp(2557)\onnxruntime_pybind11_state.pyd!00007FFC4E4A2689: (caller: 00007FFC4EBF5261) Exception(3) tid(1305c) 80070057 The parameter is incorrect.

Other information

  • OS: Windows 11
  • Olive version: 0.7.0
  • ONNXRuntime package and version: onnxruntime-directml==1.18.0

Additional context
The end of the ort log file:
2024-07-02 10:46:29.0995099 [V:onnxruntime:, session_state.cc:126 onnxruntime::SessionState::CreateGraphInfo] SaveMLValueNameIndexMapping
2024-07-02 10:46:29.1003795 [V:onnxruntime:, session_state.cc:172 onnxruntime::SessionState::CreateGraphInfo] Done saving OrtValue mappings.
2024-07-02 10:46:29.1007495 [I:onnxruntime:, allocation_planner.cc:2442 onnxruntime::IGraphPartitioner::CreateGraphPartitioner] Use
DeviceBasedPartition as default
2024-07-02 10:46:29.1062344 [I:onnxruntime:, session_state_utils.cc:201 onnxruntime::session_state_utils::SaveInitializedTensors] Saving initialized
tensors.
2024-07-02 10:46:29.1529584 [I:onnxruntime:, session_state_utils.cc:345 onnxruntime::session_state_utils::SaveInitializedTensors] Done saving
initialized tensors
2024-07-02 10:46:29.1543346 [I:onnxruntime:, inference_session.cc:2033 onnxruntime::InferenceSession::Initialize] Session successfully initialized.
[1;31m2024-07-02 10:46:29.2071764 [E:onnxruntime:, sequential_executor.cc:516 onnxruntime::ExecuteKernel] Non-zero status code returned while running
Conv node. Name:'/whisper_encoder/encoder/conv1/Conv' Status Message:
C:\a_work\1\s\onnxruntime\core\providers\dml\DmlExecutionProvider\src\MLOperatorAuthorImpl.cpp(2557)\onnxruntime_pybind11_state.pyd!00007FFC4E4A2689:
(caller: 00007FFC4EBF5261) Exception(3) tid(1305c) 80070057 The parameter is incorrect.

[1;31m2024-07-02 10:46:29.2083647 [E:onnxruntime:, sequential_executor.cc:516 onnxruntime::ExecuteKernel] Non-zero status code returned while running
WhisperBeamSearch node. Name:'BeamSearch_node' Status Message: Non-zero status code returned while running Conv node.
Name:'/whisper_encoder/encoder/conv1/Conv' Status Message:
C:\a_work\1\s\onnxruntime\core\providers\dml\DmlExecutionProvider\src\MLOperatorAuthorImpl.cpp(2557)\onnxruntime_pybind11_state.pyd!00007FFC4E4A2689:
(caller: 00007FFC4EBF5261) Exception(3) tid(1305c) 80070057 The parameter is incorrect.

Checked the line reporting the error:
https://github.com/microsoft/onnxruntime/blob/7be1d4aad3f984ebe2c4fb0f7db0b9ca67cc8964/onnxruntime/core/providers/dml/DmlExecutionProvider/src/MLOperatorAuthorImpl.cpp#L2557

If i remove the optimization "enable_skip_layer_norm": false or set "use_multi_head_attention" to false, I get the previously reported error in #1213 (comment), and if I set "use_gpu": false for InsertBeamSearch, the run fails silently and aborts after showing the message "[I:onnxruntime:, session_state_utils.cc:345 onnxruntime::session_state_utils::SaveInitializedTensors] Done saving initialized tensors" and does not show "[I:onnxruntime:, inference_session.cc:2033 onnxruntime::InferenceSession::Initialize] Session successfully initialized." in the ort output log.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DirectML DirectML
Projects
None yet
Development

No branches or pull requests

2 participants