Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Web] Demucs model won't run in both WASM and WGPU #22031

Open
gianlourbano opened this issue Sep 9, 2024 · 6 comments
Open

[Web] Demucs model won't run in both WASM and WGPU #22031

gianlourbano opened this issue Sep 9, 2024 · 6 comments
Labels
ep:WebGPU ort-web webgpu provider platform:web issues related to ONNX Runtime web; typically submitted using template

Comments

@gianlourbano
Copy link

Describe the issue

I converted the model from pytorch to onnx as described here, with some issues. The model works in onnx python, but in wasm /webgpu the runtime dies without error. The optimized version of the model runs in wasm, but not webgpu. I don't know if this problem is related to the model conversion or the runtime. I have tested with both @latest and @dev.

To reproduce

Here's a link to a sample repo, instructions in README.

Urgency

Urgent, as this project is related to my thesis

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.19.2, 1.20.0-dev.20240907-ad9afbb042

Execution Provider

'wasm'/'cpu' (WebAssembly CPU), 'webgpu' (WebGPU)

@gianlourbano gianlourbano added the platform:web issues related to ONNX Runtime web; typically submitted using template label Sep 9, 2024
@github-actions github-actions bot added the ep:WebGPU ort-web webgpu provider label Sep 9, 2024
@gyagp
Copy link
Contributor

gyagp commented Sep 10, 2024

For WebGPU EP, the problem is related to op unsqueeze. According the ONNX spec (https://onnx.ai/onnx/operators/onnx__Unsqueeze.html), axes of unsqueeze is a list of integers, but in your model, it's just a scalar "1".

@gianlourbano
Copy link
Author

So the problem is related to the dynamo export of torch?

@fs-eire
Copy link
Contributor

fs-eire commented Sep 11, 2024

Technically the axes should always be a 1D tensor. However, in reality, the CPU code has loosen the limit:

https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/providers/cpu/tensor/unsqueeze.cc#L60-L62

perhaps webgpu should have same behavior to CPU.

#22054

gyagp added a commit to gyagp/onnxruntime that referenced this issue Sep 12, 2024
This is to fix issue microsoft#22031 to run model demucs.
For conv-transpose, outputPadding.length could be 1, while spatialRank
is 2. The fix is to append enough 0s to outputPadding.
For conv, the issue is similar. kernelShape.length sometimes could be 1,
while inputs[1].dims.length is 4. The fix is also to append enough 0s to
kernelShape.
fs-eire pushed a commit that referenced this issue Sep 17, 2024
This is to fix issue #22031 to run model demucs.
For conv-transpose, outputPadding.length could be 1, while spatialRank
is 2. The fix is to append enough 0s to outputPadding. For conv, the
issue is similar. kernelShape.length sometimes could be 1, while
inputs[1].dims.length is 4. The fix is also to append enough 0s to
kernelShape.
@gianlourbano
Copy link
Author

@gyagp with latest 1.20.0-dev.20240917-afd642a194, that should include both fixes, i still cannot run the model in webgpu, the runtime just aborts after displaying the wgpu experimental warning

@gyagp
Copy link
Contributor

gyagp commented Sep 19, 2024

I also hit some issue with the latest code, and I will take a further look.
BTW, I manually modified the model to work around the unsqueeze issue before, and it seems that model can run. I uploaded it to https://huggingface.co/webai-community/models/tree/main (click "download file" after demucs.onnx).

@gianlourbano
Copy link
Author

Your model succesfully runs with latest @dev, with timings (60s of audio with 10s chunks):

wasm:
step 0: 12656 ms
step 1: 12864 ms
step 2: 13211 ms
step 3: 13164 ms
step 4: 13643 ms
step 5: 13687 ms

wgpu:
step 0: 10226 ms
step 1: 9612 ms
step 2: 9628 ms
step 3: 9647 ms
step 4: 9600 ms
step 5: 9562 ms

onnx python cpu:
step 0: 4.9 s
step 1: 4.9 s
step 2: 4.6 s
step 3: 4.9 s
step 4: 4.8 s
step 5: 4.6 s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:WebGPU ort-web webgpu provider platform:web issues related to ONNX Runtime web; typically submitted using template
Projects
None yet
Development

No branches or pull requests

3 participants