Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic "parent device lost" error/crash after #6092 #6279

Open
ArthurBrussee opened this issue Sep 15, 2024 · 3 comments
Open

Generic "parent device lost" error/crash after #6092 #6279

ArthurBrussee opened this issue Sep 15, 2024 · 3 comments
Labels
api: vulkan Issues with Vulkan platform: windows Issues with integration with windows type: bug Something isn't working
Milestone

Comments

@ArthurBrussee
Copy link

Description
After updating an egui + burn app to wgpu master, I observed random crashes with a generic "validation failed: parent device lost" error. Nothing in particular seems to cause the crash, even just drawing the app while firing off empty submit() calls seemed to crash.

After a bisection it seems to come down to this specific commit: ce9c9b7

It doesn't look that suspicious but does touch some raw pointers so idk... I definitely can't tell what's wrong anyway.

I can try to work on a smaller repro than "draw an egui app while another thread fires off submit()" but maybe this already gives enough hints.

Thanks for having a look!

Platform
Vulkan + Windows

@Wumpf
Copy link
Member

Wumpf commented Sep 15, 2024

I think that's related to

which is about to get fixed by

However after above PR this is still still a validation error which it almost certainly shouldn't be, see this comment thread here #6253 (comment)

Anyways, I'm not certain all of this is the case and device lost itself shouldn't happen in the first place, so if you have more information about your system and a as-minimal-as-reasonably-possible repro that would be great!

EDIT: Didn't pay enough attention to the fact that this is on wgpu-master only and dismissed the bisect result too quickly. That's quite curious, maybe some object lifetime got messed up 🤔. cc: @teoxoy
Put it on v23 milestone to ensure this gets a look before the next release

@Wumpf Wumpf added type: bug Something isn't working api: vulkan Issues with Vulkan platform: windows Issues with integration with windows labels Sep 15, 2024
@Wumpf Wumpf added this to the v23 milestone Sep 15, 2024
@ArthurBrussee
Copy link
Author

Thanks for the quick reply! I don't understand the internals here so quite possible it's related to the linked issues but its not at startup so think it could be different, didn't do a good job describing some other symptoms:

  • the app runs for a while before crashing
  • this "while" seems random each run
  • the more work I add to the submit queue on the background thread the faster the crash happened.
  • call stack just points to the submit call

So it seems like something race-y perhaps. Will try a single threaded setup at some point.

Otherwise some more specs - 4070 gpu (on Optimus or whatever its called now), haven't tried other gpus yet.

@ErichDonGubler ErichDonGubler changed the title Generice "parent device lost" error/crash after #6092 Generic "parent device lost" error/crash after #6092 Sep 16, 2024
@jimblandy
Copy link
Member

Do we have steps to reproduce this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: vulkan Issues with Vulkan platform: windows Issues with integration with windows type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants