-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hanging while computing principal_components #3331
Comments
Can you try with this?
|
Thanks for the suggestion — unfortunately, still freezes. Here are the results from some work I've done trying to narrow down the cause. It seems to me that something about My test script consisted of three steps:
I confirmed that this behavior is still the case with both these env vars set: os.environ['NUMEXPR_MAX_THREADS'] = '1'
os.environ["OPENBLAS_NUM_THREADS"] = "1" Here is the memory leak warning (I only get this with n_jobs=1 everywhere):
I poked around in the Let me know if you have other debugging ideas! Going to try the |
If it helps, this is what the traceback looks like when I keyboard interrupt the hung PCA: Traceback
|
Ah, and — if I force |
@alejoe91 : we should use the threadpool_limits in |
yes but that is very tricky given the current implementation! |
Similar to #2689, I'm having an issue where computing the
principal_component
quality metric hangs at 0% on Linux when run as part of a script. Similar to that issue, it seems to require multiple parallel computations to occur; if I quit the hung process and re-run it, the PCA gets computed no problem and everything runs smoothly from there.Unlike that issue, appending
MKL_THREADING_LAYER=TBB
in front of the call to my script didn't help (at least, not when passed through SLURM).Attached is my conda env export — you can see that BLAS / MKL is there, but when I followed chatGPT's advice to check if this was being used in numpy / scipy, nothing came up:
output was:
I'm going to start debugging by cloning my conda env, and trying to force the clone to not use mkl with
conda install nomkl numpy scipy scikit-learn numexpr
(again, ht chatGPT). If that doesn't work, I guess it could copy #2689 and try switching tojoblib
in certain parts of the code...other suggestions and ideas welcome :) thanks!The text was updated successfully, but these errors were encountered: