Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PyTorch] Update to 2.4.0, add distributed #1510

Merged
merged 96 commits into from
Sep 1, 2024

Conversation

HGuillemet
Copy link
Collaborator

@HGuillemet HGuillemet commented Jun 7, 2024

Included in this PR:

  • Update to PyTorch 2.4.0
  • Add distributed framework (backend Gloo)
  • Add a minimal binding for std::chrono, needed by distributed (TBC)
  • Add an adapter for intrusive_ptr, enabling transparent usage like for shared_ptr
  • Add an adapter for weak_ptr, enabling transparent usage (could be moved to JavaCPP ?)
  • Add an optional Maven dependency to cuda. This allows to use types defined in the cuda presets. Users calling a method of PyTroch presets using one of these types should then have the cuda presets. Which also means the cuda library of the presets will be used, and not the system one (unless the user sets org.bytedeco.javacpp.pathsFirst and their system cuda has same version than the cuda presets).
  • Merge functions packages to main package
  • Fix [PyTorch 2.2.2-1.5.11-SNAPSHOT] Training produces poor MNIST model on Windows #1503
  • Fix [PyTorch] Training is very slow on Linux. #1504
  • Fix support for OpenMP on macos

Remains to be done:

  • Determine if other distributed backends are needed (MPI or UCC. NCCL is not supported on Windows yet).

@HGuillemet HGuillemet marked this pull request as draft June 7, 2024 20:59
@HGuillemet
Copy link
Collaborator Author

Pytorch uses cupti for profiling, but it's not provided by the cuda presets. Could we add it ?

@saudet
Copy link
Member

saudet commented Aug 19, 2024

Sure, feel free to give it a try

@HGuillemet HGuillemet marked this pull request as ready for review August 24, 2024 20:30
@HGuillemet
Copy link
Collaborator Author

Ready for me. We can look at MPI or UCC distributed backend later if needed.
Artifacts for this PR are available on usual snapshot repository as org.bytedeco:pytorch-platform-gpu:2.4.0-1.5.11-SNAPSHOT or org.bytedeco:pytorch-platform:2.4.0-1.5.11-SNAPSHOT.

@saudet saudet merged commit dc8e6a5 into bytedeco:master Sep 1, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants