Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] A tensor parallel API for beginners #40

Merged
merged 13 commits into from
Oct 31, 2023

Conversation

siddharth9820
Copy link
Collaborator

@siddharth9820 siddharth9820 commented Oct 10, 2023

  • Attempting to create a tensor parallel API which requires minimal changes to the model definition.
  • Using the popular nanoGPT benchmark as a testbed - Add tensor parallelism nanoGPT#1

@siddharth9820
Copy link
Collaborator Author

To Dos -

  • Add grad normalization support in axonn/intra_layer
  • Make nn.kaiming_init the default in axonn/intra_layer/fully_connected
  • Change Tensor_Parallel_Linear to Linear

@siddharth9820 siddharth9820 added the WIP Work in progress label Oct 18, 2023
@siddharth9820 siddharth9820 added this to the v0.2.0 milestone Oct 18, 2023
@siddharth9820
Copy link
Collaborator Author

Added CI tests for -

  1. Easy tensor parallelism
  2. Gradient clipping
  3. bias True and False

@siddharth9820 siddharth9820 merged commit 08d46d4 into develop Oct 31, 2023
6 checks passed
@siddharth9820 siddharth9820 deleted the easy-tensor-parallelism branch October 31, 2023 16:16
Avuxon pushed a commit that referenced this pull request Jan 25, 2024
* Easy TP that works with hf models
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
WIP Work in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant