We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When using DDP with Dynamo+Thunder we get:
AttributeError: 'Float8Tensor' object has no attribute '_fp8_attrs'
This issue affects the following models:
'dolly-v2-3b', 'Mistral-7B-v0.1', 'tiny-llama-1.1b', 'stablecode-completion-alpha-3b', 'Phi-3-mini-4k-instruct', 'falcon-7b'
Please use: 1 node with 8 GPUs.
Then execute:
torchrun --standalone --max-restarts=0 --no-python --nproc-per-node=8 python /opt/pytorch/lightning-thunder/thunder/benchmarks/benchmark_litgpt.py \ --model_name Mistral-7B-v0.1 \ --distributed_mode ddp \ --shard_mode None \ --compile dynamo_thunder \ --checkpoint_activations False \ --low_precision_mode fp8-delayed-te \ --micro_batch_size 1
We should not get an error.
system.device_product_name DGXH100 system.gpu_driver_version 535.129.03 libraries.cuda 12.6.1.006 libraries.pip.lightning 2.4.0.dev20240728 libraries.pip.lightning-thunder 0.2.0.dev0 libraries.pip.lightning-utilities 0.11.7 libraries.pip.litgpt 0.4.11 libraries.pip.nvfuser 0.2.10+git91997b3 libraries.pip.pytorch-lightning 2.4.0 libraries.pip.torch 2.5.0a0+git9902b34 libraries.pip.torchmetrics 1.4.1 libraries.pip.torchvision 0.19.0a0+d23a6e1
The text was updated successfully, but these errors were encountered:
rel: #1137
Sorry, something went wrong.
No branches or pull requests
🐛 Bug
When using DDP with Dynamo+Thunder we get:
This issue affects the following models:
'dolly-v2-3b', 'Mistral-7B-v0.1', 'tiny-llama-1.1b', 'stablecode-completion-alpha-3b', 'Phi-3-mini-4k-instruct', 'falcon-7b'
To Reproduce
Please use:
1 node with 8 GPUs.
Then execute:
Expected behavior
We should not get an error.
Environment
system.device_product_name DGXH100
system.gpu_driver_version 535.129.03
libraries.cuda 12.6.1.006
libraries.pip.lightning 2.4.0.dev20240728
libraries.pip.lightning-thunder 0.2.0.dev0
libraries.pip.lightning-utilities 0.11.7
libraries.pip.litgpt 0.4.11
libraries.pip.nvfuser 0.2.10+git91997b3
libraries.pip.pytorch-lightning 2.4.0
libraries.pip.torch 2.5.0a0+git9902b34
libraries.pip.torchmetrics 1.4.1
libraries.pip.torchvision 0.19.0a0+d23a6e1
The text was updated successfully, but these errors were encountered: