-
-
Notifications
You must be signed in to change notification settings - Fork 822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable Ascend NPU support #1758
base: main
Are you sure you want to change the base?
Conversation
src/axolotl/utils/models.py
Outdated
max_memory = None | ||
|
||
model_kwargs["device_map"] = device_map | ||
set_model_device(cfg, max_memory, model_config, model_kwargs, device_map) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the way python passes by reference and updates them in the function feels a bit awkward here. Not sure right now what a good solution would be to make this more obvious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the simplest way is making model_kwargs
as the return value of func set_model_device
. Simple but the obvious effect may be similar to the current one. And model_kwargs
itself is mutable, this way probably make little sense.
A more complicated solution would be write a class ModelKwargs
, with member functions like __init__
, update_model_device
, update_dtype
, update_attention
, update_quantization
... These funcs will be called in load_model
, making load_model
and the change of model_kwargs
more clearly. However, this would bring a lot changes into src/axolotl/utils/models.py
and may exsisting some issues, time is needed for validating it.
Good day! @winglian I tried to create a class Thus, finnaly I refactor the whole This brings a lot changes, while making the model loading pipeline more clearly. Moreover, the changes of member variables such as Please review the latest code and give me some suggestions. Thanks a lot! |
1. add Ascend NPU backend support 2. refactor func load_model in src/axolotl/utils/models.py 3. refactor load_in_8bit as a kwarg
a9b5ca4
to
8d39332
Compare
Looks like it includes two parts in this commits Model Loaders reafactor and Ascend NPU support. Maybe we could spilit it as two PRs, the first one is Model Loaders reafactor, then we will rebase the Ascend NPU support PR after it. Or do you have any other suggestions? @winglian Please feel free let us know if you have any more concern. Thanks! |
Description
Enable Ascend NPU backend for finetuning, inferencing and gradio webui.
Main changes:
device
Motivation and Context
There are two benefits:
Example
Screenshots
NPU supported CLI inference
NPU supported Gradio webui inference
Config
lora.yaml