Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example of a config file for task_arithmetic 'negative' operation and a case for 'Task analogies' #400

Open
eunbin079 opened this issue Aug 19, 2024 · 1 comment

Comments

@eunbin079
Copy link

eunbin079 commented Aug 19, 2024

In the README, the Task Arithmetic paper describes two methods:

1. Applying a negative operation to the model weights to mitigate a specific behavior.
2. Using task analogy (Task Vector D ≈ Task Vector C + (Task Vector B − Task Vector A)).

Could I find examples for these?

Detail

Negative Operation

  • base model 'A' and a fine-tuned model 'a'.
  • 'model A' and 'model a' have the same model architecture.
  • how can I write the details for the 'negative' operation in the below config.
  • negative.yml
models:
  - model: A
  - model: a
merge_method: task_arithmetic
base_model: A
dtype: float16

Task Analogy

  • base model 'A', a fine-tuned model 'a'(from base model A), and a base model 'B'.
  • I want to obtain a fine-tuned model 'b' (= 'B' + ('a' − 'A')) using the task vector ('a' − 'A').
  • The relationship between 'A' and 'a' is similar to the relationship between 'B' and 'b'.
  • Do I need to create the 'a-A' task vector myself, or is it generated within MergeKit?
  • How should I represent the 'a-A' task vector in the config?
  • How can the above be reflected in the YAML file?
  • analogy.yml
models:
  - model: B
  - model: ??
merge_method: task_arithmetic
base_model: B 
dtype: float16

please help me!!
Thank you

@NextGenOP
Copy link

have you tried mergekit-mega? As my understanding, It's support multiple merge at once.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants