Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate Hugging Face model card #58

Merged
merged 8 commits into from
Jan 3, 2024
Merged

Generate Hugging Face model card #58

merged 8 commits into from
Jan 3, 2024

Conversation

cg123
Copy link
Collaborator

@cg123 cg123 commented Dec 30, 2023

WIP implementation for #41.

@fakerybakery
Copy link

Hi, not sure if this is already implemented but FYI Hugging Face confirmed that models with the “merge” tag will be marked as merges.

@cg123 cg123 marked this pull request as ready for review January 2, 2024 20:17
@cg123
Copy link
Collaborator Author

cg123 commented Jan 2, 2024

@davanstrien Happy new year! Whenever you get back into things, I have a first pass at card generation about ready. I'd appreciate your thoughts on it before I merge it in.

Here's what gets generated for examples/ties.yml:

  ---
  base_model:
  - TheBloke/Llama-2-13B-fp16
  - garage-bAInd/Platypus2-13B
  - psmathur/orca_mini_v3_13b
  - WizardLM/WizardMath-13B-V1.0
  tags:
  - mergekit
  - merge

  ---
  # ties-example

  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

  ## Merge Details
  ### Merge Method

  This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [TheBloke/Llama-2-13B-fp16](https://huggingface.co/TheBloke/Llama-2-13B-fp16) as a base.

  ### Models Merged

  The following models were included in the merge:
  * [garage-bAInd/Platypus2-13B](https://huggingface.co/garage-bAInd/Platypus2-13B)
  * [psmathur/orca_mini_v3_13b](https://huggingface.co/psmathur/orca_mini_v3_13b)
  * [WizardLM/WizardMath-13B-V1.0](https://huggingface.co/WizardLM/WizardMath-13B-V1.0)

  ### Configuration

  The following YAML configuration was used to produce this model:

  ```yaml
  base_model: TheBloke/Llama-2-13B-fp16
  dtype: float16
  merge_method: ties
  models:
  - model: TheBloke/Llama-2-13B-fp16
  - model: psmathur/orca_mini_v3_13b
    parameters:
      density: [1.0, 0.7, 0.1]
      weight: 1.0
  - model: garage-bAInd/Platypus2-13B
    parameters:
      density: 0.5
      weight: [0.0, 0.3, 0.7, 1.0]
  - model: WizardLM/WizardMath-13B-V1.0
    parameters:
      density: 0.33
      weight:
      - filter: mlp
        value: 0.5
      - value: 0.0
  parameters:
    int8_mask: 1.0
    normalize: 1.0

Copy link
Contributor

@davanstrien davanstrien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking super nice; thanks for working on this! I think most of the obvious metadata fields are included in the template already. It might be possible to also infer some additional fields from the base models being merged, but I think it's probably better to keep it a bit simpler to start, especially as some merges may include many models.

mergekit/card.py Outdated Show resolved Hide resolved
@davanstrien
Copy link
Contributor

Does it also make sense to add a push_to_hub method (probably in a separate PR)? Happy to also help with this if useful :)

Copy link

@julien-c julien-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super cool @cg123! Generated model card looks quite nice already.

I've used this branch to push my first merge: https://huggingface.co/julien-c/Mistral-7B-Neural-Story-mix 🔥

Note that the Hub model page UI element is quite ugly for now (will improve next week) but we already link to the merged models in the UI:

image

A few questions/suggestions (with a wider scope than this PR):

  • could you also copy the input yml file to the output folder, using a conventional filename (maybe something lile mergekit_config.yml?). That way people would consistently upload it and the Hub could parse them down the road and provide some cool features based on this metadata (stats, UI, etc)
  • (nit) if you wanted to let users programmatically upload their models like @davanstrien was suggesting you might want to use some model card helpers from huggingface_hub, for instance to "merge" with the remote model card rather than overwrite it (in my model i had set a license at repo creation and it was overwritten). It's just a detail at this point though.
  • we should do a nice icon for mergekit on the Hub so the tag stands out more, do you already have a logo or icon in mind? otherwise we'll come up with something
image

more generally any feature we could build that'd be useful to you, just let us know!

mergekit/card.py Show resolved Hide resolved
@cg123
Copy link
Collaborator Author

cg123 commented Jan 3, 2024

Thanks for the comments @davanstrien @julien-c! I went ahead and added a copy of the original config YAML to the output directory as well. I think this is good to go for the first pass and will merge it shortly. Next step will be a separate PR for push_to_hub.

As for an icon, I've been half-thinking about 🝢; it's easy to fit in a small space, speaks visually a bit to the idea of combination, and I like the idea of using the alchemical symbol for dissolution. I'm the opposite of a graphic designer, though, so I'm extremely open to ideas on that front.

Again thanks for the input!

@cg123 cg123 merged commit 519a868 into main Jan 3, 2024
8 checks passed
@cg123 cg123 deleted the hf-card branch January 3, 2024 21:33
@julien-c
Copy link

julien-c commented Jan 4, 2024

As for an icon, I've been half-thinking about 🝢; it's easy to fit in a small space, speaks visually a bit to the idea of combination, and I like the idea of using the alchemical symbol for dissolution.

cool idea, we'll try something with @gary149 and if you like it you can use it!

@gary149
Copy link

gary149 commented Jan 8, 2024

With your idea, I think something simple like this could work well, using the "K" letter (feedback welcome):

image

@julien-c
Copy link

WDYT @cg123? we're thinking of using that icon for mergekit on HF (in model filters, etc):

image

@cg123
Copy link
Collaborator Author

cg123 commented Jan 13, 2024

@gary149 Sorry for not seeing this sooner! Thanks for making this icon - I think it looks great. I'd be happy to have it used for mergekit on HF.

@gary149
Copy link

gary149 commented Jan 15, 2024

That's great to hear! here is a svg of it:

<svg xmlns="http://www.w3.org/2000/svg" width="32" height="32" fill="none"><path fill="#000" fill-rule="evenodd" d="M11.32 6.83a1.6 1.6 0 0 0-3.2 0v18.34a1.6 1.6 0 0 0 3.2 0V18.9L17.8 23v.01a3.04 3.04 0 1 0 1.69-2.72l-4.26-2.7h3.02a3.04 3.04 0 1 0 0-3.2h-3.02l4.26-2.7a3.04 3.04 0 1 0-1.69-2.72l-6.48 4.11V6.83Z" clip-rule="evenodd"/><circle cx="20.84" cy="16" r=".9" fill="#F5F5F5"/><circle cx="20.84" cy="8.98" r=".9" fill="#F5F5F5"/><circle cx="20.84" cy="23.02" r=".9" fill="#F5F5F5"/></svg>

@fakerybakery
Copy link

FYI @gary149 the icon doesn't render correctly on dark mode:
Icon

@gary149
Copy link

gary149 commented Jan 22, 2024

FYI @gary149 the icon doesn't render correctly on dark mode:

Thanks, we are going to fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants