Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError: 'content_image_file' #25

Open
ajie6666 opened this issue Aug 30, 2024 · 7 comments
Open

KeyError: 'content_image_file' #25

ajie6666 opened this issue Aug 30, 2024 · 7 comments

Comments

@ajie6666
Copy link

When I was training in the second stage, I got this error[KeyError: 'content_image_file'], but when I built the dataset from the DATASET.md, I saw that it only needed to be formatted {"image_file": "", "content_prompt": "", ...}.I would like to ask what this ”content_image_file “should consist of ?
Then I would like to ask how the results of the stage-1 of training can be applied to the second stage, and how the generated .bin files can be used?
Thanks.

@Jeoyal
Copy link
Contributor

Jeoyal commented Aug 30, 2024

Hi @ajie6666 , thank you for your interest in our work. During the second stage of training, we first process the raw image into a content image using the Process content input described in DATASET.md. Then, we add the content image path to the "content_image_file" field in json file.

In addition, after training stage 1, your checkpoint path structure might look like this:
output/checkpoint-xxxxx/pytorch_model.bin (ip_ckpt)
output/checkpoint-xxxxx/pytorch_model_1.bin (style_aware_encoder_path)
Simply load pytorch_model.bin as --pretrained_ip_adapter_path and pytorch_model_1.bin as --pretrained_style_encoder_path to train stage 2.

@ajie6666
Copy link
Author

ajie6666 commented Sep 1, 2024

I really appreciate your detailed answers to my questions.
Probably because of the version of the transformers, I generated the file in “.safetensors ”format instead of the “.bin”.
This made me unable to find them at first, then I tried to convert .safetensors to .bin , or let the first-stage generate .bin files, but none of them could be read out in the second-stage.
Finally, according to https://huggingface.co/docs/safetensors/speed ,I changed "style_aware_encoder.load_state_dict( torch.load(args.pretrained_style_encoder_path))" to "style_aware_encoder.load_state_dict(load_file(args.pretrained_style_encoder_path),strict=False)", and replace "sd = torch.load( ckpt_path, map_location="cpu")" to "sd = load_file(ckpt_path, device="cpu")",
It is now working normally.

@Jeoyal
Copy link
Contributor

Jeoyal commented Sep 1, 2024

I'm glad to hear that this issue has been resolved, and I hope you achieve the results you desire :).

@ajie6666
Copy link
Author

ajie6666 commented Sep 3, 2024

I would like to correct a previous misconception of mine. Simply using load_file with strict=False does not resolve the issue; an error still occurs when the file is eventually read. Instead, one should employ accelerator.save_state(save_path, safe_serialization=False) to generate a .bin file.

Additionally, I have a query regarding the integration of my own style of images into Styleshot, but the results have been unsatisfactory. In the first-stage, I trained the model with images in our style and their corresponding JSONL files. For the second-stage, I utilized the images from the "Stylebench.content40" file you provided, processed them through "Process content input," and incorporated them both into a JSONL file. I am contemplating whether the poor outcomes are due to the limited size of the dataset or if there might be an issue with the data I inputted. I would appreciate your advice on this matter.

@Jeoyal
Copy link
Contributor

Jeoyal commented Sep 3, 2024

Hi, for stage 2 you should process your dataset from stage 1 rather than Stylebench.content40.
For "style1.png" in your dataset, you should process it into content image and incorporated it into "content_image_file" field in json file.

@Jeoyal
Copy link
Contributor

Jeoyal commented Sep 3, 2024

In addition, what's your dataset scale?

@ajie6666
Copy link
Author

ajie6666 commented Sep 3, 2024

Thanks for your reply.I understand your point now. I'll run it again. My dataset consists of 3,700 images, each with a size of 5000 x 2333.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants