Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The float16 unet model of stable-diffusion-2-1 outputs NAN results #1223

Open
xhcao opened this issue Jul 8, 2024 · 3 comments
Open

The float16 unet model of stable-diffusion-2-1 outputs NAN results #1223

xhcao opened this issue Jul 8, 2024 · 3 comments

Comments

@xhcao
Copy link

xhcao commented Jul 8, 2024

Describe the bug
After running the command "python stable_diffusion.py --provider cuda --optimize --model_id stabilityai/stable-diffusion-2-1" in Olive/examples/stable_diffusion/ directory.
float32 models will be generated in Olive/examples/stable_diffusion/models/unoptimized, and these models could run correctly.
float16 models will be generated in Olive/examples/stable_diffusion/models/optimized-cuda, and these models could not run correctly, unet model outputs NAN results.

To Reproduce
Example python code,
import onnxruntime as ort
from diffusers import OnnxStableDiffusionPipeline, DDIMScheduler

sess_options = ort.SessionOptions()
sess_options.enable_mem_pattern = False

batch_size = 1
image_size = 768
provider = "cuda"

hidden_batch_size = batch_size * 2
sess_options.add_free_dimension_override_by_name("unet_sample_batch", hidden_batch_size)
sess_options.add_free_dimension_override_by_name("unet_sample_channels", 4)
sess_options.add_free_dimension_override_by_name("unet_sample_height", image_size // 8)
sess_options.add_free_dimension_override_by_name("unet_sample_width", image_size // 8)
sess_options.add_free_dimension_override_by_name("unet_time_batch", 1)
sess_options.add_free_dimension_override_by_name("unet_hidden_batch", hidden_batch_size)
sess_options.add_free_dimension_override_by_name("unet_hidden_sequence", 77)

model_id = "C:\workspace\models\stable-diffusion-2-1\optimized-cuda"
provider_map = {
"dml": "DmlExecutionProvider",
"cuda": "CUDAExecutionProvider",
}

pipeline = OnnxStableDiffusionPipeline.from_pretrained(model_dir, provider=provider_map[provider], sess_options=sess_options)

prompt = "giant castle, mountains, sunrise, volumetric lighting"
pipeline.scheduler = DDIMScheduler.from_config(pipeline.scheduler.config)
image = pipeline(
[prompt] * batch_size,
num_inference_steps=20,
callback= None,
height=image_size,
width=image_size,
guidance_scale=7.5
).images[0]

image.save("output.png")

Is there a way to convert pytorch stable-diffusion-2-1-float16 models to onnx-float16 models directly, not from pytorch stable-diffusion-2-1-float32 models. Thanks

@jambayk
Copy link
Contributor

jambayk commented Jul 8, 2024

Hi, we haven't tested this example with stable-diffusion-2-1 model. The NAN outputs must be from the numerical instability in fp16 precision. Are you able to trace the source of the NAN to the unet model? Previously, we saw instability in the VAE for sdxl model but not the unet.

ONNX conversion from float16 model is not usually done since there are many onnx operators that do not support float16. So we usually convert in float precision and then convert it float16 as part of the transformers optimization step.

Do you have the model id for a float16 compatible stable-diffusion-2-1 model? like https://huggingface.co/madebyollin/sdxl-vae-fp16-fix? If so, it might be possible to run the example on that model. That's what we did for the sdxl example

@xhcao
Copy link
Author

xhcao commented Jul 9, 2024

@jambayk , thanks for your reply.
Currently, I have not the method how to trace which operator or node generates the NAN in the unet model. But I could try.
I do not have the model id for a float16 compatible stable-diffusion-2-1 model, so I use Olive to generate float16 models from stabilityai/stable-diffusion-2-1.
Do you have the plan to enable stabilityai/stable-diffusion-2-1 recently?

@jambayk
Copy link
Contributor

jambayk commented Jul 12, 2024

hi, there are no plans for it currently. If there are large architectural changes to the model, more work might be needed from the onnxruntime team to support it. They also haven't explored sd-2-1 yet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants