Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while optimizing seq2seq model using optimum #1983

Open
1 of 4 tasks
rafikg opened this issue Aug 6, 2024 · 1 comment
Open
1 of 4 tasks

Error while optimizing seq2seq model using optimum #1983

rafikg opened this issue Aug 6, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@rafikg
Copy link

rafikg commented Aug 6, 2024

System Info

transformers==4.42.4
torch== 2.4.0+cpu
onnx==1.16.2
onnxruntime==1.18.1
optimum==1.21.2
ubuntu-22.04

Who can help?

@JingyaHuang @echar

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction (minimal, reproducible, runnable)

from transformers import AutoTokenizer
from optimum.onnxruntime import  OptimizationConfig, ORTOptimizer, ORTModelForSeq2SeqLM
model_id = "sshleifer/distilbart-cnn-12-6"
save_dir = "distilbart_optimized"

# Load a PyTorch model and export it to the ONNX format
model = ORTModelForSeq2SeqLM.from_pretrained(model_id, export=True)

# Create the optimizer
optimizer = ORTOptimizer.from_pretrained(model)

# Define the optimization strategy by creating the appropriate configuration
optimization_config = OptimizationConfig(
    optimization_level=2,
    enable_transformers_specific_optimizations=True,
    optimize_for_gpu=False,
)

# Optimize the model
optimizer.optimize(save_dir=save_dir, optimization_config=optimization_config)
tokenizer = AutoTokenizer.from_pretrained(model_id)
optimized_model = ORTModelForSeq2SeqLM.from_pretrained(save_dir)
tokens = tokenizer("This is a sample input", return_tensors="pt")
outputs = optimized_model.generate(**tokens)

Expected behavior

The optimized_model should generate outputs as expected.

but I got this error in optimizer.optimize(save_dir=save_dir, optimization_config=optimization_config)

Optimizing model...
failed in shape inference <class 'Exception'>

and this error in optimized_model = ORTModelForSeq2SeqLM.from_pretrained(save_dir)

sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Load model from distilbart_optimized/decoder_model_optimized.onnx failed:/onnxruntime_src/onnxruntime/core/graph/graph.cc:1415 void onnxruntime::Graph::InitializeStateFromModelFileGraphProto() This is an invalid model. Graph output (present.0.decoder.key) does not exist in the graph.

The save_dir contains:

image

The code works with optimization level=1

@rafikg rafikg added the bug Something isn't working label Aug 6, 2024
@regisss
Copy link
Contributor

regisss commented Sep 16, 2024

@rafikg Can you try with

optimization_config = OptimizationConfig(
    optimization_level=2,
    enable_transformers_specific_optimizations=True,
    optimize_for_gpu=False,
    disable_skip_layer_norm_fusion=True,
)

and let me know if that works on your side?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants