Error while optimizing seq2seq model using optimum #1983

rafikg · 2024-08-06T19:38:28Z

System Info

transformers==4.42.4
torch== 2.4.0+cpu
onnx==1.16.2
onnxruntime==1.18.1
optimum==1.21.2
ubuntu-22.04

Who can help?

@JingyaHuang @echar

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction (minimal, reproducible, runnable)

from transformers import AutoTokenizer
from optimum.onnxruntime import  OptimizationConfig, ORTOptimizer, ORTModelForSeq2SeqLM
model_id = "sshleifer/distilbart-cnn-12-6"
save_dir = "distilbart_optimized"

# Load a PyTorch model and export it to the ONNX format
model = ORTModelForSeq2SeqLM.from_pretrained(model_id, export=True)

# Create the optimizer
optimizer = ORTOptimizer.from_pretrained(model)

# Define the optimization strategy by creating the appropriate configuration
optimization_config = OptimizationConfig(
    optimization_level=2,
    enable_transformers_specific_optimizations=True,
    optimize_for_gpu=False,
)

# Optimize the model
optimizer.optimize(save_dir=save_dir, optimization_config=optimization_config)
tokenizer = AutoTokenizer.from_pretrained(model_id)
optimized_model = ORTModelForSeq2SeqLM.from_pretrained(save_dir)
tokens = tokenizer("This is a sample input", return_tensors="pt")
outputs = optimized_model.generate(**tokens)

Expected behavior

The optimized_model should generate outputs as expected.

but I got this error in optimizer.optimize(save_dir=save_dir, optimization_config=optimization_config)

Optimizing model...
failed in shape inference <class 'Exception'>

and this error in optimized_model = ORTModelForSeq2SeqLM.from_pretrained(save_dir)

sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Load model from distilbart_optimized/decoder_model_optimized.onnx failed:/onnxruntime_src/onnxruntime/core/graph/graph.cc:1415 void onnxruntime::Graph::InitializeStateFromModelFileGraphProto() This is an invalid model. Graph output (present.0.decoder.key) does not exist in the graph.

The save_dir contains:

The code works with optimization level=1

The text was updated successfully, but these errors were encountered:

regisss · 2024-09-16T18:35:47Z

@rafikg Can you try with

optimization_config = OptimizationConfig(
    optimization_level=2,
    enable_transformers_specific_optimizations=True,
    optimize_for_gpu=False,
    disable_skip_layer_norm_fusion=True,
)

and let me know if that works on your side?

rafikg added the bug Something isn't working label Aug 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error while optimizing seq2seq model using optimum #1983

Error while optimizing seq2seq model using optimum #1983

rafikg commented Aug 6, 2024 •

edited

Loading

regisss commented Sep 16, 2024

Error while optimizing seq2seq model using optimum #1983

Error while optimizing seq2seq model using optimum #1983

Comments

rafikg commented Aug 6, 2024 • edited Loading

System Info

Who can help?

Information

Tasks

Reproduction (minimal, reproducible, runnable)

Expected behavior

regisss commented Sep 16, 2024

rafikg commented Aug 6, 2024 •

edited

Loading