TensorRT EP could not deserialize engine from binary data #22139

adaber · 2024-09-18T23:40:07Z

Describe the issue

Hi,

I've wrapped a TensorRT engine in an _ctx.onnx file using an official python script (https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/python/tools/tensorrt/gen_trt_engine_wrapper_onnx_model.py#L156-L187)

The problem is that I get the "TensorRT EP could not deserialize engine from binary data" error. The TensorRT model works well using the TensorRT API. I am kind of stuck since there is no other information to help me figure out why this happens.

I've tried using different ortTrtOptions but to no avail.

This error occurs when creating an inference session. I tried both the FP16 and INT8 version and I got the same error.

I've uploaded the FP16 version and it'd be great if you have time to look at it.

Thanks!

Edit:

Graphics Card: 3090

The trt engine was built using the following profile shapes:
min: 1x1024x128x3
opt: 1x4096x640x3
max: 1x8000x1400x3

To reproduce

EmbededTrtEngine_FP16_ctx.zip

Urgency

Both a workaround or a fix would help.

Platform

Windows

OS Version

10

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.19.0

ONNX Runtime API

C#

Architecture

X64

Execution Provider

TensorRT

Execution Provider Library Version

CUDA 11.8, CuDNN 8.9.7.29, TRT 10.4.0.26 and 10.1.0.27

Model File

EmbededTrtEngine_FP16_ctx.zip

Is this a quantized model?

No

adaber added the performance issues related to performance regressions label Sep 18, 2024

github-actions bot added api:CSharp issues related to the C# API ep:TensorRT issues related to TensorRT execution provider labels Sep 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TensorRT EP could not deserialize engine from binary data #22139

TensorRT EP could not deserialize engine from binary data #22139

adaber commented Sep 18, 2024 •

edited

Loading

TensorRT EP could not deserialize engine from binary data #22139

TensorRT EP could not deserialize engine from binary data #22139

Comments

adaber commented Sep 18, 2024 • edited Loading

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Model File

Is this a quantized model?

adaber commented Sep 18, 2024 •

edited

Loading