About converting YOLOv7 QAT model to TensorRT engine(failed for dynamic-batch setting) #46

YunghuiHsu · 2023-05-25T00:53:12Z

When I refer to yolo_deepstream/tree/main/tensorrt_yolov7 and use "yolov7QAT" to perform a batch detection task, the following error occurs
./build/detect --engine=yolov7QAT.engine --img=./imgs/horses.jpg,./imgs/zidane.jpg

Error Message

input 2 images, paths: ./imgs/horses.jpg, ./imgs/zidane.jpg, 
--------------------------------------------------------
Yolov7 initialized from: /opt/nvidia/deepstream/deepstream/samples/models/tao_pretrained_models/yolov7/yolov7QAT.engine
input : images , shape : [ 1,3,640,640,]
output : outputs , shape : [ 1,25200,85,]
--------------------------------------------------------
preprocess start
error cv_img.size() in preProcess
 error: mImgPushed = 1 numImg = 1 mMaxBatchSize= 1, mImgPushed + numImg > mMaxBatchSize 
inference start
postprocessing start
detectec image written to: ./imgs/horses.jpgdetect0.jpg

Note

It works fine when running a single detection task with "yolov7QAT.engine".
"yolov7QAT.engine" comes from "yolov7qat.onnx" conversion.
Whether downloaded from here or self trained "yolov7qat.onnx" (using 'netron' view, it shows the same structure), the same error occurs when running `. /build/detect `` all show the same error message
Runs fine with non-qat "yolov7db4fp32.engine" or "yolov7db4fp16.engine"

Environment

    CUDA: 11.4.315
    cuDNN: 8.6.0.166
    TensorRT: 5.1
    Python: 3.8.10
    PyTorch: 1.12.0a0+2c916ef.nv22.3
Hardware
    Model: Jetson-AGX
    Module: NVIDIA Jetson AGX Xavier (32 GB ram)
    L4T: 35.2.1
    Jetpack: 5.1

The text was updated successfully, but these errors were encountered:

YunghuiHsu · 2023-05-31T05:27:06Z

in https://github.com/NVIDIA-AI-IOT/yolo_deepstream/tree/main/tensorrt_yolov7#prepare-tensorrt-engines

Suggest explicitly specifying “dynamic-batch”, and the problem was solved!

#Peplace

# int8 QAT model, the onnx model with Q&DQ nodes
/usr/src/tensorrt/bin/trtexec --onnx=yolov7qat.onnx --saveEngine=yolov7QAT.engine --fp16 --int8

with

# int8 QAT model, the onnx model with Q&DQ nodes and dynamic-batch
/usr/src/tensorrt/bin/trtexec --onnx=yolov7qat.onnx \
        --minShapes=images:1x3x640x640 \
        --optShapes=images:12x3x640x640 \
        --maxShapes=images:16x3x640x640 \
        --saveEngine=yolov7QAT.engine --fp16 --int8

However, when testing performance with /usr/src/tensorrt/bin/trtexec --loadEngine=yourmodel.engine, the performance of the engine that is explicitly specified as dynamic batch is much worse.

yolov7QAT.engine
=== Performance summary ===
[I] Throughput: 57.8406 qps
[I] Latency mean = 17.8946 ms

yolov7QAT.engine with dynamic batch(max=16)
=== Performance summary ===
[I] Throughput: 23.8396 qps
[I] Latency: mean = 42.046 ms

YunghuiHsu changed the title ~~About converting YOLOv7 QAT model to TensorRT engine~~ About converting YOLOv7 QAT model to TensorRT engine(failed for dynamic-batch setting) May 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About converting YOLOv7 QAT model to TensorRT engine(failed for dynamic-batch setting) #46

About converting YOLOv7 QAT model to TensorRT engine(failed for dynamic-batch setting) #46

YunghuiHsu commented May 25, 2023 •

edited

Loading

YunghuiHsu commented May 31, 2023 •

edited

Loading

About converting YOLOv7 QAT model to TensorRT engine(failed for dynamic-batch setting) #46

About converting YOLOv7 QAT model to TensorRT engine(failed for dynamic-batch setting) #46

Comments

YunghuiHsu commented May 25, 2023 • edited Loading

Error Message

Note

Environment

YunghuiHsu commented May 31, 2023 • edited Loading

YunghuiHsu commented May 25, 2023 •

edited

Loading

YunghuiHsu commented May 31, 2023 •

edited

Loading