Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About converting YOLOv7 QAT model to TensorRT engine(failed for dynamic-batch setting) #46

Open
YunghuiHsu opened this issue May 25, 2023 · 1 comment

Comments

@YunghuiHsu
Copy link

YunghuiHsu commented May 25, 2023

When I refer to yolo_deepstream/tree/main/tensorrt_yolov7 and use "yolov7QAT" to perform a batch detection task, the following error occurs
./build/detect --engine=yolov7QAT.engine --img=./imgs/horses.jpg,./imgs/zidane.jpg

Error Message

input 2 images, paths: ./imgs/horses.jpg, ./imgs/zidane.jpg, 
--------------------------------------------------------
Yolov7 initialized from: /opt/nvidia/deepstream/deepstream/samples/models/tao_pretrained_models/yolov7/yolov7QAT.engine
input : images , shape : [ 1,3,640,640,]
output : outputs , shape : [ 1,25200,85,]
--------------------------------------------------------
preprocess start
error cv_img.size() in preProcess
 error: mImgPushed = 1 numImg = 1 mMaxBatchSize= 1, mImgPushed + numImg > mMaxBatchSize 
inference start
postprocessing start
detectec image written to: ./imgs/horses.jpgdetect0.jpg

Note

  • It works fine when running a single detection task with "yolov7QAT.engine".
  • "yolov7QAT.engine" comes from "yolov7qat.onnx" conversion.
  • Whether downloaded from here or self trained "yolov7qat.onnx" (using 'netron' view, it shows the same structure), the same error occurs when running `. /build/detect `` all show the same error message
  • Runs fine with non-qat "yolov7db4fp32.engine" or "yolov7db4fp16.engine"

Environment

    CUDA: 11.4.315
    cuDNN: 8.6.0.166
    TensorRT: 5.1
    Python: 3.8.10
    PyTorch: 1.12.0a0+2c916ef.nv22.3
Hardware
    Model: Jetson-AGX
    Module: NVIDIA Jetson AGX Xavier (32 GB ram)
    L4T: 35.2.1
    Jetpack: 5.1
@YunghuiHsu
Copy link
Author

YunghuiHsu commented May 31, 2023

in https://github.com/NVIDIA-AI-IOT/yolo_deepstream/tree/main/tensorrt_yolov7#prepare-tensorrt-engines

Suggest explicitly specifying “dynamic-batch”, and the problem was solved!

#Peplace

# int8 QAT model, the onnx model with Q&DQ nodes
/usr/src/tensorrt/bin/trtexec --onnx=yolov7qat.onnx --saveEngine=yolov7QAT.engine --fp16 --int8

with

# int8 QAT model, the onnx model with Q&DQ nodes and dynamic-batch
/usr/src/tensorrt/bin/trtexec --onnx=yolov7qat.onnx \
        --minShapes=images:1x3x640x640 \
        --optShapes=images:12x3x640x640 \
        --maxShapes=images:16x3x640x640 \
        --saveEngine=yolov7QAT.engine --fp16 --int8

However, when testing performance with /usr/src/tensorrt/bin/trtexec --loadEngine=yourmodel.engine, the performance of the engine that is explicitly specified as dynamic batch is much worse.

yolov7QAT.engine
=== Performance summary ===
[I] Throughput: 57.8406 qps
[I] Latency mean = 17.8946 ms

yolov7QAT.engine with dynamic batch(max=16)
=== Performance summary ===
[I] Throughput: 23.8396 qps
[I] Latency: mean = 42.046 ms

@YunghuiHsu YunghuiHsu changed the title About converting YOLOv7 QAT model to TensorRT engine About converting YOLOv7 QAT model to TensorRT engine(failed for dynamic-batch setting) May 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant