feat(autoware_tensorrt_yolox): add GPU - CUDA device option #8245

ismetatabay · 2024-07-29T10:13:52Z

Description

This PR adds a parameter to specify which GPU will be used for tensorrt_yolox inference on multiple GPU devices.

How was this PR tested?

This PR has been tested with 8 cameras on a 3-GPU computer. Since GPU-0 is used for the lidar centerpoint and other things, the 8 cameras are divided between GPU-1 and GPU-2. The NVIDIA-SMI output can be found in the following image.

Notes for reviewers

None.

Interface changes

Change type	Parameter Name	Type	Default Value	Description
Added	`gpu_id`	`int`	`0`	GPU ID to select CUDA Device

Effects on system behavior

None.

github-actions · 2024-07-29T10:14:12Z

Thank you for contributing to the Autoware project!

🚧 If your pull request is in progress, switch it to draft mode.

Please ensure:

You've checked our contribution guidelines.
Your PR follows our pull request guidelines.
All required CI checks pass before marking the PR ready for review.

kminoda · 2024-07-29T13:06:10Z

@ismetatabay Please make sure to submit the PR after all the CIs are passing (json schema check is failing)

codecov · 2024-07-29T13:18:03Z

Codecov Report

Attention: Patch coverage is 0% with 22 lines in your changes missing coverage. Please review.

Project coverage is 23.99%. Comparing base (a64566e) to head (d3889b5).
Report is 3 commits behind head on main.

Files	Patch %	Lines
...ion/autoware_tensorrt_yolox/src/tensorrt_yolox.cpp	0.00%	11 Missing ⚠️
...utoware_tensorrt_yolox/src/tensorrt_yolox_node.cpp	0.00%	5 Missing ⚠️
..._detector/src/traffic_light_fine_detector_node.cpp	0.00%	5 Missing ⚠️
...include/autoware/tensorrt_yolox/tensorrt_yolox.hpp	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #8245      +/-   ##
==========================================
- Coverage   24.11%   23.99%   -0.13%     
==========================================
  Files        1399     1397       -2     
  Lines      102423   102224     -199     
  Branches    38926    38778     -148     
==========================================
- Hits        24702    24529     -173     
+ Misses      75223    75187      -36     
- Partials     2498     2508      +10

Flag	Coverage Δ		*Carryforward flag
differential	`0.00% <0.00%> (?)`
total	`24.00% <ø> (-0.12%)`	⬇️	Carriedforward from 8f97a9d

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

perception/autoware_tensorrt_yolox/src/tensorrt_yolox_node.cpp

perception/autoware_tensorrt_yolox/include/autoware/tensorrt_yolox/tensorrt_yolox_node.hpp

perception/autoware_tensorrt_yolox/src/tensorrt_yolox_node.cpp

manato · 2024-08-02T01:14:58Z

perception/autoware_tensorrt_yolox/src/tensorrt_yolox_node.cpp

@@ -90,6 +91,10 @@ TrtYoloXNode::TrtYoloXNode(const rclcpp::NodeOptions & node_options)
  const tensorrt_common::BatchConfig batch_config{1, 1, 1};
  const size_t max_workspace_size = (1 << 30);

+  if (!setCudaDeviceId(gpu_id_)) {


@ismetatabay
We appreciate your contribution. I guess engine generation by TensorRT would be better to be performed on the target device (GPU) on which inference will run. From this perspective, I wonder we need to set target device before engine generation; so I would appreciate it if you consider that order of operation.

@manato -san, thank you for your valuable comments. Since I am using the same 3xGPU model for our tests, the created engine file for GPU 0 (default) is valid for the remaining two GPUs. If there is another different model GPU, it will be a problem. Therefore, we can update the tensorrt_common package to handle different GPUs as well. What do you think about that?

@ismetatabay
Thank you for your consideration!
Yes, my main concern was such kinds of environments that different model GPUs exist. Regarding building engine on the specific device, it seems to be enough to call cudaSetDevice() before an engine builder starts to work, according to FAQs: Q.How do I use TensorRT on multiple GPUs? (I believe you can confirm this behavior by monitoring which GPU is used using nvidia-smi during engine building). If this is the case, I think we don't need to modify tensorrt_common because we just need to call cudaSetDevice before constructing tensorrt_common::TrtCommon.

Sorry for repeatedly asking, but could you consider to move this cuda-related codes into tensorrt_yolox.cpp for further encapsulation so that other nodes who use yolox as a module can enjoy your improvement?

Thank you @manato -san, you are right. I am okay with moving everything to tensorrt_yolox.cpp. With this update, traffic_light_fine_detector can also select GPU as you mentioned. However, if I move everything to TrtYoloX class, I still get an illegal memory access error from CUDA unless I call cudaSetDevice before creating the trt_yolox_ object in the TrtYoloXNode constructor. Do you have any suggestions on that?

autoware.universe/perception/autoware_tensorrt_yolox/src/tensorrt_yolox_node.cpp

Lines 92 to 97 in 3e79b8f

trt_yolox_ = std::make_unique<tensorrt_yolox::TrtYoloX>(

model_path, precision, label_map_.size(), score_threshold, nms_threshold, build_config,

preprocess_on_gpu, calibration_image_list_path, norm_factor, cache_dir, batch_config,

max_workspace_size, color_map_path);

Thank you @ismetatabay -san for your thoughtful consideration!

I guess the cause of the issue you observed is this line; the line attempts to create cudaStream on the default (0) device during construction of an instance, and the stream will be used for memory copy before/after inference. Could you please try to move to call makeCudaStream() after setting the device in the class constructor and see if it makes a difference?

Hello @manato -san, sorry for the late update. I was on leave and only had a chance to update it today. I have moved the updates to the tensorrt_yolox. Could you please check it? Thank you 🙌

Hi @ismetatabay -san, thank you for your continuous updates. The modifications look good to me!

perception/autoware_tensorrt_yolox/src/tensorrt_yolox_node.cpp

Shin-kyoto · 2024-08-15T14:25:02Z

/review

Shin-kyoto · 2024-08-15T14:25:10Z

/improve

github-actions · 2024-08-15T14:26:11Z

PR Reviewer Guide 🔍

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Key issues to review Possible Bug The `setCudaDeviceId` method returns a boolean, but the constructor throws an exception if it fails. This inconsistency in error handling should be addressed. Consider making the method throw an exception instead of returning a boolean. Logging The use of `std::cout` for logging GPU selection is not ideal for production code. Consider using a proper logging framework.

perception/autoware_tensorrt_yolox/src/tensorrt_yolox.cpp

perception/autoware_traffic_light_fine_detector/src/traffic_light_fine_detector_node.cpp

perception/autoware_tensorrt_yolox/include/autoware/tensorrt_yolox/tensorrt_yolox.hpp

perception/autoware_tensorrt_yolox/src/tensorrt_yolox.cpp

Shin-kyoto

Thank you for the changes.
Finally, could you please perform a functionality check with all the changes included and document the verification method and results?

Signed-off-by: ismetatabay <[email protected]>

Signed-off-by: ismetatabay <[email protected]> style(pre-commit): autofix

Signed-off-by: ismetatabay <[email protected]>

ismetatabay · 2024-08-23T12:23:22Z

Hello @Shin-kyoto -san, @manato -san, @Owen-Liuyuxuan -san,

Thank you for your review and valuable comments. Today, I tested the latest version of the PR with our test vehicle, and everything worked as expected. The vehicle has 8 cameras and 3 GPUs; I divided the 8 cameras between GPU 1 and GPU 2 (4 cameras each) for the tests. The NVIDIA SMI output is shown in the following image.

I used the following launch files to launch the tensorrt_yolox package at our vehicle:

If it is okay for you, I think we can merge this PR.

Owen-Liuyuxuan · 2024-08-25T04:29:33Z

I have no further issue.

Shin-kyoto

LGTM

I confirmed that the content of this PR is enough to merge.
I did NOT check this PR by running autoware.

manato

@ismetatabay
Thank you for your continuous updating to solve my concerns. LGTM!
I confirmed this modification causes no problem at least x86 laptop with 1 GPU.

…foundation#8245) * init CUDA device option Signed-off-by: ismetatabay <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

…foundation#8245) * init CUDA device option Signed-off-by: ismetatabay <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Batuhan Beytekin <[email protected]>

…foundation#8245) * init CUDA device option Signed-off-by: ismetatabay <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

github-actions bot added the component:perception Advanced sensor data processing and environment understanding. (auto-assigned) label Jul 29, 2024

ismetatabay force-pushed the feat/yolox-add-device-option branch from 7572d4f to 70c43ba Compare July 29, 2024 10:20

ismetatabay marked this pull request as ready for review July 29, 2024 13:03

ismetatabay requested review from dan-dnn and manato as code owners July 29, 2024 13:03

kminoda added the tag:run-build-and-test-differential Mark to enable build-and-test-differential workflow. (used-by-ci) label Jul 29, 2024

ismetatabay marked this pull request as draft July 29, 2024 13:11

ismetatabay marked this pull request as ready for review July 29, 2024 16:21

ismetatabay linked an issue Jul 29, 2024 that may be closed by this pull request

tensorrt_yolox node does not support GPU selection #8178

Closed

5 tasks

Owen-Liuyuxuan reviewed Jul 30, 2024

View reviewed changes

perception/autoware_tensorrt_yolox/src/tensorrt_yolox_node.cpp Outdated Show resolved Hide resolved

ismetatabay self-assigned this Jul 30, 2024

Shin-kyoto reviewed Aug 1, 2024

View reviewed changes

perception/autoware_tensorrt_yolox/include/autoware/tensorrt_yolox/tensorrt_yolox_node.hpp Outdated Show resolved Hide resolved

perception/autoware_tensorrt_yolox/src/tensorrt_yolox_node.cpp Outdated Show resolved Hide resolved

manato reviewed Aug 2, 2024

View reviewed changes

ismetatabay force-pushed the feat/yolox-add-device-option branch 2 times, most recently from 136a5ff to e581f64 Compare August 2, 2024 09:22

ismetatabay marked this pull request as draft August 12, 2024 12:59

ismetatabay force-pushed the feat/yolox-add-device-option branch 3 times, most recently from 941c3f9 to 6e136f5 Compare August 14, 2024 15:58

ismetatabay marked this pull request as ready for review August 14, 2024 15:59

ismetatabay requested review from miursh and tzhong518 as code owners August 14, 2024 15:59

github-actions bot reviewed Aug 15, 2024

View reviewed changes

perception/autoware_tensorrt_yolox/src/tensorrt_yolox.cpp Show resolved Hide resolved

perception/autoware_tensorrt_yolox/src/tensorrt_yolox.cpp Outdated Show resolved Hide resolved

perception/autoware_tensorrt_yolox/src/tensorrt_yolox.cpp Outdated Show resolved Hide resolved

Shin-kyoto requested changes Aug 15, 2024

View reviewed changes

ismetatabay changed the title ~~feat(tensorrt_yolox): add GPU - CUDA device option~~ feat(autoware_tensorrt_yolox): add GPU - CUDA device option Aug 16, 2024

ismetatabay force-pushed the feat/yolox-add-device-option branch from 1e1688d to a8a998a Compare August 20, 2024 13:35

Shin-kyoto requested changes Aug 21, 2024

View reviewed changes

perception/autoware_tensorrt_yolox/src/tensorrt_yolox.cpp Outdated Show resolved Hide resolved

github-actions bot added the tag:require-cuda-build-and-test label Aug 21, 2024

Shin-kyoto requested changes Aug 23, 2024

View reviewed changes

ismetatabay and others added 10 commits August 23, 2024 15:20

init CUDA device option

98aaacc

Signed-off-by: ismetatabay <[email protected]>

move GPU ID parameter to param file

a93f5c3

Signed-off-by: ismetatabay <[email protected]>

fix the grammar in the error log

48db535

Signed-off-by: ismetatabay <[email protected]>

fix member variable type and error logs

24f11ba

Signed-off-by: ismetatabay <[email protected]>

move code to the tensorrt_yolox

a85d239

Signed-off-by: ismetatabay <[email protected]> style(pre-commit): autofix

improve code performance

f3e2c1c

Signed-off-by: ismetatabay <[email protected]>

style(pre-commit): autofix

3d811f5

fix print type

1f02967

Signed-off-by: ismetatabay <[email protected]>

update log mechanism

dfc2aa3

Signed-off-by: ismetatabay <[email protected]>

remove redundant cerr message

c29d6bc

Signed-off-by: ismetatabay <[email protected]>

ismetatabay force-pushed the feat/yolox-add-device-option branch from e0537f1 to c29d6bc Compare August 23, 2024 12:22

github-actions bot added the type:documentation Creating or refining documentation. (auto-assigned) label Aug 23, 2024

style(pre-commit): autofix

8f97a9d

Shin-kyoto self-requested a review August 27, 2024 03:15

Shin-kyoto approved these changes Aug 27, 2024

View reviewed changes

manato approved these changes Aug 27, 2024

View reviewed changes

Merge branch 'main' into feat/yolox-add-device-option

d3889b5

ismetatabay requested a review from ktro2828 as a code owner August 27, 2024 07:57

ismetatabay merged commit e434372 into autowarefoundation:main Aug 27, 2024
29 of 31 checks passed

ismetatabay deleted the feat/yolox-add-device-option branch August 27, 2024 09:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(autoware_tensorrt_yolox): add GPU - CUDA device option #8245

feat(autoware_tensorrt_yolox): add GPU - CUDA device option #8245

ismetatabay commented Jul 29, 2024

github-actions bot commented Jul 29, 2024 •

edited

Loading

kminoda commented Jul 29, 2024 •

edited

Loading

codecov bot commented Jul 29, 2024 •

edited

Loading

manato Aug 2, 2024

ismetatabay Aug 2, 2024

manato Aug 2, 2024

ismetatabay Aug 5, 2024

manato Aug 6, 2024

ismetatabay Aug 14, 2024

manato Aug 16, 2024

Shin-kyoto commented Aug 15, 2024

Shin-kyoto commented Aug 15, 2024

github-actions bot commented Aug 15, 2024

Shin-kyoto left a comment

ismetatabay commented Aug 23, 2024

Owen-Liuyuxuan commented Aug 25, 2024

Shin-kyoto left a comment •

edited

Loading

manato left a comment •

edited

Loading


	trt_yolox_ = std::make_unique<tensorrt_yolox::TrtYoloX>(
	model_path, precision, label_map_.size(), score_threshold, nms_threshold, build_config,
	preprocess_on_gpu, calibration_image_list_path, norm_factor, cache_dir, batch_config,
	max_workspace_size, color_map_path);

feat(autoware_tensorrt_yolox): add GPU - CUDA device option #8245

feat(autoware_tensorrt_yolox): add GPU - CUDA device option #8245

Conversation

ismetatabay commented Jul 29, 2024

Description

Related links

How was this PR tested?

Notes for reviewers

Interface changes

Effects on system behavior

github-actions bot commented Jul 29, 2024 • edited Loading

kminoda commented Jul 29, 2024 • edited Loading

codecov bot commented Jul 29, 2024 • edited Loading

Codecov Report

manato Aug 2, 2024

Choose a reason for hiding this comment

ismetatabay Aug 2, 2024

Choose a reason for hiding this comment

manato Aug 2, 2024

Choose a reason for hiding this comment

ismetatabay Aug 5, 2024

Choose a reason for hiding this comment

manato Aug 6, 2024

Choose a reason for hiding this comment

ismetatabay Aug 14, 2024

Choose a reason for hiding this comment

manato Aug 16, 2024

Choose a reason for hiding this comment

Shin-kyoto commented Aug 15, 2024

Shin-kyoto commented Aug 15, 2024

github-actions bot commented Aug 15, 2024

PR Reviewer Guide 🔍

Shin-kyoto left a comment

Choose a reason for hiding this comment

ismetatabay commented Aug 23, 2024

Owen-Liuyuxuan commented Aug 25, 2024

Shin-kyoto left a comment • edited Loading

Choose a reason for hiding this comment

manato left a comment • edited Loading

Choose a reason for hiding this comment

github-actions bot commented Jul 29, 2024 •

edited

Loading

kminoda commented Jul 29, 2024 •

edited

Loading

codecov bot commented Jul 29, 2024 •

edited

Loading

Shin-kyoto left a comment •

edited

Loading

manato left a comment •

edited

Loading