Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix DensePose vertex visualization. #5278

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

PieroV
Copy link

@PieroV PieroV commented May 7, 2024

This commit fixes a RuntimeError by explicitly copying an index array to the CPU:

Traceback (most recent call last):
  File "/home/piero/tmp/detectron2/projects/DensePose/apply_net.py", line 353, in <module>
    main()
  File "/home/piero/tmp/detectron2/projects/DensePose/apply_net.py", line 349, in main
    args.func(args)
  File "/home/piero/tmp/detectron2/projects/DensePose/apply_net.py", line 105, in execute
    cls.execute_on_outputs(context, {"file_name": file_name, "image": img}, outputs)
  File "/home/piero/tmp/detectron2/projects/DensePose/apply_net.py", line 284, in execute_on_outputs
    image_vis = visualizer.visualize(image, data)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/piero/tmp/detectron2/projects/DensePose/densepose/vis/base.py", line 188, in visualize
    image = visualizer.visualize(image, data[i])
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/piero/tmp/detectron2/projects/DensePose/densepose/vis/densepose_outputs_vertex.py", line 93, in visualize
    vis = (embed_map[closest_vertices].clip(0, 1) * 255.0).cpu().numpy()
           ~~~~~~~~~^^^^^^^^^^^^^^^^^^
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

My command line was:

python apply_net.py show configs/cse/densepose_rcnn_R_50_FPN_s1x.yaml .../model_final_c4ea5f.pkl .../00000.jpg dp_vertex,bbox -v

Fix a RuntimeError by explicitly copying an index array to the CPU.
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 7, 2024
Copy link
Contributor

@Programmer-RD-AI Programmer-RD-AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this change not effect when the closest_vertices are stored in the GPU..?

@PieroV
Copy link
Author

PieroV commented May 28, 2024

I think I tested this change with the GPU (I'm running an Nvidia GPU on my Linux machine), but I'm not sure, I'm really not an expert of Pytorch etc.

Do you have a suggestion on how can I force the script to run with a GPU? (Even though I think it was already, since the CPU load was low enough and the GPU was actually making noise).

@Programmer-RD-AI
Copy link
Contributor

So you could use something like torch.cuda.is_avaialble() to check if the GPU or CPU is to be used, and use that accordingly, so changing that in the PR would be much more better imo...
and Detectron2 by default run on the GPU if it is avaialbe, I have not run in a CPU thought...
Best regards,

@PieroV
Copy link
Author

PieroV commented May 28, 2024

Ok, I tried to revert my patch and added

print("Is CUDA available?", torch.cuda.is_available())

Result:

/home/piero/tmp/venv-cv/lib/python3.11/site-packages/torch/functional.py:507: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3549.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
[05/28 14:51:37 apply_net]: Processing /media/edati/kinect/bosca1/rgb-orig/rgb_00000.jpg
Is CUDA available? True
Traceback (most recent call last):
  File "/home/piero/tmp/detectron2/projects/DensePose/apply_net.py", line 353, in <module>
    main()
  File "/home/piero/tmp/detectron2/projects/DensePose/apply_net.py", line 349, in main
    args.func(args)
  File "/home/piero/tmp/detectron2/projects/DensePose/apply_net.py", line 105, in execute
    cls.execute_on_outputs(context, {"file_name": file_name, "image": img}, outputs)
  File "/home/piero/tmp/detectron2/projects/DensePose/apply_net.py", line 284, in execute_on_outputs
    image_vis = visualizer.visualize(image, data)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/piero/tmp/detectron2/projects/DensePose/densepose/vis/base.py", line 188, in visualize
    image = visualizer.visualize(image, data[i])
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/piero/tmp/detectron2/projects/DensePose/densepose/vis/densepose_outputs_vertex.py", line 94, in visualize
    vis = (embed_map[closest_vertices].clip(0, 1) * 255.0).cpu().numpy()
           ~~~~~~~~~^^^^^^^^^^^^^^^^^^
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

The script doesn't run without CUDA at all:

CUDA_VISIBLE_DEVICES="" python apply_net.py show configs/cse/densepose_rcnn_R_50_FPN_s1x.yaml https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_50_FPN_s1x/251155172/model_final_c4ea5f.pkl ../rgb_00000.jpg --output .../out.png dp_vertex,bbox -v
[05/28 14:52:44 apply_net]: Loading config from configs/cse/densepose_rcnn_R_50_FPN_s1x.yaml
[05/28 14:52:44 apply_net]: Loading model from https://dl.fbaipublicfiles.com/densepose/cse/densepose_rcnn_R_50_FPN_s1x/251155172/model_final_c4ea5f.pkl
Traceback (most recent call last):
  File "/home/piero/tmp/detectron2/projects/DensePose/apply_net.py", line 353, in <module>
    main()
  File "/home/piero/tmp/detectron2/projects/DensePose/apply_net.py", line 349, in main
    args.func(args)
  File "/home/piero/tmp/detectron2/projects/DensePose/apply_net.py", line 94, in execute
    predictor = DefaultPredictor(cfg)
                ^^^^^^^^^^^^^^^^^^^^^
  File "/home/piero/tmp/venv-cv/lib/python3.11/site-packages/detectron2-0.6-py3.11-linux-x86_64.egg/detectron2/engine/defaults.py", line 282, in __init__
    self.model = build_model(self.cfg)
                 ^^^^^^^^^^^^^^^^^^^^^
  File "/home/piero/tmp/venv-cv/lib/python3.11/site-packages/detectron2-0.6-py3.11-linux-x86_64.egg/detectron2/modeling/meta_arch/build.py", line 23, in build_model
    model.to(torch.device(cfg.MODEL.DEVICE))
  File "/home/piero/tmp/venv-cv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1152, in to
    return self._apply(convert)
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/piero/tmp/venv-cv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 802, in _apply
    module._apply(fn)
  File "/home/piero/tmp/venv-cv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 802, in _apply
    module._apply(fn)
  File "/home/piero/tmp/venv-cv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 825, in _apply
    param_applied = fn(param)
                    ^^^^^^^^^
  File "/home/piero/tmp/venv-cv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1150, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/piero/tmp/venv-cv/lib/python3.11/site-packages/torch/cuda/__init__.py", line 302, in _lazy_init
    torch._C._cuda_init()
RuntimeError: No CUDA GPUs are available

Have you tried to run the command I wrote?
In case, are you having problems in reproducing the error? Could it be due to a mismatch in Pytorch version?

@Programmer-RD-AI
Copy link
Contributor

Hi,
In the solution that you gave, the data is stored in the cpu() no matter if the GPU is available or not. The error originally was caused by the closest_vertices being in the CPU, so if you could convert it or transfer it to the GPU, it would have better performance (not sure exactly how much, but a significant amount).
Best regards,

@PieroV
Copy link
Author

PieroV commented May 29, 2024

it would have better performance (not sure exactly how much, but a significant amount)

Would it though?
It seems to me this is only the visualization phase, it's fine if it's on CPU.
Hasn't the inference already happened by now?

@Programmer-RD-AI
Copy link
Contributor

By ensuring closest_vertices is on the same device as embed_map, you avoid the RuntimeError without compromising performance significantly. This approach provides a balanced solution, maintaining the efficiency of GPU operations during inference while ensuring compatibility and simplicity during visualization.

@PieroV
Copy link
Author

PieroV commented May 29, 2024

print(embed_map.device)

--> cpu

@PieroV
Copy link
Author

PieroV commented May 29, 2024

From what I can see, embed_map is already in the CPU and vis eventually is going to be in the CPU as well (there's a .cpu() after the clipping + multiplication).
So, the .cpu() is redundant (after removing it the code still works), but maybe it could be moved before applying the mask (even if it was in the GPU, there's probably no real advantage in doing clipping and multiplication on the GPU).

I'm not going to do any refactors to make sure embed_map stays on the GPU in the callers of the failing method in this PR. If needed, I'll someone who has a better knowledge than me on how densepose works do it and I can open an issue instead (but in case I'd like the PR to be still merged, as the current status is non-working for me).

@Programmer-RD-AI
Copy link
Contributor

@PieroV,
Thank you for your detailed clarification regarding the device placement of embed_map and closest_vertices.
We can check up on I'm not going to do any refactors to make sure embed_map stays on the GPU in the callers of the failing method in this PR. in a future PR.
Great Contributions :)

Best regards,
Ranuga Disansa

@matejsuchanek
Copy link

The problem is actually in the get_xyz_vertex_embedding function. See #5003 (comment). When the mesh corresponds to smpl_27554, the result is never moved to the correct device.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants