Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vrtc: error: invalid value for --gpu-architecture (-arch) #263

Open
giandre opened this issue Jul 31, 2024 · 2 comments
Open

vrtc: error: invalid value for --gpu-architecture (-arch) #263

giandre opened this issue Jul 31, 2024 · 2 comments

Comments

@giandre
Copy link

giandre commented Jul 31, 2024

I am getting stuck in the following error. I am trying to figure out what to do:

(video_retalking) H:\z-aitools\video-retalking>python inference.py --face examples/face/1.mp4 --audio examples/audio/1.wav --outfile results/1_1.mp4
[Info] Using cuda for inference.
[Step 0] Number of frames available for inference: 135
[Step 1] Landmarks Extraction in Video.
landmark Det:: 1%|▍ | 1/135 [00:14<31:21, 14.04s/it]nvrtc: error: invalid value for --gpu-architecture (-arch)

nvrtc compilation failed:

#define NAN __int_as_float(0x7fffffff)
#define POS_INFINITY __int_as_float(0x7f800000)
#define NEG_INFINITY __int_as_float(0xff800000)

template
device T maximum(T a, T b) {
return isnan(a) ? a : (a > b ? a : b);
}

template
device T minimum(T a, T b) {
return isnan(a) ? a : (a < b ? a : b);
}

extern "C" global
void fused_cat_cat(float* tinput0_42, float* tinput0_46, float* tout3_67, float* tinput0_60, float* tinput0_52, float* tout3_71, float* aten_cat, float* aten_cat_1) {
{
if (blockIdx.x<512 ? 1 : 0) {
aten_cat_1[512 * blockIdx.x + threadIdx.x] = ((((512 * blockIdx.x + threadIdx.x) / 1024) % 256<192 ? 1 : 0) ? ((((512 * blockIdx.x + threadIdx.x) / 1024) % 256<128 ? 1 : 0) ? __ldg(tinput0_60 + (512 * blockIdx.x + threadIdx.x) % 262144) : __ldg(tinput0_52 + (512 * blockIdx.x + threadIdx.x) % 262144 - 131072)) : __ldg(tout3_71 + (512 * blockIdx.x + threadIdx.x) % 262144 - 196608));
}
aten_cat[512 * blockIdx.x + threadIdx.x] = ((((512 * blockIdx.x + threadIdx.x) / 4096) % 256<192 ? 1 : 0) ? ((((512 * blockIdx.x + threadIdx.x) / 4096) % 256<128 ? 1 : 0) ? __ldg(tinput0_42 + (512 * blockIdx.x + threadIdx.x) % 1048576) : __ldg(tinput0_46 + (512 * blockIdx.x + threadIdx.x) % 1048576 - 524288)) : __ldg(tout3_67 + (512 * blockIdx.x + threadIdx.x) % 1048576 - 786432));
}
}

landmark Det:: 1%|▍ | 1/135 [00:14<31:57, 14.31s/it]
Traceback (most recent call last):
File "inference.py", line 347, in
main()
File "inference.py", line 84, in main
lm = kp_extractor.extract_keypoint(frames_pil, './temp/'+base_name+'_landmarks.txt')
File "H:\z-aitools\video-retalking\third_part\face3d\extract_kp_videos.py", line 29, in extract_keypoint
current_kp = self.extract_keypoint(image)
File "H:\z-aitools\video-retalking\third_part\face3d\extract_kp_videos.py", line 57, in extract_keypoint
return keypoints
UnboundLocalError: local variable 'keypoints' referenced before assignment

I did some troubleshooting with ChatGPT and this is my current CUDA settings since I am suspecting is related to that? I am using a RTX4090.
(video_retalking) H:\z-aitools\video-retalking>python test.py
Is CUDA available: True
CUDA device count: 1
CUDA current device: 0
CUDA device name: NVIDIA GeForce RTX 4090
CUDA arch list: ['sm_37', 'sm_50', 'sm_60', 'sm_61', 'sm_70', 'sm_75', 'sm_80', 'sm_86', 'compute_37']

I appreciate any help!

@lordlxh
Copy link

lordlxh commented Aug 4, 2024

renew pytorch and cuda maybe helpful.

@giandre
Copy link
Author

giandre commented Aug 4, 2024

I was able to fix it. I do not recall how but this process takes way too long to complete so I decided to continue trying with Dinet and Sadtalker..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants