Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OutOfMemoryError #15

Open
Tefor opened this issue Aug 21, 2023 · 2 comments
Open

OutOfMemoryError #15

Tefor opened this issue Aug 21, 2023 · 2 comments

Comments

@Tefor
Copy link

Tefor commented Aug 21, 2023

Hi author, your code is great, but when I introduce your SMT module for training I always have a case of not enough memory in the attn = (q @ k.transpose(-2, -1)) * self.scale statement when calculating the Attention, and it's not enough for me to set the Batchsize to 1. Can the author give some ideas how to modify it, please. I'm only using stage3's structure

@AFeng-x
Copy link
Owner

AFeng-x commented Aug 22, 2023

Hi, here are two simple ways u can try:
(1) reduce the number of channel (eg. 256->128)
(2) reduce the number of block (eg. 12->6)
Also, you need to confirm whether the resolution at stage 3 is too high for your own task.

@Tefor
Copy link
Author

Tefor commented Aug 24, 2023

Thank you for your answer, it is indeed a resolution problem, my input image resolution is 128128, so that when calculating the attention N=HW=16384, this is too big, I would like to ask the author why your attention calculation has to transform the input x's shape from (B,C,H,W) to (B,N,C)? This takes up so much memory to calculate the attention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants