Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ParallelFor with BoxND #4052

Merged

Conversation

AlexanderSinn
Copy link
Member

@AlexanderSinn AlexanderSinn commented Jul 29, 2024

Summary

This PR adds the ability to use ParallelFor(RNG) and Loop(OnCpu) variants with BoxND.

Additional background

The functions supplied to ParallelFor etc. can have one of the following input parameter combinations when used with BoxND<dim>:

(int, int, ...) // dim times
(IntVectND<dim>)
(int, int, int) // if dim is 1 or 2, for backwards compatability

After that come additional parameters such as int n, Gpu::KernelInfo or RandomEngine.

I had to mark the constructor of GPU Handler as explicit to avoid ambiguity between (i, j, 0) and (i, j, Gpu::Handler).

Checklist

The proposed changes:

  • fix a bug or incorrect behavior in AMReX
  • add new capabilities to AMReX
  • changes answers in the test suite to more than roundoff level
  • are likely to significantly affect the results of downstream AMReX users
  • include documentation in the code and/or rst files, if appropriate

Src/Base/AMReX_GpuLaunchFunctsC.H Dismissed Show resolved Hide resolved
Src/Base/AMReX_Loop.H Fixed Show fixed Hide fixed
Src/Base/AMReX_Loop.H Fixed Show fixed Hide fixed
@WeiqunZhang
Copy link
Member

I am a little bit worried by loops like this

        for (int i2 = lo[2], h2 = hi[2]; i2 <= h2; ++i2) { iv[2] = i2;
        for (int i1 = lo[1], h1 = hi[1]; i1 <= h1; ++i1) { iv[1] = i1;
        for (int i0 = lo[0], h0 = hi[0]; i0 <= h0; ++i0) { iv[0] = i0;
            call_f_intvect(f,iv);
        }}}

where iv is IntVectND<dim>. The compilers I have tried seem to have figured out that iv[0] = i0 should not prevent vectorization even though iv is shared. But still, would it be better to avoid setting iv inside the loop?

@WeiqunZhang
Copy link
Member

Maybe my last comment is not an issue. I have not noticed any performance differences in my testing.

@WeiqunZhang WeiqunZhang merged commit e792933 into AMReX-Codes:development Aug 10, 2024
71 of 72 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants