-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decrease number of guard cells allocated as a function of interpolation order #2336
base: development
Are you sure you want to change the base?
Conversation
@EZoni @RemiLehe I will probably need some help from you on this PR (nothing urgent, can definitely wait till after the milestone). In the end, I have a single CI test failing ( |
Hi @NeilZaim, thank you for this PR! I think it might be best to discuss your question and this PR in general in person. Will try to coordinate on Slack. In general, we might need to be careful about differences between even and odd shape factors. I think the nice plot below, taken from this chapter, will help guide our discussion (it will help reviewers go through your PR as well): In particular, I would think that the guard cells that you set in
should rather be
based on the assumption that the shape factor should be centered on a given particle's position, if we think about the deposition of the particle's charge and current. I might be wrong about this, though. Or I might be confusing something at the moment. I think it will be easier to discuss in person with you and Remi. |
Hi @EZoni. Sure, it's a good idea to have a meeting to talk about this. I think I'll be available most mornings (your time) in the coming weeks. |
Quick summary of an offline discussion with @EZoni and @NeilZaim : The failing tests with |
So I think that this PR is ready now. I've had to modify a few benchmarks for the following reasons:
|
Trigger CI again |
Ok, it looks like the @ax3l in case I cannot reproduce the issue locally, do you know what is the best way to debug the segfault? Am I doomed to debug in CI by adding print statements everywhere? Is there a way to access the backtraces generated during CI? |
Ok, looks like the macos issue is indeed fixed by #3294. |
The NCI corrector benchmarks were also failing since they were added in #3252. This is due to machine precision difference when reducing the number of guard cells (same as |
… AllocateFewerGuardCells
ec3db2a
to
76c4e33
Compare
I pushed a few commits to update this old branch and see if we can still get these changes to work. |
I'm wondering if we're not overly cautious in the number of guard cells we allocate as a function of interpolation order. For instance, for a regular simulation with order 3 shape factors, we will end up with 4 guard cells (because we round it up later on to the closest even integer), while I thought that 2 would be necessary.
So I'm trying to reduce this number in this PR and see if this affects our automated tests.
(This should not affect PSATD simulations, where the number of allocated guard cells is dominated by the needs of the field solver and not by the interpolation order)
Edit: Updates after the CI tests. Some tests have been failing and I've had to make a few modifications:
averaged_galilean_2d_psatd_hybrid
have changed. After investigating, this is because this PR changes the number of allocated guard cells from(9,16)
to(8,16)
. Indeed, the number of guard cells in the x direction was interestingly not determined by the field solver order, but by the high cfl used, which means that particles can move by 4 cells in the x direction within a timestep, and resulted in a computed value ofng_alloc_Rho
of(9,6)
(eventually the number of allocated guard cells is the same for all arrays, so all arrays ended up with(9,16)
guard cells). However, the value of9
was obtained by assuming that 3 guard cells are necessary for a shape factor of 3, which is not the case (only 2 are needed). Hence, I think it makes sense that this test only needs 8 guard cells in the x direction and that we can reset the benchmarks (which I've done).LaserAccelerationBoost
(which was off by ~1e-7). I've investigated this and it seems to be machine precision errors in the current deposition (which end up making 1e-7 errors in the ion momentum, which is practically 0 in the x direction). With the changes of this PR, the number of guard cells for the J field decrease from 4 to 3 in this test. This can affect the results (machine precision errors) via the calculation ofxyzmin
here:WarpX/Source/Particles/WarpXParticleContainer.cpp
Line 420 in 0512340
I've verified that changing back the number of guard cells of the J array from 3 to 4 recovers the previous benchmark, and that slightly changing the values of
xyzmin
with lines likexyzmin[0] = xyzmin[0] + xyzmin[0]*5.e-15;
also modifies this checksum by ~e-7. So I think that we can safely reset this benchmark as well.PEC_particle
. Except that this time, the numerical errors came from both the current deposition (same as above) and the field gathering. In the latter, it is the computation ofxyzmin
here:WarpX/Source/Particles/PhysicalParticleContainer.cpp
Line 2385 in b9fb50c
that is sensitive to machine precision errors and that was changed when reducing the number of guard cells for the E&B fields from 4 to 2. This resulted in very important changes in relative amplitude (>1) for some checksums, but these were checksums with values extremely close to 0 so it's not surprising that they are sensitive to machine precision errors.