Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thermostat for implicit midpoint #239

Merged
merged 20 commits into from
Feb 24, 2024
Merged

Thermostat for implicit midpoint #239

merged 20 commits into from
Feb 24, 2024

Conversation

ddahlbom
Copy link
Member

@ddahlbom ddahlbom commented Feb 19, 2024

This PR adds a thermostat to the implicit midpoint integrator. The interface changes are minimal: ImplicitMidpoint now optionally takes kT and λ. By default both these keywords are set to zero.

All the finite temperature sampling tests (against analytical solutions for anisotropies, spin chains, etc.) now also exercise the implicit midpoint integrator.

We may in the future wish to redo some naming. (Having Langevin specify a Heun integrator makes less sense now.) But these changes should not break anyone's code and they will hopefully be useful to @Lazersmoke's efforts.

@ddahlbom
Copy link
Member Author

Will update docstrings soon.

@Lazersmoke
Copy link
Contributor

Thanks David!! This will be really helpful :)

function rhs!(ΔZ, Z, ξ, HZ, integrator, sys::System{N}) where N
(; kT, λ, Δt) = integrator
for site in eachsite(sys)
ΔZ′ = -im*√(2*Δt*kT*λ)*ξ[site] - Δt*(im+λ)*HZ[site]
Copy link
Member

@kbarros kbarros Feb 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To reduce computational cost, it might make sense to break this up into two operations:

ΔZ′ = - Δt*(im+λ)*HZ[site]
if kT > 0
    ΔZ′ += -im*√(2*Δt*kT*λ)*ξ[site]
end

This will work well for hardware branch prediction, and avoids unnecessary operations in the case of zero noise. (I don't know how expensive this is relatively to calculating HZ -- might be worth checking?)

It's also possible that the square root might be an expensive operation in the case that there exists a noise term? It's the square root of a constant, so the compiler might be smart enough to move it outside the for loop. But probably better not to rely on compiler optimizations. Maybe define:

noise_magnitude = (2*Δt*kT*λ)

outside the for loop?

Maybe before applying these optimizations we should check whether any of our CI tests slow down with this PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Thanks. Will change.

@kbarros
Copy link
Member

kbarros commented Feb 19, 2024

Looks very nice to me. The only question I have is about whether this slows down some workflows -- ImplicitMidpoint timesteps will be the bottleneck for many calculations. If so, there is an optimization idea in comment above.

@ddahlbom
Copy link
Member Author

ddahlbom commented Feb 19, 2024

Yes, I was concerned about that too. I'll make the change you suggested and do some simple benchmarking. I guess if it is noticeably slower, we can go back to dispatching on the type level (so instead of relying on branch prediction we just do one function lookup). Maybe have an ImplicitMidpointLangevin integrator and revert back to the prior meaning for ImplicitMidpoint.

@@ -282,7 +289,7 @@ function step!(sys::System{0}, integrator::ImplicitMidpoint)
# improved midpoint estimator s̄′.
@. ŝ = normalize_dipole(s̄, sys.κs)
set_energy_grad_dipoles!(∇E, ŝ, sys)
@. s̄′ = s + 0.5 * Δt * rhs_dipole(ŝ, -∇E)
@. s̄′ = s + 0.5 * (Δt * rhs_dipole(ŝ, -∇E, λ) + √Δt * rhs_dipole(ŝ, ξ))
Copy link
Member

@kbarros kbarros Feb 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New call to rhs_dipole(ŝ, ξ) might incur nontrivial computational cost. This could be broken into a separate block:

if kT > 0
    @. s̄′ += 0.5 * (Δt*2λ*kT) * rhs_dipole(ŝ, ξ))
end

And then remove the line ξ .*= √(2λ*kT) above. Note that this scaling of ξ as a pure Gaussian is more consistent with the SU(N) case.

It's not clear to me whether the compiler will precalculate √(Δt*2λ*kT). Could do this manually by defining

noise_magnitude = (2Δt*λ*kT)

outside the broadcast operation.

@ddahlbom
Copy link
Member Author

ddahlbom commented Feb 19, 2024

I took a simple system (S=1, one exchange, onsite anisotropy), thermalized it, and benched two versions of the step function: (1) The original code; and (3) the new code with suggested optimizations and λ and kT set to zero.

For dipole:

Reference           -- 46.667 μs
New Code            -- 48.666 μs

For SU(N)

Reference           -- 93.667 μs
New Code            -- 107.292 μs

We see a slight slowdown in dipole mode and a serious slowdown in SU(N) mode. Perhaps this makes it worthwhile to keep around the old ImplicitMidpoint (with no thermostat) and add an ImplicitMidpointLangevin.

Sound reasonable?

@ddahlbom
Copy link
Member Author

It turns out the main reason for the performance regression was the fact that the new approach added a call to proj in all cases, even when lambda and kT were zero.

The current approach includes a few more branching conditions. The big branching condition is based on whether lambda is nonzero.

These conditionals are kind of ugly, but the result is: no new type, and no performance regression.

@kbarros
Copy link
Member

kbarros commented Feb 20, 2024

I added some commits that fully unify the dipole and SU(N) integration schemes (both Heun and implicit midpoint). This implementation also avoids code redundancy.

Because the convergence check is slightly modified, it was necessary to recompute the reference data for "Sampled correlations reference".

According to the benchmark below, performance is preserved for implicit midpoint without Langevin thermostat.

using Sunny, BenchmarkTools

latvecs = lattice_vectors(1, 10, 10, 90, 90, 90)
cryst = Crystal(latvecs, [[0,0,0]])

# Modify for :dipole or :SUN
sys = System(cryst, (1000, 1, 1), [SpinInfo(1, g=2, S=1)], :SUN)
set_exchange!(sys, -1, Bond(1, 1, (1, 0, 0)))
randomize_spins!(sys)

integrator = ImplicitMidpoint(0.02; atol=1e-5)
# suggest_timestep(sys, integrator; tol=1e-3)

function f(sys, integrator)
    randomize_spins!(sys)
    for _ in 1:100
        step!(sys, integrator)
    end
end

integrator.λ = 0
integrator.kT = 0
@btime f(sys, integrator)
# :SUN with S=1
# (new) 14.4 ms <- 14.58 ms (old)
# :dipole
# (new) 6.27 ms <- 6.30 ms (old)

integrator.λ = 0.0001
integrator.kT = 0
@btime f(sys, integrator)
# :SUN with S=1 :: 16.95 ms
# :dipole       :: 6.825 ms

integrator.λ = 0.0001
integrator.kT = 0.0001
@btime f(sys, integrator)
# :SUN with S=1 :: 18.7 ms
# :dipole       :: 7.62 ms

Copy link
Member Author

@ddahlbom ddahlbom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reorg looks very nice to me.

I guess Github doesn't let you commend on unchanged code, but these lines should probably be changed. I don't think there's any reason to have different versions now.

function suggest_timestep(sys::System{N}, integrator::Langevin; tol) where N
    (; Δt, λ, kT) = integrator
    suggest_timestep_aux(sys; tol, Δt, λ, kT)
end
function suggest_timestep(sys::System{N}, integrator::ImplicitMidpoint; tol) where N
    (; Δt) = integrator
    suggest_timestep_aux(sys; tol, Δt, λ=0, kT=0)
end

src/Integrators.jl Outdated Show resolved Hide resolved
@kbarros
Copy link
Member

kbarros commented Feb 22, 2024

Thanks a lot for the nice suggestions @ddahlbom. Please let me know if everything is addressed with the recent commits.

@ddahlbom
Copy link
Member Author

Looks great to me.

@kbarros kbarros merged commit 95f641c into main Feb 24, 2024
4 checks passed
@kbarros kbarros deleted the langevin-implicit-midpoint branch February 24, 2024 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants