Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve gradient mismatches in benchmarks #340

Closed
seabbs opened this issue Jul 7, 2024 · 10 comments · Fixed by #388
Closed

Resolve gradient mismatches in benchmarks #340

seabbs opened this issue Jul 7, 2024 · 10 comments · Fixed by #388
Labels
bug Something isn't working EpiAware

Comments

@seabbs
Copy link
Collaborator

seabbs commented Jul 7, 2024

For several utilities, benchmarking suggests that different backends given different gradients. This should be investigated as it may indicate performance issues.

@SamuelBrand1
Copy link
Collaborator

Is the difference between all options or is Zygote the outlier? Reverse diff with compiled tape is known to have these problems with logical branches in the code.

@seabbs
Copy link
Collaborator Author

seabbs commented Jul 8, 2024

Compiled tape has issues.

@SamuelBrand1
Copy link
Collaborator

That makes sense... The improvement path here is track down the "wrong" branch(es) that are getting compiled.

@seabbs seabbs added the bug Something isn't working label Jul 18, 2024
@seabbs
Copy link
Collaborator Author

seabbs commented Jul 19, 2024

Screenshot 2024-07-18 at 23 12 21

This is what I see in our current benchmarks. Could really do with a improved warning message here to help localise

@seabbs
Copy link
Collaborator Author

seabbs commented Jul 22, 2024

Screenshot 2024-07-22 at 18 39 17

In the latest benchmarks in #392 I still see a single instance of this that needs resolving.

@SamuelBrand1
Copy link
Collaborator

Where is this benchmark?

@seabbs
Copy link
Collaborator Author

seabbs commented Jul 25, 2024

I think we can also use instabilities in our benchmarks (i.e #400 (comment)) to indicate where problems are. It would be useful to think if there is a more formal way of checking this that requires less tracking across PRs.

(i.e here it suggests that observation error models are problematic).

@seabbs
Copy link
Collaborator Author

seabbs commented Aug 1, 2024

#414 localised some issues to the NegativeBinomialError model. It would be good to investigate this further to see if it common across all error models or just the negative binomial.

See: #414 (comment)

@seabbs
Copy link
Collaborator Author

seabbs commented Aug 1, 2024

Running some more repititions I see the following throwing gradient issues (due to compiled reverse diff):

Type Count
Ascertainment{NegativeBinomialError} 3
NegativeBinomialError{HalfNormal{Float64}} 2
Warnings from Model{typeof(generate_observations), (:obs_model, :y_t, :Y_t), (), (), Tuple{Ascertainment{NegativeBinomialError{HalfNormal{Float64}}, AbstractTuringLatentModel, var"#88#89", String}, Vector{Int64}, Vector{Int64}}, Tuple{}, DefaultContext}(EpiAware.EpiAwareBase.generate_observations, (obs_model = Ascertainment{NegativeBinomialError{HalfNormal{Float64}}, AbstractTuringLatentModel, var"#88#89", String}(NegativeBinomialError{HalfNormal{Float64}}(HalfNormal{Float64}=0.01)), PrefixLatentModel{FixedIntercept{Float64}, String}(FixedIntercept{Float64}(0.1), "Ascertainment"), var"#88#89"(), "Ascertainment"), y_t = [100, 100, 100, 100, 100, 100, 100, 100, 100, 100], Y_t = [100, 100, 100, 100, 100, 100, 100, 100, 100, 100]), NamedTuple(), DefaultContext()):
┌ Warning: `ad.compile` where `ad` is `AutoReverseDiff` has been deprecated and will be removed in v2. Instead it is available as a compile-time constant as `AutoReverseDiff{true}` or `AutoReverseDiff{false}
Warnings from Model{typeof(generate_observations), (:obs_model, :y_t, :Y_t), (), (), Tuple{Ascertainment{NegativeBinomialError{HalfNormal{Float64}}, AbstractTuringLatentModel, var"#82#83", String}, Vector{Int64}, Vector{Int64}}, Tuple{}, DefaultContext}(EpiAware.EpiAwareBase.generate_observations, (obs_model = Ascertainment{NegativeBinomialError{HalfNormal{Float64}}, AbstractTuringLatentModel, var"#82#83", String}(NegativeBinomialError{HalfNormal{Float64}}(HalfNormal{Float64}=0.01)), PrefixLatentModel{FixedIntercept{Float64}, String}(FixedIntercept{Float64}(0.1), "Ascertainment"), var"#82#83"(), "Ascertainment"), y_t = [100, 100, 100, 100, 100, 100, 100, 100, 100, 100], Y_t = [100, 100, 100, 100, 100, 100, 100, 100, 100, 100]), NamedTuple(), DefaultContext()):
Warnings from Model{typeof(generate_observations), (:obs_model, :y_t, :Y_t), (), (), Tuple{Ascertainment{NegativeBinomialError{HalfNormal{Float64}}, AbstractTuringLatentModel, var"#64#65", String}, Vector{Int64}, Vector{Int64}}, Tuple{}, DefaultContext}(EpiAware.EpiAwareBase.generate_observations, (obs_model = Ascertainment{NegativeBinomialError{HalfNormal{Float64}}, AbstractTuringLatentModel, var"#64#65", String}(NegativeBinomialError{HalfNormal{Float64}}(HalfNormal{Float64}=0.01)), PrefixLatentModel{FixedIntercept{Float64}, String}(FixedIntercept{Float64}(0.1), "Ascertainment"), var"#64#65"(), "Ascertainment"), y_t = [100, 100, 100, 100, 100, 100, 100, 100, 100, 100], Y_t = [100, 100, 100, 100, 100, 100, 100, 100, 100, 100]), NamedTuple(), DefaultContext())
Warnings from Model{typeof(generate_observations), (:obs_model, :y_t, :Y_t), (), (), Tuple{NegativeBinomialError{HalfNormal{Float64}}, Vector{Float64}, Vector{Float64}}, Tuple{}, DefaultContext}(EpiAware.EpiAwareBase.generate_observations, (obs_model = NegativeBinomialError{HalfNormal{Float64}}(HalfNormal{Float64}=0.01)), y_t = [10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0], Y_t = [10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0]), NamedTuple(), DefaultContext())
arnings from Model{typeof(generate_observations), (:obs_model, :y_t, :Y_t), (), (), Tuple{NegativeBinomialError{HalfNormal{Float64}}, Vector{Float64}, Vector{Float64}}, Tuple{}, DefaultContext}(EpiAware.EpiAwareBase.generate_observations, (obs_model = NegativeBinomialError{HalfNormal{Float64}}(HalfNormal{Float64}=0.01)), y_t = [10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0], Y_t = [10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0]), NamedTuple(), DefaultContext())

Something of a pattern I think!

@seabbs
Copy link
Collaborator Author

seabbs commented Aug 15, 2024

I think with #442 and #415 these are now all handled so closing.

@seabbs seabbs closed this as completed Aug 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working EpiAware
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants