-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure that strictly positive distribution parameter is > 0 after exp() transformation #2310
Comments
Are your priors informative enough? Sometimes, the priors are to be blamed for these sorts of numerical issues. |
MCMC algorithms shouldn't suggest steps that cause any of the parameters to be julia> exp(-746)
0.0 To avoid largely negative values of As a side remark, instead of a |
Thanks, I didn't realize that, that must be the source of the problem (indeed I was surprised that it would yield very big neg values given the fairly tight priors). It seems like exp(x) falls below the Good to know that truncating is an option, though I'm worried about it then creating issues as other unconstrained coefficients (effect of some other variables) are at risk of "potentially" having that truncated parameter go beyond its truncation point, e.g;, in case (to illustrate): σ ~ truncated(Normal(0, 3), -100, Inf)
σ_slope ~ Normal(0, 50)
y ~ Normal(mu, exp(σ+σ_slope))
Interesting, I didn't think of that. Though I'm not sure how to make it work with coefficients placed on them: This would be the reference model: @model function model(y, x)
μ_intercept ~ Normal(0, 3)
μ_slope ~ Normal(0, 3)
σ_intercept ~ Normal(0, 3)
σ_slope ~ Normal(0, 3)
for i in eachindex(y)
μ = μ_intercept + μ_slope * x[i]
σ = σ_intercept + σ_slope * x[i]
y[i] ~ Normal(μ, exp(σ))
end
end Reparametrized model for better management of small values: @model function model(y, x)
μ_intercept ~ Normal(0, 3)
μ_slope ~ Normal(0, 3)
σ_intercept ~ LogNormal(0, 3)
σ_slope ~ Normal(0, 3)
for i in eachindex(y)
μ = μ_intercept + μ_slope * x[i]
σ = σ_intercept + exp(σ_slope * x[i])
y[i] ~ Normal(μ, σ)
end
end The interpretation of the parameters gets a bit tricky though, especially since I would expect the sigma formula to change to be multiplicative ( Here's a toy dataset: y = rand(Normal(0, 2), 1000)
x = -y
y = y .+ [rand(Normal(0, abs(minimum(x)) + i), 1)[1] for i in x]
sample(model(y, x), NUTS(), 1000) I guess in general my question is: how to best parametrize linear effects on ]0; +Inf] parameters. Although I don't think it's a Turing issue anymore (I'll close the issue), any tips are appreciated :) |
The model with a @model function model(y, x)
μ_intercept ~ Normal(0, 3)
μ_slope ~ Normal(0, 3)
σ_intercept ~ LogNormal(0, 3)
σ_slope ~ Normal(0, 3)
for i in eachindex(y)
μ = μ_intercept + μ_slope * x[i]
σ = σ_intercept * exp(σ_slope * x[i])
y[i] ~ Normal(μ, σ)
end
end |
I thought so too but for some reason this model errors: using Distributions
using Turing
using Random
@model function model(y, x)
μ_intercept ~ Normal(0, 3)
μ_slope ~ Normal(0, 3)
σ_intercept ~ LogNormal(0, 3)
σ_slope ~ Normal(0, 3)
for i in eachindex(y)
μ = μ_intercept + μ_slope * x[i]
σ = σ_intercept * exp(σ_slope * x[i])
y[i] ~ Normal(μ, σ)
end
end
Random.seed!(123)
y = rand(Normal(0, 2), 1000)
x = -y
y = y .+ [rand(Normal(0, abs(minimum(x)) + i), 1)[1] for i in x]
sample(model(y, x), NUTS(), 1000)
|
|
(But with these combinations of parameters etc. personally I'd probably stick with |
This comment was marked as outdated.
This comment was marked as outdated.
If you didn't already have NaN-safe mode enabled, the script will still use the NaN-unsafe mode: Changing preferences after loading a package doesn't affect the current session. |
(edit: forgot to refresh the session) Here's the error now: ERROR: DomainError with ForwardDiff.Dual{ForwardDiff.Tag{DynamicPPL.DynamicPPLTag, Float64}, Float64, 4}(NaN,
Stacktrace:
[1] #371
@ C:\Users\domma\.julia\packages\Distributions\ji8PW\src\univariate\continuous\normal.jl:37 [inlined]
[2] check_args
@ C:\Users\domma\.julia\packages\Distributions\ji8PW\src\utils.jl:89 [inlined]
[3] #Normal#370
@ C:\Users\domma\.julia\packages\Distributions\ji8PW\src\univariate\continuous\normal.jl:37 [inlined]
[4] Normal
@ C:\Users\domma\.julia\packages\Distributions\ji8PW\src\univariate\continuous\normal.jl:36 [inlined]
[5] macro expansion
@ c:\Users\domma\Dropbox\RECHERCHE\Studies\DoggoNogo\study1\analysis\activate.jl:24 [inlined]
[6] macro expansion
@ C:\Users\domma\.julia\packages\DynamicPPL\rXg4T\src\compiler.jl:555 [inlined]
[7] model(__model__::DynamicPPL.Model{…}, __varinfo__::DynamicPPL.ThreadSafeVarInfo{…}, __context__::DynamicPPL.SamplingContext{…}, y::Vector{…}, x::Vector{…})
@ Main c:\Users\domma\Dropbox\RECHERCHE\Studies\DoggoNogo\study1\analysis\activate.jl:21
[8] _evaluate!!
@ C:\Users\domma\.julia\packages\DynamicPPL\rXg4T\src\model.jl:963 [inlined]
[9] evaluate_threadsafe!!
@ C:\Users\domma\.julia\packages\DynamicPPL\rXg4T\src\model.jl:952 [inlined]
[10] evaluate!!(model::DynamicPPL.Model{…}, varinfo::DynamicPPL.TypedVarInfo{…}, context::DynamicPPL.SamplingContext{…})
@ DynamicPPL C:\Users\domma\.julia\packages\DynamicPPL\rXg4T\src\model.jl:887
[11] logdensity
@ C:\Users\domma\.julia\packages\DynamicPPL\rXg4T\src\logdensityfunction.jl:94 [inlined]
[12] Fix1
@ .\operators.jl:1118 [inlined]
[13] vector_mode_dual_eval!
@ C:\Users\domma\.julia\packages\ForwardDiff\PcZ48\src\apiutils.jl:24 [inlined]
[14] vector_mode_gradient!(result::DiffResults.MutableDiffResult{…}, f::Base.Fix1{…}, x::Vector{…}, cfg::ForwardDiff.GradientConfig{…})
@ ForwardDiff C:\Users\domma\.julia\packages\ForwardDiff\PcZ48\src\gradient.jl:96
[15] gradient!
@ C:\Users\domma\.julia\packages\ForwardDiff\PcZ48\src\gradient.jl:37 [inlined]
[16] gradient!
@ C:\Users\domma\.julia\packages\ForwardDiff\PcZ48\src\gradient.jl:35 [inlined]
[17] logdensity_and_gradient
@ C:\Users\domma\.julia\packages\LogDensityProblemsAD\rBlLq\ext\LogDensityProblemsADForwardDiffExt.jl:118 [inlined]
[18] ∂logπ∂θ
@ C:\Users\domma\.julia\packages\Turing\hbbFY\src\mcmc\hmc.jl:159 [inlined]
[19] ∂H∂θ
@ C:\Users\domma\.julia\packages\AdvancedHMC\AlvV4\src\hamiltonian.jl:38 [inlined]
[20] step(lf::AdvancedHMC.Leapfrog{…}, h::AdvancedHMC.Hamiltonian{…}, z::AdvancedHMC.PhasePoint{…}, n_steps::Int64; fwd::Bool, full_trajectory::Val{…})
@ AdvancedHMC C:\Users\domma\.julia\packages\AdvancedHMC\AlvV4\src\integrator.jl:229
[21] step (repeats 2 times)
@ C:\Users\domma\.julia\packages\AdvancedHMC\AlvV4\src\integrator.jl:199 [inlined]
[22] A
@ C:\Users\domma\.julia\packages\AdvancedHMC\AlvV4\src\trajectory.jl:759 [inlined]
[23] find_good_stepsize(rng::TaskLocalRNG, h::AdvancedHMC.Hamiltonian{…}, θ::Vector{…}; max_n_iters::Int64)
@ AdvancedHMC C:\Users\domma\.julia\packages\AdvancedHMC\AlvV4\src\trajectory.jl:781
[24] find_good_stepsize(rng::TaskLocalRNG, h::AdvancedHMC.Hamiltonian{…}, θ::Vector{…})
@ AdvancedHMC C:\Users\domma\.julia\packages\AdvancedHMC\AlvV4\src\trajectory.jl:765
[25] initialstep(rng::TaskLocalRNG, model::DynamicPPL.Model{…}, spl::DynamicPPL.Sampler{…}, vi_original::DynamicPPL.TypedVarInfo{…}; initial_params::Nothing, nadapts::Int64, kwargs::@Kwargs{})
@ Turing.Inference C:\Users\domma\.julia\packages\Turing\hbbFY\src\mcmc\hmc.jl:190
[26] step(rng::TaskLocalRNG, model::DynamicPPL.Model{…}, spl::DynamicPPL.Sampler{…}; initial_params::Nothing, kwargs::@Kwargs{…})
@ DynamicPPL C:\Users\domma\.julia\packages\DynamicPPL\rXg4T\src\sampler.jl:116
[27] step
@ C:\Users\domma\.julia\packages\DynamicPPL\rXg4T\src\sampler.jl:99 [inlined]
[28] macro expansion
@ C:\Users\domma\.julia\packages\AbstractMCMC\YrmkI\src\sample.jl:130 [inlined]
[29] macro expansion
@ C:\Users\domma\.julia\packages\ProgressLogging\6KXlp\src\ProgressLogging.jl:328 [inlined]
[30] macro expansion
@ C:\Users\domma\.julia\packages\AbstractMCMC\YrmkI\src\logging.jl:9 [inlined]
[31] mcmcsample(rng::TaskLocalRNG, model::DynamicPPL.Model{…}, sampler::DynamicPPL.Sampler{…}, N::Int64; progress::Bool, progressname::String, callback::Nothing, discard_initial::Int64, thinning::Int64, chain_type::Type, initial_state::Nothing, kwargs::@Kwargs{…})
@ AbstractMCMC C:\Users\domma\.julia\packages\AbstractMCMC\YrmkI\src\sample.jl:120
[32] sample(rng::TaskLocalRNG, model::DynamicPPL.Model{…}, sampler::DynamicPPL.Sampler{…}, N::Int64; chain_type::Type, resume_from::Nothing, initial_state::Nothing, progress::Bool, nadapts::Int64, discard_adapt::Bool, discard_initial::Int64, kwargs::@Kwargs{})
@ Turing.Inference C:\Users\domma\.julia\packages\Turing\hbbFY\src\mcmc\hmc.jl:117
[33] sample
@ C:\Users\domma\.julia\packages\Turing\hbbFY\src\mcmc\hmc.jl:86 [inlined]
[34] #sample#3
@ C:\Users\domma\.julia\packages\Turing\hbbFY\src\mcmc\Inference.jl:219 [inlined]
[35] sample
@ C:\Users\domma\.julia\packages\Turing\hbbFY\src\mcmc\Inference.jl:212 [inlined]
[36] #sample#2
@ C:\Users\domma\.julia\packages\Turing\hbbFY\src\mcmc\Inference.jl:209 [inlined]
[37] sample(model::DynamicPPL.Model{…}, alg::NUTS{…}, N::Int64)
@ Turing.Inference C:\Users\domma\.julia\packages\Turing\hbbFY\src\mcmc\Inference.jl:203
[38] top-level scope
@ c:\Users\domma\Dropbox\RECHERCHE\Studies\DoggoNogo\study1\analysis\activate.jl:33ERROR: MethodError: no method matching size(::ForwardDiff.Partials{4, Float64})
Closest candidates are:
size(::ForwardDiff.Partials{N}) where N (method too new to be called from this world context.)
@ ForwardDiff C:\Users\domma\.julia\packages\ForwardDiff\PcZ48\src\partials.jl:21
size(::AbstractArray{T, N}, ::Any) where {T, N}
@ Base abstractarray.jl:42
size(::DataStructures.DiBitVector) (method too new to be called from this world context.)
@ DataStructures C:\Users\domma\.julia\packages\DataStructures\95DJa\src\dibit_vector.jl:41
...
EDIT 2: new error when I re-run: julia> sample(model(y, x), NUTS(), 1000)
ERROR: DomainError with Dual{ForwardDiff.Tag{DynamicPPL.DynamicPPLTag, Float64}}(NaN,NaN,NaN,NaN,NaN):
Normal: the condition σ >= zero(σ) is not satisfied.
Stacktrace:
[1] #371
@ C:\Users\domma\.julia\packages\Distributions\ji8PW\src\univariate\continuous\normal.jl:37 [inlined]
[2] check_args
@ C:\Users\domma\.julia\packages\Distributions\ji8PW\src\utils.jl:89 [inlined]
[3] #Normal#370
@ C:\Users\domma\.julia\packages\Distributions\ji8PW\src\univariate\continuous\normal.jl:37 [inlined]
[4] Normal
@ C:\Users\domma\.julia\packages\Distributions\ji8PW\src\univariate\continuous\normal.jl:36 [inlined]
[5] macro expansion
@ c:\Users\domma\Dropbox\RECHERCHE\Studies\DoggoNogo\study1\analysis\activate.jl:24 [inlined]
[6] macro expansion
@ C:\Users\domma\.julia\packages\DynamicPPL\rXg4T\src\compiler.jl:555 [inlined]
[7] model(__model__::DynamicPPL.Model{…}, __varinfo__::DynamicPPL.ThreadSafeVarInfo{…}, __context__::DynamicPPL.SamplingContext{…}, y::Vector{…}, x::Vector{…})
@ Main c:\Users\domma\Dropbox\RECHERCHE\Studies\DoggoNogo\study1\analysis\activate.jl:21
[8] _evaluate!!
@ C:\Users\domma\.julia\packages\DynamicPPL\rXg4T\src\model.jl:963 [inlined]
[9] evaluate_threadsafe!!
@ C:\Users\domma\.julia\packages\DynamicPPL\rXg4T\src\model.jl:952 [inlined]
[10] evaluate!!(model::DynamicPPL.Model{…}, varinfo::DynamicPPL.TypedVarInfo{…}, context::DynamicPPL.SamplingContext{…})
@ DynamicPPL C:\Users\domma\.julia\packages\DynamicPPL\rXg4T\src\model.jl:887
[11] logdensity
@ C:\Users\domma\.julia\packages\DynamicPPL\rXg4T\src\logdensityfunction.jl:94 [inlined]
[12] Fix1
@ .\operators.jl:1118 [inlined]
[13] vector_mode_dual_eval!
@ C:\Users\domma\.julia\packages\ForwardDiff\PcZ48\src\apiutils.jl:24 [inlined]
[14] vector_mode_gradient!(result::DiffResults.MutableDiffResult{…}, f::Base.Fix1{…}, x::Vector{…}, cfg::ForwardDiff.GradientConfig{…})
@ ForwardDiff C:\Users\domma\.julia\packages\ForwardDiff\PcZ48\src\gradient.jl:96
[15] gradient!
@ C:\Users\domma\.julia\packages\ForwardDiff\PcZ48\src\gradient.jl:37 [inlined]
[16] gradient!
@ C:\Users\domma\.julia\packages\ForwardDiff\PcZ48\src\gradient.jl:35 [inlined]
[17] logdensity_and_gradient
@ C:\Users\domma\.julia\packages\LogDensityProblemsAD\rBlLq\ext\LogDensityProblemsADForwardDiffExt.jl:118 [inlined]
[18] ∂logπ∂θ
@ C:\Users\domma\.julia\packages\Turing\hbbFY\src\mcmc\hmc.jl:159 [inlined]
[19] ∂H∂θ
@ C:\Users\domma\.julia\packages\AdvancedHMC\AlvV4\src\hamiltonian.jl:38 [inlined]
[20] step(lf::AdvancedHMC.Leapfrog{…}, h::AdvancedHMC.Hamiltonian{…}, z::AdvancedHMC.PhasePoint{…}, n_steps::Int64; fwd::Bool, full_trajectory::Val{…})
@ AdvancedHMC C:\Users\domma\.julia\packages\AdvancedHMC\AlvV4\src\integrator.jl:229
[21] step (repeats 2 times)
@ C:\Users\domma\.julia\packages\AdvancedHMC\AlvV4\src\integrator.jl:199 [inlined]
[22] A
@ C:\Users\domma\.julia\packages\AdvancedHMC\AlvV4\src\trajectory.jl:759 [inlined]
[23] find_good_stepsize(rng::TaskLocalRNG, h::AdvancedHMC.Hamiltonian{…}, θ::Vector{…}; max_n_iters::Int64)
@ AdvancedHMC C:\Users\domma\.julia\packages\AdvancedHMC\AlvV4\src\trajectory.jl:781
[24] find_good_stepsize(rng::TaskLocalRNG, h::AdvancedHMC.Hamiltonian{…}, θ::Vector{…})
@ AdvancedHMC C:\Users\domma\.julia\packages\AdvancedHMC\AlvV4\src\trajectory.jl:765
[25] initialstep(rng::TaskLocalRNG, model::DynamicPPL.Model{…}, spl::DynamicPPL.Sampler{…}, vi_original::DynamicPPL.TypedVarInfo{…}; initial_params::Nothing, nadapts::Int64, kwargs::@Kwargs{})
@ Turing.Inference C:\Users\domma\.julia\packages\Turing\hbbFY\src\mcmc\hmc.jl:190
[26] step(rng::TaskLocalRNG, model::DynamicPPL.Model{…}, spl::DynamicPPL.Sampler{…}; initial_params::Nothing, kwargs::@Kwargs{…})
@ DynamicPPL C:\Users\domma\.julia\packages\DynamicPPL\rXg4T\src\sampler.jl:116
[27] step
@ C:\Users\domma\.julia\packages\DynamicPPL\rXg4T\src\sampler.jl:99 [inlined]
[28] macro expansion
@ C:\Users\domma\.julia\packages\AbstractMCMC\YrmkI\src\sample.jl:130 [inlined]
[29] macro expansion
@ C:\Users\domma\.julia\packages\ProgressLogging\6KXlp\src\ProgressLogging.jl:328 [inlined]
[30] macro expansion
@ C:\Users\domma\.julia\packages\AbstractMCMC\YrmkI\src\logging.jl:9 [inlined]
[31] mcmcsample(rng::TaskLocalRNG, model::DynamicPPL.Model{…}, sampler::DynamicPPL.Sampler{…}, N::Int64; progress::Bool, progressname::String, callback::Nothing, discard_initial::Int64, thinning::Int64, chain_type::Type, initial_state::Nothing, kwargs::@Kwargs{…})
@ AbstractMCMC C:\Users\domma\.julia\packages\AbstractMCMC\YrmkI\src\sample.jl:120
[32] sample(rng::TaskLocalRNG, model::DynamicPPL.Model{…}, sampler::DynamicPPL.Sampler{…}, N::Int64; chain_type::Type, resume_from::Nothing, initial_state::Nothing, progress::Bool, nadapts::Int64, discard_adapt::Bool, discard_initial::Int64, kwargs::@Kwargs{})
@ Turing.Inference C:\Users\domma\.julia\packages\Turing\hbbFY\src\mcmc\hmc.jl:117
[33] sample
@ C:\Users\domma\.julia\packages\Turing\hbbFY\src\mcmc\hmc.jl:86 [inlined]
[34] #sample#3
@ C:\Users\domma\.julia\packages\Turing\hbbFY\src\mcmc\Inference.jl:219 [inlined]
[35] sample
@ C:\Users\domma\.julia\packages\Turing\hbbFY\src\mcmc\Inference.jl:212 [inlined]
[36] #sample#2
@ C:\Users\domma\.julia\packages\Turing\hbbFY\src\mcmc\Inference.jl:209 [inlined]
[37] sample(model::DynamicPPL.Model{…}, alg::NUTS{…}, N::Int64)
@ Turing.Inference C:\Users\domma\.julia\packages\Turing\hbbFY\src\mcmc\Inference.jl:203
[38] top-level scope
@ c:\Users\domma\Dropbox\RECHERCHE\Studies\DoggoNogo\study1\analysis\activate.jl:33
Some type information was truncated. Use `show(err)` to see complete types. |
This does seem like something that should be solved by prior specifications. Reducing the prior scale of @model function model(y, x)
μ_intercept ~ Normal(0, 3)
μ_slope ~ Normal(0, 3)
σ_intercept ~ LogNormal(0, 3)
σ_slope ~ Normal(0, 0.3)
for i in eachindex(y)
μ = μ_intercept + μ_slope * x[i]
σ = σ_intercept * exp(σ_slope * x[i])
y[i] ~ Normal(μ, σ)
end
end You should always be very careful about |
Setting a conservative initial stepsize and a high target acceptance rate seems to work well: @model function model(y, x)
μ_intercept ~ Normal(0, 3)
μ_slope ~ Normal(0, 3)
σ_intercept ~ LogNormal(0, 3)
σ_slope ~ Normal(0, 3)
for i in eachindex(y)
μ = μ_intercept + μ_slope * x[i]
σ = σ_intercept * exp(σ_slope * x[i])
y[i] ~ Normal(μ, σ)
end
end
sample(model(y, x), NUTS(128,0.9; init_ϵ=0.0001), 1000) It seems like the initial stepsize finding routine is trying out values that are numerically dangerous. |
Sorry to bother again, but I am using quite a lot of models with coefficients set on bounded, typically some scale or shape parameters that must be > 0.
My typical approach is to express the priors on the log-scale (
Normal(0, 3)
), and then inject them in the distribution withexp(scale_param)
.However, there's been quite a few occasion (fairly randomly) where MCMC sampling errors with a message like:
The only reason I can think of is that for some reasons the algorithm explores some very big negative values, which gets turned into
-Inf
, resulting inexp(-Inf)=0
and erroring.One of the "fix" I've been using is for example to add a small constant like
Weibull(exp(α) + eps(), exp(θ) + eps())
but it feels a bit hacky.Is there a better way to deal with that?
The text was updated successfully, but these errors were encountered: