From 6ec8fd62f8d64b79bd4c1133f3f15bfb8d09d539 Mon Sep 17 00:00:00 2001 From: "Documenter.jl" Date: Sat, 31 Aug 2024 08:43:50 +0000 Subject: [PATCH] build based on a0f0c2c --- dev/.documenter-siteinfo.json | 2 +- dev/docstrings/index.html | 22 +++++++++++----------- dev/exponential/index.html | 2 +- dev/heat/index.html | 2 +- dev/index.html | 2 +- dev/interface/index.html | 2 +- dev/logistic/index.html | 2 +- dev/lotka/index.html | 2 +- dev/math/index.html | 2 +- dev/regression/index.html | 2 +- dev/stefan/index.html | 2 +- 11 files changed, 21 insertions(+), 21 deletions(-) diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index ddd98fb..b53905c 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.10.5","generation_timestamp":"2024-08-31T07:32:40","documenter_version":"1.6.0"}} \ No newline at end of file +{"documenter":{"julia_version":"1.10.5","generation_timestamp":"2024-08-31T08:43:44","documenter_version":"1.6.0"}} \ No newline at end of file diff --git a/dev/docstrings/index.html b/dev/docstrings/index.html index 020e3d9..4a55dc9 100644 --- a/dev/docstrings/index.html +++ b/dev/docstrings/index.html @@ -1,5 +1,5 @@ -Docstrings · ProfileLikelihood.jl

Docstrings

Here we give some of the main docstrings.

LikelihoodProblem

ProfileLikelihood.LikelihoodProblemType
LikelihoodProblem{N,P,D,L,Θ,S} <: AbstractLikelihoodProblem

Struct representing a likelihood problem.

Fields

  • problem::P

The associated OptimizationProblem.

  • data::D

The argument p used in the log-likelihood function.

  • log_likelihood_function::L

The log-likelihood function, taking the form ℓ(θ, p).

  • θ₀::Θ

Initial estimates for the MLE θ.

  • syms::S

Variable names for the parameters. This is wrapped into a SymbolCache from SymbolicIndexingInterface.jl.

The extra parameter N is the number of parameters.

Constructors

Standard

LikelihoodProblem(loglik::Function, θ₀;
+Docstrings · ProfileLikelihood.jl

Docstrings

Here we give some of the main docstrings.

LikelihoodProblem

ProfileLikelihood.LikelihoodProblemType
LikelihoodProblem{N,P,D,L,Θ,S} <: AbstractLikelihoodProblem

Struct representing a likelihood problem.

Fields

  • problem::P

The associated OptimizationProblem.

  • data::D

The argument p used in the log-likelihood function.

  • log_likelihood_function::L

The log-likelihood function, taking the form ℓ(θ, p).

  • θ₀::Θ

Initial estimates for the MLE θ.

  • syms::S

Variable names for the parameters. This is wrapped into a SymbolCache from SymbolicIndexingInterface.jl.

The extra parameter N is the number of parameters.

Constructors

Standard

LikelihoodProblem(loglik::Function, θ₀;
     syms=eachindex(θ₀), data=SciMLBase.NullParameters(),
     f_kwargs=nothing, prob_kwargs=nothing)

Constructor for the LikelihoodProblem.

Arguments

  • loglik::Function: The log-likelihood function, taking the form ℓ(θ, p).
  • θ₀: The estimates estimates for the MLEs.

Keyword Arguments

  • syms=eachindex(θ₀): Names for each parameter.
  • data=SciMLBase.NullParameters(): The parameter p in the log-likelihood function.
  • f_kwargs=nothing: Keyword arguments, passed as a NamedTuple, for the OptimizationFunction.
  • prob_kwargs=nothing: Keyword arguments, passed as a NamedTuple, for the OptimizationProblem.

Outputs

Returns the LikelihoodProblem problem object.

With arguments for a differential equation problem

LikelihoodProblem(loglik::Function, θ₀,
     ode_function, u₀, tspan;
@@ -7,8 +7,8 @@
     ode_parameters=SciMLBase.NullParameters(), ode_alg,
     ode_kwargs=nothing, f_kwargs=nothing, prob_kwargs=nothing)

Constructor for the LikelihoodProblem for a differential equation problem.

Arguments

  • loglik::Function: The log-likelihood function, taking the form ℓ(θ, p, integrator).
  • θ₀: The estimates estimates for the MLEs.
  • ode_function: The function f(du, u, p, t) or f(u, p, t) for the differential equation.
  • u₀: The initial condition for the differential equation.
  • tspan: The time-span to solve the differential equation over.

Keyword Arguments

  • syms=eachindex(θ₀): Names for each parameter.
  • data=SciMLBase.NullParameters(): The parameter p in the log-likelihood function.
  • ode_parameters=SciMLBase.NullParameters(): The parameter p in ode_function.
  • ode_alg: The algorithm used for solving the differential equatios.
  • ode_kwargs=nothing: Extra keyword arguments, passed as a NamedTuple, to pass into the integrator; see construct_integrator.
  • f_kwargs=nothing: Keyword arguments, passed as a NamedTuple, for the OptimizationFunction.
  • prob_kwargs=nothing: Keyword arguments, passed as a NamedTuple, for the OptimizationProblem.

Outputs

Returns the LikelihoodProblem problem object.

With an integrator

LikelihoodProblem(loglik::Function, θ₀, integrator;
     syms=eachindex(θ₀), data=SciMLBase.NullParameters(),
-    f_kwargs=nothing, prob_kwargs=nothing)

Constructor for the LikelihoodProblem for a differential equation problem with associated integrator.

Arguments

  • loglik::Function: The log-likelihood function, taking the form ℓ(θ, p, integrator).
  • θ₀: The estimates estimates for the MLEs.
  • integrator: The integrator for the differential equation problem. See also construct_integrator.

Keyword Arguments

  • syms=eachindex(θ₀): Names for each parameter.
  • data=SciMLBase.NullParameters(): The parameter p in the log-likelihood function.
  • f_kwargs=nothing: Keyword arguments, passed as a NamedTuple, for the OptimizationFunction.
  • prob_kwargs=nothing: Keyword arguments, passed as a NamedTuple, for the OptimizationProblem.

Outputs

Returns the LikelihoodProblem problem object.

source

LikelihoodSolution

ProfileLikelihood.LikelihoodSolutionType
struct LikelihoodSolution{Θ,P,M,R,A} <: AbstractLikelihoodSolution

Struct for a solution to a LikelihoodProblem.

Fields

  • mle::Θ: The MLEs.
  • problem::P: The LikelihoodProblem.
  • optimiser::A: The algorithm used for solving the optimisation problem.
  • maximum::M: The maximum likelihood.
  • retcode::R: The SciMLBase.ReturnCode.

Constructors

LikelihoodSolution(sol::SciMLBase.OptimizationSolution, prob::AbstractLikelihoodProblem; alg=sol.alg)

Constructs the likelihood solution from a solution to an OptimizationProblem with a given LikelihoodProblem.

source
ProfileLikelihood.mleFunction
mle(prob::LikelihoodProblem, alg, args...; kwargs...)
-mle(prob::LikelihoodProblem, alg::Tuple, args...; kwargs...)

Given the likelihood problem prob and an optimiser alg, finds the MLEs and returns a LikelihoodSolution object. Extra arguments and keyword arguments for solve can be passed through args... and kwargs....

If alg is a Tuple, then the problem is re-optimised after each algorithm with the next element in alg, starting from alg[1], with initial estimate coming from the solution with the previous algorithm (starting with get_initial_estimate(prob)).

source

ProfileLikelihoodSolution

ProfileLikelihood.ProfileLikelihoodSolutionType
ProfileLikelihoodSolution{I,V,LP,LS,Spl,CT,CF,OM}

Struct for the normalised profile log-likelihood. See profile for a constructor.

Fields

  • parameter_values::Dict{I, V}

This is a dictionary such that parameter_values[i] gives the parameter values used for the normalised profile log-likelihood of the ith variable.

  • profile_values::Dict{I, V}

This is a dictionary such that profile_values[i] gives the values of the normalised profile log-likelihood function at the corresponding values in θ[i].

  • likelihood_problem::LP

The original LikelihoodProblem.

  • likelihood_solution::LS

The solution to the full problem.

  • splines::Dict{I, Spl}

This is a dictionary such that splines[i] is a spline through the data (parameter_values[i], profile_values[i]). This spline can be evaluated at a point ψ for the ith variable by calling an instance of the struct with arguments (ψ, i). See also spline_profile.

  • confidence_intervals::Dict{I,ConfidenceInterval{CT,CF}}

This is a dictonary such that confidence_intervals[i] is a confidence interval for the ith parameter.

  • other_mles::OM

This is a dictionary such that other_mles[i] gives the vector for the MLEs of the other parameters not being profiled, for each datum.

Spline evaluation

This struct is callable. We define the method

(prof::ProfileLikelihoodSolution)(θ, i)

that evaluates the spline through the ith profile at the point θ.

source
ProfileLikelihood.ConfidenceIntervalType
struct ConfidenceInterval{T, F}

Struct representing a confidence interval.

Fields

  • lower::T

The lower bound of the confidence interval.

  • upper::T

The upper bound of the confidence interval.

  • level::F

The level of the confidence interval.

source
ProfileLikelihood.profileFunction
profile(prob::LikelihoodProblem, sol::LikelihoodSolution, n=1:number_of_parameters(prob);
+    f_kwargs=nothing, prob_kwargs=nothing)

Constructor for the LikelihoodProblem for a differential equation problem with associated integrator.

Arguments

  • loglik::Function: The log-likelihood function, taking the form ℓ(θ, p, integrator).
  • θ₀: The estimates estimates for the MLEs.
  • integrator: The integrator for the differential equation problem. See also construct_integrator.

Keyword Arguments

  • syms=eachindex(θ₀): Names for each parameter.
  • data=SciMLBase.NullParameters(): The parameter p in the log-likelihood function.
  • f_kwargs=nothing: Keyword arguments, passed as a NamedTuple, for the OptimizationFunction.
  • prob_kwargs=nothing: Keyword arguments, passed as a NamedTuple, for the OptimizationProblem.

Outputs

Returns the LikelihoodProblem problem object.

source

LikelihoodSolution

ProfileLikelihood.LikelihoodSolutionType
struct LikelihoodSolution{Θ,P,M,R,A} <: AbstractLikelihoodSolution

Struct for a solution to a LikelihoodProblem.

Fields

  • mle::Θ: The MLEs.
  • problem::P: The LikelihoodProblem.
  • optimiser::A: The algorithm used for solving the optimisation problem.
  • maximum::M: The maximum likelihood.
  • retcode::R: The SciMLBase.ReturnCode.

Constructors

LikelihoodSolution(sol::SciMLBase.OptimizationSolution, prob::AbstractLikelihoodProblem; alg=sol.alg)

Constructs the likelihood solution from a solution to an OptimizationProblem with a given LikelihoodProblem.

source
ProfileLikelihood.mleFunction
mle(prob::LikelihoodProblem, alg, args...; kwargs...)
+mle(prob::LikelihoodProblem, alg::Tuple, args...; kwargs...)

Given the likelihood problem prob and an optimiser alg, finds the MLEs and returns a LikelihoodSolution object. Extra arguments and keyword arguments for solve can be passed through args... and kwargs....

If alg is a Tuple, then the problem is re-optimised after each algorithm with the next element in alg, starting from alg[1], with initial estimate coming from the solution with the previous algorithm (starting with get_initial_estimate(prob)).

source

ProfileLikelihoodSolution

ProfileLikelihood.ProfileLikelihoodSolutionType
ProfileLikelihoodSolution{I,V,LP,LS,Spl,CT,CF,OM}

Struct for the normalised profile log-likelihood. See profile for a constructor.

Fields

  • parameter_values::Dict{I, V}

This is a dictionary such that parameter_values[i] gives the parameter values used for the normalised profile log-likelihood of the ith variable.

  • profile_values::Dict{I, V}

This is a dictionary such that profile_values[i] gives the values of the normalised profile log-likelihood function at the corresponding values in θ[i].

  • likelihood_problem::LP

The original LikelihoodProblem.

  • likelihood_solution::LS

The solution to the full problem.

  • splines::Dict{I, Spl}

This is a dictionary such that splines[i] is a spline through the data (parameter_values[i], profile_values[i]). This spline can be evaluated at a point ψ for the ith variable by calling an instance of the struct with arguments (ψ, i). See also spline_profile.

  • confidence_intervals::Dict{I,ConfidenceInterval{CT,CF}}

This is a dictonary such that confidence_intervals[i] is a confidence interval for the ith parameter.

  • other_mles::OM

This is a dictionary such that other_mles[i] gives the vector for the MLEs of the other parameters not being profiled, for each datum.

Spline evaluation

This struct is callable. We define the method

(prof::ProfileLikelihoodSolution)(θ, i)

that evaluates the spline through the ith profile at the point θ.

source
ProfileLikelihood.ConfidenceIntervalType
struct ConfidenceInterval{T, F}

Struct representing a confidence interval.

Fields

  • lower::T

The lower bound of the confidence interval.

  • upper::T

The upper bound of the confidence interval.

  • level::F

The level of the confidence interval.

source
ProfileLikelihood.profileFunction
profile(prob::LikelihoodProblem, sol::LikelihoodSolution, n=1:number_of_parameters(prob);
     alg=get_optimiser(sol),
     conf_level::F=0.95,
     confidence_interval_method=:spline,
@@ -21,7 +21,7 @@
     extrap=Line,
     parallel=false,
     next_initial_estimate_method = :prev,
-    kwargs...)

Computes profile likelihoods for the parameters from a likelihood problem prob with MLEs sol.

See also replace_profile! which allows you to re-profile a parameter in case you are not satisfied with the results. For plotting, see the plot_profiles function (requires that you have loaded a backend of Makie.jl).

Arguments

  • prob::LikelihoodProblem: The LikelihoodProblem.
  • sol::LikelihoodSolution: The LikelihoodSolution. See also mle.
  • n=1:number_of_parameters(prob): The parameter indices to compute the profile likelihoods for.

Keyword Arguments

  • alg=get_optimiser(sol): The optimiser to use for solving each optimisation problem.
  • conf_level::F=0.95: The level to use for the ConfidenceIntervals.
  • confidence_interval_method=:spline: The method to use for computing the confidence intervals. See also get_confidence_intervals!. The default :spline uses rootfinding on the spline through the data, defining a continuous function, while the alternative :extrema simply takes the extrema of the values that exceed the threshold.
  • threshold=get_chisq_threshold(conf_level): The threshold to use for defining the confidence intervals.
  • resolution=200: The number of points to use for evaluating the profile likelihood in each direction starting from the MLE (giving a total of 2resolution points). - resolution=200: The number of points to use for defining grids below, giving the number of points to the left and right of each interest parameter. This can also be a vector, e.g. resolution = [20, 50, 60] will use 20 points for the first parameter, 50 for the second, and 60 for the third.
  • param_ranges=construct_profile_ranges(sol, get_lower_bounds(prob), get_upper_bounds(prob), resolution): The ranges to use for each parameter.
  • min_steps=10: The minimum number of steps to allow for the profile in each direction. If fewer than this number of steps are used before reaching the threshold, then the algorithm restarts and computes the profile likelihood a number min_steps of points in that direction. See also min_steps_fallback.
  • min_steps_fallback=:replace: Method to use for updating the profile when it does not reach the minimum number of steps, min_steps. See also reach_min_steps!. If :replace, then the profile is completely replaced and we use min_steps equally spaced points to replace it. If :refine, we just fill in some of the space in the grid so that a min_steps number of points are reached. Note that this latter option will mean that the spacing is no longer constant between parameter values. You can use :refine_parallel to apply :refine in parallel.
  • normalise::Bool=true: Whether to optimise the normalised profile log-likelihood or not.
  • spline_alg=FritschCarlsonMonotonicInterpolation: The interpolation algorithm to use for computing a spline from the profile data. See Interpolations.jl.
  • extrap=Line: The extrapolation algorithm to use for computing a spline from the profile data. See Interpolations.jl.
  • parallel=false: Whether to use multithreading. If true, will use multithreading so that multiple parameters are profiled at once, and the steps to the left and right are done at the same time.
  • next_initial_estimate_method = :prev: Method for selecting the next initial estimate when stepping forward when profiling. :prev simply uses the previous solution, but you can also use :interp to use linear interpolation. See also set_next_initial_estimate!.
  • kwargs...: Extra keyword arguments to pass into solve for solving the OptimizationProblem. See also the docs from Optimization.jl.

Output

Returns a ProfileLikelihoodSolution.

source
ProfileLikelihood.replace_profile!Function
replace_profile!(prof::ProfileLikelihoodSolution, n);
+    kwargs...)

Computes profile likelihoods for the parameters from a likelihood problem prob with MLEs sol.

See also replace_profile! which allows you to re-profile a parameter in case you are not satisfied with the results. For plotting, see the plot_profiles function (requires that you have loaded a backend of Makie.jl).

Arguments

  • prob::LikelihoodProblem: The LikelihoodProblem.
  • sol::LikelihoodSolution: The LikelihoodSolution. See also mle.
  • n=1:number_of_parameters(prob): The parameter indices to compute the profile likelihoods for.

Keyword Arguments

  • alg=get_optimiser(sol): The optimiser to use for solving each optimisation problem.
  • conf_level::F=0.95: The level to use for the ConfidenceIntervals.
  • confidence_interval_method=:spline: The method to use for computing the confidence intervals. See also get_confidence_intervals!. The default :spline uses rootfinding on the spline through the data, defining a continuous function, while the alternative :extrema simply takes the extrema of the values that exceed the threshold.
  • threshold=get_chisq_threshold(conf_level): The threshold to use for defining the confidence intervals.
  • resolution=200: The number of points to use for evaluating the profile likelihood in each direction starting from the MLE (giving a total of 2resolution points). - resolution=200: The number of points to use for defining grids below, giving the number of points to the left and right of each interest parameter. This can also be a vector, e.g. resolution = [20, 50, 60] will use 20 points for the first parameter, 50 for the second, and 60 for the third.
  • param_ranges=construct_profile_ranges(sol, get_lower_bounds(prob), get_upper_bounds(prob), resolution): The ranges to use for each parameter.
  • min_steps=10: The minimum number of steps to allow for the profile in each direction. If fewer than this number of steps are used before reaching the threshold, then the algorithm restarts and computes the profile likelihood a number min_steps of points in that direction. See also min_steps_fallback.
  • min_steps_fallback=:replace: Method to use for updating the profile when it does not reach the minimum number of steps, min_steps. See also reach_min_steps!. If :replace, then the profile is completely replaced and we use min_steps equally spaced points to replace it. If :refine, we just fill in some of the space in the grid so that a min_steps number of points are reached. Note that this latter option will mean that the spacing is no longer constant between parameter values. You can use :refine_parallel to apply :refine in parallel.
  • normalise::Bool=true: Whether to optimise the normalised profile log-likelihood or not.
  • spline_alg=FritschCarlsonMonotonicInterpolation: The interpolation algorithm to use for computing a spline from the profile data. See Interpolations.jl.
  • extrap=Line: The extrapolation algorithm to use for computing a spline from the profile data. See Interpolations.jl.
  • parallel=false: Whether to use multithreading. If true, will use multithreading so that multiple parameters are profiled at once, and the steps to the left and right are done at the same time.
  • next_initial_estimate_method = :prev: Method for selecting the next initial estimate when stepping forward when profiling. :prev simply uses the previous solution, but you can also use :interp to use linear interpolation. See also set_next_initial_estimate!.
  • kwargs...: Extra keyword arguments to pass into solve for solving the OptimizationProblem. See also the docs from Optimization.jl.

Output

Returns a ProfileLikelihoodSolution.

source
ProfileLikelihood.replace_profile!Function
replace_profile!(prof::ProfileLikelihoodSolution, n);
     alg=get_optimiser(prof.likelihood_solution),
     conf_level::F=0.95,
     confidence_interval_method=:spline,
@@ -35,7 +35,7 @@
     extrap=Line,
     parallel=false,
     next_initial_estimate_method=:prev,
-    kwargs...) where {F}

Given an existing prof::ProfileLikelihoodSolution, replaces the profile results for the parameters in n by re-profiling. The keyword arguments are the same as for profile.

source
ProfileLikelihood.refine_profile!Function
refine_profile!(prof::ProfileLikelihoodSolution, n;
+    kwargs...) where {F}

Given an existing prof::ProfileLikelihoodSolution, replaces the profile results for the parameters in n by re-profiling. The keyword arguments are the same as for profile.

source
ProfileLikelihood.refine_profile!Function
refine_profile!(prof::ProfileLikelihoodSolution, n;
     alg=get_optimiser(prof.likelihood_solution),
     conf_level::F=0.95,
     confidence_interval_method=:spline,
@@ -45,9 +45,9 @@
     spline_alg=FritschCarlsonMonotonicInterpolation,
     extrap=Line,
     parallel=false,
-    kwargs...) where {F}

Given an existing prof::ProfileLikelihoodSolution, refines the profile results for the parameters in n by adding more points. The keyword arguments are the same as for profile. target_number is the total number of points that should be included in the end (not how many more are added).

source
ProfileLikelihood.set_next_initial_estimate!Method
set_next_initial_estimate!(sub_cache, param_vals, other_mles, prob, θₙ; next_initial_estimate_method=Val(:prev))

Method for selecting the next initial estimate for the optimisers. sub_cache is the cache vector for placing the initial estimate into, param_vals is the current list of parameter values for the interest parameter, and other_mles is the corresponding list of previous optimisers. prob is the OptimizationProblem. The value θₙ is the next value of the interest parameter.

The available methods are:

  • next_initial_estimate_method = Val(:prev): If this is selected, simply use other_mles[end], i.e. the previous optimiser.
  • next_initial_estimate_method = Val(:interp): If this is selected, the next optimiser is determined via linear interpolation using the data (param_vals[end-1], other_mles[end-1]), (param_vals[end], other_mles[end]). If the new approximation is outside of the parameter bounds, falls back to next_initial_estimate_method = :prev.
source
ProfileLikelihood.get_confidence_intervals!Function
get_confidence_intervals!(confidence_intervals, method, n, param_vals, profile_vals, threshold, spline_alg, extrap, mles, conf_level)

Method for computing the confidence intervals.

Arguments

  • confidence_intervals: The dictionary storing the confidence intervals.

  • method: The method to use for computing the confidence interval. The available methods are:

    • method = Val(:spline): Fits a spline to (param_vals, profile_vals) and finds where the continuous spline equals threshold.
    • method = Val(:extrema): Takes the first and last values in param_vals whose corresponding value in profile_vals exceeds threshold.
  • n: The parameter being profiled.

  • param_vals: The parameter values.

  • profile_vals: The profile values.

  • threshold: The threshold for the confidence interval.

  • spline_alg: The algorithm to use for fitting a spline.

  • extrap: The extrapolation algorithm used for the spline.

  • mles: The MLEs.

  • conf_level: The confidence level for the confidence interval.

Outputs

There are no outputs - confidence_intervals[n] gets the ConfidenceInterval put into it.

source
ProfileLikelihood.reach_min_steps!Function
reach_min_steps!(param_vals, profile_vals, other_mles, param_range,
+    kwargs...) where {F}

Given an existing prof::ProfileLikelihoodSolution, refines the profile results for the parameters in n by adding more points. The keyword arguments are the same as for profile. target_number is the total number of points that should be included in the end (not how many more are added).

source
ProfileLikelihood.set_next_initial_estimate!Method
set_next_initial_estimate!(sub_cache, param_vals, other_mles, prob, θₙ; next_initial_estimate_method=Val(:prev))

Method for selecting the next initial estimate for the optimisers. sub_cache is the cache vector for placing the initial estimate into, param_vals is the current list of parameter values for the interest parameter, and other_mles is the corresponding list of previous optimisers. prob is the OptimizationProblem. The value θₙ is the next value of the interest parameter.

The available methods are:

  • next_initial_estimate_method = Val(:prev): If this is selected, simply use other_mles[end], i.e. the previous optimiser.
  • next_initial_estimate_method = Val(:interp): If this is selected, the next optimiser is determined via linear interpolation using the data (param_vals[end-1], other_mles[end-1]), (param_vals[end], other_mles[end]). If the new approximation is outside of the parameter bounds, falls back to next_initial_estimate_method = :prev.
source
ProfileLikelihood.get_confidence_intervals!Function
get_confidence_intervals!(confidence_intervals, method, n, param_vals, profile_vals, threshold, spline_alg, extrap, mles, conf_level)

Method for computing the confidence intervals.

Arguments

  • confidence_intervals: The dictionary storing the confidence intervals.

  • method: The method to use for computing the confidence interval. The available methods are:

    • method = Val(:spline): Fits a spline to (param_vals, profile_vals) and finds where the continuous spline equals threshold.
    • method = Val(:extrema): Takes the first and last values in param_vals whose corresponding value in profile_vals exceeds threshold.
  • n: The parameter being profiled.

  • param_vals: The parameter values.

  • profile_vals: The profile values.

  • threshold: The threshold for the confidence interval.

  • spline_alg: The algorithm to use for fitting a spline.

  • extrap: The extrapolation algorithm used for the spline.

  • mles: The MLEs.

  • conf_level: The confidence level for the confidence interval.

Outputs

There are no outputs - confidence_intervals[n] gets the ConfidenceInterval put into it.

source
ProfileLikelihood.reach_min_steps!Function
reach_min_steps!(param_vals, profile_vals, other_mles, param_range,
     restricted_prob, n, cache, alg, sub_cache, ℓmax, normalise,
-    threshold, min_steps, mles; min_steps_fallback=Val(:replace), next_initial_estimate_method=Val(:interp), kwargs...)

Updates the results from the side of a profile likelihood (e.g. left or right side, see find_endpoint!) to meet the minimum number of steps min_steps.

Arguments

  • param_vals: The parameter values.
  • profile_vals: The profile values.
  • other_mles: The other MLEs, i.e. the optimised parameters for the corresponding fixed parameter values in param_vals.
  • param_range: The vector of parameter values.
  • restricted_prob: The optimisation problem, restricted to the nth parameter.
  • n: The parameter being profiled.
  • cache: A cache for the complete parameter vector.
  • alg: The algorithm used for optimising.
  • sub_cache: A cache for the parameter vector excluding the nth parameter.
  • ℓmax: The maximum likelihood.
  • normalise: Whether the optimisation problem is normalised.
  • threshold: The threshold for the confidence interval.
  • min_steps: The minimum number of steps to reach.
  • mles: The MLEs.

Keyword Arguments

  • min_steps_fallback=Val(:interp): The method used for reaching the minimum number of steps. The available methods are:

    • min_steps_fallback = Val(:replace): This method completely replaces the profile, defining a grid from the MLE to the computed endpoint with min_steps points. No information is re-used.
    • min_steps_fallback = Val(:refine): This method just adds more points to the profile, filling in enough points so that the total number of points is min_steps. The initial estimates in this case come from a spline from other_mles.
    • min_steps_fallback = Val(:parallel_refine): This applies the method above, except in parallel.
  • next_initial_estimate_method=Val(:replace): The method used for obtaining initial estimates. See also set_next_initial_estimate!.

source

BivariateProfileLikelihoodSolution

ProfileLikelihood.BivariateProfileLikelihoodSolutionType
BivariateProfileLikelihoodSolution{I,V,LP,LS,Spl,CT,CF,OM}

Struct for the normalised bivariate profile log-likelihood. See bivariate_profile for a constructor.

Arguments

  • parameter_values::Dict{I, G}

Maps the tuple (i, j) to the grid values used for this parameter pair. The result is a Tuple, with the first element the grid for the ith parameter, and the second element the grid for the jth parameter. The grids are given as OffsetVectors, with the 0th index the MLE, negative indices to the left of the MLE, and positive indices to the right of the MLE.

  • profile_values::Dict{I, V}

Maps the tuple (i, j) to the matrix used for this parameter pair. The result is a OffsetMatrix, with the (k, ℓ) entry the profile at (parameter_values[(i, j)][1][k], parameter_values[(i, j)][2][k]), and particularly the (0, 0) entry is the profile at the MLEs.

  • likelihood_problem::LP

The original likelihood problem.

  • likelihood_solution::LS

The original likelihood solution.

  • interpolants::Dict{I,Spl}

Maps the tuple (i, j) to the interpolant for that parameter pair's profile. This interpolant also uses linear extrapolation.

  • confidence_regions::Dict{I,ConfidenceRegion{CT,CF}}

Maps the tuple (i, j) to the confidence region for that parameter pair's confidence region. See also ConfidenceRegion.

  • other_mles::OM

Maps the tuple (i, j) to an OffsetMatrix storing the solutions for the nuisance parameters at the corresponding grid values.

Interpolant evaluation

This struct is callable. We define the method

(prof::BivariateProfileLikelihoodSolution)(θ, ψ, i, j)

that evaluates the interpolant through the (i, j)th profile at the point (θ, ψ).

source
ProfileLikelihood.ConfidenceRegionType
struct ConfidenceRegion{T, F}

Struct representing a confidence region.

Fields

  • x::T

The x-coordinates for the region's boundary.

  • y::T

The y-coordinates for the region's boundary.

  • level::F

The level of the confidence region.

source
ProfileLikelihood.bivariate_profileFunction
bivariate_profile(prob::LikelihoodProblem, sol::LikelihoodSolution, n::NTuple{M,NTuple{2,Int}};
+    threshold, min_steps, mles; min_steps_fallback=Val(:replace), next_initial_estimate_method=Val(:interp), kwargs...)

Updates the results from the side of a profile likelihood (e.g. left or right side, see find_endpoint!) to meet the minimum number of steps min_steps.

Arguments

  • param_vals: The parameter values.
  • profile_vals: The profile values.
  • other_mles: The other MLEs, i.e. the optimised parameters for the corresponding fixed parameter values in param_vals.
  • param_range: The vector of parameter values.
  • restricted_prob: The optimisation problem, restricted to the nth parameter.
  • n: The parameter being profiled.
  • cache: A cache for the complete parameter vector.
  • alg: The algorithm used for optimising.
  • sub_cache: A cache for the parameter vector excluding the nth parameter.
  • ℓmax: The maximum likelihood.
  • normalise: Whether the optimisation problem is normalised.
  • threshold: The threshold for the confidence interval.
  • min_steps: The minimum number of steps to reach.
  • mles: The MLEs.

Keyword Arguments

  • min_steps_fallback=Val(:interp): The method used for reaching the minimum number of steps. The available methods are:

    • min_steps_fallback = Val(:replace): This method completely replaces the profile, defining a grid from the MLE to the computed endpoint with min_steps points. No information is re-used.
    • min_steps_fallback = Val(:refine): This method just adds more points to the profile, filling in enough points so that the total number of points is min_steps. The initial estimates in this case come from a spline from other_mles.
    • min_steps_fallback = Val(:parallel_refine): This applies the method above, except in parallel.
  • next_initial_estimate_method=Val(:replace): The method used for obtaining initial estimates. See also set_next_initial_estimate!.

source

BivariateProfileLikelihoodSolution

ProfileLikelihood.BivariateProfileLikelihoodSolutionType
BivariateProfileLikelihoodSolution{I,V,LP,LS,Spl,CT,CF,OM}

Struct for the normalised bivariate profile log-likelihood. See bivariate_profile for a constructor.

Arguments

  • parameter_values::Dict{I, G}

Maps the tuple (i, j) to the grid values used for this parameter pair. The result is a Tuple, with the first element the grid for the ith parameter, and the second element the grid for the jth parameter. The grids are given as OffsetVectors, with the 0th index the MLE, negative indices to the left of the MLE, and positive indices to the right of the MLE.

  • profile_values::Dict{I, V}

Maps the tuple (i, j) to the matrix used for this parameter pair. The result is a OffsetMatrix, with the (k, ℓ) entry the profile at (parameter_values[(i, j)][1][k], parameter_values[(i, j)][2][k]), and particularly the (0, 0) entry is the profile at the MLEs.

  • likelihood_problem::LP

The original likelihood problem.

  • likelihood_solution::LS

The original likelihood solution.

  • interpolants::Dict{I,Spl}

Maps the tuple (i, j) to the interpolant for that parameter pair's profile. This interpolant also uses linear extrapolation.

  • confidence_regions::Dict{I,ConfidenceRegion{CT,CF}}

Maps the tuple (i, j) to the confidence region for that parameter pair's confidence region. See also ConfidenceRegion.

  • other_mles::OM

Maps the tuple (i, j) to an OffsetMatrix storing the solutions for the nuisance parameters at the corresponding grid values.

Interpolant evaluation

This struct is callable. We define the method

(prof::BivariateProfileLikelihoodSolution)(θ, ψ, i, j)

that evaluates the interpolant through the (i, j)th profile at the point (θ, ψ).

source
ProfileLikelihood.ConfidenceRegionType
struct ConfidenceRegion{T, F}

Struct representing a confidence region.

Fields

  • x::T

The x-coordinates for the region's boundary.

  • y::T

The y-coordinates for the region's boundary.

  • level::F

The level of the confidence region.

source
ProfileLikelihood.bivariate_profileFunction
bivariate_profile(prob::LikelihoodProblem, sol::LikelihoodSolution, n::NTuple{M,NTuple{2,Int}};
     alg=get_optimiser(sol),
     conf_level::F=0.95,
     confidence_region_method=Val(:contour),
@@ -59,8 +59,8 @@
     normalise=Val(true),
     parallel=Val(false),
     next_initial_estimate_method=Val(:nearest),
-    kwargs...) where {M,F}

Computes bivariates profile likelihoods for the parameters from a likelihood problem prob with MLEs sol. You can also call this function using Symbols, e.g. if get_syms(prob) = [:λ, :K, :u₀], then calling bivariate_profile(prob, sol, ((:λ, :K), (:K, u₀))) is the same as calling bivariate_profile(prob, sol, ((1, 2), (2, 3))) (the integer coordinate representation is still used in the solution, though).

For plotting, see the plot_profiles function (requires that you have loaded a backend of Makie.jl).

Arguments

  • prob::LikelihoodProblem: The LikelihoodProblem.
  • sol::LikelihoodSolution: The LikelihoodSolution. See also mle.
  • n::NTuple{M,NTuple{2,Int}}: The parameter indices to compute the profile likelihoods for. These should be tuples of indices, e.g. n = ((1, 2),) will compute the bivariate profile between the parameters 1 and 2.

Keyword Arguments

  • alg=get_optimiser(sol): The optimiser to use for solving each optimisation problem.
  • conf_level::F=0.95: The level to use for the ConfidenceRegions.
  • confidence_region_method=:contour: The method to use for computing the confidence regions. See also get_confidence_regions!. The default, :contour, using Contour.jl to compute the boundary of the confidence region. An alternative option is :delaunay, which uses DelaunayTriangulation.jl and triangulation contouring to find the boundary. This latter option is only available if you have already done using DelaunayTriangulation.
  • threshold=get_chisq_threshold(conf_level, 2): The threshold to use for defining the confidence regions.
  • resolution=200: The number of points to use for defining grids below, giving the number of points to the left and right of each interest parameter. This can also be a vector, e.g. resolution = [20, 50, 60] will use 20 points for the first parameter, 50 for the second, and 60 for the third. When defining the grid between pairs of values, the maximum of the two resolutions is used (thus defining a square grid).
  • grids=construct_profile_grids(n, sol, get_lower_bounds(prob), get_upper_bounds(prob), resolution): The grids to use for each parameter pair.
  • min_layers=10: The minimum number of layers to allow for the profile away from the MLE. If fewer than this number of layers are used before reaching the threshold, then the algorithm restarts and computes the profile likelihood a number min_steps of points in that direction.
  • outer_layers=0: The number of layers to go out away from the bounding box of the confidence region.
  • normalise=true: Whether to optimise the normalised profile log-likelihood or not.
  • parallel=false: Whether to use multithreading. If true, will use multithreading so that multiple parameters are profiled at once, and the work done evaluating the solution at each node in a layer is distributed across each thread.
  • next_initial_estimate_method = :nearest: Method for selecting the next initial estimate when stepping onto the next layer when profiling. :nearest simply uses the solution at the nearest node from the previous layer, but you can also use :mle to reuse the MLE or :interp to use linear interpolation. See also set_next_initial_estimate!.
  • kwargs...: Extra keyword arguments to pass into solve for solving the OptimizationProblem. See also the docs from Optimization.jl.

Output

Returns a BivariateProfileLikelihoodSolution.

source
ProfileLikelihood.set_next_initial_estimate!Method
set_next_initial_estimate!(sub_cache, other_mles, I::CartesianIndex, fixed_vals, grid, layer, prob, next_initial_estimate_method::Val{M}) where {M}

Method for selecting the next initial estimate for the optimisers.

Arguments

  • sub_cache: Cache for the next initial estimate.
  • other_mles: Solutions to the optimisation problems found so far.
  • I::CartesianIndex: The coordinate of the node currently being considered.
  • fixed_vals: The current values for the parameters of interest.
  • grid: The grid for the parameters of interest.
  • layer: The current layer.
  • prob: The restricted optimisation problem.
  • next_initial_estimate_method::Val{M}: The method to use.

The methods currently available for next_initial_estimate_method are:

  • next_initial_estimate_method = Val(:mle): Simply sets sub_cache to be the MLE.
  • next_initial_estimate_method = Val(:nearest): Sets sub_cache to be other_mles[J], where J is the nearest node to I in the previous layer.
  • next_initial_estimate_method = Val(:interp): Uses linear interpolation from all the previous layers to extrapolate and compute a new sub_cache.

Outputs

There are no outputs.

source

Prediction intervals

ProfileLikelihood.get_prediction_intervalsFunction
get_prediction_intervals(q, prof::(Bivariate)ProfileLikelihoodSolution, data;
-    q_prototype=isinplace(q, 3) ? nothing : build_q_prototype(q, prof, data), resolution=250)

Obtain prediction intervals for the output of the prediction function q, assuming q returns (or operates in-place on) a vector.

Arguments

  • q: The prediction function, taking either the form (θ, data) or (cache, θ, data). The former version is an out-of-place version, returning the full vector, while the latter version is an in-place version, with the output being placed into cache. The argument θ is the same as the parameters used in the likelihood problem (from prof), and the data argument is the same data as in this function.
  • prof::(Bivariate)ProfileLikelihoodSolution: The profile likelihood results.
  • data: The argument data in q.

Keyword Arguments

  • q_prototype=isinplace(q, 3) ? nothing : build_q_prototype(q, prof, data): A prototype for the result of q. If you are using the q(θ, data) version of q, this can be inferred from build_q_prototype, but if you are using the in-place version then a build_q_prototype is needed. For example, if q returns a vector with eltype Float64 and has length 27, q_prototype could be zeros(27).
  • resolution::Integer=250: The amount of curves to evaluate for each parameter. This will be the same for each parameter. If prof isa BivariateProfileLikelihoodSolution, then resolution^2 points are defined inside a bounding box for the confidence region, and then we throw away all points outside of the actual confidence region.
  • parallel=false: Whether to use multithreading. Multithreading is used when building q_vals below.

Outputs

Four values are returned. In order:

  • individual_intervals: Prediction intervals for the output of q, relative to each parameter.
  • union_intervals: The union of the individual prediction intervals from individual_intervals.
  • q_vals: Values of q at each parameter considered. The output is a Dict, where the parameter index is mapped to a matrix where each column is an output from q, with the jth column corresponding to the parameter value at param_ranges[j].
  • param_ranges: Parameter values used for each prediction interval.
source

Plotting

ProfileLikelihood.plot_profilesFunction
plot_profiles(prof::ProfileLikelihoodSolution, vars = profiled_parameters(prof); 
+    kwargs...) where {M,F}

Computes bivariates profile likelihoods for the parameters from a likelihood problem prob with MLEs sol. You can also call this function using Symbols, e.g. if get_syms(prob) = [:λ, :K, :u₀], then calling bivariate_profile(prob, sol, ((:λ, :K), (:K, u₀))) is the same as calling bivariate_profile(prob, sol, ((1, 2), (2, 3))) (the integer coordinate representation is still used in the solution, though).

For plotting, see the plot_profiles function (requires that you have loaded a backend of Makie.jl).

Arguments

  • prob::LikelihoodProblem: The LikelihoodProblem.
  • sol::LikelihoodSolution: The LikelihoodSolution. See also mle.
  • n::NTuple{M,NTuple{2,Int}}: The parameter indices to compute the profile likelihoods for. These should be tuples of indices, e.g. n = ((1, 2),) will compute the bivariate profile between the parameters 1 and 2.

Keyword Arguments

  • alg=get_optimiser(sol): The optimiser to use for solving each optimisation problem.
  • conf_level::F=0.95: The level to use for the ConfidenceRegions.
  • confidence_region_method=:contour: The method to use for computing the confidence regions. See also get_confidence_regions!. The default, :contour, using Contour.jl to compute the boundary of the confidence region. An alternative option is :delaunay, which uses DelaunayTriangulation.jl and triangulation contouring to find the boundary. This latter option is only available if you have already done using DelaunayTriangulation.
  • threshold=get_chisq_threshold(conf_level, 2): The threshold to use for defining the confidence regions.
  • resolution=200: The number of points to use for defining grids below, giving the number of points to the left and right of each interest parameter. This can also be a vector, e.g. resolution = [20, 50, 60] will use 20 points for the first parameter, 50 for the second, and 60 for the third. When defining the grid between pairs of values, the maximum of the two resolutions is used (thus defining a square grid).
  • grids=construct_profile_grids(n, sol, get_lower_bounds(prob), get_upper_bounds(prob), resolution): The grids to use for each parameter pair.
  • min_layers=10: The minimum number of layers to allow for the profile away from the MLE. If fewer than this number of layers are used before reaching the threshold, then the algorithm restarts and computes the profile likelihood a number min_steps of points in that direction.
  • outer_layers=0: The number of layers to go out away from the bounding box of the confidence region.
  • normalise=true: Whether to optimise the normalised profile log-likelihood or not.
  • parallel=false: Whether to use multithreading. If true, will use multithreading so that multiple parameters are profiled at once, and the work done evaluating the solution at each node in a layer is distributed across each thread.
  • next_initial_estimate_method = :nearest: Method for selecting the next initial estimate when stepping onto the next layer when profiling. :nearest simply uses the solution at the nearest node from the previous layer, but you can also use :mle to reuse the MLE or :interp to use linear interpolation. See also set_next_initial_estimate!.
  • kwargs...: Extra keyword arguments to pass into solve for solving the OptimizationProblem. See also the docs from Optimization.jl.

Output

Returns a BivariateProfileLikelihoodSolution.

source
ProfileLikelihood.set_next_initial_estimate!Method
set_next_initial_estimate!(sub_cache, other_mles, I::CartesianIndex, fixed_vals, grid, layer, prob, next_initial_estimate_method::Val{M}) where {M}

Method for selecting the next initial estimate for the optimisers.

Arguments

  • sub_cache: Cache for the next initial estimate.
  • other_mles: Solutions to the optimisation problems found so far.
  • I::CartesianIndex: The coordinate of the node currently being considered.
  • fixed_vals: The current values for the parameters of interest.
  • grid: The grid for the parameters of interest.
  • layer: The current layer.
  • prob: The restricted optimisation problem.
  • next_initial_estimate_method::Val{M}: The method to use.

The methods currently available for next_initial_estimate_method are:

  • next_initial_estimate_method = Val(:mle): Simply sets sub_cache to be the MLE.
  • next_initial_estimate_method = Val(:nearest): Sets sub_cache to be other_mles[J], where J is the nearest node to I in the previous layer.
  • next_initial_estimate_method = Val(:interp): Uses linear interpolation from all the previous layers to extrapolate and compute a new sub_cache.

Outputs

There are no outputs.

source

Prediction intervals

ProfileLikelihood.get_prediction_intervalsFunction
get_prediction_intervals(q, prof::(Bivariate)ProfileLikelihoodSolution, data;
+    q_prototype=isinplace(q, 3) ? nothing : build_q_prototype(q, prof, data), resolution=250)

Obtain prediction intervals for the output of the prediction function q, assuming q returns (or operates in-place on) a vector.

Arguments

  • q: The prediction function, taking either the form (θ, data) or (cache, θ, data). The former version is an out-of-place version, returning the full vector, while the latter version is an in-place version, with the output being placed into cache. The argument θ is the same as the parameters used in the likelihood problem (from prof), and the data argument is the same data as in this function.
  • prof::(Bivariate)ProfileLikelihoodSolution: The profile likelihood results.
  • data: The argument data in q.

Keyword Arguments

  • q_prototype=isinplace(q, 3) ? nothing : build_q_prototype(q, prof, data): A prototype for the result of q. If you are using the q(θ, data) version of q, this can be inferred from build_q_prototype, but if you are using the in-place version then a build_q_prototype is needed. For example, if q returns a vector with eltype Float64 and has length 27, q_prototype could be zeros(27).
  • resolution::Integer=250: The amount of curves to evaluate for each parameter. This will be the same for each parameter. If prof isa BivariateProfileLikelihoodSolution, then resolution^2 points are defined inside a bounding box for the confidence region, and then we throw away all points outside of the actual confidence region.
  • parallel=false: Whether to use multithreading. Multithreading is used when building q_vals below.

Outputs

Four values are returned. In order:

  • individual_intervals: Prediction intervals for the output of q, relative to each parameter.
  • union_intervals: The union of the individual prediction intervals from individual_intervals.
  • q_vals: Values of q at each parameter considered. The output is a Dict, where the parameter index is mapped to a matrix where each column is an output from q, with the jth column corresponding to the parameter value at param_ranges[j].
  • param_ranges: Parameter values used for each prediction interval.
source

Plotting

ProfileLikelihood.plot_profilesFunction
plot_profiles(prof::ProfileLikelihoodSolution, vars = profiled_parameters(prof); 
     ncol=nothing, 
     nrow=nothing,
     true_vals=Dict(vars .=> nothing), 
@@ -81,7 +81,7 @@
     interpolation=false,
     smooth_confidence_boundary=false,
     close_contour=true,
-    latex_names=Dict(1:number_of_parameters(get_likelihood_problem(prof)) .=> get_syms(prof)))

Plot results from a bivariate profile likelihood solution prof. To use this function you you need to have done using CairoMakie (or any other Makie backend).

Arguments

  • prof::ProfileLikelihoodSolution: The profile likelihood solution from profile.
  • vars = profiled_parameters(prof): The parameters to plot.

Keyword Arguments

  • ncol=nothing: The number of columns to use. If nothing, chosen automatically via choose_grid_layout.
  • nrow=nothing: The number of rows to use. If nothing, chosen automatically via choose_grid_layout
  • true_vals=Dict(1:number_of_parameters(get_likelihood_problem(prof)) .=> nothing): A dictionary mapping parameter indices to their true values, if they exist. If nothing, nothing is plotted, otherwise a black dot is plotted at the true value on the bivariate profile's plot.
  • show_mles=true: Whether to put a red dot at the MLEs.
  • fig_kwargs=nothing: Extra keyword arguments for Figure (see the Makie docs).
  • axis_kwargs=nothing: Extra keyword arguments for Axis (see the Makie docs).
  • interpolation=false: Whether to plot the profile using the interpolant (true), or to use the data from prof directly (false).
  • smooth_confidence_boundary=false: Whether to smooth the confidence region boundary when plotting (true) or not (false). The smoothing is done with a spline.
  • close_contour=true: Whether to connect the last part of the confidence region boundary to the beginning (true) or not (false).
  • latex_names=Dict(1:number_of_parameters(get_likelihood_problem(prof)) .=> get_syms(prof))): LaTeX names to use for the parameters. Defaults to the syms names.
  • xlim_tuples=nothing: xlims to use for each plot. nothing if the xlims should be set automatically.
  • ylim_tuples=nothing: ylims to use for each plot. nothing if the ylims should be set automatically.

Output

The Figure() is returned.

source

GridSearch

Grid definitions

ProfileLikelihood.AbstractGridType
abstract type AbstractGrid{N,B,T}

Type representing a grid, where N is the number of parameters, B is the type for the bounds, and T is the number type.

source
ProfileLikelihood.RegularGridType
struct RegularGrid{N,B,R,S,T} <: AbstractGrid{N,B,T}

Struct for a grid in which each parameter is regularly spaced.

Fields

  • lower_bounds::B: Lower bounds for each parameter.
  • upper_bounds::B: Upper bounds for each parameter.
  • resolution::R: Number of grid points for each parameter. If R <: Number, then the same number of grid points is used for each parameter.
  • step_sizes::S: Grid spacing for each parameter.

Constructor

You can construct a RegularGrid using RegularGrid(lower_bounds, upper_bounds, resolution).

source
ProfileLikelihood.FusedRegularGridType
struct FusedRegularGrid{N,B,R,S,T,C,OR} <: AbstractGrid{N,B,T}

Struct representing the fusing of two grids.

Fields

  • positive_grid::RegularGrid{N,B,R,S,T}

This is the first part of the grid, indexed into by positive integers.

  • negative_grid::RegularGrid{N,B,R,S,T}

This is the second part of the grid, indexed into by negative integers.

  • centre::C

The two grids meet at a common centre, and this is that centre.

  • resolutions::R

This is the vector of resolutions provided (e.g. if store_original_resolutions=true in the constructor below), or the transformed version from get_resolution_tuples.

Constructor

You can construct a FusedRegularGrid using the method

FusedRegularGrid(lower_bounds::B, upper_bounds::B, centre::C, resolutions::R; store_original_resolutions=false) where {B,R,C}

For example, the following code creates fused as the fusion of grid_1 and grid_2:

lb = [2.0, 3.0, 1.0, 5.0]
+    latex_names=Dict(1:number_of_parameters(get_likelihood_problem(prof)) .=> get_syms(prof)))

Plot results from a bivariate profile likelihood solution prof. To use this function you you need to have done using CairoMakie (or any other Makie backend).

Arguments

  • prof::ProfileLikelihoodSolution: The profile likelihood solution from profile.
  • vars = profiled_parameters(prof): The parameters to plot.

Keyword Arguments

  • ncol=nothing: The number of columns to use. If nothing, chosen automatically via choose_grid_layout.
  • nrow=nothing: The number of rows to use. If nothing, chosen automatically via choose_grid_layout
  • true_vals=Dict(1:number_of_parameters(get_likelihood_problem(prof)) .=> nothing): A dictionary mapping parameter indices to their true values, if they exist. If nothing, nothing is plotted, otherwise a black dot is plotted at the true value on the bivariate profile's plot.
  • show_mles=true: Whether to put a red dot at the MLEs.
  • fig_kwargs=nothing: Extra keyword arguments for Figure (see the Makie docs).
  • axis_kwargs=nothing: Extra keyword arguments for Axis (see the Makie docs).
  • interpolation=false: Whether to plot the profile using the interpolant (true), or to use the data from prof directly (false).
  • smooth_confidence_boundary=false: Whether to smooth the confidence region boundary when plotting (true) or not (false). The smoothing is done with a spline.
  • close_contour=true: Whether to connect the last part of the confidence region boundary to the beginning (true) or not (false).
  • latex_names=Dict(1:number_of_parameters(get_likelihood_problem(prof)) .=> get_syms(prof))): LaTeX names to use for the parameters. Defaults to the syms names.
  • xlim_tuples=nothing: xlims to use for each plot. nothing if the xlims should be set automatically.
  • ylim_tuples=nothing: ylims to use for each plot. nothing if the ylims should be set automatically.

Output

The Figure() is returned.

source

GridSearch

Grid definitions

ProfileLikelihood.AbstractGridType
abstract type AbstractGrid{N,B,T}

Type representing a grid, where N is the number of parameters, B is the type for the bounds, and T is the number type.

source
ProfileLikelihood.RegularGridType
struct RegularGrid{N,B,R,S,T} <: AbstractGrid{N,B,T}

Struct for a grid in which each parameter is regularly spaced.

Fields

  • lower_bounds::B: Lower bounds for each parameter.
  • upper_bounds::B: Upper bounds for each parameter.
  • resolution::R: Number of grid points for each parameter. If R <: Number, then the same number of grid points is used for each parameter.
  • step_sizes::S: Grid spacing for each parameter.

Constructor

You can construct a RegularGrid using RegularGrid(lower_bounds, upper_bounds, resolution).

source
ProfileLikelihood.FusedRegularGridType
struct FusedRegularGrid{N,B,R,S,T,C,OR} <: AbstractGrid{N,B,T}

Struct representing the fusing of two grids.

Fields

  • positive_grid::RegularGrid{N,B,R,S,T}

This is the first part of the grid, indexed into by positive integers.

  • negative_grid::RegularGrid{N,B,R,S,T}

This is the second part of the grid, indexed into by negative integers.

  • centre::C

The two grids meet at a common centre, and this is that centre.

  • resolutions::R

This is the vector of resolutions provided (e.g. if store_original_resolutions=true in the constructor below), or the transformed version from get_resolution_tuples.

Constructor

You can construct a FusedRegularGrid using the method

FusedRegularGrid(lower_bounds::B, upper_bounds::B, centre::C, resolutions::R; store_original_resolutions=false) where {B,R,C}

For example, the following code creates fused as the fusion of grid_1 and grid_2:

lb = [2.0, 3.0, 1.0, 5.0]
 ub = [15.0, 13.0, 27.0, 10.0]
 centre = [7.3, 8.3, 2.3, 7.5]
 grid_1 = RegularGrid(centre .+ (ub .- centre) / 173, ub, 173)
@@ -92,4 +92,4 @@
 res = [(2, 11), (23, 25), (19, 21), (50, 51), (17, 99)]
 grid_1 = RegularGrid(centre .+ (ub .- centre) ./ [2, 23, 19, 50, 17], ub, [2, 23, 19, 50, 17])
 grid_2 = RegularGrid(centre .- (centre .- lb) ./ [11, 25, 21, 51, 99], lb, [11, 25, 21, 51, 99])
-fused = ProfileLikelihood.FusedRegularGrid(lb, ub, centre, res) # fused grid_1 and grid_2
source
ProfileLikelihood.IrregularGridType
struct IrregularGrid{N,B,R,S,T} <: AbstractGrid{N,B,T}

Struct for an irregular grid of parameters.

Fields

  • lower_bounds::B: Lower bounds for each parameter.
  • upper_bounds::B: Upper bounds for each parameter.
  • grid::G: The set of parameter values, e.g. a matrix where each column is the parameter vector.

Constructor

You can construct a IrregularGrid using IrregularGrid(lower_bounds, upper_bounds, grid).

source
ProfileLikelihood.GridSearchType
struct GridSearch{F,G}

Struct for a GridSearch.

Fields

  • f::F: The function to optimise, of the form f(x, p).
  • p::P: The arguments p in the function f.
  • grid::G: The grid, where G<:AbstractGrid. See also grid_search.
source
ProfileLikelihood.grid_searchFunction
grid_search(prob; save_vals=Val(false), minimise:=Val(false), parallel=Val(false))

Performs a grid search for the given grid search problem prob.

Arguments

  • prob::GridSearch{F, G}: The grid search problem.

Keyword Arguments

  • save_vals:=Val(false): Whether to return a array with the function values.
  • minimise:=Val(false): Whether to minimise or to maximise the function.
  • parallel:=Val(false): Whether to run the grid search with multithreading.

Outputs

  • f_opt: The optimal objective value.
  • x_argopt: The parameter that gave f_opt.
  • f_res: If save_vals==Val(true), then this is the array of function values.
source
grid_search(f, grid::AbstractGrid; save_vals=Val(false), minimise=Val(false), parallel=Val(false))

For a given grid and function f, performs a grid search.

Arguments

  • f: The function to optimise.
  • grid::AbstractGrid: The grid to use for optimising.

Keyword Arguments

  • save_vals=Val(false): Whether to return a array with the function values.
  • minimise=Val(false): Whether to minimise or to maximise the function.
  • parallel=Val(false): Whether to run the grid search with multithreading.
source
grid_search(prob::LikelihoodProblem, grid::AbstractGrid, parallel=Val(false); save_vals=Val(false))

Given a grid and a likelihood problem prob, maximises it over the grid using a grid search. If save_vals==Val(true), then the likelihood function values at each gridpoint are returned. Set parallel=Val(true) if you want multithreading.

source
+fused = ProfileLikelihood.FusedRegularGrid(lb, ub, centre, res) # fused grid_1 and grid_2
source
ProfileLikelihood.IrregularGridType
struct IrregularGrid{N,B,R,S,T} <: AbstractGrid{N,B,T}

Struct for an irregular grid of parameters.

Fields

  • lower_bounds::B: Lower bounds for each parameter.
  • upper_bounds::B: Upper bounds for each parameter.
  • grid::G: The set of parameter values, e.g. a matrix where each column is the parameter vector.

Constructor

You can construct a IrregularGrid using IrregularGrid(lower_bounds, upper_bounds, grid).

source
ProfileLikelihood.GridSearchType
struct GridSearch{F,G}

Struct for a GridSearch.

Fields

  • f::F: The function to optimise, of the form f(x, p).
  • p::P: The arguments p in the function f.
  • grid::G: The grid, where G<:AbstractGrid. See also grid_search.
source
ProfileLikelihood.grid_searchFunction
grid_search(prob; save_vals=Val(false), minimise:=Val(false), parallel=Val(false))

Performs a grid search for the given grid search problem prob.

Arguments

  • prob::GridSearch{F, G}: The grid search problem.

Keyword Arguments

  • save_vals:=Val(false): Whether to return a array with the function values.
  • minimise:=Val(false): Whether to minimise or to maximise the function.
  • parallel:=Val(false): Whether to run the grid search with multithreading.

Outputs

  • f_opt: The optimal objective value.
  • x_argopt: The parameter that gave f_opt.
  • f_res: If save_vals==Val(true), then this is the array of function values.
source
grid_search(f, grid::AbstractGrid; save_vals=Val(false), minimise=Val(false), parallel=Val(false))

For a given grid and function f, performs a grid search.

Arguments

  • f: The function to optimise.
  • grid::AbstractGrid: The grid to use for optimising.

Keyword Arguments

  • save_vals=Val(false): Whether to return a array with the function values.
  • minimise=Val(false): Whether to minimise or to maximise the function.
  • parallel=Val(false): Whether to run the grid search with multithreading.
source
grid_search(prob::LikelihoodProblem, grid::AbstractGrid, parallel=Val(false); save_vals=Val(false))

Given a grid and a likelihood problem prob, maximises it over the grid using a grid search. If save_vals==Val(true), then the likelihood function values at each gridpoint are returned. Set parallel=Val(true) if you want multithreading.

source
diff --git a/dev/exponential/index.html b/dev/exponential/index.html index c7544db..1bac735 100644 --- a/dev/exponential/index.html +++ b/dev/exponential/index.html @@ -97,4 +97,4 @@ axis_kwargs=(width=600, height=300)) resize_to_layout!(fig)

-
+ diff --git a/dev/heat/index.html b/dev/heat/index.html index 44fae06..df511d4 100644 --- a/dev/heat/index.html +++ b/dev/heat/index.html @@ -479,4 +479,4 @@ q_lwr = minimum(q_mat; dims=2) |> vec q_upr = maximum(q_mat; dims=2) |> vec lines!(ax, t_many_pts, q_lwr, color=:magenta) -lines!(ax, t_many_pts, q_upr, color=:magenta) +lines!(ax, t_many_pts, q_upr, color=:magenta) diff --git a/dev/index.html b/dev/index.html index a9cfbc1..3512941 100644 --- a/dev/index.html +++ b/dev/index.html @@ -1,2 +1,2 @@ -Home · ProfileLikelihood.jl

ProfileLikelihood

DOI

This package defines the routines required for computing maximum likelihood estimates and profile likelihoods. The optimisation routines are built around the Optimization.jl interface, allowing us to e.g. easily switch between algorithms, between finite differences and automatic differentiation, and it allows for constraints to be defined with ease. Below we list the definitions we are using for likelihoods and profile likelihoods. This code only works for scalar or bivariate parameters of interest (i.e. out of a vector $\boldsymbol \theta$, you can profile a single scalar parameter $\theta_i \in \boldsymbol\theta$ or a pair $(\theta_i,\theta_j) \in \boldsymbol\theta$) for now. We use the following definitions:

Definition: Likelihood function (see Casella & Berger, 2002): Let $f(\boldsymbol x \mid \boldsymbol \theta)$ denote the joint probability density function (PDF) of the sample $\boldsymbol X = (X_1,\ldots,X_n)^{\mathsf T}$, where $\boldsymbol \theta \in \Theta$ is some set of parameters and $\Theta$ is the parameter space. We define the likelihood function $\mathcal L \colon \Theta \to [0, \infty)$ by $\mathcal L(\boldsymbol \theta \mid \boldsymbol x) = f(\boldsymbol x \mid \boldsymbol \theta)$ for some realisation $\boldsymbol x = (x_1,\ldots,x_n)^{\mathsf T}$ of $\boldsymbol X$. The log-likelihood function $\ell\colon\Theta\to\mathbb R$ is defined by $\ell(\boldsymbol \theta \mid \boldsymbol x) = \log\mathcal L(\boldsymbol\theta \mid \boldsymbol x)$.The maximum likelihood estimate (MLE) $\hat{\boldsymbol\theta}$ is the parameter $\boldsymbol\theta$ that maximises the likelihood function, $\hat{\boldsymbol{\theta}} = argmax_{\boldsymbol{\theta} \in \Theta} \mathcal{L}(\boldsymbol{\theta} \mid \boldsymbol x) = argmax_{\boldsymbol\theta \in \Theta} \ell(\boldsymbol\theta \mid \boldsymbol x)$.

Definition: Profile likelihood function (see Pawitan, 2001): Suppose we have some parameters of interest, $\boldsymbol \theta \in \Theta$, and some nuisance parameters, $\boldsymbol \phi \in \Phi$, and some data $\boldsymbol x = (x_1,\ldots,x_n)^{\mathsf T}$, giving some joint likelihood $\mathcal L \colon \Theta \cup \Phi \to [0, \infty)$ defined by $\mathcal L(\boldsymbol\theta, \boldsymbol\phi \mid \boldsymbol x)$. We define the profile likelihood $\mathcal L_p \colon \Theta \to [0, \infty)$ of $\boldsymbol\theta$ by $\mathcal L_p(\boldsymbol\theta \mid \boldsymbol x) = \sup_{\boldsymbol \phi \in \Phi \mid \boldsymbol \theta} \mathcal L(\boldsymbol \theta, \boldsymbol \phi \mid \boldsymbol x)$. The profile log-likelihood $\ell_p \colon \Theta \to \mathbb R$ of $\boldsymbol\theta$ is defined by $\ell_p(\boldsymbol \theta \mid \boldsymbol x) = \log \mathcal L_p(\boldsymbol\theta \mid \boldsymbol x)$. The normalised profile likelihood is defined by $\hat{\mathcal L}_p(\boldsymbol\theta \mid \boldsymbol x) = \mathcal L_p(\boldsymbol \theta \mid \boldsymbol x) - \mathcal L_p(\hat{\boldsymbol\theta} \mid \boldsymbol x)$, where $\hat{\boldsymbol\theta}$ is the MLE of $\boldsymbol\theta$, and similarly for the normalised profile log-likelihood.

From Wilk's theorem, we know that $2\hat{\ell}\_p(\boldsymbol\theta \mid \boldsymbol x) \geq -\chi_{p, 1-\alpha}^2$ is an approximate $100(1-\alpha)\%$ confidence region for $\boldsymbol \theta$, and this enables us to obtain confidence intervals for parameters by considering only their profile likelihood, where $\chi_{p,1-\alpha}^2$ is the $1-\alpha$ quantile of the $\chi_p^2$ distribution and $p$ is the length of $\boldsymbol\theta$. For the case of a scalar parameter of interest, $-\chi_{1, 0.95}^2/2 \approx -1.92$, and for a bivariate parameter of interest we have $-\chi_{2,0.95}^2/2 \approx -3$.

We compute the profile log-likelihood in this package by starting at the MLE, and stepping left/right until we reach a given threshold. The code is iterative to not waste time in so much of the parameter space. In the bivariate case, we start at the MLE and expand outwards in layers. This implementation is described in the documentation.

More detail about the methods we use in this package is given in the sections in the sidebar, with extra detail in the tests. All of the examples in the sidebar use a Gaussian likelihood, but of course the tools here work for any likelihood problem (e.g. some good examples that might be good to show are the Poisson problems here).

+Home · ProfileLikelihood.jl

ProfileLikelihood

DOI

This package defines the routines required for computing maximum likelihood estimates and profile likelihoods. The optimisation routines are built around the Optimization.jl interface, allowing us to e.g. easily switch between algorithms, between finite differences and automatic differentiation, and it allows for constraints to be defined with ease. Below we list the definitions we are using for likelihoods and profile likelihoods. This code only works for scalar or bivariate parameters of interest (i.e. out of a vector $\boldsymbol \theta$, you can profile a single scalar parameter $\theta_i \in \boldsymbol\theta$ or a pair $(\theta_i,\theta_j) \in \boldsymbol\theta$) for now. We use the following definitions:

Definition: Likelihood function (see Casella & Berger, 2002): Let $f(\boldsymbol x \mid \boldsymbol \theta)$ denote the joint probability density function (PDF) of the sample $\boldsymbol X = (X_1,\ldots,X_n)^{\mathsf T}$, where $\boldsymbol \theta \in \Theta$ is some set of parameters and $\Theta$ is the parameter space. We define the likelihood function $\mathcal L \colon \Theta \to [0, \infty)$ by $\mathcal L(\boldsymbol \theta \mid \boldsymbol x) = f(\boldsymbol x \mid \boldsymbol \theta)$ for some realisation $\boldsymbol x = (x_1,\ldots,x_n)^{\mathsf T}$ of $\boldsymbol X$. The log-likelihood function $\ell\colon\Theta\to\mathbb R$ is defined by $\ell(\boldsymbol \theta \mid \boldsymbol x) = \log\mathcal L(\boldsymbol\theta \mid \boldsymbol x)$.The maximum likelihood estimate (MLE) $\hat{\boldsymbol\theta}$ is the parameter $\boldsymbol\theta$ that maximises the likelihood function, $\hat{\boldsymbol{\theta}} = argmax_{\boldsymbol{\theta} \in \Theta} \mathcal{L}(\boldsymbol{\theta} \mid \boldsymbol x) = argmax_{\boldsymbol\theta \in \Theta} \ell(\boldsymbol\theta \mid \boldsymbol x)$.

Definition: Profile likelihood function (see Pawitan, 2001): Suppose we have some parameters of interest, $\boldsymbol \theta \in \Theta$, and some nuisance parameters, $\boldsymbol \phi \in \Phi$, and some data $\boldsymbol x = (x_1,\ldots,x_n)^{\mathsf T}$, giving some joint likelihood $\mathcal L \colon \Theta \cup \Phi \to [0, \infty)$ defined by $\mathcal L(\boldsymbol\theta, \boldsymbol\phi \mid \boldsymbol x)$. We define the profile likelihood $\mathcal L_p \colon \Theta \to [0, \infty)$ of $\boldsymbol\theta$ by $\mathcal L_p(\boldsymbol\theta \mid \boldsymbol x) = \sup_{\boldsymbol \phi \in \Phi \mid \boldsymbol \theta} \mathcal L(\boldsymbol \theta, \boldsymbol \phi \mid \boldsymbol x)$. The profile log-likelihood $\ell_p \colon \Theta \to \mathbb R$ of $\boldsymbol\theta$ is defined by $\ell_p(\boldsymbol \theta \mid \boldsymbol x) = \log \mathcal L_p(\boldsymbol\theta \mid \boldsymbol x)$. The normalised profile likelihood is defined by $\hat{\mathcal L}_p(\boldsymbol\theta \mid \boldsymbol x) = \mathcal L_p(\boldsymbol \theta \mid \boldsymbol x) - \mathcal L_p(\hat{\boldsymbol\theta} \mid \boldsymbol x)$, where $\hat{\boldsymbol\theta}$ is the MLE of $\boldsymbol\theta$, and similarly for the normalised profile log-likelihood.

From Wilk's theorem, we know that $2\hat{\ell}\_p(\boldsymbol\theta \mid \boldsymbol x) \geq -\chi_{p, 1-\alpha}^2$ is an approximate $100(1-\alpha)\%$ confidence region for $\boldsymbol \theta$, and this enables us to obtain confidence intervals for parameters by considering only their profile likelihood, where $\chi_{p,1-\alpha}^2$ is the $1-\alpha$ quantile of the $\chi_p^2$ distribution and $p$ is the length of $\boldsymbol\theta$. For the case of a scalar parameter of interest, $-\chi_{1, 0.95}^2/2 \approx -1.92$, and for a bivariate parameter of interest we have $-\chi_{2,0.95}^2/2 \approx -3$.

We compute the profile log-likelihood in this package by starting at the MLE, and stepping left/right until we reach a given threshold. The code is iterative to not waste time in so much of the parameter space. In the bivariate case, we start at the MLE and expand outwards in layers. This implementation is described in the documentation.

More detail about the methods we use in this package is given in the sections in the sidebar, with extra detail in the tests. All of the examples in the sidebar use a Gaussian likelihood, but of course the tools here work for any likelihood problem (e.g. some good examples that might be good to show are the Poisson problems here).

diff --git a/dev/interface/index.html b/dev/interface/index.html index 71ce5de..01383b3 100644 --- a/dev/interface/index.html +++ b/dev/interface/index.html @@ -27,4 +27,4 @@ interpolants::Dict{I,Spl} confidence_regions::Dict{I,ConfidenceRegion{CT,CF}} other_mles::OM -end

The definitions are similar to the univariate case, although parameter_values now maps to Tuples of grids, making use of OffsetVectors to define grids relative to an MLE. Similarly, we use OffsetMatrixs to define the grid of profile values, given in profile_values. You can also call the resulting struct, e.g. if prof is a BivariateProfileLikelihoodSolution, then prof(0.3, 0.9, :λ, :K) computes the bivariate profile between $\lambda$ and $K$ at $(\lambda,K)=(0.3,0.9)$. See ?ProfileLikelihood.BivariateProfileLikelihoodSolution for more detail, or its docstring in the sidebar.

You can also index such a prof to look at only a specific parameter. For example, prof[1, 2] will return a BivariateProfileLikelihoodSolutionView for the profile between the first and second parameter. Similarly, prof[:λ, :K] is the profile between $\lambda$ and $K$.

Propagating uncertainty: Prediction intervals

The confidence intervals obtained from profiling can be used to obtain approximate prediction intervals via profile-wise profile likelihoods, as defined e.g. in Simpson and Maclaren (2022), for a prediction function $\boldsymbol q(\boldsymbol\theta)$. These intervals can be based on varying a single parameter, or by taking the union of individual prediction intervals. The main function for this is get_prediction_intervals. Rather than explain in full detail here, please refer to the second example (the logistic ODE example), where we reproduce the first case study of Simpson and Maclaren (2022).

The full docstring for get_prediction_intervals is given in the sidebar.

Plotting

We provide a function plot_profiles that can be useful for plotting profile likelihoods. It requires that you have done

using CairoMakie # (or any other Makie backend)

(else the function does not exist, thanks to Requires.jl and package extensions from Julia v1.9). A full description of this function is given in the corresponding docstring in the sidebar.

GridSearch

it can sometimes be useful to evaluate the likelihood function over many points prior to optimising it, e.g. to find a good initial estimate or to just obtain data at many points for the purpose of visualisation. We provide functions for this, based on either a RegularGrid or an IrregularGrid.

A RegularGrid is a grid in which the grid for each parameter is uniformly spaced, so that the values for all parameter values to try fall on a lattice. An IrregularGrid allows for the parameters to take on whatever values you want, with the requirement that the parameter values to evaluate at are provided as a matrix with each column a different parameter set.

The function grid_search, after having defined a grid, can then be used for performing the grid search. The main method of interest is:

grid_search(prob::LikelihoodProblem, grid::AbstractGrid; save_vals=Val(false), parallel=Val(false))

Here, grid could be either a RegularGrid or an IrregularGrid. You can set save_vals=Val(true) if you want an array with all the likelihood function values, Val(false) otherwise. To enable multithreading, allowing for the evaluation of the function across different points via multiple threads, set parallel=Val(true), otherwise leave it as Val(false). The result of this grid search, if save_vals=Val(true), will be (sol, f_vals), where sol is a likelihood solution giving the parameters that gave to the highest likelihood, and f_res is the array of likelihoods at the corresponding parameters. If save_vals=Val(false), only sol is returned.

More detail is given in the examples, and complete docstrings are provided in the sidebar.

+end

The definitions are similar to the univariate case, although parameter_values now maps to Tuples of grids, making use of OffsetVectors to define grids relative to an MLE. Similarly, we use OffsetMatrixs to define the grid of profile values, given in profile_values. You can also call the resulting struct, e.g. if prof is a BivariateProfileLikelihoodSolution, then prof(0.3, 0.9, :λ, :K) computes the bivariate profile between $\lambda$ and $K$ at $(\lambda,K)=(0.3,0.9)$. See ?ProfileLikelihood.BivariateProfileLikelihoodSolution for more detail, or its docstring in the sidebar.

You can also index such a prof to look at only a specific parameter. For example, prof[1, 2] will return a BivariateProfileLikelihoodSolutionView for the profile between the first and second parameter. Similarly, prof[:λ, :K] is the profile between $\lambda$ and $K$.

Propagating uncertainty: Prediction intervals

The confidence intervals obtained from profiling can be used to obtain approximate prediction intervals via profile-wise profile likelihoods, as defined e.g. in Simpson and Maclaren (2022), for a prediction function $\boldsymbol q(\boldsymbol\theta)$. These intervals can be based on varying a single parameter, or by taking the union of individual prediction intervals. The main function for this is get_prediction_intervals. Rather than explain in full detail here, please refer to the second example (the logistic ODE example), where we reproduce the first case study of Simpson and Maclaren (2022).

The full docstring for get_prediction_intervals is given in the sidebar.

Plotting

We provide a function plot_profiles that can be useful for plotting profile likelihoods. It requires that you have done

using CairoMakie # (or any other Makie backend)

(else the function does not exist, thanks to Requires.jl and package extensions from Julia v1.9). A full description of this function is given in the corresponding docstring in the sidebar.

GridSearch

it can sometimes be useful to evaluate the likelihood function over many points prior to optimising it, e.g. to find a good initial estimate or to just obtain data at many points for the purpose of visualisation. We provide functions for this, based on either a RegularGrid or an IrregularGrid.

A RegularGrid is a grid in which the grid for each parameter is uniformly spaced, so that the values for all parameter values to try fall on a lattice. An IrregularGrid allows for the parameters to take on whatever values you want, with the requirement that the parameter values to evaluate at are provided as a matrix with each column a different parameter set.

The function grid_search, after having defined a grid, can then be used for performing the grid search. The main method of interest is:

grid_search(prob::LikelihoodProblem, grid::AbstractGrid; save_vals=Val(false), parallel=Val(false))

Here, grid could be either a RegularGrid or an IrregularGrid. You can set save_vals=Val(true) if you want an array with all the likelihood function values, Val(false) otherwise. To enable multithreading, allowing for the evaluation of the function across different points via multiple threads, set parallel=Val(true), otherwise leave it as Val(false). The result of this grid search, if save_vals=Val(true), will be (sol, f_vals), where sol is a likelihood solution giving the parameters that gave to the highest likelihood, and f_res is the array of likelihoods at the corresponding parameters. If save_vals=Val(false), only sol is returned.

More detail is given in the examples, and complete docstrings are provided in the sidebar.

diff --git a/dev/logistic/index.html b/dev/logistic/index.html index 245a760..7e11e8b 100644 --- a/dev/logistic/index.html +++ b/dev/logistic/index.html @@ -121,4 +121,4 @@ lines!(ax, t_many_pts, q_upr, color=:magenta, linewidth=3) resize_to_layout!(fig)

-

The first plot shows that the profile-wise prediction interval for $\lambda$ is quite large when $t$ is small, and then small for large time. This makes sense since the large time solution is independent of $\lambda$ (the large time solution is $u_s(t)=K$). For $K$, we see that the profile-wise interval only becomes large for large time, which again makes sense. For $u_0$ we see similar behaviour as for $\lambda$. Finally, taking the union over all the intervals, as is done in (d), shows that we fully enclose the solution coming from the MLE, as well as the true curve. The magenta curve shows the results from the full likelihood function, and is reasonably close to the approximate interval obtained from the union.

+

The first plot shows that the profile-wise prediction interval for $\lambda$ is quite large when $t$ is small, and then small for large time. This makes sense since the large time solution is independent of $\lambda$ (the large time solution is $u_s(t)=K$). For $K$, we see that the profile-wise interval only becomes large for large time, which again makes sense. For $u_0$ we see similar behaviour as for $\lambda$. Finally, taking the union over all the intervals, as is done in (d), shows that we fully enclose the solution coming from the MLE, as well as the true curve. The magenta curve shows the results from the full likelihood function, and is reasonably close to the approximate interval obtained from the union.

diff --git a/dev/lotka/index.html b/dev/lotka/index.html index 04d5c51..1f28ed0 100644 --- a/dev/lotka/index.html +++ b/dev/lotka/index.html @@ -242,4 +242,4 @@ lines!(_ax[k], t_many_pts, q_upr[idx], color=:magenta, linewidth=3) end

-
+ diff --git a/dev/math/index.html b/dev/math/index.html index 8a83f30..570e9f8 100644 --- a/dev/math/index.html +++ b/dev/math/index.html @@ -1,2 +1,2 @@ -Mathematical and Implementation Details · ProfileLikelihood.jl

Mathematical and Implementation Details

We now give some of the mathematical and implementation details used in this package, namely for computing the profile likelihood function and for computing prediction intervals.

Computing the profile likelihood function

Let us start by giving a mathematical description of the method that we use for computing the profile log-likelihood function. Suppose that we have a parameter vector $\boldsymbol\theta$ that we partition as $\boldsymbol \theta = (\boldsymbol\psi, \boldsymbol \omega)$ - $\boldsymbol\psi$ is either a scalar, $\psi$, or a 2-vector, $(\psi, \varphi)$. We suppose that we have a likelihood function $\mathcal L(\boldsymbol \theta) \equiv \mathcal L(\boldsymbol\psi, \boldsymbol \omega)$ so that the normalised profile log-likelihood function for $\boldsymbol\psi$ is defined as

\[\hat\ell_p(\boldsymbol\psi) = \sup_{\boldsymbol \omega \in \Omega \mid \boldsymbol\psi} \left[\ell(\boldsymbol\psi, \boldsymbol\omega) - \ell^*\right],\]

where $\Omega$ is the parameter space for $\boldsymbol \omega$, $\ell(\boldsymbol\psi,\boldsymbol\omega) = \log \mathcal L(\boldsymbol\psi, \boldsymbol \omega)$, and $\ell^* = \ell(\hat{\boldsymbol \theta})$, where $\boldsymbol \theta$ are the MLEs for $\boldsymbol \theta$. This definition of $\hat\ell_p(\boldsymbol\psi)$ induces a function $\boldsymbol\omega^*(\boldsymbol\psi)$ depending on $\boldsymbol\psi$ that gives the values of $\boldsymbol \omega$ leading to the supremum above, i.e.

\[\ell(\boldsymbol\psi, \boldsymbol\omega^{\star}(\psi)) = \sup_{\boldsymbol \omega \in \Omega \mid \boldsymbol\psi} \left[\ell(\boldsymbol\psi, \boldsymbol\omega) - \ell^{\star}\right]. \]

To compute $\hat\ell_p(\boldsymbol\psi)$, then, requires a way to efficiently compute the $\omega^*(\boldsymbol\psi)$, and requires knowing where to stop computing. Where we stop computing the profile likelihood is simply when $\hat\ell_p(\psi) < -\chi_{k,1-\alpha}^2/2$, where $\alpha$ is the significance level and $k=1$ if $\boldsymbol\psi = \psi$ and $k=2$ if $\boldsymbol\psi = (\psi,\varphi)$. This motivates a iterative algorithm, where we start at the MLE and expand outwards.

Univariate profile likelihoods

We first describe our implementation for a univariate profile, in which case $\boldsymbol\psi = \psi$. The basic summary of this procedure is that we simply step to the left and to the right of the MLE, continuing until we reach the threshold.

We describe how we evaluate the function to the right of the MLE – the case of going to the left is identical. First, we define $\psi_1 = \hat\psi$, where $\hat\psi$ is the MLE for $\psi$. This defines $\boldsymbol{\omega}_{1} = \boldsymbol{\omega}^{\star}(\psi_{1})$, which in this case just gives the MLE $\hat{\boldsymbol\theta} = (\hat\psi, \boldsymbol\omega_1)$ by definition. The value of the normalised profile log-likelihood here is simply $\hat\ell_1 = \hat\ell(\psi_1) = 0$. Then, defining some step size $\Delta\psi$, we define $\psi_2 = \psi_1 + \Delta \psi$, and in general $\psi_{j+1} = \psi_j + \Delta \psi$, we need to estimate $\boldsymbol\omega_2 = \boldsymbol \omega^*(\psi_2)$. We do this by starting an optimiser at the initial estimate $\boldsymbol\omega_2 = \boldsymbol\omega_1$ and then using this initial estimate to produce a refined value of $\boldsymbol\omega_2$ that we take as its true value. In particular, each $\boldsymbol\omega_j$ comes from starting the optimiser at the previous $\boldsymbol\omega_{j-1}$, and the value for $\hat\ell_j = \hat\ell(\psi_j)$ comes from the value of the likelihood at $(\psi_j, \boldsymbol\omega_j)$. The same holds when going to the left except with $\psi_{j+1} = \psi_j - \Delta\psi$, and then rearranging the indices $j$ when combining the results to the left and to the right. At each step, we check if $\hat\ell_j < -\chi_{1,1-\alpha}^2/2$ and, if so, we terminate.

Once we have terminated the algorithm, we need to obtain the confidence intervals. To do this, we fit a spline to the data $(\psi_j, \hat\ell_j)$, and use a bisection algorithm over the two intervals $(\min_j \psi_j, \hat\psi)$ and $(\hat\psi, \max_j\psi_j)$, to find where $\hat\ell_j = -\chi_{1-\alpha}^2/2$. This leads to two solutions $(L, U)$ that we take together to give the confidence interval for $\psi$.

This is all done for each parameter.

Note that a better method for initialising the optimisation for $\boldsymbol\omega_j$ may be to use e.g. linear interpolation for the previous two values, $\boldsymbol\omega_{j-1}$ and $\boldsymbol\omega_{j-2}$ (with special care for the bounds of the parameters). We provide support for this, letting $\boldsymbol\omega_j = [\boldsymbol\omega_{j-2}(\psi_{j-1} - \psi_j) + \boldsymbol\omega_{j-1}(\psi_j - \psi_{j-2})] / (\psi_{j-1} - \psi_{j-2})$. See the next_initial_estimate_method option in ?profile.

Bivariate profile likelihoods

Now we describe the implementation for a bivariate profile. In this case, there are many possibilities as instead of only having to think about going to the left and to the right, we could think about how we step away from the MLE in each direction, and how we want to stop iterating. This package currently has a basic implementation that we describe below, where we simply step out from the MLE in layers. In what follows, we let $\boldsymbol\psi = (\psi, \varphi)$.

To start, we suppose that we are on some square grid with integer coordinates $\{(i, j) : i, j = -N, -N+1, \ldots, 0, \ldots, N-1, N\}$, and we suppose that $(0, 0)$ refers to the MLE. We call the coordinate $(0, 0)$ the zeroth layer, denoted $L_0 = \{(0, 0)\}$. The $j$th layer, $L_j$, is defined to wrap around $L_{j-1}$. For example, $L_1$ wraps around $\{(0, 0)\}$ so that $L_1 = \{(-1,-1),(0,-1),(1,-1),(1,0),(1,1),(0,1),(-1,1),(-1,0)\}$. Note that $|L_j| = 8j$. The idea here is that we can solve the required optimisation problems at each $(i, j) \in L_j$, accelerating the solutions by making use of information at $L_{j-1}$ to define initial estimates for each optimisation problem. Layers are implemented via ProfileLikelihood.LayerIterator.

We also need to prescribe the parameter values that are defined at each coordinate. We suppose that we have bounds $\psi_L \leq \psi \leq \psi_U$ and $\varphi_L \leq \varphi \leq \varphi_U$ for $\psi$ and $\varphi$, and that we have MLEs $\hat\psi$ and $\hat\varphi$ for $\psi$ and $\varphi$, respectively. We then define $\Delta\psi_L = (\hat\psi - \psi_L)/N$, $\Delta\psi_R = (\psi_U - \hat\psi)/N$, $\Delta\varphi_L = (\hat\varphi-\varphi_L)/N$, and $\Delta\varphi_R = (\varphi_U - \hat\varphi)/N$. With these definitions, we let

\[\psi_j = \begin{cases} \psi_0 + j\Delta\psi_R & j > 0, \\ \hat\psi & j = 0, \\ \psi_0 - |j|\Delta\psi_L & j < 0, \end{cases} \qquad \varphi_j = \begin{cases} \varphi_0 + j\Delta\varphi_R & j>0, \\ \hat\varphi & j = 0, \\ \varphi_0 - |j|\Delta\varphi_L & j < 0. \end{cases} \]

We thus associate the parameter values $(\psi_i, \varphi_j)$ with the coordinate $(i, j)$. We will similarly associate $\hat\ell_p(\psi_i, \varphi_j) = \hat\ell(\psi_i, \varphi_j, \boldsymbol\omega^*(\psi_i, \varphi_j))$ with the coordinate $(i, j)$, and lastly $\boldsymbol\omega_{ij} = \boldsymbol\omega^*(\psi_i, \varphi_j)$.

So, the procedure is as follows: First, we start at the layer $L_1$ and compute $\hat\ell_{p, ij} \equiv \hat\ell_p(\psi_i, \varphi_j)$ for each $(i, j) \in L_1$, starting each initial estimate $\boldsymbol\omega_{ij}$ at $\boldsymbol\omega_{00}$, which is just the MLE. For each $(i, j)$, the parameter values we use are $(\psi_i, \varphi_j)$. Once we have evaluated at each $(i, j) \in L_1$, we can move into $L_2$, making use of the same procedure. In this case, though, there are a few choices that we could make for choosing an initial value for $\boldsymbol\omega_{ij}$, $(i, j) \in L_2$. As defined in set_next_initial_estimate!, we currently have three options:

  • :mle: The simplest choice is to simply start $\boldsymbol\omega_{ij}$ at $\boldsymbol\omega_{00}$.
  • :nearest: An alternative choice is to simply start $\boldsymbol\omega_{ij}$ at $\boldsymbol\omega_{i'j'}$, where $(i', j') \in L_1$ is the nearest coordinate to $(i, j) \in L_2$. For example, if $(i, j) = (-2, 0)$ then $(i', j') = (-1, 0)$, and if $(i, j) = (2, 2)$ then $(i', j') = (1, 1)$.
  • :interp: An alternative choice, which is currently the slowest (but could be made faster in the future, it just needs some work), is to maintain a linear interpolant across the data from $L_0$ and $L_1$ (i.e., all the previous layers, so $L_j$ uses data from $L_0, L_1, \ldots, L_{j-1}$), and use extrapolation to obtain a new value for $\boldsymbol\omega_{ij}$ from the linear interpolant evaluated at $(\psi_i, \varphi_j)$.

This procedure allows us to easily solve our optimisation problems for any given $L_j$. To decide how to terminate, we will simply terminate when we find that $\hat\ell_{p, ij} < -\chi_{2, 1-\alpha}^2/2$ for all $(i, j)$ in some layer, i.e. we do not terminate if any $\hat\ell_{p, ij} > \chi_{2, 1-\alpha}^2/2$. This choice means that we stop once we have found a box, i.e. a layer, that bounds the confidence region.

Now having a box that bounds the confidence region (assuming it to be simply connected), we need to find the boundary of the confidence region. Note that we define the confidence region as $\mathcal C = \{(\psi_i, \varphi_j) : \hat\ell_p(\psi_i, \varphi_j) < -\chi_{2,1-\alpha}^2/2\}$, and we are trying to now find the boundary $\partial\mathcal C$. We currently provide two methods for this, as defined in get_confidence_regions:

  • :contour: Here we use Contours.jl, defining a contour at level $-\chi_{2,1-\alpha}^2/2$, to find the contour.
  • :delaunay: This method uses DelaunayTriangulation.jl, defining a Delaunay triangulation over the entire grid of $(\psi_i, \varphi_j)$ in the bounding box, making use of triangulation contouring to find the contours. In particular, we take a set of triangles $\mathcal T$ that triangulate the bounding box, and then we iterate over all edges $\mathcal E$ in the triangulation. Assuming that $\hat\ell_p$ is linear over each triangle, we can assume that $\hat\ell_p$ increases linearly over an edge. Thus, if the value of the profile at one vertex of an edge is below $-\chi_{2,1-\alpha}^2/2$ and the other is above $-\chi_{2,1-\alpha}^2/2$, then there must be a point where $\hat\ell_p = -\chi_{2,1-\alpha}^2/2$ on this edge (making use of our linearity assumption). This thus defines a point on the boundary of $\partial\mathcal C$. We do this for each edge, giving us a complete boundary.

These procedures give us the complete solution.

Computing prediction intervals

Our method for computing prediction intervals follows Simpson and Maclaren (2022), as does our description that follows. This method is nice as it provides a means for sensitivity analysis, enabling the attribution of components of uncertainty in some prediction function $q(\boldsymbol\psi, \boldsymbol \omega)$ (with $\boldsymbol\psi$ the interest parameter and $\boldsymbol\omega$ the nuisance parameters as above) to individual parameters (or pairs, in the case of a bivariate profile). The resulting intervals are called profile-wise intervals, with the predictions themselves called parameter-based, profile-wise predictions or profile-wise predictions.

The idea is to take a set of profile likelihoods and the confidence intervals obtained from each, and then pushing those into a prediction function that we then use to obtain prediction intervals, making heavy use of the transformation invariance property of MLEs.

So, let us start with some prediction function $q(\boldsymbol\psi, \boldsymbol \omega)$, and recall that the profile likelihood function for $\boldsymbol\psi$ induces a function $\boldsymbol\omega^{\star}(\boldsymbol\psi)$. The profile-wise likelihood for $q$, given the set of values $(\boldsymbol\psi, \boldsymbol\omega^{\star}(\boldsymbol\psi))$, is defined by

\[\hat\ell_p\left(q\left(\boldsymbol\psi, \boldsymbol\omega^{\star}(\boldsymbol\psi)\right) = q\right) = \sup_{\boldsymbol\psi \mid q\left(\boldsymbol\psi, \boldsymbol\omega^{\star}(\boldsymbol\psi)\right) = q} \hat\ell_p(\boldsymbol\psi). \]

Note that if $q(\boldsymbol\psi, \boldsymbol\omega^{\star}(\boldsymbol\psi))$ is injective, there is only one such $\boldsymbol\psi$ such that $q\left(\boldsymbol\psi, \boldsymbol\omega^{\star}(\psi)\right) = q'$ for any given $q'$ in which case the profile-wise likelihood for $q$ (based on $\boldsymbol\psi$) is simply $\hat\ell_p(\boldsymbol\psi)$. This definition is intuitive, recalling that the profile likelihood comes from a definition like the above except with the likelihood function on the right, so profile-wise likelihoods come from profile likelihoods. Using this definition, and using the transformation invariance property of the MLE, confidence sets for $\psi$ directly translate into confidence sets for $q$, in particular to find a $100(1-\alpha)\%$ prediction interval for $q$ we need only evaluate $q$ for $\psi$ inside its confidence interval.

Let us now describe the extra details involved in obtaining these prediction intervals, in particular what we are doing in the get_prediction_intervals function. For this, we imagine that $q$ is scalar valued, but the description below can be easily extended to the vector case (just apply the idea to each component – see the logistic ODE example). We also only explain this for a single parameter $\boldsymbol\psi$, but we describe how we use the results for each parameter to obtain a more conservative interval.

The first step is to evaluate the family of curves. Here we describe the evaluation for a scalar parameter of interest, $\boldsymbol\psi = \psi$, but note that the case of a bivariate parameter of interest is similar (just use a confidence region instead of a confidence interval). If we suppose that the confidence interval for $\psi$ is $(\psi_L, \psi_U)$, we define $\psi_j = \psi_L + (j-1)(\psi_U - \psi_L)/(n_\psi - 1)$, $j=1,\ldots,n_\psi$ – this is a set of $n_\psi$ equally spaced points between the interval limits. For each $\psi_j$ we need to then compute $\boldsymbol\omega^{\star}(\psi_j)$. Rather than re-optimise, we use the data from our profile likelihoods, where we have stored values for $(\psi, \boldsymbol\omega^{\star}(\psi))$ to define a continuous function $\boldsymbol\omega^{\star}(\psi)$ via linear interpolation. Using this linear interpolant we can thus compute $\boldsymbol\omega^{\star}(\psi_j)$ for each gridpoint $\psi_j$. We can therefore compute $\boldsymbol\theta_j = (\psi_j, \boldsymbol\omega^{\star}(\psi_j))$ so that we can evaluate the prediction function at each $\psi_j$, $q_j = q(\boldsymbol\theta_j)$.

We now have a sample ${q_1, \ldots, q_{n_\psi}}$. If we let $q_L = min_{j=1}^{n_\psi} q_j$ and $q_U = max_{j=1}^{n_\psi} q_j$, then our prediction interval is $(q_L, q_U)$. To be more specific, this is the profile-wise interval for $q$ given the basis $(\boldsymbol\psi, \boldsymbol\omega^{\star}(\boldsymbol\psi))$.

We have now described how prediction intervals are obtained based on a single parameter. Suppose we do this for a collection of parameters $\{\boldsymbol\psi^1, \ldots, \boldsymbol\psi^d\}$ (e.g. if $\boldsymbol\theta = (D, \lambda, K)$, then we might have computed profiles for $\psi^1=D$, $\psi^2=\lambda$, and $\psi^3=K$), giving $d$ different intervals for each $\boldsymbol\psi^i$, say ${(q_L^i, q_U^i)}_{i=1}^d$. We can take the union of these intervals to get a more conservative interval for the prediction, giving the new interval $(min_{i=1}^d q_L^i, max_{i=1}^d q_U^i)$.

+Mathematical and Implementation Details · ProfileLikelihood.jl

Mathematical and Implementation Details

We now give some of the mathematical and implementation details used in this package, namely for computing the profile likelihood function and for computing prediction intervals.

Computing the profile likelihood function

Let us start by giving a mathematical description of the method that we use for computing the profile log-likelihood function. Suppose that we have a parameter vector $\boldsymbol\theta$ that we partition as $\boldsymbol \theta = (\boldsymbol\psi, \boldsymbol \omega)$ - $\boldsymbol\psi$ is either a scalar, $\psi$, or a 2-vector, $(\psi, \varphi)$. We suppose that we have a likelihood function $\mathcal L(\boldsymbol \theta) \equiv \mathcal L(\boldsymbol\psi, \boldsymbol \omega)$ so that the normalised profile log-likelihood function for $\boldsymbol\psi$ is defined as

\[\hat\ell_p(\boldsymbol\psi) = \sup_{\boldsymbol \omega \in \Omega \mid \boldsymbol\psi} \left[\ell(\boldsymbol\psi, \boldsymbol\omega) - \ell^*\right],\]

where $\Omega$ is the parameter space for $\boldsymbol \omega$, $\ell(\boldsymbol\psi,\boldsymbol\omega) = \log \mathcal L(\boldsymbol\psi, \boldsymbol \omega)$, and $\ell^* = \ell(\hat{\boldsymbol \theta})$, where $\boldsymbol \theta$ are the MLEs for $\boldsymbol \theta$. This definition of $\hat\ell_p(\boldsymbol\psi)$ induces a function $\boldsymbol\omega^*(\boldsymbol\psi)$ depending on $\boldsymbol\psi$ that gives the values of $\boldsymbol \omega$ leading to the supremum above, i.e.

\[\ell(\boldsymbol\psi, \boldsymbol\omega^{\star}(\psi)) = \sup_{\boldsymbol \omega \in \Omega \mid \boldsymbol\psi} \left[\ell(\boldsymbol\psi, \boldsymbol\omega) - \ell^{\star}\right]. \]

To compute $\hat\ell_p(\boldsymbol\psi)$, then, requires a way to efficiently compute the $\omega^*(\boldsymbol\psi)$, and requires knowing where to stop computing. Where we stop computing the profile likelihood is simply when $\hat\ell_p(\psi) < -\chi_{k,1-\alpha}^2/2$, where $\alpha$ is the significance level and $k=1$ if $\boldsymbol\psi = \psi$ and $k=2$ if $\boldsymbol\psi = (\psi,\varphi)$. This motivates a iterative algorithm, where we start at the MLE and expand outwards.

Univariate profile likelihoods

We first describe our implementation for a univariate profile, in which case $\boldsymbol\psi = \psi$. The basic summary of this procedure is that we simply step to the left and to the right of the MLE, continuing until we reach the threshold.

We describe how we evaluate the function to the right of the MLE – the case of going to the left is identical. First, we define $\psi_1 = \hat\psi$, where $\hat\psi$ is the MLE for $\psi$. This defines $\boldsymbol{\omega}_{1} = \boldsymbol{\omega}^{\star}(\psi_{1})$, which in this case just gives the MLE $\hat{\boldsymbol\theta} = (\hat\psi, \boldsymbol\omega_1)$ by definition. The value of the normalised profile log-likelihood here is simply $\hat\ell_1 = \hat\ell(\psi_1) = 0$. Then, defining some step size $\Delta\psi$, we define $\psi_2 = \psi_1 + \Delta \psi$, and in general $\psi_{j+1} = \psi_j + \Delta \psi$, we need to estimate $\boldsymbol\omega_2 = \boldsymbol \omega^*(\psi_2)$. We do this by starting an optimiser at the initial estimate $\boldsymbol\omega_2 = \boldsymbol\omega_1$ and then using this initial estimate to produce a refined value of $\boldsymbol\omega_2$ that we take as its true value. In particular, each $\boldsymbol\omega_j$ comes from starting the optimiser at the previous $\boldsymbol\omega_{j-1}$, and the value for $\hat\ell_j = \hat\ell(\psi_j)$ comes from the value of the likelihood at $(\psi_j, \boldsymbol\omega_j)$. The same holds when going to the left except with $\psi_{j+1} = \psi_j - \Delta\psi$, and then rearranging the indices $j$ when combining the results to the left and to the right. At each step, we check if $\hat\ell_j < -\chi_{1,1-\alpha}^2/2$ and, if so, we terminate.

Once we have terminated the algorithm, we need to obtain the confidence intervals. To do this, we fit a spline to the data $(\psi_j, \hat\ell_j)$, and use a bisection algorithm over the two intervals $(\min_j \psi_j, \hat\psi)$ and $(\hat\psi, \max_j\psi_j)$, to find where $\hat\ell_j = -\chi_{1-\alpha}^2/2$. This leads to two solutions $(L, U)$ that we take together to give the confidence interval for $\psi$.

This is all done for each parameter.

Note that a better method for initialising the optimisation for $\boldsymbol\omega_j$ may be to use e.g. linear interpolation for the previous two values, $\boldsymbol\omega_{j-1}$ and $\boldsymbol\omega_{j-2}$ (with special care for the bounds of the parameters). We provide support for this, letting $\boldsymbol\omega_j = [\boldsymbol\omega_{j-2}(\psi_{j-1} - \psi_j) + \boldsymbol\omega_{j-1}(\psi_j - \psi_{j-2})] / (\psi_{j-1} - \psi_{j-2})$. See the next_initial_estimate_method option in ?profile.

Bivariate profile likelihoods

Now we describe the implementation for a bivariate profile. In this case, there are many possibilities as instead of only having to think about going to the left and to the right, we could think about how we step away from the MLE in each direction, and how we want to stop iterating. This package currently has a basic implementation that we describe below, where we simply step out from the MLE in layers. In what follows, we let $\boldsymbol\psi = (\psi, \varphi)$.

To start, we suppose that we are on some square grid with integer coordinates $\{(i, j) : i, j = -N, -N+1, \ldots, 0, \ldots, N-1, N\}$, and we suppose that $(0, 0)$ refers to the MLE. We call the coordinate $(0, 0)$ the zeroth layer, denoted $L_0 = \{(0, 0)\}$. The $j$th layer, $L_j$, is defined to wrap around $L_{j-1}$. For example, $L_1$ wraps around $\{(0, 0)\}$ so that $L_1 = \{(-1,-1),(0,-1),(1,-1),(1,0),(1,1),(0,1),(-1,1),(-1,0)\}$. Note that $|L_j| = 8j$. The idea here is that we can solve the required optimisation problems at each $(i, j) \in L_j$, accelerating the solutions by making use of information at $L_{j-1}$ to define initial estimates for each optimisation problem. Layers are implemented via ProfileLikelihood.LayerIterator.

We also need to prescribe the parameter values that are defined at each coordinate. We suppose that we have bounds $\psi_L \leq \psi \leq \psi_U$ and $\varphi_L \leq \varphi \leq \varphi_U$ for $\psi$ and $\varphi$, and that we have MLEs $\hat\psi$ and $\hat\varphi$ for $\psi$ and $\varphi$, respectively. We then define $\Delta\psi_L = (\hat\psi - \psi_L)/N$, $\Delta\psi_R = (\psi_U - \hat\psi)/N$, $\Delta\varphi_L = (\hat\varphi-\varphi_L)/N$, and $\Delta\varphi_R = (\varphi_U - \hat\varphi)/N$. With these definitions, we let

\[\psi_j = \begin{cases} \psi_0 + j\Delta\psi_R & j > 0, \\ \hat\psi & j = 0, \\ \psi_0 - |j|\Delta\psi_L & j < 0, \end{cases} \qquad \varphi_j = \begin{cases} \varphi_0 + j\Delta\varphi_R & j>0, \\ \hat\varphi & j = 0, \\ \varphi_0 - |j|\Delta\varphi_L & j < 0. \end{cases} \]

We thus associate the parameter values $(\psi_i, \varphi_j)$ with the coordinate $(i, j)$. We will similarly associate $\hat\ell_p(\psi_i, \varphi_j) = \hat\ell(\psi_i, \varphi_j, \boldsymbol\omega^*(\psi_i, \varphi_j))$ with the coordinate $(i, j)$, and lastly $\boldsymbol\omega_{ij} = \boldsymbol\omega^*(\psi_i, \varphi_j)$.

So, the procedure is as follows: First, we start at the layer $L_1$ and compute $\hat\ell_{p, ij} \equiv \hat\ell_p(\psi_i, \varphi_j)$ for each $(i, j) \in L_1$, starting each initial estimate $\boldsymbol\omega_{ij}$ at $\boldsymbol\omega_{00}$, which is just the MLE. For each $(i, j)$, the parameter values we use are $(\psi_i, \varphi_j)$. Once we have evaluated at each $(i, j) \in L_1$, we can move into $L_2$, making use of the same procedure. In this case, though, there are a few choices that we could make for choosing an initial value for $\boldsymbol\omega_{ij}$, $(i, j) \in L_2$. As defined in set_next_initial_estimate!, we currently have three options:

  • :mle: The simplest choice is to simply start $\boldsymbol\omega_{ij}$ at $\boldsymbol\omega_{00}$.
  • :nearest: An alternative choice is to simply start $\boldsymbol\omega_{ij}$ at $\boldsymbol\omega_{i'j'}$, where $(i', j') \in L_1$ is the nearest coordinate to $(i, j) \in L_2$. For example, if $(i, j) = (-2, 0)$ then $(i', j') = (-1, 0)$, and if $(i, j) = (2, 2)$ then $(i', j') = (1, 1)$.
  • :interp: An alternative choice, which is currently the slowest (but could be made faster in the future, it just needs some work), is to maintain a linear interpolant across the data from $L_0$ and $L_1$ (i.e., all the previous layers, so $L_j$ uses data from $L_0, L_1, \ldots, L_{j-1}$), and use extrapolation to obtain a new value for $\boldsymbol\omega_{ij}$ from the linear interpolant evaluated at $(\psi_i, \varphi_j)$.

This procedure allows us to easily solve our optimisation problems for any given $L_j$. To decide how to terminate, we will simply terminate when we find that $\hat\ell_{p, ij} < -\chi_{2, 1-\alpha}^2/2$ for all $(i, j)$ in some layer, i.e. we do not terminate if any $\hat\ell_{p, ij} > \chi_{2, 1-\alpha}^2/2$. This choice means that we stop once we have found a box, i.e. a layer, that bounds the confidence region.

Now having a box that bounds the confidence region (assuming it to be simply connected), we need to find the boundary of the confidence region. Note that we define the confidence region as $\mathcal C = \{(\psi_i, \varphi_j) : \hat\ell_p(\psi_i, \varphi_j) < -\chi_{2,1-\alpha}^2/2\}$, and we are trying to now find the boundary $\partial\mathcal C$. We currently provide two methods for this, as defined in get_confidence_regions:

  • :contour: Here we use Contours.jl, defining a contour at level $-\chi_{2,1-\alpha}^2/2$, to find the contour.
  • :delaunay: This method uses DelaunayTriangulation.jl, defining a Delaunay triangulation over the entire grid of $(\psi_i, \varphi_j)$ in the bounding box, making use of triangulation contouring to find the contours. In particular, we take a set of triangles $\mathcal T$ that triangulate the bounding box, and then we iterate over all edges $\mathcal E$ in the triangulation. Assuming that $\hat\ell_p$ is linear over each triangle, we can assume that $\hat\ell_p$ increases linearly over an edge. Thus, if the value of the profile at one vertex of an edge is below $-\chi_{2,1-\alpha}^2/2$ and the other is above $-\chi_{2,1-\alpha}^2/2$, then there must be a point where $\hat\ell_p = -\chi_{2,1-\alpha}^2/2$ on this edge (making use of our linearity assumption). This thus defines a point on the boundary of $\partial\mathcal C$. We do this for each edge, giving us a complete boundary.

These procedures give us the complete solution.

Computing prediction intervals

Our method for computing prediction intervals follows Simpson and Maclaren (2022), as does our description that follows. This method is nice as it provides a means for sensitivity analysis, enabling the attribution of components of uncertainty in some prediction function $q(\boldsymbol\psi, \boldsymbol \omega)$ (with $\boldsymbol\psi$ the interest parameter and $\boldsymbol\omega$ the nuisance parameters as above) to individual parameters (or pairs, in the case of a bivariate profile). The resulting intervals are called profile-wise intervals, with the predictions themselves called parameter-based, profile-wise predictions or profile-wise predictions.

The idea is to take a set of profile likelihoods and the confidence intervals obtained from each, and then pushing those into a prediction function that we then use to obtain prediction intervals, making heavy use of the transformation invariance property of MLEs.

So, let us start with some prediction function $q(\boldsymbol\psi, \boldsymbol \omega)$, and recall that the profile likelihood function for $\boldsymbol\psi$ induces a function $\boldsymbol\omega^{\star}(\boldsymbol\psi)$. The profile-wise likelihood for $q$, given the set of values $(\boldsymbol\psi, \boldsymbol\omega^{\star}(\boldsymbol\psi))$, is defined by

\[\hat\ell_p\left(q\left(\boldsymbol\psi, \boldsymbol\omega^{\star}(\boldsymbol\psi)\right) = q\right) = \sup_{\boldsymbol\psi \mid q\left(\boldsymbol\psi, \boldsymbol\omega^{\star}(\boldsymbol\psi)\right) = q} \hat\ell_p(\boldsymbol\psi). \]

Note that if $q(\boldsymbol\psi, \boldsymbol\omega^{\star}(\boldsymbol\psi))$ is injective, there is only one such $\boldsymbol\psi$ such that $q\left(\boldsymbol\psi, \boldsymbol\omega^{\star}(\psi)\right) = q'$ for any given $q'$ in which case the profile-wise likelihood for $q$ (based on $\boldsymbol\psi$) is simply $\hat\ell_p(\boldsymbol\psi)$. This definition is intuitive, recalling that the profile likelihood comes from a definition like the above except with the likelihood function on the right, so profile-wise likelihoods come from profile likelihoods. Using this definition, and using the transformation invariance property of the MLE, confidence sets for $\psi$ directly translate into confidence sets for $q$, in particular to find a $100(1-\alpha)\%$ prediction interval for $q$ we need only evaluate $q$ for $\psi$ inside its confidence interval.

Let us now describe the extra details involved in obtaining these prediction intervals, in particular what we are doing in the get_prediction_intervals function. For this, we imagine that $q$ is scalar valued, but the description below can be easily extended to the vector case (just apply the idea to each component – see the logistic ODE example). We also only explain this for a single parameter $\boldsymbol\psi$, but we describe how we use the results for each parameter to obtain a more conservative interval.

The first step is to evaluate the family of curves. Here we describe the evaluation for a scalar parameter of interest, $\boldsymbol\psi = \psi$, but note that the case of a bivariate parameter of interest is similar (just use a confidence region instead of a confidence interval). If we suppose that the confidence interval for $\psi$ is $(\psi_L, \psi_U)$, we define $\psi_j = \psi_L + (j-1)(\psi_U - \psi_L)/(n_\psi - 1)$, $j=1,\ldots,n_\psi$ – this is a set of $n_\psi$ equally spaced points between the interval limits. For each $\psi_j$ we need to then compute $\boldsymbol\omega^{\star}(\psi_j)$. Rather than re-optimise, we use the data from our profile likelihoods, where we have stored values for $(\psi, \boldsymbol\omega^{\star}(\psi))$ to define a continuous function $\boldsymbol\omega^{\star}(\psi)$ via linear interpolation. Using this linear interpolant we can thus compute $\boldsymbol\omega^{\star}(\psi_j)$ for each gridpoint $\psi_j$. We can therefore compute $\boldsymbol\theta_j = (\psi_j, \boldsymbol\omega^{\star}(\psi_j))$ so that we can evaluate the prediction function at each $\psi_j$, $q_j = q(\boldsymbol\theta_j)$.

We now have a sample ${q_1, \ldots, q_{n_\psi}}$. If we let $q_L = min_{j=1}^{n_\psi} q_j$ and $q_U = max_{j=1}^{n_\psi} q_j$, then our prediction interval is $(q_L, q_U)$. To be more specific, this is the profile-wise interval for $q$ given the basis $(\boldsymbol\psi, \boldsymbol\omega^{\star}(\boldsymbol\psi))$.

We have now described how prediction intervals are obtained based on a single parameter. Suppose we do this for a collection of parameters $\{\boldsymbol\psi^1, \ldots, \boldsymbol\psi^d\}$ (e.g. if $\boldsymbol\theta = (D, \lambda, K)$, then we might have computed profiles for $\psi^1=D$, $\psi^2=\lambda$, and $\psi^3=K$), giving $d$ different intervals for each $\boldsymbol\psi^i$, say ${(q_L^i, q_U^i)}_{i=1}^d$. We can take the union of these intervals to get a more conservative interval for the prediction, giving the new interval $(min_{i=1}^d q_L^i, max_{i=1}^d q_U^i)$.

diff --git a/dev/regression/index.html b/dev/regression/index.html index adab5e2..725f5cd 100644 --- a/dev/regression/index.html +++ b/dev/regression/index.html @@ -107,4 +107,4 @@

You could also plot individual or specific parameters:

plot_profiles(prof, [1, 3]) # plot σ and β₁
 plot_profiles(prof, [:σ, :β₁, :β₃]) # can use symbols 
 plot_profiles(prof, 1) # can just provide an integer 
-plot_profiles(prof, :β₂) # symbols work
+plot_profiles(prof, :β₂) # symbols work diff --git a/dev/stefan/index.html b/dev/stefan/index.html index 5bc7b15..faa883e 100644 --- a/dev/stefan/index.html +++ b/dev/stefan/index.html @@ -172,4 +172,4 @@ get_prediction_intervals(prediction_function!, prof, pred_data; parallel=true, q_prototype)

Now let's plot these results, including the MLE and exact curves.


-

We see that the uncertainty covers the exact curves. We could keep going and doing e.g. bivariate profiles and bivariate prediction intervals, but let us stop here (see the Lotka-Volterra example in Example IV if you want to see bivariate profiling).

+

We see that the uncertainty covers the exact curves. We could keep going and doing e.g. bivariate profiles and bivariate prediction intervals, but let us stop here (see the Lotka-Volterra example in Example IV if you want to see bivariate profiling).