Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Progress of Wrapping APIs of ReverseDiff #3

Closed
16 tasks done
Non-Contradiction opened this issue May 14, 2018 · 4 comments
Closed
16 tasks done

Progress of Wrapping APIs of ReverseDiff #3

Non-Contradiction opened this issue May 14, 2018 · 4 comments
Milestone

Comments

@Non-Contradiction
Copy link
Owner

Non-Contradiction commented May 14, 2018

The ReverseDiff api is at http://www.juliadiff.org/ReverseDiff.jl/api/

Follow a similar discussion as to #2, we omit everything with ! because ! is julia's notation for mutating the content of input, which common native R function will not do.

Gradients of f(x::AbstractArray{<:Real}...)::Real

  • ReverseDiff.gradient(f, input, cfg::GradientConfig = GradientConfig(input))
    If input is an AbstractArray, assume f has the form f(::AbstractArray{<:Real})::Real and return ∇f(input).
    If input is a tuple of AbstractArrays, assume f has the form f(::AbstractArray{<:Real}...)::Real (such that it can be called as f(input...)) and return a Tuple where the ith element is the gradient of f w.r.t. input[i].
    Note that cfg can be preallocated and reused for subsequent calls.
    If possible, it is highly recommended to use ReverseDiff.GradientTape to prerecord f. Otherwise, this method will have to re-record f's execution trace for every subsequent call.

Jacobians of f(x::AbstractArray{<:Real}...)::AbstractArray{<:Real}

  • ReverseDiff.jacobian(f, input, cfg::JacobianConfig = JacobianConfig(input))
    If input is an AbstractArray, assume f has the form f(::AbstractArray{<:Real})::AbstractArray{<:Real} and return J(f)(input).
    If input is a tuple of AbstractArrays, assume f has the form f(::AbstractArray{<:Real}...)::AbstractArray{<:Real} (such that it can be called as f(input...)) and return a Tuple where the ith element is the Jacobian of f w.r.t. input[i].
    Note that cfg can be preallocated and reused for subsequent calls.
    If possible, it is highly recommended to use ReverseDiff.JacobianTape to prerecord f. Otherwise, this method will have to re-record f's execution trace for every subsequent call.

Hessians of f(x::AbstractArray{<:Real})::Real

  • ReverseDiff.hessian(f, input::AbstractArray, cfg::HessianConfig = HessianConfig(input))
    Given f(input::AbstractArray{<:Real})::Real, return fs Hessian w.r.t. to the given input.
    Note that cfg can be preallocated and reused for subsequent calls.
    If possible, it is highly recommended to use ReverseDiff.HessianTape to prerecord f. Otherwise, this method will have to re-record f's execution trace for every subsequent call.

The AbstractTape API

ReverseDiff works by recording the target function's execution trace to a "tape", then running the tape forwards and backwards to propagate new input values and derivative information.

In many cases, it is the recording phase of this process that consumes the most time and memory, while the forward and reverse execution passes are often fast and non-allocating. Luckily, ReverseDiff provides the AbstractTape family of types, which enable the user to pre-record a reusable tape for a given function and differentiation operation.

Note that pre-recording a tape can only capture the the execution trace of the target function with the given input values. Therefore, re-running the tape (even with new input values) will only execute the paths that were recorded using the original input values. In other words, the tape cannot any re-enact branching behavior that depends on the input values. You can guarantee your own safety in this regard by never using the AbstractTape API with functions that contain control flow based on the input values.

Similarly to the branching issue, a tape is not guaranteed to capture any side-effects caused or depended on by the target function.

  • ReverseDiff.GradientTape(f, input, cfg::GradientConfig = GradientConfig(input))
    Return a GradientTape instance containing a pre-recorded execution trace of f at the given input.
    This GradientTape can then be passed to ReverseDiff.gradient! to take gradients of the execution trace with new input values. Note that these new values must have the same element type and shape as input.
    See ReverseDiff.gradient for a description of acceptable types for input.

  • ReverseDiff.JacobianTape(f, input, cfg::JacobianConfig = JacobianConfig(input))
    Return a JacobianTape instance containing a pre-recorded execution trace of f at the given input.
    This JacobianTape can then be passed to ReverseDiff.jacobian! to take Jacobians of the execution trace with new input values. Note that these new values must have the same element type and shape as input.
    See ReverseDiff.jacobian for a description of acceptable types for input.

  • ReverseDiff.HessianTape(f, input, cfg::HessianConfig = HessianConfig(input))
    Return a HessianTape instance containing a pre-recorded execution trace of f at the given input.
    This HessianTape can then be passed to ReverseDiff.hessian! to take Hessians of the execution trace with new input values. Note that these new values must have the same element type and shape as input.
    See ReverseDiff.hessian for a description of acceptable types for input.

  • ReverseDiff.compile(t::AbstractTape)
    Return a fully compiled representation of t of type CompiledTape. This object can be passed to any API methods that accept t (e.g. gradient!(result, t, input)).
    In many cases, compiling t can significantly speed up execution time. Note that the longer the tape, the more time compilation may take. Very long tapes (i.e. when length(t) is on the order of 10000 elements) can take a very long time to compile.
    Note that this function calls eval in the current_module() to generate functions from t. Thus, the returned CompiledTape will only be useable once the world-age counter has caught up with the world-age of the eval'd functions (i.e. once the call stack has bubbled up to top level).

The AbstractConfig API

For the sake of convenience and performance, all "extra" information used by ReverseDiff's API methods is bundled up in the ReverseDiff.AbstractConfig family of types. These types allow the user to easily feed several different parameters to ReverseDiff's API methods, such as work buffers and tape configurations.
ReverseDiff's basic API methods will allocate these types automatically by default, but you can reduce memory usage and improve performance if you preallocate them yourself.

  • ReverseDiff.GradientConfig(input, tp::RawTape = RawTape())
    Return a GradientConfig instance containing the preallocated tape and work buffers used by the ReverseDiff.gradient/ReverseDiff.gradient! methods.
    Note that input is only used for type and shape information; it is not stored or modified in any way. It is assumed that the element type of input is same as the element type of the target function's output.
    See ReverseDiff.gradient for a description of acceptable types for input.

  • ReverseDiff.GradientConfig(input, ::Type{D}, tp::RawTape = RawTape())
    Like GradientConfig(input, tp), except the provided type D is assumed to be the element type of the target function's output.

  • ReverseDiff.JacobianConfig(input, tp::RawTape = RawTape())
    Return a JacobianConfig instance containing the preallocated tape and work buffers used by the ReverseDiff.jacobian/ReverseDiff.jacobian! methods.
    Note that input is only used for type and shape information; it is not stored or modified in any way. It is assumed that the element type of input is same as the element type of the target function's output.
    See ReverseDiff.jacobian for a description of acceptable types for input.

  • ReverseDiff.JacobianConfig(input, ::Type{D}, tp::RawTape = RawTape())
    Like JacobianConfig(input, tp), except the provided type D is assumed to be the element type of the target function's output.

  • ReverseDiff.HessianConfig(input::AbstractArray, gtp::RawTape = RawTape(), jtp::RawTape = RawTape())
    Return a HessianConfig instance containing the preallocated tape and work buffers used by the ReverseDiff.hessian/ReverseDiff.hessian! methods. gtp is the tape used for the inner gradient calculation, while jtp is used for outer Jacobian calculation.
    Note that input is only used for type and shape information; it is not stored or modified in any way. It is assumed that the element type of input is same as the element type of the target function's output.

  • ReverseDiff.HessianConfig(input::AbstractArray, ::Type{D}, gtp::RawTape = RawTape(), jtp::RawTape = RawTape())
    Like HessianConfig(input, tp), except the provided type D is assumed to be the element type of the target function's output.

Optimization Annotations

Update

Since we have finished all the basic implementations. The next step will be:

  • Export the wrapper functions.
  • Finish the documentation.
  • Add tests.
@Non-Contradiction Non-Contradiction added this to the First phase milestone May 14, 2018
@Non-Contradiction
Copy link
Owner Author

It seems that if we are not allowing mutating functions, then the tape api is useless. So they will not be wrapped temporarily.

@Non-Contradiction
Copy link
Owner Author

For the abstract config api, it seems that currently we don't have to use ::Type{D} because in R we usually only deal with Float64 and for most R users (I guess ?), it will seem weird to use functions like that.

@Non-Contradiction
Copy link
Owner Author

Finish exportation and documentation of wrapper functions.
Add some basic tests for wrapper functions.
Need to adopt tests at
https://github.com/JuliaDiff/ReverseDiff.jl/tree/master/test
https://github.com/JuliaDiff/DiffTests.jl/blob/master/src/DiffTests.jl
to close this issue.

@Non-Contradiction
Copy link
Owner Author

Close. To be continued at #17.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant