Progress of Wrapping APIs of ReverseDiff #3

Non-Contradiction · 2018-05-14T16:47:29Z

The ReverseDiff api is at http://www.juliadiff.org/ReverseDiff.jl/api/

Follow a similar discussion as to #2, we omit everything with ! because ! is julia's notation for mutating the content of input, which common native R function will not do.

Gradients of `f(x::AbstractArray{<:Real}...)::Real`

ReverseDiff.gradient(f, input, cfg::GradientConfig = GradientConfig(input))
If input is an AbstractArray, assume f has the form f(::AbstractArray{<:Real})::Real and return ∇f(input).
If input is a tuple of AbstractArrays, assume f has the form f(::AbstractArray{<:Real}...)::Real (such that it can be called as f(input...)) and return a Tuple where the ith element is the gradient of f w.r.t. input[i].
Note that cfg can be preallocated and reused for subsequent calls.
If possible, it is highly recommended to use ReverseDiff.GradientTape to prerecord f. Otherwise, this method will have to re-record f's execution trace for every subsequent call.

Jacobians of `f(x::AbstractArray{<:Real}...)::AbstractArray{<:Real}`

ReverseDiff.jacobian(f, input, cfg::JacobianConfig = JacobianConfig(input))
If input is an AbstractArray, assume f has the form f(::AbstractArray{<:Real})::AbstractArray{<:Real} and return J(f)(input).
If input is a tuple of AbstractArrays, assume f has the form f(::AbstractArray{<:Real}...)::AbstractArray{<:Real} (such that it can be called as f(input...)) and return a Tuple where the ith element is the Jacobian of f w.r.t. input[i].
Note that cfg can be preallocated and reused for subsequent calls.
If possible, it is highly recommended to use ReverseDiff.JacobianTape to prerecord f. Otherwise, this method will have to re-record f's execution trace for every subsequent call.

Hessians of `f(x::AbstractArray{<:Real})::Real`

ReverseDiff.hessian(f, input::AbstractArray, cfg::HessianConfig = HessianConfig(input))
Given f(input::AbstractArray{<:Real})::Real, return fs Hessian w.r.t. to the given input.
Note that cfg can be preallocated and reused for subsequent calls.
If possible, it is highly recommended to use ReverseDiff.HessianTape to prerecord f. Otherwise, this method will have to re-record f's execution trace for every subsequent call.

The AbstractTape API

ReverseDiff works by recording the target function's execution trace to a "tape", then running the tape forwards and backwards to propagate new input values and derivative information.

In many cases, it is the recording phase of this process that consumes the most time and memory, while the forward and reverse execution passes are often fast and non-allocating. Luckily, ReverseDiff provides the AbstractTape family of types, which enable the user to pre-record a reusable tape for a given function and differentiation operation.

Note that pre-recording a tape can only capture the the execution trace of the target function with the given input values. Therefore, re-running the tape (even with new input values) will only execute the paths that were recorded using the original input values. In other words, the tape cannot any re-enact branching behavior that depends on the input values. You can guarantee your own safety in this regard by never using the AbstractTape API with functions that contain control flow based on the input values.

Similarly to the branching issue, a tape is not guaranteed to capture any side-effects caused or depended on by the target function.

ReverseDiff.GradientTape(f, input, cfg::GradientConfig = GradientConfig(input))
Return a GradientTape instance containing a pre-recorded execution trace of f at the given input.
This GradientTape can then be passed to ReverseDiff.gradient! to take gradients of the execution trace with new input values. Note that these new values must have the same element type and shape as input.
See ReverseDiff.gradient for a description of acceptable types for input.
ReverseDiff.JacobianTape(f, input, cfg::JacobianConfig = JacobianConfig(input))
Return a JacobianTape instance containing a pre-recorded execution trace of f at the given input.
This JacobianTape can then be passed to ReverseDiff.jacobian! to take Jacobians of the execution trace with new input values. Note that these new values must have the same element type and shape as input.
See ReverseDiff.jacobian for a description of acceptable types for input.
ReverseDiff.HessianTape(f, input, cfg::HessianConfig = HessianConfig(input))
Return a HessianTape instance containing a pre-recorded execution trace of f at the given input.
This HessianTape can then be passed to ReverseDiff.hessian! to take Hessians of the execution trace with new input values. Note that these new values must have the same element type and shape as input.
See ReverseDiff.hessian for a description of acceptable types for input.
ReverseDiff.compile(t::AbstractTape)
Return a fully compiled representation of t of type CompiledTape. This object can be passed to any API methods that accept t (e.g. gradient!(result, t, input)).
In many cases, compiling t can significantly speed up execution time. Note that the longer the tape, the more time compilation may take. Very long tapes (i.e. when length(t) is on the order of 10000 elements) can take a very long time to compile.
Note that this function calls eval in the current_module() to generate functions from t. Thus, the returned CompiledTape will only be useable once the world-age counter has caught up with the world-age of the eval'd functions (i.e. once the call stack has bubbled up to top level).

The AbstractConfig API

For the sake of convenience and performance, all "extra" information used by ReverseDiff's API methods is bundled up in the ReverseDiff.AbstractConfig family of types. These types allow the user to easily feed several different parameters to ReverseDiff's API methods, such as work buffers and tape configurations.
ReverseDiff's basic API methods will allocate these types automatically by default, but you can reduce memory usage and improve performance if you preallocate them yourself.

ReverseDiff.GradientConfig(input, tp::RawTape = RawTape())
Return a GradientConfig instance containing the preallocated tape and work buffers used by the ReverseDiff.gradient/ReverseDiff.gradient! methods.
Note that input is only used for type and shape information; it is not stored or modified in any way. It is assumed that the element type of input is same as the element type of the target function's output.
See ReverseDiff.gradient for a description of acceptable types for input.
ReverseDiff.GradientConfig(input, ::Type{D}, tp::RawTape = RawTape())
Like GradientConfig(input, tp), except the provided type D is assumed to be the element type of the target function's output.
ReverseDiff.JacobianConfig(input, tp::RawTape = RawTape())
Return a JacobianConfig instance containing the preallocated tape and work buffers used by the ReverseDiff.jacobian/ReverseDiff.jacobian! methods.
Note that input is only used for type and shape information; it is not stored or modified in any way. It is assumed that the element type of input is same as the element type of the target function's output.
See ReverseDiff.jacobian for a description of acceptable types for input.
ReverseDiff.JacobianConfig(input, ::Type{D}, tp::RawTape = RawTape())
Like JacobianConfig(input, tp), except the provided type D is assumed to be the element type of the target function's output.
ReverseDiff.HessianConfig(input::AbstractArray, gtp::RawTape = RawTape(), jtp::RawTape = RawTape())
Return a HessianConfig instance containing the preallocated tape and work buffers used by the ReverseDiff.hessian/ReverseDiff.hessian! methods. gtp is the tape used for the inner gradient calculation, while jtp is used for outer Jacobian calculation.
Note that input is only used for type and shape information; it is not stored or modified in any way. It is assumed that the element type of input is same as the element type of the target function's output.
ReverseDiff.HessianConfig(input::AbstractArray, ::Type{D}, gtp::RawTape = RawTape(), jtp::RawTape = RawTape())
Like HessianConfig(input, tp), except the provided type D is assumed to be the element type of the target function's output.

Optimization Annotations

Update

Since we have finished all the basic implementations. The next step will be:

Export the wrapper functions.
Finish the documentation.
Add tests.

The text was updated successfully, but these errors were encountered:

Non-Contradiction · 2018-05-15T02:37:38Z

It seems that if we are not allowing mutating functions, then the tape api is useless. So they will not be wrapped temporarily.

Non-Contradiction · 2018-05-15T03:04:09Z

For the abstract config api, it seems that currently we don't have to use ::Type{D} because in R we usually only deal with Float64 and for most R users (I guess ?), it will seem weird to use functions like that.

Non-Contradiction · 2018-05-17T23:05:12Z

Finish exportation and documentation of wrapper functions.
Add some basic tests for wrapper functions.
Need to adopt tests at
https://github.com/JuliaDiff/ReverseDiff.jl/tree/master/test
https://github.com/JuliaDiff/DiffTests.jl/blob/master/src/DiffTests.jl
to close this issue.

Non-Contradiction · 2018-05-31T03:12:48Z

Close. To be continued at #17.

Non-Contradiction added this to the First phase milestone May 14, 2018

Non-Contradiction mentioned this issue May 31, 2018

Progress of Wrapping APIs of ReverseDiff: Part 2 #17

Open

3 tasks

Non-Contradiction closed this as completed Jun 15, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Progress of Wrapping APIs of ReverseDiff #3

Progress of Wrapping APIs of ReverseDiff #3

Non-Contradiction commented May 14, 2018 •

edited

Loading

Non-Contradiction commented May 15, 2018

Non-Contradiction commented May 15, 2018

Non-Contradiction commented May 17, 2018

Non-Contradiction commented May 31, 2018

Progress of Wrapping APIs of ReverseDiff #3

Progress of Wrapping APIs of ReverseDiff #3

Comments

Non-Contradiction commented May 14, 2018 • edited Loading

Gradients of f(x::AbstractArray{<:Real}...)::Real

Jacobians of f(x::AbstractArray{<:Real}...)::AbstractArray{<:Real}

Hessians of f(x::AbstractArray{<:Real})::Real

The AbstractTape API

The AbstractConfig API

Optimization Annotations

Update

Non-Contradiction commented May 15, 2018

Non-Contradiction commented May 15, 2018

Non-Contradiction commented May 17, 2018

Non-Contradiction commented May 31, 2018

Non-Contradiction commented May 14, 2018 •

edited

Loading

Gradients of `f(x::AbstractArray{<:Real}...)::Real`

Jacobians of `f(x::AbstractArray{<:Real}...)::AbstractArray{<:Real}`

Hessians of `f(x::AbstractArray{<:Real})::Real`