API
DifferentiationInterface
— ModuleDifferentiationInterface
An interface to various automatic differentiation backends in Julia.
Exports
AutoChainRules
AutoDiffractor
AutoEnzyme
AutoFastDifferentiation
AutoFiniteDiff
AutoFiniteDifferences
AutoForwardDiff
AutoPolyesterForwardDiff
AutoReverseDiff
AutoSparse
AutoSymbolics
AutoTapir
AutoTracker
AutoZygote
DenseSparsityDetector
DifferentiateWith
GreedyColoringAlgorithm
SecondOrder
check_available
check_hessian
check_twoarg
derivative
derivative!
gradient
gradient!
hessian
hessian!
hvp
hvp!
jacobian
jacobian!
prepare_derivative
prepare_gradient
prepare_hessian
prepare_hvp
prepare_hvp_same_point
prepare_jacobian
prepare_pullback
prepare_pullback_same_point
prepare_pushforward
prepare_pushforward_same_point
prepare_second_derivative
pullback
pullback!
pushforward
pushforward!
second_derivative
second_derivative!
value_and_derivative
value_and_derivative!
value_and_gradient
value_and_gradient!
value_and_jacobian
value_and_jacobian!
value_and_pullback
value_and_pullback!
value_and_pushforward
value_and_pushforward!
value_derivative_and_second_derivative
value_derivative_and_second_derivative!
value_gradient_and_hessian
value_gradient_and_hessian!
First order
Pushforward
DifferentiationInterface.prepare_pushforward
— Functionprepare_pushforward(f, backend, x, dx) -> extras
prepare_pushforward(f!, y, backend, x, dx) -> extras
Create an extras
object that can be given to pushforward
and its variants.
If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again. In the two-argument case, y
is mutated by f!
during preparation.
DifferentiationInterface.prepare_pushforward_same_point
— Functionprepare_pushforward_same_point(f, backend, x, dx) -> extras_same
prepare_pushforward_same_point(f!, y, backend, x, dx) -> extras_same
Create an extras_same
object that can be given to pushforward
and its variants if they are applied at the same point x
.
If the function or the point changes in any way, the result of preparation will be invalidated, and you will need to run it again. In the two-argument case, y
is mutated by f!
during preparation.
DifferentiationInterface.pushforward
— Functionpushforward(f, backend, x, dx, [extras]) -> dy
pushforward(f!, y, backend, x, dx, [extras]) -> dy
Compute the pushforward of the function f
at point x
with seed dx
.
To improve performance via operator preparation, refer to prepare_pushforward
and prepare_pushforward_same_point
.
Pushforwards are also commonly called Jacobian-vector products or JVPs. This function could have been named jvp
.
DifferentiationInterface.pushforward!
— Functionpushforward!(f, dy, backend, x, dx, [extras]) -> dy
pushforward!(f!, y, dy, backend, x, dx, [extras]) -> dy
Compute the pushforward of the function f
at point x
with seed dx
, overwriting dy
.
To improve performance via operator preparation, refer to prepare_pushforward
and prepare_pushforward_same_point
.
Pushforwards are also commonly called Jacobian-vector products or JVPs. This function could have been named jvp!
.
DifferentiationInterface.value_and_pushforward
— Functionvalue_and_pushforward(f, backend, x, dx, [extras]) -> (y, dy)
value_and_pushforward(f!, y, backend, x, dx, [extras]) -> (y, dy)
Compute the value and the pushforward of the function f
at point x
with seed dx
.
To improve performance via operator preparation, refer to prepare_pushforward
and prepare_pushforward_same_point
.
Pushforwards are also commonly called Jacobian-vector products or JVPs. This function could have been named value_and_jvp
.
Required primitive for forward mode backends.
DifferentiationInterface.value_and_pushforward!
— Functionvalue_and_pushforward!(f, dy, backend, x, dx, [extras]) -> (y, dy)
value_and_pushforward!(f!, y, dy, backend, x, dx, [extras]) -> (y, dy)
Compute the value and the pushforward of the function f
at point x
with seed dx
, overwriting dy
.
To improve performance via operator preparation, refer to prepare_pushforward
and prepare_pushforward_same_point
.
Pushforwards are also commonly called Jacobian-vector products or JVPs. This function could have been named value_and_jvp!
.
Pullback
DifferentiationInterface.prepare_pullback
— Functionprepare_pullback(f, backend, x, dy) -> extras
prepare_pullback(f!, y, backend, x, dy) -> extras
Create an extras
object that can be given to pullback
and its variants.
If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again. In the two-argument case, y
is mutated by f!
during preparation.
DifferentiationInterface.prepare_pullback_same_point
— Functionprepare_pullback_same_point(f, backend, x, dy) -> extras_same
prepare_pullback_same_point(f!, y, backend, x, dy) -> extras_same
Create an extras_same
object that can be given to pullback
and its variants if they are applied at the same point x
.
If the function or the point changes in any way, the result of preparation will be invalidated, and you will need to run it again. In the two-argument case, y
is mutated by f!
during preparation.
DifferentiationInterface.pullback
— Functionpullback(f, backend, x, dy, [extras]) -> dx
pullback(f!, y, backend, x, dy, [extras]) -> dx
Compute the pullback of the function f
at point x
with seed dy
.
To improve performance via operator preparation, refer to prepare_pullback
and prepare_pullback_same_point
.
Pullbacks are also commonly called vector-Jacobian products or VJPs. This function could have been named vjp
.
DifferentiationInterface.pullback!
— Functionpullback!(f, dx, backend, x, dy, [extras]) -> dx
pullback!(f!, y, dx, backend, x, dy, [extras]) -> dx
Compute the pullback of the function f
at point x
with seed dy
, overwriting dx
.
To improve performance via operator preparation, refer to prepare_pullback
and prepare_pullback_same_point
.
Pullbacks are also commonly called vector-Jacobian products or VJPs. This function could have been named vjp!
.
DifferentiationInterface.value_and_pullback
— Functionvalue_and_pullback(f, backend, x, dy, [extras]) -> (y, dx)
value_and_pullback(f!, y, backend, x, dy, [extras]) -> (y, dx)
Compute the value and the pullback of the function f
at point x
with seed dy
.
To improve performance via operator preparation, refer to prepare_pullback
and prepare_pullback_same_point
.
Pullbacks are also commonly called vector-Jacobian products or VJPs. This function could have been named value_and_vjp
.
Required primitive for reverse mode backends.
DifferentiationInterface.value_and_pullback!
— Functionvalue_and_pullback!(f, dx, backend, x, dy, [extras]) -> (y, dx)
value_and_pullback!(f!, y, dx, backend, x, dy, [extras]) -> (y, dx)
Compute the value and the pullback of the function f
at point x
with seed dy
, overwriting dx
.
To improve performance via operator preparation, refer to prepare_pullback
and prepare_pullback_same_point
.
Pullbacks are also commonly called vector-Jacobian products or VJPs. This function could have been named value_and_vjp!
.
Derivative
DifferentiationInterface.prepare_derivative
— Functionprepare_derivative(f, backend, x) -> extras
prepare_derivative(f!, y, backend, x) -> extras
Create an extras
object that can be given to derivative
and its variants.
If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again. In the two-argument case, y
is mutated by f!
during preparation.
DifferentiationInterface.derivative
— Functionderivative(f, backend, x, [extras]) -> der
derivative(f!, y, backend, x, [extras]) -> der
Compute the derivative of the function f
at point x
.
To improve performance via operator preparation, refer to prepare_derivative
.
DifferentiationInterface.derivative!
— Functionderivative!(f, der, backend, x, [extras]) -> der
derivative!(f!, y, der, backend, x, [extras]) -> der
Compute the derivative of the function f
at point x
, overwriting der
.
To improve performance via operator preparation, refer to prepare_derivative
.
DifferentiationInterface.value_and_derivative
— Functionvalue_and_derivative(f, backend, x, [extras]) -> (y, der)
value_and_derivative(f!, y, backend, x, [extras]) -> (y, der)
Compute the value and the derivative of the function f
at point x
.
To improve performance via operator preparation, refer to prepare_derivative
.
DifferentiationInterface.value_and_derivative!
— Functionvalue_and_derivative!(f, der, backend, x, [extras]) -> (y, der)
value_and_derivative!(f!, y, der, backend, x, [extras]) -> (y, der)
Compute the value and the derivative of the function f
at point x
, overwriting der
.
To improve performance via operator preparation, refer to prepare_derivative
.
Gradient
DifferentiationInterface.prepare_gradient
— Functionprepare_gradient(f, backend, x) -> extras
Create an extras
object that can be given to gradient
and its variants.
If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again.
DifferentiationInterface.gradient
— Functiongradient(f, backend, x, [extras]) -> grad
Compute the gradient of the function f
at point x
.
To improve performance via operator preparation, refer to prepare_gradient
.
DifferentiationInterface.gradient!
— Functiongradient!(f, grad, backend, x, [extras]) -> grad
Compute the gradient of the function f
at point x
, overwriting grad
.
To improve performance via operator preparation, refer to prepare_gradient
.
DifferentiationInterface.value_and_gradient
— Functionvalue_and_gradient(f, backend, x, [extras]) -> (y, grad)
Compute the value and the gradient of the function f
at point x
.
To improve performance via operator preparation, refer to prepare_gradient
.
DifferentiationInterface.value_and_gradient!
— Functionvalue_and_gradient!(f, grad, backend, x, [extras]) -> (y, grad)
Compute the value and the gradient of the function f
at point x
, overwriting grad
.
To improve performance via operator preparation, refer to prepare_gradient
.
Jacobian
DifferentiationInterface.prepare_jacobian
— Functionprepare_jacobian(f, backend, x) -> extras
prepare_jacobian(f!, y, backend, x) -> extras
Create an extras
object that can be given to jacobian
and its variants.
If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again. In the two-argument case, y
is mutated by f!
during preparation.
DifferentiationInterface.jacobian
— Functionjacobian(f, backend, x, [extras]) -> jac
jacobian(f!, y, backend, x, [extras]) -> jac
Compute the Jacobian matrix of the function f
at point x
.
To improve performance via operator preparation, refer to prepare_jacobian
.
DifferentiationInterface.jacobian!
— Functionjacobian!(f, jac, backend, x, [extras]) -> jac
jacobian!(f!, y, jac, backend, x, [extras]) -> jac
Compute the Jacobian matrix of the function f
at point x
, overwriting jac
.
To improve performance via operator preparation, refer to prepare_jacobian
.
DifferentiationInterface.value_and_jacobian
— Functionvalue_and_jacobian(f, backend, x, [extras]) -> (y, jac)
value_and_jacobian(f!, y, backend, x, [extras]) -> (y, jac)
Compute the value and the Jacobian matrix of the function f
at point x
.
To improve performance via operator preparation, refer to prepare_jacobian
.
DifferentiationInterface.value_and_jacobian!
— Functionvalue_and_jacobian!(f, jac, backend, x, [extras]) -> (y, jac)
value_and_jacobian!(f!, y, jac, backend, x, [extras]) -> (y, jac)
Compute the value and the Jacobian matrix of the function f
at point x
, overwriting jac
.
To improve performance via operator preparation, refer to prepare_jacobian
.
Second order
DifferentiationInterface.SecondOrder
— TypeSecondOrder
Combination of two backends for second-order differentiation.
SecondOrder
backends do not support first-order operators.
Constructor
SecondOrder(outer_backend, inner_backend)
Fields
outer::ADTypes.AbstractADType
: backend for the outer differentiationinner::ADTypes.AbstractADType
: backend for the inner differentiation
Second derivative
DifferentiationInterface.prepare_second_derivative
— Functionprepare_second_derivative(f, backend, x) -> extras
Create an extras
object that can be given to second_derivative
and its variants.
If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again.
DifferentiationInterface.second_derivative
— Functionsecond_derivative(f, backend, x, [extras]) -> der2
Compute the second derivative of the function f
at point x
.
To improve performance via operator preparation, refer to prepare_second_derivative
.
DifferentiationInterface.second_derivative!
— Functionsecond_derivative!(f, der2, backend, x, [extras]) -> der2
Compute the second derivative of the function f
at point x
, overwriting der2
.
To improve performance via operator preparation, refer to prepare_second_derivative
.
DifferentiationInterface.value_derivative_and_second_derivative
— Functionvalue_derivative_and_second_derivative(f, backend, x, [extras]) -> (y, der, der2)
Compute the value, first derivative and second derivative of the function f
at point x
.
To improve performance via operator preparation, refer to prepare_second_derivative
.
DifferentiationInterface.value_derivative_and_second_derivative!
— Functionvalue_derivative_and_second_derivative!(f, der, der2, backend, x, [extras]) -> (y, der, der2)
Compute the value, first derivative and second derivative of the function f
at point x
, overwriting der
and der2
.
To improve performance via operator preparation, refer to prepare_second_derivative
.
Hessian-vector product
DifferentiationInterface.prepare_hvp
— Functionprepare_hvp(f, backend, x, dx) -> extras
Create an extras
object that can be given to hvp
and its variants.
If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again.
DifferentiationInterface.prepare_hvp_same_point
— Functionprepare_hvp_same_point(f, backend, x, dx) -> extras_same
Create an extras_same
object that can be given to hvp
and its variants if they are applied at the same point x
.
If the function or the point changes in any way, the result of preparation will be invalidated, and you will need to run it again.
DifferentiationInterface.hvp
— Functionhvp(f, backend, x, dx, [extras]) -> dg
Compute the Hessian-vector product of f
at point x
with seed dx
.
To improve performance via operator preparation, refer to prepare_hvp
and prepare_hvp_same_point
.
DifferentiationInterface.hvp!
— Functionhvp!(f, dg, backend, x, dx, [extras]) -> dg
Compute the Hessian-vector product of f
at point x
with seed dx
, overwriting dg
.
To improve performance via operator preparation, refer to prepare_hvp
and prepare_hvp_same_point
.
Hessian
DifferentiationInterface.prepare_hessian
— Functionprepare_hessian(f, backend, x) -> extras
Create an extras
object that can be given to hessian
and its variants.
If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again.
DifferentiationInterface.hessian
— Functionhessian(f, backend, x, [extras]) -> hess
Compute the Hessian matrix of the function f
at point x
.
To improve performance via operator preparation, refer to prepare_hessian
.
DifferentiationInterface.hessian!
— Functionhessian!(f, hess, backend, x, [extras]) -> hess
Compute the Hessian matrix of the function f
at point x
, overwriting hess
.
To improve performance via operator preparation, refer to prepare_hessian
.
DifferentiationInterface.value_gradient_and_hessian
— Functionvalue_gradient_and_hessian(f, backend, x, [extras]) -> (y, grad, hess)
Compute the value, gradient vector and Hessian matrix of the function f
at point x
.
To improve performance via operator preparation, refer to prepare_hessian
.
DifferentiationInterface.value_gradient_and_hessian!
— Functionvalue_gradient_and_hessian!(f, grad, hess, backend, x, [extras]) -> (y, grad, hess)
Compute the value, gradient vector and Hessian matrix of the function f
at point x
, overwriting grad
and hess
.
To improve performance via operator preparation, refer to prepare_hessian
.
Utilities
Backend queries
DifferentiationInterface.check_available
— Functioncheck_available(backend)
Check whether backend
is available (i.e. whether the extension is loaded).
DifferentiationInterface.check_twoarg
— Functioncheck_twoarg(backend)
Check whether backend
supports differentiation of two-argument functions.
DifferentiationInterface.check_hessian
— Functioncheck_hessian(backend)
Check whether backend
supports second order differentiation by trying to compute a hessian.
Might take a while due to compilation time.
DifferentiationInterface.outer
— Functionouter(backend::SecondOrder)
Return the outer backend of a SecondOrder
object, tasked with differentiation at the second order.
DifferentiationInterface.inner
— Functioninner(backend::SecondOrder)
Return the inner backend of a SecondOrder
object, tasked with differentiation at the first order.
Backend switch
DifferentiationInterface.DifferentiateWith
— TypeDifferentiateWith
Callable function wrapper that enforces differentiation with a specified (inner) backend.
This works by defining new rules overriding the behavior of the outer backend that would normally be used.
This is an experimental functionality, whose API cannot yet be considered stable. At the moment, it only supports one-argument functions, and rules are only defined for ChainRules.jl-compatible outer backends.
Fields
f
: the function in questionbackend::AbstractADType
: the inner backend to use for differentiation
Constructor
DifferentiateWith(f, backend)
Example
using DifferentiationInterface
import ForwardDiff, Zygote
function f(x)
a = Vector{eltype(x)}(undef, 1)
a[1] = sum(x) # mutation that breaks Zygote
return a[1]
end
dw = DifferentiateWith(f, AutoForwardDiff());
gradient(dw, AutoZygote(), [2.0]) # calls ForwardDiff instead
# output
1-element Vector{Float64}:
1.0
Sparsity detection
DifferentiationInterface.DenseSparsityDetector
— TypeDenseSparsityDetector
Sparsity pattern detector satisfying the detection API of ADTypes.jl.
The nonzeros in a Jacobian or Hessian are detected by computing the relevant matrix with dense AD, and thresholding the entries with a given tolerance (which can be numerically inaccurate).
This detector can be very slow, and should only be used if its output can be exploited multiple times to compute many sparse matrices.
In general, the sparsity pattern you obtain can depend on the provided input x
. If you want to reuse the pattern, make sure that it is input-agnostic.
Fields
backend::AbstractADType
is the dense AD backend used under the hoodatol::Float64
is the minimum magnitude of a matrix entry to be considered nonzero
Constructor
DenseSparsityDetector(backend; atol, method=:iterative)
The keyword argument method::Symbol
can be either:
:iterative
: compute the matrix in a sequence of matrix-vector products (memory-efficient):direct
: compute the matrix all at once (memory-hungry but sometimes faster).
Note that the constructor is type-unstable because method
ends up being a type parameter of the DenseSparsityDetector
object (this is not part of the API and might change).
Examples
using ADTypes, DifferentiationInterface, SparseArrays
import ForwardDiff
detector = DenseSparsityDetector(AutoForwardDiff(); atol=1e-5, method=:direct)
ADTypes.jacobian_sparsity(diff, rand(5), detector)
# output
4×5 SparseMatrixCSC{Bool, Int64} with 8 stored entries:
1 1 ⋅ ⋅ ⋅
⋅ 1 1 ⋅ ⋅
⋅ ⋅ 1 1 ⋅
⋅ ⋅ ⋅ 1 1
Sometimes the sparsity pattern is input-dependent:
ADTypes.jacobian_sparsity(x -> [prod(x)], rand(2), detector)
# output
1×2 SparseMatrixCSC{Bool, Int64} with 2 stored entries:
1 1
ADTypes.jacobian_sparsity(x -> [prod(x)], [0, 1], detector)
# output
1×2 SparseMatrixCSC{Bool, Int64} with 1 stored entry:
1 ⋅
Internals
The following is not part of the public API.
DifferentiationInterface.AutoZeroForward
— TypeAutoZeroForward <: ADTypes.AbstractADType
Trivial backend that sets all derivatives to zero. Used in testing and benchmarking.
DifferentiationInterface.AutoZeroReverse
— TypeAutoZeroReverse <: ADTypes.AbstractADType
Trivial backend that sets all derivatives to zero. Used in testing and benchmarking.
DifferentiationInterface.DerivativeExtras
— TypeDerivativeExtras
Abstract type for additional information needed by derivative
and its variants.
DifferentiationInterface.ForwardOverForward
— TypeForwardOverForward
Traits identifying second-order backends that compute HVPs in forward over forward mode (inefficient).
DifferentiationInterface.ForwardOverReverse
— TypeForwardOverReverse
Traits identifying second-order backends that compute HVPs in forward over reverse mode.
DifferentiationInterface.Gradient
— TypeGradient
Functor computing the gradient of f
with a fixed backend
.
This type is not part of the public API.
Constructor
Gradient(f, backend, extras=nothing)
If extras
is provided, the gradient closure will skip preparation.
Example
using DifferentiationInterface
import Zygote
g = DifferentiationInterface.Gradient(x -> sum(abs2, x), AutoZygote())
g([2.0, 3.0])
# output
2-element Vector{Float64}:
4.0
6.0
DifferentiationInterface.GradientExtras
— TypeGradientExtras
Abstract type for additional information needed by gradient
and its variants.
DifferentiationInterface.HVPExtras
— TypeHVPExtras
Abstract type for additional information needed by hvp
and its variants.
DifferentiationInterface.HessianExtras
— TypeHessianExtras
Abstract type for additional information needed by hessian
and its variants.
DifferentiationInterface.JacobianExtras
— TypeJacobianExtras
Abstract type for additional information needed by jacobian
and its variants.
DifferentiationInterface.PullbackExtras
— TypePullbackExtras
Abstract type for additional information needed by pullback
and its variants.
DifferentiationInterface.PullbackFast
— TypePullbackFast
Trait identifying backends that support efficient pullbacks.
DifferentiationInterface.PullbackSlow
— TypePullbackSlow
Trait identifying backends that do not support efficient pullbacks.
DifferentiationInterface.PushforwardExtras
— TypePushforwardExtras
Abstract type for additional information needed by pushforward
and its variants.
DifferentiationInterface.PushforwardFast
— TypePushforwardFast
Trait identifying backends that support efficient pushforwards.
DifferentiationInterface.PushforwardSlow
— TypePushforwardSlow
Trait identifying backends that do not support efficient pushforwards.
DifferentiationInterface.ReverseOverForward
— TypeReverseOverForward
Traits identifying second-order backends that compute HVPs in reverse over forward mode.
DifferentiationInterface.ReverseOverReverse
— TypeReverseOverReverse
Traits identifying second-order backends that compute HVPs in reverse over reverse mode.
DifferentiationInterface.SecondDerivativeExtras
— TypeSecondDerivativeExtras
Abstract type for additional information needed by second_derivative
and its variants.
DifferentiationInterface.Tangents
— TypeTangents{B}
Storage for B
(co)tangents (NTuple
wrapper).
Tangents{B}
with B > 1
can be used as seed to trigger batched-mode pushforward
, pullback
and hvp
.
Fields
d::NTuple{B}
DifferentiationInterface.TwoArgNotSupported
— TypeTwoArgNotSupported
Trait identifying backends that do not support two-argument functions f!(y, x)
.
DifferentiationInterface.TwoArgSupported
— TypeTwoArgSupported
Trait identifying backends that support two-argument functions f!(y, x)
.
ADTypes.mode
— Methodmode(backend::SecondOrder)
Return the outer mode of the second-order backend.
DifferentiationInterface.basis
— Methodbasis(backend, a::AbstractArray, i::CartesianIndex)
Construct the i
-th standard basis array in the vector space of a
with element type eltype(a)
.
Note
If an AD backend benefits from a more specialized basis array implementation, this function can be extended on the backend type.
DifferentiationInterface.multibasis
— Methodmultibasis(backend, a::AbstractArray, inds::AbstractVector{<:CartesianIndex})
Construct the sum of the i
-th standard basis arrays in the vector space of a
with element type eltype(a)
, for all i ∈ inds
.
Note
If an AD backend benefits from a more specialized basis array implementation, this function can be extended on the backend type.
DifferentiationInterface.nested
— Methodnested(backend)
Return a possibly modified backend
that can work while nested inside another differentiation procedure.
At the moment, this is only useful for Enzyme, which needs autodiff_deferred
to be compatible with higher-order differentiation.
DifferentiationInterface.pick_batchsize
— Methodpick_batchsize(backend::AbstractADType, dimension::Integer)
Pick a reasonable batch size for batched derivative evaluation with a given total dimension
.
Returns 1
for backends which have not overloaded it.
DifferentiationInterface.pullback_performance
— Methodpullback_performance(backend)
Return PullbackFast
or PullbackSlow
in a statically predictable way.
DifferentiationInterface.pushforward_performance
— Methodpushforward_performance(backend)
Return PushforwardFast
or PushforwardSlow
in a statically predictable way.
DifferentiationInterface.twoarg_support
— Methodtwoarg_support(backend)
Return TwoArgSupported
or TwoArgNotSupported
in a statically predictable way.