Introduction

We explain the basics of our package on a simple function that is not amenable to naive automatic differentiation.

using ForwardDiff
using ImplicitDifferentiation
using LinearAlgebra
using Random
using Zygote

Random.seed!(63);

Why do we bother?

ForwardDiff.jl and Zygote.jl are two prominent packages for automatic differentiation in Julia. While they are very generic, there are simple language constructs that they cannot differentiate through.

function badsqrt(x::AbstractArray)
    a = [0.0]
    a[1] = first(x)
    return sqrt.(x)
end
badsqrt (generic function with 1 method)

This is essentially the componentwise square root function but with an additional twist: a::Vector{Float64} is created internally, and its only element is replaced with the first element of x. We can check that it does what it's supposed to do.

x = rand(2)
badsqrt(x) ≈ sqrt.(x)
true

Of course the Jacobian has an explicit formula.

J = Diagonal(0.5 ./ sqrt.(x))
2×2 LinearAlgebra.Diagonal{Float64, Vector{Float64}}:
 1.05545   ⋅ 
  ⋅       0.874736

However, things start to go wrong when we compute it with autodiff, due to the limitations of ForwardDiff.jl and those of Zygote.jl.

try
    ForwardDiff.jacobian(badsqrt, x)
catch e
    e
end
MethodError(Float64, (Dual{ForwardDiff.Tag{typeof(Main.badsqrt), Float64}}(0.22442135286865494,1.0,0.0),), 0x0000000000008309)

ForwardDiff.jl throws an error because it tries to call badsqrt with an array of dual numbers, and cannot use one of these numbers to fill a (which has element type Float64).

try
    Zygote.jacobian(badsqrt, x)
catch e
    e
end
ErrorException("Mutating arrays is not supported -- called setindex!(Vector{Float64}, ...)\nThis error occurs when you ask Zygote to differentiate operations that change\nthe elements of arrays in place (e.g. setting values with x .= ...)\n\nPossible fixes:\n- avoid mutating operations (preferred)\n- or read the documentation and solutions for this error\n  https://fluxml.ai/Zygote.jl/latest/limitations\n")

Zygote.jl also throws an error because it cannot handle mutation.

Implicit function

The first possible use of ImplicitDifferentiation.jl is to overcome the limitations of automatic differentiation packages by defining functions (and computing their derivatives) implicitly. An implicit function is a mapping

\[x \in \mathbb{R}^n \longmapsto y(x) \in \mathbb{R}^m\]

whose output is defined by conditions

\[F(x,y(x)) = 0 \in \mathbb{R}^m\]

We represent it using a type called ImplicitFunction, which you will see in action shortly.

First we define a forward mapping corresponding to the function we consider. It returns the actual output $y(x)$ of the function, and can be thought of as a black box solver. Importantly, this Julia callable doesn't need to be differentiable by automatic differentiation packages but the underlying function still needs to be mathematically differentiable.

forward(x) = badsqrt(x)
forward (generic function with 1 method)

Then we define conditions $c(x, y) = 0$ that the output $y(x)$ is supposed to satisfy. These conditions must be array-valued, with the same size as $y$. Unlike the forward mapping, the conditions need to be differentiable by automatic differentiation packages with respect to both $x$ and $y$. Here the conditions are very obvious: the square of the square root should be equal to the original value.

function conditions(x, y)
    c = y .^ 2 .- x
    return c
end
conditions (generic function with 1 method)

Finally, we construct a wrapper implicit around the previous objects. By default, forward is assumed to return a single output and conditions is assumed to accept 2 arguments.

implicit = ImplicitFunction(forward, conditions)
ImplicitFunction(forward, conditions, IterativeLinearSolver(true, false), nothing)

What does this wrapper do? When we call it as a function, it just falls back on first ∘ implicit.forward, so unsurprisingly we get the first output $y(x)$.

implicit(x) ≈ sqrt.(x)
true

And when we try to compute its Jacobian, the implicit function theorem is applied in the background to circumvent the lack of differentiability of the forward mapping.

Forward and reverse mode autodiff

Now ForwardDiff.jl works seamlessly.

ForwardDiff.jacobian(implicit, x) ≈ J
true

And so does Zygote.jl. Hurray!

Zygote.jacobian(implicit, x)[1] ≈ J
true

Second derivative

We can even go higher-order by mixing the two packages (forward-over-reverse mode). The only technical requirement is to switch the linear solver to something that can handle dual numbers:

implicit_higher_order = ImplicitFunction(
    forward, conditions; linear_solver=DirectLinearSolver()
)
ImplicitFunction(forward, conditions, DirectLinearSolver(true), nothing)

Then the Jacobian itself is differentiable.

h = rand(2)
J_Z(t) = Zygote.jacobian(implicit_higher_order, x .+ t .* h)[1]
ForwardDiff.derivative(J_Z, 0) ≈ Diagonal((-0.25 .* h) ./ (x .^ 1.5))
true

This page was generated using Literate.jl.