Gradients for everyone

A quick guide to autodiff in Julia

Guillaume Dalle

EPFL

Adrian Hill

TU Berlin

2024-07-11

Introduction

Motivation

What is a derivative?

A linear approximation of a function around a point.

Why do we care?

Derivatives of computer code are essential in optimization and machine learning.

What do I need to do?

Not much: with Automatic Differentiation (AD), derivatives are easy to compute!

Three types of AD users

  1. Package users want to differentiate through functions
  2. Package developers want to write differentiable functions
  3. Backend developers want to create new AD systems

Python vs. Julia: user experience

Python vs. Julia: developers

Python vs. Julia: developers

Understanding AD

Various flavors of differentiation

  • Manual: work out \(f'\) by hand
  • Numeric: \(f'(x) \approx \frac{f(x+\varepsilon) - f(x)}{\varepsilon}\)
  • Symbolic: code a formula for \(f\), get a formula for \(f'\)
  • Automatic: code a program for \(f\), get a value for \(f'(x)\)

Automatic differentiation

Three key ideas (Griewank and Walther 2008):

  1. Programs are composition chains (or DAGs) of many functions
  2. Jacobian of \(f = f_L \circ \dots \circ f_2 \circ f_1\) given by the chain rule: \[ J = J_L J_{L-1} \dots J_2 J_1 \]
  3. Avoid materializing full Jacobians with matrix-vector products: we only need \(Jv\) and \(v^\top J\)

Forward mode

Jacobian-Vector Products (JVPs), aka pushforwards, are naturally decomposed from \(1\) to \(L\): \[ J v = J_L (J_{L-1}(\dots J_2(J_1 v))) \]

For \(f: \mathbb{R}^n \rightarrow \mathbb{R}^m\), the \(m \times n\) Jacobian requires \(n\) JVPs: one per input dimension.

Special case

The derivative of \(f : \mathbb{R} \rightarrow \mathbb{R}^m\) requires just one JVP.

Reverse mode

Vector-Jacobian Products (VJPs), aka pullbacks, are naturally decomposed from \(L\) to \(1\): \[ v^\top J = (((v^\top J_L) J_{L-1}) \dots J_2)J_1 \]

For \(f: \mathbb{R}^n \rightarrow \mathbb{R}^m\), the \(m \times n\) Jacobian requires \(m\) VJPs: one per output dimension.

Special case

The gradient of \(f : \mathbb{R}^n \rightarrow \mathbb{R}\) requires just one VJP.

Implementation details

Forward mode

Forward sweep only.

Often based on dual numbers.

Low memory cost.

Reverse mode

Forward sweep + reverse sweep.

Often based on tapes.

High memory cost.

Using AD

Why so many backends?

  • Conflicting paradigms:
    • numeric vs. symbolic vs. algorithmic
    • operator overloading vs. source-to-source (which source?)
  • Cover varying subsets of the language
  • Historical reasons: developed by different people

Meaningful criteria

  • Does this AD backend execute without error?
  • Does it return the right derivative?
  • Does it run fast enough for me?

A simple decision tree

  1. Follow recommendations of high-level library (e.g. Flux).
  2. Otherwise, choose mode based on input and output dimensions.
  3. Try the most battle-tested backends: ForwardDiff or Enzyme in forward mode, Zygote or Enzyme in reverse mode.
  4. If nothing works, fall back on finite differences.

Enabling AD

Typical ForwardDiff failure

import ForwardDiff

badcopy(x) = copyto!(zeros(size(x)), x)

ForwardDiff.jacobian(badcopy, ones(2))
MethodError: MethodError(Float64, (Dual{ForwardDiff.Tag{typeof(Main.Notebook.badcopy), Float64}}(1.0,1.0,0.0),), 0x0000000000007b04)
MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{typeof(badcopy), Float64}, Float64, 2})

Closest candidates are:
  (::Type{T})(::Real, !Matched::RoundingMode) where T<:AbstractFloat
   @ Base rounding.jl:207
  (::Type{T})(::T) where T<:Number
   @ Core boot.jl:792
  Float64(!Matched::IrrationalConstants.Sqrt4π)
   @ IrrationalConstants ~/.julia/packages/IrrationalConstants/vp5v4/src/macro.jl:112
  ...

Stacktrace:
  [1] convert(::Type{Float64}, x::ForwardDiff.Dual{ForwardDiff.Tag{typeof(badcopy), Float64}, Float64, 2})
    @ Base ./number.jl:7
  [2] setindex!(A::Vector{Float64}, x::ForwardDiff.Dual{ForwardDiff.Tag{typeof(badcopy), Float64}, Float64, 2}, i1::Int64)
    @ Base ./array.jl:1021
  [3] _unsafe_copyto!(dest::Vector{Float64}, doffs::Int64, src::Vector{ForwardDiff.Dual{ForwardDiff.Tag{typeof(badcopy), Float64}, Float64, 2}}, soffs::Int64, n::Int64)
    @ Base ./array.jl:299
  [4] unsafe_copyto!
    @ ./array.jl:353 [inlined]
  [5] _copyto_impl!
    @ ./array.jl:376 [inlined]
  [6] copyto!
    @ ./array.jl:363 [inlined]
  [7] copyto!
    @ ./array.jl:385 [inlined]
  [8] badcopy(x::Vector{ForwardDiff.Dual{ForwardDiff.Tag{typeof(badcopy), Float64}, Float64, 2}})
    @ Main.Notebook ~/work/JuliaCon2024-AutoDiff/JuliaCon2024-AutoDiff/index.qmd:252
  [9] vector_mode_dual_eval!
    @ ~/.julia/packages/ForwardDiff/PcZ48/src/apiutils.jl:24 [inlined]
 [10] vector_mode_jacobian(f::typeof(badcopy), x::Vector{Float64}, cfg::ForwardDiff.JacobianConfig{ForwardDiff.Tag{typeof(badcopy), Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{typeof(badcopy), Float64}, Float64, 2}}})
    @ ForwardDiff ~/.julia/packages/ForwardDiff/PcZ48/src/jacobian.jl:125
 [11] jacobian(f::Function, x::Vector{Float64}, cfg::ForwardDiff.JacobianConfig{ForwardDiff.Tag{typeof(badcopy), Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{typeof(badcopy), Float64}, Float64, 2}}}, ::Val{true})
    @ ForwardDiff ~/.julia/packages/ForwardDiff/PcZ48/src/jacobian.jl:21
 [12] jacobian(f::Function, x::Vector{Float64}, cfg::ForwardDiff.JacobianConfig{ForwardDiff.Tag{typeof(badcopy), Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{typeof(badcopy), Float64}, Float64, 2}}})
    @ ForwardDiff ~/.julia/packages/ForwardDiff/PcZ48/src/jacobian.jl:19
 [13] top-level scope
    @ ~/work/JuliaCon2024-AutoDiff/JuliaCon2024-AutoDiff/index.qmd:254

ForwardDiff troubleshooting

Allow numbers of type Dual in your functions.

goodcopy(x::AbstractArray{<:Real}) = copyto!(zeros(eltype(x), size(x)), x)

ForwardDiff.jacobian(goodcopy, ones(2))
2×2 Matrix{Float64}:
 1.0  0.0
 0.0  1.0

Typical Zygote failure

import Zygote

Zygote.jacobian(badcopy, ones(2))
ErrorException: ErrorException("Mutating arrays is not supported -- called copyto!(Vector{Float64}, ...)\nThis error occurs when you ask Zygote to differentiate operations that change\nthe elements of arrays in place (e.g. setting values with x .= ...)\n\nPossible fixes:\n- avoid mutating operations (preferred)\n- or read the documentation and solutions for this error\n  https://fluxml.ai/Zygote.jl/latest/limitations\n")
Mutating arrays is not supported -- called copyto!(Vector{Float64}, ...)
This error occurs when you ask Zygote to differentiate operations that change
the elements of arrays in place (e.g. setting values with x .= ...)

Possible fixes:
- avoid mutating operations (preferred)
- or read the documentation and solutions for this error
  https://fluxml.ai/Zygote.jl/latest/limitations

Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] _throw_mutation_error(f::Function, args::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/nsBv0/src/lib/array.jl:70
  [3] (::Zygote.var"#543#544"{Vector{Float64}})(::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/nsBv0/src/lib/array.jl:85
  [4] (::Zygote.var"#2633#back#545"{Zygote.var"#543#544"{Vector{Float64}}})(Δ::Vector{Float64})
    @ Zygote ~/.julia/packages/ZygoteRules/M4xmc/src/adjoint.jl:72
  [5] badcopy
    @ ~/work/JuliaCon2024-AutoDiff/JuliaCon2024-AutoDiff/index.qmd:252 [inlined]
  [6] (::Zygote.Pullback{Tuple{typeof(badcopy), Vector{Float64}}, Tuple{Zygote.ZBack{Returns{Tuple{ChainRulesCore.NoTangent, ChainRulesCore.NoTangent}}}, Zygote.var"#2633#back#545"{Zygote.var"#543#544"{Vector{Float64}}}, Zygote.ZBack{Returns{Tuple{ChainRulesCore.NoTangent, ChainRulesCore.NoTangent}}}}})(Δ::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/nsBv0/src/compiler/interface2.jl:0
  [7] (::Zygote.var"#291#292"{Tuple{Tuple{Nothing}}, Zygote.Pullback{Tuple{typeof(badcopy), Vector{Float64}}, Tuple{Zygote.ZBack{Returns{Tuple{ChainRulesCore.NoTangent, ChainRulesCore.NoTangent}}}, Zygote.var"#2633#back#545"{Zygote.var"#543#544"{Vector{Float64}}}, Zygote.ZBack{Returns{Tuple{ChainRulesCore.NoTangent, ChainRulesCore.NoTangent}}}}}})(Δ::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/nsBv0/src/lib/lib.jl:206
  [8] (::Zygote.var"#2169#back#293"{Zygote.var"#291#292"{Tuple{Tuple{Nothing}}, Zygote.Pullback{Tuple{typeof(badcopy), Vector{Float64}}, Tuple{Zygote.ZBack{Returns{Tuple{ChainRulesCore.NoTangent, ChainRulesCore.NoTangent}}}, Zygote.var"#2633#back#545"{Zygote.var"#543#544"{Vector{Float64}}}, Zygote.ZBack{Returns{Tuple{ChainRulesCore.NoTangent, ChainRulesCore.NoTangent}}}}}}})(Δ::Vector{Float64})
    @ Zygote ~/.julia/packages/ZygoteRules/M4xmc/src/adjoint.jl:72
  [9] call_composed
    @ ./operators.jl:1045 [inlined]
 [10] (::Zygote.Pullback{Tuple{typeof(Base.call_composed), Tuple{typeof(badcopy)}, Tuple{Vector{Float64}}, @Kwargs{}}, Any})(Δ::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/nsBv0/src/compiler/interface2.jl:0
 [11] call_composed
    @ ./operators.jl:1044 [inlined]
 [12] #_#103
    @ ./operators.jl:1041 [inlined]
 [13] (::Zygote.Pullback{Tuple{Base.var"##_#103", @Kwargs{}, ComposedFunction{typeof(Zygote._jvec), typeof(badcopy)}, Vector{Float64}}, Tuple{Zygote.Pullback{Tuple{typeof(Base.call_composed), Tuple{typeof(Zygote._jvec), typeof(badcopy)}, Tuple{Vector{Float64}}, @Kwargs{}}, Tuple{Zygote.var"#2141#back#281"{Zygote.var"#277#280"}, Zygote.Pullback{Tuple{typeof(Zygote._jvec), Vector{Float64}}, Tuple{Zygote.Pullback{Tuple{typeof(vec), Vector{Float64}}, Tuple{}}}}, Zygote.var"#2029#back#213"{Zygote.var"#back#211"{2, 1, Zygote.Context{false}, typeof(Zygote._jvec)}}, Zygote.Pullback{Tuple{typeof(Base.call_composed), Tuple{typeof(badcopy)}, Tuple{Vector{Float64}}, @Kwargs{}}, Any}}}, Zygote.Pullback{Tuple{typeof(Base.unwrap_composed), ComposedFunction{typeof(Zygote._jvec), typeof(badcopy)}}, Tuple{Zygote.Pullback{Tuple{typeof(Base.unwrap_composed), typeof(Zygote._jvec)}, Tuple{Zygote.var"#2013#back#204"{typeof(identity)}, Zygote.Pullback{Tuple{typeof(Base.maybeconstructor), typeof(Zygote._jvec)}, Tuple{}}}}, Zygote.Pullback{Tuple{typeof(Base.unwrap_composed), typeof(badcopy)}, Tuple{Zygote.var"#2013#back#204"{typeof(identity)}, Zygote.Pullback{Tuple{typeof(Base.maybeconstructor), typeof(badcopy)}, Tuple{}}}}, Zygote.var"#2180#back#303"{Zygote.var"#back#302"{:inner, Zygote.Context{false}, ComposedFunction{typeof(Zygote._jvec), typeof(badcopy)}, typeof(badcopy)}}, Zygote.var"#2169#back#293"{Zygote.var"#291#292"{Tuple{Tuple{Nothing}, Tuple{Nothing}}, Zygote.var"#2013#back#204"{typeof(identity)}}}, Zygote.var"#2180#back#303"{Zygote.var"#back#302"{:outer, Zygote.Context{false}, ComposedFunction{typeof(Zygote._jvec), typeof(badcopy)}, typeof(Zygote._jvec)}}}}}})(Δ::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/nsBv0/src/compiler/interface2.jl:0
 [14] #291
    @ ~/.julia/packages/Zygote/nsBv0/src/lib/lib.jl:206 [inlined]
 [15] #2169#back
    @ ~/.julia/packages/ZygoteRules/M4xmc/src/adjoint.jl:72 [inlined]
 [16] ComposedFunction
    @ ./operators.jl:1041 [inlined]
 [17] (::Zygote.Pullback{Tuple{ComposedFunction{typeof(Zygote._jvec), typeof(badcopy)}, Vector{Float64}}, Tuple{Zygote.Pullback{Tuple{Type{NamedTuple}}, Tuple{}}, Zygote.var"#2169#back#293"{Zygote.var"#291#292"{Tuple{Tuple{Nothing, Nothing}, Tuple{Nothing}}, Zygote.Pullback{Tuple{Base.var"##_#103", @Kwargs{}, ComposedFunction{typeof(Zygote._jvec), typeof(badcopy)}, Vector{Float64}}, Tuple{Zygote.Pullback{Tuple{typeof(Base.call_composed), Tuple{typeof(Zygote._jvec), typeof(badcopy)}, Tuple{Vector{Float64}}, @Kwargs{}}, Tuple{Zygote.var"#2141#back#281"{Zygote.var"#277#280"}, Zygote.Pullback{Tuple{typeof(Zygote._jvec), Vector{Float64}}, Tuple{Zygote.Pullback{Tuple{typeof(vec), Vector{Float64}}, Tuple{}}}}, Zygote.var"#2029#back#213"{Zygote.var"#back#211"{2, 1, Zygote.Context{false}, typeof(Zygote._jvec)}}, Zygote.Pullback{Tuple{typeof(Base.call_composed), Tuple{typeof(badcopy)}, Tuple{Vector{Float64}}, @Kwargs{}}, Any}}}, Zygote.Pullback{Tuple{typeof(Base.unwrap_composed), ComposedFunction{typeof(Zygote._jvec), typeof(badcopy)}}, Tuple{Zygote.Pullback{Tuple{typeof(Base.unwrap_composed), typeof(Zygote._jvec)}, Tuple{Zygote.var"#2013#back#204"{typeof(identity)}, Zygote.Pullback{Tuple{typeof(Base.maybeconstructor), typeof(Zygote._jvec)}, Tuple{}}}}, Zygote.Pullback{Tuple{typeof(Base.unwrap_composed), typeof(badcopy)}, Tuple{Zygote.var"#2013#back#204"{typeof(identity)}, Zygote.Pullback{Tuple{typeof(Base.maybeconstructor), typeof(badcopy)}, Tuple{}}}}, Zygote.var"#2180#back#303"{Zygote.var"#back#302"{:inner, Zygote.Context{false}, ComposedFunction{typeof(Zygote._jvec), typeof(badcopy)}, typeof(badcopy)}}, Zygote.var"#2169#back#293"{Zygote.var"#291#292"{Tuple{Tuple{Nothing}, Tuple{Nothing}}, Zygote.var"#2013#back#204"{typeof(identity)}}}, Zygote.var"#2180#back#303"{Zygote.var"#back#302"{:outer, Zygote.Context{false}, ComposedFunction{typeof(Zygote._jvec), typeof(badcopy)}, typeof(Zygote._jvec)}}}}}}}}, Zygote.var"#2366#back#419"{Zygote.var"#pairs_namedtuple_pullback#418"{(), @NamedTuple{}}}, Zygote.var"#2013#back#204"{typeof(identity)}}})(Δ::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/nsBv0/src/compiler/interface2.jl:0
 [18] (::Zygote.var"#75#76"{Zygote.Pullback{Tuple{ComposedFunction{typeof(Zygote._jvec), typeof(badcopy)}, Vector{Float64}}, Tuple{Zygote.Pullback{Tuple{Type{NamedTuple}}, Tuple{}}, Zygote.var"#2169#back#293"{Zygote.var"#291#292"{Tuple{Tuple{Nothing, Nothing}, Tuple{Nothing}}, Zygote.Pullback{Tuple{Base.var"##_#103", @Kwargs{}, ComposedFunction{typeof(Zygote._jvec), typeof(badcopy)}, Vector{Float64}}, Tuple{Zygote.Pullback{Tuple{typeof(Base.call_composed), Tuple{typeof(Zygote._jvec), typeof(badcopy)}, Tuple{Vector{Float64}}, @Kwargs{}}, Tuple{Zygote.var"#2141#back#281"{Zygote.var"#277#280"}, Zygote.Pullback{Tuple{typeof(Zygote._jvec), Vector{Float64}}, Tuple{Zygote.Pullback{Tuple{typeof(vec), Vector{Float64}}, Tuple{}}}}, Zygote.var"#2029#back#213"{Zygote.var"#back#211"{2, 1, Zygote.Context{false}, typeof(Zygote._jvec)}}, Zygote.Pullback{Tuple{typeof(Base.call_composed), Tuple{typeof(badcopy)}, Tuple{Vector{Float64}}, @Kwargs{}}, Any}}}, Zygote.Pullback{Tuple{typeof(Base.unwrap_composed), ComposedFunction{typeof(Zygote._jvec), typeof(badcopy)}}, Tuple{Zygote.Pullback{Tuple{typeof(Base.unwrap_composed), typeof(Zygote._jvec)}, Tuple{Zygote.var"#2013#back#204"{typeof(identity)}, Zygote.Pullback{Tuple{typeof(Base.maybeconstructor), typeof(Zygote._jvec)}, Tuple{}}}}, Zygote.Pullback{Tuple{typeof(Base.unwrap_composed), typeof(badcopy)}, Tuple{Zygote.var"#2013#back#204"{typeof(identity)}, Zygote.Pullback{Tuple{typeof(Base.maybeconstructor), typeof(badcopy)}, Tuple{}}}}, Zygote.var"#2180#back#303"{Zygote.var"#back#302"{:inner, Zygote.Context{false}, ComposedFunction{typeof(Zygote._jvec), typeof(badcopy)}, typeof(badcopy)}}, Zygote.var"#2169#back#293"{Zygote.var"#291#292"{Tuple{Tuple{Nothing}, Tuple{Nothing}}, Zygote.var"#2013#back#204"{typeof(identity)}}}, Zygote.var"#2180#back#303"{Zygote.var"#back#302"{:outer, Zygote.Context{false}, ComposedFunction{typeof(Zygote._jvec), typeof(badcopy)}, typeof(Zygote._jvec)}}}}}}}}, Zygote.var"#2366#back#419"{Zygote.var"#pairs_namedtuple_pullback#418"{(), @NamedTuple{}}}, Zygote.var"#2013#back#204"{typeof(identity)}}}})(Δ::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/nsBv0/src/compiler/interface.jl:91
 [19] withjacobian(f::Function, args::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/nsBv0/src/lib/grad.jl:150
 [20] jacobian(f::Function, args::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/nsBv0/src/lib/grad.jl:128
 [21] top-level scope
    @ ~/work/JuliaCon2024-AutoDiff/JuliaCon2024-AutoDiff/index.qmd:280

Zygote troubleshooting

Define a custom rule with ChainRulesCore:

using ChainRulesCore, LinearAlgebra

badcopy2(x) = badcopy(x)

function ChainRulesCore.rrule(::typeof(badcopy2), x)
    @info "My rule is called"
    y = badcopy2(x)  # primal value
    function badcopy2_pullback(dy)
    @info "My pullback is called"
        df = NoTangent()
        dx = I' * dy # Vector-Jacobian product
        return (df, dx)
    end
    return y, badcopy2_pullback
end

Zygote.jacobian(badcopy2, ones(2))
[ Info: My rule is called
[ Info: My pullback is called
[ Info: My pullback is called
([1.0 0.0; 0.0 1.0],)

Typical Enzyme failure

import Enzyme

Enzyme.autodiff(
  Enzyme.Forward,
  badcopy,
  Enzyme.Active(ones(2))
)
ErrorException: ErrorException("Unsupported Active{Vector{Float64}}, consider Duplicated or Const")
Unsupported Active{Vector{Float64}}, consider Duplicated or Const
Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:35
 [2] EnzymeCore.Active(x::Vector{Float64})
   @ EnzymeCore ~/.julia/packages/EnzymeCore/a2poZ/src/EnzymeCore.jl:49
 [3] top-level scope
   @ ~/work/JuliaCon2024-AutoDiff/JuliaCon2024-AutoDiff/index.qmd:320

Enzyme troubleshooting

Pay attention to type stability, temporary storage and activity annotations (see the FAQ).

Enzyme.autodiff(
  Enzyme.Forward,
  badcopy,
  Enzyme.Duplicated(ones(2), zeros(2))
)

DifferentiationInterface

Goals

  • DifferentiationInterface (DI) offers a common syntax for all AD backends1
  • AD users can compare correctness and performance without reading each documentation
  • AD developers get access to a wider user base

The fine print

DI may be slower than a direct call to the backend’s API (mostly with Enzyme).

Supported packages

Getting started with DI

Step 1: load the necessary packages

using DifferentiationInterface
import ForwardDiff, Enzyme, Zygote

f(x) = sum(abs2, x)
x = [1.0, 2.0, 3.0, 4.0]

Step 2: Combine DI’s operators with a backend from ADTypes

value_and_gradient(f, AutoForwardDiff(), x)
(30.0, [2.0, 4.0, 6.0, 8.0])
value_and_gradient(f, AutoEnzyme(), x)
(30.0, [2.0, 4.0, 6.0, 8.0])
value_and_gradient(f, AutoZygote(), x)
(30.0, [2.0, 4.0, 6.0, 8.0])

Step 3: Increase performance via DI’s preparation mechanism

Features of DI

  • Support for functions f(x) or f!(y, x) with scalar/array inputs & outputs
  • Eight standard operators: pushforward, pullback, derivative, gradient, jacobian, hvp, second_derivative, hessian
  • Out-of-place and in-place versions
  • Combine different backends using SecondOrder
  • Translate between backends using DifferentiateWith

DifferentiationInterfaceTest

  • Systematic tests for a variety of inputs and functions
  • Scenarios with weird arrays (static, GPU, sparse)
  • Type-stability checks
  • Automated benchmarks

Sparse AD ecosystem

Sparse AD with coloring (Gebremedhin, Manne, and Pothen 2005)

Conclusion

What’s next?

DI and its sparse AD ecosystem are brand new projects:

  • Try them out in your code
  • Report bugs or inefficiencies
  • Help us improve these packages!

Coming soon in DI (JuliaCon hackathon?)

Support for multiple arguments and non-array types.

More complex settings

More details in the book by Blondel and Roulet (2024).

Take-home message

Computing derivatives is easy, but each AD solution comes with its own limitations.

Learn to recognize and overcome them, either as a user or as a developer.

References

Blondel, Mathieu, and Vincent Roulet. 2024. “The Elements of Differentiable Programming.” arXiv. https://doi.org/10.48550/arXiv.2403.14606.
Gebremedhin, Assefaw Hadish, Fredrik Manne, and Alex Pothen. 2005. “What Color Is Your Jacobian? Graph Coloring for Computing Derivatives.” SIAM Review 47 (4): 629–705. https://doi.org/cmwds4.
Griewank, Andreas, and Andrea Walther. 2008. Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation. 2nd ed. Philadelphia, PA: Society for Industrial and Applied Mathematics.