Sparsity

We present sparsity handling with DifferentiationInterface.jl.

using BenchmarkTools
using DifferentiationInterface
import ForwardDiff, Zygote

Sparse AD is very useful when Jacobian or Hessian matrices have a lot of zeros. So let us write functions that satisfy this property.

f_sparse_vector(x::AbstractVector) = diff(x .^ 2) + diff(reverse(x .^ 2))
f_sparse_scalar(x::AbstractVector) = sum(f_sparse_vector(x) .^ 2)

Let's also pick a random test vector.

x = float.(1:8);
8-element Vector{Float64}:
 1.0
 2.0
 3.0
 4.0
 5.0
 6.0
 7.0
 8.0

Dense backends

When we use the jacobian or hessian operator with a dense backend, we get a dense matrix with plenty of zeros.

dense_first_order_backend = AutoForwardDiff()
J_dense = jacobian(f_sparse_vector, dense_first_order_backend, x)
7×8 Matrix{Float64}:
 -2.0   4.0   0.0   0.0    0.0    0.0   14.0  -16.0
  0.0  -4.0   6.0   0.0    0.0   12.0  -14.0    0.0
  0.0   0.0  -6.0   8.0   10.0  -12.0    0.0    0.0
  0.0   0.0   0.0   0.0    0.0    0.0    0.0    0.0
  0.0   0.0   6.0  -8.0  -10.0   12.0    0.0    0.0
  0.0   4.0  -6.0   0.0    0.0  -12.0   14.0    0.0
  2.0  -4.0   0.0   0.0    0.0    0.0  -14.0   16.0
dense_second_order_backend = SecondOrder(AutoForwardDiff(), AutoZygote())
H_dense = hessian(f_sparse_scalar, dense_second_order_backend, x)
8×8 Matrix{Float64}:
  112.0   -32.0     0.0     0.0     0.0     0.0  -112.0   128.0
  -32.0    96.0   -96.0     0.0     0.0  -192.0   448.0  -256.0
    0.0   -96.0   256.0  -192.0  -240.0   576.0  -336.0     0.0
    0.0     0.0  -192.0   224.0   320.0  -384.0     0.0     0.0
    0.0     0.0  -240.0   320.0   368.0  -480.0     0.0     0.0
    0.0  -192.0   576.0  -384.0  -480.0  1120.0  -672.0     0.0
 -112.0   448.0  -336.0     0.0     0.0  -672.0  1536.0  -896.0
  128.0  -256.0     0.0     0.0     0.0     0.0  -896.0  1120.0

The results are correct but the procedure is very slow. By using a sparse backend, we can get the runtime to increase with the number of nonzero elements, instead of the total number of elements.

Sparse backends

Recipe to create a sparse backend: combine a dense backend, a sparsity detector and a compatible coloring algorithm inside AutoSparse. The following are reasonable defaults:

using SparseConnectivityTracer: TracerSparsityDetector
using SparseMatrixColorings: GreedyColoringAlgorithm

sparse_first_order_backend = AutoSparse(
    AutoForwardDiff();
    sparsity_detector=TracerSparsityDetector(),
    coloring_algorithm=GreedyColoringAlgorithm(),
)

sparse_second_order_backend = AutoSparse(
    SecondOrder(AutoForwardDiff(), AutoZygote());
    sparsity_detector=TracerSparsityDetector(),
    coloring_algorithm=GreedyColoringAlgorithm(),
)

Now the resulting matrices are sparse:

jacobian(f_sparse_vector, sparse_first_order_backend, x)
7×8 SparseArrays.SparseMatrixCSC{Float64, Int64} with 26 stored entries:
 -2.0   4.0    ⋅     ⋅      ⋅      ⋅    14.0  -16.0
   ⋅   -4.0   6.0    ⋅      ⋅    12.0  -14.0     ⋅ 
   ⋅     ⋅   -6.0   8.0   10.0  -12.0     ⋅      ⋅ 
   ⋅     ⋅     ⋅    0.0    0.0     ⋅      ⋅      ⋅ 
   ⋅     ⋅    6.0  -8.0  -10.0   12.0     ⋅      ⋅ 
   ⋅    4.0  -6.0    ⋅      ⋅   -12.0   14.0     ⋅ 
  2.0  -4.0    ⋅     ⋅      ⋅      ⋅   -14.0   16.0
hessian(f_sparse_scalar, sparse_second_order_backend, x)
8×8 SparseArrays.SparseMatrixCSC{Float64, Int64} with 40 stored entries:
  112.0   -32.0      ⋅       ⋅       ⋅       ⋅   -112.0   128.0
  -32.0    96.0   -96.0      ⋅       ⋅   -192.0   448.0  -256.0
     ⋅    -96.0   256.0  -192.0  -240.0   576.0  -336.0      ⋅ 
     ⋅       ⋅   -192.0   224.0   320.0  -384.0      ⋅       ⋅ 
     ⋅       ⋅   -240.0   320.0   368.0  -480.0      ⋅       ⋅ 
     ⋅   -192.0   576.0  -384.0  -480.0  1120.0  -672.0      ⋅ 
 -112.0   448.0  -336.0      ⋅       ⋅   -672.0  1536.0  -896.0
  128.0  -256.0      ⋅       ⋅       ⋅       ⋅   -896.0  1120.0

Sparse preparation

In the examples above, we didn't use preparation. Sparse preparation is more costly than dense preparation, but it is even more essential. Indeed, once preparation is done, sparse differentiation is much faster than dense differentiation, because it makes fewer calls to the underlying function. The speedup becomes very visible in large dimensions.

n = 1000
jac_extras_dense = prepare_jacobian(f_sparse_vector, dense_first_order_backend, zeros(n))
jac_extras_sparse = prepare_jacobian(f_sparse_vector, sparse_first_order_backend, zeros(n))
@benchmark jacobian($f_sparse_vector, $jac_extras_dense, $dense_first_order_backend, $(randn(n)))
BenchmarkTools.Trial: 840 samples with 1 evaluation.
 Range (minmax):  4.866 ms25.719 ms   GC (min … max):  0.00% … 3.92%
 Time  (median):     5.462 ms               GC (median):    12.39%
 Time  (mean ± σ):   5.947 ms ±  1.750 ms   GC (mean ± σ):  12.54% ± 3.17%

    ▁█                                                     
  ▄▁███▇▅▁▆▄▇▆▅▄▅▄▅▄▄▄▄▁▄▄▄▄▄▄▄▄▁▄▄▄▄▁▄▁▁▄▄▆▆▄▄▄▄▄▁▁▄▁▄▁▄▅ ▇
  4.87 ms      Histogram: log(frequency) by time     14.2 ms <

 Memory estimate: 57.62 MiB, allocs estimate: 1011.
@benchmark jacobian($f_sparse_vector, $jac_extras_sparse, $sparse_first_order_backend, $(randn(n)))
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (minmax):  25.738 μs 2.480 ms   GC (min … max):  0.00% … 95.53%
 Time  (median):     33.242 μs               GC (median):     0.00%
 Time  (mean ± σ):   41.056 μs ± 80.365 μs   GC (mean ± σ):  12.64% ±  6.54%

   ▅█▄▂                                                     ▂
  ▇█████▇▇▆▆▆▅▄▄▃▁▁▁▁▁▁▃▁▄▄▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▃▁▁▁▁▁▁▃▁▁▃▁▁▅██ █
  25.7 μs      Histogram: log(frequency) by time       172 μs <

 Memory estimate: 399.66 KiB, allocs estimate: 26.