Mixing data, optimization and decision

The role of automatic differentiation

Guillaume Dalle

LVMT

2025-06-13

A little about me

📜 CV

2019-2022: PhD at CERMICS
2022: visiting student at MIT
2023-2024: postdoc at EPFL
2025-????: researcher here!

💕 Passions

By day: applied mathematics, computer science
By night: board games, songwriting, musicals

Motivation

Road traffic

MATSim model of Paris (https://matsim.org/gallery/paris/)

Train routing

Flatland railway simulator

Mohanty et al. (2020)

Challenges

📊 Calibration

Fitting model parameters to explain observed data.

🏃 Acceleration

Replacing complicated models with simpler surrogates.

🗺️ Decision-making

Using these models to inform industrial or political choices.

We focus on simpler models

They are easier to analyze and optimize.

Path problems with congestion

Shortest paths

Most transportation problems live on a graph $G = (V, E)$:

$V$ is a set of vertices

$E$ is a set of (weighted) edges

Shortest path = sequence of edges $a \to b$ with minimum cost.

Single agent	Interacting agents
Easy	Hard

Static traffic assignment (theory)

What if conflicting agents cause a slowdown?

\[ t_e(f_e) = t_e^0 \left[1 + \alpha \left(\frac{f_e}{c_e}\right)^\beta \right] \enspace \text{with} \enspace \begin{cases} \text{$t_e$: travel time} \\ \text{$f_e$: edge flow} \\ \text{$c_e$: capacity}\end{cases}\]

Everyone behaves selfishly: find a Nash equilibrium.

Static traffic assignment (example)

Xu et al. (2024)

Static traffic assignment (calibration)

Model parameters: free flow times, street capacities
Input data: network structure, travel demand
Observed data: measured flows or speeds

Goal: estimate model parameters from the data.

Idea

During calibration, the TA problem is a subroutine.

Static traffic assignment (calibration)

Can we do better than manual fine-tuning?

Grid search does not scale.

Multi-agent pathfinding (theory)

What if conflicting agents are forbidden?

Everyone behaves selflessly: find a social optimum.

Multi-agent pathfinding (example)

Multi-agent pathfinding (acceleration)

MAPF is difficult, choice between optimality and speed?

\[ \xrightarrow[]{\text{Input}} \boxed{\text{Encoder}} \xrightarrow[]{\text{Guidance}} \boxed{\text{Fast solver}} \xrightarrow[]{\text{Solution}} \]

Goal: learn encoder parameters to guide the solver.

modified movement costs

custom priority ordering

Idea

During encoder learning, the MAPF problem is a subroutine.

Both models (decision-making)

🚥 Static traffic assignment

Change the rules:

Set tolls
Close streets

🗺️ Multi-agent pathfinding

Change the rules:

Cancel trips
Adapt infrastructure

Idea

When we evaluate policy changes, the original problem is a subroutine.

Machine learning primer

Supervised learning

Imagine we have a dataset of image-label pairs $(x_i, y_i)$:

\[ (🐱, \texttt{cat}) \quad (🐶, \texttt{dog}) \quad (🦆, \texttt{duck}) \]

We want to recognize animals using a function

\[ f_p : x \longmapsto y \]

This function has parameters $p$ which need to be set.

Losses and gradients

Parameters are set by minimizing a loss function

\[ \ell(p) = \sum_i \lvert f_p(x_i) - y_i \rvert^2 \]

The gradient $\nabla \ell(p)$ gives a direction where loss increases.

Taking small steps with $p$ in the opposite direction = gradient descent.

Deep learning

Neural networks are a flexible family of parametric functions.

Subroutines called layers can be assembled arbitrarily.

Gradient descent works because individual layers are differentiable automatically.

The meaning of differentiation

Derivatives allow fast sensitivity analysis:

➡️ for a given input, how much does it affect every output?
⬅️ for a given output, how much is it affected by every input?

No need to evaluate small changes in every possible direction

Optimization as a layer

The gist

A strange concept

What if we could compute derivatives of transportation problems automatically?

Calibration: fit a large model with gradient descent
Acceleration: learn an encoder to improve fast solvers
Decision-making: solve bi-level optimization problems

Machine learning 🤝 constraint satisfaction

Parametric optimization problems

A parametric optimization problem has the form

\[ p \quad \longmapsto \quad \min_v c(v, p) \enspace \text{subject to} \enspace v \in \mathcal{C} \]

$v$ is the decision variable, $c$ the cost, $v \in \mathcal{C}$ the constraints.

	TA	MAPF
Variable $v$	Traffic flows	Agent paths
Parameter $p$	Street capacities	Movement costs

Theoretical issues

	TA	MAPF
Problem type	Continuous	Discrete
Derivative $\frac{\partial v^*}{\partial p}$	Well-defined	Ill-defined

Practical issues

👩‍🔬 Expertise

Can we reuse existing algorithms and make them differentiable?

💻 Hardware

Can we leverage modern parallel processors (GPUs) to speed things up?

📖 Open science

Can we do it all with open-source software and reproducible experiments?

Conclusion

Recent works

Solving large-scale transportation problems for logistics: Bouvier et al. (2023)
Unifying techniques for differentiable optimization layers: Dalle et al. (2022)
Developing software for automatic differentiation of complex programs: Dalle and Hill (2025), Hill and Dalle (2025), Montoison, Dalle, and Gebremedhin (2025)

Research perspectives

Methodological: combinatorial optimization, graph machine learning, automatic differentiation, game theory
Practical: large-scale transportation & logistics problems
Industrial partners: Renault, Michelin, Califrais, ART, SNCF

I want to know what you do!

Send me a paper you like at guillaume.dalle@enpc.fr

References

Berthet, Quentin, Mathieu Blondel, Olivier Teboul, Marco Cuturi, Jean-Philippe Vert, and Francis Bach. 2020. “Learning with Differentiable Perturbed Optimizers.” In Advances in Neural Information Processing Systems, 33:9508–19. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2020/hash/6bb56208f672af0dd65451f869fedfd9-Abstract.html.

Bouvier, Louis, Guillaume Dalle, Axel Parmentier, and Thibaut Vidal. 2023. “Solving a Continent-Scale Inventory Routing Problem at Renault.” Transportation Science, October. https://doi.org/10.1287/trsc.2022.0342.

Boyles, Stephen D., Nicholas E. Lownes, and Avinash Unnikrishnan. 2025. “Transportation Network Analysis, Volume I: Static and Dynamic Traffic Assignment.” arXiv. https://doi.org/10.48550/arXiv.2502.05182.

Dalle, Guillaume, Léo Baty, Louis Bouvier, and Axel Parmentier. 2022. “Learning with Combinatorial Optimization Layers: A Probabilistic Approach.” arXiv. https://doi.org/10.48550/arXiv.2207.13513.

Dalle, Guillaume, and Adrian Hill. 2025. “A Common Interface for Automatic Differentiation.” arXiv. https://doi.org/10.48550/arXiv.2505.05542.

Hill, Adrian, and Guillaume Dalle. 2025. “Sparser, Better, Faster, Stronger: Sparsity Detection for Efficient Automatic Differentiation.” Transactions on Machine Learning Research. https://openreview.net/forum?id=GtXSN52nIW.

Mandi, Jayanta, James Kotary, Senne Berden, Maxime Mulamba, Victor Bucarey, Tias Guns, and Ferdinando Fioretto. 2024. “Decision-Focused Learning: Foundations, State of the Art, Benchmark and Future Opportunities.” Journal of Artificial Intelligence Research 80 (August): 1623–1701. https://doi.org/10.1613/jair.1.15320.

Mohanty, Sharada, Erik Nygren, Florian Laurent, Manuel Schneider, Christian Scheller, Nilabha Bhattacharya, Jeremy Watson, et al. 2020. “Flatland-RL : Multi-Agent Reinforcement Learning on Trains.” http://arxiv.org/abs/2012.05893.

Montoison, Alexis, Guillaume Dalle, and Assefaw Gebremedhin. 2025. “Revisiting Sparse Matrix Coloring and Bicoloring.” arXiv. https://doi.org/10.48550/arXiv.2505.07308.

Scardapane, Simone. 2024. “Alice’s Adventures in a Differentiable Wonderland – Volume I, A Tour of the Land.” arXiv. https://doi.org/10.48550/arXiv.2404.17625.

Stern, Roni, Nathan Sturtevant, Ariel Felner, Sven Koenig, Hang Ma, Thayne Walker, Jiaoyang Li, et al. 2019. “Multi-Agent Pathfinding: Definitions, Variants, and Benchmarks.” Proceedings of the International Symposium on Combinatorial Search 10 (1): 151–58. https://doi.org/10.1609/socs.v10i1.18510.

Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. “Attention Is All You Need.” In Advances in Neural Information Processing Systems. Vol. 30. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html.

Wang, Shiyue, Haozheng Xu, Yuhan Zhang, Jingran Lin, Changhong Lu, Xiangfeng Wang, and Wenhao Li. 2025. “Where Paths Collide: A Comprehensive Survey of Classic and Learning-Based Multi-Agent Pathfinding.” arXiv. https://doi.org/10.48550/arXiv.2505.19219.

Xu, Xiaotong, Zhenjie Zheng, Zijian Hu, Kairui Feng, and Wei Ma. 2024. “A Unified Dataset for the City-Scale Traffic Assignment Model in 20 U.S. Cities.” Scientific Data 11 (1): 325. https://doi.org/10.1038/s41597-024-03149-8.

	TA	MAPF
Variable \(v\)	Traffic flows	Agent paths
Parameter \(p\)	Street capacities	Movement costs