Simulating, Fast and Slow: Learning Policies for Black-Box Optimization

Abstract

In recent years, solving optimization problems involving black-box simulators has become a point of focus for the machine learning community due to their ubiquity in science and engineering. The simulators describe a forward process from simulation parameters and input data to observations, and the goal of the optimization problem is to find parameters that minimize a desired loss function. Sophisticated optimization algorithms typically require gradient information regarding the forward process with respect to the parameters. However, obtaining gradients from black-box simulators can often be prohibitively expensive or, in some cases, impossible. Furthermore, in many applications, practitioners aim to solve a set of related problems. Thus, starting the optimization from scratch each time might be inefficient if the forward model is expensive to evaluate. To address those challenges, this paper introduces a novel method for solving classes of similar black-box optimization problems by learning an active learning policy that guides a differentiable surrogate’s training and uses the surrogate’s gradients to optimize the simulation parameters with gradient descent. After training the policy, downstream optimization of problems involving black-box simulators requires up to ~90% fewer expensive simulator calls compared to baselines such as local surrogate-based approaches, numerical optimization, and Bayesian methods.

Publication
arXiv preprint, 2024. Earlier workshop versions accepted at NeurIPS ReALML Workshop, 2023 and NeurIPS Deep Inverse Workshop, 2023
Tim Bakker
Tim Bakker
Senior machine learning researcher

My current research interests include AI safety, LLM reasoning, reinforcement learning.

Related