Recap: Let the Model Pick the Next Experiment — Aryan Deshwal's Bayesian Optimization Talk at Upper Bound 2026

An abstract landscape of pink and green vertical columns forming peaks and valleys, like a search space to be optimized

Part of my Upper Bound 2026 series — write-ups of the talks I caught at Amii's AI conference in Edmonton that I wanted to carry home. This one I sat in for, front to back. It's the most "math on the slides" of the bunch, and also the one that quietly changed how I think about my own time.

The Setup: You Can't Run Every Experiment

Here's the pain point Aryan Deshwal opened on, and it's one I feel constantly even though I'm not a materials scientist: you have a giant space of things you could try, each one costs something to test, and you cannot test them all.

His domain is designing materials — think tens of thousands of candidate molecular structures, each of which needs an expensive simulation or a physical experiment to evaluate. But swap "candidate material" for "hyperparameter config," "prompt variant," "feature flag combination," or "which of forty ideas to ship this quarter," and it's the same shape. The search space is huge, feedback is expensive, and brute force is off the table.

So the question becomes: given everything you've learned so far, what's the single most valuable experiment to run next? That's the whole talk. That's Bayesian optimization.

Who He Is

Aryan Deshwal is an assistant professor in Computer Science & Engineering at the University of Minnesota (he joined in 2024, after a PhD at Washington State). His research line is exactly this: sample-efficient optimization and experiment design when each data point is costly. The frameworks below come straight from that body of work — these aren't napkin sketches, they're peer-reviewed.

The Core Loop

Bayesian optimization is a loop with three moving parts:

graph LR
    subgraph "The Bayesian Optimization Loop"
    M[Build Probabilistic Model] --> A[Optimize Acquisition Function]
    A --> E[Run Expensive Experiment]
    E --> M
    end

Build a probabilistic model of the thing you're optimizing — your current best guess of "input → outcome," with uncertainty attached. Usually a Gaussian process. The uncertainty is the important part; it knows what it doesn't know.
Optimize an acquisition function — a cheap-to-evaluate score that ranks every candidate by "how worth-it is it to try this next?" You maximize this instead of the real (expensive) objective.
Run the experiment the acquisition function picked, observe the result, fold it back into the model.

Then repeat. Each loop the model gets sharper and the next pick gets smarter. The magic is entirely in step 2 — how you decide what's "worth it."

The Heart of It: Exploit + Explore

The acquisition function Deshwal used to make this concrete is UCB — Upper Confidence Bound:

AF(x) = μ(x) + β · σ(x)

Two terms, and the whole personality of the search lives in the tension between them:

μ(x) — exploitation. The model's predicted value at x. Chase this alone and you greedily re-test near your current best, never looking around.
σ(x) — exploration. The model's uncertainty at x. Chase this alone and you wander into the unknown, ignoring everything you've learned.
β — the dial. How much you reward uncertainty. Crank it up, you explore; turn it down, you exploit.

This is the explore/exploit tradeoff, and once you see it you see it everywhere — it's the same dilemma as the multi-armed bandit, as deciding whether to reorder your favorite dish or try the new one, as whether to deepen the codebase you know or go learn the one you don't. BO just makes the tradeoff explicit and tunable.

When You Want More Than One Thing

Real problems are rarely single-objective. You want a material that's cheap and strong and stable. Those trade off against each other, so there's no single winner — there's a Pareto frontier, the set of options where you can't improve one property without sacrificing another.

This is where Deshwal's own contributions come in. Instead of an acquisition function that asks "what's the highest predicted value," you use an information-theoretic one that asks: "which experiment would teach me the most about where the Pareto frontier actually is?" You're not optimizing the objective directly — you're optimizing your knowledge of the frontier, and letting good choices fall out of that. His framing that stuck with me: quantify your uncertainty about the unknown, then reason about the information value of each possible action.

Does It Actually Win? Yes.

The result he showed that sold it, as I remember it: a nanoporous-materials design problem — searching tens of thousands of candidate structures for ones with the best gas-storage capacity. He pitted Bayesian optimization against a standard evolutionary optimizer (CMA-ES) and against random search. BO found the good candidates in dramatically fewer expensive evaluations. When every evaluation is a simulation or a lab run, "fewer evaluations to the same answer" is the entire game.

That's the payoff of treating "what should I measure next" as a problem worth being smart about, rather than just gridding the space or trusting intuition.

Why It Stuck With Me

I write a lot about pairing with AI agents, and this talk reframed something for me. Most of us run our agent sessions like random search or pure exploitation: we either try whatever's in front of us, or we keep grinding on the approach we already trust. Bayesian optimization is a discipline for the messy middle — spend your next expensive action where it buys the most information.

Three things I'm keeping:

Uncertainty is a feature, not an embarrassment. The model is good because it tracks what it doesn't know. The honest "I'm not sure here" is what makes the next pick smart.
Optimize the decision, not the objective. The clever move is scoring candidate experiments, cheaply, instead of trying to shortcut the expensive thing itself.
Explore/exploit is a dial you can actually turn. β isn't just theory — it's a real knob, and naming it changes how you budget your own attention.

The Research Base

None of this is hand-waving — it's a well-developed corner of machine learning. Where to read more:

The friendly on-ramp: Peter Frazier, "A Tutorial on Bayesian Optimization" (arXiv:1807.02811, 2018). The clearest single intro.
The survey: Shahriari, Swersky, Wang, Adams & de Freitas, "Taking the Human Out of the Loop: A Review of Bayesian Optimization" (Proceedings of the IEEE, 2016).
The ML-tuning classic: Snoek, Larochelle & Adams, "Practical Bayesian Optimization of Machine Learning Algorithms" (NeurIPS 2012) — BO for hyperparameter search.
The UCB / explore-exploit lineage: Srinivas, Krause, Kakade & Seeger, "Gaussian Process Optimization in the Bandit Setting" (GP-UCB, ICML 2010), building on Auer, Cesa-Bianchi & Fischer's "Finite-time Analysis of the Multiarmed Bandit Problem" (UCB1, Machine Learning, 2002).
Deshwal's multi-objective work: Belakaria, Deshwal & Doppa, "Max-value Entropy Search for Multi-Objective Bayesian Optimization" (NeurIPS 2019); "Uncertainty-Aware Search Framework for Multi-Objective Bayesian Optimization" (AAAI 2020); and the journal version, "Output Space Entropy Search Framework for Multi-Objective Bayesian Optimization" (JAIR, 2021).
The materials result: Deshwal, Simon & Doppa, "Bayesian optimization of nanoporous materials" (Molecular Systems Design & Engineering, 2021).

The One-Line Takeaway

You will never have the budget to try everything. Bayesian optimization is the formal answer to "so what do I try next?" — and the answer is: whatever your current uncertainty says will teach you the most. Even if you never write the math, that instinct is worth stealing.

More in the Upper Bound 2026 series: Vibe Ops, IT/OT security and the Purdue model, an accountability framework for AI agents, and more.

Header photo by Maxim Berg on Unsplash.

Content on this blog was created using human and AI-assisted workflows described here. Original ideas and editorial decisions by Justin Quaintance.