Back to past content

Daily Feed - 2026-03-16

Date:

1. Q-Learning for Controlled Diffusions

Bayraktar, Kara, Pradhan, Yuksel · Mar 2026

Develops a quantized Q-learning scheme for optimal control of diffusion processes on — covering both discounted and ergodic cost criteria. Proves almost-sure convergence to near-optimal policies for finite MDP approximations, then shows these interpolate to near-optimal continuous-time controls. The ergodic result uses a vanishing-discount argument. Mild conditions (Lipschitz dynamics, non-degeneracy, Lyapunov stability), no knowledge of system dynamics required.

Why it matters: Rigorous bridge between tabular RL convergence theory and continuous-time stochastic control, with explicit near-optimality rates — the kind of theory that’s missing from most “RL for finance” papers.


2. Microstructural Foundation of Rough Bergomi

Hager, Horst, Wagenhofer, Xu · Mar 2026

Derives the rough Bergomi model from first principles: a sequence of order-driven markets where Poisson-arriving orders have long-lasting volatility impact. Via C-tightness for càdlàg processes, the price-vol process converges weakly to a log-normal rough volatility model. Bonus: weak error rates via a Clark-Ocone formula for Poisson processes, turning the microstructure model into a viable alternative to classical rough vol simulation.

Why it matters: Answers “why is volatility rough?” directly from order flow mechanics — microstructure → rough vol, not the other way around.


3. Exact Equilibria in LQG Games with Endogenous Signals

Babichenko · Mar 2026

Gives the first exact equilibrium characterization of finite-player continuous-time LQG games where actions reshape opponents’ information. The key move: condition on primitive Brownian shocks (dynamic Harsanyi) instead of the physical state, collapsing the infinite belief hierarchy onto deterministic two-time kernels. This yields an explicit information wedge — a deterministic Volterra process that prices the marginal value of shifting opponents’ posteriors. The wedge vanishes iff signals are exogenous.

Why it matters: Four-decade-old problem. The Volterra information wedge gives a closed-form map from information structure to equilibrium — directly relevant to strategic trading with information leakage.


4. Transformers Trapped by Simplicity Bias on Boolean Functions

Peters et al. · ICLR 2026

Studies noise-robust learning: can transformers trained on noisy features recover the noiseless target? They succeed on -sparse parity and majority, but systematically fail on random -juntas — precisely when the optimal noisy-regime solution has lower Boolean sensitivity than the target. The failure is a simplicity bias trap: transformers converge to the low-sensitivity solution and can’t escape. A sensitivity-penalizing loss term helps.

Why it matters: Sharp characterization of where transformers’ inductive bias actively hurts, framed through Boolean function analysis. Connects simplicity bias to noise stability in a falsifiable way.


5. Tao & Davis: Mathematics Distillation Challenge

Tao, Davis · SAIR Foundation · Mar 2026

Terry Tao and Damek Davis launched the SAIR Foundation’s first competition: distill the 22 million true/false results from the Equational Theories Project into a 10KB “cheat sheet” that helps cheap open-source LLMs solve universal algebra problems. Without guidance, small models perform at coin-flip level; with the right prompt, they hit 55–60%. Stage 1 runs through April 20; Stage 2 will require proofs/counterexamples.

Why it matters: Operational test of whether mathematical knowledge can be compressed into prompts — essentially measuring the gap between “knows the answer” and “can explain the method.” Beautiful intersection of formalization, knowledge distillation, and AI reasoning.

Blog post

Comments