Snakamoto

Research Feed

  • 2026-03-23

    Superlinear SGD noise-curvature power law reframes implicit regularization, Bayesian optimality of selective SSMs over Transformers for in-context learning, SLAY's physics-inspired spherical linear attention via Bernstein's theorem, and Rigollet's mean-field PDE framework for transformer training dynamics.

  • 2026-03-16

    Q-learning for controlled diffusions with near-optimality rates, a microstructural derivation of rough Bergomi from order flow, exact LQG equilibria with endogenous signals and Volterra information wedges, transformers trapped by simplicity bias on Boolean functions, and Tao & Davis launch a mathematics distillation challenge.

  • 2026-03-09

    Softmax gradient flow polarization explains attention sinks, f-divergence policy gradients fix exponential blowup, a 524M-param foundation model for trade microstructure, and Riemannian geometry reveals the optimal AMM rebalancing path.

  • 2026-02-28

    Research highlights: A Model-Free Universal AI, Conformalized Neural Networks for Federated Uncertainty Quantification under Dual Heterogeneity, and Mean Estimation from Coarse Data: Characterizations and Efficient Algorithms (+2 more).

  • 2026-02-27

    A set of recent papers and talks on model agreement, risk-aware POMDP evaluation, and viscous HJB control with direct implications for practical learning systems.

  • 2026-02-26

    Three transport-focused ML/finance papers on optimal transport, diffusion dynamics, and risk-adjusted prediction, plus two OT transport-focused talks.

  • 2026-02-25

    This cycle highlights discrete diffusion control advances, distributionally robust online learning, and gap-dependent reinforcement guarantees, plus two transport-focused videos.

Blogs

Browse all blogs ->

Browse all entries ->  ·  Manage subscription