Blogs
Markdown-backed posts grouped by topic.
Agentic Software
- Agentic Programming
Date:
What I learned about programming by trying to program agents — and why the OS analogy isn't a metaphor.
Film Reviews
- Hamnet
Date:
A review of Hamnet as a film where grief moves from memory to mechanism—bridging sparse history and the tragic interior pressure of Hamlet.
Machine Learning
- OT for generative modeling 3 — Diffusion as Maximum Likelihood Estimation
Date:
We derive the interpretation of physical diffusion as Wasserstein gradient flow, noise-spectrum decomposition of KL-divergence, diffusion models as MLE, and first-principles analysis of flow matching scalability. Honorable mentions to Fokker-Planck equation, Anderson's theorem, de Bruijin identity, and Tweedie's formula.
- OT for generative modeling 0 — the static perspective
Date:
Why we care about optimal transport (OT), the static (Kantorovich) definition of Wasserstein definition, the linear programming (Kantorovich-Rubinstein) dual formulation, and WGAN.
- OT for generative modeling 1 — the Wasserstein geometry
Date:
We construct the Wasserstein manifold from first principles: probability distributions as points, sample-space vector fields as tangent vectors, the density-weighted inner product that endows optimal transport with a rich Riemannian geometry. Physics-intuition on the Benamou-Brenier theorem which unifies static and Riemannian definitions.
- OT for generative modeling 2 — Wasserstein gradients and drifting models
Date:
We look at Kaiming Deng et al.'s Drifting Models, interpret the antisymmetric drifting field as Wasserstein gradient flow on the reverse KL between kernel-smoothed distributions, and develop the connection to maximum likelihood estimation.
- Rollout likelihood generalization of maximum likelihood reinforcement learning
Date:
A highly principled foundational RL paper with easily actionable changes. We derive the continuous generalization.