Fan Pu Zeng
ML Research Engineer at Jane Street based in New York

Hello! I am Fan Pu (曾繁朴), and I work on LLM research at Jane Street. I previously worked on training foundation models to be good at writing OCaml and perform valuable trading tasks.
(2025-09-21) We are hiring - if this work sounds interesting to you and you have strong engineering skills, a solid machine learning background, and keen interest in pushing the state of the art of LLMs/ML, please shoot me an email! It’s currently an incredibly exciting time to be doing ML here.
I graduated with a B.S (2022) and M.S (2023) in Computer Science from Carnegie Mellon University.
At CMU, I was actively involved in the open-source programming assignment auto-grading platform Autolab from 2018-2023. I served as the Masters Student Liaison for the Singapore Students Association. I also used to play Capture-The-Flag (CTF) competitions with PPP. I previously interned at Jane Street, Meta, Asana, and Saleswhale (acquired 2022). I was a TA for 10-708 Probabilistic Graphical Models in the Spring 2023 semester.
My academic interests lie in understanding the theoretical foundations for the success of deep learning, particularly in optimization and generalization.
In my free time, I enjoy climbing, running, skiing, reading and learning new things, and writing things for my blog. I used to do sprint canoe competitively. If I have an extended break I enjoy traveling, especially hiking and exploring the great outdoors. Most of the banner pictures on my blog posts were taken during these hikes. My favorite classroom in CMU is GHC 4303.
I grew up in my hometown Singapore before moving to the US for college and work. I try to go back and visit once a year.
Feel free to reach out to me at fzeng[at]alumni[dot]cmu[dot]edu. I am happy to chat and provide advice.
Regrettably, I am unable to provide referrals for people that I have not directly collaborated with, as I cannot write you a meaningful recommendation.
I have a Technician amateur radio license, with callsign KC3UFE.
This blog was originally started on 24 June 2018, although it has taken many forms since then. All banner pictures on the blog are taken by yours truly!
Talks
Slides I developed for talks on various LLM-related topics. You are free to share, adapt, and reuse these materials, provided that you give appropriate credit.
- (2024-11-18) A Statistical Approach to Language Model Evaluations
- (2024-10-08) Advanced Retrieval Augmented Generation Techniques
- (2024-07-24) Superalignment, or how to train models smarter than us
- (2024-05-03) Rotary Positional Embeddings (RoPE)
- (2024-04-30) Parameter-Efficient Fine-Tuning
- (2024-03-01) Understanding Transformers
Starred Blog Posts
Some of my more popular posts:
Technical Posts
- Score-Based Diffusion Models
- Bounding Mixing Times of Markov Chains via the Spectral Gap
- An Intuitive Introduction to Gaussian Processes
- (Paper Summary) Zero-shot Image-to-Image Translation
- (Paper Summary) The Implicit Bias of Gradient Descent on Separable Data
- A Unified Framework for High-Dimensional Analysis of M-Estimators with Decomposable Regularizers: A Guided Walkthrough
- The Delightful Consequences of the Graph Minor Theorem
- Universal types, and your type checker doesn’t suck as much as you think
General
- The Art of LaTeX: Common Mistakes, and Advice for Typesetting Beautiful, Delightful Proofs
- Against Government Scholarships
- Notes On Founding A Startup To My Future Self
CMU
news
May 10, 2025 | I thought the ReduNet paper was particularly novel after learning about it at ICLR and wrote a post about it: Neural Networks from Maximizing Rate Reduction |
---|---|
Apr 20, 2025 | I will attending ICLR from 04/24-04/28. Let’s talk if you’re also in town! |
Jan 21, 2025 | Completed a new post on what I think is an under-appreciated topic: An Intuitive Introduction to Gaussian Processes |
Jan 12, 2025 | Wrote on the beautiful connection between how long a Markov Chain takes to mix (commonly used in MCMC methods in ML), and the spectral gap of its transition matrix: Bounding Mixing Times of Markov Chains via the Spectral Gap |
Nov 29, 2024 | I will be at NeurIPs from 12/10-12/15. Let’s chat if you’re also there! |