Hello! I am Fan Pu, and I work with large language models and machine learning at Jane Street. I recently graduated with a M.S (2023) and B.S (2022) in Computer Science from Carnegie Mellon University.

At CMU, I was actively involved in the open-source programming assignment auto-grading platform Autolab from 2018-2023. I served as the Masters Student Liaison for the Singapore Students Association. I also used to play Capture-The-Flag (CTF) competitions with PPP. I previously interned at Jane Street, Meta, Asana, and Saleswhale (acquired 2022). I was a TA for 10-708 Probabilistic Graphical Models in the Spring 2023 semester.

My interests currently lie in large language models and deep learning theory. It is a really exciting time to be alive and I can’t wait to see what the next few years will bring in ML. My eventual goal is to build a machine learning startup, for the betterment of humanity the world.

In my free time, I enjoy bouldering, K-pop dancing, running, reading and learning new things, writing things for my blog, and watching anime. I used to do sprint canoe competitively. If I have an extended break I enjoy traveling, especially hiking and exploring the great outdoors. Most of the banner pictures on my blog posts were taken during these hikes. My favorite classroom in CMU is GHC 4303.

I grew up in my hometown Singapore before moving to the US for college and work. I try to go back and visit once a year.

Feel free to reach out to me at fzeng[at]alumni[dot]cmu[dot]edu. I am happy to chat and share any of my experiences. Unfortunately, I am unable to provide referrals for people that I have not worked with as I cannot give a strong recommendation.

I have a Technician amateur radio license, and my callsign is KC3UFE.

This blog was originally started on 24 June 2018, although it has taken many forms since then. I write about things that I find interesting, and which might be helpful for other people, with the goal that people learn something new when they read them. I try to have my posts reviewed by friends who are knowledgeable in an area whenever possible to ensure that what is presented is factually accurate. Do let me know if you have any article suggestions. All banner pictures on the blog are taken by yours truly!

Nov 20, 2023 I’m really excited to be joining the AI Assistants team at Jane Street to work on large language models!
Oct 22, 2023 Read a really interesting paper on image translation via diffusion models this weekend and wrote a more detailed than usual summary for it: Zero-shot Image-to-Image Translation
Sep 9, 2023 Wrote a pretty interesting summary with high-level proof sketches for The Implicit Bias of Gradient Descent on Separable Data
Sep 2, 2023 Wrote a tutorial on setting up the Japanese arcade rhythm game Sound Voltex at home.
Sep 1, 2023 Wrote a post on creating Trackback requests manually for static sites, motivated by my own usage.
Aug 26, 2023 Wrote one of my more interesting paper summaries: Efficiently Modeling Long Sequences with Structured State Spaces
Aug 21, 2023 Started full-time as a Linux Engineer at Jane Street!