Zhichen Zeng

zhichen_1.jpg

409, Paul Allen Center

185 E Stevens Way NE, Seattle, WA

I am Zhichen Zeng 「曾郅琛」, a second-year PhD student at the University of Washington, advised by Prof. Ang Li and Prof. Banghua Zhu. My research focuses on developing efficient system support for LLMs.

Before joining UW, I got the Bachelor of Science in Physics from University of Science and Technology of China (USTC), where I was honored to receive the Guo Moruo Scholarship—the highest honor for USTC undergrads.

Previously, I had a enjoyable internship at Microsoft Research Asia, where I worked with Dr. Shijie Cao on efficient systems for long-context LLMs. I worked with Prof. Zhiru Zhang at Cornell on domain-specific compilers for accelerator design.

Feel free to connect with me! :smiley:

news

Jan 22, 2026 Two papers accepted to ICLR 2026 — Tactic paper and Local Linear Attention paper! Thanks to all coauthors :tada:
Jul 16, 2025 Our paper MHE-TPE micro-architecture has been accepted to MICRO’25! Congrats to all the coauthors!
May 23, 2025 Excited to join ByteDance Seed-Infra-Training, working with Ziheng and Haibin! :rocket:
Nov 03, 2024 Our Tensor Processing Engines paper has been accepted to HPCA’25 :blush:
Sep 05, 2024 Thrilled to share that I’ve completed my six-month intern at MSRA with an amazing team and honored with the Stars of Tomorrow award! :tada: :tada:

selected publications

  1. arXiv
    DisagMoE: Computation-Communication Overlapped MoE Training via Disaggregated AF-Pipe Parallelism
    Zhichen Zeng, Chi-Chih Chang , Jiayi Wang , Zezhou Wang , and 9 more authors
    arXiv preprint, 2026
  2. arXiv
    Parallax: Parameterized Local Linear Attention for Language Modeling
    Yifei Zuo , Dhruv Pai , Zhichen Zeng, Alec Dewulf , and 2 more authors
    arXiv preprint, 2026
  3. NeurIPS 2025
    SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
    Yizhao Gao* , Zhichen Zeng*, Dayou Du , Shijie Cao , and 4 more authors
    Annual Conference on Neural Information Processing Systems, 2025
  4. PLDI 2024
    Allo: A Programming Model for Composable Accelerator Design
    Hongzheng Chen* , Niansong Zhang* , Shaojie Xiang , Zhichen Zeng, and 2 more authors
    ACM SIGPLAN Conference on Programming Language Design and Implementation, 2024

service

  • Artifact Evaluation Committee - MLSys 2025, ASPLOS 2025, HPCA 2025, MICRO 2024
  • Conference Reviewer - ICLR 2025, ACL 2025, NeurIPS 2024
  • Teaching Assistent - CSE 469: Computer Architecture, Spring 2025, UW