Zhichen Zeng

409, Paul Allen Center

185 E Stevens Way NE, Seattle, WA

I am Zhichen Zeng 「曾郅琛」, a second-year PhD student at the University of Washington, advised by Prof. Ang Li and Prof. Banghua Zhu. My research focuses on developing efficient system support for LLMs.

Before joining UW, I got the Bachelor of Science in Physics from University of Science and Technology of China (USTC), where I was honored to receive the Guo Moruo Scholarship—the highest honor for USTC undergrads.

Previously, I had a enjoyable internship at Microsoft Research Asia, where I worked with Dr. Shijie Cao on efficient systems for long-context LLMs. I worked with Prof. Zhiru Zhang at Cornell on domain-specific compilers for accelerator design.

Feel free to connect with me!

news

Jul 16, 2025	Our paper MHE-TPE micro-architecture has been accepted to MICRO’25! Congrats to all the coauthors!
May 23, 2025	Excited to join ByteDance Seed-Infra-Training, working with Ziheng and Haibin!
Nov 03, 2024	Our Tensor Processing Engines paper has been accepted to HPCA’25
Sep 05, 2024	Thrilled to share that I’ve completed my six-month intern at MSRA with an amazing team and honored with the Stars of Tomorrow award!
Aug 01, 2024	Our EN-Tensorcore paper has been accepted to ICCD’24

selected publications

NeurIPS 2025

SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs

Yizhao Gao* , Zhichen Zeng*, Dayou Du , Shijie Cao , and 4 more authors

The Thirty-Ninth Annual Conference on Neural Information Processing Systems, 2024

PDF
ISCA 2025

LUT Tensor Core: Lookup Table Enables Efficient Low-Bit LLM Inference Acceleration

Zhiwen Mo , Lei Wang , Jianyu Wei , Zhichen Zeng, and 7 more authors

IEEE/ACM Annual International Symposium on Computer Architecture, 2025

PDF
MICRO 2025

MHE-TPE: Multi-Operand High-Radix Encoder for Mixed-Precision Fixed-Point Tensor Processing Engines

Qizhe Wu , Jinyi Zhou , Zhanhe Hu , Zhichen Zeng, and 9 more authors

IEEE/ACM International Symposium on Microarchitecture, 2025
PLDI 2024

Allo: A Programming Model for Composable Accelerator Design

Hongzheng Chen* , Niansong Zhang* , Shaojie Xiang , Zhichen Zeng, and 2 more authors

ACM SIGPLAN Conference on Programming Language Design and Implementation, 2024

PDF Code
HPCA 2025

Exploring the Performance Improvement of Tensor Processing Engines through Transformation in the Bit-weight Dimension of MACs

Qizhe Wu , Huawen Liang , Yuchen Gui , Zhichen Zeng, and 6 more authors

IEEE International Symposium on High-Performance Computer Architecture, 2025

PDF
ICCD 2024

EN-T: Optimizing Tensor Computing Engines Performance via Encoder-Based Methodology

Qizhe Wu , Yuchen Gui , Zhichen Zeng, Xiaotian Wang , and 2 more authors

IEEE 42nd International Conference on Computer Design, 2024

PDF

service

Artifact Evaluation Committee - MLSys 2025, ASPLOS 2025, HPCA 2025, MICRO 2024
Conference Reviewer - ICLR 2025, ACL 2025, NeurIPS 2024
Teaching Assistent - CSE 469: Computer Architecture, Spring 2025, UW