Zhichen Zeng

zhichen.png

409, Paul Allen Center

185 E Stevens Way NE, Seattle, WA

This is Zhichen Zeng 「曾郅琛」, a first-year PhD student at the University of Washington, advised by Prof. Ang Li and working closely with Prof. Baris Kasikci. Before joining UW, I got the bachelor degree of Physics from USTC, where I was honored to receive the Guo Moruo Scholarship—the highest honor for USTC undergrads.

Previously, I had a enjoyable internship at Microsoft Research Asia, where I worked with Dr. Shijie Cao on efficient systems for long-context LLMs. I worked with Prof. Zhiru Zhang at Cornell on domain-specific compilers for accelerator design.

Feel free to connect with me! :smiley:

news

Feb 23, 2025 Excited to join ByteDance Seed MLSys Team as a Research Scientist Intern :rocket:
Nov 03, 2024 Our Tensor Processing Engines paper has been accepted to HPCA’25 :blush:
Sep 05, 2024 Thrilled to share that I’ve completed my six-month intern at MSRA with an amazing team and honored with the Stars of Tomorrow award! :tada: :tada:
Aug 01, 2024 Our EN-Tensorcore paper has been accepted to ICCD’24 :blush:
Apr 20, 2024 Awarded the 43rd Guo Moruo Scholarship (highest honor of USTC undergrads) :tada: :tada:

selected publications

  1. Under Review
    SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
    Yizhao Gao* , Zhichen Zeng*, Dayou Du , Shijie Cao , and 4 more authors
    arXiv, 2024
  2. Under Review
    Tactic: Adaptive Sparse Attention with Clustering and Distribution Fitting for Long-Context LLMs
    Kan Zhu , Tian Tang , Qinyu Xu , Yile Gu , and 6 more authors
    arXiv, 2025
  3. PLDI
    Allo: A Programming Model for Composable Accelerator Design
    Hongzheng Chen* , Niansong Zhang* , Shaojie Xiang , Zhichen Zeng, and 2 more authors
    2024 ACM SIGPLAN Conference on Programming Language Design and Implementation, 2024
  4. HPCA
    Exploring the Performance Improvement of Tensor Processing Engines through Transformation in the Bit-weight Dimension of MACs
    Qizhe Wu , Huawen Liang , Yuchen Gui , Zhichen Zeng, and 6 more authors
    2025 IEEE International Symposium on High-Performance Computer Architecture, 2025
  5. ICCD
    EN-T: Optimizing Tensor Computing Engines Performance via Encoder-Based Methodology
    Qizhe Wu , Yuchen Gui , Zhichen Zeng, Xiaotian Wang , and 2 more authors
    IEEE 42nd International Conference on Computer Design, 2024
  6. ISCA
    LUT Tensor Core: Lookup Table Enables Efficient Low-Bit LLM Inference Acceleration
    Zhiwen Mo , Lei Wang , Jianyu Wei , Zhichen Zeng, and 7 more authors
    2025 IEEE/ACM Annual International Symposium on Computer Architecture, 2024

service

  • Artifact Evaluation Committee - MLSys 2025, ASPLOS 2025, HPCA 2025
  • Conference Reviewer - ICLR 2025, NeurIPS 2024