publications

\* indicates equal contribution

2026

2026

ICLR 2026

Tactic: Adaptive Sparse Attention with Clustering and Distribution Fitting for Long-Context LLMs

Kan Zhu* , Tian Tang* , Qinyu Xu* , Yile Gu , and 6 more authors

International Conference on Learning Representations, 2026

PDF
ICLR 2026

Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention for Test-Time Regression

Yifei Zuo , Yutong Yin , Zhichen Zeng, Ang Li , and 2 more authors

International Conference on Learning Representations, 2026

PDF
arXiv

DisagMoE: Computation-Communication Overlapped MoE Training via Disaggregated AF-Pipe Parallelism

Zhichen Zeng, Chi-Chih Chang , Jiayi Wang , Zezhou Wang , and 9 more authors

arXiv preprint, 2026

PDF
arXiv

Parallax: Parameterized Local Linear Attention for Language Modeling

Yifei Zuo , Dhruv Pai , Zhichen Zeng, Alec Dewulf , and 2 more authors

arXiv preprint, 2026

PDF
ISCA 2026

DICE: Enabling Efficient General-Purpose SIMT Execution with Statically Scheduled Coarse-Grained Reconfigurable Arrays

Jiayi Wang , Ang Da Lu , Zhichen Zeng, and Ang Li

IEEE/ACM Annual International Symposium on Computer Architecture, 2026

PDF
Tech Report

Seed2.0 Model Card: Towards Intelligence Frontier for Real-World Complexity

ByteDance Seed

ByteDance Technical Report, 2026

2025

2025

NeurIPS 2025

SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs

Yizhao Gao* , Zhichen Zeng*, Dayou Du , Shijie Cao , and 4 more authors

Annual Conference on Neural Information Processing Systems, 2025

PDF
ISCA 2025

LUT Tensor Core: Lookup Table Enables Efficient Low-Bit LLM Inference Acceleration

Zhiwen Mo , Lei Wang , Jianyu Wei , Zhichen Zeng, and 7 more authors

IEEE/ACM Annual International Symposium on Computer Architecture, 2025

PDF
MICRO 2025

MHE-TPE: Multi-Operand High-Radix Encoder for Mixed-Precision Fixed-Point Tensor Processing Engines

Qizhe Wu , Jinyi Zhou , Zhanhe Hu , Zhichen Zeng, and 9 more authors

IEEE/ACM International Symposium on Microarchitecture, 2025
HPCA 2025

Exploring the Performance Improvement of Tensor Processing Engines through Transformation in the Bit-weight Dimension of MACs

Qizhe Wu , Huawen Liang , Yuchen Gui , Zhichen Zeng, and 6 more authors

IEEE International Symposium on High-Performance Computer Architecture, 2025

PDF
NeurIPS Workshop

SRT: Accelerating Reinforcement Learning via Speculative Rollout with Tree-Structured Cache

Chi-Chih Chang , Siqi Zhu , Zhichen Zeng, Haibin Lin , and 4 more authors

NeurIPS 2025 Workshop, 2025

PDF

2024

2024

PLDI 2024

Allo: A Programming Model for Composable Accelerator Design

Hongzheng Chen* , Niansong Zhang* , Shaojie Xiang , Zhichen Zeng, and 2 more authors

ACM SIGPLAN Conference on Programming Language Design and Implementation, 2024

PDF Code
ICCD 2024

EN-T: Optimizing Tensor Computing Engines Performance via Encoder-Based Methodology

Qizhe Wu , Yuchen Gui , Zhichen Zeng, Xiaotian Wang , and 2 more authors

IEEE International Conference on Computer Design, 2024

PDF
J. Phys. D

Highly Stable and Fast Response Photodetector Based on Double Perovskite Cs2AgBiCl6 Crystals

Zhengyu Han* , Mengjia Dai* , Zhichen Zeng*, Chunhui Ye , and 4 more authors

Journal of Physics D: Applied Physics, 2024

PDF