publications
\* indicates equal contribution
2026
2026
- ICLR 2026Tactic: Adaptive Sparse Attention with Clustering and Distribution Fitting for Long-Context LLMsInternational Conference on Learning Representations, 2026
- ICLR 2026Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention for Test-Time RegressionInternational Conference on Learning Representations, 2026
- arXivDisagMoE: Computation-Communication Overlapped MoE Training via Disaggregated AF-Pipe ParallelismarXiv preprint, 2026
- arXiv
- ISCA 2026DICE: Enabling Efficient General-Purpose SIMT Execution with Statically Scheduled Coarse-Grained Reconfigurable ArraysIEEE/ACM Annual International Symposium on Computer Architecture, 2026
- Tech ReportSeed2.0 Model Card: Towards Intelligence Frontier for Real-World ComplexityByteDance Technical Report, 2026
2025
2025
- NeurIPS 2025SeerAttention: Learning Intrinsic Sparse Attention in Your LLMsAnnual Conference on Neural Information Processing Systems, 2025
- ISCA 2025LUT Tensor Core: Lookup Table Enables Efficient Low-Bit LLM Inference AccelerationIEEE/ACM Annual International Symposium on Computer Architecture, 2025
- MICRO 2025MHE-TPE: Multi-Operand High-Radix Encoder for Mixed-Precision Fixed-Point Tensor Processing EnginesIEEE/ACM International Symposium on Microarchitecture, 2025
- HPCA 2025Exploring the Performance Improvement of Tensor Processing Engines through Transformation in the Bit-weight Dimension of MACsIEEE International Symposium on High-Performance Computer Architecture, 2025
- NeurIPS WorkshopSRT: Accelerating Reinforcement Learning via Speculative Rollout with Tree-Structured CacheNeurIPS 2025 Workshop, 2025
2024
2024
- ICCD 2024EN-T: Optimizing Tensor Computing Engines Performance via Encoder-Based MethodologyIEEE International Conference on Computer Design, 2024
- J. Phys. DHighly Stable and Fast Response Photodetector Based on Double Perovskite Cs2AgBiCl6 CrystalsJournal of Physics D: Applied Physics, 2024