About me

I’m currently a senior student at the Department of Electronic Engineering, Tsinghua University. And I’m going to attend the University of Michigan on 24-fall, where I would pursue a master’s degree at the Department of Electrical and Computer Engineering.

📖 Research

My past researches mainly focus on model compression and acceleration for large language models.

  • [09/2022~06/2024] Nanoscale Integrated Circuits and System Lab, Energy Efficient Computing Group (NICS-EFC). [Website]
  • [07/2023~08/2023] HKU Musketeers Foundation Institude of Data Science (HKU-IDS). [Website]

💻 Internship

  • [09/2023~01/2024] Bytedance Data-TnS, Algorithm intern. [Website]
  • [02/2024~06/2024] Infinigence AI, Research intern. [Website]

📝 Publications

  • [Arxiv’24] A Survey on Efficient Inference for Large Language Models. Zixuan Zhou, Xuefei Ning, Ke Hong, Tianyu Fu, Jiaming Xu, Shiyao Li, Yuming Lou, Luning Wang, Zhihang Yuan, Xiuhong Li, Shengen Yan, Guohao Dai, Xiao-Ping Zhang, Yuhan Dong, Yu Wang. [pdf]
  • [ICML 2024] Evaluating Quantized Large Language Models. Shiyao Li, Xuefei Ning, Luning Wang, Tengxuan Liu, Xiangsheng Shi, Shengen Yan, Guohao Dai, Huazhong Yang, Yu Wang. [pdf] [github]
  • [ENLSP NeurIPS Workshop 2023] LLM-MQ: Mixed-precision Quantization for Efficient LLM Deployment. Shiyao Li, Xuefei Ning, Ke Hong, Tengxuan Liu, Luning Wang, Xiuhong Li, Kai Zhong, Guohao Dai, Huazhong Yang, Yu Wang. [pdf] [poster]