About me

Hi! I’m Luning Wang (王麓宁), currently a first-year master student majored in electronic and computer engineering.

I’m now actively looking for (research/engineering) intern opportunities in the field of LLMs, MLSys, and potentially other AI & Data-Science related fields! Please see my internship section for details.

🎓 Education

  • [08/2024~05/2026] M.S. Department of Electrical and Computer Engineering, University of Michigan
  • [09/2020~06/2024] B.E. Department of Electronic Engineering, Tsinghua University

💻 Internship

I have been working in several organizations, including both academia and industry. See my CV for more details of my work.

  • [02/2024~06/2024] Infinigence AI, Algorithm Research Intern. [Website]
  • [09/2023~01/2024] Bytedance Data-TnS, Algorithm Research Intern. [Website]
  • [07/2023~08/2023] HKU-IDS, Research Assisstant. [Website]

I’m now actively looking for (research/engineering) intern opportunities in the field of LLMs, MLSys, and potentially other AI & Data-Science related fields!

  • Prospective: 2025 summer (May ~ August), full-time, remote or on-site. Positions in China or the United States are both applicable. Please contact me if there’s an opportunity!

📖 Research

I mainly focused on the efficient algorithms of large language models in my past research, including the compression and acceleration techniques of LLMs. I’m currently trying to get on the way of multimodal models and diffusion models. See my publications to learn more about my work.

  • [09/2022~06/2024] NICS-EFC, Tsinghua University. [website]

I’m open to research cooperation opportunities in the field of LLMs, Multimodal Models, MLSys, and potentially other AI & Data-Science related fields.

📝 Publications

  • [ENLSP NeurIPS Workshop’24] CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios. Luning Wang, Shiyao Li, Xuefei Ning, Zhihang Yuan, Shengen Yan, Guohao Dai, Yu Wang. [pdf] [github]
  • [(Under review)] A Survey on Efficient Inference for Large Language Models. Zixuan Zhou*, Xuefei Ning*, Ke Hong*, Tianyu Fu, Jiaming Xu, Shiyao Li, Yuming Lou, Luning Wang, Zhihang Yuan, Xiuhong Li, Shengen Yan, Guohao Dai, Xiao-Ping Zhang, Yuhan Dong, Yu Wang. [pdf]
  • [ICML’24] Evaluating Quantized Large Language Models. Shiyao Li, Xuefei Ning, Luning Wang, Tengxuan Liu, Xiangsheng Shi, Shengen Yan, Guohao Dai, Huazhong Yang, Yu Wang. [pdf] [github]
  • [ENLSP NeurIPS Workshop’23] LLM-MQ: Mixed-precision Quantization for Efficient LLM Deployment. Shiyao Li, Xuefei Ning, Ke Hong, Tengxuan Liu, Luning Wang, Xiuhong Li, Kai Zhong, Guohao Dai, Huazhong Yang, Yu Wang. [pdf]