About me

Hi! I’m Luning Wang (王麓宁), currently a Master’s student at the University of Michigan. Before that, I got my Bachelor’s degree at Tsinghua University.

I’m now actively looking for research/engineering full time work opportunities in the field of LLMs, MLSys, and potentially other AI-related fields. I’m expected to graduate in 05/2026 and planning to work in Hong Kong / Singapore/ China Mainland. Feel free to contact with me via Email if there’s an opportunity!

🎓 Education

[08/2024~05/2026] M.S. Department of Electrical and Computer Engineering, University of Michigan
[09/2020~06/2024] B.Eng. Department of Electronic Engineering, Tsinghua University

💻 Internship

I have been working in several organizations, including both academia and industry. See my CV for more details of my work.

[02/2024~06/2024] Infinigence AI, Algorithm Research Intern. [Website]
[09/2023~01/2024] Bytedance Data-TnS, Algorithm Research Intern. [Website]
[07/2023~08/2023] HKU-IDS, Research Assisstant. [Website]

📖 Research

I mainly focused on the efficient algorithms of large language models in my past research, including the compression and acceleration techniques of LLMs. See my publications to learn more about my work.

[09/2022~06/2024] NICS-EFC, Tsinghua University. [website]

Aside from that, I’m also working on MLSys, LLM reasoning, LLM agents, Biomedical LLMs, etc.

I’m open to discussion and collaboration, like coopearting on academic papers or contributing to open source projects. Feel free to drop me an Email or send me a message on LinkedIn!

📝 Publications

[(Preprint, Under review)] MedPlan:A Two-Stage RAG-Based System for Personalized Medical Plan Generation. Hsin-Ling Hsu*, Cong-Tinh Dao*, Luning Wang, Zitao Shuai, Thao Nguyen Minh Phan, Jun-En Ding, Chun-Chieh Liao, Pengfei Hu, Xiaoxue Han, Chih-Ho Hsu, Dongsheng Luo, Wen-Chih Peng, Feng Liu, Fang-Ming Hung, Chenwei Wu. [pdf]
[ICLR’25] Dynamic Modeling of Patients, Modalities and Tasks via Multi-modal Multi-task Mixture of Experts. Chenwei Wu*, Zitao Shuai*, Zhengxu Tang*, Luning Wang, Liyue Shen. [pdf]
[NeurIPS ENLSP Workshop’24] CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios. Luning Wang, Shiyao Li, Xuefei Ning, Zhihang Yuan, Shengen Yan, Guohao Dai, Yu Wang. [pdf] [github]
[(Preprint, Under review)] A Survey on Efficient Inference for Large Language Models. Zixuan Zhou*, Xuefei Ning*, Ke Hong*, Tianyu Fu, Jiaming Xu, Shiyao Li, Yuming Lou, Luning Wang, Zhihang Yuan, Xiuhong Li, Shengen Yan, Guohao Dai, Xiao-Ping Zhang, Yuhan Dong, Yu Wang. [pdf]
[ICML’24] Evaluating Quantized Large Language Models. Shiyao Li, Xuefei Ning, Luning Wang, Tengxuan Liu, Xiangsheng Shi, Shengen Yan, Guohao Dai, Huazhong Yang, Yu Wang. [pdf] [github]
[NeurIPS ENLSP Workshop’23] LLM-MQ: Mixed-precision Quantization for Efficient LLM Deployment. Shiyao Li, Xuefei Ning, Ke Hong, Tengxuan Liu, Luning Wang, Xiuhong Li, Kai Zhong, Guohao Dai, Huazhong Yang, Yu Wang. [pdf]

⚙️ Service

[02/2025] Paper reviewer at the ICLR 2025 Workshop on Reasoning and Planning for LLMs.