About me
Hi! I’m Luning Wang (王麓宁), currently a Master’s student at the University of Michigan. Before that, I got my Bachelor’s degree at Tsinghua University, working with NICS-EFC.
🎓 Education
- [08/2024~05/2026] M.S. Department of Electrical and Computer Engineering, University of Michigan
- [09/2020~06/2024] B.Eng. Department of Electronic Engineering, Tsinghua University
💻 Internship
- [04/2025~08/2025] Noah’s Ark Lab, Machine Learning Engineer Intern.
- [02/2024~06/2024] Infinigence AI, Machine Learning Research Intern.
- [09/2023~01/2024] ByteDance (TikTok), Machine Learning Engineer Intern.
See my CV for more details of my work.
📖 Research
I mainly focused on infrastructure and efficiency optimization of Large Language Models (a.k.a MLSys or AI-Infra) in my past research, including the compression and acceleration techniques of LLMs. Aside from that, I also have some experiences on Biomedical LLMs and Multimodal LLMs.
I’m open to discussion and collaboration. Feel free to drop me an Email or reach out on LinkedIn!
📝 Selected Publications
Here are some of my representative works:
- [ACL’25] MedPlan:A Two-Stage RAG-Based System for Personalized Medical Plan Generation. Hsin-Ling Hsu*, Cong-Tinh Dao*, Luning Wang, et al. [pdf]
- [NeurIPS ENLSP Workshop’24] CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios. Luning Wang, et al. [pdf] [github]
- [ICML’24] Evaluating Quantized Large Language Models. Shiyao Li, Xuefei Ning, Luning Wang, et al. [pdf] [github]
- [NeurIPS ENLSP Workshop’23] LLM-MQ: Mixed-precision Quantization for Efficient LLM Deployment. Shiyao Li, Xuefei Ning, Ke Hong, Tengxuan Liu, Luning Wang, et al. [pdf]
See my Google Scholar for the full list of my publications.
⚙️ Academic Services
- [09/2025~12/2025] Paper reviewer for the ICLR’26 main conference.
- [02/2025~03/2025] Paper reviewer for the ICLR’25 Workshop on Reasoning and Planning for LLMs.
