About me
Hi! I’m Luning Wang, currently a Master’s student at the University of Michigan. Before that, I did my undergrad at Tsinghua University, working with NICS-EFC lab.
I mainly focused on the infrastructure and efficiency optimization of Large Language Models (a.k.a MLSys or AI-Infra) in my past works. Aside from that, I was also looking into some other AI techs like RL and MLLMs. I’d casually put some of my study notes for new works in my Blog Posts, hope that could help as a reference if you’re also interested in related topics.
I’m open to discussion and collaboration. Feel free to drop me an Email or reach out on LinkedIn!
🎓 Education
- [08/2024~05/2026] M.S. Department of Electrical and Computer Engineering, University of Michigan
- [09/2020~06/2024] B.Eng. Department of Electronic Engineering, Tsinghua University
💻 Internship
- [04/2025~08/2025] Huawei Noah’s Ark Lab, Machine Learning Engineer Intern.
- [02/2024~06/2024] Infinigence AI, Machine Learning Research Intern.
- [09/2023~01/2024] ByteDance, Machine Learning Engineer Intern.
See my CV for more details of my work.
📝 Selected Publications
Here are some of my representative works:
- [ACL’25] MedPlan:A Two-Stage RAG-Based System for Personalized Medical Plan Generation. Hsin-Ling Hsu*, Cong-Tinh Dao*, Luning Wang, et al. [pdf]
- [NeurIPS ENLSP Workshop’24] CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios. Luning Wang, et al. [pdf] [github]
- [ICML’24] Evaluating Quantized Large Language Models. Shiyao Li, Xuefei Ning, Luning Wang, et al. [pdf] [github]
- [NeurIPS ENLSP Workshop’23] LLM-MQ: Mixed-precision Quantization for Efficient LLM Deployment. Shiyao Li, Xuefei Ning, Ke Hong, Tengxuan Liu, Luning Wang, et al. [pdf]
See my Google Scholar for the full list of my publications.
⚙️ Academic Services
- [09/2025~12/2025] Paper reviewer for the ICLR’26 main conference.
- [02/2025~03/2025] Paper reviewer for the ICLR’25 Workshop on Reasoning and Planning for LLMs.
