About me

Hi! This is Luning Wang (王麓宁)’s homepage. I’m currently working on Large Recommendation Models & Generative Recommendation at ByteDance Douyin Group as a full-time algorithm engineer.

I mainly focused on the infrastructure and efficiency optimization of Large Language Models in my past work. I had experience on optimizing LLMs with techniques like quantization, kv-cache compression and speculative decoding. I’ve also done works related to LLM Systems, in which I gained experience of working on vLLM and parallel computing. Furthermore, I’ve been exploring the application of LLMs in the field of biomedical engineering for some time at UofM.

I’d casually put my notes for new works & random thoughts in my blog posts, hope that could help as a reference if you’re also interested in related topics. I’m open to discussion and collaboration, feel free to drop me an Email or reach out on LinkedIn!

🎓 Education

  • [08/2024~05/2026] M.S. Department of Electrical and Computer Engineering, University of Michigan
  • [09/2020~06/2024] B.Eng. Department of Electronic Engineering, Tsinghua University

💻 Work Experience

See my CV for more details of my work.

📚 Research Experience

📝 Selected Publications

Here are some of my representative works:

  • [ACL’25] MedPlan:A Two-Stage RAG-Based System for Personalized Medical Plan Generation. Hsin-Ling Hsu*, Cong-Tinh Dao*, Luning Wang, et al. [pdf]
  • [NeurIPS ENLSP Workshop’24] CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios. Luning Wang, et al. [pdf] [github]
  • [ICML’24] Evaluating Quantized Large Language Models. Shiyao Li, Xuefei Ning, Luning Wang, et al. [pdf] [github]
  • [NeurIPS ENLSP Workshop’23] LLM-MQ: Mixed-precision Quantization for Efficient LLM Deployment. Shiyao Li, Xuefei Ning, Ke Hong, Tengxuan Liu, Luning Wang, et al. [pdf]

See my Google Scholar for the full list of my publications.

⚙️ Academic Services

  • [09/2025~12/2025] Paper reviewer for the ICLR’26 main conference.
  • [02/2025~03/2025] Paper reviewer for the ICLR’25 Workshop on Reasoning and Planning for LLMs.