News
Mar 02, 2025 | The preprint of TeleRAG, our new paper on RAG inference acceleration, is out on ArXiv! |
---|---|
Jan 22, 2025 | Our KV-Cache compression framework, Palu, is accepted to ICLR 2025! You’re welcome to use our code to make your LLM more efficient. See you in Singapore! |
Jan 10, 2025 | ![]() Drop me an email if you have opportunities! |
Jan 02, 2025 | I’m visiting Taiwan for three weeks. Hit me up if you want to chat about research! |
Dec 14, 2024 | I’m attending NeurIPs 2024 in vancouver! |
Oct 29, 2024 | Gave a talk on Palu KV-Cache compression at UW CSE research day. |