Chien-Yu Lin

About me

I’m a Research Scientist in the FAIR group at Meta Superintelligence Labs. My research interests lie in efficient machine learning algorithms and systems.

I received my Ph.D. in 2025 from the Paul G. Allen School of Computer Science & Engineering at the University of Washington. I was fortunate to be advised by Luis Ceze and to work closely with Baris Kasikci and Arvind Krishnamurthy. During my time at UW, I developed efficient methods for a range of ML workloads, including GNNs, NeRFs, and LLMs. Before UW, I earned my B.S. and M.S. in Electronics Engineering from National Yang Ming Chiao Tung University (formerly NCTU), where I worked on sparse CNN accelerator design under the guidance of Prof. Bo-Cheng Lai.

Beyond research, I enjoy outdoor activities such as running, cycling, and tennis. When time allows, I like spending time in the mountains. I have successfully summited and skied down four of the five major volcanoes in Washington (all except Glacier Peak). In the summer of 2025, I completed a 1,000-mile cycling trip from Seattle to the Bay Area.

News

Jan 2026	One paper (TeleRAG) got accepted to MLSys 2026 and two papers (Composer and UniQL) got accepted to ICLR 2026.
Aug 2025	Joined Meta at the MPK!
Jul 2025	Completed a 1,000-mile bike trip from Seattle to Menlo Park! It’s a journey I’ll remember for my life.

Selected publications

* means equal contribution

TeleRAG: Efficient Retrieval-Augmented Generation Inference with Lookahead Retrieval

Chien-Yu Lin^*, Keisuke Kamahori^*, Yiyu Liu, and 11 more authors

2026

Bib PDF

@article{lin2025telerag,
  title = {TeleRAG: Efficient Retrieval-Augmented Generation Inference with Lookahead Retrieval},
  author = {Lin, Chien-Yu and Kamahori, Keisuke and Liu, Yiyu and Shi, Xiaoxiang and Kashyap, Madhav and Gu, Yile and Shao, Rulin and Ye, Zihao and Zhu, Kan and Wang, Stephanie and Krishnamurthy, Arvind and Kadekodi, Rohan and Ceze, Luis and Kasikci, Baris},
  booktitle = {Proceedings of Machine Learning and Systems (MLSys)},
  venue_url = {https://arxiv.org/abs/2502.20969},
  year = {2026},
}

Palu: Compressing KV-Cache with Low-Rank Projection

Chi-Chih Chang^*, Wei-Cheng Lin^*, Chien-Yu Lin^*, and 7 more authors

In Proceedings of International Conference on Learning Representations (ICLR), 2025

Bib PDF

@inproceedings{chang2024palu,
  author = {Chang, Chi-Chih and Lin, Wei-Cheng and Lin, Chien-Yu and Chen, Chong-Yan and Hu, Yu-Fang and Wang, Pei-Shuo and Huang, Ning-Chi and Ceze, Luis and Abdelfattah, Mohamed S. and Wu, Kai-Chiang},
  booktitle = {Proceedings of International Conference on Learning Representations (ICLR)},
  title = {Palu: Compressing KV-Cache with Low-Rank Projection},
  year = {2025},
  venue_url = {https://arxiv.org/abs/2407.21118},
}

Atom: Low-Bit Quantization for Efficient and Accurate LLM Serving

Yilong Zhao, Chien-Yu Lin, Kan Zhu, and 7 more authors

In Proceedings of Machine Learning and Systems (MLSys), 2024

Bib PDF Code

@inproceedings{zhao2024atom,
  author = {Zhao, Yilong and Lin, Chien-Yu and Zhu, Kan and Ye, Zihao and Chen, Lequn and Zheng, Size and Ceze, Luis and Krishnamurthy, Arvind and Chen, Tianqi and Kasikci, Baris},
  booktitle = {Proceedings of Machine Learning and Systems (MLSys)},
  pages = {196--209},
  title = {Atom: Low-Bit Quantization for Efficient and Accurate LLM Serving},
  url = {https://proceedings.mlsys.org/paper_files/paper/2024/file/5edb57c05c81d04beb716ef1d542fe9e-Paper-Conference.pdf},
  volume = {6},
  year = {2024},
}

FastSR-NeRF: Improving NeRF Efficiency on Consumer Devices with A Simple Super-Resolution Pipeline

Chien-Yu Lin, Qichen Fu, Thomas Merth, and 2 more authors

In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Jan 2024

Oral (Top 2.6%)

Bib PDF

@inproceedings{lin2024fastsrnerf,
  author = {Lin, Chien-Yu and Fu, Qichen and Merth, Thomas and Yang, Karren and Ranjan, Anurag},
  title = {FastSR-NeRF: Improving NeRF Efficiency on Consumer Devices with A Simple Super-Resolution Pipeline},
  booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
  month = jan,
  year = {2024},
  pages = {2482-2491},
  venue_type = {Conference},
}

SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic Networks

Chien-Yu Lin^*, Anish Prabhu^*, Thomas Merth, and 4 more authors

In Proceedings the 17th European Conference on Computer Vision (ECCV), Jan 2022

Bib PDF Video Code

@inproceedings{lin2022spin,
  author = {Lin, Chien-Yu and Prabhu, Anish and Merth, Thomas and Mehta, Sachin and Ranjan, Anurag and Horton, Maxwell and Rastegari, Mohammad},
  title = {SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic Networks},
  booktitle = {Proceedings the 17th European Conference on Computer Vision (ECCV)},
  year = {2022},
  venue_type = {Conference},
}

Supporting compressed-sparse activations and weights on SIMD-like accelerator for sparse convolutional neural networks

Chien-Yu Lin, and Bo-Cheng Lai

In Proceedings of the 23rd Asia and South Pacific Design Automation Conference (ASP-DAC), Jan 2018

Bib PDF Slides

@inproceedings{lin2018supporting,
  author = {Lin, Chien-Yu and Lai, Bo-Cheng},
  title = {Supporting compressed-sparse activations and weights on SIMD-like accelerator for sparse convolutional neural networks},
  booktitle = {Proceedings of the 23rd Asia and South Pacific Design Automation Conference (ASP-DAC)},
  year = {2018},
}