Publications

I'm thankful for all the collaborators who work on the following papers with me. * means equal contributions.

2025

  1. telerag.png
    TeleRAG: Efficient Retrieval-Augmented Generation Inference with Lookahead Retrieval
    Chien-Yu Lin*, Keisuke Kamahori*, Yiyu Liu, and 11 more authors
    2025
  2. palu_concept.png
    Palu: Compressing KV-Cache with Low-Rank Projection
    Chi-Chih Chang*, Wei-Cheng Lin*Chien-Yu Lin*, and 7 more authors
    In Proceedings of International Conference on Learning Representations (ICLR), 2025

2024

  1. nanoflow2.png
    NanoFlow: Towards Optimal Large Language Model Serving Throughput
    Kan Zhu, Yilong Zhao, Liangyu Zhao, and 12 more authors
    2024
  2. Efficient Encoder-Decoder Transformer Decoding for Decomposable Tasks
    Bo-Ru Lu, Nikita Haduong, Chien-Yu Lin, and 3 more authors
    arXiv preprint arXiv:2403.13112, 2024
  3. atom2.png
    Atom: Low-Bit Quantization for Efficient and Accurate LLM Serving
    Yilong Zhao, Chien-Yu Lin, Kan Zhu, and 7 more authors
    In Proceedings of Machine Learning and Systems (MLSys), 2024
  4. FastSR-NeRF: Improving NeRF Efficiency on Consumer Devices with A Simple Super-Resolution Pipeline
    Chien-Yu Lin, Qichen Fu, Thomas Merth, and 2 more authors
    In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Jan 2024
    Oral (Top 2.6%)

2022

  1. SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic Networks
    Chien-Yu Lin*, Anish Prabhu*, Thomas Merth, and 4 more authors
    In Proceedings the 17th European Conference on Computer Vision (ECCV), Jan 2022

2021

  1. Accelerating Spmm Kernel with Cache-First Edge Sampling for Graph Neural Networks
    Chien-Yu Lin, Liang Luo, and Luis Ceze
    arXiv preprint arXiv:2104.10716, Jan 2021

2019

  1. Enhancing Utilization of SIMD-Like Accelerator for Sparse Convolutional Neural Networks
    Bo-Cheng Lai, Jyun-Wei Pan, and Chien-Yu Lin
    IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Jan 2019

2018

  1. Supporting compressed-sparse activations and weights on SIMD-like accelerator for sparse convolutional neural networks
    Chien-Yu Lin, and Bo-Cheng Lai
    In Proceedings of the 23rd Asia and South Pacific Design Automation Conference (ASP-DAC), Jan 2018