Publications

I'm thankful for all the collaborators who work on the following papers with me. * means equal contributions.

2025

  1. xkv.jpg
    xKV: Cross-Layer SVD for KV-Cache Compression
    Chi-Chih Chang, Chien-Yu Lin, Yash Akhauri, and 4 more authors
    2025
  2. xKV: Cross-Layer SVD for KV-Cache Compression
    2025
  3. telerag.png
    TeleRAG: Efficient Retrieval-Augmented Generation Inference with Lookahead Retrieval
    Chien-Yu Lin*, Keisuke Kamahori*, Yiyu Liu, and 11 more authors
    2025
  4. palu_concept.png
    Palu: Compressing KV-Cache with Low-Rank Projection
    Chi-Chih Chang*, Wei-Cheng Lin*Chien-Yu Lin*, and 7 more authors
    In Proceedings of International Conference on Learning Representations (ICLR), 2025
  5. nanoflow2.png
    NanoFlow: Towards Optimal Large Language Model Serving Throughput
    Kan Zhu, Yilong Zhao, Liangyu Zhao, and 12 more authors
    In 19th USENIX Symposium on Operating Systems Design and Implementation (OSDI) 2025, 2025

2024

  1. Efficient Encoder-Decoder Transformer Decoding for Decomposable Tasks
    Bo-Ru Lu, Nikita Haduong, Chien-Yu Lin, and 3 more authors
    arXiv preprint arXiv:2403.13112, 2024
  2. atom2.png
    Atom: Low-Bit Quantization for Efficient and Accurate LLM Serving
    Yilong Zhao, Chien-Yu Lin, Kan Zhu, and 7 more authors
    In Proceedings of Machine Learning and Systems (MLSys), 2024
  3. FastSR-NeRF: Improving NeRF Efficiency on Consumer Devices with A Simple Super-Resolution Pipeline
    Chien-Yu Lin, Qichen Fu, Thomas Merth, and 2 more authors
    In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Jan 2024
    Oral (Top 2.6%)

2022

  1. SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic Networks
    Chien-Yu Lin*, Anish Prabhu*, Thomas Merth, and 4 more authors
    In Proceedings the 17th European Conference on Computer Vision (ECCV), Jan 2022

2021

  1. Accelerating Spmm Kernel with Cache-First Edge Sampling for Graph Neural Networks
    Chien-Yu Lin, Liang Luo, and Luis Ceze
    arXiv preprint arXiv:2104.10716, Jan 2021

2019

  1. Enhancing Utilization of SIMD-Like Accelerator for Sparse Convolutional Neural Networks
    Bo-Cheng Lai, Jyun-Wei Pan, and Chien-Yu Lin
    IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Jan 2019

2018

  1. Supporting compressed-sparse activations and weights on SIMD-like accelerator for sparse convolutional neural networks
    Chien-Yu Lin, and Bo-Cheng Lai
    In Proceedings of the 23rd Asia and South Pacific Design Automation Conference (ASP-DAC), Jan 2018