Publications

You can also find my articles on my Google Scholar profile.

Conference Papers


ZipServ: Fast and Memory-Efficient LLM Inference with Hardware-Aware Lossless Compression

Published in Proceedings of the 31st ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2026

Ruibo Fan, Xiangrui Yu, Xinglin Pan, Zeyu Li, Weile Luo, Qiang Wang, Wei Wang, and Xiaowen Chu, ‘‘ZipServ: Fast and Memory-Efficient LLM Inference with Hardware-Aware Lossless Compression,’’ in the Proceedings of ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’26), Pittsburgh, PA, USA, March 2026.

Download Paper | Code | Download Bibtex

ROME: Maximizing GPU Efficiency for All-Pairs Shortest Path via Taming Fine-Grained Irregularities

Published in Proceedings of the 31st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), 2026

Weile Luo, Yuxin Chen, Xiangrui Yu, Qiang Wang, Ruibo Fan, Haibo Liu, et al., “ROME: Maximizing GPU Efficiency for All-Pairs Shortest Path via Taming Fine-Grained Irregularities,” in Proceedings of the 31st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), 2026.

Download Paper | Code | Download Bibtex

DTC-SpMM: Bridging the Gap in Accelerating General Sparse Matrix Multiplication with Tensor Cores

Published in Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2024

Ruibo Fan, Wei Wang, and Xiaowen Chu, “DTC-SpMM: Bridging the Gap in Accelerating General Sparse Matrix Multiplication with Tensor Cores,” in Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2024.

Download Paper | Code | Download Bibtex

Journal Articles