profile photo

Shiyi Cao 「曹诗怡」

 |  News  |  Publications  |  Projects  |  Contact  | 

I am a first-year Ph.D. student at UC Berkeley EECS , advised by Ion Stoica and Joseph Gonzalez, affliated with Sky Computing Lab and BAIR. Previously, I was fortunate to be working with Zhihao Jia at CMU on accelerating distributed training. I obtained M.S. in Computer Science at ETH, working at SPCL with Torsten Hoefler. Prior to joining grad school, I had a great time at Shanghai Jiao Tong University where I earned Bachelor degree in Computer Science. I am mainly interested in accelerating/optimizing computations (especially ML workloads) on large-scale heterogeneous systems.

 ~  Email  |  CV  |  Github  |  Google Scholar  |  Twitter  |  INS  ~ 


Oct '23  

Released S-LoRA, a scalable system for serving thousands of LoRA adapters concurrently!

Aug '23  

Graduated from ETH and joined UC Berkeley EECS!

Sept '22  

Paper accepted at ExaMPI workshop, SC '22!

Sept '20

Started to study as a master student in Computer Science at ETH Zürich.

Fairness in Serving Large Language Models (Arxiv)
Ying Sheng,  Shiyi Cao,  Dacheng Li,  Banghua Zhu,  Zhuohan Li,  Danyang Zhuo,  Joseph E Gonzalez,  Ion Stoica 
OSDI 2024.
LLM Serving; Fair Scheduling.
S-LoRA: Serving Thousands of Concurrent LoRA Adapters (Arxiv, Github, Blog)
Ying Sheng*,  Shiyi Cao*,  Dacheng Li,  Coleman Hooper,  Nicholas Lee,  Shuo Yang,  Christopher Chou,  Banghua Zhu,  Lianmin Zheng,  Kurt Keutzer,  Joseph E. Gonzalez,  Ion Stoica 
MLSys 2024.
LLM Inference; LoRA; Adapters; Memory Management.
Accelerating Data Serialization/Deserialization Protocols with In-Network Compute (pdf, video)
Shiyi Cao,  Salvatore Di Girolamo,  Torsten Hoefler 
Workshop on Exascale MPI, ExaMPI@SC, 2022. 
SmartNICs; In-Network Compute; Data (De)serialization.
AdaM: An Adaptive Fine-Grained Scheme for Distributed Metadata Management
Shiyi Cao,  Yuanning Gao,  Xiaofeng Gao,  Guihai Chen 
International Conference on Parallel Processing (ICPP), 2019. 
Distributed Systems; Metadata Management; Reinforcement Learning.



This template is a modification to Jon Barron's website and Rishab Khincha's website.