I am Minhui Xie, a fifth-year Ph.D. student from Tsinghua University, advised by Professor Youyou Lu and Jiwu Shu.
I am a system researcher. My research focus is building efficient systems for at-scale machine learning, with emerging hardware (e.g., persistent memory, modern GPUs).
I am so excited about the interact field between ML and System.
Department of Computer Science, Tsinghua University
Department of Computer Science, Nanjing University
Rank 1st /160 (core courses), 3rd /160 (all courses)
PetPS - supporting huge embedding models with persistent memory
- PetPS is the first system that applies byte-addressable NVM technology to reduce storage costs of huge embedding models. It solves the essential problems caused by the poor performance of NVM hardware.
- It has been deployed at Kuaishou’s datacenters and successfully withstood the access pressure of over 26 billion video recommendations every day. It got 30% cost-savings while maintaining service performance.
- It has been reported by the industry and official media such as Tsinghua(link), Kuaishou, Intel(link), and People.cn(link).
Fleche - efficient GPU-resident embedding cache
- In this work, we identify the DRAM bandwidth scarcity problem and propose Fleche to address it. Fleche’s key idea is absorbing hot accesses via a lightweight GPU-resident embedding cache.
- Fleche gets up to 4.0x speedup of end-to-end inference throughput over NVIDIA HugeCTR, a well-known highly optimized industrial system.
Kraken - memory efficient continual learning for at-scale recommendation systems
- Kraken redesigns the age-old structure of embedding tables for continual learning and tailors the optimizer algorithm to make thrift use of DRAM. It can trisect the memory usage while keeping model performance.
- It has been cited and highly rated by companies including Facebook, Tencent, Alibaba, ByteDance, Kuaishou, and Huawei. It was also incorporated into a popular open-source book on GitHub in the area of MLSys, OpenMLSys(link).
Grants & Awards
Awards During Ph.D.
- Tsinghua First-class Scholarship
- Student Grant from USENIX FAST
Selected awards before Ph.D.
- Outstanding Graduate of Nanjing University
- Tung OOCL Scholarship (5%)
- National Scholarship (2%)
- National Second Prize, China Undergraduate Mathematical Contest in Modeling
- Meritorious Winner, MCM/ICM
- Tung OOCL Scholarship (5%)
- Excellent student at Nanjing University (5%)
- EuroSys 2023, Artifact reviewer
- SIGCOMM 2022, Artifact reviewer
- USENIX ATC 2022, Artifact reviewer
- OSDI 2022, Artifact reviewer
- IEEE Transactions on Parallel and Distributed Systems (TPDS), 2022, Reviewer
- EuroSys 2022, Artifact reviewer
- Long-term volunteer of ChinaSys
- Fleche - efficient GPU-resident embedding cache
- NVIDIA, Beijing, China - May 26, 2022
- EuroSys’22, Rennes, France - Apr 04, 2022
- Kraken - memory efficient continual learning for at-scale recommendation systems
- Huawei, Beijing, China - Mar 25, 2022
- Tsinghua, Beijing, China - Nov 19, 2020
- SC’20, San Diego, US - Nov 11, 2020
- TA, Computer Organization and Architecture, Tsinghua University, Spring 2022
- TA, Computer Organization and Architecture, Tsinghua University, Spring 2021
- TA, Computer Organization and Architecture, Tsinghua University, Spring 2020
- TA, Introduction to Computer System, Nanjing University, Fall 2017