publications

Publication list with open access.

2024

  1. hetegen.png
    HeteGen: Heterogeneous Parallel Inference for Large Language Models on Resource-Constrained Devices
    Xuanlei Zhao, Bin Jia, Haotian Zhou, Ziming Liu, Shenggan Cheng, and Yang You
    2024
  2. dsp.png
    DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers
    Xuanlei Zhao, Shenggan Cheng, Zangwei Zheng, Zheming Yang, Ziming Liu, and Yang You
    2024

2023

  1. hanayo.jpg
    Hanayo: Harnessing Wave-like Pipeline Parallelism for Enhanced Large Model Training Efficiency
    Ziming Liu, Shenggan Cheng, Haotian Zhou, and Yang You
    In SC ’23, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2023
  2. atp.jpg
    ATP: Adaptive Tensor Parallelism for Foundation Models
    Shenggan Cheng, Ziming Liu, Jiangsu Du, and Yang You
    2023

2022

  1. energon.jpg
    EnergonAI: An Inference System for 10-100 Billion Parameter Transformer Models
    Jiangsu Du, Ziming Liu, Jiarui Fang, Shenggui Li, Yongbin Li, Yutong Lu, and 1 more author
    2022