To appear on PPoPP '25. A novel pipeline parallelism that communicate model weight rather than activation under long-sequence scenarios.
Nov 10, 2024
Accepted by SC '23. Efficient Pipeline Parallelism System for LLM.
Nov 11, 2023