Efficient Sequence Parallelism System for Transformer model training.
Jun 30, 2024
Dynamic Sequence Parallelism for multi-dimensional transformers.
Jan 1, 2024