Efficient Heterogeneous Parallel Inference System for LLM on resource-constrained devices.
May 13, 2024