반응형 DeepSpeed-FastGen1 [논문리뷰] DeepSpeed-FastGen: High-throughput Text Generation forLLMs via MII and DeepSpeed-Inference LLM inference 관련해서 DeepSpeed-FastGen 이라는 새로운 방법이 나왔습니다. 논문 제목에서 알 수 있듯이, 이는 MS DeepSpeed 팀에서 낸 논문에서 제시하는 방법론입니다. https://arxiv.org/abs/2401.08671 DeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and DeepSpeed-Inference The deployment and scaling of large language models (LLMs) have become critical as they permeate various applications, demanding high-throughput and low-latency.. 2024. 1. 22. 이전 1 다음 반응형