fastllm (ztxz16/fastllm) is an open-source AI project on GitHub. Repository summary: fastllm是后端无依赖的高性能大模型推理库。同时支持张量并行推理稠密模型和混合模式推理MOE模型,任意10G以上显卡即可推理满血DeepSeek。双路9004/9005服务器+单显卡部署DeepSeek满血满精度原版模型,单并发20tps;INT4量化模型单并发30tps,多并发可达60+。 Its focus includes developer-centric engineering workflows, multi-agent orchestration, workflow automation. It is suitable for extension, integration, and iterative delivery in real workflows.
License
Apache-2.0
Stars
4,713
Features
- Core capability: fastllm是后端无依赖的高性能大模型推理库。同时支持张量并行推理稠密模型和混合模式推理MOE模型,任意10G以上显卡即可推理满血DeepSeek。双路9004/9005服务器+单显卡部署DeepSeek满血满精度原版模型,单并发20tps;INT4量化模型单并发30tps,多并发可达60+。
- Built for code generation, debugging, or engineering integration
- Supports multi-agent coordination and task decomposition
- Supports orchestrated automation flows and scheduling
- Repository: ztxz16/fastllm
- Primary language: C++
Use Cases
- Supports AI engineering build-and-iterate workflows for dev teams
- Used for decomposing and running complex tasks in parallel
- Used for cross-system process automation and operations efficiency
- Build internal AI workflow prototypes with fastllm
- Validate fastllm in production-like engineering scenarios
- Building AI development workflows
FAQ
Teams should first define integration boundaries and call patterns, then map repository capabilities into concrete interfaces, parameters, and access rules. GitHub repository: https://github.com/ztxz16/fastllm. Community traction is around 4,712 stars. License: Apache-2.0.
It usually works as an execution component or capability layer, with common deployment fits such as: Supports AI engineering build-and-iterate workflows for dev teams, Used for decomposing and running complex tasks in parallel, Used for cross-system process automation and operations efficiency.