fastllm

Coding & Assistance

fastllm (ztxz16/fastllm) is an open-source AI project on GitHub. Repository summary: fastllm是后端无依赖的高性能大模型推理库。同时支持张量并行推理稠密模型和混合模式推理MOE模型，任意10G以上显卡即可推理满血DeepSeek。双路9004/9005服务器+单显卡部署DeepSeek满血满精度原版模型，单并发20tps；INT4量化模型单并发30tps，多并发可达60+。 Its focus includes developer-centric engineering workflows, multi-agent orchestration, workflow automation. It is suitable for extension, integration, and iterative delivery in real workflows.

License

Apache-2.0

Stars

4,713

Features

Core capability: fastllm是后端无依赖的高性能大模型推理库。同时支持张量并行推理稠密模型和混合模式推理MOE模型，任意10G以上显卡即可推理满血DeepSeek。双路9004/9005服务器+单显卡部署DeepSeek满血满精度原版模型，单并发20tps；INT4量化模型单并发30tps，多并发可达60+。
Built for code generation, debugging, or engineering integration
Supports multi-agent coordination and task decomposition
Supports orchestrated automation flows and scheduling
Repository: ztxz16/fastllm
Primary language: C++

Use Cases

Supports AI engineering build-and-iterate workflows for dev teams
Used for decomposing and running complex tasks in parallel
Used for cross-system process automation and operations efficiency
Build internal AI workflow prototypes with fastllm
Validate fastllm in production-like engineering scenarios
Building AI development workflows

FAQ

Teams should first define integration boundaries and call patterns, then map repository capabilities into concrete interfaces, parameters, and access rules. GitHub repository: https://github.com/ztxz16/fastllm. Community traction is around 4,712 stars. License: Apache-2.0.

It usually works as an execution component or capability layer, with common deployment fits such as: Supports AI engineering build-and-iterate workflows for dev teams, Used for decomposing and running complex tasks in parallel, Used for cross-system process automation and operations efficiency.

Related Tools

GitHub Copilot

Code completion tool

Cursor

AI code editor

Claude Code

Fix bugs, edit code, run tests, and submit PRs in real codebases