promptfoo (promptfoo/promptfoo) is an open-source AI project on GitHub. Repository summary: Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, DeepSeek, and more. Simple declarative configs with command line and CI/CD integration. Used by OpenAI and Anthropic. Its focus includes retrieval-augmented generation, evaluation and observability, security and compliance automation, workflow automation. It is suitable for extension, integration, and iterative delivery in real workflows.
License
MIT
Stars
21,667
Homepage
https://promptfoo.dev/Features
- Core capability: Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, DeepSeek, and more. Simple declarative configs with command line and CI/CD integration. Used by OpenAI and Anthropic.
- Supports vector retrieval and retrieval-augmented reasoning
- Includes evaluation, tracing, or observability capabilities
- Covers security testing, risk detection, or compliance workflows
- Supports orchestrated automation flows and scheduling
- Repository: promptfoo/promptfoo
Use Cases
- Builds enterprise knowledge Q&A and document retrieval systems
- Used for AI quality monitoring and regression evaluation
- Used for security assessment and compliance automation
- Used for cross-system process automation and operations efficiency
- Build internal AI workflow prototypes with promptfoo
- Validate promptfoo in production-like engineering scenarios
FAQ
Teams should first define integration boundaries and call patterns, then map repository capabilities into concrete interfaces, parameters, and access rules. GitHub repository: https://github.com/promptfoo/promptfoo. Community traction is around 21,538 stars. License: MIT.
It usually works as an execution component or capability layer, with common deployment fits such as: Builds enterprise knowledge Q&A and document retrieval systems, Used for AI quality monitoring and regression evaluation, Used for security assessment and compliance automation.