promptfoo

Coding & Assistance

promptfoo (promptfoo/promptfoo) is an open-source AI project on GitHub. Repository summary: Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, DeepSeek, and more. Simple declarative configs with command line and CI/CD integration. Used by OpenAI and Anthropic. Its focus includes retrieval-augmented generation, evaluation and observability, security and compliance automation, workflow automation. It is suitable for extension, integration, and iterative delivery in real workflows.

License

MIT

Stars

21,667

Homepage

https://promptfoo.dev/

Features

Core capability: Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, DeepSeek, and more. Simple declarative configs with command line and CI/CD integration. Used by OpenAI and Anthropic.
Supports vector retrieval and retrieval-augmented reasoning
Includes evaluation, tracing, or observability capabilities
Covers security testing, risk detection, or compliance workflows
Supports orchestrated automation flows and scheduling
Repository: promptfoo/promptfoo

Use Cases

Builds enterprise knowledge Q&A and document retrieval systems
Used for AI quality monitoring and regression evaluation
Used for security assessment and compliance automation
Used for cross-system process automation and operations efficiency
Build internal AI workflow prototypes with promptfoo
Validate promptfoo in production-like engineering scenarios

FAQ

Teams should first define integration boundaries and call patterns, then map repository capabilities into concrete interfaces, parameters, and access rules. GitHub repository: https://github.com/promptfoo/promptfoo. Community traction is around 21,538 stars. License: MIT.

It usually works as an execution component or capability layer, with common deployment fits such as: Builds enterprise knowledge Q&A and document retrieval systems, Used for AI quality monitoring and regression evaluation, Used for security assessment and compliance automation.

Related Tools

GitHub Copilot

Code completion tool

Cursor

AI code editor

Claude Code

Fix bugs, edit code, run tests, and submit PRs in real codebases