Comment by dangelosaurus

dangelosaurus Dec 14, 2025 parent

Working on promptfoo, an open-source (MIT) CLI and framework for eval-ing and red-teaming LLM apps. Think of it like pytest but for prompts - you define test cases, run evals against any model (OpenAI, Anthropic, local models, whatever), and catch regressions before they hit prod.

Currently building out support for multi-agent evals, better tracing, voice, and static code analysis for AI security use cases. So many fun sub-problems in this space - LLM testing is deceptively hard.

If you end up checking it out and pick up an issue, I'll happily send swag. We're also hiring if you want to work on this stuff full-time.

https://github.com/promptfoo/promptfoo

This item has no comments currently.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous