Preferences

Working on promptfoo, an open-source (MIT) CLI and framework for eval-ing and red-teaming LLM apps. Think of it like pytest but for prompts - you define test cases, run evals against any model (OpenAI, Anthropic, local models, whatever), and catch regressions before they hit prod.

Currently building out support for multi-agent evals, better tracing, voice, and static code analysis for AI security use cases. So many fun sub-problems in this space - LLM testing is deceptively hard.

If you end up checking it out and pick up an issue, I'll happily send swag. We're also hiring if you want to work on this stuff full-time.

https://github.com/promptfoo/promptfoo


This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal