Preferences


Howdy, HN. Authors here. We got tired of text-to-image leaderboards that only focus on aesthetics, so we built our own benchmarks to test what matters for real work: fidelity to complex prompts, safety, bias, and IP infringement.

We analyzed 18 models and found that no single model is good at everything. For example, GPT-4o has the best safety guardrails but also a 98% IP infringement rate on celebrity likenesses. Google's Imagen 4 Ultra actively counters bias (e.g., 90% of its "CEOs" are female) but struggles with generating crowds. X AI's Grok 2 blocks almost nothing.

Lots more detail in the post. We'll be here all day to answer questions.

Really unique viewpoint. Can't stress how rare it is these days for tech startups and companies to emphasize social responsibility, and crucially its potential to translate to profitability as well! Responsible AI isn't just a constraint on the field - controllability means quality and usability.

This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal