Comment by og_kalu - Hacker Neue

og_kalu 2 days ago parent

Imagen is a diffusion text to image model. You write some text that describes your image, you get an image out and that's it.

Flash Image is an image (and text) predicting large language model. In a similar fashion to how trained LLMs can manipulate/morph text, this can do that for images as well. Things like style transfer, character consistency etc.

You can communicate with it in a way you can't for imagen, and it has a better overall world understanding.

This item has no comments currently.