Text-to-Image Alignment
SmooSense allows users to visualize and evaluate text-to-image AI performance. Its intuitive UI utilizes word scores and visual masks to help pinpoint misalignments between the text and the image, simplifying model analysis.
This example shows human labelers' feedback to text-to-image generation. Model was given text and generated image. Human labelers were asked to highlight image areas and words having alignment issues.
Data Source
Data is excerpted from: Rapidata/text-2-image-Rich-Human-Feedback-32k
Rapidata can help you get similar human labeling for your data.
Use SmooSense to visualize your data
Word scores
- Name the column such that it contains
word_score - Cell values should be string, json dumps of list of word and score pair. For example:
[["seven", 2.04], ["pixelated", 0.5219], ...]
Image mask
- Name your column such that it contains
image_mask. - Save mask data as a grayscale png file and store url in the column.
- Ensure there is a column named
image_urlcontaining the corresponding image.
Data in this page
Loading...