Text-to-Image Alignment
Analyze text-to-image alignment with word-level scoring and visual masks.
This example shows human labelers' feedback to text-to-image generation. Model was given text and generated image. Human labelers were asked to highlight image areas and words having alignment issues.
Data Source
Data is excerpted from: Rapidata/text-2-image-Rich-Human-Feedback-32k
Rapidata can help you get similar human labeling for your data.
Use SmooSense to visualize your data
Word scores
- Name the column such that it contains
word_score
- Cell values should be string, json dumps of list of word and score pair. For example:
[["seven", 2.04], ["pixelated", 0.5219], ...]
Image mask
- Name your column such that it contains
image_mask
. - Save mask data as a grayscale png file and store url in the column.
- Ensure there is a column named
image_url
containing the corresponding image.
Data in this page
Loading...