Text-to-Image Alignment

Analyze text-to-image alignment with word-level scoring and visual masks.

This example shows human labelers' feedback to text-to-image generation. Model was given text and generated image. Human labelers were asked to highlight image areas and words having alignment issues.

Data Source

Data is excerpted from: Rapidata/text-2-image-Rich-Human-Feedback-32k

Rapidata can help you get similar human labeling for your data.


Use SmooSense to visualize your data

Word scores

  • Name the column such that it contains word_score
  • Cell values should be string, json dumps of list of word and score pair. For example:
[["seven", 2.04], ["pixelated", 0.5219], ...]

Image mask

  • Name your column such that it contains image_mask.
  • Save mask data as a grayscale png file and store url in the column.
  • Ensure there is a column named image_url containing the corresponding image.

Data in this page

Loading...