- Built a scalable annotation pipeline for 50K+ MAdVerse ads, automatically enriching metadata with design attributes (color, font, shape, style) via OCR and vision models (Tesseract, OpenCV, ONNX).
- Enhanced ad generation quality by 36% in CLIP alignment by integrating aesthetic and branding insights, improving design consistency and marketing personalization.
Multi-modal models currently exist like CLIP and VQ-VAE, which allow for text-to-image generation to create advertisements, but their outputs often do not fully fulfill branding requirements and aesthetics (eg. “minimalist yet vibrant”). Datasets with advertisements and elements relating to theme such as color, font, text do not exist currently Thus, we updated the Madverse dataset to make it higher quality for models to understand different graphic design themes and product quality outputs.
We built our dataset off the MAdVerse dataset an extensive multilingual dataset with over 50,000 ads. These ads spanned across 10 different domains, from shopping to baby products to ads for financial institutions, providing a wide variety of advertisements to look from.
Create a pipeline that, given any advertisement, can pull relevant information relating to the graphic design theme of the advertisement.
Our focus: automating dataset generation for online advertisements to finetune Claude 4.0 to enable entrepreneurs and founders to create high quality advertising campaigns without the need for human capital.
Additional annotations to the MAdVerse dataset are added about typography, colors, shapes, and design style because these elements are recognized as the most important visual elements of graphic design. By improving model capabilities to output more aligned images, users can achieve aesthetic requirements which increases viewing time of the product being advertised Positive correlation between time spent looking at preferred designs.