Introduction

You have a choice in the type of content which you want to generate:

Images (Computer Vision)

Using a pre-trained HuggingFace stable diffusion model to generate artistic images from input image & a prompt, we want to further train it on a niche dataset (in our case containing images of cars in different locations) so that we can then generate high quality car images with artistic styles through a text prompt.
We then want to use our generated car images for two further use cases:
- To replace the car in the image with a different one, using a generated mask and another stable diffusion model.
- To better the image quality with the stable diffusion upscaling model.

Text (Natural Language Processing)

Using a pre-trained HuggingFace text generation model, we want to be able to generate a set of textual data using simple, intuitive input prompts. The output can be evaluated based on the believability of the output. Here are some examples of the things which can be generated:

A news article based around a specific topic, on current events (e.g. space)
An imaginary product based on reviews or product specifications (e.g. amazon reviews or product data)
Synthetic Data which can be used to improve an existing machine learning algorithm on a specific use-case.

Tools and Technologies

Development Environment

Google Colab Model Platform

APIs

HuggingFace, OpenAI

Models

Computer Vision

Stable Diffusion

Natural Language Processing

GPT-3 (Open-AI) GPT-Neo/NeoX (2.7B and 20B respectively) - The models to be used for the Natural Language Processing area of the challenge are relatively flexible. Feel free to use open-source large language models which can be found on the HuggingFace Platform.

Inspiration

Useful resources and example notebooks can be found at this Google Drive link.
https://drive.google.com/drive/u/1/folders/1R4boDg9kBaxKKLLbnuGQKszeGuKgPSs1