Visualent

Inspiration

Driven by an unwavering fascination with unsolved mysteries, we all loved talking about who Jack the Ripper really was or if the 3 escapees really escaped Alcatraz. It was this deep-rooted intrigue in unsolved crime mysteries that had us settle on creating a tool that could transform fleeting memories into tangible leads.

What it does

Visualent is a tool that prompts users to describe a perpetrator's face. Based on the description, it extracts information and categorizes it into attributes such as weight, gender, age, race, hair, eyes, facial hair, face shape, and any unique features. It then generates an image, which users can refine and edit to ensure the final visual representation aligns closely with their recollection.

How we built it

Frontend Development: The user interface was designed with React. Backend Architecture: Python, coupled with the FastAPI framework, forms the backbone of our application, facilitating robust, scalable interactions. AI Model Deployment: Leveraging replicate.com, we deployed our AI model, ensuring efficient access and real-time processing capabilities.

Data Collection and Model Training: Given the unique nature of our project, traditional datasets were insufficient. We innovatively sourced a diverse range of images from the Cleveland Public Library Digital Gallery, overcoming the challenge of dataset scarcity.

A custom-tailored AI model powers the core of Visualent. Starting with the foundational Stable Diffusion Model from realvisxl2, we meticulously finetuned it with our curated dataset, harnessing the capabilities of Hugging Face's diffusers and accelerating libraries. This approach allowed us to refine our model specifically for forensic accuracy and detail, setting a new standard in AI-assisted forensic analysis.

Challenges we ran into

Our journey with Visualent faced two key hurdles: acquiring a diverse dataset and refining our AI model with limited resources. We tackled the dataset dilemma by utilizing the Cleveland Public Library's unique collection of mugshots, a creative solution that enriched our training material. Fine-tuning the AI, despite dataset constraints, demanded ingenuity and persistence, showcasing our commitment to precision and reliability in forensic imaging.

Accomplishments that we're proud of

We're immensely proud of fine-tuning a stable diffusion model for forensic precision and developing a sleek, intuitive front end. These achievements mark a significant leap forward in merging AI with forensic artistry.

What we learned

For the majority of the group, this was our first hackathon. So we learned just how powerful it is to work in a team. Together, we were able to all work to our strengths to develop the best possible version of Visualent.

What's next for Visualent

Some things that we are looking to add to Visualent include voice input. So if the user wants to talk about what they look like, they can do that. Also being able to create a 3D model of the person to help visualize them better will be much better than just 2D images. These two features are important. However, something that would be particularly important is implementing an interactive feedback loop. This will allow the user to perfect the 2D and 3D images of the perpetrator. We also would like to have a more accurate dataset for finding images of criminals

Built With

html
huggingface
javascript
python
pytorch
react
tailwind

Updates

Mark Shteyn started this project — Feb 25, 2024 08:59 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.