CaseConnect

Inspiration

CaseConnect was inspired by the need for an efficient way to search and identify missing persons cases within the NamUs database, helping law enforcement and the public to quickly find relevant cases.

What it does

CaseConnect is a semantic search engine for the NamUs missing persons database. It scrapes case data, downloads associated images, and embeds both text and images into a vector space. Users can search the database using text or images, retrieving similar cases.

How we built it

The project was built by first creating a scraper to extract data from the NamUs database. Text and images were then embedded using text-embedding-ada-002 and ViT-bigG-14 models, respectively. The searching is performed using a nearest neighbor's lookup

Challenges we ran into

The main challenges were efficiently scraping the database, embedding text and images, and implementing various search functionalities like text-to-text, image-to-image, and text-to-image searches.

Accomplishments that we're proud of

We successfully developed a scraper and embedded text and images, laying the foundation for an efficient semantic search engine for the NamUs database.

What we learned

We learned how to effectively extract and process data, and apply advanced embedding techniques to create a powerful search engine for missing persons cases.

What's next for CaseConnect

Future plans include building the search front end, setting up the embedding and front-end infrastructure, adding a chat feature, and implementing stretch goals such as converting sketches to images for searching and enabling live generations from sketches with semantic search.

Built With

ada-2
open-clip
openai
python
sklearn

Updates

Private user started this project — Mar 26, 2023 01:59 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.