Inspiration

Every year, 5 million human trafficking victims are transported using counterfeited or fake passports and ID cards. These fake identity documents not only fuel human trafficking but also facilitate identity theft, allowing criminals to assume false identities, evade law enforcement, and expand their illegal operations. The problem is only getting worse—according to The New York Times, a new generation of 'unbeatable' fake IDs is making it increasingly difficult to distinguish real documents from fraudulent ones. To combat this growing crisis, we developed IdentityAI.

What it does

Our project can be used to detect even the most perfectly crafted false identity card. We know that at least one of the fields on this ID is falsified, whether it be the photo, name, birthday, address, or all of the above. And, if the fields are all valid, then we can check the similarity of the ID photo to the face of its holder. So, we leverage web scraping, machine learning, and AI agents to find these discrepancies. We take pictures of a person and their ID using the Meta Ray Bans glasses, process this information, and utilize agents to crawl the internet and verify the validity of the ID. We have a web UI where a user can view the results of our search, which include a face-matching confidence score, address, age, and other identifying information. Our project uses a multitude of strategies to achieve this goal. We built a web scraper to perform reverse image searching. We leverage Perplexity’s Sonar and other online sources to validate the person’s name, age, and address. At the end of this workflow, Identity AI presents results to the user, who will then know the validity of an ID regardless of its quality.

How we built it

We built the backend using python and FastAPI. These include endpoints to OCR the ID, to gather additional information off internet searches, as well as interact with large language models with tools and agents.

We leverage a Yolo model to detect the largest face in an image. This is then sent to our reverse image search tool to find related articles, pictures, and websites that could have correlating images. After receiving this corpus of information, we run the links acquired into crawler instances to retrieve the data from the links. This data is then fed into an Anthropic LLM to congregate the data into meaningful leads.

Meta Ray Bans glasses do not have an API for us to use, so we got creative in order to take advantage of this cool technology. We created a facebook account to send images to from the glasses and had a chrome extension monitor the FaceBook Messenger DOM in order to then send these images to our server.

Additionally, we leveraged OCR to compare the ID owner’s face and the photo on the ID, and we determine this similarity by taking an L2 norm of the image encodings. We then utilized a formula to convert this norm into a percentage based confidence score.

We also queried Perplexity’s Sonar API to gather more leads on whether the information on the ID card is falsified. Furthermore, we experimented with an agentic workflow to leverage all these tools we created and make qualitative decisions about what information is still necessary in order to come to a conclusion about the realness of an ID. We built an orchestrator agent who interacts with OSINT agents and a decision agent. We provided these agents with function calls to various tools we built.

Challenges we ran into

Creating agents that run together and deliberate with our dedicated tools to find the next best move. Whether that is to gather more information with our tools or if the information is already good enough, determine the final output. Our agents needed to decide whether to gather more information or finalize a decision based on existing data. Striking the right balance between exploration and resolution was tricky because without clear stopping conditions we risked infinite loops or premature decisions with low confidence. To tackle this, we implemented a structured decision loop that iterates through agent deliberation while enforcing constraints on information retrieval. Each iteration, the DecisionAgent assesses available data, determining whether to request additional OSINT queries or finalize its output.

Other challenges that we ran into were the quality of the meta glasses. In testing, phone cameras capture images in high resolution allowing us to manually enlarge and crop the image.

We also encountered difficulties in web scraping. Websites we interacted with had bot detection such as reCAPTCHA, and we got creative combining available tools to bypass these measures (e.g. mocking a real user’s chrome profile, randomizing delays when interacting with the DOM). All to allow us to continue our altruistic crawl of the internet!

Accomplishments that we're proud of

Based on just an individual's face, we can build a comprehensive profile leveraging smaller pieces of evidence from various sources on the web. We can then cross reference this information with any ID card to validate identification with a high degree of accuracy. We truly embraced the hacker mentality, interacting with technologies and tools in ways that are not their original intended purpose, and thus we had to think outside the box to still be able to achieve our goals. Overall, we are proud of the extent of the functionality we were able to build for this project.

What we learned

Building complex agentic systems is challenging. Unlike standard software development, where logic is explicit, designing AI agents often feels like recursive programming, where you must construct the right scaffolding and then take a leap of faith, trusting the system to generalize and adapt dynamically.

One of the biggest lessons we learned is that orchestrating multiple AI agents requires a different mindset. Instead of writing step-by-step instructions, we had to think about how agents interact, how they handle uncertainty, and how to design feedback loops to improve performance over time and rebound from errors. We also learned a lot about prompt engineering and system design in shaping agent behavior. Small changes in instructions, memory design, or action constraints could lead to vastly different outputs. For example, by adding a simple memory list of previous queries for one of our agents it effectively could learn from its past queries and make better requests as a result.

What's next for Identity AI

A potential extension of our project would be to actively match faces against the FBI’s national kidnapping database. By integrating real-time facial recognition, we could help identify trafficking victims who need to be rescued and alert authorities in critical situations.

Built With

Share this project:

Updates