DataBridge

✨Inspiration

We’ve all heard the rumors about how Audit and Assurance have more overworked interns than the Super Bowl staff at halftime - but in reality, the chaos isn’t caused by the coffee runs… it’s the data. When two financial institutions merge, their systems speak entirely different languages - columns don’t match, schemas collide, and context gets lost faster than last quarter’s reports.

That’s where DataBridge steps in. We were inspired to imagine a world where EY’s data teams no longer have to wrestle with endless CSVs or mismatched databases. Instead, AI-powered agents do the heavy lifting - understanding schemas, aligning columns, resolving conflicts, and ensuring perfect integrity - all automatically.

After all, imagine if you could have 10 overworked Gemini agents instead of 10 overworked interns. Would EY want that? Well… we’ll let you decide.

Our mission is simple - to build the bridge between raw, chaotic data and clean, intelligent insight - empowering EY to merge systems effortlessly, with AI precision and human oversight.

🚀 What it does

Prerequisites: Firstly, ideally, a company would add its schema/customers/transactions/accounts to its cloud (Snowflake).

Steps When a user logs into our website, they can prompt many different actions with our multifaceted chat Gemini agent.

EXPLORE: The mapping agents can generate a query and tell you more about the specific part of the file you are talking about and give you more context (ex. tell you more about userID: ae9ocs)
MERGE: This takes on many agents to merge the companies and all their files and give merged csvs that correspond to the type of file it is (Transactions,Accounts etc.)

While the merging takes place (Around a minute) in real time, you can see where in the process it is on right on the website and which agents are being used in the Agents tab

On Snowflake, data analytics such as rows and columns in both the merged and unmerged files can be seen to make sure everything went as expected after the merging is done.

On our website, users can also go and click different buttons in the navbar, such as projects, to see their merging projects, user to see statistics on the merging that they have done and teams to talk to the rest of the team

Simply put: DataBridge doesn’t just merge data - it merges intelligence.

🔧 How we built it

We built DataBridge like a modern orchestra — every instrument (or in our case, agent) plays its part in perfect harmony.

Each Gemini agent, from schema mapping to SQL generation, runs as an independent containerized microservice.
A Master Agent acts as the conductor, dynamically deciding how many agents to spawn depending on dataset complexity.
All communication flows through FastAPI and WebSocket channels for real-time coordination.
Snowflake APIs handle every data operation, ensuring enterprise-grade scalability and security.
The architecture is fully extensible - new AI agents can be added with just one configuration file (no onboarding forms required).

The result? A lightweight, fault-tolerant, and cloud-native data ecosystem that scales like Kubernetes on espresso shots.

Challenges we ran into

Building something this modular was both exciting and mildly terrifying.

We wrestled with cross-agent communication, ensuring all our AI agents played nice without introducing latency or data drift. We fought to maintain fault tolerance while managing dozens of microservices doing parallel work. Balancing real-time AI inference with efficient resource allocation felt like juggling servers and snowflakes at the same time.

And of course, integrating multiple AI frameworks into one unified pipeline was no small feat — Gemini had opinions, Snowflake had syntax, and our logs had a lot to say. But every challenge pushed us to engineer smarter, code cleaner, and think more like a production-grade EY solution than a hackathon prototype.

🤯 Accomplishments that we're proud of

Designed a production-ready AI agent framework from scratch - no monoliths allowed.
Built a modular FastAPI backend that scales multiple AI services autonomously.
Achieved end-to-end data flow orchestration across Gemini, Snowflake, and custom merge agents.
Delivered a Kubernetes-ready architecture that looks and feels enterprise-grade.
And most proudly, transformed what’s usually a painful, manual data integration process into something elegant, automated, and EY-approved.

Because if your AI system can handle merging two banks’ datasets without breaking a sweat, it’s definitely worth bragging about.

🧠 What we learned

This project was our masterclass in combining software engineering precision with AI-driven creativity.

We learned how to design multi-agent systems that stay lightweight yet powerful, and how API standardization is the secret sauce behind scalable AI ecosystems. We sharpened our collaboration between data engineers, AI developers, and DevOps leads - because good data pipelines, like good teams, depend on seamless coordination.

And perhaps most importantly, we realized that solving real-world data challenges isn’t just about technology - it’s about vision, teamwork, and occasionally, convincing your AI agents to stop arguing about schema mismatches.

🚀 What's next for DataBridge

This is just the first bridge we’ve built, and we’re already planning the next ones.

We’re working on a visual data orchestration dashboard for live monitoring and analytics, and integrating LLM-based natural language querying, so users can simply ask, “Show me overlapping customers between Bank A and Bank B” and get instant insights. It can also tell us how much bandwidth our agents take and can output all of it in fluid graphs. This can be done via software such as DataDog

We also plan to launch an agent marketplace, allowing EY teams to build and plug in their own AI modules tailored to specific data domains. Finally, we’re preparing multi-cloud deployment on GCP, AWS, and Azure with auto-scaling and full compliance.

Ultimately, we envision DataBridge as EY’s own AI-powered data operating system - the definitive bridge between humans, data, and intelligence.

And the best part? It doesn’t need interns to make it work.

Built With

cloud-native
datadog
docker
fastapi
firestore
gemini
geminiapi
github
gitlab-ci
googlecloudplatform
helm
jwt
kagent
kubernetes
mcp
openaiapi
pandas
postgresql
pycharm
python
react
restapi
snowflake
toolservers
typescript
vscode
websocket
yaml

Submitted to

Hack the Valley X
- Winner Best Finance AI Eval

Created by

Krithika Kannan
CS undergrad building real-world impact through OOP, automation & AI - curious, driven, and always shipping solutions.
Varnit Sahu
Advitiya Sharma
Shaun Plassery
builder extraordinaire, ce @ mac