Credit Cat

Inspiration

Applying for credit is opaque and intimidating. An applicant gets a yes or no with no sense of who they are in the system, why, or what to do next — and the tools that exist are built for lenders to screen people, not for people to understand themselves. We flipped that. Our user is the applicant at the start of their credit journey, and we leaned straight into the datathon theme: in a world anxious about AI, build the thing that augments a person instead of judging them.

What it does

You answer a few quick questions, one at a time. Credit Cat matches you to one of four real credit segments — The Builder, The Quiet File, The Established, and The Veteran — and shows a dashboard: what your segment means in plain English, the typical credit-limit range and card types for a profile like yours, a map of where you sit among all applicants, how typical or borderline you are, and — front and center — realistic, actionable steps toward the next segment up. It never predicts approval and never gives a score. It's a mirror, not a gatekeeper.

How we built it

We attended a workshop on specification-driven development ("vibe coding with specifications") and applied it before writing any app code: three specs in our docs folder (system requirements, backend, frontend) defined the system, and the build followed them — including a strict split between backend.py (all model/data logic) and app.py (UI only).

The model is K-Means on 689 anonymized applicants. We dropped the identifier, held out the approve/deny label entirely (this is unsupervised), one-hot encoded categorical codes, scaled the continuous features, and log-transformed the heavily right-skewed ones — a decision the data forced on us: our first k=4 run produced a one-person cluster, which we traced to extreme right-skew in A14 (skewness ≈ 13). After a log1p transform, k=4 produced four well-sized groups.

Our key modeling call: silhouette favored k=2, and we report that as the strongest split — but a tool that only says "Type A or B" isn't useful to a person, so we deliberately chose k=4 for richer, navigable segments, knowingly trading a little statistical separation for human usefulness. The headline validation: the model never saw the approve/deny label, yet the four segments separate cleanly by historical approval (~20% / 40% / 80% / 90%). Our recommendation engine then uses the data to find which changeable features separate you from the next group (employment, tenure, income) and pairs them with proven credit habits — never immutable traits.

The app is Streamlit, themed in a Capital One-inspired navy/red system with a custom cat logo, a multi-section landing, a one-question-at-a-time flow with progress dots, a deliberately disabled demonstration paywall, and a results dashboard.

Challenges we ran into

The features are anonymized, so we inferred meaning carefully and stayed honest about uncertainty — one flag even behaves opposite to its likely label, which we flag rather than trust. We also had to chase down the A14 outlier that hijacked an entire cluster, and the harder design challenge: translating abstract cluster math into advice an average person can actually act on.

What's next

A fillable column-interpretation table, a "what-if" view that lets users see how a change might shift them, and richer per-segment guidance — all kept inside our no-prediction, no-gaming guardrails.

Built With

git
github
google-colab
joblib
k-means
matplotlib
numpy
pandas
pca
python
scikit-learn
streamlit

Updates

emmetgingerich Gingerich started this project — May 23, 2026 12:44 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.