MovieMap

TRY IT OUT: https://moviemap-frontend.onrender.com/

Inspiration

Our motivation for moviemap came from the painful experience of choosing a movie to watch with multiple people, families, partners, friends – it’s always a struggle. We aim to solve this in an enjoyable and effective way. We give an individual taste profile, that is projected onto a map of over 2500 films, that can be paired with a partner to find film recommendations that satisfy both of your tastes.

What it does

MovieMap turns the MovieLens 25M dataset into an interactive galaxy of 2,500 films. Each node is a movie, projected onto 2D space from a vector space with over 1000 dimensions (functioning similarly to a large language model). Clusters form naturally: dark thrillers collect in one corner, animated family films in another, arthouse dramas somewhere in the middle.

You enter a Letterboxd username. Your rated, liked, and watched films light up across the map in amber. A second user loads in teal. The graph shows you where your tastes live, overlap, and diverge.

From there:

Solo mode computes a weighted taste centroid from your ratings and recommends films you have not seen yet, scored by cosine similarity in genome-tag space, with bonuses for films that resemble multiple things you loved (not just an average).

Duo mode overlays two profiles. A compatibility score combines centroid similarity, shared highly-rated films, and genre overlap. Recommendations are drawn from the intersection zone, films that appeal to both of you but neither has seen, with watchlist flags if a pick is already sitting on someone's list.

Content vs Taste shifts between Content and Taste. Content uses descriptive tags that have been attached to films to determine what movies are best for you. Whereas Taste uses correlations between user reviews - films such as Schindlers List and Interstellar could have a high taste score, as they cater to a similar audience, despite very different content.

The engine dampens popularity bias, reaches further from your centroid, and surfaces hidden gems with strong quality signals.

Taste Profiler is for people without Letterboxd. Answer 12 binary questions — "Do you like being scared?", "Should it feel grounded in reality?" — and the engine builds a genome-space taste vector from your answers and recommends films by cosine similarity. No account required.

Film detail panel shows genome tags with relevance scores, similar films with percentages, average rating, and direct links to IMDb, TMDB, and JustWatch for where to actually watch it.

How we built it

Data pipeline. MovieLens ml-25m gives us 25M ratings and a 1,128-dimension genome tag matrix per film. We blend genre/theme cosine similarity with item-item collaborative filtering (60/40), run UMAP to get 2D coordinates, and K-means with TF-IDF cluster labelling to name each region.

Backend. FastAPI serves the pre-computed graph and recommendation engine. Letterboxd scraping uses cloudscraper for Cloudflare bypass, 6 concurrent workers to resolve film slugs to TMDB IDs, and a persistent disk cache so repeated lookups are instant.

Frontend. D3 force simulation on Canvas (SVG can't handle 2,500 glowing nodes). The content/taste slider blends scores client-side via useMemo with no API calls after the initial load.

Challenges we ran into

Isaac deleted all of his code, so we had to redesign the data pipeline and front-end.

It took many attempts to figure out the most meaningful predictive signals in our data set.

Accomplishments that we're proud of

The content/taste slider updates recommendations in real time with zero API calls after load — all blending happens client-side. The duo compatibility score is a weighted combination of taste centroid similarity, shared highly-rated films, and genre overlap, which are calculated in over 1000 dimensions, before being projected into 2D.

What we learned

UMAP's global structure preservation makes the layout genuinely readable without any manual category assignment. The TF-IDF cluster labelling (distinctive tags per cluster relative to the global mean) was a much better approach than naive top-tag labelling, which just names everything "suspense". Also: cache your Letterboxd scrapes aggressively.

What's next for MovieMap

We envision this being integrated into Letterboxd, giving users insights and functionality with minimal effort.

Built With

Share this project:

Updates