PE Hackathon 2026 — URL Shortener: Scalability Quest
Inspiration
The scalability challenge immediately caught my attention. Most developers can build a URL shortener in an afternoon — but can it handle 500 concurrent users hammering it at once without breaking a sweat? That's a fundamentally different engineering problem. I wanted to build something that could demonstrate real, measurable performance under load, not just a toy that works in development.
The constraint of using the MLH template stack (Flask + Peewee + PostgreSQL) made it more interesting — I couldn't just reach for the most performant framework. I had to make this stack fast.
What I Built
A production-grade URL shortener with a Redis caching layer, deployed on Railway with a full CI/CD pipeline. The system can handle 500 concurrent virtual users with zero errors, serving redirects in under 100ms on warm cache paths.
Core features:
POST /shorten— create short URLs with optional custom codes and titlesGET /<code>— redirect to original URL (cache-first, sub-1ms on hits)GET /stats/<code>— click count and metadata for any short URLGET /health— live database and Redis connectivity check- Full seed data pipeline loading 400 users, 2000 URLs, and 3422 events from the MLH platform CSVs
How I Built It
Architecture
The key insight was that a URL shortener has an extremely skewed read/write
ratio — roughly 95% of traffic is redirects, 5% is creation. That means
the hot path (GET /<code>) needs to be as fast as possible, while writes
can afford a little latency.
The architecture has three layers:
Redis cache — every active URL is pre-warmed into Redis on startup using a pipeline batch. On a cache hit, the redirect is served with a single Redis
GET(~1ms). No database touch.PostgreSQL — only hit on cache misses (new URLs not yet cached) and writes. Peewee's ORM maps exactly to the seed data schema.
Gunicorn with gthread workers — 4 workers × 4 threads = 16 concurrent request handlers per container, keeping CPU utilization high without thread contention.
Scalability Decisions
Cache warming on startup: instead of waiting for the first 2000 requests to populate Redis, all active URLs are bulk-loaded via a Redis pipeline before the first request arrives. Cold start penalty is eliminated.
404 caching: non-existent short codes are cached for 60 seconds as a sentinel value. Repeated probes for missing codes never touch the database.
Connection keep-alive: gunicorn is configured with
--keep-alive 5to reuse TCP connections from Railway's edge proxy, reducing handshake overhead at scale.Schema matches seed data exactly: the
urlstable usesoriginal_url(notlong_url) and includestitle,updated_at, andclick_count— matching the MLH CSVs field-for-field so the judge evaluator finds exactly what it expects.
CI/CD
GitHub Actions runs pytest on every push with a live PostgreSQL and Redis service container. Coverage threshold is enforced at 50%. Railway auto-deploys from the main branch on green builds.
Challenges
The localhost Crash
The first Railway deployment crashed immediately with a Postgres connection
refused error. The app was connecting to localhost because Railway's
Postgres service uses an internal hostname (postgres.railway.internal) that
isn't reachable from a Mac. The fix required discovering the DATABASE_PUBLIC_URL
variable, extracting the proxy hostname and port, and running the seed script
with those values explicitly set.
Schema Mismatch Discovery
Halfway through building, I discovered the seed CSV uses original_url but my
model was using long_url. The judges' evaluator would query for original_url
and find nothing. This required updating the model, all routes, and the cache
serialization simultaneously — while keeping the app backward-compatible with
both field names in the POST body.
Free Tier CPU Throttling
Under 500 concurrent users, Railway's free tier throttles CPU aggressively. The p95 latency hit 9.5 seconds — not because the architecture was wrong, but because the shared vCPU couldn't process requests fast enough. The cache hits were fast; the bottleneck was pure compute. This highlighted an important lesson: caching solves latency, but hardware solves throughput.
Connection Pool Regression
Adding PooledPostgresqlDatabase to replace the default PostgresqlDatabase
broke the /shorten endpoint entirely — all 667 POST requests in the load test
failed. The template's DatabaseProxy pattern doesn't compose cleanly with
Peewee's pool implementation. Rolled back to the standard database and accepted
the connection overhead rather than debug ORM internals during a hackathon.
What I Learned
Measure before you optimize. The first instinct was to add more workers and tune gunicorn. The actual bottleneck was cold cache misses and 404s hitting the database. k6 metrics made this visible.
Schema is a contract. In a judge-evaluated system, your data model must match the evaluator's expectations exactly. A single renamed field (
long_urlvsoriginal_url) is the difference between 0% and 100% on a test suite.Cache warming is as important as caching. A cache that starts cold is useless for a load test that starts immediately. Pre-loading 2000 URLs into Redis via a pipeline batch on startup meant every redirect was a cache hit from request number one.
Hardware matters. No amount of Redis caching overcomes a throttled shared vCPU under 500 concurrent users. Scalable architecture requires scalable infrastructure.
Tech Stack
- Language: Python 3.13
- Framework: Flask + Peewee ORM
- Database: PostgreSQL 16
- Cache: Redis 7 (LRU eviction, 256MB)
- Server: Gunicorn (gthread, 4 workers × 4 threads)
- Deployment: Railway (Docker)
- Load Testing: k6 (500 VUs, 3m30s)
- CI: GitHub Actions
- Package Manager: uv
Log in or sign up for Devpost to join the conversation.