Skip to content

undeemed/ratemyhackathons

Repository files navigation

RateMyHackathons

A platform for rating, reviewing, and discovering hackathon events. Search by company, event, or user.

Tech Stack

Layer Technology Rationale
Backend Rust + Actix Web High-performance async REST API
Crawler Python + Scrapling Adaptive scraping with proxy rotation & stealth
Analytics Rust + Actix-Web + SvelteKit Live dashboard with crawl stats, SSE feed
Database PostgreSQL Many-to-many relations, full-text search (tsvector), JSONB for crawler data
Query Layer SQLx Async-native, compile-time checked SQL
IDs UUIDv7 Time-ordered for efficient B-tree indexing
Frontend SvelteKit + Svelte 5 + Tailwind v4 + cobe globe + GSAP Editorial brutalist B&W design, Instrument Serif + Space Mono, WebGL globe, GSAP scroll animations

Data Schema

erDiagram
    EVENTS ||--o{ EVENT_COMPANIES : has
    COMPANIES ||--o{ EVENT_COMPANIES : sponsors
    USERS ||--o{ REVIEWS : writes
    EVENTS ||--o{ REVIEWS : receives
    EVENTS ||--o{ CRAWL_SOURCES : sourced_from
    REVIEWS ||--o{ REVIEW_VOTES : receives
    REVIEWS ||--o{ REVIEW_COMMENTS : has
    REVIEW_COMMENTS ||--o{ REVIEW_COMMENTS : replies_to

    EVENTS {
        uuid id PK
        text name
        text description
        text location
        text url
        date start_date
        date end_date
        text image_url
        float latitude
        float longitude
        tsvector search_vector
        timestamptz created_at
        timestamptz updated_at
    }

    COMPANIES {
        uuid id PK
        text name
        text logo_url
        text website
        text description
        tsvector search_vector
        timestamptz created_at
    }

    USERS {
        uuid id PK
        text username
        text email
        text display_name
        text bio
        int age
        text avatar_url
        text github
        text twitter
        text linkedin
        text website
        timestamptz created_at
    }

    EVENT_COMPANIES {
        uuid event_id FK
        uuid company_id FK
        text role
    }

    REVIEWS {
        uuid id PK
        uuid event_id FK
        uuid user_id FK
        int rating
        text title
        text body
        timestamptz created_at
    }

    REVIEW_VOTES {
        uuid id PK
        uuid review_id FK
        uuid user_id FK
        boolean helpful
        timestamptz created_at
    }

    REVIEW_COMMENTS {
        uuid id PK
        uuid review_id FK
        uuid user_id FK
        uuid parent_comment_id FK
        text body
        timestamptz created_at
    }

    CRAWL_SOURCES {
        uuid id PK
        uuid event_id FK
        text source_url
        text source_type
        text source_hash UK
        jsonb raw_data
        timestamptz crawled_at
    }

    SCRAPE_SOURCES {
        uuid id PK
        text name
        text source_type
        text base_url
        boolean enabled
        int poll_interval_hours
        timestamptz last_polled_at
    }
Loading

Documentation

Component README
Backend API backend/README.md
Services services/README.md
Crawler services/crawler/README.md
Analytics services/analytics/README.md
CV API Recon services/crawler/cv/API_RECON.md
Luma API Recon services/crawler/luma/API_RECON.md
MLH Recon services/crawler/mlh/API_RECON.md
Hackiterate Recon services/crawler/hackiterate/API_RECON.md

Project Structure

ratemyhackathons/
├── frontend/              # SvelteKit frontend
│   ├── src/
│   │   ├── app.css       # Dark theme (Tailwind v4)
│   │   ├── lib/
│   │   │   ├── api.ts         # Typed API client
│   │   │   ├── types.ts       # TypeScript interfaces
│   │   │   ├── animations/gsap.ts  # ScrollTrigger actions
│   │   │   └── components/    # Globe, EventCard, ReviewCard, Nav, Footer
│   │   └── routes/
│   │       ├── +page.svelte   # Landing page (7 sections)
│   │       ├── about/         # About page (mission, stack, data sources)
│   │       ├── api/           # API documentation (interactive endpoint reference)
│   │       ├── events/        # Event list (sort/search/date-range filter/list-grid toggle) + detail
│   │       ├── companies/     # Company list (sort/search) + detail
│   │       ├── compare/       # Side-by-side comparison (inline search, entity chips, shareable URLs)
│   │       ├── users/[id]/    # User profiles
│   │       └── search/        # Tabbed search results
├── backend/               # Rust API server
│   ├── src/
│   │   ├── main.rs        # Server entry point + health check
│   │   ├── lib.rs         # Library crate (public modules for tests)
│   │   ├── config.rs      # Environment configuration
│   │   ├── db.rs          # Database pool setup
│   │   ├── errors.rs      # API error types
│   │   ├── models/        # Data models & DTOs
│   │   └── routes/        # API route handlers
│   ├── tests/             # Integration tests
│   └── migrations/        # PostgreSQL migrations
├── services/              # Standalone services
│   ├── crawler/           # Python event scraper
│   │   ├── main.py        # CLI: --once, --daemon, --dry-run
│   │   ├── dry_run.py     # Standalone spider test (no DB, all sources + dedup)
│   │   ├── db.py          # asyncpg database layer
│   │   ├── dedup.py       # Hash + fuzzy + cross-source URL deduplication
│   │   ├── proxy.py       # Proxy rotation setup
│   │   ├── company.py     # Best-effort company matching
│   │   ├── sponsors.py    # 4-strategy sponsor extraction
│   │   ├── llm.py         # OpenRouter LLM (free + paid fallback)
│   │   ├── spiders/       # Source-specific scrapers
│   │   │   ├── mlh.py             # MLH (HTML scraping)
│   │   │   ├── hackiterate.py     # Hackiterate (Playwright)
│   │   │   ├── cerebralvalley.py  # CV public API + host enrichment
│   │   │   └── luma.py            # Luma API (15-city geo sweep + keyword filter)
│   │   ├── cv/            # CV API recon
│   │   │   └── API_RECON.md
│   │   ├── luma/          # Luma API recon
│   │   │   └── API_RECON.md
│   │   ├── mlh/           # MLH scraping recon
│   │   │   └── API_RECON.md
│   │   └── hackiterate/   # Hackiterate scraping recon
│   │       └── API_RECON.md
│   └── analytics/         # Rust analytics API + SvelteKit admin dashboard
│       ├── src/            # Actix-web server (:8081)
│       │   ├── main.rs
│       │   ├── db.rs       # Analytics queries
│       │   └── routes/     # crawl, events, reviews, live SSE
│       └── dashboard/     # SvelteKit + Tailwind v4 admin dashboard
│           └── src/routes/ # overview, events, crawl, reviews pages
├── CHANGELOG.md
└── .env.example

Getting Started

Prerequisites

Setup

# 1. Clone and enter project
git clone <repo-url>
cd ratemyhackathons

# 2. Create the database
createdb ratemyhackathons

# 3. Configure environment
cp backend/.env.example backend/.env
# Edit backend/.env with your database credentials

# 4. Run the migrations
psql -d ratemyhackathons -f backend/migrations/20260313_initial_schema.sql
psql -d ratemyhackathons -f backend/migrations/20260313_review_votes_comments.sql
psql -d ratemyhackathons -f backend/migrations/20260313_user_profiles_event_slugs.sql
psql -d ratemyhackathons -f backend/migrations/20260313_crawl_registry.sql
psql -d ratemyhackathons -f backend/migrations/20260314_event_geocoding.sql
psql -d ratemyhackathons -f backend/migrations/20260314_rmp_ratings.sql

# 5. Start the server
cd backend
cargo run

# 6. (Optional) Run the crawler
cd ../services/crawler
pip install -r requirements.txt  # or: pip install scrapling asyncpg python-dotenv
cp .env.example .env   # add your DATABASE_URL and PROXY_URL
python main.py --dry-run   # preview without inserting
python main.py --once      # single crawl pass
python main.py --daemon    # continuous polling

# 7. (Optional) Run the analytics dashboard
cd ../analytics
cargo run                  # API on :8081
cd dashboard && bun install && bun dev   # dashboard on :5174

The API will be available at http://127.0.0.1:8080. The frontend will be at http://localhost:5173. The analytics dashboard will be at http://localhost:5174.

Frontend

cd frontend
bun install
bun dev                    # Dev server on :5173 (proxies /api → :8080)
bun run build              # Production build
bun run check              # TypeScript/Svelte type checking

API Endpoints & Schemas

Overview

Method Path Description
GET /health Health check (DB connectivity + version)
GET /api/events List events (paginated, filterable)
GET /api/events/{id} Event detail with companies & reviews
POST /api/events Create event
GET /api/events/globe Globe markers (events with lat/lng)
GET /api/events/locations Unique location strings (for autocomplete)
GET /api/companies List companies (paginated)
GET /api/companies/{id} Company detail with events
POST /api/companies Create company
GET /api/users List users (paginated)
GET /api/users/{id} User detail with reviews
POST /api/users Create user
POST /api/users/{id}/reviews Create review (10 category scores, auth required)
GET /api/reviews/{id} Review detail with votes & threaded comments
POST /api/reviews/{id}/vote Vote helpful/unhelpful (upsert)
POST /api/reviews/{id}/comments Add comment or reply
GET /api/reviews/{id}/comments Get threaded comment tree
GET /api/search?q=&type= Full-text search
GET /api/tags List all tags
GET /api/tags/top?entity_type=&entity_id= Top 5 tags for entity
POST /api/tags Create tag (returns existing if name matches)
GET /api/compare?type=&ids= Side-by-side entity comparison

Events

GET /api/events

Query params: ?page=1&per_page=20&company_id=uuid

// Response 200:
{
  "data": [{
    "id": "uuid", "name": "HackMIT 2025", "description": "...",
    "location": "Cambridge, MA", "url": "https://...",
    "start_date": "2025-10-01", "end_date": "2025-10-02",
    "image_url": "https://...",
    "companies": [{ "id": "uuid", "name": "Google", "role": "sponsor" }],
    "avg_rating": 4.2, "review_count": 15,
    "created_at": "2025-01-01T00:00:00Z"
  }],
  "total": 100, "page": 1, "per_page": 20
}

GET /api/events/{id}

// Response 200:
{
  "id": "uuid", "name": "HackMIT 2025", "description": "...",
  "location": "Cambridge, MA", "url": "https://...",
  "start_date": "2025-10-01", "end_date": "2025-10-02",
  "image_url": "https://...",
  "companies": [{ "id": "uuid", "name": "Google", "role": "sponsor" }],
  "reviews": [{
    "id": "uuid", "user_id": "uuid", "username": "alice",
    "rating": 5, "title": "Amazing!", "body": "...", "created_at": "..."
  }],
  "avg_rating": 4.2, "review_count": 15
}

POST /api/events

// Request:
{
  "name": "HackMIT 2025",          // required
  "description": "Annual hackathon", // optional
  "location": "Cambridge, MA",       // optional
  "url": "https://hackmit.org",      // optional
  "start_date": "2025-10-01",        // optional
  "end_date": "2025-10-02",          // optional
  "image_url": "https://...",        // optional
  "company_ids": ["uuid", "uuid"]    // optional, attach companies
}
// Response 201: full event object

Companies

GET /api/companies

Query params: ?page=1&per_page=20&search=google

// Response 200:
{
  "data": [{
    "id": "uuid", "name": "Google", "logo_url": "https://...",
    "website": "https://google.com", "description": "...",
    "event_count": 12, "avg_rating": 4.1, "review_count": 23,
    "latest_event_date": "2025-10-01",
    "category_ratings": [
      { "category": "organization", "avg": 4.5 },
      { "category": "vibes", "avg": 4.2 }
    ],
    "created_at": "..."
  }],
  "total": 50, "page": 1, "per_page": 20
}

GET /api/companies/{id}

// Response 200:
{
  "id": "uuid", "name": "Google", "logo_url": "...",
  "website": "https://google.com", "description": "...",
  "events": [{
    "id": "uuid", "name": "HackMIT 2025", "role": "sponsor",
    "start_date": "2025-10-01", "avg_rating": 4.2
  }]
}

POST /api/companies

// Request:
{ "name": "Google", "logo_url": "...", "website": "...", "description": "..." }
// Response 201: full company object

Users

GET /api/users

Query params: ?page=1&per_page=20

// Response 200:
{
  "data": [{
    "id": "uuid", "username": "alice", "display_name": "Alice Chen",
    "avatar_url": "...", "review_count": 5, "created_at": "..."
  }],
  "total": 200, "page": 1, "per_page": 20
}

GET /api/users/{id} — Full profile

// Response 200:
{
  "id": "uuid", "username": "alice", "email": "alice@...",
  "display_name": "Alice Chen", "bio": "Full-stack developer",
  "age": 22, "avatar_url": "...",
  "socials": {
    "github": "alicechen",
    "twitter": "alice_dev",
    "linkedin": "alice-chen",
    "website": "https://alice.dev"
  },
  "reviews": [{
    "id": "uuid", "event_id": "uuid", "event_name": "HackMIT 2025",
    "rating": 5, "title": "Amazing!", "body": "...", "created_at": "..."
  }]
}

POST /api/users

// Request:
{
  "username": "alice",              // required
  "email": "alice@example.com",      // required
  "display_name": "Alice Chen",      // optional
  "bio": "Full-stack developer",     // optional
  "age": 22,                         // optional, 13-150
  "avatar_url": "https://...",       // optional
  "github": "alicechen",             // optional
  "twitter": "alice_dev",            // optional
  "linkedin": "alice-chen",          // optional
  "website": "https://alice.dev"     // optional
}
// Response 201: full user object

Reviews

POST /api/users/{user_id}/reviews — Create review

Requires authentication (Clerk JWT) in production. In dev mode, user_id in body is used.

// Request:
{
  "event_id": "uuid",              // XOR with company_id
  "company_id": "uuid",           // XOR with event_id
  "title": "Amazing!",            // optional, max 200 chars
  "body": "Detailed review...",   // required, 350-5000 chars
  "would_return": true,           // optional
  "category_ratings": {           // required, all 10 categories
    "organization": 5,
    "prizes": 4,
    "mentorship": 5,
    "judging": 4,
    "venue": 3,
    "food": 4,
    "swag": 3,
    "networking": 5,
    "communication": 4,
    "vibes": 5
  },
  "tag_ids": ["uuid1", "uuid2"]   // optional
}
// Response 201: full review object with computed overall rating

GET /api/reviews/{id} — Review detail with votes & comments

// Response 200:
{
  "id": "uuid", "event_id": "uuid", "user_id": "uuid",
  "rating": 5, "title": "Amazing!", "body": "...",
  "created_at": "...",
  "votes": { "helpful": 12, "unhelpful": 3 },
  "comments": [
    {
      "id": "uuid", "user_id": "uuid", "username": "bob",
      "body": "Great review!", "created_at": "...",
      "replies": [
        {
          "id": "uuid", "user_id": "uuid", "username": "alice",
          "body": "Thanks!", "created_at": "...",
          "replies": []
        }
      ]
    }
  ]
}

POST /api/reviews/{id}/vote — Vote helpful/unhelpful

One vote per user per review. Re-voting updates the existing vote (upsert).

// Request:
{ "user_id": "uuid", "helpful": true }
// Response 200: { "id": "uuid", "review_id": "uuid", "user_id": "uuid", "helpful": true, "created_at": "..." }

POST /api/reviews/{id}/comments — Add comment or reply

Pass parent_comment_id to reply to an existing comment (Reddit-style nesting). Omit for a top-level comment.

// Request (top-level comment):
{ "user_id": "uuid", "body": "Great review!" }

// Request (reply to another comment):
{ "user_id": "uuid", "parent_comment_id": "uuid", "body": "I agree!" }

// Response 201: full comment object

GET /api/reviews/{id}/comments — Get threaded comment tree

Returns nested JSON tree — each comment has a replies array containing its children, recursively.

// Response 200:
[
  {
    "id": "uuid", "user_id": "uuid", "username": "bob",
    "body": "Great review!", "created_at": "...",
    "replies": [
      {
        "id": "uuid", "user_id": "uuid", "username": "alice",
        "body": "Thanks!", "created_at": "...",
        "replies": []
      }
    ]
  }
]

Search

GET /api/search

Query params: ?q=hackathon&type=event|company|user&per_page=20

// Response 200:
{
  "events": [{ "id": "uuid", "name": "HackMIT", "rank": 0.95 }],
  "companies": [{ "id": "uuid", "name": "Google", "rank": 0.87 }],
  "users": [{ "id": "uuid", "name": "hackfan", "rank": 0.72 }],
  "total": 25
}

Architecture: How Queries Work

All list endpoints use correlated subqueries instead of the N+1 pattern. Here's what that means:

Example (from list_companies)

SELECT c.id, c.name, c.logo_url, c.website, c.description,
       (SELECT COUNT(*) FROM event_companies WHERE company_id = c.id) as event_count,
       (SELECT AVG(rating)::float8 FROM reviews WHERE company_id = c.id) as avg_rating,
       (SELECT COUNT(*) FROM reviews WHERE company_id = c.id) as review_count,
       c.created_at
FROM companies c
ORDER BY c.name ASC
LIMIT $1 OFFSET $2

Line by line:

  • FROM companies c — Read the companies table (aliased as c)
  • SELECT c.id, c.name, ... — Pick which columns to return
  • (SELECT COUNT(*) ...)Subquery: while on each company row, peek into event_companies and count matching rows. This runs inside the main query, not as a separate call
  • (SELECT AVG(rating) ...) — Another subquery: compute average rating from reviews for this company
  • ORDER BY c.name ASC — Sort A→Z
  • LIMIT $1 OFFSET $2 — Pagination ($1 = page size, $2 = how many rows to skip)

Why Not Loop? (N+1 Problem)

Subquery (our approach) N+1 loop
3 companies 1 query 4 queries
20 companies 1 query 21 queries
100 companies 1 query 101 queries

Each query has network overhead (Rust → PostgreSQL → Rust), so the subquery approach is dramatically faster.

Testing

cd backend
cargo test

45 tests across 5 files covering error handling, pagination, model serialization, route existence, and UUIDv7 ordering.

Data Population

Data is populated via:

  1. Web crawler — 4 spiders (MLH, Hackiterate, Cerebral Valley, Luma) with --dry-run preview mode
  2. Manual entry through the API

License

See LICENSE for details.

About

Whine, Complain, Compliment. Most hackathons are rigged anyways lol, hope your demo doesn't bug. May the most memorable team win!

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors