Align

Inspiration

What it does

How we built it

Challenges we ran into## Inspiration

AI systems today struggle with alignment — not because of a lack of models, but because of a lack of high-quality human preference data. Most approaches rely on static datasets or expensive RLHF pipelines, which fail to capture the nuance and diversity of real human values.

We were inspired by a simple question:
What if we could build a scalable system to collect structured human preferences directly from users — while giving them value in return?

At the same time, we wanted to create something people would actually enjoy using — not just a data collection tool. That led us to combine self-discovery, social interaction, and AI alignment into a single platform.

What it does

Align is a full-stack system that collects high-quality human preference data through adaptive quizzes and converts it into machine-readable personality representations.

Users answer dynamically generated questions
Responses are structured across multiple formats (ranking, scale, text, etc.)
A neural network converts responses into a 64-dimensional personality vector
Users receive:
- Personality insights
- Interpretable traits
- The ability to compare with friends

At the same time, this creates a scalable pipeline for AI alignment data.

How we built it

We built Align as a full-stack, cloud-deployed system:

Frontend

Next.js (hosted on Vercel)
Dynamic quiz interface with real-time interaction
Supports multiple response types

Backend

FastAPI + MySQL
Deployed on Google Cloud PaaS
Handles:
- Quiz generation and delivery
- Response storage
- Profile + trait management

Machine Learning

PyTorch MLP models
Trained using Google Cloud
Two key components:
- Personality embedding model (64-dim vector)
- Question generation/structuring logic

Key Innovation

Unlike traditional systems, Align:

Actively collects high-quality preference data (not passive scraping)
Uses structured responses, not just raw text
Produces dual outputs:
- Human-readable traits
- Machine-readable embeddings
Creates a feedback loop:
- Better data → better questions → better alignment

We’re not just collecting data — we’re optimizing how alignment data is generated.

User Engagement & Trust

A key challenge in alignment is participation — users won’t contribute data unless they trust the system.

We address this by:

Providing immediate value:
- Personality insights
- Self-understanding
- Social comparison with friends
Ensuring transparency:
- Users know what data is collected
- Clear control over public vs private traits
Designing for user ownership:
- Users are collaborators, not products

Challenges we ran into

Designing a schema to support multiple response types (ranking, text, scale)
Training an MLP to produce meaningful embeddings from sparse user input
Generating high-signal questions that avoid redundancy and bias
Integrating frontend, backend, and ML systems into a seamless pipeline under hackathon time constraints

What we learned

The hardest part of AI systems isn’t always the model — it’s the data pipeline
Structured data is far more valuable than unstructured input for downstream ML tasks
User trust and engagement are critical when building systems that rely on human data
Full-stack + ML integration requires careful system design, even for a prototype