Inspiration

What it does

How we built it

Challenges we ran into## Inspiration

AI systems today struggle with alignment — not because of a lack of models, but because of a lack of high-quality human preference data. Most approaches rely on static datasets or expensive RLHF pipelines, which fail to capture the nuance and diversity of real human values.

We were inspired by a simple question:
What if we could build a scalable system to collect structured human preferences directly from users — while giving them value in return?

At the same time, we wanted to create something people would actually enjoy using — not just a data collection tool. That led us to combine self-discovery, social interaction, and AI alignment into a single platform.


What it does

Align is a full-stack system that collects high-quality human preference data through adaptive quizzes and converts it into machine-readable personality representations.

  • Users answer dynamically generated questions
  • Responses are structured across multiple formats (ranking, scale, text, etc.)
  • A neural network converts responses into a 64-dimensional personality vector
  • Users receive:
    • Personality insights
    • Interpretable traits
    • The ability to compare with friends

At the same time, this creates a scalable pipeline for AI alignment data.


How we built it

We built Align as a full-stack, cloud-deployed system:

Frontend

  • Next.js (hosted on Vercel)
  • Dynamic quiz interface with real-time interaction
  • Supports multiple response types

Backend

  • FastAPI + MySQL
  • Deployed on Google Cloud PaaS
  • Handles:
    • Quiz generation and delivery
    • Response storage
    • Profile + trait management

Machine Learning

  • PyTorch MLP models
  • Trained using Google Cloud
  • Two key components:
    • Personality embedding model (64-dim vector)
    • Question generation/structuring logic

Key Innovation

Unlike traditional systems, Align:

  • Actively collects high-quality preference data (not passive scraping)
  • Uses structured responses, not just raw text
  • Produces dual outputs:
    • Human-readable traits
    • Machine-readable embeddings
  • Creates a feedback loop:
    • Better data → better questions → better alignment

We’re not just collecting data — we’re optimizing how alignment data is generated.


User Engagement & Trust

A key challenge in alignment is participation — users won’t contribute data unless they trust the system.

We address this by:

  • Providing immediate value:
    • Personality insights
    • Self-understanding
    • Social comparison with friends
  • Ensuring transparency:
    • Users know what data is collected
    • Clear control over public vs private traits
  • Designing for user ownership:
    • Users are collaborators, not products

Challenges we ran into

  • Designing a schema to support multiple response types (ranking, text, scale)
  • Training an MLP to produce meaningful embeddings from sparse user input
  • Generating high-signal questions that avoid redundancy and bias
  • Integrating frontend, backend, and ML systems into a seamless pipeline under hackathon time constraints

What we learned

  • The hardest part of AI systems isn’t always the model — it’s the data pipeline
  • Structured data is far more valuable than unstructured input for downstream ML tasks
  • User trust and engagement are critical when building systems that rely on human data
  • Full-stack + ML integration requires careful system design, even for a prototype

What’s next for Align

  • Compatibility matching using vector similarity
  • Using collected data for LLM fine-tuning / alignment
  • Personalized recommendation systems
  • Expanding to domain-specific alignment (health, ethics, social systems)

Final Thoughts

Align demonstrates that AI alignment starts with better data — and better data starts with people.

We’re building the infrastructure to make that possible.

Accomplishments that we're proud of

What we learned

What's next for Align

Built With

Share this project:

Updates