Immigrant Voices

Immigrant Voices is a Next.js app that turns immigrant stories into reusable community knowledge. It collects first-person stories about common newcomer challenges in the US, structures those stories into normalized records, and then generates practical learnings by topic.

This project was built for the Immigration Hackathon NYC.

The live deployment is on Vercel: https://immigrant-voices.vercel.app

What the project does

The app focuses on topics that new immigrants often struggle with early on:

First credit card
Healthcare
Housing
Jobs
Banking
Legal paperwork

For each topic, the product does two things:

It shows the original stories.
It generates a topic-level rubric of community learnings grounded in those stories.

Users can also contribute their own stories through the app, and those contributions can feed the next round of rubric generation.

Tech stack

Next.js 14
React 18
TypeScript
Tailwind CSS
Tavily for web search and raw-content retrieval
Together AI for structured extraction and rubric generation
Local JSON files in data/ for persistence

Local setup

Create a .env.local file with:

TAVILY_API_KEY=your_tavily_key
TOGETHER_API_KEY=your_together_key
TOGETHER_MODEL=meta-llama/Llama-3.3-70B-Instruct-Turbo

Install dependencies:

npm install

Run the app locally:

npm run dev

Then open:

http://localhost:3000

Useful scripts

Run story ingestion for the default topic:

npm run ingest:stories

Run story ingestion for a specific topic:

npm run ingest:stories -- --domain housing

Limit how many search queries are used:

npm run ingest:stories -- --domain jobs --limit 3

Cap the number of stored stories for a topic:

npm run ingest:stories -- --domain banking --max-stories 10

Regenerate a rubric for a topic:

npm run extract:rubric -- --domain first-credit-card

Build for production:

npm run build

Start the production server:

npm run start

App flow

1. Story collection

Stories come from two places:

Web-sourced stories collected by the ingestion script
Direct community submissions through the contribute form

Web-sourced entries are marked as seeded data, while form submissions are stored as contributed stories.

2. Tavily extraction

The ingestion script in scripts/ingest-stories.ts starts with a set of topic-specific search queries from lib/domains.ts.

For each query, we use Tavily to:

Search domains that are likely to contain personal experience posts
Fetch raw page content
Save source metadata into data/sources.json

The project currently prefers sources like Reddit, Quora, Medium, expat forums, and similar sites where first-person accounts are common.

3. Structured story extraction

After fetching raw content, the app sends each source to the Together model with a strict extraction prompt. That prompt asks the model to keep only valid first-person immigrant stories related to the selected topic.

For accepted stories, we extract:

Contributor name
Country of origin
Arrival year
Cleaned story text
Organizations mentioned
Products or services mentioned
Documents mentioned
Fees or amounts
Other key details

Those normalized records are saved into data/stories.json.

4. Clustering into recurring themes

Once stories exist for a topic, we look across the whole set and group repeated signals into shared learnings. In practice, the clustering here is semantic rather than embedding-based:

All stories are already grouped by topic domain
The rubric generation step compares stories side by side
It identifies repeated actions, obstacles, and successful patterns mentioned by multiple contributors
It turns those repeated patterns into ordered rubric steps

Each generated step must be supported by multiple stories, and the saved rubric keeps the supporting story IDs so every learning is traceable back to the source stories.

5. How we came up with the learnings

The learnings are not written manually. They are synthesized from repeated patterns in the story set.

The rubric generation logic in lib/rubric.ts asks the model to:

Read all stories for one topic
Find only advice that is independently supported by multiple contributors
Explain why the step matters based on the evidence in the stories
Order the steps from prerequisite actions to the main path and then optimization

The output becomes a topic rubric saved in data/rubrics.json.

If model-based extraction fails, the app falls back to a heuristic rubric path for resilience.

Data files

data/sources.json: raw source metadata and fetched content
data/stories.json: cleaned, structured story records
data/rubrics.json: generated topic learnings and supporting story references

Main routes

/: landing page with topic overview and featured rubric
/topics/[domain]: topic page with rubric and supporting stories
/stories: browse all stored stories
/contribute: submit a new story

Deployment

This project is deployed on Vercel at: https://immigrant-voices.vercel.app

For Vercel deployment, the required environment variables are:

TAVILY_API_KEY
TOGETHER_API_KEY
TOGETHER_MODEL

Project goal

The core idea is simple: immigrant stories are valuable, but raw stories are hard to reuse at scale. This project turns those stories into structured community knowledge so the next person can learn faster, avoid common mistakes, and start from what already worked for others.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
app		app
components		components
data		data
lib		lib
scripts		scripts
.env.local.example		.env.local.example
.gitignore		.gitignore
README.md		README.md
code_1.md		code_1.md
components.json		components.json
next-env.d.ts		next-env.d.ts
next.config.mjs		next.config.mjs
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Immigrant Voices

What the project does

Tech stack

Local setup

Useful scripts

App flow

1. Story collection

2. Tavily extraction

3. Structured story extraction

4. Clustering into recurring themes

5. How we came up with the learnings

Data files

Main routes

Deployment

Project goal

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Immigrant Voices

What the project does

Tech stack

Local setup

Useful scripts

App flow

1. Story collection

2. Tavily extraction

3. Structured story extraction

4. Clustering into recurring themes

5. How we came up with the learnings

Data files

Main routes

Deployment

Project goal

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages