Classify and Label Rows
When you have hundreds of rows that each need consistent, structured labels across multiple categories, you need batched parallel evaluation with a fixed schema.
Here, we classify 200 job postings from Hacker News "Who's Hiring" into 9 role categories in under a minute.
| Metric | Value |
|---|---|
| Rows | 200 |
| Time | 1.0 minutes |
| Cost | $1.53 |
| Cost per row | $0.008 |
Add FutureSearch to Claude Code if you haven't already:
claude mcp add futuresearch --scope project --transport http https://mcp.futuresearch.ai/mcp
Download the dataset: hn_jobs_classify.csv (3,616 Hacker News "Who's Hiring" posts). With the CSV in your working directory, tell Claude:
Classify each row in hn_jobs_classify.csv by primary engineering role. Use these categories:
backend, frontend, fullstack, data, ml_ai, devops_sre, mobile, security, other.
For each row, return the category and a short reasoning.
Claude calls FutureSearch's agent MCP tool with your schema, then polls for progress:
Tool: futuresearch_agent
├─ task: "Classify this job posting by primary role: backend, frontend, fullstack..."
├─ input_csv: "/Users/you/hn_jobs_classify.csv"
└─ response_schema: {"category": "enum", "reasoning": "string"}
→ Submitted: 200 rows for processing.
Session: https://futuresearch.ai/sessions/4d82663b-c681-4b12-a48a-17533f162b51
Task ID: 4d82...
Tool: futuresearch_progress
├─ task_id: "4d82..."
→ Running: 0/200 complete, 200 running (10s elapsed)
Tool: futuresearch_progress
→ Running: 100/200 complete, 100 running (30s elapsed)
...
Tool: futuresearch_progress
→ Completed: 200/200 (0 failed) in 59s.
Tool: futuresearch_results
├─ task_id: "4d82..."
├─ output_path: "/Users/you/classified_jobs.csv"
→ Saved 200 rows to /Users/you/classified_jobs.csv
Add the FutureSearch connector if you haven't already. Then upload hn_jobs_classify.csv and ask Claude:
Classify each row by primary engineering role. Use these categories: backend, frontend, fullstack, data, ml_ai, devops_sre, mobile, security, other. For each row, return the category and a short reasoning.
Go to futuresearch.ai/app, upload hn_jobs_classify.csv, and enter:
Classify each row by primary engineering role. Use these categories: backend, frontend, fullstack, data, ml_ai, devops_sre, mobile, security, other. For each row, return the category and a short reasoning.
pip install futuresearch
export FUTURESEARCH_API_KEY=your_key_here # Get one at futuresearch.ai
import asyncio
from typing import Literal
import pandas as pd
from pydantic import BaseModel, Field
from futuresearch.ops import agent_map
class JobClassification(BaseModel):
category: Literal[
"backend", "frontend", "fullstack", "data",
"ml_ai", "devops_sre", "mobile", "security", "other"
] = Field(description="Primary role category")
reasoning: str = Field(description="Why this category was chosen")
async def main():
jobs = pd.read_csv("hn_jobs_classify.csv")
result = await agent_map(
task="""Classify this job posting by primary role:
- backend: Server-side, API development
- frontend: UI, web development
- fullstack: Both frontend and backend
- data: Data engineering, pipelines, analytics
- ml_ai: Machine learning, AI, deep learning
- devops_sre: Infrastructure, platform engineering
- mobile: iOS, Android development
- security: Security engineering
- other: Product, design, management, etc.
""",
input=jobs,
response_model=JobClassification,
)
print(result.data[["id", "category", "reasoning"]])
asyncio.run(main())
Use Python's Literal type to restrict classifications to specific values. The LLM is constrained to only return values from this set. For multi-label cases, use list[str]:
class MultiLabel(BaseModel):
tags: list[str] = Field(description="All applicable tags for this item")
primary_tag: str = Field(description="The most relevant tag")
Results
| Category | Count | % |
|---|---|---|
| fullstack | 88 | 44.0% |
| other | 32 | 16.0% |
| backend | 21 | 10.5% |
| ml_ai | 20 | 10.0% |
| data | 13 | 6.5% |
| mobile | 9 | 4.5% |
| devops_sre | 8 | 4.0% |
| frontend | 6 | 3.0% |
| security | 3 | 1.5% |
Fullstack dominates (44%) because Hacker News "Who's Hiring" posts are predominantly from startups hiring generalist engineers.
Cost scales linearly:
| Rows | Estimated Cost | Estimated Time |
|---|---|---|
| 100 | ~$1 | ~1 min |
| 1,000 | ~$9 | ~5 min |
| 10,000 | ~$90 | ~30 min |
Built with FutureSearch. See the agent_map documentation for more options including response models and effort levels.