neXus

Inspiration

We've all been there! Scrolling through X, knowing someone in our network could help with a job lead, a warm intro, domain expertise, or just a night out, but having no idea where to start. Your social graph is one of your most valuable assets, yet it's completely opaque and requires unnecessary labor.

Many people know Bacon's Law, which asserts that any two people on Earth are only six or fewer acquaintance links apart. However, we've realized that your 2nd-degree connections are just as powerful. When it comes to meeting new people, no one wants to be put into a room full of strangers. We want warm intros. We want mutual context. We want to know why this person matters and how we relate before we reach out.

We saw a gap in the X API ecosystem. The data exists: follower graphs, profile metadata, recent posts. But no one was combining graph traversal with modern AI to make networks searchable by intent. Want to find "ML engineers interested in startups in SF"? Maybe you're looking for "Engineers who are training for the marathon." As of right now, such queries provide ambiguous or incomplete results on the current X platform.

neXus was born from frustration with the disconnect between how powerful our networks are and how primitive the tools to navigate them remain. We set out to build the missing layer of intelligence on top of X. One that makes your social graph fully searchable, context-aware, and relationship-driven. Instead of digging through profiles or relying on guesswork, neXus turns your network into a living, queryable map of opportunities.

What it does

neXus is an AI-powered intelligence layer for your social graph. It transforms your static following list into a dynamic, searchable knowledge base of human capital.

1. Deep Graph Traversal & Ingestion

The moment you authenticate via Twitter OAuth 2.0, neXus begins mapping your digital world. It doesn't just look at who you follow; it traverses the edges of your network to index your 1st-degree connections and discovers high-relevance 2nd-degree nodes (friends of friends). We extract rich metadata such as bios, locations, and most importantly, historical post content to build a comprehensive persona for every user in your orbit.

2. Semantic & Intent-Based Search

Standard keyword search fails when you need nuance. neXus utilizes RAG (Retrieval-Augmented Generation) and vector embeddings to understand the meaning behind your query.

Input: "Founders in SF building in crypto."
Process: The system doesn't just look for the string "crypto"; it understands related concepts like "web3," "blockchain," "tokens," and "decentralized."
Output: A ranked list of profiles, filtered by relevance, even if they don't explicitly have the word "crypto" in their bio.

3. Network Intelligence & Visualization

We render your social graph in a 3D interactive environment, allowing you to visually explore clusters and outliers. On the individual level, every profile card provides immediate context:

Degree Badges: Instantly know if someone is a direct connection (1st) or a warm intro away (2nd).
Match Reasoning: AI-generated explanations of why this person appeared in your search.
Mutuals: See exactly who bridges the gap between you and a potential opportunity.

4. AI-Generated Warm Intros

The "last mile" of networking is the cold DM. neXus eliminates writer's block. By analyzing your profile against their recent posts and interests, our Intro Generator crafts highly personalized, context-aware opening messages. It identifies shared topics, recent wins, or mutual connections to ensure your first interaction is warm, relevant, and human.

How we built it

neXus is architected as a modern two-tier application, leveraging the best-in-class tools for rapid iteration and high-performance data processing.

The Stack

Frontend: Built with Next.js 16 (App Router) and React 19, styled with Tailwind CSS and shadcn/ui for a clean, professional aesthetic. We used Framer Motion for complex state transitions and the 3D network visualization is powered by react-force-graph-3d and three.js.
Backend: A robust FastAPI (Python) server handles our business logic. We chose Python for its superior ecosystem in data science and AI integration.
Database: We use PostgreSQL via Supabase as our primary datastore, utilizing SQLAlchemy for ORM-based interactions.
AI & Inference: xAI's Grok powers our core intelligence, including persona summarization and intro generation, chosen for its "real-time" understanding of the X ecosystem. Google Gemini handles the high-throughput vector embedding generation for our semantic search.
Infrastructure: The entire stack is containerized and deployed, with Poetry managing strict dependency resolution for the backend.

The Pipeline: From API to Insight

Auth & Security: We implemented OAuth 2.0 with PKCE to securely authenticate users via the Twitter API v2. This grants us a user-context access token, allowing us to respect rate limits while fetching private social graph data.
The "Snowball" Scraper: Our custom-built scraping engine performs a breadth-first traversal of the user's graph. It fetches 1st-degree connections, then recursively fetches their connections to discover the 2nd-degree network. To handle Twitter's strict rate limits, we implemented a smart queuing system with exponential backoff and batch processing.
RAG & Vectorization: Raw data is noisy. We pass scraped profiles and their last ~50 tweets through an ETL pipeline.
- Extraction: We pull text content, removing URLs and media.
- Summarization: Grok generates a dense "persona summary" for each user (e.g., "AI Researcher focused on LLMs, based in London").
- Embedding: These summaries are passed to Gemini to create high-dimensional vector embeddings, which are stored in our vector database.
Semantic Search Engine: When a user searches "Founders hiring engineers," we don't just grep the database. We embed the query into the same vector space and perform a cosine similarity search against our indexed profiles. This returns results that are semantically related, even without exact keyword matches.
Generative UX: The intro generator uses a few-shot prompting technique with Grok. We feed the model the target's recent tweets, the user's bio, and the "mutuals" context to dynamically generate a message that feels authentic, not automated.

Challenges we ran into

The most significant technical hurdle was architecting around the X API's strict rate limits. Building a comprehensive social graph requires thousands of data points, but the API constraints forced us to rethink our entire data ingestion strategy. We couldn't simply "brute force" the graph traversal.

1. The Rate Limit Bottleneck

A "super full" graph is impossible to build with naive API calls. Every request is expensive. We initially hit rate limits almost immediately, which stalled development and broke the user experience.

Solution: We moved away from real-time fetching for everything. Instead, we treat the X API as a "seeder" rather than a live stream. We implemented a "Snowball" sampling strategy: we fetch 100% of 1st-degree connections but intelligently sample the top N most relevant 2nd-degree connections based on follower weight and mutual overlap.

2. Database Persistence & Optimization

Because API calls are so precious, we couldn't afford to fetch the same profile twice. We realized that a user-centric database (where every user has their own "connections" table) would lead to massive data redundancy and wasted API credits.

Solution: We designed a global, shared graph schema.
- Unified Profiles Table: If User A and User B both follow Elon Musk, we store Elon's profile once in a global x_profiles table.
- Edge-Based Graph: We store relationships in a lightweight x_follows table.
- Staleness Checks: We added a last_updated_at column. When our scraper encounters a profile, it checks this timestamp. We only trigger a fresh X API call if the data is older than 24 hours, significantly preserving our rate limit budget.

3. Deployment & Timeouts

Our initial plan was a simple Vercel deployment. However, our "Snowball" scraping and RAG pipeline—which involves fetching posts, generating summaries with Grok, and creating embeddings with Gemini—are long-running processes that often exceed serverless execution limits (e.g., Vercel's 10-second timeout).

Solution: We decoupled the architecture.
- Frontend: Remains on Vercel for edge performance.
- Backend: Migrated to a dedicated FastAPI container capable of handling long-lived asynchronous tasks.
- Asynchronous Processing: We rewrote our scraper to use Python's asyncio and background tasks, allowing the API to return an immediate "Accepted" status to the frontend while the heavy lifting happens in the background.

4. Caching Layers

To further reduce latency and API dependency, we implemented aggressive caching.

Session Caching: User auth tokens and basic profile data are cached to prevent re-fetching on every page load.
Vector Caching: Once a profile is embedded, we store the vector. We only re-embed if the underlying profile summary changes, saving costs and time on the Gemini API.

Accomplishments that we're proud of

1. Architecting a "Global" Social Graph

One of our biggest technical wins was moving away from siloed user data. Instead of creating a separate list of connections for every user, we designed a unified, global schema in PostgreSQL (x_profiles and x_follows tables). This means if User A scrapes a popular profile, that data is instantly indexed and available for User B. This shared architecture drastically reduced our API consumption and storage overhead, turning every new user's signup into a net benefit for the entire platform's intelligence.

2. Solving the RAG "Statement Timeout"

We encountered severe database timeouts when attempting to generate embeddings for hundreds of profiles simultaneously. We are proud of the robust batch processing pipeline we engineered in our backend. By implementing a smart chunking strategy (processing profiles in batches of 50), adding calculated delays, and optimizing our SQLAlchemy queries to fetch only essential text fields, we successfully stabilized the pipeline. We can now ingest, summarize, and vector-embed massive datasets without crashing our Supabase instance or hitting API rate limits.

3. Full OAuth 2.0 PKCE Implementation

Authentication is often taken for granted, but implementing the Twitter OAuth 2.0 Authorization Code Flow with PKCE from scratch was a significant hurdle. We built a secure handshake between our Next.js frontend and FastAPI backend, ensuring that we not only sign users in but also capture and refresh their access tokens to perform actions on their behalf securely.

4. Smart "Snowball" Scraping

We built a scraper that doesn't just grab a list. It traverses. We implemented logic to fetch 1st-degree connections and then intelligently identify the most relevant 2nd-degree nodes. We combined this with a "last_scraped" timestamp check to ensure we never waste an API credit on a profile we have updated in the last 24 hours.

5. The Interactive 3D Network

We successfully integrated react-force-graph-3d with Next.js 16 to create a cute, interactive landing page. It is not just a pre-rendered video. It is a live simulation. We built custom logic to "push" nodes to the periphery of the screen so they don't obscure the main content, and we implemented performant hover effects that mix three.js lighting with standard DOM tooltips.

What we learned

1. The Cost of Data is Real

The most immediate lesson was the sheer expense of the X API, both in terms of rate limits and potential monetary cost. We learned that relying on a third-party platform's data firehose is not viable for a scalable product without a significant budget. This forced us to be clever engineering economists, optimizing every single query and caching aggressively to maximize the value of every token we spent.

2. The Power (and Pain) of Distributed Systems

Separating our frontend (Next.js) and backend (FastAPI) taught us the complexities of distributed state. We had to manage CORS policies, handle asynchronous handshakes, and debug connection timeouts between Vercel's edge network and our containerized Python server. We learned that "it works on localhost" is only 10% of the battle; the real challenge is making two distinct frameworks talk to each other reliably in production.

3. Graph Data vs. Relational Tables

We started by trying to force a social graph into standard relational tables. We quickly learned that querying "friends of friends" (2nd-degree connections) in SQL can become exponentially slow without the right indices and schema design. We learned how to structure our x_follows table to act closer to an edge list, allowing us to perform graph-like traversals within a Postgres environment.

4. Vector Search is Not Magic

We learned that RAG is not just about dumping text into a vector database. The quality of our search results depended entirely on the quality of our inputs. We spent hours refining the "summarization" step, using Gemini to condense 50 tweets into a concise persona, before embedding it. We learned that garbage in equals garbage out, and that structured data preparation is the most critical part of any AI pipeline.

5. Asynchronous User Experience

We learned that users will not wait 60 seconds for a scraper to finish. We had to shift our mental model from "request-response" to "trigger-and-poll." Learning to provide immediate UI feedback (like "Scraping started...") while handling heavy compute tasks in the background was essential for keeping the application feeling snappy and responsive.

1. True Graph Database Migration

While Postgres has served us well for the MVP, we plan to migrate our connection data to a dedicated graph database like Neo4j. This would unlock ultra-fast 3rd- and 4th-degree traversals, allowing us to answer complex queries like "Who is the strongest path to introduce me to Sam Altman?" in milliseconds.

2. Autonomous "Network Agent"

We envision neXus evolving from a passive search tool into an active agent. Imagine a background worker that monitors your network 24/7. When a high-value connection posts about a topic you're interested in, or when a 2nd-degree connection moves to your city, neXus could proactively alert you and draft an intro message before you even know you need it.

3. The "Social CRM"

We want to build the ultimate relationship management tool. This includes features like automatic catch-up reminders based on interaction frequency, profile note-taking, and integration with calendar APIs to log when you actually meet these connections IRL.

4. Cross-Platform Intelligence

Your network doesn't just live on X. The long-term vision for neXus is to aggregate your social graph across LinkedIn, GitHub, and Substack, creating a unified identity layer that gives you a complete 360-degree view of your professional world.

neXus started as a hackathon project to solve a personal frustration, but it has grown into a proof of concept for a new way to network. We believe that in the age of AI, your social capital shouldn't be hidden in a black box. It should be open, intelligent, and working for you.

Thank you to X for all of those credits and a great weekend!