RC.JS — Devpost Submission

Inspiration

We were frustrated by the same problem every student, researcher, and founder hits: you're deep into a complex topic(distributed systems, a product strategy, a legal brief) and your workflow is dumping PDFs into a chatbot and getting a wall of text back. You can't see how the ideas connect. You can't trace where a claim actually came from. And when you close the tab, that understanding disappears.

Using AI for your academic and personal workflows is amazing, but we wanted to push it further. What if you could see your knowledge? Not as a list of documents, but as a living, navigable 3D map of concepts, questions, tensions, and evidence — something you could walk through, interrogate, and share with others?

That question became RC.JS: a spatial thinking workspace where AI doesn't just answer questions — it builds you a knowledge graph to think with.

What it does

RC.JS turns any collection of sources(be it PDFs, notes, URLs, GitHub repositories, Slack exports) into an interactive 3D knowledge graph. Here's how it works:

Start with a prompt. You land on the page and tell RC.JS what you're thinking about: "Prepare for my distributed systems exam" or "Understand this codebase." It creates a workspace.

Add your sources. Upload a PDF, paste a note, connect a public GitHub repo. RC.JS processes everything in the background through a multi-stage AI pipeline.

Explore a knowledge graph. Your sources transform into a 3D constellation of typed nodes — concepts, entities, questions, tasks, insights — connected by meaningful edges (supports, contradicts, relates to). Node shapes encode type. Evidence strength rings show how well-supported each concept is. Cluster halos group related ideas.

Ask grounded questions. Type a question and get a detailed, citation-backed answer that references specific nodes in the graph as clickable pills. The evidence path lights up — you can see exactly which nodes and edges produced the answer. Every claim traces back to a quoted source excerpt.

Discover intelligently. The system detects contradictions between your sources, identifies unexplored areas of the graph, and tracks your exploration progress — like a research GPS that guides you toward the most important things you haven't looked at yet.

Publish and fork. Keep your workspace private by default. When you've built something valuable, publish it to the public gallery. Other users can browse, explore, ask their own questions, fork it into their own workspace, and even suggest additions back.

How we built it

Frontend: Next.js with TypeScript, Tailwind CSS, and shadcn/ui for the interface layer. The 3D graph is powered by react-force-graph-3d with custom Three.js rendering — each node type has a distinct geometry (icosahedrons for concepts, spheres for sources, tetrahedrons for questions, boxes for tasks), with UnrealBloomPass post-processing for the stellar aesthetic. A 5,000-star particle field and nebula dust create the space observatory feel. Zustand manages all client state. Framer Motion handles transitions.

Backend: Next.js API route handlers backed by Supabase for auth, Postgres, pgvector, and file storage. Row-level security policies enforce private/public access. A background worker pipeline processes sources through normalize → chunk → embed → extract → cleanup → persist stages, with ingestion jobs tracked in Postgres for progress reporting.

AI Pipeline (Gemini throughout):

Document understanding: Gemini Files API for PDFs and long-form source material with local text extraction as the primary path and Gemini OCR as fallback
Graph extraction: Gemini 2.5 Flash with structured output schemas to extract typed nodes, edges, clusters, open questions, and recommended tasks as validated JSON
Embeddings: gemini-embedding-001 for chunk-level semantic similarity, powering both the retrieval pipeline and semantic edge creation
Grounded answering: Gemini 2.5 Pro with function calling (get_node_context, search_workspace) for tool-augmented Q&A that can pull additional evidence mid-generation. Answers include inline node references, 2–5 citations with quoted text, and follow-up suggestions
Context caching: Workspace-level content caching for repeated queries against the same source bundle
Graph commands: The answer pipeline can return optional graph actions (filter, highlight, focus, compare) enabling natural language navigation of the 3D graph

Data model: Workspaces contain sources, which produce source chunks (with embeddings), which feed graph nodes and graph edges. Conversations persist messages with full assistant metadata (citations, highlights, follow-ups). An ingestion jobs table tracks pipeline progress with stage-level reporting.

Challenges we ran into

Making the graph readable, not just impressive. Early versions produced 300+ node graphs that were visually overwhelming. We had to build aggressive cleanup — exact and semantic label deduplication, edge fan-out caps, importance-based promotion — to keep graphs between 50–250 nodes where a judge (or user) can understand the structure in under 10 seconds.

Grounding answers in real evidence. Getting Gemini to only cite actual chunk IDs and node IDs from the context — instead of hallucinating plausible-looking references — required careful prompt engineering, a multi-round function-calling pipeline, and a server-side validation layer that cross-checks every citation against the database before returning it to the client.

3D rendering performance. Three.js with bloom post-processing, 5,000 stars, and 200+ nodes with custom geometries can easily tank frame rate. We had to remove several visual effects (additive-blending outer glows caused full-screen blue wash during hover), implement progressive graph reveal animations, and pass explicit container dimensions to prevent frustum calculation errors.

Balancing latency and quality. The full pipeline — embed query, retrieve chunks, expand graph neighbors, generate answer with function calls — can take several seconds. We implemented model cascading (Pro first, Flash fallback on rate limit), context caching for repeated queries, and demo-mode short-circuits for seeded workspaces to ensure the demo flow stays snappy.

Keeping the graph useful during processing. When a user adds a new source, we can't blank the existing graph while rebuilding. We built a freshness model (empty/stale/fresh) with non-destructive refresh — the prior graph stays interactive while the worker rebuilds in the background.

Accomplishments that we're proud of

Gemini is genuinely central, not a wrapper. We use six distinct Gemini capabilities (document understanding, structured extraction, embeddings, function calling, grounded generation, and context caching) as integrated parts of the product, not bolt-on features.

The graph is useful, not decorative. Every node traces to source evidence. Evidence strength rings give at-a-glance trust signals. Contradiction detection surfaces intellectual tensions. The exploration tracker turns passive browsing into guided research. These aren't graph visualizations — they're thinking tools.

Answers reference the graph. When you ask a question, the answer text contains clickable node pills — inline references to specific concepts in the 3D graph. Click one and the camera flies to it. This is the bridge between conversational AI and spatial understanding that we couldn't find in any existing product.

The public knowledge model works. Private by default, publish intentionally, fork to build on others' work, suggest additions to contribute back. It's a complete knowledge circulation loop, not just a personal notebook.

It ships as a real product. Auth, persistence, background processing, error recovery, mobile-responsive layout, keyboard shortcuts, presentation mode. This isn't a demo hack — it's software you could hand to a real user.

What we learned

Source grounding changes everything. The moment you can click a claim and see the exact quote from page 12 of a PDF, the AI stops feeling like magic and starts feeling like a tool you can trust. Explicit insufficient-evidence states are just as important as good answers — they teach users the system's boundaries.

Spatial interfaces need progressive disclosure. Showing everything at once is worse than showing nothing. The cluster-first layout, importance-based label visibility, and hover-before-click tooltip hierarchy were essential to making the graph feel calm instead of chaotic.

Structured outputs are the bridge between LLMs and real applications. Gemini's structured output mode let us extract typed JSON (nodes, edges, clusters) reliably enough to persist directly into Postgres. Without that, the extraction pipeline would have been a fragile regex-and-prayer operation.

Function calling makes AI agents, not just chatbots. The difference between "answer this question" and "here are tools to find more evidence before you answer" is the difference between a chatbot and a research assistant. Two rounds of function calls dramatically improved answer quality for complex questions.

What's next for RC.JS

Workspace merging. Combine two workspaces on overlapping topics — semantic deduplication of shared concepts, surface contradictions between different source corpora, create super-graphs that synthesize multiple perspectives.

Temporal graph evolution. A timeline slider showing how the graph changed as sources were added. Track how concept importance shifts, when contradictions emerge, and which ideas persist across updates.

Gemini Live API integration. Voice-guided workspace tours where you can navigate the graph by speaking: "Show me the consensus algorithms cluster" or "What haven't I explored yet?" The spatial interface is a natural fit for multimodal interaction.

Collaborative presence. Multiple users exploring the same workspace in real-time — see each other's cursors on the graph, leave annotations on nodes, co-author questions. Turn knowledge graphs into shared thinking spaces.

Artifact generation. One-click exports: study guides from exam workspaces, architecture decision records from codebase graphs, executive briefs from research workspaces. The graph structure gives us the outline for free.