Inflect-AI

Landing Page - Bull vs Bear
S&P 500
Voice Agent
Thesis Agent
Portfolio
Trade
Voice-Chat
Buy/Sell
LOGO
Architecture Diagram
Multi Agentic Intelligence

Inflect — Project Story

Inspiration

Bloomberg Terminal costs $24,000 a year. TradingView assumes you already know what RSI means. ThinkorSwim was built for professionals, not students learning to trade.

At the same time, retail investors are flooded with noise—tabs of prices, filings, dashboards, and opinions—while real answers like “What was gross margin in Q4?” or “Why is this stock moving?” remain scattered across multiple tools.

We built Inflect because financial research shouldn't require a finance degree or a six-figure budget. Every retail investor and finance student deserves the same depth of insight that institutional traders take for granted — delivered instantly, cited accurately, and explained clearly.

The name comes from the concept of an inflection point — the moment a trend changes direction. That’s exactly what we aim to help users identify before everyone else does.

What it does

Inflect is an AI-powered financial research and portfolio platform that synthesizes SEC filings, real-time market data, technical indicators, and news sentiment into instant, cited answers.

Users can ask anything about a public company — fundamentals, technicals, earnings, or price action — and Inflect returns a grounded response with citations pulled directly from SEC EDGAR filings and structured datasets.

Each response includes a structured Trade Thesis Card:

Fundamental signal — derived from 10-K / 10-Q / 8-K via RAG
Technical signal — RSI, MACD, Bollinger Bands (TA-Lib)
Sentiment signal — FinBERT scoring across recent headlines
Verdict — HOLD / WATCH / AVOID (never BUY/SELL — we are educators, not advisors)

Beyond research, Inflect provides:

Portfolio dashboard with live market strip
Paper trading with simulated capital ($100K)
Voice-first interaction (ask → answer → spoken response)
Chat-based research (Perplexity-style threads)
Chart screenshot upload with AI-powered vision analysis
Seamless switching between voice and chat with full session context

Every answer is citation-grounded. A Validator Agent blocks any response that cannot be traced back to a retrieved source document — enforcing zero hallucination tolerance.

About the project

Inflect is designed as a full-stack financial copilot, not just a chatbot.

At its core, it is a system that:

Understands user intent (price check vs deep research vs trade)
Routes queries to the correct data pipeline
Retrieves structured + unstructured financial data
Generates grounded answers
Validates them against source documents
Returns results with optional charts, thesis summaries, and voice output

The experience is built around a unified research surface:

A live ticker strip powered by low-latency market data
A portfolio home for tracking positions
A research panel with citations, metrics, and chart actions
A dual-mode interface (voice + chat) with shared context

This turns fragmented financial workflows into a single, intelligent interface.

How we built it

Frontend

React 18 + TypeScript + Vite
Tailwind CSS + shadcn-style UI
Plotly.js for interactive charts
Zustand + TanStack Query for state/data management
Supabase Auth for authentication
Deployed via Lovable / custom domain

We built a dual-mode UI:

Voice Mode (mic-first interface)
Chat Mode (threaded research)

Both modes share full session memory (ticker, timeframe, context).

Backend

Python + FastAPI + Uvicorn
Deployed on Google Cloud Run / Railway

Core query pipeline:

User Query
   ↓
Intent Classification (Groq)
   ↓
Routing Layer:
   → Price check (Finnhub / Snowflake)
   → Research (RAG + LLM)
   → Portfolio / trade actions
   ↓
Validator Agent (citation enforcement)
   ↓
Response (text + charts + voice)

The system uses a modular AI provider abstraction layer, allowing model swapping without code changes.

Data & AI Pipeline

RAG System

540K+ SEC filing chunks (10-K, 10-Q, 8-K)
Embedded using BGE-small
Stored in Pinecone across multiple namespaces
Section-aware retrieval:
- Item 7 (MD&A)
- Item 8 (Financial Statements)

Data Systems

Snowflake → prices, fundamentals, SEC tables
Finnhub → real-time market data
Cloudflare R2 → storage

AI Stack

Groq LLaMA 3.3 70B → reasoning, answer generation, thesis synthesis
Groq Whisper large-v3 → speech-to-text (~300ms)
ElevenLabs / Kokoro → text-to-speech
TA-Lib → technical indicators
FinBERT → sentiment scoring
moondream2 → chart image analysis
Wolfram Alpha → technical analysis

Chart Agent

A dedicated Chart Agent dynamically generates Plotly Python code:

import plotly.graph_objects as go
fig = go.Figure(data=[go.Candlestick(...)])
fig.show()

Executed inside a RestrictedPython sandbox
Validates output before rendering
Supports 9 chart types (candlestick, RSI, MACD, etc.)

Infrastructure

Backend → Google Cloud Run / Railway
Auth & DB → Supabase (Postgres)
Cache → Redis (Upstash)
Vector DB → Pinecone
Warehouse → Snowflake
Storage → Cloudflare R2

Entire stack runs at approximately $1–3/month at MVP scale.

Challenges we ran into

Hallucination prevention at scale: balancing strict citation enforcement with useful responses
Market data reliability: yfinance failed in cloud → replaced with Finnhub + Snowflake batching
Chart code sandboxing: required secure execution environment (RestrictedPython)
CORS & preview environments: handling dynamic Lovable preview domains with regex allowlists
Intent + ticker resolution: mapping natural language queries like “Tesla price” → TSLA
Latency constraints: achieving <3s voice pipeline required parallelization and fast inference

Speech → STT → Intent → RAG → LLMs → TTS

Accomplishments that we're proud of

Fully cited, zero-hallucination RAG system over 540K+ SEC chunks
Chart Agent generating 9 types of interactive visualizations
Trade Thesis Engine combining 3 independent signals
Seamless voice + chat context switching
End-to-end voice pipeline under 3 seconds
Entire AI stack running at ~$1–3/month

What we learned

Speed matters more than model size for user experience
RAG faithfulness is harder than RAG accuracy
Data pipelines are more important than prompts in production systems
Evaluation tools (RAGAS, LangSmith) must be integrated early
Batching Snowflake reads + parallel Finnhub calls reduced latency from 30s → ~1s

What's next for Inflect

Earnings transcript ingestion (8-K pipeline)
Vision-based chart analysis (moondream2 integration)
Historical backtesting
Mobile app with offline voice support
Real trading integration (pending regulatory review)

Built With

elevenlabs
fastapi
finnhub
gemini-(optional)
googlecloudrun
groq
groq-whisper-market-data:-finnhub-cloud:-google-cloud-run-testing:-playwright
languages:-python
pinecone
playwright
pydantic
pydantic-auth-&-db:-supabase-data-warehouse:-snowflake-vector-db:-pinecone-llm-/-ai:-groq
python
query
react
snowflake
supabase
tailwind-css
tanstack
tanstack-query-backend:-fastapi
typescript
typescript-frontend:-react
uvicorn
vite
vitest
wolfram-alpha-(optional)-voice:-elevenlabs
wolfram-technologies
zustand