Investigate

Inspiration

Financial research is fundamentally broken. Quantitative analysts focus entirely on raw data — tracking academically grounded factors like momentum, volatility, dividend yield, and Fama-French size factor. Meanwhile, qualitative analysts spend their time reading news and gauging market sentiment, trying to decipher key narrative themes and separate hype from substance. These two disciplines rarely overlap effectively, and having a human synthesise both in real time is nearly impossible.

We also noticed that most AI tools for finance are simply chatbots waiting for a user query. We wanted to build a true autonomous multi-agent AI system that delegates tasks to specialised experts — just like an institutional research desk. Investigate is not a chatbot. There is no user query. You provide a ticker; the system does the rest.

What it does

Investigate is a four-phase parallel pipeline of specialised financial experts. You input a stock ticker, and the system autonomously investigates it.

Phase 1 — a pure Python orchestrator uses asyncio.gather() to fetch market data and social sentiment concurrently, then computes quantitative signals locally without touching an LLM: Wilder's 14-day RSI, Jegadeesh & Titman 12-1 month momentum, 20-day historical volatility, 14-day ATR, TTM dividend yield, and Fama-French SMB size factor.

Phase 2 — a Quant Agent and a Sentiment Agent run in parallel. The Quant Agent interprets the raw signal numbers; the Sentiment Agent analyses raw Reddit posts to extract structured themes, hype-vs-substance classification, and risk flags. Each agent operates in strict isolation — the Quant Agent knows nothing about sentiment, and the Sentiment Agent knows nothing about RSI.

Phase 3 — a Risk Agent receives both outputs and assesses risk profile, volatility regime, and position sizing.

Phase 4 — a Research Writer synthesises everything into a cohesive LLM-written research narrative with a specific conviction score.

The output is not just a report — it's a live activity feed where you can watch the agents think, see their exact findings, and follow their reasoning as it streams in real time.

How we built it

We architected Investigate to maximise parallel execution, strict data contracts, and observability.

The Orchestrator: A pure Python orchestrator using asyncio calls all data fetchers and signal computations concurrently, with zero LLM tokens spent on planning. We have a fixed set of data sources, so dynamic planning would only add latency and cost.

Data Ingestion: We integrated yfinance for OHLCV data and corporate actions, and the public Reddit JSON API for qualitative social data across r/wallstreetbets, r/stocks, and r/investing.

Quantitative Computations: All signals — RSI, momentum, volatility, ATR, dividend yield, and Fama-French size factor — are computed locally from market data without an LLM.

Parallel LLM Agents: The Quant Agent interprets raw signal numbers; the Sentiment Agent extracts structured themes from raw social text. Both run concurrently via asyncio.gather().

Typed Contracts: Agents communicate exclusively via Pydantic v2 schemas. Every output is machine-validated before moving downstream — no free-text glue between LLM calls.

Observability Stream: We implemented Server-Sent Events via a FastAPI StreamingResponse, allowing the frontend to render a live activity feed of structured AgentEvents — each containing specific findings and reasoning, not just status updates.

Challenges we ran into

Building a parallel agentic pipeline presented several technical hurdles.

API Limitations: We discovered a frustrating quirk with yfinance where FastInfo.keys() returns camelCase, but attribute access strictly requires snake_case. We also encountered a 403 Forbidden error from Stocktwits, forcing us to rely on the Reddit public JSON API for social sentiment.

Separation of Concerns: Enforcing strict domain isolation between agents was harder than expected. The Quant Agent must know nothing about sentiment; the Sentiment Agent must know nothing about RSI. Achieving this cleanly required deliberate prompt and schema design.

Real-Time Events: Emitting structured AgentEvents in real time while multiple async tasks ran in parallel required careful architectural planning to ensure the Next.js frontend received a clean, ordered timeline via SSE.

Accomplishments that we're proud of

We are proud of achieving genuine agentic delegation — the parallel execution of QuantAgent and SentimentAgent proves real delegation, not just chained prompts. The typed Pydantic contracts mean the system behaves as a reliable, machine-validated software pipeline rather than a loose collection of LLM calls.

We are also proud of the live observability feed. Showing users exactly what each agent is finding and the reasoning behind its conclusions is a significant step forward for AI transparency — and a direct answer to the "black box" criticism of AI-generated financial analysis.

What we learned

The biggest lesson: true agentic architecture is built on strict data contracts. Forcing agents to communicate through Pydantic schemas rather than free text proved that AI can act as a reliable software pipeline. We also learned the immense value of observability — seeing the internal reasoning of agents builds trust in the final output far more than a polished report alone.

What's next for Investigate

Our immediate priority is implementing fundamentalsFetcher.py via Financial Modeling Prep to unlock Post-Earnings Announcement Drift (PEAD) signals, and integrating put/call ratios via CBOE daily CSV data. We also plan to implement filingsFetcher.py against the SEC EDGAR free API for earnings transcript context.

Looking further ahead, we are designing an Agentic Web Searcher — a tool-calling agent that actively searches for earnings call transcripts, analyst reports, and SEC filings, running as a Phase 2c in parallel with the Quant and Sentiment agents. It would produce a typed WebResearchContext schema feeding directly into the Risk Agent and Research Writer.

Built with

claude · fastapi · langchain · next.js · pydantic · python · react · typescript · yfinance

Main changes made: added the four-phase structure explicitly, named the academic signal sources (Jegadeesh & Titman, Fama-French SMB), added dividend yield and size factor to the signals list, strengthened the "not a chatbot" framing in the Inspiration, clarified the PEAD/put-call/SEC EDGAR roadmap from the CLAUDE.md priority TODO, and tightened the typed contracts language throughout.

message.txt 7 KB slightly altered this leans more towards the actual product but keeps the rest of what you'v edone Here's the updated Devpost with adjustments drawn from CLAUDE.md: