Inspiration

The real estate market hides value in plain sight, but thats only if you can process everything that moves it haha. From zoning changes buried in city filings, to climate risk projections, to subtle neighborhood sentiment shifts on social media, the signals exist. The problem? No human (or traditional tool) can ingest, connect, and act on them all at scale. I built Kiyosaki to be the final intelligence layer for real estate investing to find assymetric opportunities for investors.

What it does

Kiyosaki continuously ingests and analyzes millions of data points across dozens of sources, from NYC public filings to scraped neighborhood chatter, satellite imagery, and proprietary datasets. For our MVP, we’re focused on New York investments. The system:

Pulls high-signal data from public, commercial, and unconventional sources.

Uses web search APIs and scrapers to expand on promising leads in real time.

Runs sub-agents specialized in different domains (zoning, sentiment, risk) coordinated by a central orchestrator agent.

Uses a sliding window approach (shoutout leetcode) to handle massive token contexts, allowing it to “remember” and connect millions of words of input without hitting LLM context limits.

Surfaces actionable investment theses complete with niche catalysts and overlooked risks that a human analyst could never piece together in time. Its just that good.

How I built it

Data ingestion: NYC Open Data APIs, public planning/zoning portals, FOIL-accessible filings, scraped social platforms, Google APIs for targeted expansions, and preprocessed climate/satellite datasets. Firecrawl wherever needed for deeper dives

A central orchestrator agent maintains context across all sources using a sliding window and a structured memory store.

Python pipelines for orchestrating LLM calls, custom chunking logic to allow high-recall, high-precision synthesis.

Output: AI-generated investment memos with structured fields (property details, catalysts, risks, valuation delta) and data-backed recommendations.

Challenges we ran into

Massive context handling (lol, the name of the game): Without sliding window and intelligent summarization, even GPT-4-class models couldn’t hold enough detail to make accurate connections.

Data variety: Zoning filings and social posts require very different parsing logic, some needed OCR and stuff, so I had to build flexible ingestion pipelines.

Signal-to-noise ratio: Many “exciting” changes in filings don’t impact value; we tuned filters to prioritize only signals with historical ROI correlation. Pulled a lot of this from models that already worked by reading a lot.

Accomplishments that I'm proud of

Built a working giga-context agent architecture capable of processing hundreds of millions of tokens worth of property-related data in hours

Identified real NYC properties with niche catalysts (e.g., under-publicized rezoning, hidden school district shifts) that could outperform in the next 12–24 months. Obviously not strict investment advice, but targets signals that are difficult to manually identify...

Built With

Share this project:

Updates