Sentimentree

Inspiration

We were inspired by how fragmented and fast-moving public sentiment is across the internet. Platforms like X, Reddit, and news outlets all contribute to evolving narratives, but there is no unified way to understand how these narratives develop over time. At the same time, prediction markets such as Polymarket and Kalshi provide outcomes, but not the reasoning or discourse behind them. Sentimentree was built to bridge this gap by making the evolution of public sentiment visible and interpretable.

What it does

Sentimentree is a visual, agent-powered search engine that maps public sentiment as a dynamic tree. Starting from a root query, such as a prediction or major topic, the system builds out branches representing different narratives emerging across social platforms and news sources. Each node captures a moment in time, including its sentiment score, source, timestamp, and a summary of the discussion it represents.

As new information appears, branches split when sub-narratives emerge and merge when perspectives begin to align. The resulting structure allows users to observe whether public opinion is fragmenting or converging. Prediction market data is overlaid as an additional signal, providing context rather than acting as the final output. Users can also interact with the system by inserting their own nodes, annotating narratives, and tracking how their personal predictions compare to the broader sentiment over time.

How we built it

We built Sentimentree as a multi-stage pipeline that transforms raw internet discourse into a structured, interactive visualization. The system begins with an agentic data collection layer powered by OpenClaw, where agents identify relevant queries and continuously scrape platforms such as X, Reddit, and news sources. This data is then passed into an embedding and tagging pipeline, where each item is converted into a semantic vector, filtered for relevance, and scored based on its directional sentiment relative to the original question.

Next, we developed a branching algorithm that clusters semantically similar content into narrative threads. This algorithm detects when narratives diverge enough to form new branches and when previously separate narratives begin to converge. Finally, we built an interactive frontend that renders the tree structure chronologically, allowing users to explore nodes, follow branches, and visualize how sentiment evolves over time. The entire system is connected through a real-time infrastructure that continuously updates the tree as new data is processed.

Challenges we ran into

One of the biggest challenges was dealing with noisy and inconsistent data across platforms. Not all scraped content is equally relevant, so we had to design filtering mechanisms that preserved meaningful signals while discarding noise. Another challenge was defining what it actually means for a narrative to split or merge, which required careful tuning of embedding distances and clustering thresholds.

We also faced difficulties in maintaining real-time performance across a multi-stage pipeline, where delays in one component could affect the entire system. On the frontend, visualizing a large and complex graph without overwhelming the user required thoughtful design decisions around layout, interaction, and scalability. Additionally, normalizing data across different platforms introduced inconsistencies that had to be resolved at multiple stages of the pipeline.

Accomplishments that we're proud of

We are most proud of building a fully end-to-end system that goes from raw internet data to an interpretable visualization within a short hackathon timeframe. In particular, we developed a novel way to represent sentiment evolution through a branching tree structure, which captures not just what people think, but how narratives form and change over time.

We also successfully integrated multiple complex components, including agent-based data collection, semantic embeddings, clustering algorithms, and an interactive frontend. The ability for users to contribute their own nodes and predictions adds an additional layer of depth, making the system not just analytical but participatory.

What we learned

Through this project, we learned that raw data alone is not enough; meaningful insights require structure, context, and interpretation. We also realized that sentiment analysis becomes far more powerful when it is framed relative to a specific question or outcome rather than treated as a generic positive or negative signal.

Another key takeaway was the importance of system design in multi-stage pipelines. Each component must be carefully aligned with the others, or small inconsistencies can propagate and cause larger issues downstream. Finally, we learned that visualization plays a critical role in making complex data understandable and actionable.

What's next for Sentimentree

Moving forward, we plan to improve the accuracy of our branching and clustering algorithms by incorporating more advanced temporal and semantic modeling techniques. We also want to expand our data sources to include platforms such as YouTube, blogs, and niche communities to capture a broader range of discourse.

In addition, we aim to introduce user reputation systems to better weigh contributed content and enhance prediction tracking to compare user insights against market outcomes over time. As the system scales, we will continue refining the infrastructure to support larger datasets and more real-time use cases. Ultimately, we see Sentimentree evolving into a powerful tool for understanding sentiment dynamics in areas such as finance, politics, and global news.