Cenotium

Sandboxed web browsing agent activated (ReAct)
Architecture of the Browser
Knowledge Graph of User Context
Dynamic Port Spawning
Complete objective of Operator

Inspiration

The modern internet is not built for AI agents. It was designed for human interaction—clicking, searching, reading, and manually executing tasks. However, with the rise of autonomous AI systems, a new paradigm is necessary—one where AI agents can interact, navigate, and execute tasks independently across the web.

We saw a fundamental gap:
-Websites are not built for agents – Most web pages lack structured APIs, making them inaccessible to AI.
-Autonomous AI requires a new interaction model – AI agents need a structured web representation to engage with content.
-Security & trust are critical – Autonomous systems need verifiable, secure, and efficient execution to operate safely.

This inspired us to redefine the internet—not by replacing it, but by overlaying a new structured schema that enables AI agents to navigate the web autonomously, fundamentally making us a New Age Browser.

🛠️ What We Built

A Browser for AI Agents

Cenotium is an Agentic Internet Browser, allowing AI agents to interact with and automate web tasks using a structured approach.

A Web Schema Transformer

Extracts website HTML, images, and UI components
Uses Vision, Grounding, and Action models to structure pages for AI interaction
Stores structured data in a Supabase database for efficiency and future page recall

An AI-Powered Agent Manager

Uses LLM Compiler to fine-tune user prompts using RAG from user preference knowledge graph
Divides tasks into single-action steps using LLMs
Delegates execution to autonomous AI agents

A Network of AI Agents

Perplexity Search Agent → Queries real-time information from the web
Twilio Calling Agent → Places calls & sends messages autonomously
Browser Activation Agent → Utilized E2B.dev containerized environments with OS Atlas grounding and LLM planning to interact with web elements, executing tasks via a Poetry-managed Python stack with PyQt5, Flask, OpenCV, and FFmpeg integration.

Security & Trust Layer

Global & Local Trust Core → Implements EigenTrust + PageRank for AI decision-making
Fernet Symmetric Encryption → Ensures secure agent communication
Persistent Storage (Redis & Supabase) → Maintains execution history & user preferences
Custom Inter-Agent Communication Protocol (RabbitMQ/Kafka) → Handles message security, encryption, and authentication

This architecture allows any new AI agent to be integrated, making Cenotium a scalable, future-proof AI ecosystem for web automation.

💡 What We Learned

The Web is Not AI-Ready – Most websites hide UI elements in React/Angular front-ends, making AI interaction difficult.
Planning & Execution Require Dynamic Replanning – Tasks must be broken into single-action steps and replanned on failure for seamless execution.
Security is Paramount – AI agents must operate with encryption, trust mechanisms, and secure communication to prevent malicious actions.
Agent Collaboration is Key – By allowing agents to call other agents, we built a self-improving multi-agent system.
Lack of Web Search Optimization Dataset - There is no structured dataset optimized for AI-driven web search and interaction, forcing agents to learn web navigation heuristically rather than relying on standardized training data.

🚧 Challenges We Faced

Building an Agent-Readable Web Schema → Converting unstructured HTML into structured agent-compatible schemas was a major challenge. We solved this using Vision + Grounding + Action models.
Efficient Prompt Optimization → Ensuring that the LLM Compiler created the most contextually aware user queries required RAG-based knowledge retrieval and continuous prompt tuning.
Security & Trust → We had to implement EigenTrust-style reputation ranking, Fernet encryption, and a custom inter-agent security protocol to ensure safe & fair agent execution.
Scalability → The system had to support infinite AI agents, allowing any new LangChain tool to become an agent. We optimized this using a modular, distributed execution model.

🛠️ Built With

Languages & Frameworks

Python → Backend development
JavaScript (Next.js, Tailwind, React & Streamlit) → Frontend & UI

AI Models & NLP

ChatGPT-4o, DeepSeek R1, Claude 3.5 Sonnet → Task decomposition, planning, execution
OS-Atlas deployed on HuggingFace → Bounding box detection for UI elements
OpenAI 4o-mini → Web image captioning
AWS Neptune → Knowledge graph Generation
Gemini Flash → Running of the Grounding model

Data & Storage

Supabase → Stores web schema & bounding boxes
Redis → Caches agent execution history

Security & Trust

Fernet Encryption + HMAC-SHA256 → Secure AI communications
EigenTrust + PageRank → AI agent reputation scoring

Agents & Tools

LangChain → AI agents built as modular tools
Selenium → Browser automation for web scraping
Twilio API → Automated phone calling & messaging
Perplexity AI API → Real-time web search agent
LLMCompiler → Assist Agent Manager to control and call agents

📸 Project Media

Agentic Internet

Cenotium is just the beginning—a paradigm shift redefining how AI interacts with the web. We are building a new internet, not for humans—but for agents. 🌎 🤖