Inspiration

The modern internet is not built for AI agents. It was designed for human interaction—clicking, searching, reading, and manually executing tasks. However, with the rise of autonomous AI systems, a new paradigm is necessary—one where AI agents can interact, navigate, and execute tasks independently across the web.

We saw a fundamental gap:
-Websites are not built for agents – Most web pages lack structured APIs, making them inaccessible to AI.
-Autonomous AI requires a new interaction model – AI agents need a structured web representation to engage with content.
-Security & trust are critical – Autonomous systems need verifiable, secure, and efficient execution to operate safely.

This inspired us to redefine the internet—not by replacing it, but by overlaying a new structured schema that enables AI agents to navigate the web autonomously, fundamentally making us a New Age Browser.


🛠️ What We Built

A Browser for AI Agents

Cenotium is an Agentic Internet Browser, allowing AI agents to interact with and automate web tasks using a structured approach.

A Web Schema Transformer

  • Extracts website HTML, images, and UI components
  • Uses Vision, Grounding, and Action models to structure pages for AI interaction
  • Stores structured data in a Supabase database for efficiency and future page recall

An AI-Powered Agent Manager

  • Uses LLM Compiler to fine-tune user prompts using RAG from user preference knowledge graph
  • Divides tasks into single-action steps using LLMs
  • Delegates execution to autonomous AI agents

A Network of AI Agents

  • Perplexity Search Agent → Queries real-time information from the web
  • Twilio Calling Agent → Places calls & sends messages autonomously
  • Browser Activation Agent → Utilized E2B.dev containerized environments with OS Atlas grounding and LLM planning to interact with web elements, executing tasks via a Poetry-managed Python stack with PyQt5, Flask, OpenCV, and FFmpeg integration.

Security & Trust Layer

  • Global & Local Trust Core → Implements EigenTrust + PageRank for AI decision-making
  • Fernet Symmetric Encryption → Ensures secure agent communication
  • Persistent Storage (Redis & Supabase) → Maintains execution history & user preferences
  • Custom Inter-Agent Communication Protocol (RabbitMQ/Kafka) → Handles message security, encryption, and authentication

This architecture allows any new AI agent to be integrated, making Cenotium a scalable, future-proof AI ecosystem for web automation.


💡 What We Learned

  • The Web is Not AI-Ready – Most websites hide UI elements in React/Angular front-ends, making AI interaction difficult.
  • Planning & Execution Require Dynamic Replanning – Tasks must be broken into single-action steps and replanned on failure for seamless execution.
  • Security is Paramount – AI agents must operate with encryption, trust mechanisms, and secure communication to prevent malicious actions.
  • Agent Collaboration is Key – By allowing agents to call other agents, we built a self-improving multi-agent system.
  • Lack of Web Search Optimization Dataset - There is no structured dataset optimized for AI-driven web search and interaction, forcing agents to learn web navigation heuristically rather than relying on standardized training data.

🚧 Challenges We Faced

  • Building an Agent-Readable Web Schema → Converting unstructured HTML into structured agent-compatible schemas was a major challenge. We solved this using Vision + Grounding + Action models.

  • Efficient Prompt Optimization → Ensuring that the LLM Compiler created the most contextually aware user queries required RAG-based knowledge retrieval and continuous prompt tuning.

  • Security & Trust → We had to implement EigenTrust-style reputation ranking, Fernet encryption, and a custom inter-agent security protocol to ensure safe & fair agent execution.

  • Scalability → The system had to support infinite AI agents, allowing any new LangChain tool to become an agent. We optimized this using a modular, distributed execution model.


🛠️ Built With

Languages & Frameworks

  • Python → Backend development
  • JavaScript (Next.js, Tailwind, React & Streamlit) → Frontend & UI

AI Models & NLP

  • ChatGPT-4o, DeepSeek R1, Claude 3.5 Sonnet → Task decomposition, planning, execution
  • OS-Atlas deployed on HuggingFace → Bounding box detection for UI elements
  • OpenAI 4o-mini → Web image captioning
  • AWS Neptune → Knowledge graph Generation
  • Gemini Flash → Running of the Grounding model

Data & Storage

  • Supabase → Stores web schema & bounding boxes
  • Redis → Caches agent execution history

Security & Trust

  • Fernet Encryption + HMAC-SHA256 → Secure AI communications
  • EigenTrust + PageRank → AI agent reputation scoring

Agents & Tools

  • LangChain → AI agents built as modular tools
  • Selenium → Browser automation for web scraping
  • Twilio API → Automated phone calling & messaging
  • Perplexity AI API → Real-time web search agent
  • LLMCompiler → Assist Agent Manager to control and call agents

📸 Project Media

Agentic Internet


Cenotium is just the beginning—a paradigm shift redefining how AI interacts with the web. We are building a new internet, not for humans—but for agents. 🌎 🤖

Built With

Share this project:

Updates