Inspiration
The modern internet is not built for AI agents. It was designed for human interaction—clicking, searching, reading, and manually executing tasks. However, with the rise of autonomous AI systems, a new paradigm is necessary—one where AI agents can interact, navigate, and execute tasks independently across the web.
We saw a fundamental gap:
-Websites are not built for agents – Most web pages lack structured APIs, making them inaccessible to AI.
-Autonomous AI requires a new interaction model – AI agents need a structured web representation to engage with content.
-Security & trust are critical – Autonomous systems need verifiable, secure, and efficient execution to operate safely.
This inspired us to redefine the internet—not by replacing it, but by overlaying a new structured schema that enables AI agents to navigate the web autonomously, fundamentally making us a New Age Browser.
🛠️ What We Built
A Browser for AI Agents
Cenotium is an Agentic Internet Browser, allowing AI agents to interact with and automate web tasks using a structured approach.
A Web Schema Transformer
- Extracts website HTML, images, and UI components
- Uses Vision, Grounding, and Action models to structure pages for AI interaction
- Stores structured data in a Supabase database for efficiency and future page recall
An AI-Powered Agent Manager
- Uses LLM Compiler to fine-tune user prompts using RAG from user preference knowledge graph
- Divides tasks into single-action steps using LLMs
- Delegates execution to autonomous AI agents
A Network of AI Agents
- Perplexity Search Agent → Queries real-time information from the web
- Twilio Calling Agent → Places calls & sends messages autonomously
- Browser Activation Agent → Utilized E2B.dev containerized environments with OS Atlas grounding and LLM planning to interact with web elements, executing tasks via a Poetry-managed Python stack with PyQt5, Flask, OpenCV, and FFmpeg integration.
Security & Trust Layer
- Global & Local Trust Core → Implements EigenTrust + PageRank for AI decision-making
- Fernet Symmetric Encryption → Ensures secure agent communication
- Persistent Storage (Redis & Supabase) → Maintains execution history & user preferences
- Custom Inter-Agent Communication Protocol (RabbitMQ/Kafka) → Handles message security, encryption, and authentication
This architecture allows any new AI agent to be integrated, making Cenotium a scalable, future-proof AI ecosystem for web automation.
💡 What We Learned
- The Web is Not AI-Ready – Most websites hide UI elements in React/Angular front-ends, making AI interaction difficult.
- Planning & Execution Require Dynamic Replanning – Tasks must be broken into single-action steps and replanned on failure for seamless execution.
- Security is Paramount – AI agents must operate with encryption, trust mechanisms, and secure communication to prevent malicious actions.
- Agent Collaboration is Key – By allowing agents to call other agents, we built a self-improving multi-agent system.
- Lack of Web Search Optimization Dataset - There is no structured dataset optimized for AI-driven web search and interaction, forcing agents to learn web navigation heuristically rather than relying on standardized training data.
🚧 Challenges We Faced
Building an Agent-Readable Web Schema → Converting unstructured HTML into structured agent-compatible schemas was a major challenge. We solved this using Vision + Grounding + Action models.
Efficient Prompt Optimization → Ensuring that the LLM Compiler created the most contextually aware user queries required RAG-based knowledge retrieval and continuous prompt tuning.
Security & Trust → We had to implement EigenTrust-style reputation ranking, Fernet encryption, and a custom inter-agent security protocol to ensure safe & fair agent execution.
Scalability → The system had to support infinite AI agents, allowing any new LangChain tool to become an agent. We optimized this using a modular, distributed execution model.
🛠️ Built With
Languages & Frameworks
- Python → Backend development
- JavaScript (Next.js, Tailwind, React & Streamlit) → Frontend & UI
AI Models & NLP
- ChatGPT-4o, DeepSeek R1, Claude 3.5 Sonnet → Task decomposition, planning, execution
- OS-Atlas deployed on HuggingFace → Bounding box detection for UI elements
- OpenAI 4o-mini → Web image captioning
- AWS Neptune → Knowledge graph Generation
- Gemini Flash → Running of the Grounding model
Data & Storage
- Supabase → Stores web schema & bounding boxes
- Redis → Caches agent execution history
Security & Trust
- Fernet Encryption + HMAC-SHA256 → Secure AI communications
- EigenTrust + PageRank → AI agent reputation scoring
Agents & Tools
- LangChain → AI agents built as modular tools
- Selenium → Browser automation for web scraping
- Twilio API → Automated phone calling & messaging
- Perplexity AI API → Real-time web search agent
- LLMCompiler → Assist Agent Manager to control and call agents
📸 Project Media
Cenotium is just the beginning—a paradigm shift redefining how AI interacts with the web. We are building a new internet, not for humans—but for agents. 🌎 🤖
Built With
- amazon-web-services
- chatgpt
- eigentrust
- encryption
- fernet
- huggingface
- javascript
- langchain
- llmcompiler
- next.js
- os-atlas
- pagerank
- perplexity
- python
- react
- redis
- selenium
- streamlit
- supabase
- tailwind
- twilio
Log in or sign up for Devpost to join the conversation.