Skip to content

validexisinfra/storyexplorer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

StoryInfra — Infrastructure & AI Explorer for the Story Network 🚀

Unified infrastructure explorer and AI assistant for the Story ecosystem

Website: https://storyinfra.tech
AI Assistant API: /api/ai/query


🧩 Project Description

StoryInfra is an open infrastructure and analytics initiative focused on providing transparent, data-backed insights into the Story blockchain.

The project combines a network explorer, a structured data pipeline, and a production-grade AI assistant that answers questions strictly based on on-chain data and internal snapshots.

The goal of StoryInfra is to make the Story network easier to understand, analyze, and operate — without relying on opaque dashboards or generic AI chatbots.


🧱 Core Components

StoryInfra consists of three tightly integrated layers:

1️⃣ StoryInfra Explorer

A web-based infrastructure explorer providing:

  • Network-level statistics (validators, delegators, stake, price, APR)
  • Validator set visibility and ranking
  • Asset, license and derivation growth analytics
  • Infrastructure and decentralization insights (geo / providers)
  • Historical time-series snapshots

2️⃣ Story AI — Analytical Assistant

A specialized AI agent designed specifically for the Story blockchain, not a generic LLM interface.

It answers questions such as:

  • “What is the current state of the Story network?”
  • “Show top 5 validators by stake”
  • “How fast are assets and licenses growing?”
  • “Is the network becoming more decentralized?”

All answers are generated only from a local SQLite database populated by verified snapshots.


3️⃣ Data Pipeline & Storage

A structured ingestion layer that:

  • Collects JSON snapshots from Story APIs and infrastructure tools
  • Normalizes and aggregates metrics
  • Stores them in a relational SQLite database
  • Exposes clean analytical context to the AI layer

No raw SQL, no hidden APIs, no hallucinated numbers.


✨ Key Features

  • 🔗 Story-native focus — built exclusively for the Story network
  • 📊 Snapshot-based analytics — clear time windows and explicit limitations
  • 🧠 LLM with hard grounding — the AI can only use provided database context
  • 🧮 Validator ranking by stake — deterministic, SQL-based ordering
  • 📈 Asset & license growth tracking — minute / hour / day aggregation
  • 🌍 Infrastructure decentralization insights — countries and providers
  • 🧰 Command knowledge base — structured operational references
  • 🛡 Anti-hallucination design — unknown or insufficient data is explicitly stated
  • 🧪 Extensible architecture — new data sources and tools can be added cleanly

👥 Usage Scenarios

User Role Typical Question / Goal What StoryInfra Provides
Validator / Node Operator “How healthy is the current validator set?” Snapshot-based overview of validators, bonded status, stake distribution, and infrastructure signals.
Validator / Node Operator “Am I competitive compared to other validators?” Ranking context, commission comparison, and relative stake positioning.
Delegator “Who are the top validators by stake?” Deterministic top-N validator ranking based on on-chain delegation data.
Delegator “Is staking participation increasing?” Delegator and staked-token growth dynamics over defined time windows.
Ecosystem Team / Researcher “How is the Story network evolving over time?” Time-series analytics for assets, licenses, derivations, staking, and price.
Ecosystem Team / Researcher “Can this data be reproduced and verified?” Snapshot-based SQLite storage with transparent aggregation logic.
Infrastructure / Decentralization Initiative “How decentralized is the network infrastructure?” Geographic and provider distribution of visible nodes.
Infrastructure / Decentralization Initiative “Are there concentration risks?” Country- and provider-level aggregation highlighting potential centralization patterns.

The system is designed to translate raw on-chain and infrastructure data into clear, verifiable analytical answers for different roles across the Story ecosystem.


🧠 Story AI Assistant Capabilities

Story AI is an analytical assistant, not a chat bot.

It follows strict rules:

  • uses only structured DB context
  • never invents or estimates values
  • explicitly states when data is missing
  • explains numbers in plain language

What it can do today ✅

  • Provide network overview snapshots:

    • validators count
    • delegators count
    • total staked tokens
    • STORY price and APR
    • current block height
  • Rank top validators by stake:

    • deterministic SQL ordering
    • commission context where available
  • Analyze growth dynamics:

    • assets
    • licenses
    • derivations
    • over minute / hour / day windows
  • Explain economic trends:

    • staking growth
    • delegator growth
    • price dynamics (snapshot-based)
  • Summarize infrastructure distribution:

    • top countries
    • top hosting providers

🔍 Data Sources

StoryInfra currently uses:

  • 🔌 Story network APIs

    • validators
    • staking metrics
    • chain statistics
  • 🗃 JSON snapshot files

    • assets & licenses history
    • chain economics history
    • infrastructure visibility data
  • 🧮 SQLite database

    • normalized validators table
    • time-series metric tables
    • infrastructure & commands tables

All data is processed before reaching the LLM.


🔄 Data Flow Overview

flowchart TD
    %% =====================
    %% External Sources
    %% =====================
    subgraph Sources["External Data Sources"]
        A1["Story RPC / APIs<br/>(validators, staking, chain stats)"]
        A2["Infrastructure Scanners<br/>(geo, providers, peers)"]
        A3["Asset & IP Indexers<br/>(assets, licenses, derivations)"]
    end

    %% =====================
    %% Raw Snapshots
    %% =====================
    subgraph Snapshots["Raw Snapshot Layer"]
        B1["nodestats.json"]
        B2["nodes.json"]
        B3["storychain.json"]
        B4["ipasset_snapshots.json"]
        B5["infrastructure_snapshot.json"]
        B6["commands.json"]
    end

    %% =====================
    %% Import & Processing
    %% =====================
    subgraph Import["Import & Normalization"]
        C1["import_db.py"]
        C2["Schema Validation"]
        C3["Metric Normalization"]
        C4["Time-series Aggregation<br/>(m / h / d buckets)"]
    end

    %% =====================
    %% Storage
    %% =====================
    subgraph Storage["Relational Storage"]
        D1["SQLite Database"]
        D2["validators table"]
        D3["chain table"]
        D4["assetstime table"]
        D5["chainstory table"]
        D6["location table"]
        D7["commands table"]
    end

    %% =====================
    %% Agent Runtime
    %% =====================
    subgraph Agent["Story AI Runtime"]
        E1["Intent Detection"]
        E2["SQL Context Builder"]
        E3["MCP Tool Resolver"]
        E4["Response Guardrails<br/>(no hallucinations)"]
    end

    %% =====================
    %% LLM
    %% =====================
    subgraph LLM["LLM Layer"]
        F1["Prompt Assembly"]
        F2["Gemini LLM"]
    end

    %% =====================
    %% Data Flow
    %% =====================
    A1 --> B1
    A1 --> B2
    A3 --> B4
    A2 --> B5
    A1 --> B3
    A3 --> B6

    B1 --> C1
    B2 --> C1
    B3 --> C1
    B4 --> C1
    B5 --> C1
    B6 --> C1

    C1 --> C2 --> C3 --> C4

    C4 --> D1
    D1 --> D2
    D1 --> D3
    D1 --> D4
    D1 --> D5
    D1 --> D6
    D1 --> D7

    D1 --> E1 --> E2
    E2 --> E3 --> E4
    E4 --> F1 --> F2
Loading

🧰 MCP & Tooling Layer

StoryInfra uses a Model Context Protocol (MCP)-style approach to expose internal data as callable tools.

Current tools include:

  • commands
    Returns structured CLI / RPC / operational references for the Story network.

This layer is intentionally designed to be:

  • reusable by other agents
  • callable by automation scripts
  • expandable with additional analytical tools

🧱 Project Structure (Current)

storyai/
├── api/
│   └── main.py                # FastAPI application entrypoint
│
├── agent/
│   ├── agent.py               # Main orchestration logic
│   ├── intent.py              # Intent detection and routing
│   ├── context.py             # SQL context builder
│   ├── response.py            # Answer generation & guardrails
│   ├── prompts.py             # System prompt and response rules
│   ├── llm.py                 # Gemini LLM integration
│   ├── sql_tools.py           # SQLite query helpers
│   └── tools_proxy.py         # MCP tool proxy
│
├── mcp/
│   ├── server.py              # MCP FastAPI server
│   ├── registry.py            # MCP tool registry
│   └── tools.py               # MCP analytical tools
│
├── db/
│   └── story.db               # SQLite database (generated)
│
├── import_db.py               # Data ingestion & normalization script
├── run.sh                     # Production runner (uvicorn)
├── requirements.txt           # Python dependencies
├── config.py                  # Global configuration
├── .env                       # Environment variables (not committed)
└── README.md                  # Project documentation

🛠️ Getting Started (Local)

1️⃣ Requirements

  • Linux environment (Ubuntu/Debian recommended)
  • Python 3.10+
  • git
  • SQLite
  • Gemini API key (or compatible LLM provider)

2️⃣ Clone repository

git clone https://github.com/<your-org>/storyinfra.git
cd storyinfra

3️⃣ Setup virtual environment

python3 -m venv venv
source venv/bin/activate

4️⃣ Install dependencies

pip install -r requirements.txt

5️⃣ Environment variables (create .env)

GEMINI_API_KEY=YOUR_KEY

6️⃣ Import data into DB

python import_db.py

7️⃣ Run API

./run.sh

API will be available at:

POST /api/ai/query

Example:

curl -X POST http://127.0.0.1:8004/api/ai/query \
  -H "Content-Type: application/json" \
  -d '{"query":"top 5 validators by stake"}'

🛡 Design Principles

StoryInfra is built around the following guarantees:

  • No hallucinations — answers are DB-backed
  • Explicit uncertainty — missing data is stated
  • Deterministic analytics — rankings are SQL-based
  • Explainability first — numbers are explained, not dumped
  • Public-good orientation — transparency over marketing

🔭 Roadmap

✅ Implemented

  • SQLite-backed AI assistant
  • Validator ranking by stake
  • Asset & license growth analytics
  • Infrastructure distribution insights
  • FastAPI production backend

🧪 Planned

  • Validator health scoring
  • Longer historical windows
  • Advanced decentralization metrics
  • Public dashboards powered by the same backend
  • Additional MCP analytical tools

📄 License

This project is licensed under the MIT License.
See the LICENSE file for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors