Skip to content

research(orchestration): HiveMind OS-inspired scheduling for concurrent agent workloads — token budgets and rate-limit tracking #3299

@bug-ops

Description

@bug-ops

Description

arXiv:2604.17111 (April 2026) proposes HiveMind, an OS-inspired scheduling system for concurrent LLM agent workloads. It introduces primitives: admission control, provider-aware rate-limit tracking, and per-agent token budgets — directly applicable to Zeph's multi-agent PILOT routing layer.

Key Findings

  • Admission control gate prevents overcommitting concurrent agents to a provider
  • Provider-aware rate-limit tracking integrates with the provider registry (maps 1:1 to Zeph's [[llm.providers]] entries)
  • Per-agent token budgets enforce cost ceilings per delegated subagent
  • Scheduling policies (FIFO, priority, fair-share) selectable at runtime

Relevance to Zeph

Zeph's PILOT LinUCB bandit routing already tracks provider latency and cost but lacks:

  1. Admission control (can saturate a provider with concurrent subagents)
  2. Per-agent budget enforcement (subagents can exceed total turn budget)

HiveMind primitives could be layered on top of PILOT without replacing the bandit policy.

Research Actions

  • Read full paper: https://arxiv.org/abs/2604.17111
  • Prototype admission control gate in zeph-orchestration
  • Assess whether per-agent token budgets can reuse cost_tracker already in zeph-core

Environment

  • Paper: arXiv:2604.17111
  • Area: zeph-orchestration, zeph-core

Metadata

Metadata

Assignees

Labels

P4Long-term / exploratoryresearchResearch-driven improvement

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions