Skip to content

DRAFT feat(consensus): add multi-lineage consensus tool#4703

Merged
code-yeongyu merged 4 commits into
code-yeongyu:devfrom
albertdbio:feat/consensus-tool
Jun 4, 2026
Merged

DRAFT feat(consensus): add multi-lineage consensus tool#4703
code-yeongyu merged 4 commits into
code-yeongyu:devfrom
albertdbio:feat/consensus-tool

Conversation

@albertdbio

@albertdbio albertdbio commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Adds a consensus tool: spawns N voters (default 3) from different model families (Anthropic / OpenAI / Google / open-source) in parallel, gives each the same question, and returns their positions for the calling agent to synthesize.
  • Restricted to the main agent (subagents cannot invoke it, to prevent recursion and cost blow-ups); config-gated and enabled by default.
  • Promoted in the Sisyphus prompt alongside Oracle, so the orchestrator reaches for it on high-stakes / hard-to-verify decisions.

Changes

The tool & core logic

  • src/tools/consensus/tool.ts — the consensus tool definition: description the agent sees, args (prompt, count, caller_model, exclude_lineages), main-agent-only gating, and the synthesizer guidance returned to the caller.
  • src/features/consensus/consensus-engine.ts — orchestration: fetches connected providers + available models, selects diverse voters (excluding the caller's own lineage), spawns them in parallel, aggregates positions, and flags advisoryOnly when fewer than two voters return a usable answer.
  • src/features/consensus/voter-resolver.ts — resolves each candidate to a connected provider + a promptable model id, with a stale-cache fallback so a just-connected provider still resolves.
  • src/features/consensus/voter-spawner.ts — spawns a one-shot voter session: framing (answer directly, do not delegate or wait), delegation/question tools disabled, configurable reasoning effort, polls to completion, and cleans up the temporary session.
  • src/shared/model-lineage.ts — the default voter pool (model families) and lineage-diversity selection.
  • src/features/consensus/types.ts — shared types (VoterPosition, ConsensusResult, ResolvedVoterCandidate).

Config

  • src/config/schema/consensus.ts — config block: voter count, lineages, per-voter timeout, reasoning effort, and the gate sub-configs.
  • src/config/schema/oh-my-opencode-config.ts — composes ConsensusConfigSchema into the root config schema.

Wiring

  • src/plugin/tool-registry.ts — registers the consensus tool, config-gated (default on).
  • src/tools/index.ts, src/tools/consensus/index.ts, src/features/consensus/index.ts — barrel exports.

Prompt promotion

  • src/agents/dynamic-agent-core-sections.tsbuildConsensusSection() (rendered only when the tool is registered), mirroring the existing Oracle section.
  • src/agents/dynamic-agent-prompt-builder.ts — exports the new section.
  • src/agents/sisyphus.ts, src/agents/sisyphus/{default,claude-opus-4-7,gpt-5-4,kimi-k2-6}.ts — inject the structured section next to Oracle; src/agents/sisyphus/gpt-5-5.ts — matching prose paragraph for the prose-style variant.

Generated

  • assets/oh-my-opencode.schema.json — regenerated from the updated config schema.

Testing

bun run typecheck
bun test

tsgo --noEmit passes clean. The tool was also exercised end-to-end with a Claude main agent (caller_model="claude-opus-4-8"), which resolved and collected three voters — GPT-5.5 (OpenAI), Gemini 3.1 Pro (Google Vertex), and Kimi K2.6 (OpenCode Zen) — all returning positions, with advisoryOnly correctly false.

Related Issues


Summary by cubic

Adds a consensus tool that runs parallel voters from different model families and returns their positions for the main agent to synthesize. Also relaxes a Windows installer test timeout to reduce flakiness.

  • New Features

    • consensus tool (main‑agent only): args prompt, count, caller_model, exclude_lineages; picks diverse voters (excludes caller lineage), resolves against connected providers with stale‑inventory fallback, spawns one‑shot voters with delegation disabled, aggregates results, flags advisoryOnly when <2 usable answers.
    • Config + wiring + schema: new consensus block (default voter count/lineages, per‑voter timeout, reasoning effort, optional pre‑question and post‑test gates); tool registered in src/plugin/tool-registry.ts (config‑gated, default on); regenerated assets/oh-my-opencode.schema.json.
    • Prompt promotion: Consensus usage section added to all Sisyphus variants.
  • Bug Fixes

    • Routed voter prompts through the shared internal prompt gate (dispatchInternalPrompt, mode: sync, queueBehavior: defer) and handled accepted/ambiguous/skipped outcomes before polling.
    • Removed a stale GPT voter fallback; GPT lineage now prefers current fallbacks (e.g., gpt-5.4(-mini)) and rejects Codex‑era matches.
    • Relaxed Windows cleanup installer test timeout to 15s to reduce flakiness.

Written for commit f55f41c. Summary will update on new commits.

Review in cubic

@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

All contributors have signed the CLA. Thank you! ✅
Posted by the CLA Assistant Lite bot.

@albertdbio

Copy link
Copy Markdown
Contributor Author

I have read the CLA Document and I hereby sign the CLA

github-actions Bot added a commit that referenced this pull request Jun 3, 2026

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 22 files

Confidence score: 5/5

  • Automated review surfaced no issues in the provided summaries.
  • No files require special attention.

Requires human review: While the PR is well-structured and adds a useful feature, the scope (845 lines, 22 files) and the strict '100% sure no regressions' criterion make auto-approval risky without a deeper review.

Re-trigger cubic

@albertdbio

albertdbio commented Jun 3, 2026

Copy link
Copy Markdown
Contributor Author

Codegraph-assisted review

Blocking / should resolve

  1. src/features/consensus/voter-spawner.ts:79 bypasses the established prompt dispatch path

    The voter spawner calls ctx.client.session.prompt(...) directly. The rest of the child-session paths I traced use the internal prompt dispatch/gate helpers or explicit facades. Locally, the prompt-route audit flags this file:

    bun test src/shared/prompt-async-route-audit.test.ts
    + [ "features/consensus/voter-spawner.ts" ]
    - []
    

    GitHub CI is currently green, so this is not a CI-status claim. The review concern is architectural: either route this through the established dispatch path, or explicitly document/allowlist why a fresh one-shot consensus voter session is exempt.

  2. No tests for the new consensus path

    The PR adds the core engine, voter resolver/spawner, tool wrapper, config schema, and prompt wiring, but no consensus-specific tests. The new code already has useful DI seams (RunConsensusDeps, ConsensusToolDeps), so candidate selection, caller-lineage exclusion, subagent rejection, result classification, and cleanup behavior should be pinned.

  3. Voter sessions do not inherit the parent workspace directory

    src/tools/consensus/tool.ts:60 calls runConsensus(...) without passing parentDirectory, while src/features/consensus/voter-spawner.ts:63 only sets query.directory if one is provided. Existing child-session flows resolve the parent session directory first. Consensus voters can otherwise run outside the user’s project context.

  4. Empty voter output is counted as a successful vote

    waitForResult can return an empty string, but spawnVoter still returns status: "ok", and okVoterCount treats that as a usable voter. Empty/whitespace responses should not count toward consensus.

  5. Config exposes unimplemented gates

    src/config/schema/consensus.ts defines pre_question_gate and post_test_gate, but repo search found no hook/runtime wiring for either. Users can configure these and see no behavior. Either implement the gates or mark/remove them until they exist.

How codegraph helped

I used codegraph-omo to trace the feature end-to-end: createConsensusTool -> runConsensus -> resolveVoterCandidate -> spawnVoter, then checked the impact path through createToolRegistry and the Sisyphus prompt builders. That call graph showed that the explicit tool is the only runtime caller today, the gate config is schema-only, and parentDirectory is dead plumbing from the tool layer into the spawner.

Local verification

  • GitHub PR checks are green.
  • bun run typecheck passes locally after installing dependencies.
  • bun test src/shared/model-availability.test.ts src/plugin/tool-registry.test.ts passes locally.
  • bun test src/shared/prompt-async-route-audit.test.ts locally flags features/consensus/voter-spawner.ts as above.

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 issues found across 1 file (changes from recent commits).

Requires human review: This PR adds a large new multi-lineage consensus tool with 863 lines of changed code spanning new orchestration, voter spawning, model lineage resolution, configuration, and prompt integration, and while automated review found no issues, the strict requirement of zero regressions cannot be met with

Re-trigger cubic

@albertdbio albertdbio changed the title feat(consensus): add multi-lineage consensus tool DRAFT feat(consensus): add multi-lineage consensus tool Jun 3, 2026

@code-yeongyu code-yeongyu left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes for deprecated GPT model references in this PR.\n\nThe current dev branch has moved off gpt-5.2/gpt-5.3 and now uses the GPT-5.5-era IDs where applicable. This PR diff still adds stale references such as:\n\n634:+ gpt: ["gpt-5.5", "gpt-5.4", "gpt-5.3-codex"],\n\nPlease update the PR so it does not introduce gpt-5.2/gpt-5.3 in runtime config, docs, fallback chains, snapshots, or ordinary test fixtures. If a deprecated ID is truly needed as a migration-input fixture, isolate it as an explicit historical/deprecated-input test and make sure the migrated/current output is GPT-5.5-era, not 5.2/5.3.

albertdbio and others added 3 commits June 4, 2026 12:44
Adds a `consensus` tool that spawns N voters (default 3) from different model
families (Anthropic / OpenAI / Google / open-source) in parallel, gives each the
same question, and returns their positions for the calling agent to synthesize.
Restricted to the main agent; subagents cannot invoke it.

- src/features/consensus: engine, voter-spawner (single-shot framing, disabled
  delegation tools, configurable reasoning effort), voter-resolver (live-provider
  resolution with stale-cache fallback), types.
- src/tools/consensus: the main-agent-only consensus tool, gated on session role.
- src/shared/model-lineage: voter pool and lineage diversity selection.
- src/config/schema/consensus: config block (voter count, lineages, timeout,
  reasoning effort), composed into the root config schema.
- Register the consensus tool in tool-registry, config-gated (default on).
- Promote consensus in all six Sisyphus prompt variants (structured section plus
  gpt-5-5 prose), mirroring the Oracle guidance section.
voter-spawner called ctx.client.session.prompt directly, bypassing the internal prompt dispatch gate and tripping the prompt-async-route audit. Route the voter prompt through dispatchInternalPrompt (mode: sync, queueBehavior: defer), matching the call-omo-agent sync executor, and handle accepted/ambiguous/skipped dispatch results before polling.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
@code-yeongyu code-yeongyu force-pushed the feat/consensus-tool branch from 619a131 to 3bf6946 Compare June 4, 2026 03:47

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 issues found across 1 file (changes from recent commits).

Requires human review: While the implementation appears well-structured and the AI review found no issues, the addition of a new multi-lineage consensus subsystem spanning over 1100 lines across 15+ files (including changes to core agent prompts, config schema, tool registry, and session handling) introduces inherent...

Re-trigger cubic

@code-yeongyu code-yeongyu left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved after maintainer fixup.\n\nVerified the PR diff no longer introduces gpt-5.2/gpt-5.3 or hyphenated stale variants, the consensus fixes address the prior deprecated-model/requested-change concern, local targeted Codex cleanup test and full test:codex passed, and CI/build/Codex compatibility are green on f55f41c. Cubic also reports 0 issues on the current head.

@code-yeongyu code-yeongyu merged commit 43c3a49 into code-yeongyu:dev Jun 4, 2026
18 of 19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants