Skip to content

fix: simplify governance sync scheduling#1590

Closed
TheLastCicada wants to merge 2 commits into
developfrom
fix/governance-sync-v2-rc2
Closed

fix: simplify governance sync scheduling#1590
TheLastCicada wants to merge 2 commits into
developfrom
fix/governance-sync-v2-rc2

Conversation

@TheLastCicada

@TheLastCicada TheLastCicada commented Apr 20, 2026

Copy link
Copy Markdown
Contributor

Summary

  • make the scheduler the single owner of governance sync for both V1 and V2, removing nested default-org sync triggers and recursive governance retries
  • add a bootstrap cadence with bounded scheduler imports, then fall back to a 30-minute steady-state cadence once default-org governance data is present
  • harden fire-and-forget sync endpoints, align config/docs, and add focused V1/V2 integration coverage for cadence selection and V2 in-flight sync deduplication

Testing

  • eval \"$(fnm env)\" && fnm use && bash tests/run-tests.sh npx cross-env NODE_ENV=test USE_SIMULATOR=true CW_PORT=31311 mocha --loader node_modules/extensionless/src/register.js tests/v2/integration/enable-disable-versions.spec.js tests/v2/integration/sync-default-organizations-v2.spec.js tests/v2/integration/governance-v2.spec.js --reporter spec --exit --timeout 300000

Note

Medium Risk
Changes governance sync execution and timing for both V1 and V2, which can affect startup behavior and data freshness if cadence selection/promotion is wrong. Risk is mitigated by added integration tests and by keeping other sync tasks unchanged.

Overview
Centralizes governance sync ownership in the scheduler for both V1 and V2, replacing the previous long-interval job and removing ad-hoc governance sync triggers from default-organization sync tasks.

Adds a two-phase governance sync cadence: a fast bootstrap interval (GOVERNANCE_SYNC_BOOTSTRAP_TASK_INTERVAL, default 300s) until local orgList governance data exists, then automatically promotes to the steady-state interval (GOVERNANCE_SYNC_TASK_INTERVAL, default 1800s). Governance models now expose hasLocalGovernanceData() and dedupe concurrent syncs via an in-flight _syncPromise instead of recursive retry loops.

Hardens /governance/sync and /v2/governance/sync to be fire-and-forget with error logging, adds bounded-wait options to org import/subscription paths for scheduler-driven retries, updates config defaults/docs/docker examples, and expands integration tests to cover cadence selection and V2 concurrent sync deduplication.

Reviewed by Cursor Bugbot for commit 0342980. Bugbot is set up for automated code reviews on this repo. Configure here.

Move governance sync ownership into the scheduler so startup retries are
centralized instead of layered across tasks, loaders, and model methods.

Keep the sync path cheaper and safer by deduplicating in-flight work,
falling back to bounded scheduler imports, and promoting to a slower
steady-state cadence after bootstrap.
Comment thread src/tasks/sync-governance-body.js Outdated
Comment thread src/tasks/index.js
Remove unused default exports from sync-governance-body.js and
sync-governance-body-v2.js that instantiated dead SimpleIntervalJob
objects at module load time. Add schedulerStopped guard to prevent
promotion timer callbacks from creating orphan jobs after stopAll.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 0342980. Configure here.

} seconds`,
error,
);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bootstrap cadence never promotes to steady-state without governance

Medium Severity

When no GOVERNANCE_BODY_ID is configured (or the node is the governance body itself), shouldSyncGovernance() returns false, sync is skipped, and hasLocalGovernanceData() always returns false since no governance data is ever written. The onBootstrapComplete callback never fires, so the job permanently stays on the 300-second bootstrap cadence instead of promoting to the 1800-second steady-state cadence. The old code used a single fixed interval. Nodes in this configuration now make unnecessary DataLayer/wallet RPC calls and a DB query every 5 minutes, forever.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 0342980. Configure here.

Base automatically changed from v2-rc2 to develop April 23, 2026 18:01
TheLastCicada added a commit that referenced this pull request Apr 23, 2026
Adopt the orthogonal pieces of #1590 (which supersedes that PR's governance
sync rewrite with #1600's simpler single-attempt-per-run design while
keeping the #1600 5-minute cadence).

- tasks/sync-default-organizations{,-v2}.js: remove the nested
  Governance.sync() trigger. Governance sync is now owned exclusively by
  sync-governance-body{,-v2}.js on its scheduler cadence; this task just
  reads whatever the local DB has and no-ops if orgList hasn't landed
  yet. Fixes a race where the task could block on governance sync before
  doing its own work.
- utils/data-loaders{,-v2}.js: drop the 50-retry recursive wait loop in
  getDefaultOrganizationList{,V2}. On a fresh install where governance
  data hasn't been synced, return [] instead of retrying internally; the
  scheduler will re-run us on cadence. Also move the USE_SIMULATOR /
  USE_DEVELOPMENT_MODE destructure into the functions so config reloads
  aren't stale.
- controllers/governance{,-v2}.controller.js: harden POST
  /governance/sync and POST /v2/governance/sync as fire-and-forget. The
  model sync() methods already return cleanly on any failure (subscribe-
  first pattern, scheduler retries on cadence); attach .catch defensively
  so future exceptions are logged instead of becoming unhandled
  rejections.

Does NOT adopt #1590's two-tier bootstrap/steady cadence (keeping #1600's
flat 300s default), the src/tasks/index.js scheduler refactor (not needed
without the promotion machinery), or the importOrganization
{ allowLongWait } option (that belongs to #1603).

Closes #1590 by folding in its compatible pieces.
@TheLastCicada

Copy link
Copy Markdown
Contributor Author

Closing in favor of #1600, which takes a simpler and more defensive approach to the same problem:

  • Single 300s cadence (no two-tier bootstrap/steady promotion machinery).
  • subscribe-first pattern before getDataLayerStoreSyncStatus, which fixes the permanent-skip-loop failure mode Bugbot flagged here on commit 0342980 (and the equivalent "no GOVERNANCE_BODY_ID" edge case).
  • { skipOnUnsynced } option on reconcileOrganization with the default throwing to protect the POST /organizations/resync data-destruction path from silently succeeding on a transient DL hiccup.

The orthogonal pieces of this PR (scheduler-as-single-owner / nested-trigger removal, fire-and-forget governance controllers, retry-loop cleanup in data-loaders*.js) are folded into #1600 in d98198f.

Dropped intentionally: the two-tier cadence, the src/tasks/index.js scheduler refactor (infrastructure for the discarded promotion machinery), and the importOrganization { allowLongWait } changes (those belong to #1603).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant