fix: /api/v1/server?mode=cluster returns no ha section after Raft migration (#4261)#4266
Conversation
…ration (#4261) After the legacy HAServer was replaced by RaftHAPlugin (#3731), the ?mode=cluster endpoint stopped returning the ha section for clusters that worked under the old implementation. Three issues fixed: - PluginManager.isHAEnabled() and RaftHAPlugin.isRaftEnabled() now also return true when HA_SERVER_LIST is non-empty. Previously only HA_ENABLED was checked, so users who configured a server list but didn't know about the new opt-in flag got no HA plugin started. - GetServerHandler.exportCluster() no longer forwards to the leader over HTTP to obtain replica info. The old HA needed it because followers lacked the full peer list; with Raft, every node has the complete peer list from the group config. The forwarding also crashed with MalformedURLException when getLeaderAddress() returned null during leader election. - QueryEngineManager.getAvailableLanguages() returned "opencypher" twice because the "cypher" and "opencypher" keys point at the same factory instance. Returning implementations.keySet() preserves the distinct language aliases and eliminates the duplicate. RaftHTTP2ServersIT.serverInfo() now asserts the ha section structure to prevent the regression from going unnoticed again.
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 0 |
🟢 Coverage 100.00% diff coverage · -7.48% coverage variation
Metric Results Coverage variation ✅ -7.48% coverage variation Diff coverage ✅ 100.00% diff coverage Coverage variation details
Coverable lines Covered lines Coverage Common ancestor commit (9f73fdf) 127293 93874 73.75% Head commit (1ecc2f5) 158944 (+31651) 105325 (+11451) 66.27% (-7.48%) Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch:
<coverage of head commit> - <coverage of common ancestor commit>Diff coverage details
Coverable lines Covered lines Diff coverage Pull request (#4266) 9 9 100.00% Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified:
<covered lines added or modified>/<coverable lines added or modified> * 100%
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
Code ReviewOverall this is a clean, well-reasoned fix. The PR description clearly explains each change and the root causes are correctly identified. A few observations below. QueryEngineManager -
|
| Area | Severity | Action |
|---|---|---|
Stale Javadoc on RaftHAPlugin |
Minor | Update comment |
isEmpty() vs isBlank() |
Minor | Consider isBlank() |
| Duplicated HA-enable logic | Minor | Follow-up refactor to GlobalConfiguration |
getAvailableLanguages() return order |
Negligible | Document or use TreeSet if stable order needed |
| Missing unit test for language dedup | Minor | Add test |
Test: assert replicas non-empty |
Minor | Strengthen existing assertion |
The core fixes (implicit HA via HA_SERVER_LIST, removal of crash-prone follower forwarding, and language dedup) are all correct. Ready to merge after addressing the stale comment at minimum.
There was a problem hiding this comment.
Code Review
This pull request refactors the high availability (HA) detection logic to consider the server list configuration and simplifies the retrieval of available query languages. It also removes redundant leader-polling logic in the server handler and enhances integration tests to verify HA status in the server response. I have no feedback to provide.
- Centralize the implicit HA-enabled check in ContextConfiguration.isHAImplicitlyEnabled() so PluginManager (server) and RaftHAPlugin (ha-raft) share one source of truth and cannot drift if the detection logic changes again. - Use isBlank() instead of isEmpty() for HA_SERVER_LIST so accidental whitespace-only values do not silently enable HA. validateConfiguration() updated for coherence. - Refresh RaftHAPlugin class Javadoc to mention both triggers (HA_ENABLED=true OR non-blank HA_SERVER_LIST). - Switch QueryEngineManager.implementations to LinkedHashMap so getAvailableLanguages() returns aliases in insertion order rather than the non-deterministic HashMap iteration order callers had been getting. - New QueryEngineManagerLanguagesTest pins three contracts: no duplicates, built-in engines always present, and both 'cypher' and 'opencypher' aliases surface when OpenCypher is on the classpath. - Strengthen RaftHTTP2ServersIT.serverInfo() to assert the replicas array is non-empty (previously only existence was checked).
|
Review items addressed in f28f735:
On the suggestion to assert the test exercises the implicit-enable path: Local results: |
Code Review - PR #4266Good bug fix overall. The root cause analysis in the PR description is clear and the three separate fixes are well-scoped. Some observations below. Correctness
Follower forwarding removal - this is the right call. The old
Potential Issues
Style (per CLAUDE.md)Multi-line Javadoc on // Non-blank HA_SERVER_LIST implies HA intent even without HA_ENABLED=true
public boolean isHAImplicitlyEnabled() {Multi-line comments in Inline comment in Test Coverage
The one gap noted above (follower vs leader in Minor
Summary: The fix is correct and well-tested. The main actionable items are: (1) verify |
Three cases pinned: unset HA_SERVER_LIST → false, blank/whitespace-only → false, configured value → true. This is the boolean that gates the implicit-enable path; testing it directly removes the need to spin up a Raft server in tests just to exercise the discovery logic.
Code ReviewOverviewThis PR fixes three distinct bugs introduced after the
The root causes are correctly identified and the fixes are targeted and clean. Positive Aspects
Concerns and Suggestions1. Leaky abstraction in the engine module's Javadoc (minor)
The engine module should not know about Raft. The Javadoc describes a configuration convention; the detail about which plugin acts on it belongs in /**
* Returns {@code true} when HA is implicitly opted-in by a non-blank {@code HA_SERVER_LIST}.
* Callers (PluginManager, HA plugins) combine this with an explicit {@code HA_ENABLED=true}
* to decide whether to activate HA. Kept here as the single source of truth.
*/2. The old contract was "return each factory's canonical 3. Follower-awareness assumption in test comment The added comment in // With Raft, every node knows all peers from the group config, so the replicas array
// is populated even on followers without the old leader-forwarding round trip.This is a strong architectural claim. A pointer to the class/method in the Raft implementation that guarantees this (where the peer list is populated on followers) would help future maintainers verify it has not regressed. 4. Null check in
public boolean isHAImplicitlyEnabled() {
return !getValueAsString(GlobalConfiguration.HA_SERVER_LIST).isBlank();
}Keeping it is fine for defensive programming - just a cosmetic point. 5. Vacuous test guard (nit) if (languages.contains("opencypher"))
assertThat(languages).contains("cypher");If OpenCypher is absent from the classpath, the test passes without asserting anything. Consider adding a short comment explaining this is intentional for engine-only test runs without the opencypher dependency on the classpath. SummaryThe fix is correct, well-scoped, and the regression tests are solid. None of the concerns above block merging. The most actionable improvement is the Javadoc wording in point 1 - keeping Raft-specific language out of the |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4266 +/- ##
==========================================
- Coverage 64.67% 64.67% -0.01%
==========================================
Files 1644 1644
Lines 127293 127266 -27
Branches 27277 27277
==========================================
- Hits 82328 82310 -18
+ Misses 33447 33433 -14
- Partials 11518 11523 +5 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…r runs Before this PR's implicit-enable change, BaseGraphServerTest set HA_SERVER_LIST unconditionally and gated HA on HA_ENABLED=true. That was a no-op for single-server tests because the discovery logic only looked at HA_ENABLED. Once ContextConfiguration.isHAImplicitlyEnabled() started treating a non-blank HA_SERVER_LIST as an HA opt-in (the user-facing fix in this PR), every single-server test inheriting BaseGraphServerTest accidentally auto-started RaftHAPlugin, with a one-peer server list. Writes against a one-node Raft group time out / fail with "Group commit entry failed", surfacing as the gRPC test failures CI flagged on ArcadeDbGrpcServiceExtendedTest, GrpcServerIT, and Issue4214InsertStreamConflictIgnoreIT. The fix is to set HA_SERVER_LIST and HA_ENABLED only when getServerCount() > 1 so single-server tests behave as they did before the implicit-enable change.
Code Review - PR #4266The PR correctly identifies and fixes three independent regressions introduced after the CorrectnessFollower replica list (GetServerHandler) - Verified. Implicit HA enablement - Placing the logic in
Security improvement - The removed follower-forwarding code was forwarding the client's raw Suggestions1. Test only exercises one server endpoint for the HA assertion
2. Conditional assertion in if (languages.contains("opencypher"))
assertThat(languages).contains("cypher");The engine module always has OpenCypher on its classpath (it lives in assertThat(languages).contains("opencypher", "cypher");3. Minor: These are fine as-is per existing project style. One AssertJ alternative that gives a cleaner failure message: assertThat(ha.keySet()).contains("clusterName", "leader", "network");Not a blocker. BaseGraphServerTest fixThe guard added around SummaryThe core fixes are correct, safe, and well-tested. The main item worth adding before merge is a follower-side assertion in |
The 'opencypher' factory lives in engine/src/main and is always on the
classpath of this test, so the 'if (languages.contains("opencypher"))'
guard could only ever hide a regression where OpenCypher stopped being
registered. Asserting on both aliases unconditionally turns that into
a hard failure.
|
Items 2 and 3 addressed, item 1 is a misread of the test. 1. Follower-side assertion - already covered. 2. Conditional Cypher assertion - fixed in 1ecc2f5. The 3. Local: |
|
Code Review - PR #4266 This is a clean, well-scoped fix for three distinct but related regressions. The PR description is thorough and the root-cause analysis is correct. Overview The PR fixes three distinct issues:
Strengths
Issues 1. Multi-line comment blocks violate project style CLAUDE.md says: "Never write multi-paragraph docstrings or multi-line comment blocks - one short line max." The new Javadoc on 2. Inline comments inside test methods explain WHAT, not WHY The comment in 3. The old code returned 4. No integration test for the implicit-enable path itself The regression being fixed (HA_SERVER_LIST set but HA_ENABLED not explicitly true) is not exercised by any of the updated integration tests - BaseGraphServerTest now explicitly sets HA_ENABLED=true when getServerCount() > 1. The unit test in ContextConfigurationTest covers isHAImplicitlyEnabled() in isolation, but there is no end-to-end test that boots a 2-server cluster relying solely on HA_SERVER_LIST without HA_ENABLED=true. Consider adding at least one such test, or noting that this path is covered by the manual test plan. 5. ContextConfiguration coupling (minor) Placing HA-specific detection logic in ContextConfiguration (engine module) couples a low-level component to HA concepts. It is justified here as the single source of truth between server and ha-raft, and ContextConfiguration already owns HA_ENABLED/HA_SERVER_LIST values. Just worth keeping in mind if HA detection logic grows more complex over time. Summary The core logic is correct and the test coverage is good. The main actionable items are: trim the multi-line comment blocks to conform to project style, and consider whether the implicit-enable path needs an end-to-end integration test. Neither is a blocker. |
|
Hi there Command I used was:
|
Summary
Fixes #4261. After the legacy
HAServerwas replaced byRaftHAPlugin(#3731), the/api/v1/server?mode=clusterendpoint stopped returning thehasection for HA clusters.Root cause (missing
hasection): BothPluginManager.isHAEnabled()andRaftHAPlugin.isRaftEnabled()only checkedHA_ENABLED(defaultfalse). Users who configuredHA_SERVER_LISTbut didn't explicitly setHA_ENABLED=truegot no HA plugin started, soserver.getHA()returnednulland thehasection was omitted. A non-emptyHA_SERVER_LISTnow implicitly enables HA in both places.Follower forwarding removed:
GetServerHandler.exportCluster()used to call back to the leader over HTTP to fetch the replica list. The old HAServer needed this because followers lacked the full peer list; with Raft, every node already has the complete peer list from the group config. The forwarding also crashed withMalformedURLExceptionwhengetLeaderAddress()returnednullduring leader election.Duplicate
opencypherin languages:QueryEngineManager.getAvailableLanguages()iteratedimplementations.values().getLanguage(), but"cypher"and"opencypher"point at the sameOpenCypherQueryEngineFactoryinstance, sogetLanguage()returned"opencypher"twice. Returningimplementations.keySet()preserves distinct aliases and removes the duplicate.Test plan
RaftHTTP2ServersIT.serverInfo()extended with assertions onha.clusterName,ha.leader,ha.network, andha.network.replicasso the regression cannot return undetectedRaftHTTP2ServersIT(5/5) andRaftHAPluginTest(4/4) pass locallyHTTPDocumentIT#serverClusterInfo(single-server, no HA) still passes -hacorrectly absentservermodule tests pass (81/81)