Skip to content

Skip certain metric assertions on Windows#144933

Merged
mamazzol merged 6 commits intoelastic:mainfrom
mamazzol:ci-144167
Mar 31, 2026
Merged

Skip certain metric assertions on Windows#144933
mamazzol merged 6 commits intoelastic:mainfrom
mamazzol:ci-144167

Conversation

@mamazzol
Copy link
Copy Markdown
Contributor

@mamazzol mamazzol commented Mar 25, 2026

Make sure we don't check for metrics that are not enabled on Windows.

Closes: #144166
Closes: #144167
Closes: #144282
Closes: #144976
Closes: #145152

@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v9.4.0 labels Mar 25, 2026
@mamazzol mamazzol added Team:Core/Infra Meta label for core/infra team :Core/Infra/Metrics Metrics and metering infrastructure >test Issues or PRs that are addressing/adding tests and removed needs:triage Requires assignment of a team area label labels Mar 25, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 25, 2026

Caution

Review failed

Failed to post review comments

📝 Walkthrough

Walkthrough

This pull request updates infrastructure and vector operations across multiple areas. Pipeline configurations update GCP compute resources to use newer machine types (n4-standard/n4-custom variants) with hyperdisk-balanced storage. Gradle wrapper and native library versions are bumped to 9.4.1 and 1.0.70 respectively. Test infrastructure adds BFloat16 vector scoring implementations with accompanying benchmarks and test utilities. Several APM integration test cases are muted due to metric collection failures. Documentation expands with new ESQL functions (SPARKLINE), updated type support tables for histogram types transitioning to GA, and property support for flattened fields. Native C++ code adds BFloat16 vector operations and sparse bulk scoring primitives. Java libraries extend vector scoring with BFloat16 capability, consolidate heap-segment feature detection, and reorganize scorer implementations for cleaner architecture.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch ci-144167
  • 🛠️ Update Documentation: Commit on current branch
  • 🛠️ Update Documentation: Create PR

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@server/src/test/java/org/elasticsearch/monitor/metrics/SystemMetricsTests.java`:
- Around line 52-64: The test currently uses a single fdSupported flag; instead
gate each assertion on its respective probe separately: compute boolean
fdUsedSupported = ProcessProbe.getOpenFileDescriptorCount() >= 0 and boolean
fdMaxSupported = ProcessProbe.getMaxFileDescriptorCount() >= 0, then assert
"jvm.fd.used" and "jvm.file_descriptor.count" against fdUsedSupported (and
fdUsedSupported && emitOTelMetrics for the OTel-name check), and assert
"jvm.fd.max" and "jvm.file_descriptor.limit" against fdMaxSupported (and
fdMaxSupported && emitOTelMetrics for the OTel-name check) so each metric is
validated independently of the other.

In
`@test/external-modules/apm-integration/src/javaRestTest/java/org/elasticsearch/test/apmintegration/MetricsApmIT.java`:
- Around line 78-103: The custom TestRule 'rule' must ensure
recordingApmServer.after() runs even if cluster.apply(...).evaluate() throws;
wrap the call to cluster.apply(s, description).evaluate() in an outer
try/finally that calls recordingApmServer.after() in the finally block, while
keeping the existing inner try/finally around base.evaluate() to preserve normal
shutdown ordering; update the lambda that creates the Statement so the outer
finally is the failure-path cleanup guard for recordingApmServer.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: aa9dc8ed-5016-4832-826a-c4a3d3bb64fa

📥 Commits

Reviewing files that changed from the base of the PR and between d0aea54 and 9b82fa0.

📒 Files selected for processing (3)
  • muted-tests.yml
  • server/src/test/java/org/elasticsearch/monitor/metrics/SystemMetricsTests.java
  • test/external-modules/apm-integration/src/javaRestTest/java/org/elasticsearch/test/apmintegration/MetricsApmIT.java
💤 Files with no reviewable changes (1)
  • muted-tests.yml

Comment on lines 52 to 64
boolean fdSupported = ProcessProbe.getOpenFileDescriptorCount() >= 0 && ProcessProbe.getMaxFileDescriptorCount() >= 0;
assertEquals("jvm.fd.used should be registered if supported", fdSupported, registeredGauges.contains("jvm.fd.used"));
assertEquals("jvm.fd.max should be registered if supported", fdSupported, registeredGauges.contains("jvm.fd.max"));
assertEquals(
"jvm.file_descriptor.count should be registered if emitting OTel metrics",
registeredGauges.contains("jvm.file_descriptor.count"),
emitOTelMetrics
fdSupported && emitOTelMetrics
);
assertEquals(
"jvm.file_descriptor.limit should be registered if emitting OTel metrics",
registeredGauges.contains("jvm.file_descriptor.limit"),
emitOTelMetrics
fdSupported && emitOTelMetrics
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Gate each fd metric on its own probe result.

SystemMetrics registers jvm.fd.used/jvm.file_descriptor.count from getOpenFileDescriptorCount() and jvm.fd.max/jvm.file_descriptor.limit from getMaxFileDescriptorCount() independently. The new shared fdSupported boolean requires both probes to succeed, so this test can still fail or miss regressions when only one side returns -1.

Suggested fix
-            boolean fdSupported = ProcessProbe.getOpenFileDescriptorCount() >= 0 && ProcessProbe.getMaxFileDescriptorCount() >= 0;
-            assertEquals("jvm.fd.used should be registered if supported", fdSupported, registeredGauges.contains("jvm.fd.used"));
-            assertEquals("jvm.fd.max should be registered if supported", fdSupported, registeredGauges.contains("jvm.fd.max"));
+            boolean openFdSupported = ProcessProbe.getOpenFileDescriptorCount() >= 0;
+            boolean maxFdSupported = ProcessProbe.getMaxFileDescriptorCount() >= 0;
+            assertEquals("jvm.fd.used should be registered if supported", openFdSupported, registeredGauges.contains("jvm.fd.used"));
+            assertEquals("jvm.fd.max should be registered if supported", maxFdSupported, registeredGauges.contains("jvm.fd.max"));
             assertEquals(
                 "jvm.file_descriptor.count should be registered if emitting OTel metrics",
-                registeredGauges.contains("jvm.file_descriptor.count"),
-                fdSupported && emitOTelMetrics
+                openFdSupported && emitOTelMetrics,
+                registeredGauges.contains("jvm.file_descriptor.count")
             );
             assertEquals(
                 "jvm.file_descriptor.limit should be registered if emitting OTel metrics",
-                registeredGauges.contains("jvm.file_descriptor.limit"),
-                fdSupported && emitOTelMetrics
+                maxFdSupported && emitOTelMetrics,
+                registeredGauges.contains("jvm.file_descriptor.limit")
             );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@server/src/test/java/org/elasticsearch/monitor/metrics/SystemMetricsTests.java`
around lines 52 - 64, The test currently uses a single fdSupported flag; instead
gate each assertion on its respective probe separately: compute boolean
fdUsedSupported = ProcessProbe.getOpenFileDescriptorCount() >= 0 and boolean
fdMaxSupported = ProcessProbe.getMaxFileDescriptorCount() >= 0, then assert
"jvm.fd.used" and "jvm.file_descriptor.count" against fdUsedSupported (and
fdUsedSupported && emitOTelMetrics for the OTel-name check), and assert
"jvm.fd.max" and "jvm.file_descriptor.limit" against fdMaxSupported (and
fdMaxSupported && emitOTelMetrics for the OTel-name check) so each metric is
validated independently of the other.

Comment on lines +78 to +103
@ClassRule
// Custom test rule which manually orders test fixtures. A RuleChain won't work here due to the specific order we require, which is:
// 1. Start the mock APM server
// 2. Start ES cluster
// 3. Run the test
// 4. Stop the mock APM server
// 5. Stop ES cluster
public static TestRule rule = (base, description) -> {
return new Statement() {
@Override
public void evaluate() throws Throwable {
recordingApmServer.before();
Statement s = new Statement() {
@Override
public void evaluate() throws Throwable {
try {
base.evaluate();
} finally {
recordingApmServer.after();
}
}
};
cluster.apply(s, description).evaluate();
}
};
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add a fallback after() around the whole rule body.

If cluster.apply(...).evaluate() fails during cluster startup, the inner finally never runs because base.evaluate() is never entered. That leaves RecordingApmServer's server/thread alive for the rest of the suite. Since RecordingApmServer.after() is already idempotent, add an outer try/finally as a failure-path cleanup guard and keep the current inner finally to preserve the normal shutdown order.

Suggested fix
     public static TestRule rule = (base, description) -> {
         return new Statement() {
             `@Override`
             public void evaluate() throws Throwable {
-                recordingApmServer.before();
-                Statement s = new Statement() {
-                    `@Override`
-                    public void evaluate() throws Throwable {
-                        try {
-                            base.evaluate();
-                        } finally {
-                            recordingApmServer.after();
-                        }
-                    }
-                };
-                cluster.apply(s, description).evaluate();
+                try {
+                    recordingApmServer.before();
+                    Statement s = new Statement() {
+                        `@Override`
+                        public void evaluate() throws Throwable {
+                            try {
+                                base.evaluate();
+                            } finally {
+                                recordingApmServer.after();
+                            }
+                        }
+                    };
+                    cluster.apply(s, description).evaluate();
+                } finally {
+                    recordingApmServer.after();
+                }
             }
         };
     };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@ClassRule
// Custom test rule which manually orders test fixtures. A RuleChain won't work here due to the specific order we require, which is:
// 1. Start the mock APM server
// 2. Start ES cluster
// 3. Run the test
// 4. Stop the mock APM server
// 5. Stop ES cluster
public static TestRule rule = (base, description) -> {
return new Statement() {
@Override
public void evaluate() throws Throwable {
recordingApmServer.before();
Statement s = new Statement() {
@Override
public void evaluate() throws Throwable {
try {
base.evaluate();
} finally {
recordingApmServer.after();
}
}
};
cluster.apply(s, description).evaluate();
}
};
};
`@ClassRule`
// Custom test rule which manually orders test fixtures. A RuleChain won't work here due to the specific order we require, which is:
// 1. Start the mock APM server
// 2. Start ES cluster
// 3. Run the test
// 4. Stop the mock APM server
// 5. Stop ES cluster
public static TestRule rule = (base, description) -> {
return new Statement() {
`@Override`
public void evaluate() throws Throwable {
try {
recordingApmServer.before();
Statement s = new Statement() {
`@Override`
public void evaluate() throws Throwable {
try {
base.evaluate();
} finally {
recordingApmServer.after();
}
}
};
cluster.apply(s, description).evaluate();
} finally {
recordingApmServer.after();
}
}
};
};
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@test/external-modules/apm-integration/src/javaRestTest/java/org/elasticsearch/test/apmintegration/MetricsApmIT.java`
around lines 78 - 103, The custom TestRule 'rule' must ensure
recordingApmServer.after() runs even if cluster.apply(...).evaluate() throws;
wrap the call to cluster.apply(s, description).evaluate() in an outer
try/finally that calls recordingApmServer.after() in the finally block, while
keeping the existing inner try/finally around base.evaluate() to preserve normal
shutdown ordering; update the lambda that creates the Statement so the outer
finally is the failure-path cleanup guard for recordingApmServer.

@mamazzol mamazzol requested a review from a team March 25, 2026 14:56
// 3. Run the test
// 4. Stop the mock APM server
// 5. Stop ES cluster
public static TestRule rule = (base, description) -> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we refactor this into a TestRule we can reuse in both places?

@prdoyle
Copy link
Copy Markdown
Contributor

prdoyle commented Mar 25, 2026

#143517 does alter how we do the class rules here, but not for the same reason as this PR.

Probably it makes sense to merge #143517 and then re-do these changes (which are relatively simple) on top of that.

@mamazzol
Copy link
Copy Markdown
Contributor Author

I'll hold off on this so it goes after #143517 if it's still necessary

if (Constants.WINDOWS) {
// JVM file descriptor metrics are not emitted on Windows.
valueAssertions.remove("jvm.fd.used");
valueAssertions.remove("jvm.fd.max");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might fix #144976.

@mamazzol mamazzol requested review from a team as code owners March 30, 2026 16:51
Gate SystemMetricsTests FD gauge expectations on ProcessProbe support and
fix assertEquals expected/actual order for OTel FD metrics.

Skip jvm.fd.used / jvm.fd.max assertions on Windows in AbstractMetricsIT.

Remove stale MetricsApmIT mute entries (superseded by ApmAgentMetricsIT
and OtelMetricsIT).

Closes elastic#144166
Closes elastic#144167
Closes elastic#144282

Made-with: Cursor
@elastic elastic deleted a comment from cla-checker-service bot Mar 30, 2026
@shainaraskas shainaraskas removed request for a team March 30, 2026 18:13
@mamazzol mamazzol changed the title Fix ordering between cluster and recording APM Agent in tests Skip certain metric assertion son Windows Mar 31, 2026
@mamazzol mamazzol changed the title Skip certain metric assertion son Windows Skip certain metric assertions on Windows Mar 31, 2026
Copy link
Copy Markdown
Contributor

@prdoyle prdoyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question, but I'll just approve because this is a pretty low-consequence PR since it's only fixing tests.

)
);

if (Constants.WINDOWS) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this check for Windows, while the other one checks ProcessProbe? I would have expected both of them to check the same thing.

@mamazzol mamazzol merged commit b16f84c into elastic:main Mar 31, 2026
35 checks passed
szybia added a commit to szybia/elasticsearch that referenced this pull request Mar 31, 2026
…rics

* upstream/main: (21 commits)
  Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {csv-spec:external-basic.topSnippetsFunction} elastic#145353
  Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {csv-spec:external-basic.scoreFunction} elastic#145352
  [DiskBBQ] Fix bug in NeighborQueue#popRawAndAddRaw (elastic#145324)
  Fix dense_vector default index options when using BFLOAT16 (elastic#145202)
  Use checked exceptions in entitlement constructor rules (elastic#145234)
  ESQL: DS: datasource file plugins should not return TEXT types (elastic#145334)
  Plumb DLM error store through to DlmFrozenTransition classes (elastic#145243)
  Make Settings.Builder.remove() fluent (elastic#145294)
  Add FLS tests for METRICS_INFO and TS_INFO (elastic#145211)
  Fix flaky SecurityFeatureResetTests (elastic#145063)
  [DOCS] Fix conflict markers in ESQL processing command list (elastic#145338)
  Skip certain metric assertions on Windows (elastic#144933)
  [ES|QL] Add schema reconciliation for multi-file external sources (elastic#145220)
  Simplify DiskBBQ dynamic visit ratio to linear (elastic#142784)
  ESQL: Disallow unmapped_fields=load with partial non-KEYWORD (elastic#144109)
  [Transform] Track Linked Projects (elastic#144399)
  Fix bulk scoring to process last batch instead of falling through to scalar tail (elastic#145316)
  Clean up TickerScheduleEngineTests (elastic#145303)
  [CI] ShardBulkInferenceActionFilterIT testRestart - Ensuring that secrets-inference index is available after full restart and unmuting test (elastic#145317)
  Add CRUD doc to the DistributedArchitectureGuide (elastic#144710)
  ...
seanzatzdev pushed a commit to seanzatzdev/elasticsearch that referenced this pull request Mar 31, 2026
seanzatzdev pushed a commit to seanzatzdev/elasticsearch that referenced this pull request Mar 31, 2026
ncordon pushed a commit to ncordon/elasticsearch that referenced this pull request Apr 1, 2026
mromaios pushed a commit to mromaios/elasticsearch that referenced this pull request Apr 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Core/Infra/Metrics Metrics and metering infrastructure Team:Core/Infra Meta label for core/infra team >test Issues or PRs that are addressing/adding tests v9.4.0

Projects

None yet

4 participants