Skip to content

Make warmup settings dynamic for pull-based ingestion#20936

Merged
varunbharadwaj merged 18 commits intoopensearch-project:mainfrom
kaustubhbutte17:feature/warmup-dynamic-settings
Mar 24, 2026
Merged

Make warmup settings dynamic for pull-based ingestion#20936
varunbharadwaj merged 18 commits intoopensearch-project:mainfrom
kaustubhbutte17:feature/warmup-dynamic-settings

Conversation

@kaustubhbutte17
Copy link
Copy Markdown
Contributor

@kaustubhbutte17 kaustubhbutte17 commented Mar 20, 2026

Description

Follow-up to #20526. Changes the two warmup settings from Final (immutable after index creation) to Dynamic (updatable at runtime via the update settings API).

This allows operators to adjust warmup behavior without recreating the index.

Settings Changed

Setting Before After
index.ingestion_source.warmup.timeout Final Dynamic
index.ingestion_source.warmup.lag_threshold Final Dynamic

Use Cases

  • Disable warmup mid-progress - emergency escape hatch if warmup is taking too long
  • Adjust lag threshold - tune without recreating the index
  • Change timeout - increase/decrease without recreating the index

Dynamic Update Behavior

Scenario Behavior
Disable warmup while in progress (timeout -> -1) Warmup completes immediately, shard starts serving
Change threshold while warmup in progress New threshold takes effect on next check cycle
Change timeout while warmup in progress New timeout takes effect on next check cycle
Enable warmup after shard is already STARTED Does NOT re-trigger warmup (would be disruptive)

Settings Propagation Path

Update Settings API -> IngestionEngine.registerDynamicIndexSettingsHandlers() -> DefaultStreamPoller.updateWarmupConfig()

Testing

  • testUpdateWarmupConfigDisableWhileInProgress - disabling warmup via config update
  • testUpdateWarmupConfigDisableWithRunningPoller - disabling mid-warmup with running poller, verifies updateWarmupStatus disabled branch
  • testUpdateWarmupConfigThresholdAndTimeoutWhileInProgress - threshold and timeout update during warmup
  • testUpdateWarmupConfigDoesNotReEnableAfterCompletion - no re-trigger after shard is serving
  • testWarmupSkippedWhenPollerStartsInPausedState - paused poller skips warmup
  • testDynamicWarmupSettingsUpdate (IT) - end-to-end dynamic settings update via API

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 20, 2026

PR Reviewer Guide 🔍

(Review updated until commit 9dcf919)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes

Sub-PR theme: Core dynamic warmup settings implementation

Relevant files:

  • server/src/main/java/org/opensearch/cluster/metadata/IndexMetadata.java
  • server/src/main/java/org/opensearch/index/engine/IngestionEngine.java
  • server/src/main/java/org/opensearch/indices/pollingingest/DefaultStreamPoller.java
  • server/src/main/java/org/opensearch/indices/pollingingest/StreamPoller.java

Sub-PR theme: Tests for dynamic warmup settings

Relevant files:

  • server/src/test/java/org/opensearch/indices/pollingingest/DefaultStreamPollerTests.java
  • plugins/ingestion-kafka/src/internalClusterTest/java/org/opensearch/plugin/kafka/IngestFromKafkaIT.java

⚡ Recommended focus areas for review

Warmup Guard Removed

The polling loop condition changed from if (!warmupComplete && warmupConfig.isEnabled()) to just if (!warmupComplete). This means updateWarmupStatus() is now called on every poll cycle even when warmup is disabled at startup (before any dynamic update). When warmup is disabled, updateWarmupStatus() immediately sets warmupComplete = true and counts down the latch — which is the intended behavior — but it also means the latch is counted down inside the poll loop rather than at initialization. If awaitWarmupComplete is called before the poll loop runs even once, it will block until the first poll iteration. Previously, with warmupConfig.isEnabled() guarding the call, a disabled warmup would never enter updateWarmupStatus() at all, so the latch would never be counted down via that path. This is a behavioral change: callers of awaitWarmupComplete on a disabled-warmup poller now depend on the poll loop running at least once.

if (!warmupComplete) {
    updateWarmupStatus();
}
Race Condition on warmupConfig

warmupConfig is declared volatile, which ensures visibility of the reference but not atomicity of the read-then-use pattern in updateWarmupStatus(). If updateWarmupConfig() is called concurrently while updateWarmupStatus() is mid-execution (e.g., after reading warmupConfig.isEnabled() but before reading warmupConfig.lagThreshold()), the two reads could see different WarmupConfig instances. While WarmupConfig appears to be an immutable record, the volatile reference swap between two reads in the same method invocation is still a potential inconsistency. Assigning warmupConfig to a local variable at the start of updateWarmupStatus() would eliminate this.

if (paused || !warmupConfig.isEnabled()) {
    warmupComplete = true;
    warmupLatch.countDown();
    logger.info(
        "Warmup skipped for index {} shard {} - {}",
        indexName,
        shardId,
        paused ? "poller is paused" : "warmup is disabled"
    );
    return;
}
Misleading Test Assertion

In testUpdateWarmupConfigDisableWhileInProgress, the comment says "isWarmupComplete() checks warmupConfig.isEnabled(), so disabling makes it return true immediately". However, isWarmupComplete() returns the warmupComplete field, not a live check of warmupConfig.isEnabled(). The test passes only because the poller was never started, so warmupComplete remains false and the assertion assertTrue(warmupPoller.isWarmupComplete()) would actually fail unless isWarmupComplete() has special logic. If isWarmupComplete() does check warmupConfig.isEnabled() as a shortcut, this is undocumented behavior that could mislead future maintainers. The test should verify the actual contract clearly.

// isWarmupComplete() checks warmupConfig.isEnabled(), so disabling makes it return true immediately
assertTrue(warmupPoller.isWarmupComplete());

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 20, 2026

PR Code Suggestions ✨

Latest suggestions up to 9dcf919

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix incorrect warmup completion assertion without starting poller

The test comment claims isWarmupComplete() checks warmupConfig.isEnabled(), but
looking at the production code, isWarmupComplete() likely returns the warmupComplete
volatile boolean field directly. Since the poller was never started and
updateWarmupStatus() was never called, warmupComplete would still be false. This
test assertion may be incorrect and could be testing a behavior that doesn't
actually exist, leading to a false-positive or test failure.

server/src/test/java/org/opensearch/indices/pollingingest/DefaultStreamPollerTests.java [1054-1055]

-// isWarmupComplete() checks warmupConfig.isEnabled(), so disabling makes it return true immediately
+// After disabling warmup, the poller loop will call updateWarmupStatus() which will
+// detect warmup is disabled and set warmupComplete=true. Since poller is not started,
+// we verify the config was updated but warmupComplete is still false until the loop runs.
+assertFalse(warmupPoller.isWarmupComplete());
+// Start poller and wait for warmup to complete via the poll loop
+warmupPoller.start();
+assertTrue("Warmup should complete after being dynamically disabled", warmupPoller.awaitWarmupComplete(5000));
 assertTrue(warmupPoller.isWarmupComplete());
Suggestion importance[1-10]: 7

__

Why: The test comment and assertion appear incorrect - isWarmupComplete() likely returns the warmupComplete boolean field directly, not checking warmupConfig.isEnabled(). Since the poller is never started in this test, warmupComplete would remain false, making the assertTrue assertion potentially wrong and the test unreliable.

Medium
Guard against null warmup config updates

The warmup check no longer guards against calling updateWarmupStatus() when warmup
is disabled, relying instead on the check inside updateWarmupStatus(). However,
since warmupConfig is now volatile and can be updated at any time, there's a
potential issue: if warmup was already complete (warmupComplete == true), the outer
check correctly skips the call. But if warmup was disabled via updateWarmupConfig
while warmupComplete is still false, the latch countdown happens inside
updateWarmupStatus() on the next iteration, which is correct. The real concern is
that warmupConfig could be set to null if updateWarmupConfig is called with a null
value, causing a NullPointerException. Add a null check in updateWarmupConfig.

server/src/main/java/org/opensearch/indices/pollingingest/DefaultStreamPoller.java [439-448]

-if (!warmupComplete) {
-    updateWarmupStatus();
+@Override
+public void updateWarmupConfig(IngestionSource.WarmupConfig newConfig) {
+    if (newConfig == null) {
+        throw new IllegalArgumentException("WarmupConfig must not be null");
+    }
+    this.warmupConfig = newConfig;
+    logger.info(
+        "Warmup config updated for index {} shard {}: timeout={}, lagThreshold={}",
+        indexName,
+        shardId,
+        newConfig.timeout(),
+        newConfig.lagThreshold()
+    );
 }
Suggestion importance[1-10]: 4

__

Why: Adding a null check in updateWarmupConfig is a reasonable defensive measure, but the method is only called from IngestionEngine.updateWarmupConfig which always constructs a new WarmupConfig object, making a null value unlikely in practice. The improvement is minor.

Low
General
Handle individual warmup setting updates independently

The addSettingsUpdateConsumer with two settings registers a bi-consumer that fires
only when both settings are updated together. If only one setting is updated (e.g.,
only lag_threshold), this handler may not be triggered, leaving the warmup config
partially stale. Consider registering separate consumers for each setting that read
the current value of the other setting from engineConfig.getIndexSettings().

server/src/main/java/org/opensearch/index/engine/IngestionEngine.java [493-497]

 private void updateWarmupConfig(TimeValue timeout, Long lagThreshold) {
     if (streamPoller != null) {
         streamPoller.updateWarmupConfig(new IngestionSource.WarmupConfig(timeout, lagThreshold));
     }
 }
 
+private void updateWarmupTimeout(TimeValue timeout) {
+    Long lagThreshold = engineConfig.getIndexSettings().getValue(IndexMetadata.INGESTION_SOURCE_WARMUP_LAG_THRESHOLD_SETTING);
+    updateWarmupConfig(timeout, lagThreshold);
+}
+
+private void updateWarmupLagThreshold(Long lagThreshold) {
+    TimeValue timeout = engineConfig.getIndexSettings().getValue(IndexMetadata.INGESTION_SOURCE_WARMUP_TIMEOUT_SETTING);
+    updateWarmupConfig(timeout, lagThreshold);
+}
+
Suggestion importance[1-10]: 7

__

Why: The addSettingsUpdateConsumer with two settings registers a bi-consumer that only fires when both settings are updated atomically. If a user updates only lag_threshold or only timeout, the handler won't be triggered, leaving the warmup config stale. The suggested fix of registering separate consumers for each setting is a valid approach to handle partial updates correctly.

Medium

Previous suggestions

Suggestions up to commit 678aa45
CategorySuggestion                                                                                                                                    Impact
Possible issue
Immediately complete warmup when dynamically disabled

When updateWarmupConfig is called with a disabled config (timeout=-1) after warmup
is already complete, the warmupComplete flag is true and the latch is already
counted down, so there is no issue. However, if warmup is already complete and a new
enabled config is set, the polling loop will never re-enter updateWarmupStatus()
because warmupComplete is true. But if warmup is in progress and the config is
updated to disabled, the latch countdown only happens on the next poll loop
iteration. To ensure immediate effect when disabling warmup, the latch should be
counted down directly in updateWarmupConfig when the new config is disabled and
warmup is not yet complete.

server/src/main/java/org/opensearch/indices/pollingingest/DefaultStreamPoller.java [438-448]

 @Override
 public void updateWarmupConfig(IngestionSource.WarmupConfig newConfig) {
     this.warmupConfig = newConfig;
     logger.info(
         "Warmup config updated for index {} shard {}: timeout={}, lagThreshold={}",
         indexName,
         shardId,
         newConfig.timeout(),
         newConfig.lagThreshold()
     );
+    if (!newConfig.isEnabled() && !warmupComplete) {
+        warmupComplete = true;
+        warmupLatch.countDown();
+        logger.info("Warmup immediately completed for index {} shard {} - warmup was dynamically disabled", indexName, shardId);
+    }
 }
Suggestion importance[1-10]: 6

__

Why: This is a valid improvement — when warmup is dynamically disabled via updateWarmupConfig, immediately counting down the latch ensures threads blocked on awaitWarmupComplete are unblocked without waiting for the next poll loop iteration, which is more responsive and correct behavior.

Low
General
Fix incorrect test assumption about warmup completion

The comment states that isWarmupComplete() checks warmupConfig.isEnabled(), but
looking at the production code, isWarmupComplete() returns warmupComplete field
directly (a boolean flag), not warmupConfig.isEnabled(). This test may be asserting
incorrect behavior — without starting the poller, updateWarmupStatus() is never
called, so warmupComplete remains false even after disabling the config. The test
assertion may pass only if isWarmupComplete() has special logic for disabled
configs, which should be verified.

server/src/test/java/org/opensearch/indices/pollingingest/DefaultStreamPollerTests.java [1026-1055]

 public void testUpdateWarmupConfigDisableWhileInProgress() throws Exception {
     ...
-    // isWarmupComplete() checks warmupConfig.isEnabled(), so disabling makes it return true immediately
+    // Dynamically disable warmup (timeout=-1) - should immediately complete warmup
+    IngestionSource.WarmupConfig disabledConfig = new IngestionSource.WarmupConfig(TimeValue.timeValueMillis(-1), 100L);
+    warmupPoller.updateWarmupConfig(disabledConfig);
+
+    // After disabling, warmup should be immediately marked complete
     assertTrue(warmupPoller.isWarmupComplete());
 
+    warmupPoller.close();
+}
+
Suggestion importance[1-10]: 5

__

Why: The suggestion raises a valid concern: if isWarmupComplete() returns the warmupComplete field directly and the poller hasn't started, disabling the config alone may not set warmupComplete=true. This could indicate a test that passes for the wrong reason, though the actual behavior depends on the isWarmupComplete() implementation not shown in the diff.

Low
Restore warmup enabled guard in polling loop

The warmup check no longer guards against calling updateWarmupStatus() when warmup
is disabled, relying instead on the check inside updateWarmupStatus(). However,
updateWarmupStatus() when warmup is disabled will set warmupComplete = true and call
warmupLatch.countDown() on every polling loop iteration until warmupComplete is set.
Since warmupComplete is only set inside updateWarmupStatus(), the outer guard should
be restored to also check warmupConfig.isEnabled() to avoid unnecessary method calls
after warmup is already skipped.

server/src/main/java/org/opensearch/indices/pollingingest/DefaultStreamPoller.java [240-242]

-if (!warmupComplete) {
+if (!warmupComplete && warmupConfig.isEnabled()) {
+    updateWarmupStatus();
+} else if (!warmupComplete && !warmupConfig.isEnabled()) {
     updateWarmupStatus();
 }
Suggestion importance[1-10]: 1

__

Why: The improved_code is logically equivalent to the existing code (both branches call updateWarmupStatus()), making this suggestion incorrect and unhelpful. The outer !warmupComplete guard already prevents repeated calls once warmup is done.

Low
Suggestions up to commit 775437a
CategorySuggestion                                                                                                                                    Impact
Possible issue
Prevent re-enabling warmup after it has completed

When updateWarmupConfig is called with a disabled config (timeout=-1) after warmup
is already complete, the warmupComplete flag is already true and the latch is
already counted down, so this is safe. However, if warmup is already complete and a
new enabled config is passed in, the warmupComplete flag remains true but
warmupConfig.isEnabled() now returns true, which could cause isWarmupComplete() to
return inconsistent results if it checks warmupConfig.isEnabled(). The method should
guard against re-enabling warmup once warmupComplete is already true.

server/src/main/java/org/opensearch/indices/pollingingest/DefaultStreamPoller.java [438-448]

 @Override
 public void updateWarmupConfig(IngestionSource.WarmupConfig newConfig) {
+    if (warmupComplete) {
+        logger.info(
+            "Warmup already complete for index {} shard {}, ignoring config update",
+            indexName,
+            shardId
+        );
+        return;
+    }
     this.warmupConfig = newConfig;
     logger.info(
         "Warmup config updated for index {} shard {}: timeout={}, lagThreshold={}",
         indexName,
         shardId,
         newConfig.timeout(),
         newConfig.lagThreshold()
     );
 }
Suggestion importance[1-10]: 6

__

Why: This is a valid concern - if updateWarmupConfig is called with an enabled config after warmupComplete is already true, it could cause inconsistency in isWarmupComplete() if that method checks warmupConfig.isEnabled(). The test testUpdateWarmupConfigDoesNotReEnableAfterCompletion validates this behavior, but the guard is not in the implementation itself.

Low
General
Guard latch countdown to prevent redundant calls

Calling warmupLatch.countDown() multiple times is safe (it won't go below zero), but
warmupComplete = true is set redundantly on every call when warmup is disabled. More
importantly, if updateWarmupConfig is called to disable warmup while the poller is
running, updateWarmupStatus() will be called on the next iteration and correctly
complete warmup. However, if warmup was previously enabled and warmupComplete is
already true, the outer if (!warmupComplete) guard prevents this code from running,
so there's no double-countDown issue. The logic is correct but could be made clearer
by checking warmupComplete before calling countDown().

server/src/main/java/org/opensearch/indices/pollingingest/DefaultStreamPoller.java [460-463]

 // Skip warmup if poller is paused or warmup is disabled
 if (paused || !warmupConfig.isEnabled()) {
     warmupComplete = true;
-    warmupLatch.countDown();
+    if (warmupLatch.getCount() > 0) {
+        warmupLatch.countDown();
+    }
Suggestion importance[1-10]: 2

__

Why: CountDownLatch.countDown() is already safe to call multiple times (it won't go below zero), and the outer if (!warmupComplete) guard in the polling loop prevents this code from being reached after warmupComplete is set to true. The suggestion adds unnecessary complexity for a non-issue.

Low
Avoid redundant warmup status checks after completion

The warmup check no longer guards against calling updateWarmupStatus() when warmup
is disabled, relying instead on the check inside updateWarmupStatus(). However,
updateWarmupStatus() when warmup is disabled will set warmupComplete = true and call
warmupLatch.countDown() on every iteration until warmupComplete is set. Since
warmupComplete is only set inside updateWarmupStatus(), the outer guard should be
restored to also check warmupConfig.isEnabled() to avoid unnecessary method calls
once warmup is disabled but before the flag is set.

server/src/main/java/org/opensearch/indices/pollingingest/DefaultStreamPoller.java [240-242]

-if (!warmupComplete) {
+if (!warmupComplete && warmupConfig.isEnabled()) {
+    updateWarmupStatus();
+} else if (!warmupComplete) {
     updateWarmupStatus();
 }
Suggestion importance[1-10]: 1

__

Why: The improved_code is logically equivalent to the existing code - both branches call updateWarmupStatus() when !warmupComplete is true, making the suggestion a no-op. The outer !warmupComplete guard already prevents redundant calls after completion.

Low
Suggestions up to commit 422d33c
CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix incorrect assertion about warmup completion without running poller

The comment claims isWarmupComplete() checks warmupConfig.isEnabled(), but looking
at the production code, isWarmupComplete() returns the warmupComplete volatile field
directly. Disabling warmup via updateWarmupConfig only sets the new config — the
warmupComplete flag is only set to true inside updateWarmupStatus() which runs in
the poll loop. Since the poller is never started in this test, warmupComplete will
remain false, making this assertion incorrect.

server/src/test/java/org/opensearch/indices/pollingingest/DefaultStreamPollerTests.java [1054-1055]

-// isWarmupComplete() checks warmupConfig.isEnabled(), so disabling makes it return true immediately
-assertTrue(warmupPoller.isWarmupComplete());
+// updateWarmupConfig only updates the config; warmupComplete is set by the poll loop.
+// Without starting the poller, warmupComplete remains false.
+assertFalse(warmupPoller.isWarmupComplete());
Suggestion importance[1-10]: 7

__

Why: This is a valid and important catch. The test asserts assertTrue(warmupPoller.isWarmupComplete()) after calling updateWarmupConfig without starting the poller, but isWarmupComplete() returns the warmupComplete volatile field which is only set in the poll loop via updateWarmupStatus(). The assertion would fail at runtime, making this a real test correctness bug.

Medium
General
Guard warmup skip logic against redundant state changes

The warmupLatch.countDown() is called unconditionally even if warmupComplete is
already true and the latch has already been counted down. Since CountDownLatch
ignores extra countDown() calls when already at zero this is safe, but
warmupComplete = true is also set redundantly. More critically, if warmup was
previously enabled and in-progress, then the config is dynamically updated to
disable it, the latch will be counted down here — but only on the next poll loop
iteration. This is the intended behavior, but the warmupComplete flag should be
checked first to avoid redundant state changes.

server/src/main/java/org/opensearch/indices/pollingingest/DefaultStreamPoller.java [461-470]

 if (paused || !warmupConfig.isEnabled()) {
-    warmupComplete = true;
-    warmupLatch.countDown();
+    if (!warmupComplete) {
+        warmupComplete = true;
+        warmupLatch.countDown();
+        logger.info(
+            "Warmup skipped for index {} shard {} - {}",
+            indexName,
+            shardId,
+            paused ? "poller is paused" : "warmup is disabled"
+        );
+    }
+    return;
+}
Suggestion importance[1-10]: 4

__

Why: The suggestion is valid — adding a !warmupComplete guard avoids redundant assignments and log messages on every poll loop iteration when warmup is already done. However, since CountDownLatch safely handles extra countDown() calls and warmupComplete is a simple boolean write, this is a minor optimization rather than a critical fix.

Low
Confirm poll loop warmup check is consistent with new logic

The original guard warmupConfig.isEnabled() was removed from the poll loop check,
and the enabled check was moved inside updateWarmupStatus(). However,
updateWarmupStatus() now sets warmupComplete = true and counts down the latch when
warmup is disabled. This means if warmup starts enabled and later gets dynamically
disabled, the latch is correctly released. But the removal of the outer isEnabled()
guard means updateWarmupStatus() is called on every poll iteration until
warmupComplete is set — which is fine functionally but the original condition was
!warmupComplete && warmupConfig.isEnabled(). The new condition !warmupComplete is
correct since updateWarmupStatus handles the disabled case internally.

server/src/main/java/org/opensearch/indices/pollingingest/DefaultStreamPoller.java [240-242]

-if (!warmupComplete && warmupConfig.isEnabled()) {
+if (!warmupComplete) {
     updateWarmupStatus();
 }
Suggestion importance[1-10]: 2

__

Why: The suggestion's improved_code is identical to the existing_code already in the PR diff, so it adds no change. It merely confirms the existing implementation is correct, which warrants a very low score.

Low
Suggestions up to commit 714a8b5
CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix test asserting warmup completion without starting poller

The test comment claims isWarmupComplete() checks warmupConfig.isEnabled(), but
looking at the code, isWarmupComplete() returns the warmupComplete field directly.
The poller was never started, so updateWarmupStatus() was never called to set
warmupComplete = true. This test may be asserting incorrect behavior — the
warmupComplete flag would still be false since the poller thread never ran. The test
should either start the poller and wait, or verify the actual behavior of
isWarmupComplete().

server/src/test/java/org/opensearch/indices/pollingingest/DefaultStreamPollerTests.java [1026-1055]

 public void testUpdateWarmupConfigDisableWhileInProgress() throws Exception {
-    ...
-    // isWarmupComplete() checks warmupConfig.isEnabled(), so disabling makes it return true immediately
+    // Create a poller with warmup enabled
+    IngestionSource.WarmupConfig enabledConfig = new IngestionSource.WarmupConfig(TimeValue.timeValueMinutes(5), 100L);
+    DefaultStreamPoller warmupPoller = new DefaultStreamPoller(
+        new FakeIngestionSource.FakeIngestionShardPointer(0),
+        fakeConsumerFactory,
+        "",
+        0,
+        partitionedBlockingQueueContainer,
+        StreamPoller.ResetState.NONE,
+        "",
+        errorStrategy,
+        StreamPoller.State.NONE,
+        1000,
+        1000,
+        10000,
+        indexSettings,
+        new DefaultIngestionMessageMapper(),
+        enabledConfig
+    );
+
+    // Warmup should not be complete yet
+    assertFalse(warmupPoller.isWarmupComplete());
+
+    // Dynamically disable warmup (timeout=-1)
+    IngestionSource.WarmupConfig disabledConfig = new IngestionSource.WarmupConfig(TimeValue.timeValueMillis(-1), 100L);
+    warmupPoller.updateWarmupConfig(disabledConfig);
+
+    // Start the poller so updateWarmupStatus() is called and warmupComplete is set
+    warmupPoller.start();
+
+    // Wait for warmup to complete after being disabled
+    assertTrue("Warmup should complete after being dynamically disabled", warmupPoller.awaitWarmupComplete(5000));
     assertTrue(warmupPoller.isWarmupComplete());
 
+    warmupPoller.close();
+}
+
Suggestion importance[1-10]: 7

__

Why: The suggestion raises a valid concern: isWarmupComplete() returns the warmupComplete field directly, and since the poller is never started in testUpdateWarmupConfigDisableWhileInProgress, updateWarmupStatus() is never called to set warmupComplete = true. The test assertion at line 1055 may be passing for the wrong reason or could be incorrect, making this a potentially significant test correctness issue.

Medium
General
Restore warmup enabled guard in polling loop

The warmup check no longer guards against calling updateWarmupStatus() when warmup
is disabled, relying instead on the check inside updateWarmupStatus(). However,
updateWarmupStatus() when warmup is disabled will set warmupComplete = true and call
warmupLatch.countDown() on every poll loop iteration until warmupComplete is set.
Since warmupComplete is only set inside updateWarmupStatus(), and the outer check
was removed, the first call will set it, but the condition should still
short-circuit. The original guard !warmupComplete is still present, so this is fine
— but if warmup is dynamically re-enabled after completion (warmupComplete=true),
the updateWarmupStatus() will never be called again. Consider restoring the
warmupConfig.isEnabled() check in the outer condition to avoid unnecessary calls
when warmup is disabled.

server/src/main/java/org/opensearch/indices/pollingingest/DefaultStreamPoller.java [240-242]

-if (!warmupComplete) {
+if (!warmupComplete && warmupConfig.isEnabled()) {
+    updateWarmupStatus();
+} else if (!warmupComplete) {
     updateWarmupStatus();
 }
Suggestion importance[1-10]: 2

__

Why: The suggestion's improved_code is logically equivalent to just if (!warmupComplete) { updateWarmupStatus(); } since both branches call updateWarmupStatus(). The actual concern about unnecessary calls is already handled by the !warmupComplete guard, and the re-enable scenario is intentionally unsupported per the test. The suggestion doesn't improve correctness.

Low
Suggestions up to commit 1256a3a
CategorySuggestion                                                                                                                                    Impact
General
Replace fragile sleep with reliable latch wait

Using Thread.sleep(2000) to wait for the poll loop to iterate is fragile and can
cause flaky tests in slow CI environments. Replace the fixed sleep with a polling
loop or use awaitWarmupComplete with a reasonable timeout to reliably wait for the
latch to be counted down.

server/src/test/java/org/opensearch/indices/pollingingest/DefaultStreamPollerTests.java [1090-1099]

 // Dynamically disable warmup - updateWarmupStatus will handle completion on next poll loop
 warmupPoller.updateWarmupConfig(new IngestionSource.WarmupConfig(TimeValue.timeValueMillis(-1), 100L));
 
-// isWarmupComplete() returns true immediately via config check, but we need to verify
-// the poll loop also calls updateWarmupStatus() which counts down the latch.
-// Wait briefly for the poll loop to iterate and hit the disabled branch.
-Thread.sleep(2000);
+// Wait for the poll loop to iterate and count down the latch via updateWarmupStatus
+assertTrue("Warmup latch should be counted down", warmupPoller.awaitWarmupComplete(5000));
 
-// Verify the latch was counted down by updateWarmupStatus (not just short-circuited by isWarmupComplete)
-assertTrue("Warmup latch should be counted down", warmupPoller.awaitWarmupComplete(0));
-
Suggestion importance[1-10]: 7

__

Why: Using Thread.sleep(2000) is a well-known source of flaky tests in CI environments. Replacing it with awaitWarmupComplete(5000) is a cleaner and more reliable approach that directly tests the intended behavior without an arbitrary fixed delay.

Medium
Guard against silently re-enabling completed warmup

When updateWarmupConfig is called with a disabled config (timeout=-1) after warmup
is already complete, the warmupComplete flag is true and the latch is already
counted down, so this is safe. However, if a new enabled config is set after warmup
was previously completed (re-enabling warmup), the warmupComplete flag remains true
and warmup will never re-run. The method should log a warning or explicitly
document/enforce that re-enabling warmup after completion has no effect, to prevent
silent misbehavior.

server/src/main/java/org/opensearch/indices/pollingingest/DefaultStreamPoller.java [438-448]

 @Override
 public void updateWarmupConfig(IngestionSource.WarmupConfig newConfig) {
+    if (warmupComplete && newConfig.isEnabled()) {
+        logger.warn(
+            "Warmup config update ignored for index {} shard {} - warmup already completed. Re-enabling warmup has no effect.",
+            indexName,
+            shardId
+        );
+        return;
+    }
     this.warmupConfig = newConfig;
     logger.info(
         "Warmup config updated for index {} shard {}: timeout={}, lagThreshold={}",
         indexName,
         shardId,
         newConfig.timeout(),
         newConfig.lagThreshold()
     );
 }
Suggestion importance[1-10]: 4

__

Why: The suggestion correctly identifies that re-enabling warmup after completion has no effect due to the warmupComplete flag, and adding an explicit warning log improves observability. The test testUpdateWarmupConfigDoesNotReEnableAfterCompletion already validates this behavior, but the warning log would make the silent no-op more visible to operators.

Low
Restore early exit when warmup is already complete

The warmup check no longer guards against calling updateWarmupStatus() when warmup
is disabled, since the warmupConfig.isEnabled() check was moved inside
updateWarmupStatus(). However, if warmupConfig is updated to a disabled state while
warmupComplete is already true, the latch has already been counted down and this is
fine. But if warmup was never started (disabled from the beginning),
updateWarmupStatus() will be called on every poll loop iteration unnecessarily until
it sets warmupComplete = true. Consider restoring the warmupConfig.isEnabled() guard
in the outer check to avoid redundant calls when warmup is disabled from the start.

server/src/main/java/org/opensearch/indices/pollingingest/DefaultStreamPoller.java [240-242]

-if (!warmupComplete) {
+if (!warmupComplete && warmupConfig.isEnabled()) {
+    updateWarmupStatus();
+} else if (!warmupComplete && !warmupConfig.isEnabled()) {
     updateWarmupStatus();
 }
Suggestion importance[1-10]: 1

__

Why: The improved_code is logically equivalent to the existing code - both branches call updateWarmupStatus() when !warmupComplete, making the suggestion a no-op. The concern about redundant calls is valid but minor since updateWarmupStatus() sets warmupComplete = true on the first call when disabled, preventing further calls.

Low

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 97fffc2: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@kaustubhbutte17 kaustubhbutte17 force-pushed the feature/warmup-dynamic-settings branch 2 times, most recently from d4f3eec to 397bc0b Compare March 20, 2026 04:52
@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 397bc0b

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 397bc0b: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@kaustubhbutte17 kaustubhbutte17 force-pushed the feature/warmup-dynamic-settings branch from 397bc0b to 2fdf911 Compare March 20, 2026 06:09
@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 2fdf911

Adds testUpdateWarmupConfigDisableWithRunningPoller which starts a poller
with high lag (warmup never completes on its own), then dynamically disables
warmup and verifies it completes via the updateWarmupStatus() disabled branch.

This covers the previously uncovered 'warmup is disabled' path in
updateWarmupStatus() and the updateWarmupConfig() method.

Signed-off-by: Kaustubh Butte <kaustubhbutte17@gmail.com>
- Combine testUpdateWarmupConfigThresholdWhileInProgress and
  testUpdateWarmupConfigTimeoutWhileInProgress into single test that
  updates both values (varunbharadwaj)
- Revert assertBusy in testWarmupPhase back to direct assertEquals
  since warmup guarantees docs are indexed before completion (varunbharadwaj)

Signed-off-by: Kaustubh Butte <kaustubhbutte17@gmail.com>
- Fix testWarmupPhase: check polled count metric (proves warmup processed
  all messages) then use waitForSearchableDocs (handles async refresh)
- Add testDynamicWarmupSettingsUpdate IT: verifies dynamic settings update
  via update settings API works end-to-end (covers IngestionEngine path)
- Add testWarmupSkippedWhenPollerStartsInPausedState: covers the paused
  branch in updateWarmupStatus (poller starts paused, warmup is skipped)

Signed-off-by: Kaustubh Butte <kaustubhbutte17@gmail.com>
Signed-off-by: Kaustubh Butte <kaustubhbutte17@gmail.com>
Signed-off-by: Kaustubh Butte <kaustubhbutte17@gmail.com>
Wait for poll loop to execute updateWarmupStatus() with disabled config,
covering the '!warmupConfig.isEnabled()' branch (lines 461-470).

Signed-off-by: Kaustubh Butte <kaustubhbutte17@gmail.com>
Signed-off-by: Kaustubh Butte <kaustubhbutte17@gmail.com>
Signed-off-by: Kaustubh Butte <kaustubhbutte17@gmail.com>
Signed-off-by: Kaustubh Butte <kaustubhbutte17@gmail.com>
Signed-off-by: Kaustubh Butte <kaustubhbutte17@gmail.com>
Signed-off-by: Kaustubh Butte <kaustubhbutte17@gmail.com>
totalProcessedCount is incremented in a separate processor thread and
may lag behind the poller state. Only assert totalPolledCount which is
synchronous with the poller state transition.

Signed-off-by: Kaustubh Butte <kaustubhbutte17@gmail.com>
@kaustubhbutte17 kaustubhbutte17 force-pushed the feature/warmup-dynamic-settings branch from 714a8b5 to 422d33c Compare March 24, 2026 01:10
@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 422d33c

@kaustubhbutte17 kaustubhbutte17 force-pushed the feature/warmup-dynamic-settings branch from 422d33c to 542fb18 Compare March 24, 2026 02:04
Signed-off-by: Kaustubh Butte <kaustubhbutte17@gmail.com>
@kaustubhbutte17 kaustubhbutte17 force-pushed the feature/warmup-dynamic-settings branch from 542fb18 to 775437a Compare March 24, 2026 02:04
@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 775437a

@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 678aa45

@kaustubhbutte17 kaustubhbutte17 force-pushed the feature/warmup-dynamic-settings branch from 678aa45 to fcf52b6 Compare March 24, 2026 02:16
Signed-off-by: Kaustubh Butte <kaustubhbutte17@gmail.com>
@kaustubhbutte17 kaustubhbutte17 force-pushed the feature/warmup-dynamic-settings branch from fcf52b6 to 9dcf919 Compare March 24, 2026 02:16
@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 9dcf919

Copy link
Copy Markdown
Contributor

@varunbharadwaj varunbharadwaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks!

@github-actions
Copy link
Copy Markdown
Contributor

❕ Gradle check result for 9dcf919: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@varunbharadwaj varunbharadwaj merged commit 6082a2a into opensearch-project:main Mar 24, 2026
41 of 51 checks passed
gagandhakrey pushed a commit to gagandhakrey/OpenSearch that referenced this pull request Apr 1, 2026
…ject#20936)

Make warmup settings dynamic for pull-based ingestion

---------

Signed-off-by: Kaustubh Butte <kaustubhbutte17@gmail.com>
Signed-off-by: Gagan Dhakrey <gagandhakrey@Gagans-MacBook-Pro.local>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants