Skip to content

fix(audit-log-stream): remove AbortSignal.timeout to fix retry failures and memory pressure#5474

Merged
victorvhs017 merged 1 commit intomainfrom
fix/audit-log-streams-memory-leak
Feb 18, 2026
Merged

fix(audit-log-stream): remove AbortSignal.timeout to fix retry failures and memory pressure#5474
victorvhs017 merged 1 commit intomainfrom
fix/audit-log-streams-memory-leak

Conversation

@victorvhs017
Copy link
Contributor

@victorvhs017 victorvhs017 commented Feb 13, 2026

Context

Audit log streaming to external providers (Splunk, Datadog, Azure, Cribl, Custom) was causing memory pressure and ERR_CANCELED errors when downstream services returned errors (e.g., 503s).

Root cause: AbortSignal.timeout() was used alongside axios's timeout option. When axios-retry retried failed requests, it reused the same AbortSignal which had already started its countdown (or expired), causing immediate cancellation of retry attempts.

Before: Retries failed with ERR_CANCELED, accumulated timer handles caused memory pressure.

After: Retries work correctly with fresh timeouts per attempt. The axios timeout option is sufficient for request timeouts.

Steps to verify the change

  1. Configure an audit log stream (e.g., Splunk)
  2. Simulate a failing endpoint (503 responses)
  3. Verify retries complete without ERR_CANCELED errors
  4. Monitor memory usage under load - should remain stable

Type

  • Fix
  • Feature
  • Improvement
  • Breaking
  • Docs
  • Chore

Checklist

  • Title follows the conventional commit format: type(scope): short description (scope is optional, e.g., fix: prevent crash on sync or fix(api): handle null response).
  • Tested locally
  • Updated docs (if needed)
  • Read the contributing guide

@maidul98
Copy link
Collaborator

Snyk checks have passed. No issues have been found so far.

Status Scanner Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 13, 2026

Greptile Overview

Greptile Summary

This PR fixes a critical issue with audit log streaming where retry attempts were failing due to conflicting timeout mechanisms. The change removes AbortSignal.timeout() from all audit log stream providers (Splunk, Azure, Datadog, Cribl, Custom), keeping only the axios timeout option.

Problem identified: When axios-retry attempted to retry failed requests, it reused the same AbortSignal which had already started counting down or expired, causing immediate cancellation of retry attempts with ERR_CANCELED errors. This created memory pressure from accumulated timer handles.

Changes made:

  • Removed signal: AbortSignal.timeout(AUDIT_LOG_STREAM_TIMEOUT) from all provider factories
  • Kept timeout: AUDIT_LOG_STREAM_TIMEOUT (5 seconds) which is sufficient for request timeouts
  • Applied consistently across 5 provider implementations

Impact: This fix ensures retries work correctly with fresh timeouts per attempt, resolving memory pressure issues and allowing proper retry behavior when downstream services return errors like 503s.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk - it fixes a legitimate bug in retry behavior
  • The change removes redundant timeout mechanism that was causing retry failures. The axios timeout option alone provides sufficient timeout protection, and this fix resolves memory pressure and ERR_CANCELED errors. The change is consistent across all providers, maintains backward compatibility (no API changes), and aligns with axios-retry best practices. No security concerns introduced.
  • No files require special attention

Important Files Changed

Filename Overview
backend/src/ee/services/audit-log-stream/splunk/splunk-provider-factory.ts Removed duplicate AbortSignal.timeout() that conflicted with axios retry mechanism, keeping only axios timeout option
backend/src/ee/services/audit-log-stream/azure/azure-provider-factory.ts Removed duplicate AbortSignal.timeout() that conflicted with axios retry mechanism, keeping only axios timeout option
backend/src/ee/services/audit-log-stream/datadog/datadog-provider-factory.ts Removed duplicate AbortSignal.timeout() that conflicted with axios retry mechanism, keeping only axios timeout option
backend/src/ee/services/audit-log-stream/cribl/cribl-provider-factory.ts Removed duplicate AbortSignal.timeout() that conflicted with axios retry mechanism, keeping only axios timeout option
backend/src/ee/services/audit-log-stream/custom/custom-provider-factory.ts Removed duplicate AbortSignal.timeout() that conflicted with axios retry mechanism, keeping only axios timeout option

Last reviewed commit: a75dfb9

@victorvhs017 victorvhs017 merged commit d75d9b0 into main Feb 18, 2026
14 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants