Skip to content

[S3-SQS Source] Make shutdown timeout configurable to prevent message loss during scale-down #6442

@Davidding4718

Description

@Davidding4718

Is your feature request related to a problem? Please describe.
When OpenSearch Ingestion pipelines using the S3-SQS source scale down, OCUs are terminated with a hardcoded 30-second shutdown timeout. If S3 objects cannot be fully processed within this window, in-flight SQS messages are interrupted, causing:

  • Message visibility timeout expiration
  • Messages being re-queued as "old messages" (age: 3,500-3,600 seconds)
  • Duplicate processing and data inconsistencies
  • Increased processing costs due to repeated work

This is particularly problematic for pipelines processing large S3 objects or complex transformations requiring more than 30 seconds per message.

The hardcoded timeout is defined in SqsService.java:

Describe the solution you'd like
Add a configurable shutdown_timeout parameter to the S3-SQS source configuration:

source:
  s3:
    sqs:
      queue_url: "https://sqs.region.amazonaws.com/account/queue"
      shutdown_timeout: 300  # seconds, default: 30

This allows users to configure the timeout based on their workload characteristics.

Describe alternatives you've considered (Optional)
N/A

Additional context
N/A

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    Done

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions