Skip to content

TBS: surface pubsub errors and handle errors gracefully #17117

@carsonip

Description

@carsonip

APM Server version (apm-server version): 9.0, but affects all versions

Description of the problem including expected versus actual behavior:

When _refresh request fails, e.g. returning 400, pubsub only logs the error at Debug level. It also stops pubsub from getting latest remote sampling decisions.

Steps to reproduce:

Mock ES to return 400 to _refresh with

{
  "_shards": {
    "total": 52,
    "successful": 31,
    "failed": 21,
    "failures": [
      {
        "shard": 0,
        "index": "partial-.ds-traces-apm.sampled-default-2025.05.30-001751",
        "status": "BAD_REQUEST",
        "reason": {
          "type": "illegal_argument_exception",
          "reason": "Validation Failed: 1: segment generation is unknown;"
        }
      },
...

Provide logs (if relevant):

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions