Skip to content

TBS: Log pubsub errors at error or warn level#17135

Merged
carsonip merged 9 commits intoelastic:mainfrom
carsonip:tbs-surface-pubsub-error
Jun 10, 2025
Merged

TBS: Log pubsub errors at error or warn level#17135
carsonip merged 9 commits intoelastic:mainfrom
carsonip:tbs-surface-pubsub-error

Conversation

@carsonip
Copy link
Copy Markdown
Member

@carsonip carsonip commented Jun 9, 2025

Motivation/summary

  • Surface pubsub subscriber errors by logging at Warn level for 429 and Error level for others, instead of Debug level. Do not log context canceled. This may introduce noise to logs, e.g. when ES returns 429, but at worst it will only log one line every search interval (default = 1m/2 = 30s) which is acceptable.
  • Also change publisher logs from debug level to error level.

Checklist

- [ ] Update CHANGELOG.asciidoc to be done on release

  • Documentation has been updated

For functional changes, consider:

  • Is it observable through the addition of either logging or metrics?
  • Is its use being published in telemetry to enable product improvement?
  • Have system tests been added to avoid regression?

How to test these changes

Covered by unit tests

Related issues

Fixes #17117

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jun 9, 2025

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@carsonip carsonip added backport-active-9 Automated backport with mergify to all the active 9.[0-9]+ branches backport-active-8 Automated backport with mergify to all the active 8.[0-9]+ branches labels Jun 9, 2025
@carsonip carsonip changed the title TBS: Surface pubsub errors TBS: Surface pubsub subscriber errors Jun 9, 2025
@carsonip carsonip changed the title TBS: Surface pubsub subscriber errors TBS: Log pubsub subscriber errors at error level Jun 9, 2025
@carsonip carsonip marked this pull request as ready for review June 9, 2025 15:19
@carsonip carsonip requested a review from a team as a code owner June 9, 2025 15:19
isaacaflores2
isaacaflores2 previously approved these changes Jun 9, 2025
axw
axw previously approved these changes Jun 10, 2025
rubvs
rubvs previously approved these changes Jun 10, 2025
@carsonip carsonip dismissed stale reviews from rubvs, axw, and isaacaflores2 via 2ae9c78 June 10, 2025 14:53
@carsonip carsonip changed the title TBS: Log pubsub subscriber errors at error level TBS: Log pubsub errors at error level Jun 10, 2025
@carsonip carsonip changed the title TBS: Log pubsub errors at error level TBS: Log pubsub errors at error or warn level Jun 10, 2025
@carsonip carsonip enabled auto-merge (squash) June 10, 2025 20:24
@carsonip carsonip merged commit 6d41432 into elastic:main Jun 10, 2025
17 checks passed
@github-actions
Copy link
Copy Markdown
Contributor

@Mergifyio backport 8.17 8.18 8.19 9.0

@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Jun 10, 2025

backport 8.17 8.18 8.19 9.0

✅ Backports have been created

Details

mergify bot pushed a commit that referenced this pull request Jun 10, 2025
* Surface pubsub subscriber errors by logging at Warn level for 429 and Error level for others, instead of Debug level. Do not log context canceled. This may introduce noise to logs, e.g. when ES returns 429, but at worst it will only log one line every search interval (default = 1m/2 = 30s) which is acceptable.
* Also change publisher logs from debug level to error level.

(cherry picked from commit 6d41432)

# Conflicts:
#	x-pack/apm-server/sampling/pubsub/checkpoints.go
#	x-pack/apm-server/sampling/pubsub/pubsub.go
#	x-pack/apm-server/sampling/pubsub/pubsub_test.go
mergify bot pushed a commit that referenced this pull request Jun 10, 2025
* Surface pubsub subscriber errors by logging at Warn level for 429 and Error level for others, instead of Debug level. Do not log context canceled. This may introduce noise to logs, e.g. when ES returns 429, but at worst it will only log one line every search interval (default = 1m/2 = 30s) which is acceptable.
* Also change publisher logs from debug level to error level.

(cherry picked from commit 6d41432)

# Conflicts:
#	x-pack/apm-server/sampling/pubsub/checkpoints.go
#	x-pack/apm-server/sampling/pubsub/pubsub.go
#	x-pack/apm-server/sampling/pubsub/pubsub_test.go
mergify bot pushed a commit that referenced this pull request Jun 10, 2025
* Surface pubsub subscriber errors by logging at Warn level for 429 and Error level for others, instead of Debug level. Do not log context canceled. This may introduce noise to logs, e.g. when ES returns 429, but at worst it will only log one line every search interval (default = 1m/2 = 30s) which is acceptable.
* Also change publisher logs from debug level to error level.

(cherry picked from commit 6d41432)

# Conflicts:
#	x-pack/apm-server/sampling/pubsub/pubsub_test.go
mergify bot pushed a commit that referenced this pull request Jun 10, 2025
* Surface pubsub subscriber errors by logging at Warn level for 429 and Error level for others, instead of Debug level. Do not log context canceled. This may introduce noise to logs, e.g. when ES returns 429, but at worst it will only log one line every search interval (default = 1m/2 = 30s) which is acceptable.
* Also change publisher logs from debug level to error level.

(cherry picked from commit 6d41432)

# Conflicts:
#	x-pack/apm-server/sampling/pubsub/checkpoints.go
#	x-pack/apm-server/sampling/pubsub/pubsub.go
#	x-pack/apm-server/sampling/pubsub/pubsub_test.go
mergify bot added a commit that referenced this pull request Jun 10, 2025
…#17180)

* TBS: Log pubsub errors at error or warn level (#17135)

* Surface pubsub subscriber errors by logging at Warn level for 429 and Error level for others, instead of Debug level. Do not log context canceled. This may introduce noise to logs, e.g. when ES returns 429, but at worst it will only log one line every search interval (default = 1m/2 = 30s) which is acceptable.
* Also change publisher logs from debug level to error level.

(cherry picked from commit 6d41432)

# Conflicts:
#	x-pack/apm-server/sampling/pubsub/checkpoints.go
#	x-pack/apm-server/sampling/pubsub/pubsub.go
#	x-pack/apm-server/sampling/pubsub/pubsub_test.go

* Fix conflicts

* Fix test conflict

---------

Co-authored-by: Carson Ip <carsonip@users.noreply.github.com>
Co-authored-by: Carson Ip <carson.ip@elastic.co>
mergify bot added a commit that referenced this pull request Jun 10, 2025
#17178)

* TBS: Log pubsub errors at error or warn level (#17135)

* Surface pubsub subscriber errors by logging at Warn level for 429 and Error level for others, instead of Debug level. Do not log context canceled. This may introduce noise to logs, e.g. when ES returns 429, but at worst it will only log one line every search interval (default = 1m/2 = 30s) which is acceptable.
* Also change publisher logs from debug level to error level.

(cherry picked from commit 6d41432)

# Conflicts:
#	x-pack/apm-server/sampling/pubsub/checkpoints.go
#	x-pack/apm-server/sampling/pubsub/pubsub.go
#	x-pack/apm-server/sampling/pubsub/pubsub_test.go

* Fix conflicts

* Fix test conflict

---------

Co-authored-by: Carson Ip <carsonip@users.noreply.github.com>
Co-authored-by: Carson Ip <carson.ip@elastic.co>
mergify bot added a commit that referenced this pull request Jun 10, 2025
#17177)

* TBS: Log pubsub errors at error or warn level (#17135)

* Surface pubsub subscriber errors by logging at Warn level for 429 and Error level for others, instead of Debug level. Do not log context canceled. This may introduce noise to logs, e.g. when ES returns 429, but at worst it will only log one line every search interval (default = 1m/2 = 30s) which is acceptable.
* Also change publisher logs from debug level to error level.

(cherry picked from commit 6d41432)

# Conflicts:
#	x-pack/apm-server/sampling/pubsub/checkpoints.go
#	x-pack/apm-server/sampling/pubsub/pubsub.go
#	x-pack/apm-server/sampling/pubsub/pubsub_test.go

* Fix conflicts

* Fix test conflict

* Do not use logptest.NewTestingLogger

---------

Co-authored-by: Carson Ip <carsonip@users.noreply.github.com>
Co-authored-by: Carson Ip <carson.ip@elastic.co>
mergify bot added a commit that referenced this pull request Jun 11, 2025
#17179)

* TBS: Log pubsub errors at error or warn level (#17135)

* Surface pubsub subscriber errors by logging at Warn level for 429 and Error level for others, instead of Debug level. Do not log context canceled. This may introduce noise to logs, e.g. when ES returns 429, but at worst it will only log one line every search interval (default = 1m/2 = 30s) which is acceptable.
* Also change publisher logs from debug level to error level.

(cherry picked from commit 6d41432)

# Conflicts:
#	x-pack/apm-server/sampling/pubsub/pubsub_test.go

* Fix merge conflict

* Bump elastic-agent-libs to fix failing test

* make notice

---------

Co-authored-by: Carson Ip <carsonip@users.noreply.github.com>
Co-authored-by: Carson Ip <carson.ip@elastic.co>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-active-8 Automated backport with mergify to all the active 8.[0-9]+ branches backport-active-9 Automated backport with mergify to all the active 9.[0-9]+ branches v8.17.8 v8.18.3 v9.0.3

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TBS: surface pubsub errors and handle errors gracefully

5 participants