Skip to content

fix(telemetry): app-extended-heartbeat event not correctly reporting entire config state#4633

Merged
gh-worker-dd-mergequeue-cf854d[bot] merged 2 commits into
mainfrom
ayan.khan/fix-extended-heartbeat-config-change
Apr 8, 2026
Merged

fix(telemetry): app-extended-heartbeat event not correctly reporting entire config state#4633
gh-worker-dd-mergequeue-cf854d[bot] merged 2 commits into
mainfrom
ayan.khan/fix-extended-heartbeat-config-change

Conversation

@khanayan123

@khanayan123 khanayan123 commented Apr 2, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Replace the push-based config accumulation in heartbeatEnricher with a pull model that reads the full config state on demand from the configuration data source.

Problem: heartbeatEnricher had an AppStarted case to capture startup configs, but it was unreachable because appStartedReducer consumed the AppStarted payload before heartbeatEnricher could see it. The extended heartbeat was missing all startup configurations.

Solution: The configuration data source now retains all registered configs (using a pending set to track deltas for Payload()) and exposes an All() method. heartbeatEnricher calls getConfigs() when emitting the extended heartbeat to get the full current state directly from the source of truth.

Changes:

  • configuration.go — Single map with pending set. Add() writes to config and marks pending. Payload() extracts only pending keys. New All() returns full accumulated state.
  • internal/mapper/default.goNewDefaultMapper accepts func() []transport.ConfKeyValue. heartbeatEnricher pulls configs on demand instead of accumulating from pipeline. Removed AppStarted case.
  • client.go — Passes configuration.All to NewDefaultMapper.
  • client_test.go — Added integration tests for extended heartbeat config inclusion (single, multiple, dedup).

Motivation

Discovered while adding app-extended-heartbeat system tests in DataDog/system-tests#6338. The extended heartbeat was missing startup configs because appStartedReducer consumed them before heartbeatEnricher could see them. Rather than work around the pipeline ordering issue, this PR fixes the root cause by having heartbeatEnricher read config state directly from the source of truth.

API spec for reference: https://github.com/DataDog/instrumentation-telemetry-api-docs/blob/main/GeneratedDocumentation/ApiDocs/v2/SchemaDocumentation/Schemas/app_extended_heartbeat.md

@codecov

codecov Bot commented Apr 2, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 79.31034% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 60.70%. Comparing base (0e06d9c) to head (45d9d7d).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
internal/telemetry/internal/mapper/default.go 0.00% 4 Missing ⚠️
internal/telemetry/configuration.go 91.66% 1 Missing and 1 partial ⚠️
Additional details and impacted files
Files with missing lines Coverage Δ
internal/telemetry/client.go 82.14% <100.00%> (ø)
internal/telemetry/configuration.go 69.60% <91.66%> (ø)
internal/telemetry/internal/mapper/default.go 0.00% <0.00%> (ø)

... and 437 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@datadog-official

datadog-official Bot commented Apr 2, 2026

Copy link
Copy Markdown
Contributor

⚠️ Tests

Fix all issues with BitsAI or with Cursor

⚠️ Other Violations

❄️ 1 New flaky test detected

TestClientFlush/extended-heartbeat-config-dedup from github.com/DataDog/dd-trace-go/v2/internal/telemetry   View in Datadog   (Fix with Cursor)
Failed

=== RUN   TestClientFlush/extended-heartbeat-config-dedup
=== PAUSE TestClientFlush/extended-heartbeat-config-dedup
=== CONT  TestClientFlush/extended-heartbeat-config-dedup
    client_test.go:249: 
        	Error Trace:	/home/runner/work/dd-trace-go/dd-trace-go/internal/telemetry/client_test.go:249
        	            				/home/runner/work/dd-trace-go/dd-trace-go/internal/telemetry/client_test.go:1148
        	            				/home/runner/work/dd-trace-go/dd-trace-go/internal/synctest/synctest.go:15
        	Error:      	Not equal: 
...

ℹ️ Info

No other issues found (see more)

🧪 All tests passed

🎯 Code Coverage (details)
Patch Coverage: 80.77%
Overall Coverage: 59.99%

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 45d9d7d | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback!

@khanayan123 khanayan123 force-pushed the ayan.khan/fix-extended-heartbeat-config-change branch 5 times, most recently from a432849 to 6e55261 Compare April 2, 2026 17:20
@khanayan123 khanayan123 marked this pull request as ready for review April 2, 2026 17:28
@khanayan123 khanayan123 requested a review from a team as a code owner April 2, 2026 17:28
@khanayan123 khanayan123 changed the title fix(telemetry): include config-change configs in extended heartbeat fix(telemetry): include config-change & app-started configs in extended heartbeat Apr 2, 2026
@khanayan123 khanayan123 force-pushed the ayan.khan/fix-extended-heartbeat-config-change branch from 6e55261 to da56468 Compare April 2, 2026 17:31

@genesor genesor left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not gave a deep review of the entire flow but I left two questions.

Also there is no test modifications in internal/mapper is this expected ?

Comment thread internal/telemetry/internal/mapper/default.go Outdated
Comment thread internal/telemetry/internal/mapper/app_started.go Outdated
@khanayan123 khanayan123 added AI Generated Largely based on code generated by an AI or LLM. This label is the same across all dd-trace-* repos AI Assisted AI/LLM assistance used in this PR (partially or fully) labels Apr 3, 2026
@khanayan123 khanayan123 force-pushed the ayan.khan/fix-extended-heartbeat-config-change branch from da56468 to 87324f8 Compare April 3, 2026 15:37
@pr-commenter

pr-commenter Bot commented Apr 3, 2026

Copy link
Copy Markdown

Benchmarks

Benchmark execution time: 2026-04-08 18:52:01

Comparing candidate commit 45d9d7d in PR branch ayan.khan/fix-extended-heartbeat-config-change with baseline commit 0e06d9c in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 216 metrics, 8 unstable metrics.

Explanation

This is an A/B test comparing a candidate commit's performance against that of a baseline commit. Performance changes are noted in the tables below as:

  • 🟩 = significantly better candidate vs. baseline
  • 🟥 = significantly worse candidate vs. baseline

We compute a confidence interval (CI) over the relative difference of means between metrics from the candidate and baseline commits, considering the baseline as the reference.

If the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD), the change is considered significant.

Feel free to reach out to #apm-benchmarking-platform on Slack if you have any questions.

More details about the CI and significant changes

You can imagine this CI as a range of values that is likely to contain the true difference of means between the candidate and baseline commits.

CIs of the difference of means are often centered around 0%, because often changes are not that big:

---------------------------------(------|---^--------)-------------------------------->
                              -0.6%    0%  0.3%     +1.2%
                                 |          |        |
         lower bound of the CI --'          |        |
sample mean (center of the CI) -------------'        |
         upper bound of the CI ----------------------'

As described above, a change is considered significant if the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD).

For instance, for an execution time metric, this confidence interval indicates a significantly worse performance:

----------------------------------------|---------|---(---------^---------)---------->
                                       0%        1%  1.3%      2.2%      3.1%
                                                  |   |         |         |
       significant impact threshold --------------'   |         |         |
                      lower bound of CI --------------'         |         |
       sample mean (center of the CI) --------------------------'         |
                      upper bound of CI ----------------------------------'

@khanayan123 khanayan123 changed the title fix(telemetry): include config-change & app-started configs in extended heartbeat refactor(telemetry): pull config state for extended heartbeat instead of pipeline accumulation Apr 3, 2026
@khanayan123 khanayan123 force-pushed the ayan.khan/fix-extended-heartbeat-config-change branch 4 times, most recently from 08369d3 to 32ffc6a Compare April 3, 2026 18:09
@khanayan123

Copy link
Copy Markdown
Contributor Author

Hey @genesor I ended up scrapping the re-injection approach entirely.

The updated design has heartbeatEnricher pull the full config state directly from the configuration data source via a getConfigs() function passed at construction to be called before reporting extended heartbeat, no pipeline workarounds needed

The config store now retains all registered configs (with a pending set for deltas) and exposes an All() snapshot method

@khanayan123 khanayan123 changed the title refactor(telemetry): pull config state for extended heartbeat instead of pipeline accumulation fix(telemetry): app-extended-heartbeat event not correctly reporting entire config state Apr 3, 2026
@khanayan123 khanayan123 requested a review from genesor April 3, 2026 18:25
@khanayan123 khanayan123 force-pushed the ayan.khan/fix-extended-heartbeat-config-change branch from 32ffc6a to c2e19b6 Compare April 6, 2026 13:57

@genesor genesor left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work Ayan, I think we should have better comments in the code explaining the flow and how it will now work.

Before we were draining a struct and be done with it. It's not the case anymore, the flow should be explained somewhere so we don't have to re-understand it everytime 👍

Comment thread internal/telemetry/configuration.go Outdated
Comment on lines +22 to +23
config map[configKey]transport.ConfKeyValue
pending map[configKey]struct{}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to have comments on config and pending to explain how they differ and how they are used.

Comment thread internal/telemetry/configuration.go Outdated

configs := make([]transport.ConfKeyValue, 0, len(c.config))
for _, conf := range c.config {
conf.Value = SanitizeConfigValue(conf.Value)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it useful to have the same default handling for SeqID and Origin here as well ? Payload sets some values, depending on the calling order could we face some issues ?

Comment thread internal/telemetry/configuration.go Outdated
}
}

// All returns a sanitized snapshot of all accumulated configs.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment should explain how this function is used.

@khanayan123 khanayan123 force-pushed the ayan.khan/fix-extended-heartbeat-config-change branch 2 times, most recently from cee6f48 to adf39b0 Compare April 8, 2026 16:24
@khanayan123 khanayan123 requested a review from genesor April 8, 2026 16:30
@khanayan123 khanayan123 force-pushed the ayan.khan/fix-extended-heartbeat-config-change branch from 828d315 to ff8ff8f Compare April 8, 2026 16:44
@khanayan123 khanayan123 force-pushed the ayan.khan/fix-extended-heartbeat-config-change branch from ff8ff8f to 5bbf1ab Compare April 8, 2026 16:54
… of pipeline accumulation

Replace the push-based config accumulation in heartbeatEnricher with a
pull model that reads the full config state on demand from the
configuration data source.

Previously, heartbeatEnricher accumulated configs by watching
AppClientConfigurationChange and AppStarted payloads flow through the
pipeline. This was broken because appStartedReducer consumed the initial
config payload before heartbeatEnricher could see it.

Now, the configuration data source retains all registered configs (using
a pending set to track deltas for Payload()) and exposes an All() method.
heartbeatEnricher calls getConfigs() when emitting the extended heartbeat
to get the full current state directly from the source of truth.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@khanayan123 khanayan123 force-pushed the ayan.khan/fix-extended-heartbeat-config-change branch from 5bbf1ab to a7cfb7a Compare April 8, 2026 17:07

@genesor genesor left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks the comments

@khanayan123

Copy link
Copy Markdown
Contributor Author

/merge

@gh-worker-devflow-routing-ef8351

gh-worker-devflow-routing-ef8351 Bot commented Apr 8, 2026

Copy link
Copy Markdown

View all feedbacks in Devflow UI.

2026-04-08 18:37:05 UTC ℹ️ Start processing command /merge


2026-04-08 18:37:13 UTC ℹ️ MergeQueue: waiting for PR to be ready

This pull request is not mergeable according to GitHub. Common reasons include pending required checks, missing approvals, or merge conflicts — but it could also be blocked by other repository rules or settings.
It will be added to the queue as soon as checks pass and/or get approvals. View in MergeQueue UI.
Note: if you pushed new commits since the last approval, you may need additional approval.
You can remove it from the waiting list with /remove command.


2026-04-08 18:54:13 UTC ℹ️ MergeQueue: merge request added to the queue

The expected merge time in main is approximately 26m (p90).


2026-04-08 19:08:16 UTC ℹ️ MergeQueue: This merge request was merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI Assisted AI/LLM assistance used in this PR (partially or fully) AI Generated Largely based on code generated by an AI or LLM. This label is the same across all dd-trace-* repos mergequeue-status: done

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants