[Telemetry] Add an atomic flag to ensure telemetry event is not recurrently called by kadupoornima · Pull Request #5705 · GoogleCloudPlatform/cluster-toolkit

kadupoornima · 2026-05-25T13:13:28Z

Problem
Currently, the logging.FatalHook unconditionally triggers telemetry collection whenever the CLI encounters a fatal error. This introduces two significant risks:

Re-entrancy/Deadlocks: If the telemetry collector itself encounters an error while gathering metrics (for instance, failing to rename or read a packer module), it can trigger another logging.Fatal() call. This recursive loop can lead to deadlocks or redundant executions.
Concurrent Duplicates: If multiple goroutines experience a fatal error simultaneously, they will both trigger the FatalHook at the same time, leading to duplicate payload generation and concurrent uploads to Clearcut.

Solution
This PR introduces a thread-safe atomic flag to guarantee that telemetry is flushed exactly once per CLI execution, regardless of concurrency or re-entrant fatal calls.

Changes

cmd/root.go: Added telemetryFlushed atomic.Bool to serve as a 2-state flag for whether the telemetry event is flushed or not.
cmd/root.go: Wrapped the telemetry execution inside the logging.FatalHook with if telemetryFlushed.CompareAndSwap(false, true).
cmd/root_test.go: Added the TestFatalHook_ConcurrencyAndReentrancy unit test. This test spawns multiple concurrent goroutines to simulate race conditions and verifies that the telemetry hook behaves safely under load.

Rationale
Using an atomic.Bool with CompareAndSwap is superior to using standard locking mechanisms (like sync.Mutex or sync.Once) for this specific issue. Atomic operations are non-blocking; if a recursive call or a concurrent goroutine reaches the hook, it immediately evaluates to false and skips the block without waiting, safely proceeding to termination.

Advantages

Prevents Duplicate Events: Ensures Clearcut analytics only receive one payload per execution.
Safe CLI Termination: Eliminates the risk of deadlocks from infinite recursive logging.Fatal loops during failure handling.
Ergonomic Types: Utilizing atomic.Bool expresses the intent clearly and avoids the error-prone usage of generic integer-based atomics.

gemini-code-assist · 2026-05-25T13:15:52Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request improves the reliability of the telemetry system by preventing redundant event reporting during application crashes. By implementing an atomic guard, the system now guarantees that telemetry flushing logic is executed only once, even if multiple fatal events occur simultaneously or recursively.

Highlights

Telemetry Guard: Introduced an atomic boolean flag, telemetryFlushed, to ensure that telemetry events are processed exactly once during fatal errors.
Concurrency Safety: Updated the logging.FatalHook to use CompareAndSwap on the atomic flag, preventing re-entrancy and duplicate telemetry submissions in concurrent or recursive scenarios.
Test Coverage: Added a new test case, TestFatalHook_ConcurrencyAndReentrancy, to verify that multiple concurrent calls to the fatal hook do not trigger redundant telemetry execution.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces an atomic guard using sync/atomic to ensure telemetry is executed exactly once during fatal errors, preventing re-entrancy and duplicate events. A new test case, TestFatalHook_ConcurrencyAndReentrancy, was added to verify the thread safety of this implementation. Feedback suggests extending this atomic protection to the main Execute() function to handle normal program exits consistently and implementing a 10-second timeout for telemetry calls to prevent blocking the CLI.

base

f79ead3

kadupoornima requested a review from SwarnaBharathiMantena May 25, 2026 13:13

kadupoornima requested a review from a team as a code owner May 25, 2026 13:13

kadupoornima added release-improvements Added to release notes under the "Improvements" heading. release-bugfix Added to release notes under the "Bug fixes" heading. labels May 25, 2026

kadupoornima enabled auto-merge (squash) May 25, 2026 13:13

gemini-code-assist Bot reviewed May 25, 2026

View reviewed changes

Comment thread cmd/root.go

kadupoornima added 2 commits May 25, 2026 13:21

.

f4175b2

comments

d362686

parulbajaj01 approved these changes May 27, 2026

View reviewed changes

kadupoornima merged commit 74710aa into GoogleCloudPlatform:develop May 27, 2026
12 of 74 checks passed

kadupoornima deleted the atomic-flag branch May 27, 2026 06:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Telemetry] Add an atomic flag to ensure telemetry event is not recurrently called#5705

[Telemetry] Add an atomic flag to ensure telemetry event is not recurrently called#5705
kadupoornima merged 3 commits into
GoogleCloudPlatform:developfrom
kadupoornima:atomic-flag

kadupoornima commented May 25, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot commented May 25, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kadupoornima commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot commented May 25, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kadupoornima commented May 25, 2026 •

edited

Loading