Skip to content

fix(spanner): do not export metrics during shutdown if prev export was less than 30 seconds ago#12266

Merged
rahul2393 merged 5 commits intomainfrom
ensure_flushing_all_metrics
May 15, 2025
Merged

fix(spanner): do not export metrics during shutdown if prev export was less than 30 seconds ago#12266
rahul2393 merged 5 commits intomainfrom
ensure_flushing_all_metrics

Conversation

@sakthivelmanii
Copy link
Copy Markdown
Contributor

@sakthivelmanii sakthivelmanii commented May 14, 2025

Fixes #12017
Fixes #12091

@sakthivelmanii sakthivelmanii marked this pull request as ready for review May 14, 2025 10:02
@sakthivelmanii sakthivelmanii requested review from a team May 14, 2025 10:02
@product-auto-label product-auto-label bot added api: spanner Issues related to the Spanner API. samples Issues that are directly related to samples. labels May 14, 2025
Copy link
Copy Markdown
Contributor

@olavloite olavloite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that a better title for this PR would be:

do not export metrics during shutdown if prev export was less than 30 seconds ago

func (me *monitoringExporter) close() {
// stop the exporter if last export happens within half-time of default sample period
if time.Since(me.lastExportedAt) <= (defaultSamplePeriod / 2) {
me.stop = true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this field need to be protected by a mutex? That is; could it be that there are two different goroutines that call close() and Export() at the same time?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not possible. close() and Export() can be called in parallel. Export() doesn't change stop value. Two goroutines can't modify the same field.

errs = append(errs, err)
}

me.lastExportedAt = time.Now()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with this field; does it need to be protected by a mutex?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exporter have one goroutine which triggers for every 60 seconds. So I feel it's not required. We will not have any scenario where multiple exports happens in parallel.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is true, but that is not what I'm worried about. lastExportedAt is written by this method, which is called by the goroutine handling exports. The field is read by the stop() method, which is called by any application goroutine that closes the Spanner client. Meaning that you could have one goroutine reading the value while another is writing the value.

See https://go.dev/doc/articles/race_detector#Introduction

@sakthivelmanii sakthivelmanii force-pushed the ensure_flushing_all_metrics branch from 9a6c5bc to 46f3780 Compare May 14, 2025 11:03
@sakthivelmanii sakthivelmanii changed the title fix(spanner): Stop exporting built-in metrics after half time of sample period fix(spanner): do not export metrics during shutdown if prev export was less than 30 seconds ago May 14, 2025

func percentiles(percentile float32, latencies []int64) any {
rank := (percentile * float32(len(latencies)-1)) + 1
rank := percentile * float32(len(latencies)-1)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentional?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed it


func (me *monitoringExporter) stop() {
// stop the exporter if last export happens within half-time of default sample period
me.mu.Lock()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: it is idiomatic to use defer me.mu.Unlock() here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

shutdown chan struct{}
client *monitoring.MetricClient
shutdownOnce sync.Once
mu sync.Mutex
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Add an empty line here. It is common to create a separate 'group' for the fields that are protected by a mutex. So you get this:

type monitoringExporter struct {
  ...
  shutdownOnce sync.Once
  
  mu sync.Mutex
  stopExport bool
  lastExportedAt time.Time
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@sakthivelmanii sakthivelmanii force-pushed the ensure_flushing_all_metrics branch from 504438a to 58a71c3 Compare May 14, 2025 11:51
@rahul2393 rahul2393 merged commit 8ad7511 into main May 15, 2025
9 checks passed
@rahul2393 rahul2393 deleted the ensure_flushing_all_metrics branch May 15, 2025 02:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: spanner Issues related to the Spanner API. breaking change samples Issues that are directly related to samples.

Projects

None yet

3 participants