Skip to content

Use CLOCK_BOOTTIME instead of CLOCK_REALTIME in published_at_ns in process context C/C++ impl#79

Merged
felixge merged 4 commits into
open-telemetry:mainfrom
ivoanjo:ivoanjo/monotonic-timestamp
Mar 13, 2026
Merged

Use CLOCK_BOOTTIME instead of CLOCK_REALTIME in published_at_ns in process context C/C++ impl#79
felixge merged 4 commits into
open-telemetry:mainfrom
ivoanjo:ivoanjo/monotonic-timestamp

Conversation

@ivoanjo

@ivoanjo ivoanjo commented Mar 10, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

This PR updates the process context C/C++ implementation to match the change in the upstream specification in
open-telemetry/opentelemetry-specification@15256ee to adopt a monotonic clock.

Motivation:

Using a monotonic clock means that readers can't get confused by clocks going backward, and we can trivially about the ABA issue (because if the clock can't ever go back, we can always add 1 to a previous read to ensure no two timestamps are ever the same).

Additional Notes:

In #66 (comment) we discussed that this could be implemented by directly using the CLOCK_MONOTONIC, or alternatively we could map the monotonic clock back to a regular ns from epoch.

I had this commit already ready and it takes the first approach, but let's discuss? ;)

How to test the change?

In practice readers don't really need to change unless they were trying to interpret the timestamp and pretty-print it (as the bash script was).

…` in process context C/C++ impl

**What does this PR do?**

This PR updates the process context C/C++ implementation to match
the change in the upstream specification in
open-telemetry/opentelemetry-specification@15256ee
to adopt a monotonic clock.

**Motivation:**

Using a monotonic clock means that readers can't get confused
by clocks going backward, and we can trivially about the ABA
issue (because if the clock can't ever go back, we can always
add 1 to a previous read to ensure no two timestamps are ever
the same).

**Additional Notes:**

In open-telemetry#66 (comment)
we discussed that this could be implemented by directly using
the `CLOCK_MONOTONIC`, or alternatively we could map the monotonic
clock back to a regular ns from epoch.

I had this commit already ready and it takes the first approach,
but let's discuss? ;)

**How to test the change?**

In practice readers don't really need to change unless they were trying
to interpret the timestamp and pretty-print it (as the bash script was).
@ivoanjo

ivoanjo commented Mar 10, 2026

Copy link
Copy Markdown
Contributor Author

Btw @christos68k now that I think of it, I think if we went with option 2:

  1. Calculate a delta between monotonic and realtime clock (once, on startup, fixed for lifetime of process) and add it to each subsequent monotonic measurement (from CLOCK_BOOTTIME or CLOCK_MONOTONIC).

And looking at the manpage https://manpages.opensuse.org/Tumbleweed/man-pages/clock_gettime.2.en.html#CLOCK_MONOTONIC

CLOCK_MONOTONIC
[...] This clock does not count time that the system is suspended.

I think we could start getting weird values (either in the past or in the future) in the presence of updates to system time and VM suspension if we always used the same delta... 🤔

@christos68k

christos68k commented Mar 10, 2026

Copy link
Copy Markdown
Member

Btw @christos68k now that I think of it, I think if we went with option 2:

  1. Calculate a delta between monotonic and realtime clock (once, on startup, fixed for lifetime of process) and add it to each subsequent monotonic measurement (from CLOCK_BOOTTIME or CLOCK_MONOTONIC).

And looking at the manpage https://manpages.opensuse.org/Tumbleweed/man-pages/clock_gettime.2.en.html#CLOCK_MONOTONIC

CLOCK_MONOTONIC
[...] This clock does not count time that the system is suspended.

I think we could start getting weird values (either in the past or in the future) in the presence of updates to system time and VM suspension if we always used the same delta... 🤔

CLOCK_BOOTTIME does count time that the system is suspended, but yes, in general this processing scheme doesn't guarantee an accurate correlation with the real time clock (as it's immune to drift, clock skew and NTP updates). It guarantees that all values can be interpreted as nanoseconds-since-the-epoch and will track the realtime clock with some rough degree of precision whilst staying monotonic.

If we update the delta more than once for the lifetime of the process, we lose the "can't go backwards" guarantee.

@ivoanjo

ivoanjo commented Mar 10, 2026

Copy link
Copy Markdown
Contributor Author

Ahh yeah I read your comment and completely glossed over the CLOCK_BOOTTIME. Yeah that one would get us closer to the "it's fine as long as system time didn't go backwards" 👀

I guess in the end, if we really wanted to, rather than trying to fight the delta and the different clocks, we could bite the bullet and have the two timestamps (or one update counter and one timestamp), but it does seem slight overkill...

@christos68k

christos68k commented Mar 10, 2026

Copy link
Copy Markdown
Member

Ahh yeah I read your comment and completely glossed over the CLOCK_BOOTTIME. Yeah that one would get us closer to the "it's fine as long as system time didn't go backwards" 👀

I guess in the end, if we really wanted to, rather than trying to fight the delta and the different clocks, we could bite the bullet and have the two timestamps (or one update counter and one timestamp), but it does seem slight overkill...

Even if system time goes backwards, the calculated delta won't be affected (as it's only calculated once for the life of the process) and thus the resulting delta+monotonic values also won't be affected. The only affected part will be the tight correlation with the real time clock (which we don't really care about). In essence, this scheme trades that tight-synchronization-with-realtime guarantee for monotonicity and precision of relative measurements with respect to each other and the 'base' which remains unaffected and accurate to the nanosecond (also something we don't really need here but it's there for free).

I think we can be flexible and don't need to overspecify the implementation (either here or in the OTEP). Instead, if we specify the clock used in the OTEP (e.g. CLOCK_BOOTTIME or CLOCK_MONOTONIC), then the consumer of such values can decide how to interpret them:

  1. Purely as monotonic counters
  2. Correlated with realtime clock if the consumer performs the delta calculation

@ivoanjo

ivoanjo commented Mar 10, 2026

Copy link
Copy Markdown
Contributor Author

Even if system time goes backwards, the calculated delta won't be affected (as it's only calculated once for the life of the process) and thus the resulting delta+monotonic values also won't be affected. The only affected part will be the tight correlation with the real time clock (which we don't really care about). In essence, this scheme trades that tight-synchronization-with-realtime guarantee for monotonicity and precision of relative measurements with respect to each other and the 'base' which remains unaffected and accurate to the nanosecond (also something we don't really need here but it's there for free).

Yes! My "hmm not amazing" spider sense is more of the -- if the writer does the delta and puts as a timestamp a cooked value that includes the delta change, the semantics of "most of the time this is a timestamp that makes sense but sometimes it won't!" aren't great -- especially if not all writers agree on exactly that meaning.

Which brings us to...

I think we can be flexible and don't need to overspecify the implementation (either here or in the OTEP). Instead, if we specify the clock used in the OTEP (e.g. CLOCK_BOOTTIME), then the consumer of such values can decide how to interpret them:

  1. Purely as monotonic counters
  2. Correlated with realtime clock if the consumer performs the delta calculation

...if we put in CLOCK_BOOTTIME and calculating deltas is up to the reader (not the writer), then the reader gets the raw value and can choose to interpret the data, given the caveats, or not.

I quite like that!

I'll go ahead and update this PR and push a tiny tweak to the spec to just say "use CLOCK_BOOTTIME" instead.

@ivoanjo ivoanjo changed the title Use CLOCK_MONOTONIC instead of CLOCK_REALTIME in published_at_ns in process context C/C++ impl Use CLOCK_BOOTTIME instead of CLOCK_REALTIME in published_at_ns in process context C/C++ impl Mar 10, 2026
ivoanjo added a commit to ivoanjo/opentelemetry-specification that referenced this pull request Mar 10, 2026
@ivoanjo ivoanjo requested a review from christos68k March 10, 2026 16:45
@ivoanjo

ivoanjo commented Mar 10, 2026

Copy link
Copy Markdown
Contributor Author

@christos68k I've pushed open-telemetry/opentelemetry-specification@0c6caa7 + updated this PR.

All still good on your side?

ivoanjo added a commit to DataDog/java-profiler that referenced this pull request Mar 11, 2026
…of reference library

**What does this PR do?**

This PR updates the vendored version of the OTel process context
support with the latest code from
open-telemetry/sig-profiling#79 (which will soon
be merged upstream).

**Motivation:**

This latest version of the library fixes a number of small issues, but
more importantly includes support for the thread context configuration
keys which we need for #347
(so we will be able to shed a bunch of code once we rebase on top of
this one).

**Additional Notes:**

From the Java API consumer side, not a lot changes with this version.

**How to test the change?**

Validate tests still pass!
ivoanjo added 2 commits March 11, 2026 17:30
...they don't always have a `#define` for this flag.
…ild`

In the `otel_process_ctx_read()` code used for testing, if somehow
there were repeated keys in the process context data, then we could
leak the existing value while overwriting it with the latter one.

(The key missing bit was `if (*field != NULL) { free(value); return
false; }`).
ivoanjo added a commit to DataDog/java-profiler that referenced this pull request Mar 12, 2026
TL;DR if there were repeated keys when parsing the data in the testing
code we could leak their values.

This matches the latest version upstream in
open-telemetry/sig-profiling#79 .
@felixge felixge merged commit 650085a into open-telemetry:main Mar 13, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants