JEP draft: JEP Draft: Unbiased Stack-Walk JVMTI extension API

Owner	Roman Kennke
Type	Feature
Scope	JDK
Status	Draft
Component	hotspot / jvmti
Effort	M
Duration	M
Created	2026/03/30 13:45
Updated	2026/03/31 10:37
Issue	8381322

Summary

Provide an API for use by external tools to request and receive unbiased (aka asynchronous) stack-traces.

Goals

Provide a JVMTI extension API to allow external tools to obtain a stack-trace for a given Java thread.
The stack-trace is delivered by calling agent-provided callback-functions.
The API is safe to use from signal-handlers to facilitate profiling tools to call this from profiling signals, e.g., perf counter overflow or CPU timer signals.
The API reports information about Java frames, e.g., method/class-name, byte-code-index.
The API can obtain stack information that is free from safepoint bias.

Non-Goals

It is currently not a goal to report native frames. (This feature may be added in a follow-up improvement.)
It is not a goal to change or extend the JVMTI specification. The feature is delivered as a JVMTI extension.

Motivation

One of the great advantages of Java and the JVM is its vast ecosystem of tools that make the lives of developers easier. One such category of tools are profilers. The JVM comes with built-in facilities for profiling (JFR), and there are various external tools that provide a wider and sometimes more useful set of features both open source (e.g., async-profiler [0]) and commercial. One key feature that profiling solutions require from the JVM is the ability to obtain stack-traces. This allows profiling tools to tell the user exactly where potential performance bottlenecks or various other problems (e.g., cache-misses) may be located in their code, and how the execution of the program got there.

There are currently several ways for external profiling tools to obtain stack-traces, and all have drawbacks which make them an insufficient solution:

There is a family of official JVMTI APIs to get stack-traces, namely GetStackTrace, GetAllStackTraces and GetThreadListStackTraces. Those methods are fundamentally not signal-handler-safe because they may do allocations, they transition the caller thread from native to VM, which may block, and they will perform a handshake or even bring all threads to a safepoint, and wait for that to be finished. The stack-traces are only obtained when the thread arrives at a safepoint, which leads to the so-called safepoint-bias. Safepoint bias is a problem, because it skews profiling results such that they only ever point to the safepoints (e.g., method call sites, loop back-edges, etc) as hot areas, even when the real problem is somewhere else.
There is an unofficial API that is used by many profiling tools called AsyncGetCallTrace. That API is a JVM internal API that is not exposed in a header or through any other means. While it is signal-handler-safe and avoids the safepoint bias problem, it has limitations that make it insufficient still. It walks the stack outside of a safe-point, but is using functions that were designed to only do that only when at a safepoint which can potentially crash the JVM. Also, frames are reported as jmethodIDs. Tools need to collect them in a signal handler, but use them later, outside of the signal handler (to avoid allocations and VM transitions while trying to resolve the jmethodIDs). This is undefined behaviour per JVMTI spec and can lead to crashes, e.g., when the method/class has been unloaded by the time the jmethodID is used. Both HotSpot and the external tools are trying hard to avoid that problem, but they can only ever make it 'very unlikely' which is not sufficient for a stable profiling solution.
Some profiling tools hook into vmStructs directly to walk the stack by their own means. This is inherently dangerous and unspecified, and can change with every JVM release. Crashing is almost the best scenario for this 'solution' - it can also lead to 'silent', more subtle failures, which could be more catastrophic and harder to debug than crashes.

An example of what is possible using the new API is shown below. This is a flame-graph that represents ~27000 samples of cache-misses obtained using a small profiling agent running with one of the Renaissance benchmark workloads. The profiling agent obtains the samples by setting up a signal handler on Linux perf counter overflow on the hardware cache-misses counter, and requesting stack-traces whenever that signal fires.

Description

A new API is added as a JVMTI extension. Calling that API requests a stack-trace from the current or specified thread to be reported via the provided callback functions. The API has the following signature:

jvmtiError RequestStackTrace(jvmtiEnv* env, jthread* thread, void* ucontext, jvmtiStackTraceCallbacks callbacks, void* user_data)

Where the method arguments are:

thread: the Java thread for which a stack-trace is requested. Accepts NULL for the current thread. For threads other than the current thread, the stack-trace will be biased.
ucontext: the thread context (e.g., as passed from POSIX signal-handlers). Accepts NULL (e.g., when not available, or when calling from a system that doesn't pass thread context). When NULL is passed, the stack-trace will be biased.
callbacks: A struct of callback functions; the requested stack-trace will be reported by calling those functions.
user_data: arbitrary data passed by the caller. The data will be reported-back through the callback functions. This can typically be used by the profiling agent to associate the stack-trace as reported by JFR with agent-internal datastructures.

The function returns an error code:

JVMTI_ERROR_NOT_AVAILABLE when the functionality is not available (e.g., due to JFR not being present)
JVMTI_ERROR_INVALID_THREAD the passed-in thread is invalid
JVMTI_ERROR_NONE if the call succeeded

After calling the API, the JVM will call the agent-provided callback functions to report the stack-trace.

jvmtiStackTraceCallbacks is a struct that contains the callback functions:

typedef struct {
  jvmtiBeginStackTraceCallback beginStackTrace;
  jvmtiEndStackTraceCallback endStackTrace;
  jvmtiStackFrameCallback stackFrame;
  jvmtiStackTraceFailureCallback failure;
} jvmtiStackTraceCallbacks;

beginStackTrace: The callback is called to start a stack-trace, before any stack-frames are reported.
endStackTrace: The callback is called to end a stack-trace, after all stack-frames have been reported. It is safe to deallocate the callbacks structure and callbacks just before returning from this callback.
stackFrame: The callback is called to report a single stack-frame; stack-frames are reported beginning with the currently executed stack-frame towards the frame that started the thread.
failure: The callback is called whenever a failure is encountered that prevents a stack-trace from being reported.

typedef void (JNICALL *jvmtiBeginStackTraceCallback)
    (jthread thread,
     jboolean biased,
     void* user_data);

thread: The thread from which the stack-trace is taken
biased: Whether or not the stack-trace is biased towards a safe-point
userData: The user-data as the agent passed into RequestStackTrace

typedef void (JNICALL *jvmtiEndStackTraceCallback)
    (jthread thread,
     void* user_data);

thread: The thread from which the stack-trace is taken
userData: The user-data as the agent passed into RequestStackTrace

typedef jint (JNICALL *jvmtiStackTraceCallback)
    (jvmtiFrameType frameType,
     jmethodID methodId,
     jlocation location,
     void* user_data);

frameType: The type of the frame (interpreter, JIT, inline or native).
methodId: The method of the frame. Only guaranteed to be valid until endStackTrace has been called.
location: The location within the method.
userData: The user-data as the agent passed into RequestStackTrace

typedef void (JNICALL *jvmtiStackTraceFailureCallback)
    (jthread thread,
     void* user_data);

thread: The thread from which the stack-trace is taken
userData: The user-data as the agent passed into RequestStackTrace

The functionality needs to be enabled before use by calling the following function:

jvmtiError EnableRequestStackTrace(jvmtiEnv* env)

This will typically be called from JVMTI's Agent_OnLoad to globally enable the functionality. However, it is also possible to call the function later.

The functionality can be disabled by calling the following function:

jvmtiError DisablesRequestStackTrace(jvmtiEnv* env)

This could be called from JVMTI's Agent_OnUnload or at any earlier time to disable the functionality.

Implementation

Much of the functionality that is required for this feature has already been implemented for JEP 509: JFR CPU-Time Profiling (Experimental). The mechanism for asynchronous stack-walking that has been implemented for the CPU-time sampler is generalized and re-used for the Unbiased Stack-Walk API.

In short, the stack-walker works as follows:

The signal-handler (or any other trigger) calls into RequestStackTrace.
RequestStackTrace records the thread's current PC, BCI and SP, and places a stack-walk-request with that information on a queue.
It then arms the thread for safepoint-polling (aka handshaking).
As soon as the thread arrives at the next safepoint-poll, it stops and starts processing all enqueued requests.
For each request, the stack-walker fetches the PC, BCI and SP, and reconstructs the top frame information from that. Notice that we only need to reconstruct the top frame (plus possibly inlined frames), but never any frames below that, because method returns would always run into a safepoint poll.
Once we have the top-frame, the thread walks the stack down by the usual mechanisms.
While visiting frames, call into the agent-provided callback functions to report the frames.

Alternatives

The following alternatives have been considered:

Implement all possible functionality in JFR. This would be impossible: 1. There are too many different scenarios. For example, JFR events could be provided for various different Linux perf event overflows. 2. Many scenarios may be very platform-specific (e.g., Linux perf event overflows). Note that something like this has already been attempted in JEP 509, and while it works well for what it aims to do, it only covers one very special scenario, on one particular platform.
Change the implementation of the JVMTI GetStackTrace family of functions to provide the desired functionality (signal-safety and avoiding the safepoint bias). While this may be possible in principal, it would require a change of the signature to also accept a void* ucontext, and it would also represent a significant change in behaviour, which is undesirable. It would also suffer from the same problems as AsyncGetCallTrace in that the caller would have to pre-allocate the stack-trace structure, and also have to deal with broken (undefined-behaviour) jmethodID.
Fix AsyncGetCallTrace. As it currently is, ASGCT suffers from various fundamental problems in its design (see discussion above). In-fact, this JEP is an attempt at fixing it, by providing a more sustainable alternative.
A similar new JVMTI extension API, but one that would emit a JFR event instead of calling JVMTI callbacks. That approach is documented in JEP Draft: Unbiased Stack-Walk JFR event trigger

Testing

Several new jtreg tests are added to verify that the new functionality works as specified

Risks and Assumptions

TBD