Skip to content

parca debug-info upload #4648

Draft
korniltsev-grafanista wants to merge 34 commits intomainfrom
debuginfo-upload
Draft

parca debug-info upload #4648
korniltsev-grafanista wants to merge 34 commits intomainfrom
debuginfo-upload

Conversation

@korniltsev-grafanista
Copy link
Contributor

@korniltsev-grafanista korniltsev-grafanista commented Nov 28, 2025

This PR copies the code from https://github.com/parca-dev/parca/tree/main/pkg/debuginfo into pyroscope and integrates with the existing distributors and symbolizer

The parca's debuginfo uploads debug info binaries to a separate bucket subdir.
The symbolizer later uses the new subdir as the source for debug info binaries as an alternative to debuginfod.

@korniltsev-grafanista korniltsev-grafanista force-pushed the debuginfo-upload branch 5 times, most recently from d202551 to 8577f33 Compare January 16, 2026 07:36
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@korniltsev-grafanista korniltsev-grafanista changed the title WIP parca debug-info upload parca debug-info upload Jan 16, 2026
@korniltsev-grafanista korniltsev-grafanista marked this pull request as ready for review January 16, 2026 08:05
korniltsev-grafanista and others added 5 commits January 19, 2026 05:45
Add debug-level logging to help troubleshoot debuginfo upload issues:
- Log ShouldInitiateUpload results (build_id, decision, reason)
- Log Upload gRPC results (success with size, or failure with error)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Debuginfod fetching is now handled in the symbolizer instead of in the
parca debuginfo store. This removes the debuginfod client implementation,
related tests, and test fixtures from the parca/debuginfo package.

Also removes tenant ID handling from the debuginfo store since it's not
needed for this use case.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…test

Refactor the store test to use an HTTP/2 server with h2c and the
util.AuthenticateUser middleware instead of a direct gRPC server.
This better reflects the production setup where requests go through
HTTP middleware for tenant authentication.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

t := noop.Tracer{}
l = log.With(l, "component", "debug-info")
bucket = objstore.NewPrefixedBucket(bucket, symbolizer.BucketPrefixParcaDebugInfo)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't have a cleanup mechanism to control unbounded growth (neither for symbolizer path). Could it be an issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to keep lidia's indefinitely. And preferably to remove the debug information files once they are converted to lidia. All of this is hard to implement in a stateless manner in distributors, so I propose to postpone these decision until we have a proper place/way to implement this https://github.com/grafana/pyroscope-squad/issues/687

korniltsev-grafanista and others added 5 commits January 23, 2026 15:28
Test creates a lidia table from /proc/self/exe using only gopclntab
(symtab disabled), then resolves the test function's own address and
verifies the name matches.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
korniltsev-grafanista added a commit to grafana/alloy that referenced this pull request Jan 26, 2026
<!--

CONTRIBUTORS GUIDE:
https://github.com/grafana/alloy/blob/main/docs/developer/contributing.md#updating-the-changelog

If this is your first PR or you have not contributed in a while, we
recommend
taking the time to review the guide. It gives helpful instructions for
contributors around things like how to update the changelog.

-->

#### PR Description
This PR introduces debug information upload using parca's API
implemented in grafana/pyroscope#4648 on the
pyroscope

This feature is implemented in two places: the profiler and the proxy

```go
type debuginfo.Appender interface {
    Upload(j UploadJob) // for the profiler: fanout to all the nested children 
    Client() debuginfogrpc.DebuginfoServiceClient // for the proxy - retrieve the first grpc client available and proxy data to the client
}
```
The profile uploader actually lives in the `pyroscope.write` component
to support per-write component uploads(ugh who needs this? no-one!) ,
but this is an implementation detail and it's hidden from the users.
The proxy only can proxy data to the one client - it can not fan-out.

The `parca` package is copied from the parca's repo with some
adjustments (`logrus` -> `logkit`, the grpc client is passed as an
argument to the upload routine instead of creating and owning the grpc
client inside the uploader)

#### Which issue(s) this PR fixes

<!-- Uncomment the following line if you want that GitHub issue gets
automatically closed after merging the PR -->
<!-- Fixes #issue_id -->

#### Notes to the Reviewer

#### PR Checklist

<!-- Remove items that do not apply. For completed items, change [ ] to
[x]. -->

- [ ] CHANGELOG.md updated
- [x] Documentation added
- [x] Tests updated
- [ ] Config converters updated

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Copy link
Contributor

@simonswine simonswine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not tired this end-to-end but I think this looks mostly ready. I have a concern about the exposing of the service. Wonder how you feel about it.

pkg/api/api.go Outdated
Comment on lines +193 to +207
func (a *API) RegisterDebugInfo(d *grpc.Server, limits *validation.Overrides) {
const (
DebuginfoService_Upload_FullMethodName = "/parca.debuginfo.v1alpha1.DebuginfoService/Upload"
DebuginfoService_ShouldInitiateUpload_FullMethodName = "/parca.debuginfo.v1alpha1.DebuginfoService/ShouldInitiateUpload"
DebuginfoService_InitiateUpload_FullMethodName = "/parca.debuginfo.v1alpha1.DebuginfoService/InitiateUpload"
DebuginfoService_MarkUploadFinished_FullMethodName = "/parca.debuginfo.v1alpha1.DebuginfoService/MarkUploadFinished"
)
debugInfoGRPCOptions := []RegisterOption{
a.WithAuthMiddleware(),
}
a.RegisterRoute(DebuginfoService_Upload_FullMethodName, d, debugInfoGRPCOptions...)
a.RegisterRoute(DebuginfoService_ShouldInitiateUpload_FullMethodName, d, debugInfoGRPCOptions...)
a.RegisterRoute(DebuginfoService_InitiateUpload_FullMethodName, d, debugInfoGRPCOptions...)
a.RegisterRoute(DebuginfoService_MarkUploadFinished_FullMethodName, d, debugInfoGRPCOptions...)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit worried of our mix of GRPC and Connect-Go. None of this really your fault in this PR, but we have currently:

  • Cell internal comms which are a mix of GRPC (From Pyroscope v2 project) and connect-go (before v2)
  • Public endpoints are all connect-go (accepting GRPC) or Pyroscope legacy and we do have a endpoint GRPC endpoint for compatibility with OTLP.

So after your PR we would require a client that send both connect-go for Push and GRPC for the uploads. This feels making the above situation worse.

How do you/others feel about this? Is it worth copying the parca proto into api/debuginfo and generate our own handlers clients?

If we would do that we can reuse the connect handlers and the client would be complete within the api/ package. The endpoints would also work with curl json/grpc/connect and we would be able to reuse e.g. the API docs generation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first I was against it because I wanted to preserve as much parca code as possible in genuine state, so that it is diffed and updated in case needed easily. I later had to modify quite a lot of code, so now I am actually in favor of
Yes, let's do this. The only problem is that I've already merged client code =/
I will try to either update the client and backport tomorrow, or disable the new functionality and aim for the next release.

@korniltsev-grafanista korniltsev-grafanista requested a review from a team as a code owner February 9, 2026 09:54
Copy link
Contributor

@simonswine simonswine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this.

I think this is good enough for an initial MVP

Limits Limits
}

func newSymbolizerTest(t *testing.T, inp *symbolizerInputs) (*Symbolizer, *mocksymbolizer.MockDebuginfodClient, *mockobjstore.MockBucket) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be rebased before merge. Maybe you just forget to push?

return s.fetchFromDebuginfod(ctx, buildID)
}

func (s *Symbolizer) fetchFromParca(ctx context.Context, buildID string) (io.ReadCloser, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe (or something similar)?

Suggested change
func (s *Symbolizer) fetchFromParca(ctx context.Context, buildID string) (io.ReadCloser, error) {
func (s *Symbolizer) fetchFromUploadedDebugInfo(ctx context.Context, buildID string) (io.ReadCloser, error) {

if err := s.bucket.Upload(ctx, ObjectPath(tenantID, id), r); err != nil {
return status.Error(codes.Internal, fmt.Errorf("upload debuginfo: %w", err).Error())
}
md.FinishedAt = timestamppb.New(time.Now())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we update the state to ObjectMetadata_STATE_UPLOADED around here?

return nil, fmt.Errorf("first message expected to be init")
}
if init.File == nil {
return nil, fmt.Errorf("init.FileData == nil")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return nil, fmt.Errorf("init.FileData == nil")
return nil, fmt.Errorf("init.File == nil")


r := debuginforeader.New(ctx, readChunksFromStream(stream))
if err := s.bucket.Upload(ctx, ObjectPath(tenantID, id), r); err != nil {
return status.Error(codes.Internal, fmt.Errorf("upload debuginfo: %w", err).Error())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems mixed grpc and connect errors.

Comment on lines +22 to +36
message FileMetadata {
enum DebuginfoType {
DEBUGINFO_TYPE_EXECUTABLE_FULL = 0;
DEBUGINFO_TYPE_EXECUTABLE_NO_TEXT = 1;
// DEBUGINFO_TYPE_LIDIA = 1;
// DEBUGINFO_TYPE_BLAZESYM = 2;
}

string GNU = 1;
string Golang = 2;
// optional libpf.FileID rom the otel profiler
string OpenTelemetry = 3;
string Name = 4;
DebuginfoType type = 5;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
message FileMetadata {
enum DebuginfoType {
DEBUGINFO_TYPE_EXECUTABLE_FULL = 0;
DEBUGINFO_TYPE_EXECUTABLE_NO_TEXT = 1;
// DEBUGINFO_TYPE_LIDIA = 1;
// DEBUGINFO_TYPE_BLAZESYM = 2;
}
string GNU = 1;
string Golang = 2;
// optional libpf.FileID rom the otel profiler
string OpenTelemetry = 3;
string Name = 4;
DebuginfoType type = 5;
}
message FileMetadata {
enum Type {
TYPE_UNSPECIFIED = 0; // several times used as a good practise on enums, feel free to omit it
TYPE_EXECUTABLE_FULL = 1;
TYPE_EXECUTABLE_NO_TEXT = 2;
}
// GNU build ID.
string gnu_build_id = 1;
// Go build ID.
string go_build_id = 2;
// OpenTelemetry profiler file ID (libpf.FileID).
string otel_file_id = 3;
// Original binary file name.
string name = 4;
// Type of the debug info file.
Type type = 5;
}

@korniltsev-grafanista korniltsev-grafanista marked this pull request as draft February 16, 2026 07:25
@korniltsev-grafanista
Copy link
Contributor Author

Just wanted to clarify, the PR was not ready for review, only the new protobufs and overall intention to use the new protobufs.

I also think we should have an idea how to move forward with infinite data growth (both lidia and debug info)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants