Add tests for shipper output and add support for partial results#31558
Add tests for shipper output and add support for partial results#31558rdner merged 7 commits intoelastic:mainfrom
Conversation
An event batch can be partially accepted by the server. Added test cases for: * An event batch is published from an input to the shipper gRPC server * An event batch is not dropped when the gRPC server is not available but starts later * An event batch is not dropped when ResourceExhausted code is returned from the gRPC
|
This pull request is now in conflicts. Could you fix it? 🙏 |
|
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
| }, | ||
| }) | ||
|
|
||
| nonDroppedEvents = append(nonDroppedEvents, e) |
There was a problem hiding this comment.
The publisher.Event type is a struct, not a reference:
beats/libbeat/publisher/event.go
Line 50 in 47777ec
We want to make sure we aren't making a copy of every event here unnecessarily. Could you keep a reference, or just the index in the array instead?
| batch.Retry() | ||
| } | ||
| return fmt.Errorf("failed to publish the batch to the shipper: %w", err) | ||
| if status.Code(err) != codes.OK || resp == nil { |
There was a problem hiding this comment.
If the shipper isn't available briefly, because of an upgrade or a crash for example, what happens to the events in this case. The batch is cancelled, but does the rest of the pipeline retry?
There was a problem hiding this comment.
Good question, I assumed that the pipeline would try again in any case if the batch is cancelled. That's what I understood from the function's doc here:
beats/libbeat/publisher/event.go
Lines 44 to 45 in dc4933d
) An event batch can be partially accepted by the server. Added test cases for: * An event batch is published from an input to the shipper gRPC server * An event batch is not dropped when the gRPC server is not available but starts later * An event batch is not dropped when ResourceExhausted code is returned from the gRPC
An event batch can be partially accepted by the server.
Added test cases for:
What does this PR do?
Why is it important?
Checklist
- [ ] I have made corresponding changes to the documentation- [ ] I have made corresponding change to the default configuration files- [ ] I have added tests that prove my fix is effective or that my feature works- [ ] I have added an entry inCHANGELOG.next.asciidocorCHANGELOG-developer.next.asciidoc.Author's Checklist
How to test this PR locally
Related issues