ACK handling is being implemented in the shipper and its clients. This means Beats inputs that feed into the shipper output don't receive their acknowledgment callbacks until the shipper has sent the events upstream.
This means that currently, pending events are enqueued both in Beats and in the shipper. Neither one can safely remove them, because it isn't known until the response comes back whether it will need to retry.
There isn't a real need for Beats to do that in this case, though, because as soon as the event has been handed off to the shipper it's already in a local queue with proper retry behavior. The Beats pipeline should accommodate this case by separating producer acknowledgments (which are how inputs track progress) from queue allocations.
One way to do this that would probably require minimal modification of existing APIs is to create a proxy object implementing queue.Queue and used only for the shipper output, that instead of genuinely queueing would just synchronously Publish as batches are assembled, discarding the event data after a successful handoff, but still propagating event acknowledgment back to its producers afterwards. (This is essentially the same pattern we expect standalone shipper clients to follow.)
One complication is that batch assembly typically preserves pointers to the events inside the batch, so even if it isn't explicitly enqueued, the memory can still be used. That will probably require some extra handling or special call in the shipper output to separate freeing the memory from acknowledgment.
ACK handling is being implemented in the shipper and its clients. This means Beats inputs that feed into the shipper output don't receive their acknowledgment callbacks until the shipper has sent the events upstream.
This means that currently, pending events are enqueued both in Beats and in the shipper. Neither one can safely remove them, because it isn't known until the response comes back whether it will need to retry.
There isn't a real need for Beats to do that in this case, though, because as soon as the event has been handed off to the shipper it's already in a local queue with proper retry behavior. The Beats pipeline should accommodate this case by separating producer acknowledgments (which are how inputs track progress) from queue allocations.
One way to do this that would probably require minimal modification of existing APIs is to create a proxy object implementing
queue.Queueand used only for the shipper output, that instead of genuinely queueing would just synchronouslyPublishas batches are assembled, discarding the event data after a successful handoff, but still propagating event acknowledgment back to its producers afterwards. (This is essentially the same pattern we expect standalone shipper clients to follow.)One complication is that batch assembly typically preserves pointers to the events inside the batch, so even if it isn't explicitly enqueued, the memory can still be used. That will probably require some extra handling or special call in the shipper output to separate freeing the memory from acknowledgment.