Skip to content

BP-51: BookKeeper client memory limits #3231

@zymap

Description

@zymap

Motivation

If one bookie is slow (not down, just slow), the BK client will the acks to the user that the entries are written after the first 2 acks. In the meantime, it will keep waiting for the 3rd bookie to respond. If the bookie responds within the timeout, the entries can now be dropped from memory, otherwise the write will timeout internally and get replayed to a new bookie.

In both cases, the amount of memory used in the client will max at "throughput" * "timeout". This can be a large amount of memory and easily cause OOM errors.

Part of the problem is that it cannot be solved from outside the BK client, since there's no visibility on what entries have 2 or 3 acks and therefore it's not possible to apply back pressure. Instead, there should be a backpressure mechanism in the BK client itself to prevent this kind of issue.

Proposed Change

We want to propose a memory counter for the bookie client. The memory counter only counts the write memory usage. It will hold the memory size until the request is finished. Users can register WritableListener to listen to the write state change event.

The memory count process happens in the PendingAddOp.

When the client wants to add an entry, it will create a PendingAddOp object and then complete the sending process in it. When starting to send the entry, it will transfer the entry buffer reference into the PendingAddOp’s toSend. We will allocate a new buffer to fill the entry metadata and entry content. Because metadata is small, we ignore it in the memory counter. Once the PendingAddOp is created, we will record the entry content size with the memory counter.

We use WriteMemoryCounter to control the writable state. When the pending add requests content bytes exceed the high water mark, the listener will receive the write state changes to false; when the memory size is lower than low water mark, the listener will receive the write state changes to true.

We will decrement the write memory count before the PendingAddOp object is recycled.

Public Interfaces

Introduce a configuration of WriteWaterMark to represent the high memory limit and the low memory limit.

public class WriteWaterMark {
    private static final int DEFAULT_LOW_WATER_MARK = 1;
    private static final int DEFAULT_HIGH_WATER_MARK = 1;

    private final int low;
    private final int high;

    public WriteWaterMark(int low, int high) {
        this.low = low;
        this.high = high;
    }

    public int low() {
        return low;
    }

    public int high() {
        return high;
    }
}

Introduce WriteMemoryCounter to count the add request bytes pending in clients.

public class WriteMemoryCounter {
    public void incrementPendingWriteBytes() {};

    public void decrementPendingWriteBytes() {};

    private void setWritable(boolean state) {};
}

Introduce a WritableListner to listen for the write states.

public interface WritableListener {
    void onWriteStateChanged(boolean writable);
}

For more detailed information, here is the proposed PR: #3139.

Migration Plan and Compatibility

This won't introduce any break changes, it depends on the how the client handle the write state changes events.

Rejected Alternatives

Using server-side back pressure to control.

Configuring the back pressure won't’ resolve the client side OOM issue. When we configure WQ > AQ, the slowest bookie won’t impact the add entry request, the client won’t stop adding entry because it can receive 2 successful responses from the servers. And then the client still has entries waiting for the response from the slowest bookie, then the client's memory will be increased quickly and OOM finally.

Using a memory limiter to control the memory usage

We have introduced a memory limit controller in the PR [#2710](#2710)
, so we can easily apply the memory limit controller in the bookie client to control the client memory usage. We require the memory when there has an add entry request, and release the memory when the request sends successfully from the client.

But the problem is using the memory limiter needs block the operation. We won't want to block the client application any operation.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions