Skip to content

Conversation

@mattsta
Copy link
Contributor

@mattsta mattsta commented Mar 18, 2014

This may be more of a philosophical difference than an objective argument, but should we assume maxmemory is primarily our dataset size or all Redis memory usage even for transient allocations (temporary buffers)?

If maxmemory is all memory usage, then read-only clients can forcibly evict keys by growing large pipeline output buffers (large output buffer = large total memory usage = over maxmemory limit = evict keys).

The patches in this PR:

  • allow us to see the aggregate output buffer size in INFO output
  • ignore output buffer size when determining if we are over memory limits (just like we do with Replica buffers and AOF buffers).

mattsta added 3 commits March 18, 2014 14:47
It's useful to be able to see how much memory is used by
large output or input buffers.
If one (or multiple) clients have large pipeline requests
running (example: 100,000 GETs), now the user can see why
Redis is using more memory than expected.

I experimented with listing the total input buffer size as well,
but Redis consumes input buffers rapidly enough where I
never saw it grow beyond a few KB (and Redis does report
the largest current input buffer as clients.client_biggest_input_buf).

New output for INFO MEMORY:
    used_memory:35054272
    used_memory_human:33.43M
    used_memory_rss:80175104
    used_memory_peak:2830033808
    used_memory_peak_human:2.64G
    used_memory_lua:33792
    used_memory_output_buffers:20516792
    used_memory_output_buffers_human:19.57M
    mem_fragmentation_ratio:2.29
    mem_allocator:jemalloc-3.2.0
Treat client buffers like we treat Replica buffers
and the AOF buffer by subtracting the total size of waiting output
from the size of the used memory.

The maxmemory directive has a note saying it doesn't take
into account any Replica or AOF buffers (they get subtracted
from the total Redis memory usage calculation for purposes
of determining "maxmemory" for eviction).
But, the eviction process was still counting output buffers
of clients building results to return.

If Redis is configured with an eviction policy and we're
including client output buffers in the total memory usage
calculation, any clients can potentially empty all
volatile keys by issuing large pipleline requests.

Behavior before this commit:
  - maxmemory = 50 MB
  - maxmemory-policy = allkeys-lru or allkeys-random
  - Add 50 MB of data
  - Run a pipeline request of 100,000 GET requests.  The
    output buffers will grow to 200 MB, Redis will
    see the memory usage is now 250 MB, and Redis will start
    evicting keys.  Redis will quickly evict *all* keys because
    evicting keys isn't shrinking the output buffer usage
    in the Redis memory usage calculation.
  - Now the user is left wondering why a series of GET requests
    caused all their keys to expire.

Behavior after this commit:
  - maxmemory = 50 MB
  - maxmemory-policy = allkeys-lru or allkeys-random
  - Add 50 MB of data
  - Issue a pipleline of 100,000 GET requests
  - Redis grows the result set into a 200 MB output buffer,
    but the extra 200 MB isn't included in the eviction calculation,
    so no keys get evicted.
  - All keys remain in place and Redis shrinks back to 50 MB after
    the output buffer is freed.
@antirez
Copy link
Contributor

antirez commented Mar 21, 2014

There are a few things preventing me from merging. The biggest one is that the O(N) function to get client's buffers sum is called too often, with every call at freeMemoryIfNeeded(). Even in the cron, with N=10k which is uncommon, is a bit of an issue... we know do even client timeouts in an incremental way for the same reason.

The second problem is a conceptual one, probably even if I agree with you that in theory we want max memory to be "data" memory, still users set it to the max memory the server has, often. So going too much over this figure because clients are doing silly things, is a big problem if the server uses too much memory. Either the system will kill the process via the OOM killer, or it will start paging out to disk. Both are worse scenarios than evicting keys.

So IMHO all considered, my feeling is that it is better to avoid this change, even if there are technical solutions to have the memory used by buffers computed incrementally by taking a global variable that we increment/decrement together with every client buffer usage info, and finalize in freeClient().

@mattsta
Copy link
Contributor Author

mattsta commented Mar 24, 2014

Good points all around.

What about allowing a DEBUG BUFFERS command to show the memory usage of the client buffers? It can hide under the DEBUG command so people know it may impact performance if run while the server has many clients connected.

@antirez
Copy link
Contributor

antirez commented Mar 24, 2014

A way to be able to know memory used by clients is a good idea, but we have good options already probably: the CLIENT LIST command shows the amount of memory used by every client as obuf field, while INFO CLIENT is able to show you the worst client:

# Clients
connected_clients:1
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0

However this is bad... it means that it calls getClientsMaxBuffers() which is O(N) in the context of INFO. Not cool... maybe this is why the variable where this value is computed in INFO is called lol (longest output list).

@antirez antirez closed this Mar 24, 2014
@mattsta
Copy link
Contributor Author

mattsta commented Mar 24, 2014

it means that it calls getClientsMaxBuffers() which is O(N) in the context of INFO.

Yeah, since it was already happening, I thought it was fast enough to be acceptable :) Though, I added a new O(N) loop instead of using the existing one. I'll move it around.

Looks like an error: The O(N) clients iteration is being done for every INFO call. It should only be done if "clients" is requested. We can move the getClientsMaxBuffers() into "clients" so someone requesting INFO STATS doesn't have to iterate over the client list.

If we keep the full iteration over the client list, it can't hurt to add one more summation since we're already in the loop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants