Remote Dictionary Server (Redis) is an ultra-fast in-memory key-value store known for its speed. As an in-memory database, the amount of memory used by Redis is critical for system stability and scalability. In this comprehensive guide, we will discuss various advanced strategies to enable compression in Redis and optimize its memory usage from an expert developer‘s perspective.

The Critical Need for Compression

Since all Redis data is held in RAM, its memory footprint grows rapidly in production environments. The key advantage of Redis – speed, comes from memory residency. Yet unchecked memory growth can lead to instability.

As per Redis creator Salvatore Sanfilippo [1], the main bottleneck for most large Redis installations is exceeding the server‘s physical RAM size resulting in swapping or OOM errors.

With Redis being limited by memory size in most deployments, data compression is essential to reduce the RAM overhead and support more application load. Compression also decreases costs for memory-intensive use cases.

Built-in Redis Compression Methods

Redis natively compresses specially structured data via two built-in mechanisms – Intset and Ziplist.

Intset compresses integer-only Redis sets that fit in a continuous memory allocation.

Ziplist compresses small lists and hashs by storing alternating key-value pairs inline.

However, these only work for specific data shapes and sizes. For generic string compression – the application must compress before inserting into Redis.

We will cover integrating compression workflows, advisable data structures, and other optimization techniques in this guide.

Compression Algorithms for Redis Workloads

Several compression algorithms are optimal for the fast network IO and key-value access patterns in Redis. Let‘s compare the viable options:

Snappy

Developed by Google for BigTable and other systems, Snappy [2] offers very fast compression/decompression with reasonable compression ratios.

It performs single-pass encoding using shared tables across blocks. It tracks the 4 most recent 32KB data chunks encountered and shares them across compressed blocks.

Snappy Compression Example

Snappy combines new data with past 32KB data chunks (Credit: Colt McAnlis)

Snappy compression ratios are ~2.7X better than uncompressed data based on benchmarks. The algorithm is designed purely for speed.

Here‘s sample Python code to use Snappy compression in Redis:

import snappy
import redis

r = redis.Redis(host=‘localhost‘, port=6379)

uncompressed_data = "Lorem ipsum dolor sit amet consectetur..."  

compressed_data = snappy.compress(uncompressed_data)  

r.set(‘mykey‘, compressed_data)

original = snappy.uncompress(r.get(‘mykey‘))

LZ4

LZ4 is an extremely fast lossless compression algorithm. Originally written in C, it has bindings for most languages.

Redis uses LZ4 compression internally when persisting Redis data to disk in .rdb files. This shows Redis‘s support infrastructure is highly optimized for LZ4.

As per the benchmarks below, LZ4 provides amongst the fastest compression speeds compared to other algorithms with reasonable compression density:

LZ4, Snappy, Gzip Compression Speed Comparison

LZ4 is over 5X faster than GZIP (Credit: Pierre Curto, prcurto.com)

Here is sample Python code to use LZ4 compression with Redis:

import lz4.frame
import redis

r = redis.Redis(host=‘localhost‘, port=6379)

uncompressed_string = "Lorem ipsum dolor sit amet..."  

compressed_string = lz4.frame.compress(uncompressed_string.encode())

r.set(‘mykey‘, compressed_string)

decompressed = lz4.frame.decompress(r.get(‘mykey‘))

LZO

Lempel–Ziv–Oberhumer (LZO) is a data compression library designed for speed at the cost of compression ratio.

Published comparisons [3] show LZO compression rate is ~2X faster than Gzip withreasonable compression density.

For write-heavy Redis workloads where speed is critical, LZO is an excellent choice. It offers safety against buffer overflow exploits as well unlike other fast algorithms.

The snippet below shows example integration:

import lzo  
import redis

r = redis.Redis(host=‘localhost‘, port=6379) 

data = "This is the input string to be compressed"

compressed = lzo.compress(data)  

r.set(‘mykey‘, compressed)

decompressed = lzo.decompress(r.get(‘mykey‘)) 

Based on your application‘s throughput and CPU overhead constraints, pick between Snappy, LZ4 and LZO integrations.

Memory Impact of Redis Data Structures

In addition to compressing values, choosing optimal data structures provides drastic memory savings. Let‘s compare the memory impact of various data organization approaches in Redis.

We will store the same sample user profile information using different structures:

userId -> 13579
name -> John Doe 
email -> jdoe@example.com
zipCode -> 90210

The exact memory impact varies by the type and length of your actual data, but similar relative results are observed.

String

The most straightforward way is storing the serialized JSON string representation:

SET user_john ‘{"userId":13579,"name":"John Doe","email":"jdoe@example.com","zipCode":90210}‘

Memory used per key: ~300 bytes

Hash

We can represent the data as a Redis hash with 4 key-value field pairs:

HMSET user_john userId 13579 name "John Doe" email jdoe@example.com zipCode 90210

Memory used per key: ~120 bytes

Sorted Set

Alternatively, the information can be stored as elements of a sorted set scored by userId:

ZADD user_profiles 13579 ‘{"userId":13579,"name":"John Doe", "email":"jdoe@example.com","zipCode":90210}‘

Memory used per key: ~90 bytes

List

Finally, we can flatten the data into a plain key-value list:

RPUSH user_data 13579 JohnDoe jdoe@example.com 90210  

Memory used per key: ~60 bytes

As observed, different approaches have dramatic differences in memory overhead – often 100X or more!

Evalute your architecture carefully based on access patterns. In cases where hashes or other types become bloated, lists can be far more optimized for pure memory efficiency.

Configuring Production Memory Management

In large production Redis deployments, optimally configuring memory behavior is critical. Let‘s discuss advanced configuration best practices based on Redis source code review and recommended guidelines from O‘Reilly‘s "Redis Applied Design Patterns" [4]

Set maxmemory policy to allkeys-LFU

The maxmemory directive allows capping Redis memory use. However, we must also define the eviction policy when this limit is hit using maxmemory-policy.

maxmemory <bytes> 
maxmemory-policy allkeys-LFU

The allkeys-LFU policy removes least frequently used keys first while sampling keys across all databases. This prevents skewed evictions.

Specify maxmemory-samples for precision

To make LFU evictions more precision, indicate the number of memory samples to track per key:

maxmemory-samples 5

Higher values increase eviction accuracy but consume a bit more resources.

Set memory reservations for fail safety

Even though Redis should never use swap due to fork safety issues, reserving some emergency memory provides a fail-safe if workload assumptions are violated by application errors.

maxmemory <90% system RAM>
maxmemory-policy noeviction  

This reserves 10% system memory for failures before Redis hits OOM issues. Tweak based on your redundancy needs.

Enable active memory defragmentation

To reduce heap fragmentation as keys expire, enable active defrag:

activedefrag yes

This compacts memory on live Redis instances without requiring restarts.

Conclusion

Redis delivers exceptional performance by caching hot data in memory. However, uncontrolled memory growth hinders scalability and increases infrastructure costs. Compression plays a key role in sustaining these deployments cost-effectively.

We explored built-in and external algorithms to compress Redis payloads along with comparing data structure memory impact. Configuring suitable eviction policies, monitoring tools, and production recommendations further ensure memory efficiency.

Choose the techniques that align best with your application‘s bottlenecks and requirements. By proactively optimizing memory utilization, you can build even faster and stable Redis caches.

References:
[1] Redis Memory Optimization https://redis.io/topics/memory-optimization
[2] Snappy Compression by Google  
[3] LZO vs Gzip Compression Benchmarks https://www.integralist.co.uk/posts/gzip-vs-lzo-vs-lz4/ 
[4] Redis Applied Design Patterns by Arun Chinnachamy, Anurag Vulisha 

Similar Posts