Skip to content

Fingel/rrredis

Repository files navigation

RRRedis: Really Rusty Redis

This is an implementation of the Redis server written in Rust with a few notable properties. This is purely a learning endeavor, not for production use. This implementation passes the Codecrafters.io Redis challenge integration test suite.

This is a vibe-free project all the incorrect, incomplete, and horrible slop can be credited to my own lack of skill (minus some tab-complete where it was trivial). Spelling and grammatical errors in this README are my own fault.

Architecture

Unlike the official Redis server this implementation is multi-threaded. Each client command runs on it's own Tokio task. In theory this means better multi-core utilization in the case of many concurrent commands. In reality it probably means worse performance due to synchronization overhead.

To lessen the impact of resource coordination, the main data-store uses dashmap which means that in some cases, multiple commands can read/write to the db with true parallelism.

Bookmarks and Patterns

This project required a lot of neat patterns and solutions, some of which I'll undoubtedly want to reference in the future.

Using tokio_util::codec to decode a TCP stream into frames.

This would seem to be generally useful anytime you have a stream of async data, like a tcp connection, and want to pull "chunks" off the stream to operate on. This is probably the moment when I realized that Tokio project is much, much more than simply an async runtime.

let mut transport = RespParser.framed(stream);
while let Some(redis_value) = transport.next().await {
    // do something based on redis_value and write/read to the transport.
}

The decoder is defined in parser.rs

Taking over a connection for a replicaiton loop

This was one of the hardest parts to get right - it involved using the tokio::select! macro to listen on both the main command connection and the replication connection. What I found challenging was writing code that is a client to itself and keeping in my mind the context of where the code I was looking at was running: are we in the replica or the main context? The code is shared between the two, minus a few critical branching points which made it all the more difficult to reason about.

rrredis/src/main.rs

Lines 75 to 80 in f5147d5

RedisCommand::Psync(id_in, offset_in) => {
println!("Master - Got replication request");
let response = psync_preamble(&db, id_in, offset_in).await;
transport.send(response).await.unwrap();
let (tx, mut rx) = tokio::sync::mpsc::channel::<RedisCommand>(1024);
let replica_id = uuid::Uuid::new_v4().to_string();

Pubsub pattern using BroadcastStream from tokio_stream

Implementing pubsub meant waiting on messages from multiple recievers at the same time. Another win for the select! macro. However I had trouble with doing this while at the same time holding a mutable reference to the data structure containing the Recievers. This was necessary so pubsub channels could be added and removed at runtime.

One solution I found was to use seperate tokio tasks per channel and use a single mpsc::channel to coordinate them all for a single connection. This seemed like a lot of boilerplate.

Luckily, there is the BroadcastStream type from tokio_stream that is designed for the very purpose. It made this pattern feel pretty natural.

type Subscriptions = StreamMap<String, BroadcastStream<RedisValueRef>>;

rrredis/src/pubsub.rs

Lines 24 to 39 in f5147d5

loop {
tokio::select! {
Some((channel, result)) = subscriptions.next() => {
match result {
Ok(message) => {
transport.send(RArray(vec![
RString("message"),
RString(channel),
message
])).await.unwrap();
}
Err(err) => {
transport.send(RString(format!("Error: {}", err))).await.unwrap();
}
}
}

Using a DashMap for lock-free concurrent access

Traditional Redis uses a single thread for it's command queue. This RRREdis uses a tokio task per connection and all that connection's commands. So in order to share data structures between tasks, some kind of locking mechanism is required: Mutex, RWLock, etc.

Fighting over locks for the main data store in Redis seems less than ideal. So I decided to try out DashMap (Distributed Hashmap). It still uses locking mechanisms internally, but it also distributes it's keyspace so read/write operations on different keys in most cases can happen concurrently. Very cool.

This meant I didn't have to reach for Mutex or RWlock for the main GET, SET, etc commands. But the API is a little awkward.

pub dict: DashMap<String, RedisValue>,

rrredis/src/lib.rs

Lines 276 to 281 in f5147d5

pub async fn get(db: &Db, key: String) -> RedisValueRef {
match db.get_if_valid(&key) {
Some(value) => RedisValueRef::from(&*value),
None => RNull(),
}
}

Using Criterion to benchmark changes

Especially early on in the project, I wanted to be able to measure the real difference between different architectural decisions. The two main ones being using DashMap vs a RWLock<HashMap> and using a VecDeque vs a Vec for some other internal structures.

Turns out when people say benchmarking is hard, they aren't lying. It took me a while to get tests that actually isolated what I wanted to measure. Once I did, and saw the timings change dramatically between implementations, it was all worth it.

group.bench_with_input(BenchmarkId::new("rpush", list_size), &list_size, |b, _| {
b.to_async(&rt).iter(|| async {
lists::rpush(
black_box(db_rpush),
black_box("bench_key".to_string()),
black_box(vec!["new_item".to_string()]),
)
.await
});
});

Writing a parser for the RDB binary format

This was a good opportunity to try out the nom "parser combinator framework". I don't have a lot of experience writing parsers but the ones I have written have all been much more procedural than what I ended up with using Nom.

rrredis/src/rdb.rs

Lines 190 to 196 in d5cde6d

fn database_value(i: &[u8]) -> IResult<&[u8], KeyValue> {
let delim: &[u8] = &[0x00];
let (i, _) = tag(delim)(i)?;
let (i, key) = encoded_value(i)?;
let (i, value) = encoded_value(i)?;
Ok((i, KeyValue { key, value }))
}

Using a SkipList to implement a scored set (zset)

Supposedly real Redis also uses a skiplist to implement zsets, so I tried that here. I did consider implementing the skipist myself but decided the more interesting part woukd be the higher order ZSet datastructure and opted for a library. I might go back and implement it by scratch.

rrredis/src/zset.rs

Lines 25 to 30 in d5cde6d

#[derive(Debug)]
pub struct ZSet {
map: HashMap<String, Score>,
list: OrderedSkipList<ListNode>,
}

Encoding spherical lat/long pairs as scores to store in a scored set

Redis takes a very interesting approach to storing geographical coordinates: it normalizes lat/longs as large itegers, then interleaves the binary forms of those integers into one large integer, then stores that in a scored set such that closer objects are scored similarly. I've normally just used something like PostGIS for this kind of thing without looking under the hood so I don't know if this is actually a novel approach but I thought it was really cool.

The geo.rs file contains the implementation of various geo commands.

Areas for further improvement

If I were interested in making this project production grade, here are the areas I'd want to clean up:

  1. interpreter.rs helps with conversion between the Redis wire format RedisValueRef and the friendlier RedisCommand enum. This module is full of repetitive boilerplate. I would probably invest some time in cleaning this up, perhaps by leveraging macros or just re-organizing it completely.
  2. The RDB parser in rdb.rs still uses some good-ol match and if branching where sub-parsers would probably be more idiomatic.
  3. The zset implentation duplicates values between the skiplist and the hashset. A proper shared datastructure here would probably save some memory, but that's a rabbit hole I had no intention of going down at the time - especially in Rust.
  4. Replication seems to work, but it still feels... fragile? This feels like one of those inherently hard to get right problems - this is an area where additional rigour would need to be applied.

Running

If for some reason you want to run this project all you should need is cargo and a rust compiler. This was built using stable Rust, no Nightly required.

cargo run

You can then interact with the server using redis-cli:

redis-cli PING
PONG

The following section lists the available commands.

Commands

Ping Echo Set SetEx Get Rpush Lpush Lrange LLen LPop BLPop Type XAdd XRange XRead Incr Multi Exec Discard Info ReplConf Psync RdbPayload Wait Config Keys Subscribe Unsubscribe PSubscribe PUnsubscribe Publish ZAdd ZRank ZRange ZCard ZScore ZRem GeoAdd GeoPos GeoDist GeoSearch AclWhoami AclGetUser AclSetUser Auth

About

A Redis implementation in Rust

Resources

Stars

Watchers

Forks

Contributors