Redis: A Practical Guide: Data structures, commands, and patterns for building high-performance applications with Redis
Redis - The Definitive Guide
Redis is an open source, in-memory data structure store that can be used as a database, cache, and message broker. It supports various kinds of data structures, such as strings, lists, hashes, sets, sorted sets, streams, and more. Redis is fast, reliable, and scalable, making it a popular choice for many applications that require high performance and low latency.
Redis - The Definitive Guide - Data modeling, caching, and messaging PDF.pdf
In this article, we will cover the basics of Redis, how to install it, and how to use it for data modeling, caching, and messaging. We will also provide some examples and best practices to help you get started with Redis.
What is Redis?
Redis stands for Remote Dictionary Server. It was created by Salvatore Sanfilippo in 2009 as a solution for scaling his real-time web log analyzer. He later open sourced it and it gained traction among the Ruby community, with GitHub and Instagram being among the first adopters.
Redis is different from other database systems in that it stores data in memory rather than on disk or SSD. This means that it can access data much faster than traditional databases, but also that it has limited storage capacity and persistence options. Redis can persist data to disk using snapshots or append-only files (AOF), but this comes at a cost of performance and durability.
Redis is also different from other key-value stores in that it supports various data structures that can be manipulated with atomic commands. For example, you can append to a string, push or pop from a list, add or remove from a set, increment or decrement a counter, and so on. These commands allow you to perform complex operations on your data without having to read or write the whole value.
Why use Redis?
Redis has many use cases and advantages that make it a versatile and powerful tool for developers. Some of the reasons to use Redis are:
Speed: Redis is extremely fast because it operates on in-memory data. It can handle millions of operations per second with sub-millisecond latency.
Reliability: Redis is designed to be resilient and fault-tolerant. It supports replication, clustering, high availability, and backup mechanisms to ensure data safety and availability.
Scalability: Redis can scale horizontally by partitioning data across multiple nodes using hash-based sharding. It can also scale vertically by adding more memory or CPU resources to a single node.
Simplicity: Redis is easy to install and use. It has a simple and consistent syntax for commands and data structures. It also has a rich set of client libraries for various programming languages and frameworks.
Flexibility: Redis can be used for a wide range of applications and scenarios, such as caching, session management, real-time analytics, messaging, geospatial indexing, machine learning, gaming, social media, and more.
How to install Redis?
The easiest way to install Redis is to use a package manager for your operating system. For example, on Ubuntu Linux you can run the following commands:
sudo apt update sudo apt install redis-server
This will install Redis and start it as a service. You can check the status of the service with:
sudo systemctl status redis-server
You can also interact with Redis using the command-line interface (CLI) called redis-cli. To launch it, simply type:
redis-cli
This will connect you to the local Redis server running on the default port 6379. You can then execute Redis commands and see the responses. For example:
redis> PING PONG redis> SET hello world OK redis> GET hello "world"
To exit the CLI, type:
redis> QUIT
Data modeling with Redis
One of the most important aspects of using Redis is data modeling. Data modeling is the process of designing how to store and access your data in a way that meets your application requirements and optimizes performance. Data modeling with Redis involves choosing the right data structures and commands for your use case, as well as applying some common patterns and best practices.
Redis supports six main data types: strings, lists, hashes, sets, sorted sets, and streams. Each data type has its own characteristics, advantages, and limitations. Let's take a look at each one in more detail.
Strings
Strings are the simplest and most basic data type in Redis. They are binary safe, meaning they can store any kind of data, such as text, numbers, images, or serialized objects. Strings can be up to 512 MB in size.
You can use strings to store simple values, such as user names, email addresses, session tokens, counters, flags, etc. You can also use strings to store complex values, such as JSON documents, compressed data, encrypted data, etc.
To create or update a string value, you can use the SET command. For example:
redis> SET user:1:name Alice OK redis> SET user:1:email alice@example.com OK
To read a string value, you can use the GET command. For example:
redis> GET user:1:name "Alice" redis> GET user:1:email "alice@example.com"
To delete a string value, you can use the DEL command. For example:
redis> DEL user:1:name (integer) 1 redis> GET user:1:name (nil)
Redis also provides many other commands to manipulate strings, such as APPEND, INCR, DECR, MSET, MGET, GETSET, STRLEN, and more. You can find the full list of string commands in the official documentation.
Lists
Lists are ordered collections of strings that are linked together by pointers. Lists can store up to 4.2 billion elements and support fast insertion and deletion at both ends.
You can use lists to store sequences of values, such as comments, messages, tasks, logs, etc. You can also use lists to implement stacks or queues.
To create or update a list value, you can use the LPUSH or RPUSH commands to insert elements at the left or right end of the list. For example:
redis> LPUSH comments "This is awesome!" (integer) 1 redis> LPUSH comments "I agree!" (integer) 2 redis> RPUSH comments "Me too!" (integer) 3
To read a list value, you can use the LRANGE command to get a range of elements from the list. For example:
redis> LRANGE comments 0 -1 1) "I agree!" 2) "This is awesome!" 3) "Me too!"
To delete a list value, you can use the DEL command as before. You can also use the LPOP or RPOP commands to remove and return elements from the left or right end of the list. For example:
redis> LPOP comments "I agree!" redis> RPOP comments "Me too!"
Redis also provides many other commands to manipulate lists, such as LLEN, LINDEX, LINSERT, LREM, LSET, LTRIM, and more. You can find the full list of list commands in the official documentation.
Hashes
Sets
Sets are unordered collections of unique strings that support fast membership testing and set operations. Sets can store up to 4.2 billion elements and have a constant time complexity for adding, removing, and checking elements.
You can use sets to store values that are distinct and do not have any order, such as tags, categories, followers, likes, etc. You can also use sets to perform operations such as union, intersection, difference, and random selection.
To create or update a set value, you can use the SADD command to add one or more elements to the set. For example:
redis> SADD tags redis nosql database (integer) 3 redis> SADD tags cache memory (integer) 2
To read a set value, you can use the SMEMBERS command to get all the elements in the set. For example:
redis> SMEMBERS tags 1) "database" 2) "cache" 3) "memory" 4) "nosql" 5) "redis"
To delete a set value, you can use the DEL command as before. You can also use the SREM command to remove one or more elements from the set. For example:
redis> SREM tags cache memory (integer) 2 redis> SMEMBERS tags 1) "database" 2) "nosql" 3) "redis"
Redis also provides many other commands to manipulate sets, such as SCARD, SISMEMBER, SPOP, SRANDMEMBER, SUNION, SINTER, SDIFF, and more. You can find the full list of set commands in the official documentation.
Sorted sets
Sorted sets are similar to sets, but they also have a score associated with each element. The elements are ordered by their score in ascending order. Sorted sets can store up to 4.2 billion elements and support fast insertion, deletion, and range queries.
You can use sorted sets to store values that have an order and a rank, such as leaderboards, scores, ratings, timestamps, etc. You can also use sorted sets to perform operations such as aggregation, filtering, and pagination.
To create or update a sorted set value, you can use the ZADD command to add one or more elements with their scores to the sorted set. For example:
redis> ZADD leaderboard 100 Alice 90 Bob 80 Charlie (integer) 3 redis> ZADD leaderboard 95 David (integer) 1
To read a sorted set value, you can use the ZRANGE or ZREVRANGE commands to get a range of elements by their rank (index). You can also use the ZRANGEBYSCORE or ZREVRANGEBYSCORE commands to get a range of elements by their score. For example:
redis> ZRANGE leaderboard 0 -1 WITHSCORES 1) "Charlie" 2) "80" 3) "Bob" 4) "90" 5) "David" 6) "95" 7) "Alice" 8) "100" redis> ZREVRANGEBYSCORE leaderboard +inf -inf WITHSCORES LIMIT 0 3 1) "Alice" 2) "100" 3) "David" 4) "95" 5) "Bob" 6) "90"
To delete a sorted set value, you can use the DEL command as before. You can also use the ZREM command to remove one or more elements from the sorted set. For example:
redis> ZREM leaderboard Alice Bob (integer) 2 redis> ZRANGE leaderboard 0 -1 WITHSCORES 1) "Charlie" 2) "80" 3) "David" 4) "95"
Redis also provides many other commands to manipulate sorted sets, such as ZCARD, ZSCORE, Streams
Streams are append-only collections of entries that are ordered by a unique identifier. Streams can store an unlimited number of entries and support fast insertion and consumption operations.
You can use streams to store sequences of events, such as logs, messages, transactions, sensor readings, etc. You can also use streams to implement pub/sub, queues, and stream processing.
To create or update a stream value, you can use the XADD command to append one or more entries to the stream. Each entry consists of a unique ID and a list of field-value pairs. For example:
redis> XADD events * type login user Alice 1627987200000-0 redis> XADD events * type logout user Bob 1627987200001-0 redis> XADD events * type purchase user Charlie item book amount 10 1627987200002-0
To read a stream value, you can use the XRANGE or XREVRANGE commands to get a range of entries by their ID. You can also use the XREAD command to read from one or more streams with blocking or non-blocking options. For example:
redis> XRANGE events - + 1) 1) 1627987200000-0 2) 1) "type" 2) "login" 3) "user" 4) "Alice" 2) 1) 1627987200001-0 2) 1) "type" 2) "logout" 3) "user" 4) "Bob" 3) 1) 1627987200002-0 2) 1) "type" 2) "purchase" 3) "user" 4) "Charlie" 5) "item" 6) "book" 7) "amount" 8) "10" redis> XREAD COUNT 2 STREAMS events $ 1) 1) "events" 2) (empty list or set)
To delete a stream value, you can use the DEL command as before. You can also use the XDEL command to delete one or more entries from the stream. For example:
redis> XDEL events 1627987200000-0 (integer) 1 redis> XRANGE events - + 1) 1) 1627987200001-0 2) 1) "type" 2) "logout" 3) "user" 4) "Bob" 2) 1) 1627987200002-0 2) 1) "type" 2) "purchase" 3) "user" 4) "Charlie" 5) "item" 6) "book" 7) "amount" 8) "10"
Redis also provides many other commands to manipulate streams, such as XLEN, XGROUP, XACK, XCLAIM,XPENDING,XTRIM,and more. You can find the full list of stream commands in the official documentation.
Caching with Redis
Caching is one of the most common and popular use cases for Redis. Caching is the process of storing frequently accessed or expensive data in memory for faster retrieval and reduced load on the backend systems. Caching can improve the performance, scalability, and reliability of your applications.
In this section, we will cover some of the benefits and challenges of caching with Redis, as well as some strategies and patterns to implement caching effectively.
Benefits of caching
Caching with Redis can bring many benefits to your applications, such as:
Faster response time: Redis can serve cached data in microseconds, which is much faster than querying a database or an external service.
Lower latency: Redis can reduce the network latency between your application and your data source by storing data closer to where it is needed.
Higher throughput: Redis can handle more requests per second than most databases or services by offloading some of the workload from them.
Better availability: Redis can provide cached data even if the backend systems are down or slow, improving the availability and resilience of your applications.
Lower cost: Redis can reduce the operational cost and complexity of your backend systems by reducing the load and resource consumption on them.
Cache eviction policies
One of the challenges of caching with Redis is managing the memory usage. Since Redis stores data in memory, it has a limited capacity and can run out of space if you store too much data. When this happens, Redis needs to evict some data to make room for new data.
Redis supports six different cache eviction policies that determine how to select which data to evict when the memory is full. You can configure the cache eviction policy using the maxmemory-policy configuration option. The default policy is noeviction, which means that Redis will not evict any data and will return an error when the memory is full.
The other cache eviction policies are:
allkeys-lru: Evict the least recently used (LRU) key out of all keys.
allkeys-lfu: Evict the least frequently used (LFU) key out of all keys.
volatile-lru: Evict the least recently used (LRU) key out of the keys with an expiry set.
volatile-lfu: Evict the least frequently used (LFU) key out of the keys with an expiry set.
volatile-random: Evict a random key out of the keys with an expiry set.
volatile-ttl: Evict the key with the nearest expiry time out of the keys with an expiry set.
The best cache eviction policy depends on your use case and data access pattern. You should choose a policy that minimizes the impact of cache eviction on your application performance and consistency. For example, if your cached data has a high temporal locality, meaning that recently accessed data is likely to be accessed again soon, you may want to use an LRU-based policy. If your cached data has a high frequency locality, meaning that frequently accessed data is likely to be accessed again soon, you may want to use an LFU-based policy. If your cached data has a high spatial locality, meaning that nearby data is likely to be accessed together, you may want to use a TTL-based policy.
Cache expiration strategies
Another challenge of caching with Redis is managing the data freshness. Since cached data may become stale or invalid over time, you need to update or remove it periodically to ensure that your application gets the correct and consistent data. There are two main strategies for cache expiration: passive expiration and active expiration.
Passive expiration is when you set an expiry time for each cached key using the EXPIRE or PEXPIRE commands. For example:
redis> SET user:1:name Alice EX 3600 OK redis> TTL user:1:name (integer) 3599
This will make the key user:1:name expire after 3600 seconds (one hour). When you try to access an expired key, Redis will delete it and return nil. For example:
redis> GET user:1:name "Alice" redis> -- wait for one hour -- redis> GET user:1:name (nil)
The advantage of passive expiration is that it is simple and efficient. You don't need to worry about updating or deleting expired keys manually. The disadvantage of passive expiration is that it may cause inconsistency or latency issues. For example, if your cached data depends on other data sources that change frequently, you may get stale or incorrect data until the expiry time is reached. Also, if you have a large number of expired keys that are not accessed, they may consume memory unnecessarily and trigger cache eviction.
Active expiration is when you update or delete cached keys proactively based on some events or triggers. For example, you can use a pub/sub mechanism to notify your application when some data changes in the backend system, and then update or delete the corresponding cached keys accordingly. Alternatively, you can use a cron job or a scheduled task to scan and remove expired keys periodically.
Cache invalidation patterns
Cache invalidation is the process of updating or deleting cached data when it becomes stale or invalid. Cache invalidation can be challenging because it involves coordination and synchronization between multiple components and systems. There are many cache invalidation patterns that can be used to address different scenarios and requirements. Here are some of the common cache invalidation patterns:
Write-through: This pattern involves writing data to both the cache and the backend system at the same time. This way, the cache always reflects the latest state of the data and there is no need for cache expiration or invalidation. The drawback of this pattern is that it may increase the write latency and reduce the write throughput, as every write operation has to wait for both the cache and the backend system to complete.
Write-around: This pattern involves writing data directly to the backend system and bypassing the cache. This way, the cache does not store any stale or invalid data and there is no need for cache expiration or invalidation. The drawback of this pattern is that it may increase the read latency and reduce the read throughput, as every read operation has to fetch data from the backend system if it is not in the cache.
Write-back: This pattern involves writing data to the cache first and then asynchronously writing it to the backend system later. This way, the cache can provide fast write response and high write throughput, as every write operation only has to wait for the cache to complete. The drawback of this pattern is that it may cause data inconsistency or loss, as there may be a delay or a failure in propagating data from the cache to the backend system.
Cache-aside: This pattern involves reading data from the cache first and then fetching it from the backend system if it is not in the cache. When fetching data from the backend system, it also updates the cache with the latest data. This way, the cache can provide fast read response and high read throughput, as most read operations can be served from the cache. The drawback of this pattern is that it may cause race conditions or concurrency issues, as multip