Description

Redis is a popular key value cache store. This solution is for a high availability enterprise cluster with N+1 redundancy allowing complete failure of 1 component or host.

Solution

This walk through describes the steps required to deploy the redis service to 3 servers resulting in redis and sentinel running for 2+1 high availability specification. This solution is written for CentOS / RHEL, update paths and packer manager for other distrubitions. Ive used 3 KVM with 2xCPU 2GB RAM and 10GB storage. There is a final step showing how to collect metrics for prometheus monitoring.

Pre requisites

Configure 3 linux hosts on the same subnet. Disable SELinux and iptables for speed on this walk through deployment.

  • setenforce 0
  • iptables -F

Install Redis packages

For CentOS 7 we can get redis from EPEL. Redis Sentinel is a part of the standard redis package. Install to all 3 servers.

yum install -y epel-release
yum install -y redis

At time of writing redis 3.2 was installed.

$ rpm -qa redis
redis-3.2.12-2.el7.x86_64

Prerequisites configuration files

Edit your /etc/redis.conf file.

By default redis will only listen on localhost. As we need Connectivity from 2 other hosts we need to comment out the bind option to listen on all interfaces.

#bind 127.0.0.1

Protected mode is enabled by default, disable to allow connections from other hosts.

protected no

Configure master redis instance 6379 (redis-1 host)

Start redis daemon.

[root@redis-1 ~]# systemctl start redis
[root@redis-1 ~]# systemctl status redis
● redis.service - Redis persistent key-value database
   Loaded: loaded (/usr/lib/systemd/system/redis.service; disabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/redis.service.d
           └─limit.conf
   Active: active (running) since Sat 2020-03-21 16:54:24 GMT; 2s ago
 Main PID: 1495 (redis-server)
   CGroup: /system.slice/redis.service
           └─1495 /usr/bin/redis-server 127.0.0.1:6379

Mar 21 16:54:24 redis-1.home.lan systemd[1]: Starting Redis persistent key-value database...
Mar 21 16:54:24 redis-1.home.lan systemd[1]: Started Redis persistent key-value database.

Get current redis daemon status.

[root@redis-1 ~]# redis-cli -p 6379 info server
# Server
redis_version:3.2.12
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:7897e7d0e13773f
redis_mode:standalone
os:Linux 3.10.0-1062.12.1.el7.x86_64 x86_64
arch_bits:64
multiplexing_api:epoll
gcc_version:4.8.5
process_id:1495
run_id:bfb674d8ad1fbb15a280ee68c51da5d6ac391701
tcp_port:6379
uptime_in_seconds:103
uptime_in_days:0
hz:10
lru_clock:7751463
executable:/usr/bin/redis-server
config_file:/etc/redis.conf

Get replication info showing this is a master.

[root@redis-1 ~]# redis-cli -p 6379 info replication
# Replication
role:master
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

Configure slave instances (redis-2 and redis-3)

With the master redis instance online on host redis-1 we can now configure slave instances 2 and 3 and join the cluster. They will intitally be masters until replication is setup. Official redis replication documentation

Redis-2

Start redis daemon.

[root@redis-2 ~]# systemctl start redis
[root@redis-2 ~]# systemctl status redis
● redis.service - Redis persistent key-value database
   Loaded: loaded (/usr/lib/systemd/system/redis.service; disabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/redis.service.d
           └─limit.conf
   Active: active (running) since Sat 2020-03-21 16:59:23 GMT; 8s ago
 Main PID: 1517 (redis-server)
   CGroup: /system.slice/redis.service
           └─1517 /usr/bin/redis-server 127.0.0.1:6379

Mar 21 16:59:23 redis-2.home.lan systemd[1]: Starting Redis persistent key-value database...
Mar 21 16:59:23 redis-2.home.lan systemd[1]: Started Redis persistent key-value database.

Get replication info showing this is a new master. We will then configure as slave.

[root@redis-2 ~]# redis-cli -p 6379 info replication
# Replication
role:master
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

Issue command to make redis-2 slave of redis-1, you will need the IP address of your redis-1 host.

[root@redis-2 ~]# redis-cli -p 6379 slaveof 192.168.0.61 6379
OK

Confirm this is now a slave of redis-1. Notice the master host IP address and status is up.

[root@redis-2 ~]# redis-cli -p 6379 info replication 
# Replication
role:slave
master_host:192.168.0.61
master_port:6379
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_repl_offset:126519
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

Redis-3

Start redis daemon.

[root@redis-3 ~]# systemctl start redis
[root@redis-3 ~]# systemctl status redis
● redis.service - Redis persistent key-value database
   Loaded: loaded (/usr/lib/systemd/system/redis.service; disabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/redis.service.d
           └─limit.conf
   Active: active (running) since Sat 2020-03-21 16:59:24 GMT; 8min ago
 Main PID: 1490 (redis-server)
   CGroup: /system.slice/redis.service
           └─1490 /usr/bin/redis-server 127.0.0.1:6379

Mar 21 16:59:24 redis-3.home.lan systemd[1]: Starting Redis persistent key-value database...
Mar 21 16:59:24 redis-3.home.lan systemd[1]: Started Redis persistent key-value database.

Get replication info showing this is a new master. We will then configure as slave.

[root@redis-3 ~]# redis-cli -p 6379 info replication
# Replication
role:master
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

Issue command to make redis-3 slave of redis-1.

[root@redis-3 ~]# redis-cli -p 6379 slaveof 192.168.0.61 6379
OK

Confirm replication status with redis-1.

[root@redis-3 ~]# redis-cli -p 6379 info replication
# Replication
role:slave
master_host:192.168.0.61
master_port:6379
master_link_status:up
master_last_io_seconds_ago:2
master_sync_in_progress:0
slave_repl_offset:124580
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

Configure Redis Sentinels

Here we will remove the default redis master to monitor ‘mymaster’. We then issue command to start monitoring our master on its IP address. We give the cluster a new group name ‘app-cache’ and configure with a quorum of 2.

The quorum is the number of seninels that must be in agreement to trigger a failover and there by promoting a slave to new master. The full command specification is as follows:

sentinel monitor <master-group-name> <ip> <port> <quorum>

For full documentation on redis sentinel see the Official Redis Sentinel Documentation.

Redis-1

Start the sentinel daemon.

[root@redis-1 ~]# systemctl start redis-sentinel
[root@redis-1 ~]# systemctl status redis-sentinel
● redis-sentinel.service - Redis Sentinel
   Loaded: loaded (/usr/lib/systemd/system/redis-sentinel.service; disabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/redis-sentinel.service.d
           └─limit.conf
   Active: active (running) since Sat 2020-03-21 17:11:11 GMT; 3s ago
 Main PID: 1528 (redis-sentinel)
   CGroup: /system.slice/redis-sentinel.service
           └─1528 /usr/bin/redis-sentinel *:26379 [sentinel]

Mar 21 17:11:11 redis-1.home.lan systemd[1]: Starting Redis Sentinel...
Mar 21 17:11:11 redis-1.home.lan systemd[1]: Started Redis Sentinel.

Review initial configuration.

[root@redis-1 ~]# redis-cli -p 26379 info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=127.0.0.1:6379,slaves=0,sentinels=1

Remove the default master monitor ‘mymaster’. We will issue new monitoring command for our cluster using the master IP address and giving it a service name specific to our use case.

[root@redis-1 ~]#  redis-cli -p 26379 sentinel remove mymaster
OK
[root@redis-1 ~]#  redis-cli -p 26379 info sentinel
# Sentinel
sentinel_masters:0
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0

Issue command to start monitoring new redis master. Here we name the redis cache as app-cache with a quorum of 2.

[root@redis-1 ~]# redis-cli -p 26379 sentinel monitor app-cache 192.168.0.61 6379 2
OK

Confirm we have new cluster monitored, notice the number of slaves and sentinels.

[root@redis-1 ~]# redis-cli -p 26379 info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=app-cache,status=ok,address=192.168.0.61:6379,slaves=2,sentinels=1

Redis-2

Start the sentinel daemon.

[root@redis-2 ~]# systemctl start redis-sentinel
[root@redis-2 ~]# systemctl status redis-sentinel
● redis-sentinel.service - Redis Sentinel
   Loaded: loaded (/usr/lib/systemd/system/redis-sentinel.service; disabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/redis-sentinel.service.d
           └─limit.conf
   Active: active (running) since Sat 2020-03-21 18:27:10 GMT; 5s ago
 Main PID: 7189 (redis-sentinel)
   CGroup: /system.slice/redis-sentinel.service
           └─7189 /usr/bin/redis-sentinel *:26379 [sentinel]

Mar 21 18:27:10 redis-2.home.lan systemd[1]: Starting Redis Sentinel...
Mar 21 18:27:10 redis-2.home.lan systemd[1]: Started Redis Sentinel.

Review initial configuration.

[root@redis-2 ~]# redis-cli -p 26379 info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=127.0.0.1:6379,slaves=0,sentinels=2

Remove the default master monitor ‘mymaster’.

[root@redis-1 ~]#  redis-cli -p 26379 sentinel remove mymaster
OK
[root@redis-1 ~]#  redis-cli -p 26379 info sentinel
# Sentinel
sentinel_masters:0
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0

Issue command to start monitoring new redis master on redis-1 host.

[root@redis-2 ~]# redis-cli -p 26379 sentinel monitor app-cache 192.168.0.61 6379 2
OK

Confirm we have new sentinel monitoring cluster, notice the number of slaves and now extra sentinel (2).

[root@redis-2 ~]# redis-cli -p 26379 info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=app-cache,status=ok,address=192.168.0.61:6379,slaves=0,sentinels=2

Redis-3

Start the sentinel daemon.

[root@redis-3 ~]# systemctl start redis-sentinel
[root@redis-3 ~]#  systemctl status redis-sentinel
● redis-sentinel.service - Redis Sentinel
   Loaded: loaded (/usr/lib/systemd/system/redis-sentinel.service; disabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/redis-sentinel.service.d
           └─limit.conf
   Active: active (running) since Sat 2020-03-21 18:28:50 GMT; 3s ago
 Main PID: 7121 (redis-sentinel)
   CGroup: /system.slice/redis-sentinel.service
           └─7121 /usr/bin/redis-sentinel *:26379 [sentinel]

Mar 21 18:28:50 redis-3.home.lan systemd[1]: Starting Redis Sentinel...
Mar 21 18:28:50 redis-3.home.lan systemd[1]: Started Redis Sentinel.

Review initial configuration.

[root@redis-3 ~]# redis-cli -p 26379 info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=127.0.0.1:6379,slaves=0,sentinels=2

Remove the default master monitor ‘mymaster’.

[root@redis-1 ~]#  redis-cli -p 26379 sentinel remove mymaster
OK
[root@redis-1 ~]#  redis-cli -p 26379 info sentinel
# Sentinel
sentinel_masters:0
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0

Issue command to start monitoring new redis master on redis-1 host.

[root@redis-3 ~]# redis-cli -p 26379 sentinel monitor app-cache 192.168.0.61 6379 2
OK

Confirm we have new sentinel monitoring cluster, notice the number of slaves and now extra sentinel (3).

[root@redis-3 ~]# redis-cli -p 26379 info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=app-cache,status=ok,address=192.168.0.61:6379,slaves=0,sentinels=3

Testing

Confirmation of slaves connected on master

[root@redis-1 ~]# redis-cli -p 6379 info replication 
# Replication
role:master
connected_slaves:2
slave0:ip=192.168.0.63,port=6379,state=online,offset=193884,lag=0
slave1:ip=192.168.0.62,port=6379,state=online,offset=193884,lag=0
master_repl_offset:193884
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:193883

Writing a key to master confirmation of replication to slaves

[root@redis-1 ~]# redis-cli -p 6379 set my_value 100
OK
[root@redis-1 ~]# redis-cli -p 6379 get my_value
"100"

[root@redis-2 ~]# redis-cli -p 6379 get my_value
"100"

[root@redis-3 ~]# redis-cli -p 6379 get my_value
"100"

Force failover with sentinel

View master status.

[root@redis-2 redis]# redis-cli -p 26379 info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=app-cache,status=ok,address=192.168.0.61:6379,slaves=2,sentinels=3

Force failover.

[root@redis-2 redis]# redis-cli -p 26379 sentinel failover app-cache
OK

Confirm new master IP address has changed in sentinel monitor.

[root@redis-2 redis]# redis-cli -p 26379 info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=app-cache,status=ok,address=192.168.0.62:6379,slaves=2,sentinels=3

Confirm new master redis instance is master and has slaves attached.

[root@redis-2 redis]# redis-cli -p 6379 info replication
# Replication
role:master
connected_slaves:2
slave0:ip=192.168.0.61,port=6379,state=online,offset=891455,lag=1
slave1:ip=192.168.0.63,port=6379,state=online,offset=891597,lag=1
master_repl_offset:891739
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:886270
repl_backlog_histlen:5470

Monitoring with grafana and prometheus.

I use prometheus for metrics. In order for prometheus to collect metrics it needs to scrape the agent exporter available HERE.

Setup redis exporter

Download the release package.

[root@redis-1 ~]# curl -L -O https://github.com/oliver006/redis_exporter/releases/download/v1.5.2/redis_exporter-v1.5.2.linux-amd64.tar.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   630  100   630    0     0   1536      0 --:--:-- --:--:-- --:--:--  1544
100 3281k  100 3281k    0     0  1223k      0  0:00:02  0:00:02 --:--:-- 1638k

Extract the contents.

[root@redis-1 ~]# tar xvf redis_exporter-v1.5.2.linux-amd64.tar.gz
redis_exporter-v1.5.2.linux-amd64/
redis_exporter-v1.5.2.linux-amd64/redis_exporter
redis_exporter-v1.5.2.linux-amd64/LICENSE
redis_exporter-v1.5.2.linux-amd64/README.md

Copy the binary to bin folder.

[root@redis-1 ~]# cd redis_exporter-v1.5.2.linux-amd64

[root@redis-1 redis_exporter-v1.5.2.linux-amd64]# cp redis_exporter /usr/local/bin/

Setup systemd file.

[root@redis-1 redis_exporter-v1.5.2.linux-amd64]# cat > /etc/systemd/system/redis_exporter.service << EOF
[Unit]
Description=NodeExporter

[Service]
TimeoutStartSec=0
ExecStart=/usr/local/bin/redis_exporter -redis.addr "redis://localhost:6379"

[Install]
WantedBy=multi-user.target
EOF

Reload systemd and start the exporter.

[root@redis-1 redis_exporter-v1.5.2.linux-amd64]# systemctl daemon-reload
[root@redis-1 redis_exporter-v1.5.2.linux-amd64]# systemctl start redis_exporter 
[root@redis-1 redis_exporter-v1.5.2.linux-amd64]# systemctl status redis_exporter 
● redis_exporter.service - NodeExporter
   Loaded: loaded (/etc/systemd/system/redis_exporter.service; disabled; vendor preset: disabled)
   Active: active (running) since Sat 2020-03-21 19:03:51 GMT; 3s ago
 Main PID: 7289 (redis_exporter)
   CGroup: /system.slice/redis_exporter.service
           └─7289 /usr/local/bin/redis_exporter -redis.addr redis://localhost:6379

Mar 21 19:03:51 redis-1.home.lan systemd[1]: Started NodeExporter.
Mar 21 19:03:51 redis-1.home.lan redis_exporter[7289]: time="2020-03-21T19:03:51Z" level=info msg="Redis Metrics Exporter v1.5.2    build date: 2020-03-16-20:52:04    sha1: 18080da36b0a9759...ARCH: amd64"
Mar 21 19:03:51 redis-1.home.lan redis_exporter[7289]: time="2020-03-21T19:03:51Z" level=info msg="Providing metrics at :9121/metrics"
Hint: Some lines were ellipsized, use -l to show in full.

Configure prometheus

I assume you already have a prometheus server installed and configured for your network. Adding config for the redis exporter is straightforward. The config for prometheus /etc/prometheus/prometheus.yml would be as follows for redis-1 2 and 3.

  - job_name: redis_exporter
    static_configs:
    - targets: ['192.168.0.61:9121']
    - targets: ['192.168.0.62:9121']
    - targets: ['192.168.0.63:9121']

Add dashboard to grafana

Again it is assumed you are using grafana to visualise prometheus metrics. The dashboard to use is available for import here:

https://grafana.com/grafana/dashboards/763

Benchmark redis

The redis comes with a benchmarking tool. This tool has a number of use cases here i create 1 million keys in the master instance.

[root@redis-1 ~]# redis-benchmark -t set -n 1000000 -r 100000000 -h 192.168.0.62 -p 6379
====== SET ======
  1000000 requests completed in 124.90 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1

0.00% <= 1 milliseconds
0.02% <= 2 milliseconds
39.94% <= 3 milliseconds
92.74% <= 4 milliseconds
99.38% <= 5 milliseconds
99.66% <= 6 milliseconds
99.75% <= 7 milliseconds
99.81% <= 8 milliseconds
99.85% <= 9 milliseconds
99.90% <= 10 milliseconds
99.92% <= 11 milliseconds
99.94% <= 12 milliseconds
99.95% <= 13 milliseconds
99.96% <= 14 milliseconds
99.97% <= 15 milliseconds
99.97% <= 16 milliseconds
99.97% <= 17 milliseconds
99.97% <= 18 milliseconds
99.97% <= 19 milliseconds
99.97% <= 20 milliseconds
99.98% <= 21 milliseconds
99.98% <= 22 milliseconds
99.98% <= 23 milliseconds
99.99% <= 24 milliseconds
99.99% <= 25 milliseconds
99.99% <= 26 milliseconds
99.99% <= 27 milliseconds
99.99% <= 43 milliseconds
99.99% <= 44 milliseconds
99.99% <= 45 milliseconds
99.99% <= 46 milliseconds
99.99% <= 47 milliseconds
99.99% <= 48 milliseconds
99.99% <= 49 milliseconds
99.99% <= 50 milliseconds
100.00% <= 51 milliseconds
100.00% <= 52 milliseconds
100.00% <= 53 milliseconds
100.00% <= 54 milliseconds
100.00% <= 55 milliseconds
100.00% <= 56 milliseconds
100.00% <= 57 milliseconds
100.00% <= 57 milliseconds
8006.28 requests per second

Results of the metric collection visualised while running benchmarking test to input 1 million keys.

grafana dashboard

Conclusion

Redis is the most popular in memory database. Its mature, maintained and the sentinel daemon provides reliable resilience to the service.

Slave replication allows balancing reads from masters as well as hot standbys in the event of master failure.

This post demostrates the key features of a redis and sentinel setup in N+1 configuration. N being 2 as we need 2 sentinels for quorum in the event of master failure.

There are many ways to get metrics for redis, prometheus solution is upto date and paired with grafana provides excellent dashboards that can be customised.

Refereneces

https://redis.io/commands/config-get

https://redis.io/topics/replication

https://redis.io/topics/sentinel