K8S 部署Redis集群
在Kubernetes中部署Redis集群操作示例
使用nfs做持久化存储,采用StorageClass动态卷的方式,需要借助nfs供应商进行配置
3主3从
一.资源定义文件
注意:
以下配置仅新建一个namespace,然后部署一个redis集群进行测试的配置文件。
请根据自身情况修改相关配置
1.redis-namespace.yml
apiVersion: v1
kind: Namespace
metadata:
name: redis-test2.nfs-rbac.yml
若namespace不是redis-test,请根据自身情况修改
https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner
apiVersion: v1
kind: ServiceAccount
metadata:
name: nfs-client-provisioner
# replace with namespace where provisioner is deployed
namespace: redis-test
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: nfs-client-provisioner-runner
rules:
- apiGroups: [""]
resources: ["nodes"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["persistentvolumes"]
verbs: ["get", "list", "watch", "create", "delete"]
- apiGroups: [""]
resources: ["persistentvolumeclaims"]
verbs: ["get", "list", "watch", "update"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["events"]
verbs: ["create", "update", "patch"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: run-nfs-client-provisioner
subjects:
- kind: ServiceAccount
name: nfs-client-provisioner
# replace with namespace where provisioner is deployed
namespace: redis-test
roleRef:
kind: ClusterRole
name: nfs-client-provisioner-runner
apiGroup: rbac.authorization.k8s.io
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: leader-locking-nfs-client-provisioner
# replace with namespace where provisioner is deployed
namespace: redis-test
rules:
- apiGroups: [""]
resources: ["endpoints"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: leader-locking-nfs-client-provisioner
# replace with namespace where provisioner is deployed
namespace: redis-test
subjects:
- kind: ServiceAccount
name: nfs-client-provisioner
# replace with namespace where provisioner is deployed
namespace: redis-test
roleRef:
kind: Role
name: leader-locking-nfs-client-provisioner
apiGroup: rbac.authorization.k8s.io
[root@k8s-master1 redis-sts]# cat rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: nfs-client-provisioner
# replace with namespace where provisioner is deployed
namespace: redis-test
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: nfs-client-provisioner-runner
rules:
- apiGroups: [""]
resources: ["nodes"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["persistentvolumes"]
verbs: ["get", "list", "watch", "create", "delete"]
- apiGroups: [""]
resources: ["persistentvolumeclaims"]
verbs: ["get", "list", "watch", "update"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["events"]
verbs: ["create", "update", "patch"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: run-nfs-client-provisioner
subjects:
- kind: ServiceAccount
name: nfs-client-provisioner
# replace with namespace where provisioner is deployed
namespace: redis-test
roleRef:
kind: ClusterRole
name: nfs-client-provisioner-runner
apiGroup: rbac.authorization.k8s.io
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: leader-locking-nfs-client-provisioner
# replace with namespace where provisioner is deployed
namespace: redis-test
rules:
- apiGroups: [""]
resources: ["endpoints"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: leader-locking-nfs-client-provisioner
# replace with namespace where provisioner is deployed
namespace: redis-test
subjects:
- kind: ServiceAccount
name: nfs-client-provisioner
# replace with namespace where provisioner is deployed
namespace: redis-test
roleRef:
kind: Role
name: leader-locking-nfs-client-provisioner
apiGroup: rbac.authorization.k8s.io3.nfs-deployment.yml
注意修改其中的配置
nfs-server的地址,目录,以及PROVISIONER_NAME(后续storageclass要使用,和该name连接)
kind: Deployment
apiVersion: apps/v1
metadata:
name: nfs-client-provisioner
namespace: redis-test
labels:
app: nfs-client-provisioner
spec:
replicas: 1
selector:
matchLabels:
app: nfs-client-provisioner
strategy:
type: Recreate
template:
metadata:
labels:
app: nfs-client-provisioner
spec:
serviceAccountName: nfs-client-provisioner
containers:
- name: nfs-client-provisioner
image: k8s.gcr.io/sig-storage/nfs-subdir-external-provisioner:v4.0.2
volumeMounts:
- name: nfs-client-root
mountPath: /persistentvolumes
env:
- name: PROVISIONER_NAME
value: nfs/redis
- name: NFS_SERVER
value: 192.168.0.151
- name: NFS_PATH
value: /home/nfs/eck-log
volumes:
- name: nfs-client-root
nfs:
server: 192.168.0.151
path: /home/nfs/eck-log4.redis-storageclass.yml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: redis-data
namespace: redis-test
provisioner: nfs/redis
parameters:
#pathPattern: "${.PVC.namespace}/${.PVC.annotations.nfs.io/storage-path}" # waits for nfs.io/storage-path annotation, if not specified will accept as empty string.
#pathPattern: "${.PVC.namespace}-${.PVC.name}" # waits for nfs.io/storage-path annotation, if not specified will accept as empty string.
#onDelete: delete
pathPattern: "${.PVC.namespace}/${.PVC.name}"
onDelete: retain
reclaimPolicy: Retain5.redis-configmap.yml
apiVersion: v1
kind: ConfigMap
metadata:
name: redis-cluster
namespace: redis-test
data:
update-node.sh: |
#!/bin/sh
REDIS_NODES="/data/nodes.conf"
sed -i -e "/myself/ s/[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}/${POD_IP}/" ${REDIS_NODES}
exec "$@"
# redis.conf: |+
# cluster-enabled yes
# cluster-require-full-coverage no
# cluster-node-timeout 15000
# cluster-config-file /data/nodes.conf
# cluster-migration-barrier 1
# appendonly yes
# protected-mode no
redis.conf: |
cluster-enabled yes
cluster-config-file /data/nodes.conf
cluster-node-timeout 15000
protected-mode no
daemonize no
pidfile /var/run/redis.pid
port 6379
tcp-backlog 511
bind 0.0.0.0
timeout 3600
tcp-keepalive 1
loglevel verbose
logfile /data/redis.log
databases 64
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
dir /data
requirepass c8rGIL7d
appendonly yes
appendfilename "appendonly.aof"
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
lua-time-limit 20000
slowlog-log-slower-than 10000
slowlog-max-len 128
latency-monitor-threshold 0
notify-keyspace-events ""
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-entries 512
list-max-ziplist-value 64
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
hll-sparse-max-bytes 3000
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit slave 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
aof-rewrite-incremental-fsync yesupdate-node.sh脚本的作用用于当redis集群某pod重建后Pod IP发生变化,在/data/nodes.conf中将新的Pod IP替换原Pod IP。不然集群会出问题。
6.redis-statefulset.yml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: redis-cluster
namespace: redis-test
spec:
serviceName: redis-cluster
replicas: 6
selector:
matchLabels:
app: redis-cluster
template:
metadata:
labels:
app: redis-cluster
spec:
containers:
- name: redis
image: redis:5.0.5-alpine
ports:
- containerPort: 6379
name: client
- containerPort: 16379
name: gossip
command: ["/conf/update-node.sh", "redis-server", "/conf/redis.conf"]
env:
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
volumeMounts:
- name: conf
mountPath: /conf
readOnly: false
- name: data
mountPath: /data
readOnly: false
volumes:
- name: conf
configMap:
name: redis-cluster
defaultMode: 0755
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 5Gi
storageClassName: redis-data二.部署服务
创建namespace
kubectl apply -f redis-namespace.yml设置授权rbac
kubectl apply -f redis-namespace.yml部署nfs子目录外部供应商
kubectl apply -f nfs-deployment.yml创建redis使用的storageclass
kubectl apply -f redis-storageclass.yml创建redis的配置文件configmap
kubectl apply -f redis-configmap.yml创建redis集群
kubectl apply -f redis-statefulset.yml三.初始化
1.获取集群IP
kubectl -n redis-test get pods -l app=redis-cluster -o jsonpath='{range.items[*]}{.status.podIP}:6379 '2.创建集群
有密码的加密码,没有密码的去掉-a参数
1.自动获取redis-test命名空间中的redis集群IP进行创建集群
kubectl -n redis-test exec -it redis-cluster-0 -- redis-cli --cluster create $(kubectl -n redis-test get pods -l app=redis-cluster -o jsonpath='{range.items[*]}{.status.podIP}:6379 ') --cluster-replicas 1 -a c8rGIL7d2.利用手动获取的IP,进入pod内部,手动执行命令也可以
若有密码,请加 -a 密码(redis.conf中的requirepass即密码)
redis-cli --cluster create 10.244.1.193:6379 10.244.2.81:6379 10.244.1.195:6379 10.244.
2.82:6379 10.244.1.196:6379 10.244.2.83:6379 --cluster-replicas 1 中间输入yes
Can I set the above configuration? (type 'yes' to accept): yes/data # redis-cli --cluster create 10.244.1.193:6379 10.244.2.81:6379 10.244.1.195:6379 10.244.
2.82:6379 10.244.1.196:6379 10.244.2.83:6379 --cluster-replicas 1
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 10.244.1.196:6379 to 10.244.1.193:6379
Adding replica 10.244.2.83:6379 to 10.244.2.81:6379
Adding replica 10.244.2.82:6379 to 10.244.1.195:6379
M: fab9c85e1ab2395524d85c374501b9d88f2ce671 10.244.1.193:6379
slots:[0-5460] (5461 slots) master
M: 3c27a9ab042c0c94e68bb869a377e4d358678cff 10.244.2.81:6379
slots:[5461-10922] (5462 slots) master
M: df157ff8a828b7436da566cd306234d933603ad7 10.244.1.195:6379
slots:[10923-16383] (5461 slots) master
S: 1cd40acd376f34965727214a0a7c7d6b8a657a09 10.244.2.82:6379
replicates df157ff8a828b7436da566cd306234d933603ad7
S: 86bf46bb3bd198228a0cd3294147253277466f6e 10.244.1.196:6379
replicates fab9c85e1ab2395524d85c374501b9d88f2ce671
S: 99fac7719c0e034b018af1bd456d4d53a47e99c5 10.244.2.83:6379
replicates 3c27a9ab042c0c94e68bb869a377e4d358678cff
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
....
>>> Performing Cluster Check (using node 10.244.1.193:6379)
M: fab9c85e1ab2395524d85c374501b9d88f2ce671 10.244.1.193:6379
slots:[0-5460] (5461 slots) master
1 additional replica(s)
S: 99fac7719c0e034b018af1bd456d4d53a47e99c5 10.244.2.83:6379
slots: (0 slots) slave
replicates 3c27a9ab042c0c94e68bb869a377e4d358678cff
M: 3c27a9ab042c0c94e68bb869a377e4d358678cff 10.244.2.81:6379
slots:[5461-10922] (5462 slots) master
1 additional replica(s)
S: 1cd40acd376f34965727214a0a7c7d6b8a657a09 10.244.2.82:6379
slots: (0 slots) slave
replicates df157ff8a828b7436da566cd306234d933603ad7
M: df157ff8a828b7436da566cd306234d933603ad7 10.244.1.195:6379
slots:[10923-16383] (5461 slots) master
1 additional replica(s)
S: 86bf46bb3bd198228a0cd3294147253277466f6e 10.244.1.196:6379
slots: (0 slots) slave
replicates fab9c85e1ab2395524d85c374501b9d88f2ce671
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
问题
1.k8s重启导致redis集群各节点IP全部改变,集群异常问题?
当k8s集群出现问题,发生重启后,redis集群ip全部发生变动,导致集群异常问题如何解决。
他人说明:
pod重启后pod ip已经改变,而redis配置nodes.conf中的ip并没有变。如果单个节点down掉后重启,集群是可以恢复的,如果存在一半以上节点down的情况下,比如k8s集群重启,redis集群是不能恢复的。
redis是依赖nodes.conf配置节点信息,从而相互通信。因此我们只要保证nodes.conf能更新成新的pod ip就可以了。
问题描述
k8s部署了一个redis集群,3主3从,k8s发生重启,导致redis集群各节点IP全部发生变动,集群出现异常。
data目录挂载采用的是nfs存储挂载
集群信息
[root@k8s-master1 data-redis-cluster-0]# kubectl -n redis-test get pods
NAME READY STATUS RESTARTS AGE
nfs-client-provisioner-65d79fdd58-f9qzt 1/1 Running 0 3h9m
redis-cluster-0 1/1 Running 0 9m6s
redis-cluster-1 1/1 Running 0 35m
redis-cluster-2 1/1 Running 0 35m
redis-cluster-3 1/1 Running 0 35m
redis-cluster-4 1/1 Running 0 35m
redis-cluster-5 1/1 Running 0 35m查看redis-cluster-0节点redis状态,发现cluster_state:fail,其余节点也是fail
redis-cli -c -a c8rGIL7d查看集群信息
/data # redis-cli -c -a c8rGIL7d
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:6379> CLUSTER INFO
cluster_state:fail # 该行表示错误
cluster_slots_assigned:16384
cluster_slots_ok:5462
cluster_slots_pfail:10922
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:2
cluster_stats_messages_ping_sent:55
cluster_stats_messages_sent:55
cluster_stats_messages_received:0
127.0.0.1:6379> CLUSTER nodes
1de9084a9cf839ba4118a0367298b9ebc1496d4b 10.244.2.104:6379@16379 slave,fail? 9fdca29d3f4b1ed005108933383c6146f9737949 1648706058196 1648706058183 6 connected
06ae395178f69c2f4a059b59874f3f0eb1c9eead 10.244.1.116:6379@16379 slave,fail? 355553cdbb35058241cbf45592bdeea1e46abea5 1648706058196 1648706058183 1 connected
9fdca29d3f4b1ed005108933383c6146f9737949 10.244.2.105:6379@16379 myself,master - 0 1648706058183 2 connected 5461-10922
d093d30917e625ab4ffab13dd03cc38f2290798e 10.244.1.74:6379@16379 master,fail? - 1648706058196 1648706058182 3 connected 10923-16383
355553cdbb35058241cbf45592bdeea1e46abea5 10.244.1.100:6379@16379 master,fail? - 1648706058196 1648706058183 1 connected 0-5460
c511b683a0570b0096918e524e99176b88130a26 10.244.2.103:6379@16379 slave,fail? d093d30917e625ab4ffab13dd03cc38f2290798e 1648706058196 1648706058183 4 connected经查看其余节点也是出现这种问题。
解决方法
通过修改配置文件nodes.conf,解决
1.查看各节点nodes.conf配置文件,根据节点ID,交叉比对,获取新旧IP,用于替换
查看各节点上nodes.conf配置文件,交叉对比,根据id匹配对应节点IP,myself表示是当前节点(id和ip匹配)
随便一个redis节点的nodes.conf配置文件,如:redis-cluster-0节点的配置文件
[root@k8s-master1 redis-test]# cat data-redis-cluster-0/nodes.conf
c511b683a0570b0096918e524e99176b88130a26 10.244.2.103:6379@16379 slave d093d30917e625ab4ffab13dd03cc38f2290798e 0 1648705557230 4 connected
355553cdbb35058241cbf45592bdeea1e46abea5 10.244.1.127:6379@16379 myself,master - 0 1648705556000 1 connected 0-5460
1de9084a9cf839ba4118a0367298b9ebc1496d4b 10.244.2.104:6379@16379 slave 9fdca29d3f4b1ed005108933383c6146f9737949 0 1648705558034 6 connected
9fdca29d3f4b1ed005108933383c6146f9737949 10.244.2.100:6379@16379 master - 0 1648705557000 2 connected 5461-10922
d093d30917e625ab4ffab13dd03cc38f2290798e 10.244.1.74:6379@16379 master - 0 1648705556000 3 connected 10923-16383
06ae395178f69c2f4a059b59874f3f0eb1c9eead 10.244.1.116:6379@16379 slave 355553cdbb35058241cbf45592bdeea1e46abea5 0 1648705556227 5 connected
vars currentEpoch 6 lastVoteEpoch 0所有myself信息行
[root@k8s-master1 redis-test]# grep myself data-redis-cluster-*/nodes.conf
data-redis-cluster-0/nodes.conf:355553cdbb35058241cbf45592bdeea1e46abea5 10.244.1.127:6379@16379 myself,master - 0 1648705556000 1 connected 0-5460
data-redis-cluster-1/nodes.conf:9fdca29d3f4b1ed005108933383c6146f9737949 10.244.2.105:6379@16379 myself,master - 0 1648705556000 2 connected 5461-10922
data-redis-cluster-2/nodes.conf:d093d30917e625ab4ffab13dd03cc38f2290798e 10.244.1.129:6379@16379 myself,master - 0 1648705555000 3 connected 10923-16383
data-redis-cluster-3/nodes.conf:c511b683a0570b0096918e524e99176b88130a26 10.244.2.106:6379@16379 myself,slave d093d30917e625ab4ffab13dd03cc38f2290798e 0 1648705556000 4 connected
data-redis-cluster-4/nodes.conf:06ae395178f69c2f4a059b59874f3f0eb1c9eead 10.244.1.130:6379@16379 myself,slave 355553cdbb35058241cbf45592bdeea1e46abea5 0 1648705556000 5 connected
data-redis-cluster-5/nodes.conf:1de9084a9cf839ba4118a0367298b9ebc1496d4b 10.244.2.107:6379@16379 myself,slave 9fdca29d3f4b1ed005108933383c6146f9737949 0 1648705557000 6 connected根据id查找对应ip,替换ip
| redis节点ID | flags | 旧IP | 新IP |
|---|---|---|---|
| c511b683a0570b0096918e524e99176b88130a26 | slave | 10.244.2.103 | 10.244.2.106 |
| 355553cdbb35058241cbf45592bdeea1e46abea5 | master | 10.244.1.100 | 10.244.1.127 |
| 1de9084a9cf839ba4118a0367298b9ebc1496d4b | slave | 10.244.2.104 | 10.244.2.107 |
| 9fdca29d3f4b1ed005108933383c6146f9737949 | master | 10.244.2.100 | 10.244.2.105 |
| d093d30917e625ab4ffab13dd03cc38f2290798e | master | 10.244.1.74 | 10.244.1.129 |
| 06ae395178f69c2f4a059b59874f3f0eb1c9eead | slave | 10.244.1.116 | 10.244.1.130 |
替换nodes.conf配置文件中所有节点IP为新的,这里仅替换redis-cluster-0的,然后对该pod进行重启即可
经过测试,只需替换一个节点的nodes.conf配置文件(这里替换的是redis-cluster-0的配置),然后重启该pod,集群即可恢复正常。
# redis-cluster-0
sed 's/10.244.2.103/10.244.2.106/g;s/10.244.1.100/10.244.1.127/g;s/10.244.2.104/10.244.2.107/g;s/10.244.2.100/10.244.2.105/g;s/10.244.1.74/10.244.1.129/g;s/10.244.1.116/10.244.1.130/g' nodes.conf -i重启redis-cluster-0
不知道为什么会卡住,但是确实已经重启了
kubectl -n redis-test get pods redis-cluster-0 -o yaml | kubectl replace --force -f -
# 或者直接把redis-cluster-0这个pods删除
kubectl -n redis-test delete pods redis-cluster-0验证
连接redis
redis-cli -c -a 密码查看集群信息
cluster info(第一行显示cluster_state:ok即表示该节点恢复)
cluster_state:ok经验证,将一个节点的配置文件nodes.conf中的各节点IP替换后,重新启动该pod,集群即可恢复正常
官方文档
CLUSTER INFO
https://redis.io/commands/cluster-info/
CLUSTER INFO提供INFO有关 Redis Cluster 重要参数的样式信息。以下是示例输出,然后是报告的每个字段的描述。
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:2
cluster_stats_messages_sent:1483972
cluster_stats_messages_received:1483968
total_cluster_links_buffer_limit_exceeded:0
cluster_state:状态是ok节点是否能够接收查询。fail如果至少有一个哈希槽是未绑定的(没有关联的节点),处于错误状态(为其服务的节点被标记为 FAIL 标志),或者如果该节点无法访问大多数主节点。
cluster_slots_assigned:与某个节点关联的槽数(未绑定)。这个数字应该是 16384 节点才能正常工作,这意味着每个哈希槽都应该映射到一个节点。
cluster_slots_ok:映射到不在FAIL或PFAIL状态的节点的哈希槽数。
cluster_slots_pfail:映射到PFAIL状态节点的哈希槽数。请注意,只要故障检测算法PFAIL不提升状态,这些哈希槽仍然可以正常工作。仅表示我们目前无法与节点对话,但可能只是暂时性错误。FAILPFAIL
cluster_slots_fail:映射到FAIL状态节点的哈希槽数。如果此数字不为零,则节点无法提供查询,除非在配置cluster-require-full-coverage中设置为no。
cluster_known_nodes:集群中已知节点的总数,包括HANDSHAKE当前可能不是集群正确成员的状态的节点。
cluster_size:服务于集群中至少一个哈希槽的主节点的数量。
cluster_current_epoch: 局部Current Epoch变量。这用于在故障转移期间创建唯一递增的版本号。
cluster_my_epoch:Config Epoch我们正在与之交谈的节点的。这是分配给该节点的当前配置版本。
cluster_stats_messages_sent:通过集群节点到节点二进制总线发送的消息数。
cluster_stats_messages_received:通过集群节点到节点二进制总线接收到的消息数。
total_cluster_links_buffer_limit_exceeded:由于超出cluster-link-sendbuf-limit配置而释放的集群链接的累积计数。CLUSTER NODES
https://redis.io/commands/cluster-nodes/
序列化格式
该命令的输出只是一个以空格分隔的 CSV 字符串,其中每一行代表集群中的一个节点。以下是输出示例:
07c37dfeb235213a872192d90877d0cd55635b91 127.0.0.1:30004@31004 slave e7d1eecce10fd6bb5eb35b9f99a514335d9ba9ca 0 1426238317239 4 connected
67ed2db8d677e59ec4a4cefb06858cf2a1a89fa1 127.0.0.1:30002@31002 master - 0 1426238316232 2 connected 5461-10922
292f8b365bb7edb5e285caf0b7e6ddc7265d2f4f 127.0.0.1:30003@31003 master - 0 1426238318243 3 connected 10923-16383
6ec23923021cf3ffec47632106199cb7f496ce01 127.0.0.1:30005@31005 slave 67ed2db8d677e59ec4a4cefb06858cf2a1a89fa1 0 1426238316232 5 connected
824fe116063bc5fcf9f4ffd895bc17aee7731ac3 127.0.0.1:30006@31006 slave 292f8b365bb7edb5e285caf0b7e6ddc7265d2f4f 0 1426238317741 6 connected
e7d1eecce10fd6bb5eb35b9f99a514335d9ba9ca 127.0.0.1:30001@31001 myself,master - 0 0 1 connected 0-5460
每行由以下字段组成:
<id> <ip:port@cport> <flags> <master> <ping-sent> <pong-recv> <config-epoch> <link-state> <slot> <slot> ... <slot>
每个字段的含义如下:
id: 节点 ID,一个 40 个字符的随机字符串,在创建节点时生成并且永远不会再次更改(除非CLUSTER RESET HARD使用)。
ip:port@cport:客户端应联系节点以运行查询的节点地址。
flags: 逗号分隔的标志列表:myself, master, slave, fail?, fail, handshake, noaddr, nofailover, noflags. 标志将在下一节中详细解释。
master:如果节点是副本,并且master已知,则为master节点ID,否则为“-”字符。
ping-sent:发送当前活动 ping 的毫秒 unix 时间,如果没有挂起的 ping,则为零。
pong-recv: 最后一次收到 pong 的毫秒 unix 时间。
config-epoch:当前节点(如果节点是副本,则为当前主节点)的配置时期(或版本)。每次发生故障转移时,都会创建一个新的、唯一的、单调递增的配置时期。如果多个节点声称提供相同的哈希槽,则具有更高配置时期的节点获胜。
link-state:用于节点到节点集群总线的链路状态。我们使用此链接与节点进行通信。可以是connected或disconnected。
slot:哈希槽号或范围。从参数号 9 开始,但总共可能有多达 16384 个条目(从未达到限制)。这是此节点服务的哈希槽列表。如果条目只是一个数字,则按此方式解析。如果是范围,则格式为start-end,表示该节点负责从start到的所有哈希槽,end包括开始值和结束值。
标志的含义(字段编号 3):
myself:您正在联系的节点。
master: 节点是主人。
slave: 节点是一个副本。
fail?: 节点处于PFAIL状态。您正在联系的节点无法访问,但在逻辑上仍可访问(未处于FAIL状态)。
fail: 节点处于FAIL状态。PFAIL将状态提升为 的多个节点无法访问FAIL。
handshake: 不受信任的节点,我们正在握手。
noaddr: 此节点没有已知地址。
nofailover:副本不会尝试故障转移。
noflags: 根本没有标志。