Kubernetes中部署Redis集群操作示例

使用nfs做持久化存储,采用StorageClass动态卷的方式,需要借助nfs供应商进行配置

3主3从

一.资源定义文件

注意:

以下配置仅新建一个namespace,然后部署一个redis集群进行测试的配置文件。

请根据自身情况修改相关配置

1.redis-namespace.yml

apiVersion: v1
kind: Namespace
metadata:
  name: redis-test

2.nfs-rbac.yml

namespace不是redis-test,请根据自身情况修改

https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner

apiVersion: v1
kind: ServiceAccount
metadata:
  name: nfs-client-provisioner
  # replace with namespace where provisioner is deployed
  namespace: redis-test
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: nfs-client-provisioner-runner
rules:
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch", "create", "delete"]
  - apiGroups: [""]
    resources: ["persistentvolumeclaims"]
    verbs: ["get", "list", "watch", "update"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["create", "update", "patch"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: run-nfs-client-provisioner
subjects:
  - kind: ServiceAccount
    name: nfs-client-provisioner
    # replace with namespace where provisioner is deployed
    namespace: redis-test
roleRef:
  kind: ClusterRole
  name: nfs-client-provisioner-runner
  apiGroup: rbac.authorization.k8s.io
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: leader-locking-nfs-client-provisioner
  # replace with namespace where provisioner is deployed
  namespace: redis-test
rules:
  - apiGroups: [""]
    resources: ["endpoints"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: leader-locking-nfs-client-provisioner
  # replace with namespace where provisioner is deployed
  namespace: redis-test
subjects:
  - kind: ServiceAccount
    name: nfs-client-provisioner
    # replace with namespace where provisioner is deployed
    namespace: redis-test
roleRef:
  kind: Role
  name: leader-locking-nfs-client-provisioner
  apiGroup: rbac.authorization.k8s.io
[root@k8s-master1 redis-sts]# cat rbac.yaml 
apiVersion: v1
kind: ServiceAccount
metadata:
  name: nfs-client-provisioner
  # replace with namespace where provisioner is deployed
  namespace: redis-test
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: nfs-client-provisioner-runner
rules:
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch", "create", "delete"]
  - apiGroups: [""]
    resources: ["persistentvolumeclaims"]
    verbs: ["get", "list", "watch", "update"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["create", "update", "patch"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: run-nfs-client-provisioner
subjects:
  - kind: ServiceAccount
    name: nfs-client-provisioner
    # replace with namespace where provisioner is deployed
    namespace: redis-test
roleRef:
  kind: ClusterRole
  name: nfs-client-provisioner-runner
  apiGroup: rbac.authorization.k8s.io
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: leader-locking-nfs-client-provisioner
  # replace with namespace where provisioner is deployed
  namespace: redis-test
rules:
  - apiGroups: [""]
    resources: ["endpoints"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: leader-locking-nfs-client-provisioner
  # replace with namespace where provisioner is deployed
  namespace: redis-test
subjects:
  - kind: ServiceAccount
    name: nfs-client-provisioner
    # replace with namespace where provisioner is deployed
    namespace: redis-test
roleRef:
  kind: Role
  name: leader-locking-nfs-client-provisioner
  apiGroup: rbac.authorization.k8s.io

3.nfs-deployment.yml

注意修改其中的配置

nfs-server的地址,目录,以及PROVISIONER_NAME(后续storageclass要使用,和该name连接)

kind: Deployment
apiVersion: apps/v1
metadata:
  name: nfs-client-provisioner
  namespace: redis-test
  labels:
    app: nfs-client-provisioner
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nfs-client-provisioner
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: nfs-client-provisioner
    spec:
      serviceAccountName: nfs-client-provisioner
      containers:
        - name: nfs-client-provisioner
          image: k8s.gcr.io/sig-storage/nfs-subdir-external-provisioner:v4.0.2
          volumeMounts:
            - name: nfs-client-root
              mountPath: /persistentvolumes
          env:
            - name: PROVISIONER_NAME
              value: nfs/redis
            - name: NFS_SERVER
              value: 192.168.0.151
            - name: NFS_PATH
              value: /home/nfs/eck-log
      volumes:
        - name: nfs-client-root
          nfs:
            server: 192.168.0.151
            path: /home/nfs/eck-log

4.redis-storageclass.yml

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: redis-data
  namespace: redis-test
provisioner: nfs/redis
parameters:
  #pathPattern: "${.PVC.namespace}/${.PVC.annotations.nfs.io/storage-path}" # waits for nfs.io/storage-path annotation, if not specified will accept as empty string.
  #pathPattern: "${.PVC.namespace}-${.PVC.name}" # waits for nfs.io/storage-path annotation, if not specified will accept as empty string.
  #onDelete: delete
  pathPattern: "${.PVC.namespace}/${.PVC.name}"
  onDelete: retain
reclaimPolicy: Retain

5.redis-configmap.yml

apiVersion: v1
kind: ConfigMap
metadata:
  name: redis-cluster
  namespace: redis-test
data:
  update-node.sh: |
    #!/bin/sh
    REDIS_NODES="/data/nodes.conf"
    sed -i -e "/myself/ s/[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}/${POD_IP}/" ${REDIS_NODES}
    exec "$@"
#  redis.conf: |+
#    cluster-enabled yes
#    cluster-require-full-coverage no
#    cluster-node-timeout 15000
#    cluster-config-file /data/nodes.conf
#    cluster-migration-barrier 1
#    appendonly yes
#    protected-mode no
  redis.conf: |
    cluster-enabled yes
    cluster-config-file /data/nodes.conf
    cluster-node-timeout 15000
    protected-mode no
    daemonize no
    pidfile /var/run/redis.pid
    port 6379
    tcp-backlog 511
    bind 0.0.0.0
    timeout 3600
    tcp-keepalive 1
    loglevel verbose
    logfile /data/redis.log
    databases 64
    save 900 1
    save 300 10
    save 60 10000
    stop-writes-on-bgsave-error yes
    rdbcompression yes
    rdbchecksum yes
    dbfilename dump.rdb
    dir /data
    requirepass c8rGIL7d
    appendonly yes
    appendfilename "appendonly.aof"
    appendfsync everysec
    no-appendfsync-on-rewrite no
    auto-aof-rewrite-percentage 100
    auto-aof-rewrite-min-size 64mb
    lua-time-limit 20000
    slowlog-log-slower-than 10000
    slowlog-max-len 128
    
    latency-monitor-threshold 0
    notify-keyspace-events ""
    hash-max-ziplist-entries 512
    hash-max-ziplist-value 64
    list-max-ziplist-entries 512
    list-max-ziplist-value 64
    set-max-intset-entries 512
    zset-max-ziplist-entries 128
    zset-max-ziplist-value 64
    hll-sparse-max-bytes 3000
    activerehashing yes
    client-output-buffer-limit normal 0 0 0
    client-output-buffer-limit slave 256mb 64mb 60
    client-output-buffer-limit pubsub 32mb 8mb 60
    hz 10
    aof-rewrite-incremental-fsync yes

update-node.sh脚本的作用用于当redis集群某pod重建后Pod IP发生变化,在/data/nodes.conf中将新的Pod IP替换原Pod IP。不然集群会出问题。

6.redis-statefulset.yml

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: redis-cluster
  namespace: redis-test
spec:
  serviceName: redis-cluster
  replicas: 6
  selector:
    matchLabels:
      app: redis-cluster
  template:
    metadata:
      labels:
        app: redis-cluster
    spec:
      containers:
      - name: redis
        image: redis:5.0.5-alpine
        ports:
        - containerPort: 6379
          name: client
        - containerPort: 16379
          name: gossip
        command: ["/conf/update-node.sh", "redis-server", "/conf/redis.conf"]
        env:
        - name: POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        volumeMounts:
        - name: conf
          mountPath: /conf
          readOnly: false
        - name: data
          mountPath: /data
          readOnly: false
      volumes:
      - name: conf
        configMap:
          name: redis-cluster
          defaultMode: 0755
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 5Gi
      storageClassName: redis-data

二.部署服务

创建namespace

kubectl apply -f redis-namespace.yml

设置授权rbac

kubectl apply -f redis-namespace.yml

部署nfs子目录外部供应商

kubectl apply -f nfs-deployment.yml

创建redis使用的storageclass

kubectl apply -f redis-storageclass.yml

创建redis的配置文件configmap

kubectl apply -f redis-configmap.yml

创建redis集群

kubectl apply -f redis-statefulset.yml

三.初始化

1.获取集群IP

kubectl -n redis-test get pods -l app=redis-cluster  -o jsonpath='{range.items[*]}{.status.podIP}:6379 '

2.创建集群

有密码的加密码,没有密码的去掉-a参数

1.自动获取redis-test命名空间中的redis集群IP进行创建集群

kubectl -n redis-test exec -it redis-cluster-0 -- redis-cli --cluster create $(kubectl -n redis-test get pods -l app=redis-cluster  -o jsonpath='{range.items[*]}{.status.podIP}:6379 ') --cluster-replicas 1 -a c8rGIL7d

2.利用手动获取的IP,进入pod内部,手动执行命令也可以

若有密码,请加 -a 密码redis.conf中的requirepass即密码)

redis-cli --cluster create 10.244.1.193:6379 10.244.2.81:6379 10.244.1.195:6379 10.244.
2.82:6379 10.244.1.196:6379 10.244.2.83:6379 --cluster-replicas 1 

中间输入yes

Can I set the above configuration? (type 'yes' to accept): yes
/data # redis-cli --cluster create 10.244.1.193:6379 10.244.2.81:6379 10.244.1.195:6379 10.244.
2.82:6379 10.244.1.196:6379 10.244.2.83:6379 --cluster-replicas 1 
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 10.244.1.196:6379 to 10.244.1.193:6379
Adding replica 10.244.2.83:6379 to 10.244.2.81:6379
Adding replica 10.244.2.82:6379 to 10.244.1.195:6379
M: fab9c85e1ab2395524d85c374501b9d88f2ce671 10.244.1.193:6379
   slots:[0-5460] (5461 slots) master
M: 3c27a9ab042c0c94e68bb869a377e4d358678cff 10.244.2.81:6379
   slots:[5461-10922] (5462 slots) master
M: df157ff8a828b7436da566cd306234d933603ad7 10.244.1.195:6379
   slots:[10923-16383] (5461 slots) master
S: 1cd40acd376f34965727214a0a7c7d6b8a657a09 10.244.2.82:6379
   replicates df157ff8a828b7436da566cd306234d933603ad7
S: 86bf46bb3bd198228a0cd3294147253277466f6e 10.244.1.196:6379
   replicates fab9c85e1ab2395524d85c374501b9d88f2ce671
S: 99fac7719c0e034b018af1bd456d4d53a47e99c5 10.244.2.83:6379
   replicates 3c27a9ab042c0c94e68bb869a377e4d358678cff
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
....
>>> Performing Cluster Check (using node 10.244.1.193:6379)
M: fab9c85e1ab2395524d85c374501b9d88f2ce671 10.244.1.193:6379
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
S: 99fac7719c0e034b018af1bd456d4d53a47e99c5 10.244.2.83:6379
   slots: (0 slots) slave
   replicates 3c27a9ab042c0c94e68bb869a377e4d358678cff
M: 3c27a9ab042c0c94e68bb869a377e4d358678cff 10.244.2.81:6379
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
S: 1cd40acd376f34965727214a0a7c7d6b8a657a09 10.244.2.82:6379
   slots: (0 slots) slave
   replicates df157ff8a828b7436da566cd306234d933603ad7
M: df157ff8a828b7436da566cd306234d933603ad7 10.244.1.195:6379
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
S: 86bf46bb3bd198228a0cd3294147253277466f6e 10.244.1.196:6379
   slots: (0 slots) slave
   replicates fab9c85e1ab2395524d85c374501b9d88f2ce671
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

问题

1.k8s重启导致redis集群各节点IP全部改变,集群异常问题?

当k8s集群出现问题,发生重启后,redis集群ip全部发生变动,导致集群异常问题如何解决。

他人说明:

pod重启后pod ip已经改变,而redis配置nodes.conf中的ip并没有变。如果单个节点down掉后重启,集群是可以恢复的,如果存在一半以上节点down的情况下,比如k8s集群重启,redis集群是不能恢复的。

redis是依赖nodes.conf配置节点信息,从而相互通信。因此我们只要保证nodes.conf能更新成新的pod ip就可以了。

问题描述

k8s部署了一个redis集群,3主3从,k8s发生重启,导致redis集群各节点IP全部发生变动,集群出现异常。

data目录挂载采用的是nfs存储挂载

集群信息

[root@k8s-master1 data-redis-cluster-0]# kubectl -n redis-test get pods
NAME                                      READY   STATUS    RESTARTS   AGE
nfs-client-provisioner-65d79fdd58-f9qzt   1/1     Running   0          3h9m
redis-cluster-0                           1/1     Running   0          9m6s
redis-cluster-1                           1/1     Running   0          35m
redis-cluster-2                           1/1     Running   0          35m
redis-cluster-3                           1/1     Running   0          35m
redis-cluster-4                           1/1     Running   0          35m
redis-cluster-5                           1/1     Running   0          35m

查看redis-cluster-0节点redis状态,发现cluster_state:fail,其余节点也是fail

redis-cli -c -a c8rGIL7d

查看集群信息

/data # redis-cli -c -a c8rGIL7d
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:6379> CLUSTER INFO

cluster_state:fail        # 该行表示错误

cluster_slots_assigned:16384
cluster_slots_ok:5462
cluster_slots_pfail:10922
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:2
cluster_stats_messages_ping_sent:55
cluster_stats_messages_sent:55
cluster_stats_messages_received:0
127.0.0.1:6379> CLUSTER nodes
1de9084a9cf839ba4118a0367298b9ebc1496d4b 10.244.2.104:6379@16379 slave,fail? 9fdca29d3f4b1ed005108933383c6146f9737949 1648706058196 1648706058183 6 connected
06ae395178f69c2f4a059b59874f3f0eb1c9eead 10.244.1.116:6379@16379 slave,fail? 355553cdbb35058241cbf45592bdeea1e46abea5 1648706058196 1648706058183 1 connected
9fdca29d3f4b1ed005108933383c6146f9737949 10.244.2.105:6379@16379 myself,master - 0 1648706058183 2 connected 5461-10922
d093d30917e625ab4ffab13dd03cc38f2290798e 10.244.1.74:6379@16379 master,fail? - 1648706058196 1648706058182 3 connected 10923-16383
355553cdbb35058241cbf45592bdeea1e46abea5 10.244.1.100:6379@16379 master,fail? - 1648706058196 1648706058183 1 connected 0-5460
c511b683a0570b0096918e524e99176b88130a26 10.244.2.103:6379@16379 slave,fail? d093d30917e625ab4ffab13dd03cc38f2290798e 1648706058196 1648706058183 4 connected

经查看其余节点也是出现这种问题。

解决方法

通过修改配置文件nodes.conf,解决

1.查看各节点nodes.conf配置文件,根据节点ID,交叉比对,获取新旧IP,用于替换

查看各节点上nodes.conf配置文件,交叉对比,根据id匹配对应节点IP,myself表示是当前节点(id和ip匹配)

随便一个redis节点的nodes.conf配置文件,如:redis-cluster-0节点的配置文件

[root@k8s-master1 redis-test]# cat data-redis-cluster-0/nodes.conf 
c511b683a0570b0096918e524e99176b88130a26 10.244.2.103:6379@16379 slave d093d30917e625ab4ffab13dd03cc38f2290798e 0 1648705557230 4 connected
355553cdbb35058241cbf45592bdeea1e46abea5 10.244.1.127:6379@16379 myself,master - 0 1648705556000 1 connected 0-5460
1de9084a9cf839ba4118a0367298b9ebc1496d4b 10.244.2.104:6379@16379 slave 9fdca29d3f4b1ed005108933383c6146f9737949 0 1648705558034 6 connected
9fdca29d3f4b1ed005108933383c6146f9737949 10.244.2.100:6379@16379 master - 0 1648705557000 2 connected 5461-10922
d093d30917e625ab4ffab13dd03cc38f2290798e 10.244.1.74:6379@16379 master - 0 1648705556000 3 connected 10923-16383
06ae395178f69c2f4a059b59874f3f0eb1c9eead 10.244.1.116:6379@16379 slave 355553cdbb35058241cbf45592bdeea1e46abea5 0 1648705556227 5 connected
vars currentEpoch 6 lastVoteEpoch 0

所有myself信息行

[root@k8s-master1 redis-test]# grep myself data-redis-cluster-*/nodes.conf 
data-redis-cluster-0/nodes.conf:355553cdbb35058241cbf45592bdeea1e46abea5 10.244.1.127:6379@16379 myself,master - 0 1648705556000 1 connected 0-5460
data-redis-cluster-1/nodes.conf:9fdca29d3f4b1ed005108933383c6146f9737949 10.244.2.105:6379@16379 myself,master - 0 1648705556000 2 connected 5461-10922
data-redis-cluster-2/nodes.conf:d093d30917e625ab4ffab13dd03cc38f2290798e 10.244.1.129:6379@16379 myself,master - 0 1648705555000 3 connected 10923-16383
data-redis-cluster-3/nodes.conf:c511b683a0570b0096918e524e99176b88130a26 10.244.2.106:6379@16379 myself,slave d093d30917e625ab4ffab13dd03cc38f2290798e 0 1648705556000 4 connected
data-redis-cluster-4/nodes.conf:06ae395178f69c2f4a059b59874f3f0eb1c9eead 10.244.1.130:6379@16379 myself,slave 355553cdbb35058241cbf45592bdeea1e46abea5 0 1648705556000 5 connected
data-redis-cluster-5/nodes.conf:1de9084a9cf839ba4118a0367298b9ebc1496d4b 10.244.2.107:6379@16379 myself,slave 9fdca29d3f4b1ed005108933383c6146f9737949 0 1648705557000 6 connected

根据id查找对应ip,替换ip

redis节点IDflags旧IP新IP
c511b683a0570b0096918e524e99176b88130a26slave10.244.2.10310.244.2.106
355553cdbb35058241cbf45592bdeea1e46abea5master10.244.1.10010.244.1.127
1de9084a9cf839ba4118a0367298b9ebc1496d4bslave10.244.2.10410.244.2.107
9fdca29d3f4b1ed005108933383c6146f9737949master10.244.2.10010.244.2.105
d093d30917e625ab4ffab13dd03cc38f2290798emaster10.244.1.7410.244.1.129
06ae395178f69c2f4a059b59874f3f0eb1c9eeadslave10.244.1.11610.244.1.130

替换nodes.conf配置文件中所有节点IP为新的,这里仅替换redis-cluster-0的,然后对该pod进行重启即可

经过测试,只需替换一个节点的nodes.conf配置文件(这里替换的是redis-cluster-0的配置),然后重启该pod,集群即可恢复正常。

# redis-cluster-0

sed 's/10.244.2.103/10.244.2.106/g;s/10.244.1.100/10.244.1.127/g;s/10.244.2.104/10.244.2.107/g;s/10.244.2.100/10.244.2.105/g;s/10.244.1.74/10.244.1.129/g;s/10.244.1.116/10.244.1.130/g' nodes.conf -i

重启redis-cluster-0

不知道为什么会卡住,但是确实已经重启了

kubectl -n redis-test get pods redis-cluster-0 -o yaml | kubectl replace --force -f -

# 或者直接把redis-cluster-0这个pods删除
kubectl -n redis-test delete pods redis-cluster-0

验证

连接redis

redis-cli -c -a 密码

查看集群信息

cluster info(第一行显示cluster_state:ok即表示该节点恢复)

cluster_state:ok

经验证,将一个节点的配置文件nodes.conf中的各节点IP替换后,重新启动该pod,集群即可恢复正常

官方文档

CLUSTER INFO

https://redis.io/commands/cluster-info/

CLUSTER INFO提供INFO有关 Redis Cluster 重要参数的样式信息。以下是示例输出,然后是报告的每个字段的描述。

cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:2
cluster_stats_messages_sent:1483972
cluster_stats_messages_received:1483968
total_cluster_links_buffer_limit_exceeded:0
cluster_state:状态是ok节点是否能够接收查询。fail如果至少有一个哈希槽是未绑定的(没有关联的节点),处于错误状态(为其服务的节点被标记为 FAIL 标志),或者如果该节点无法访问大多数主节点。
cluster_slots_assigned:与某个节点关联的槽数(未绑定)。这个数字应该是 16384 节点才能正常工作,这意味着每个哈希槽都应该映射到一个节点。
cluster_slots_ok:映射到不在FAIL或PFAIL状态的节点的哈希槽数。
cluster_slots_pfail:映射到PFAIL状态节点的哈希槽数。请注意,只要故障检测算法PFAIL不提升状态,这些哈希槽仍然可以正常工作。仅表示我们目前无法与节点对话,但可能只是暂时性错误。FAILPFAIL
cluster_slots_fail:映射到FAIL状态节点的哈希槽数。如果此数字不为零,则节点无法提供查询,除非在配置cluster-require-full-coverage中设置为no。
cluster_known_nodes:集群中已知节点的总数,包括HANDSHAKE当前可能不是集群正确成员的状态的节点。
cluster_size:服务于集群中至少一个哈希槽的主节点的数量。
cluster_current_epoch: 局部Current Epoch变量。这用于在故障转移期间创建唯一递增的版本号。
cluster_my_epoch:Config Epoch我们正在与之交谈的节点的。这是分配给该节点的当前配置版本。
cluster_stats_messages_sent:通过集群节点到节点二进制总线发送的消息数。
cluster_stats_messages_received:通过集群节点到节点二进制总线接收到的消息数。
total_cluster_links_buffer_limit_exceeded:由于超出cluster-link-sendbuf-limit配置而释放的集群链接的累积计数。

CLUSTER NODES

https://redis.io/commands/cluster-nodes/

序列化格式
该命令的输出只是一个以空格分隔的 CSV 字符串,其中每一行代表集群中的一个节点。以下是输出示例:

07c37dfeb235213a872192d90877d0cd55635b91 127.0.0.1:30004@31004 slave e7d1eecce10fd6bb5eb35b9f99a514335d9ba9ca 0 1426238317239 4 connected
67ed2db8d677e59ec4a4cefb06858cf2a1a89fa1 127.0.0.1:30002@31002 master - 0 1426238316232 2 connected 5461-10922
292f8b365bb7edb5e285caf0b7e6ddc7265d2f4f 127.0.0.1:30003@31003 master - 0 1426238318243 3 connected 10923-16383
6ec23923021cf3ffec47632106199cb7f496ce01 127.0.0.1:30005@31005 slave 67ed2db8d677e59ec4a4cefb06858cf2a1a89fa1 0 1426238316232 5 connected
824fe116063bc5fcf9f4ffd895bc17aee7731ac3 127.0.0.1:30006@31006 slave 292f8b365bb7edb5e285caf0b7e6ddc7265d2f4f 0 1426238317741 6 connected
e7d1eecce10fd6bb5eb35b9f99a514335d9ba9ca 127.0.0.1:30001@31001 myself,master - 0 0 1 connected 0-5460
每行由以下字段组成:

<id> <ip:port@cport> <flags> <master> <ping-sent> <pong-recv> <config-epoch> <link-state> <slot> <slot> ... <slot>
每个字段的含义如下:

id: 节点 ID,一个 40 个字符的随机字符串,在创建节点时生成并且永远不会再次更改(除非CLUSTER RESET HARD使用)。
ip:port@cport:客户端应联系节点以运行查询的节点地址。
flags: 逗号分隔的标志列表:myself, master, slave, fail?, fail, handshake, noaddr, nofailover, noflags. 标志将在下一节中详细解释。
master:如果节点是副本,并且master已知,则为master节点ID,否则为“-”字符。
ping-sent:发送当前活动 ping 的毫秒 unix 时间,如果没有挂起的 ping,则为零。
pong-recv: 最后一次收到 pong 的毫秒 unix 时间。
config-epoch:当前节点(如果节点是副本,则为当前主节点)的配置时期(或版本)。每次发生故障转移时,都会创建一个新的、唯一的、单调递增的配置时期。如果多个节点声称提供相同的哈希槽,则具有更高配置时期的节点获胜。
link-state:用于节点到节点集群总线的链路状态。我们使用此链接与节点进行通信。可以是connected或disconnected。
slot:哈希槽号或范围。从参数号 9 开始,但总共可能有多达 16384 个条目(从未达到限制)。这是此节点服务的哈希槽列表。如果条目只是一个数字,则按此方式解析。如果是范围,则格式为start-end,表示该节点负责从start到的所有哈希槽,end包括开始值和结束值。
标志的含义(字段编号 3):

myself:您正在联系的节点。
master: 节点是主人。
slave: 节点是一个副本。
fail?: 节点处于PFAIL状态。您正在联系的节点无法访问,但在逻辑上仍可访问(未处于FAIL状态)。
fail: 节点处于FAIL状态。PFAIL将状态提升为 的多个节点无法访问FAIL。
handshake: 不受信任的节点,我们正在握手。
noaddr: 此节点没有已知地址。
nofailover:副本不会尝试故障转移。
noflags: 根本没有标志。

标签: Kubernetes, Redis

添加新评论