首先,我们使用Kubernetes的都知道,etcd是k8s的核心所在,会记录各个pod的状态信息。所以重要性极为重要。

etcd是kubernetes集群极为重要的一块服务,存储了kubernetes集群所有的数据信息,如Namespace、Pod、Service、路由等状态信息。如果etcd集群发生灾难或者 etcd 集群数据丢失,都会影响k8s集群数据的恢复。因此,通过备份etcd数据来实现kubernetes集群的灾备环境十分重要。

我们一点要养成,重要的东西备份、备份、在备份的习惯。

 ETCDCTL_API=3 /opt/kube/bin/etcdctl  --endpoints="https://172.17.1.20:2379,https://172.17.1.21:2379,https://172.17.1.22:2379" \
--cacert=/etc/kubernetes/ssl/ca.pem \
--cert=/etc/kubernetes/ssl/etcd.pem \
--key=/etc/kubernetes/ssl/etcd-key.pem \
endpoint status --write-out=table  #状态
(配合替换最后一行使用)
member list --write-out=table   #列表endpoint health --write-out=table #健康检查
[root@master01 ~]#  ETCDCTL_API=3 /opt/kube/bin/etcdctl  --endpoints="https://172.17.1.20:2379,https://172.17.1.21:2379,https://172.17.1.22:2379" --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/kubernetes/ssl/etcd.pem --key=/etc/kubernetes/ssl/etcd-key.pem endpoint status --write-out=table
+--------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|         ENDPOINT         |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+--------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://172.17.1.20:2379 | 20eec45319ea02e1 |  3.4.13 |  4.3 MB |     false |      false |         5 |      14715 |              14715 |        |
| https://172.17.1.21:2379 |  395af18f6bbef1a |  3.4.13 |  4.3 MB |     false |      false |         5 |      14715 |              14715 |        |
| https://172.17.1.22:2379 | fc24d224af6a71d9 |  3.4.13 |  4.3 MB |      true |      false |         5 |      14715 |              14715 |        |
+--------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

一、etcd集群备份

etcd不同版本的 etcdctl 命令不一样,但大致差不多,这里备份使用 napshot save进行快照备份。

需要注意几点:

  • 备份操作在etcd集群的其中一个节点执行就可以。
  • 这里使用的是etcd v3的api,因为从 k8s 1.13 开始,k8s不再支持 v2 版本的 etcd,即k8s的集群数据都存在了v3版本的etcd中。故备份的数据也只备份了使用v3添加的etcd数据,v2添加的etcd数据是没有做备份的。
  • 本案例使用的是二进制部署的k8s v1.20.2 + Calico 容器环境(下面命令中的"ETCDCTL_API=3 etcdctl" 等同于 "etcdctl")

1)开始备份之前,先来查看下etcd数据

[root@master01 ~]# cat /etc/systemd/system/etcd.service
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos[Service]
Type=notify
WorkingDirectory=/var/lib/etcd/
ExecStart=/opt/kube/bin/etcd \--name=etcd-172.17.1.20 \--cert-file=/etc/kubernetes/ssl/etcd.pem \--key-file=/etc/kubernetes/ssl/etcd-key.pem \--peer-cert-file=/etc/kubernetes/ssl/etcd.pem \--peer-key-file=/etc/kubernetes/ssl/etcd-key.pem \--trusted-ca-file=/etc/kubernetes/ssl/ca.pem \--peer-trusted-ca-file=/etc/kubernetes/ssl/ca.pem \--initial-advertise-peer-urls=https://172.17.1.20:2380 \--listen-peer-urls=https://172.17.1.20:2380 \--listen-client-urls=https://172.17.1.20:2379,http://127.0.0.1:2379 \--advertise-client-urls=https://172.17.1.20:2379 \--initial-cluster-token=etcd-cluster-0 \--initial-cluster=etcd-172.17.1.20=https://172.17.1.20:2380,etcd-172.17.1.21=https://172.17.1.21:2380,etcd-172.17.1.22=https://172.17.1.22:2380 \--initial-cluster-state=new \--data-dir=/var/lib/etcd \--snapshot-count=50000 \--auto-compaction-retention=1 \--max-request-bytes=10485760 \--auto-compaction-mode=periodic \--quota-backend-bytes=8589934592
Restart=always
RestartSec=15
LimitNOFILE=65536
OOMScoreAdjust=-999[root@master01 ~]# tree /var/lib/etcd
/var/lib/etcd
└── member├── snap│   └── db└── wal├── 0000000000000000-0000000000000000.wal└── 0.tmp3 directories, 3 files

2)执行etcd集群数据备份

在etcd集群的其中一个节点执行备份操作,然后将备份文件拷贝到其他节点上。

先在etcd集群的每个节点上创建备份目录

mkdir -p /data/etcd_backup_dir

在etcd集群其中个一个节点(这里在k8s-master01)上执行备份:

ETCDCTL_API=3 /opt/kube/bin/etcdctl  \
snapshot  save /data/etcd_backup_dir/etcd-snapshot-`date +%Y%m%d`.db \
--endpoints=https://172.17.1.20:2379 \
--cacert=/etc/kubernetes/ssl/ca.pem \
--cert=/etc/kubernetes/ssl/etcd.pem \
--key=/etc/kubernetes/ssl/etcd-key.pem

将备份文件拷贝到其他的etcd节点

[root@master01 ~]# scp /data/etcd_backup_dir/etcd-snapshot-20220310.db 172.17.1.21:/data/etcd_backup_dir/
etcd-snapshot-20220310.db                                                                     100% 4208KB 158.6MB/s   00:00
您在 /var/spool/mail/root 中有新邮件
[root@master01 ~]# scp /data/etcd_backup_dir/etcd-snapshot-20220310.db 172.17.1.22:/data/etcd_backup_dir/
etcd-snapshot-20220310.db                                                                     100% 4208KB 158.3MB/s   00:00
[root@master01 ~]#

可以将上面k8s-master01节点的etcd备份命令放在脚本里,结合crontab进行定时备份:

[root@k8s-master01 ~]# cat /data/etcd_backup_dir/etcd_backup.sh
#!/usr/bin/bashdate;
ETCDCTL_API=3 /opt/kube/bin/etcdctl  \
snapshot  save /data/etcd_backup_dir/etcd-snapshot-`date +%Y%m%d`.db \
--endpoints=https://172.17.1.20:2379 \
--cacert=/etc/kubernetes/ssl/ca.pem \
--cert=/etc/kubernetes/ssl/etcd.pem \
--key=/etc/kubernetes/ssl/etcd-key.pem# 备份保留30天
find /data/etcd_backup_dir/ -name "*.db" -mtime +30 -exec rm -f {} \;# 同步到其他两个etcd节点
scp /data/etcd_backup_dir/etcd-snapshot-`date +%Y%m%d`.db 172.17.1.21:/data/etcd_backup_dir/scp /data/etcd_backup_dir/etcd-snapshot-`date +%Y%m%d`.db 172.17.1.22:/data/etcd_backup_dir/

设置crontab定时备份任务,每天凌晨5点执行备份:

[root@master01 ~]# chmod 755 /data/etcd_backup_dir/etcd_backup.sh
[root@master01 ~]# crontab -e
crontab: installing new crontab
[root@master01 ~]# crontab -l
*/10 * * * * ntpdate time.windows.com
0 5 * * * /bin/bash -x /data/etcd_backup_dir/etcd_backup.sh > /dev/null 2>&1

二、etcd集群恢复

etcd集群备份操作只需要在其中的一个etcd节点上完成,然后将备份文件拷贝到其他节点。
但etcd集群恢复操作必须要所有的etcd节点上完成!

1)模拟etcd集群数据丢失
删除三个etcd集群节点的data数据 (或者直接删除data目录)

[root@master01 ~]# ls /var/lib/etcd/
member
[root@master01 ~]# ansible master -m shell -a "mv /var/lib/etcd/member /var/lib/etcd/member_bak"
172.17.1.21 | CHANGED | rc=0 >>172.17.1.22 | CHANGED | rc=0 >>172.17.1.20 | CHANGED | rc=0 >>[root@master01 ~]# ls /var/lib/etcd/
member_bak

查看k8s集群状态:

由于此时etcd集群的三个节点服务还在,过一会儿查看集群状态恢复正常:

[root@master01 ~]# kubectl get csWarning: v1 ComponentStatus is deprecated in v1.19+
NAME                 STATUS    MESSAGE             ERROR
controller-manager   Healthy   ok
scheduler            Healthy   ok
etcd-1               Healthy   {"health":"true"}
etcd-0               Healthy   {"health":"true"}
etcd-2               Healthy   {"health":"true"}
您在 /var/spool/mail/root 中有新邮件
[root@master01 ~]#
[root@master01 ~]#
[root@master01 ~]#  ETCDCTL_API=3 /opt/kube/bin/etcdctl  --endpoints="https://172.17.1.20:2379,https://172.17.1.21:2379,https://172.17.1.22:2379" \
> --cacert=/etc/kubernetes/ssl/ca.pem \
> --cert=/etc/kubernetes/ssl/etcd.pem \
> --key=/etc/kubernetes/ssl/etcd-key.pem \
> endpoint status --write-out=table  #状态
+--------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|         ENDPOINT         |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+--------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://172.17.1.20:2379 | 20eec45319ea02e1 |  3.4.13 |  143 kB |     false |      false |         7 |        191 |                191 |        |
| https://172.17.1.21:2379 |  395af18f6bbef1a |  3.4.13 |  143 kB |      true |      false |         7 |        191 |                191 |        |
| https://172.17.1.22:2379 | fc24d224af6a71d9 |  3.4.13 |  152 kB |     false |      false |         7 |        191 |                191 |        |
+--------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
[root@master01 ~]# kubectl get svc,pod -A
NAMESPACE   NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
default     service/kubernetes   ClusterIP   10.68.0.1    <none>        443/TCP   18s

但是,k8s集群数据其实已经丢失了。namespace命名空间下的pod等资源都没有了。此时就需要通过etcd集群备份文件来恢复,即通过上面的etcd集群快照文件恢复。

[root@master01 ~]# kubectl get ns
NAME              STATUS   AGE
default           Active   2m20s
kube-node-lease   Active   2m20s
kube-public       Active   2m20s
kube-system       Active   2m20s
[root@master01 ~]# kubectl get pod -n kube-system
No resources found in kube-system namespace.

2)etcd集群数据恢复,即kubernetes集群数据恢复

在etcd数据恢复之前,先依次关闭所有master节点的kube-aposerver服务,所有etcd节点的etcd服务:

[root@master01 ~]# ansible master -m shell -a "systemctl stop kube-apiserver && systemctl stop etcd"
172.17.1.21 | CHANGED | rc=0 >>172.17.1.22 | CHANGED | rc=0 >>172.17.1.20 | CHANGED | rc=0 >>

特别注意:在进行etcd集群数据恢复之前,一定要先将所有etcd节点的data和wal旧工作目录删掉,否则,可能会导致恢复失败(恢复命令执行时报错数据目录已存在)。

在每个etcd节点执行恢复操作:

master01

[root@master01 ~]#
[root@master01 ~]# ETCDCTL_API=3 /opt/kube/bin/etcdctl \
> --name=etcd-172.17.1.20  \
> --endpoints="https://172.17.1.20:2379" \
> --cacert=/etc/kubernetes/ssl/ca.pem \
> --cert=/etc/kubernetes/ssl/etcd.pem \
> --key=/etc/kubernetes/ssl/etcd-key.pem \
> --initial-cluster-token=etcd-cluster-0  \
> --initial-advertise-peer-urls=https://172.17.1.20:2380  \
> --initial-cluster=etcd-172.17.1.20=https://172.17.1.20:2380,etcd-172.17.1.21=https://172.17.1.21:2380,etcd-172.17.1.22=https://172.17.1.22:2380  \
>  --data-dir=/var/lib/etcd \
> snapshot restore /data/etcd_backup_dir/etcd-snapshot-20220310.db

master02

ETCDCTL_API=3 /opt/kube/bin/etcdctl \
--name=etcd-172.17.1.21  \
--endpoints="https://172.17.1.21:2379" \
--cacert=/etc/kubernetes/ssl/ca.pem \
--cert=/etc/kubernetes/ssl/etcd.pem \
--key=/etc/kubernetes/ssl/etcd-key.pem \
--initial-cluster-token=etcd-cluster-0  \
--initial-advertise-peer-urls=https://172.17.1.21:2380  \
--initial-cluster=etcd-172.17.1.20=https://172.17.1.20:2380,etcd-172.17.1.21=https://172.17.1.21:2380,etcd-172.17.1.22=https://172.17.1.22:2380  \--data-dir=/var/lib/etcd \
snapshot restore /data/etcd_backup_dir/etcd-snapshot-20220310.db

master03

ETCDCTL_API=3 /opt/kube/bin/etcdctl \
--name=etcd-172.17.1.22  \
--endpoints="https://172.17.1.22:2379" \
--cacert=/etc/kubernetes/ssl/ca.pem \
--cert=/etc/kubernetes/ssl/etcd.pem \
--key=/etc/kubernetes/ssl/etcd-key.pem \
--initial-cluster-token=etcd-cluster-0  \
--initial-advertise-peer-urls=https://172.17.1.22:2380  \
--initial-cluster=etcd-172.17.1.20=https://172.17.1.20:2380,etcd-172.17.1.21=https://172.17.1.21:2380,etcd-172.17.1.22=https://172.17.1.22:2380  \--data-dir=/var/lib/etcd \
snapshot restore /data/etcd_backup_dir/etcd-snapshot-20220310.db

依次启动所有etcd节点的etcd服务:

# systemctl start etcd
# systemctl status etcd
[root@master01 ~]# ansible master -m shell -a "systemctl start etcd"
172.17.1.22 | CHANGED | rc=0 >>172.17.1.21 | CHANGED | rc=0 >>172.17.1.20 | CHANGED | rc=0 >>您在 /var/spool/mail/root 中有新邮件
[root@master01 ~]# ansible master -m shell -a "systemctl status etcd"
172.17.1.21 | CHANGED | rc=0 >>
● etcd.service - Etcd ServerLoaded: loaded (/etc/systemd/system/etcd.service; enabled; vendor preset: disabled)Active: active (running) since 四 2022-03-10 14:43:41 CST; 8s agoDocs: https://github.com/coreosMain PID: 81855 (etcd)Tasks: 13Memory: 23.4MCGroup: /system.slice/etcd.service└─81855 /opt/kube/bin/etcd --name=etcd-172.17.1.21 --cert-file=/etc/kubernetes/ssl/etcd.pem --key-file=/etc/kubernetes/ssl/etcd-key.pem --peer-cert-file=/etc/kubernetes/ssl/etcd.pem --peer-key-file=/etc/kubernetes/ssl/etcd-key.pem --trusted-ca-file=/etc/kubernetes/ssl/ca.pem --peer-trusted-ca-file=/etc/kubernetes/ssl/ca.pem --initial-advertise-peer-urls=https://172.17.1.21:2380 --listen-peer-urls=https://172.17.1.21:2380 --listen-client-urls=https://172.17.1.21:2379,http://127.0.0.1:2379 --advertise-client-urls=https://172.17.1.21:2379 --initial-cluster-token=etcd-cluster-0 --initial-cluster=etcd-172.17.1.20=https://172.17.1.20:2380,etcd-172.17.1.21=https://172.17.1.21:2380,etcd-172.17.1.22=https://172.17.1.22:2380 --initial-cluster-state=new --data-dir=/var/lib/etcd --snapshot-count=50000 --auto-compaction-retention=1 --max-request-bytes=10485760 --auto-compaction-mode=periodic --quota-backend-bytes=85899345923月 10 14:43:41 master02 etcd[81855]: published {Name:etcd-172.17.1.21 ClientURLs:[https://172.17.1.21:2379]} to cluster 948e35f16fc80e25
3月 10 14:43:41 master02 etcd[81855]: ready to serve client requests
3月 10 14:43:41 master02 etcd[81855]: ready to serve client requests
3月 10 14:43:41 master02 systemd[1]: Started Etcd Server.
3月 10 14:43:41 master02 etcd[81855]: serving insecure client requests on 127.0.0.1:2379, this is strongly discouraged!
3月 10 14:43:41 master02 etcd[81855]: serving client requests on 172.17.1.21:2379
3月 10 14:43:41 master02 etcd[81855]: set the initial cluster version to 3.4
3月 10 14:43:41 master02 etcd[81855]: enabled capabilities for version 3.4
3月 10 14:43:41 master02 etcd[81855]: established a TCP streaming connection with peer 20eec45319ea02e1 (stream Message reader)
3月 10 14:43:41 master02 etcd[81855]: established a TCP streaming connection with peer 20eec45319ea02e1 (stream MsgApp v2 reader)
172.17.1.22 | CHANGED | rc=0 >>
● etcd.service - Etcd ServerLoaded: loaded (/etc/systemd/system/etcd.service; enabled; vendor preset: disabled)Active: active (running) since 四 2022-03-10 14:43:41 CST; 8s agoDocs: https://github.com/coreosMain PID: 81803 (etcd)Tasks: 13Memory: 20.7MCGroup: /system.slice/etcd.service└─81803 /opt/kube/bin/etcd --name=etcd-172.17.1.22 --cert-file=/etc/kubernetes/ssl/etcd.pem --key-file=/etc/kubernetes/ssl/etcd-key.pem --peer-cert-file=/etc/kubernetes/ssl/etcd.pem --peer-key-file=/etc/kubernetes/ssl/etcd-key.pem --trusted-ca-file=/etc/kubernetes/ssl/ca.pem --peer-trusted-ca-file=/etc/kubernetes/ssl/ca.pem --initial-advertise-peer-urls=https://172.17.1.22:2380 --listen-peer-urls=https://172.17.1.22:2380 --listen-client-urls=https://172.17.1.22:2379,http://127.0.0.1:2379 --advertise-client-urls=https://172.17.1.22:2379 --initial-cluster-token=etcd-cluster-0 --initial-cluster=etcd-172.17.1.20=https://172.17.1.20:2380,etcd-172.17.1.21=https://172.17.1.21:2380,etcd-172.17.1.22=https://172.17.1.22:2380 --initial-cluster-state=new --data-dir=/var/lib/etcd --snapshot-count=50000 --auto-compaction-retention=1 --max-request-bytes=10485760 --auto-compaction-mode=periodic --quota-backend-bytes=85899345923月 10 14:43:41 master03 etcd[81803]: ready to serve client requests
3月 10 14:43:41 master03 etcd[81803]: ready to serve client requests
3月 10 14:43:41 master03 systemd[1]: Started Etcd Server.
3月 10 14:43:41 master03 etcd[81803]: serving insecure client requests on 127.0.0.1:2379, this is strongly discouraged!
3月 10 14:43:41 master03 etcd[81803]: serving client requests on 172.17.1.22:2379
3月 10 14:43:41 master03 etcd[81803]: setting up the initial cluster version to 3.4
3月 10 14:43:41 master03 etcd[81803]: set the initial cluster version to 3.4
3月 10 14:43:41 master03 etcd[81803]: enabled capabilities for version 3.4
3月 10 14:43:41 master03 etcd[81803]: established a TCP streaming connection with peer 20eec45319ea02e1 (stream Message reader)
3月 10 14:43:41 master03 etcd[81803]: established a TCP streaming connection with peer 20eec45319ea02e1 (stream MsgApp v2 reader)
172.17.1.20 | CHANGED | rc=0 >>
● etcd.service - Etcd ServerLoaded: loaded (/etc/systemd/system/etcd.service; enabled; vendor preset: disabled)Active: active (running) since 四 2022-03-10 14:43:41 CST; 9s agoDocs: https://github.com/coreosMain PID: 103123 (etcd)Tasks: 13Memory: 10.7MCGroup: /system.slice/etcd.service└─103123 /opt/kube/bin/etcd --name=etcd-172.17.1.20 --cert-file=/etc/kubernetes/ssl/etcd.pem --key-file=/etc/kubernetes/ssl/etcd-key.pem --peer-cert-file=/etc/kubernetes/ssl/etcd.pem --peer-key-file=/etc/kubernetes/ssl/etcd-key.pem --trusted-ca-file=/etc/kubernetes/ssl/ca.pem --peer-trusted-ca-file=/etc/kubernetes/ssl/ca.pem --initial-advertise-peer-urls=https://172.17.1.20:2380 --listen-peer-urls=https://172.17.1.20:2380 --listen-client-urls=https://172.17.1.20:2379,http://127.0.0.1:2379 --advertise-client-urls=https://172.17.1.20:2379 --initial-cluster-token=etcd-cluster-0 --initial-cluster=etcd-172.17.1.20=https://172.17.1.20:2380,etcd-172.17.1.21=https://172.17.1.21:2380,etcd-172.17.1.22=https://172.17.1.22:2380 --initial-cluster-state=new --data-dir=/var/lib/etcd --snapshot-count=50000 --auto-compaction-retention=1 --max-request-bytes=10485760 --auto-compaction-mode=periodic --quota-backend-bytes=85899345923月 10 14:43:41 master01 etcd[103123]: serving insecure client requests on 127.0.0.1:2379, this is strongly discouraged!
3月 10 14:43:41 master01 etcd[103123]: serving client requests on 172.17.1.20:2379
3月 10 14:43:41 master01 systemd[1]: Started Etcd Server.
3月 10 14:43:41 master01 etcd[103123]: 20eec45319ea02e1 initialized peer connection; fast-forwarding 8 ticks (election ticks 10) with 2 active peer(s)
3月 10 14:43:41 master01 etcd[103123]: set the initial cluster version to 3.4
3月 10 14:43:41 master01 etcd[103123]: enabled capabilities for version 3.4
3月 10 14:43:41 master01 etcd[103123]: established a TCP streaming connection with peer fc24d224af6a71d9 (stream Message writer)
3月 10 14:43:41 master01 etcd[103123]: established a TCP streaming connection with peer fc24d224af6a71d9 (stream MsgApp v2 writer)
3月 10 14:43:41 master01 etcd[103123]: established a TCP streaming connection with peer 395af18f6bbef1a (stream Message writer)
3月 10 14:43:41 master01 etcd[103123]: established a TCP streaming connection with peer 395af18f6bbef1a (stream MsgApp v2 writer)

检查 ETCD 集群状态(如下,发现etcd集群里已经成功选主了)

再依次启动所有master节点的kube-apiserver服务:

# systemctl start kube-apiserver
# systemctl status kube-apiserver
[root@master01 ~]# ansible master -m shell -a "systemctl start kube-apiserver"
172.17.1.22 | CHANGED | rc=0 >>172.17.1.21 | CHANGED | rc=0 >>172.17.1.20 | CHANGED | rc=0 >>

查看kubernetes集群状态:

[root@master01 ~]# kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME                 STATUS    MESSAGE             ERROR
scheduler            Healthy   ok
controller-manager   Healthy   ok
etcd-1               Healthy   {"health":"true"}
etcd-2               Healthy   {"health":"true"}
etcd-0               Healthy   {"health":"true"}

查看kubernetes的资源情况:

[root@master01 ~]# kubectl get pod -A
NAMESPACE     NAME                                      READY   STATUS    RESTARTS   AGE
kube-system   calico-kube-controllers-5677ffd49-t7dbx   1/1     Running   0          3h27m
kube-system   calico-node-2hrhv                         1/1     Running   0          3h27m
kube-system   calico-node-5kc79                         1/1     Running   0          3h27m
kube-system   calico-node-6v294                         1/1     Running   0          3h27m
kube-system   calico-node-hkm8l                         1/1     Running   0          3h27m
kube-system   calico-node-kkpmd                         1/1     Running   0          3h27m
kube-system   calico-node-tsmwx                         1/1     Running   0          3h27m
kube-system   coredns-5787695b7f-8ql8q                  1/1     Running   0          3h27m
kube-system   metrics-server-8568cf894b-ztcgf           1/1     Running   1          3h27m
kube-system   node-local-dns-5mbzg                      1/1     Running   0          3h27m
kube-system   node-local-dns-c5d9j                      1/1     Running   0          3h27m
kube-system   node-local-dns-nqnjw                      1/1     Running   0          3h27m
kube-system   node-local-dns-rz565                      1/1     Running   0          3h27m
kube-system   node-local-dns-skmzk                      1/1     Running   0          3h27m
kube-system   node-local-dns-zcncq                      1/1     Running   0          3h27m
kube-system   traefik-79f5f7879c-nwhlq                  1/1     Running   0          3h27m

在etcd集群数据恢复后,pod容器也会慢慢恢复到running状态。至此,kubernetes整个集群已经通过etcd备份数据恢复了

三、最后总结

Kubernetes 集群备份主要是备份 ETCD 集群。而恢复时,主要考虑恢复整个顺序:

停止kube-apiserver --> 停止ETCD --> 恢复数据 --> 启动ETCD --> 启动kube-apiserve

特别注意:

  • 备份ETCD集群时,只需要备份一个ETCD数据,然后同步到其他节点上。
  • 恢复ETCD数据时,拿其中一个节点的备份数据恢复即可

参考

K8S集群灾备环境部署 - 散尽浮华 - 博客园
https://www.cnblogs.com/kevingrace/p/14616824.html

Kubernetes的ETCD集群备份、恢复相关推荐

  1. etcd php,etcd集群备份和数据恢复

    etcd是一个分布式k-v数据库,在kubernetes中使用其管理集群的元数据.这里介绍etcd集群数据的备份和数据恢复的方法和步骤. 快照定期备份 crontab定期执行备份脚本,每半小时备份一次 ...

  2. 用Kubernetes搭建Etcd集群和WebUI

    今天用这篇文章带大家在自己的电脑上搭建一个Kubernetes Etcd集群,Kubernetes本身的功能就依赖Etcd实现,不过并不会开放给我们的程序使用,所以需要自己单独搭建. Etcd现在是分 ...

  3. Kubernetes中使用CronJob定时备份etcd集群数据

    2019独角兽企业重金招聘Python工程师标准>>> Kubernetes中使用CronJob定时备份etcd集群数据 注意:这里的内容已经过时! 使用kubernetes1.12 ...

  4. Rancher-HA集群备份和恢复实验

    之前 安装的Rancher-HA虽然已经是高可用的集群,一个绕不开的话题是:对于它的数据,我们还是要进行定时的备份,并且在灾难情况下还要对数据进行恢复. 先来看看之前在集群中创建的应用: 因为ranc ...

  5. etcd 日志压缩_etcd集群备份和数据恢复以及优化运维

    数据备份 快照定期备份 crontab定期执行备份脚本,每半小时备份一次,本地.异地都备份(暂定:本地备份保留最近10个备份,异地保留一个月的备份)#!/bin/bash # ETCDCTL_PATH ...

  6. Kubernetes master无法加入etcd 集群解决方法

    Kubernetes master无法加入etcd 集群解决方法 参考文章: (1)Kubernetes master无法加入etcd 集群解决方法 (2)https://www.cnblogs.co ...

  7. Kubernetes集群搭建之Etcd集群配置篇

    docker快速启动etcd脚本 https://www.cnblogs.com/ka1em/p/11300355.html rm -rf /tmp/etcd-data.tmp && ...

  8. kubernetes搭建 二、Etcd集群

    etcd需要在master和node上都部署,首先在master上部署,然后把相关的二进制文件和程序拷贝到node上再修改即可 1.下载二进制包然后解压 wget https://github.com ...

  9. Kubernetes 生产环境安装部署 基于 Kubernetes v1.14.0 之 etcd集群

    说明:没有明确注明在某台服务器,都是在k8s-operation 工作服务器完成 K8S node 节点数大于2000 节点 k8s-operation 目录规划,工作目录/apps/work/k8s ...

最新文章

  1. ndk 路径修改 超简单的方法
  2. MySQL基础篇:SELECT几种子句
  3. 重写toString()
  4. c mysql封装 jdbc_彻底封装JDBC操作MySQL的连接。
  5. C#生成新浪微博短网址 示例源码
  6. MySQL(root用户)密码重置
  7. opensource项目_Best of Opensource.com 2017:排名前27的文章和资源
  8. python飞机大战游戏高级_05.python实现飞机大战游戏
  9. 从零开始的全栈工程师——html篇1.4
  10. 互联网晚报 | 10月27日 星期三 | 高德车道级导航正式发布;阿里淘菜菜发布“本地菜”计划;特斯拉市值破万亿美元...
  11. Java递归求费数列和_简述java递归与非递归算法,0-100求和,斐波那契数列,八皇后,汉诺塔问题...
  12. Unity3D天气系统插件UniStorm 3.0.1.1 插件使用说明
  13. msgbox在matlab中,Matlab load global 变量 length( ) 函数 msgbox( )函数
  14. 浅谈eform自定义表单工具和协同办公系统
  15. Amazon Braket 与量子计算
  16. 拼多多店铺怎么做咨询
  17. FTM(FlexTimerModule)TIMER总结
  18. 他出身草根,却是 Google 资深工程师
  19. Apple 的 WWDC 2022 活动时间以及观看方法
  20. [LeetCode] Sqrt(int x)解题报告之无限逼近(牛顿法)

热门文章

  1. 2021年危险化学品生产单位安全生产管理人员考试内容及危险化学品生产单位安全生产管理人员模拟考试
  2. Java 彻底去除换行符
  3. php协同过滤推荐算法【附部分源码】
  4. 大数据从业者必读书籍《数据仓库工具箱》-第一章笔记
  5. 健身房减肥计划 科学快速燃脂
  6. 中信java在线笔试_2014年中信java笔试题
  7. 核酸检测识别系统——总章
  8. [WIP] Keystone Federation (by quqi99)
  9. OpenAL的Mingw编译
  10. 【C语言|菜鸟教程】100道进阶经典例题详细解答(实例一)