
Aug 10 15:43:53 hetu-fr-2d kubelet[1586]: E0810 15:43:53.135619    1586 controller.go:178] failed to update node lease, error: Put https://kapi-xxx-xx:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/node02?timeout=10s: read tcp> read: connection reset by peer

 Aug 10 15:43:53 hetu-fr-2d kubelet[1586]: E0810 15:43:53.135619    1586 controller.go:178] failed to update node lease, error: Put https://kapi-xxx-xx:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/node02?timeout=10s: read tcp> read: connection reset by peer



查这个报错:watch chan error: etcdserver: mvcc: required revision has been compacted,有可能是因为ectd压缩问题,可能涉及磁盘

etcd is the Kubernetes’ backing store for all cluster data, it has a history compaction mechanism to avoid performance degradation and eventual storage space exhaustion, here are some docs on etcd repo."watch chan error: etcdserver: mvcc: required revision has been compacted" literally means the watched revision is compacted. This is working as designed, when attempts to re-establish a watch from a resourceVersion that is no longer available would prompt the caller to re-list objects and obtain a new current resourceVersion to watch from..For your comment "and When I perform "kubectl get cs" it shows "unknown" both etcd and kube-controller-manager and kube-scheduler ", what does that mean? Could you paste the output of "kubectl get cs" here?

看etcd日志:"msg":"leader failed to send out heartbeat on time; took too long, leader is overloaded likely from slow disk"

Aug 11 18:00:02 etcd-1 etcd[1592]: {"level":"warn","ts":"2021-08-11T18:00:02.888+0800","caller":"etcdserver/raft.go:390","msg":"leader failed to send out heartbeat on time; took too long, leader is overloaded likely from slow disk","to":"566bb4bcdc8f3e93","heartbeat-interval":"100ms","expected-duration":"200ms","exceeded-duration":"18.959836ms"}



  • etcd使用了raft算法,leader会定时地给每个follower发送心跳,如果leader连续两个心跳时间没有给follower发送心跳,etcd会打印这个log以给出告警。通常情况下这个issue是disk运行过慢导致的,leader一般会在心跳包里附带一些metadata,leader需要先把这些数据固化到磁盘上,然后才能发送。写磁盘过程可能要与其他应用竞争,或者因为磁盘是一个虚拟的或者是SATA类型的导致运行过慢,此时只有更好更快磁盘硬件才能解决问题。etcd暴露给Prometheus的metrics指标walfsyncduration_seconds就显示了wal日志的平均花费时间,通常这个指标应低于10ms。

  • 第二种原因就是CPU计算能力不足。如果是通过监控系统发现CPU利用率确实很高,就应该把etcd移到更好的机器上,然后通过cgroups保证etcd进程独享某些核的计算能力,或者提高etcd的priority。

  • 第三种原因就可能是网速过慢。如果Prometheus显示是网络服务质量不行,譬如延迟太高或者丢包率过高,那就把etcd移到网络不拥堵的情况下就能解决问题。但是如果etcd是跨机房部署的,长延迟就不可避免了,那就需要根据机房间的RTT调整heartbeat-interval,而参数election-timeout则至少是heartbeat-interval的5倍。


于是准备调整etcd 配置的参数重启etcd试试

heartbeat-interval: 1000
election-timeout: 10000

附录etcd 的推荐配置以及说明

!!!注:参数key: value中key之后":" 与value之间必须有一个空格#[Member]
# This is the configuration file for the etcd server.# Human-readable name for this member.
#      若使用静态引导,则需要匹配标志中使用的密钥。使用发现时,每个成员必须具有唯一的名称。
#      建议使用Hostname或者machine-id。
name: 'ptest01'# Path to the data directory.
data-dir: /data/db/etcd# Path to the dedicated wal directory.
wal-dir: /data/log/etcd# Number of committed transactions to trigger a snapshot to disk.
snapshot-count: 10000# Time (in milliseconds) of a heartbeat interval.
#作用:leader 多久发送一次心跳到 followers。
heartbeat-interval: 1000# Time (in milliseconds) for an election to timeout.
#作用:重新投票的超时时间,如果 follow 在该时间间隔没有收到心跳包,会触发重新投票,默认为 1000 ms。
election-timeout: 10000# Raise alarms when backend size exceeds the given quota. 0 means use the
# default quota.
quota-backend-bytes: 0# List of comma separated URLs to listen on for peer traffic.
#作用:用于监听其他etcd member的url
listen-peer-urls: List of comma separated URLs to listen on for client traffic.
listen-client-urls: ","# Maximum number of snapshot files to retain (0 is unlimited).
max-snapshots: 0# Maximum number of wal files to retain (0 is unlimited).
max-wals: 0# Comma-separated white list of origins for CORS (cross-origin resource sharing).
# List of this member's peer URLs to advertise to the rest of the cluster.
# The URLs needed to be a comma-separated list.
initial-advertise-peer-urls: List of this member's client URLs to advertise to the public.
# The URLs needed to be a comma-separated list.
advertise-client-urls: Discovery URL used to bootstrap the cluster.
discovery:# Valid values include 'exit', 'proxy'
#含义:发现服务失败时的预期行为(“退出”或“代理”)。“proxy”仅支持v2 API。
discovery-fallback: 'proxy'# HTTP proxy to use for traffic to discovery service.
discovery-proxy:# DNS domain used to bootstrap initial cluster.
#含义:DNS srv域用于引导群集。
discovery-srv:# Initial cluster configuration for bootstrapping.
initial-cluster: "ptest01=,ptest02=,ptest03="# Initial cluster token for the etcd cluster during bootstrap.
#含义:创建集群的 token,这个值每个集群保持唯一。
#作用:此配置可使重新创建集群,即使配置和之前一样,也会再次生成新的集群和节点 uuid;否则会导致多个集群之间的冲突,造成未知的错误。
initial-cluster-token: 'p-etcd-cluster'# Initial cluster state ('new' or 'existing').
initial-cluster-state: 'new'# Reject reconfiguration requests that would cause quorum loss.
strict-reconfig-check: false# Accept etcd V2 client requests
#含义:接受etcd V2客户端请求
#2.3 代理相关标识
#提示:--proxy配置etcd以在代理模式下运行,“proxy”仅支持v2 API。
enable-v2: true# Enable runtime profiling data via HTTP server
enable-pprof: true# Valid values include 'on', 'readonly', 'off'
#含义:代理模式设置,("off", "readonly" or "on")
proxy: 'off'# Time (in milliseconds) an endpoint will be held in a failed state.
#含义:在重新考虑代理请求之前,endpoints 将处于失败状态的时间(以毫秒为单位)。
proxy-failure-wait: 5000# Time (in milliseconds) of the endpoints refresh interval.
#含义:endpoints 刷新间隔的时间(以毫秒为单位)。
proxy-refresh-interval: 30000# Time (in milliseconds) for a dial to timeout.
proxy-dial-timeout: 1000# Time (in milliseconds) for a write to timeout.
proxy-write-timeout: 5000# Time (in milliseconds) for a read to timeout.
proxy-read-timeout: 0#[Security]
client-transport-security:# Path to the client server TLS cert file.#含义:客户端服务器TLS证书文件的路径。#默认值:#环境变量:ETCD_CERT_FILE#作用:cert-file:# Path to the client server TLS key file.#含义:客户端服务器TLS密钥文件的路径。#默认值:#环境变量:ETCD_KEY_FILE#作用:key-file:# Enable client cert authentication.#含义:启用客户端证书验证。#默认值:false#环境变量:ETCD_CLIENT_CERT_AUTH#作用:client-cert-auth: false# Path to the client server TLS trusted CA cert file.#含义:客户端服务器的路径TLS可信CA证书文件。#默认值:#环境变量:ETCD_TRUSTED_CA_FILE#作用:trusted-ca-file:# Client TLS using generated certificates#含义:客户端TLS使用生成的证书#默认值:false#环境变量:ETCD_AUTO_TLS#作用:auto-tls: falsepeer-transport-security:# Path to the peer server TLS cert file.#含义:对等服务器TLS证书文件的路径。这是对等流量的证书,用于服务器和客户端。#默认值:#环境变量:ETCD_PEER_CERT_FILE#作用:cert-file:# Path to the peer server TLS key file.#含义:对等服务器TLS密钥文件的路径。这是对等流量的关键,用于服务器和客户端。#默认值:#环境变量:ETCD_PEER_KEY_FILE#作用:key-file:# Enable peer client cert authentication.#含义:启用对等客户端证书验证。#默认值:false#环境变量:ETCD_PEER_CLIENT_CERT_AUTH#作用:client-cert-auth: false# Path to the peer server TLS trusted CA cert file.#含义:对等服务器TLS可信CA文件的路径。#默认值:#环境变量:ETCD_PEER_TRUSTED_CA_FILE#作用:trusted-ca-file:# Peer TLS using generated certificates.#含义:Peer TLS使用自动生成的证书#默认值:false#环境变量:ETCD_PEER_AUTO_TLS#作用:auto-tls: false# Enable debug-level logging for etcd.
debug: false含义:将单个etcd子包设置为特定的日志级别。一个例子是etcdserver=WARNING,security=DEBUG
log-package-levels: 'etcdserver=ERROR,security=DEBUG,auth=ERROR'#含义:为结构化日志记录指定'zap'或'capnslog'。
logger: zap# Specify 'stdout' or 'stderr' to skip journald logging even when running under systemd.
log-outputs: [stderr]# Force to create a new one member cluster.
force-new-cluster: false#含义:说明--auto-compaction-retention配置的基于时间保留的三种模式:periodic, revision. periodic
auto-compaction-mode: periodic#含义:在一个小时内为mvcc键值存储的自动压实保留。0表示禁用自动压缩。
auto-compaction-retention: "1"# Auth flags
auth-token-ttl: 3600000

