By accessing the website and accepting the Cookie Policy, you agree to use the cookies provided by the Site in accordance with to analyze traffic, remember your preferences, and optimize your experience.
解决方案:k3s集群etcd包含无效节点,导致etcd故障新节点无法加入的问题
k3s k8s rancher etcd    2023-12-08 15:05:22    72    0    0
emengweb   k3s k8s rancher etcd

定位问题

rancher部署的k3s集群无法加入新的etcd节点,查询日志发现报错,etcd群组中出现本应该已被删除的节点prepaid-de,此节点已经被删除,但却错误的留在了etcd member list中,导致etcd服务无法正常验证。

解决思路

移除已经无效的节点prepaid-de,恢复etcd集群的可用性。

解决方案

在k3s集群主控节点安装etcdctl工具。 

一、安装etcdctl

方法一:可以使用apt install命令直接进行安装

apt install etcd-client

方法二:下载对应etcd版本的etcdctl工具手动安装

首先查询获取etcd的版本

curl -L --cacert /var/lib/rancher/k3s/server/tls/etcd/server-ca.crt --cert /var/lib/rancher/k3s/server/tls/etcd/server-client.crt --key /var/lib/rancher/k3s/server/tls/etcd/server-client.key <https://127.0.0.1:2379/version>

修改ETCD-VER=v 之后的版本号为之前获取到的数字,运行以下命令安装对应版本etcdctl

ETCD_VER=v3.5.0
# choose either URL
GOOGLE_URL=https://storage.googleapis.com/etcd
GITHUB_URL=https://github.com/etcd-io/etcd/releases/download
DOWNLOAD_URL=${GOOGLE_URL}
rm -f /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gzrm -rf /tmp/etcd-download-test && mkdir -p /tmp/etcd-download-test
curl -L ${DOWNLOAD_URL}/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz -o /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gztar xzvf /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz -C /usr/local/bin --strip-components=1rm -f /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz
etcd --versionetcdctl version

二、查询etcd节点列表信息,获取节点id

运行etcdctl member list命令,查询etcd节点列表,找到无效节点prepaid-de的id信息

ETCDCTL_ENDPOINTS='<https://127.0.0.1:2379>' ETCDCTL_CACERT='/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt' ETCDCTL_CERT='/var/lib/rancher/k3s/server/tls/etcd/server-client.crt' ETCDCTL_KEY='/var/lib/rancher/k3s/server/tls/etcd/server-client.key' ETCDCTL_API=3 etcdctl member list
55a46072d57ff780, started, prepaid-de-e09366eb, <https://173.249.x.x:2380>, <https://173.249.x.x:2379>, true 81eee0d08a87f11f, started, dal-120fff0b, <https://185.241.x.x:2380>, <https://185.241.x.x:2379>, false

三、删除无效etcd节点

运行etcdctl member remove命令,移除无效的etcd节点

ETCDCTL_ENDPOINTS='<https://127.0.0.1:2379>' ETCDCTL_CACERT='/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt' ETCDCTL_CERT='/var/lib/rancher/k3s/server/tls/etcd/server-client.crt' ETCDCTL_KEY='/var/lib/rancher/k3s/server/tls/etcd/server-client.key' ETCDCTL_API=3 etcdctl member remove 55a46072d57ff780

至此,etcd服务恢复正常,可正常加入新的menber。

参考资料

单节点内置 ETCD 的 K3s 集群如何修改节点 IP

k3s-etcd-commands

上一篇: PVE中LXC虚拟机的 Docker 更换为 fuse-overlayfs,大幅降低空间占用

下一篇: PVE删除Local-lvm存储空间并合并到local中

72 人读过
文档导航