定位问题
rancher部署的k3s集群无法加入新的etcd节点,查询日志发现报错,etcd群组中出现本应该已被删除的节点prepaid-de,此节点已经被删除,但却错误的留在了etcd member list中,导致etcd服务无法正常验证。
解决思路
移除已经无效的节点prepaid-de,恢复etcd集群的可用性。
解决方案
在k3s集群主控节点安装etcdctl工具。
一、安装etcdctl
方法一:可以使用apt install命令直接进行安装
apt install etcd-client
方法二:下载对应etcd版本的etcdctl工具手动安装
首先查询获取etcd的版本
curl -L --cacert /var/lib/rancher/k3s/server/tls/etcd/server-ca.crt --cert /var/lib/rancher/k3s/server/tls/etcd/server-client.crt --key /var/lib/rancher/k3s/server/tls/etcd/server-client.key <https://127.0.0.1:2379/version>
修改ETCD-VER=v
之后的版本号为之前获取到的数字,运行以下命令安装对应版本etcdctl
ETCD_VER=v3.5.0 # choose either URL GOOGLE_URL=https://storage.googleapis.com/etcd GITHUB_URL=https://github.com/etcd-io/etcd/releases/download DOWNLOAD_URL=${GOOGLE_URL} rm -f /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gzrm -rf /tmp/etcd-download-test && mkdir -p /tmp/etcd-download-test curl -L ${DOWNLOAD_URL}/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz -o /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gztar xzvf /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz -C /usr/local/bin --strip-components=1rm -f /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz etcd --versionetcdctl version
二、查询etcd节点列表信息,获取节点id
运行etcdctl member list
命令,查询etcd节点列表,找到无效节点prepaid-de的id信息
ETCDCTL_ENDPOINTS='<https://127.0.0.1:2379>' ETCDCTL_CACERT='/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt' ETCDCTL_CERT='/var/lib/rancher/k3s/server/tls/etcd/server-client.crt' ETCDCTL_KEY='/var/lib/rancher/k3s/server/tls/etcd/server-client.key' ETCDCTL_API=3 etcdctl member list 55a46072d57ff780, started, prepaid-de-e09366eb, <https://173.249.x.x:2380>, <https://173.249.x.x:2379>, true 81eee0d08a87f11f, started, dal-120fff0b, <https://185.241.x.x:2380>, <https://185.241.x.x:2379>, false
三、删除无效etcd节点
运行etcdctl member remove
命令,移除无效的etcd节点
ETCDCTL_ENDPOINTS='<https://127.0.0.1:2379>' ETCDCTL_CACERT='/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt' ETCDCTL_CERT='/var/lib/rancher/k3s/server/tls/etcd/server-client.crt' ETCDCTL_KEY='/var/lib/rancher/k3s/server/tls/etcd/server-client.key' ETCDCTL_API=3 etcdctl member remove 55a46072d57ff780
至此,etcd服务恢复正常,可正常加入新的menber。