# kubeadm单节点kubernetes升级为高可用kubernetes ## 简述 对于生产环境来说,单节点master风险太大了。 非常有必要做一个高可用的集群,这里的高可用主要是针对控制面板来说的,比如 kube-apiserver、etcd、kube-comtroller-manager、kube-scheduler 这几个组件,其中 kube-controller-manager 与 kube-scheduler 组件是 kubernetes 集群自己去实现的高可用,apiserver 和 etcd 就需要手动去搭建高可集群了。 高可用的架构有很多,比如典型的 haproxy+keepalived架构,或者使用 nginx来做代理实现。我们这里为了声明如何将单 master 升级为高可用的集群,采用相对就更简单的 nginx 模式,当然这种模式也有一些缺点,但是足以说明高可用的实现方式了。 ![img](http://xieys.club/images/posts/2174174-20210613174758133-1534182897.png) 从上图可以看出,我们需要在所有控制节点上安装 nginx、keepalived服务。 来代理 apiserver,这里我准备了2个节点作为控制平面节点(线上一般最少3个节点): master、master2, 这里我默认所有节点都已经正常安装配置好了docker, 以及节点初始化操作。(由于机器资源有限,所以这里就只用1台nginx,不装keepalived服务了) | 主机名 | ip | | ------- | -------------- | | master | 192.168.116.20 | | master1 | 192.168.116.21 | | nginx | 192.168.116.25 | | node1 | 192.168.116.30 | ## 具体操作步骤 ### 更新证书 由于我们需要将集群替换成高可用的集群,那么势必会想到我们会用一个负载均衡器来代理 APIServer, 也就是这个负载均衡器访问 APIServer 的时候需要能正常访问,所以默认安装的 APIServer 证书就需要更新,因为里面没有包含我们需要的地址,需要保证在 SAN 列表中包含一些额外的名称。 首先我们一个 kubeadn 的配置文件,如果一开始安装集群的时候你就是使用的配置文件,那么我们可以直接更新这个配置文件,但是我们你没有使用这个配置文件,直接使用 kubeadm init 来安装的集群,那么我们可以从集群中获取 kubeadm 的配置信息来插件一个配置文件,因为 kubeadm 会将其配置写入到 kube-system 命名空间下面的名为 kubeadm-config 的 ConfigMap 中。可以直接执行如下所示的命令将该配置导出:会生成一个 kubeadm.yam的配置文件 ``` [root@master ~]# kubectl -n kube-system get configmap kubeadm-config -o jsonpath='{.data.ClusterConfiguration}' > kubeadm.yaml ``` 生成的kubeadm.yaml 文件中并没有列出额外的 SAN 信息,我们需要添加一个新的数据,需要在 apiserver 属性下面添加一个 certsSANs 的列表。如果你在启动集群的时候就使用的 kubeadm 的配置文件,可能就已经包含 certsSANs 列表了,如果没有我们就需要添加它,比如我们这里要添加一个新的域名 api.k8s,local 以及 master 和 master2 这两个主机名和IP地址 192.168.116.20、192.168.116.21、192.168.116.25。可以添加多个IP,192.168.116.25为虚拟VIP,那么我们需要在 apiServer 下面添加如下的所示的数据: ``` [root@master ~]# cat kubeadm.yaml apiServer: certSANs: - api.k8s.local - master - master1 - 192.168.116.20 - 192.168.116.21 - 192.168.116.25 extraArgs: authorization-mode: Node,RBAC timeoutForControlPlane: 4m0s apiVersion: kubeadm.k8s.io/v1beta2 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controllerManager: {} dns: type: CoreDNS etcd: local: dataDir: /var/lib/etcd imageRepository: registry.aliyuncs.com/google_containers kind: ClusterConfiguration kubernetesVersion: v1.16.9 networking: dnsDomain: cluster.local podSubnet: 10.244.0.0/16 serviceSubnet: 10.96.0.0/12 scheduler: {} ``` 上面我只列出了 apiServer 下面新增的 certSANs 信息,这些信息是包括在标准的 SAN 列表之外的,所以不用担心这里没有添加 kubernetes、kubernetes.default 等等这些信息,因为这些都是标准的 SAN 列表中的。 更新完 kubeadm 配置文件后我们就可以更新证书了,首先我们移动现有的 APIServer 的证书和密钥,因为 kubeadm 检测到他们已经存在于指定的位置,它就不会创建新的了。 ``` 备份 [root@master ~]# mv /etc/kubernetes/pki/apiserver.{crt,key} . 生成新的证书 [root@master ~]# kubeadm init phase certs apiserver --config kubeadm.yaml [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local api.k8s.local master master1] and IPs [10.96.0.1 192.168.116.20 192.168.116.20 192.168.116.21 192.168.116.25] ``` 通过上面的命令可以查看到 APIServer 签名的 DNS 和 IP 地址信息,一定要和自己的目标签名信息进行对比,如果缺失了数据就需要在上面的 certSANs 中补齐,重新生成证书。 该命令会使用上面指定的 kubeadm 配置文件为 APIServer 生成一个新的证书和密钥,由于指定的配置文件中包含了 certSANs 列表,那么 kubeadm 会在创建新证书的时候自动添加这些 SANs。 最后一步是重启 APIServer 来接收新的证书,最简单的方法是直接杀死 APIServer 的容器: ``` [root@master ~]# docker restart `docker ps | grep kube-apiserver | grep -v pause | awk '{print $1}'` ``` ### 验证证书 要验证证书是否更新我们可以直接去编辑 kubeconfig 文件中的 APIServer 地址,将其更换为新添加的 IP 地址或者主机名,然后去使用 kubectl 操作集群,查看是否可以正常工作。 当然我们可以使用 *openssl* 命令去查看生成的证书信息是否包含我们新添加的 SAN 列表数据: ``` [root@master ~]# openssl x509 -in /etc/kubernetes/pki/apiserver.crt -text ... DNS:master, DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, DNS:api.k8s.local, DNS:master, DNS:master1, IP Address:10.96.0.1, IP Address:192.168.116.20, IP Address:192.168.116.20, IP Address:192.168.116.21, IP Address:192.168.116.25 ... ``` 如果上面的操作都一切顺利,最后一步是将上面的集群配置信息保存到集群的 kubeadm-config 这个 ConfigMap 中去,这一点非常重要,这样以后当我们使用 kubeadm 来操作集群的时候,相关的数据不会丢失,比如升级的时候还是会带上 certSANs 中的数据进行签名的。 ``` [root@master ~]# kubeadm config upload from-file --config kubeadm.yaml Command "from-file" is deprecated, please see kubeadm init phase upload-config [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace # 如果上面命令报错,可以直接编辑修改 添加需要的内容即可 [root@master ~]# kubectl -n kube-system edit configmap kubeadm-config ``` 使用上面的命令保存配置后,我们同样可以用下面的命令来验证是否保存成功了: ``` [root@master ~]# kubectl -n kube-system get configmap kubeadm-config -o yaml apiVersion: v1 data: ClusterConfiguration: | apiServer: certSANs: - api.k8s.local - master - master1 - 192.168.116.20 - 192.168.116.21 - 192.168.116.25 ... ``` 更新 APIServer 证书的名称在很多场景下都会使用到,比如在控制平面前面添加一个负载均衡器,或者添加新的 DNS 名称或 IP 地址来使用控制平面的端点,所以掌握更新集群证书的方法也是非常有必要的。 ### 部署nginx Kubernetes作为容器集群系统,通过健康检查+重启策略实现了Pod故障自我修复能力,通过调度算法实现将Pod分布式部署,并保持预期副本数,根据Node失效状态自动在其他Node拉起Pod,实现了应用层的高可用性。 针对Kubernetes集群,高可用性还应包含以下两个层面的考虑:Etcd数据库的高可用性和Kubernetes Master组件的高可用性。 而kubeadm搭建的K8s集群,Etcd只起了一个,存在单点,所以我们这里会独立搭建一个Etcd集群。 Master节点扮演着总控中心的角色,通过不断与工作节点上的Kubelet和kube-proxy进行通信来维护整个集群的健康工作状态。如果Master节点故障,将无法使用kubectl工具或者API做任何集群管理。 Master节点主要有三个服务kube-apiserver、kube-controller-manager和kube-scheduler,其中kube-controller-manager和kube-scheduler组件自身通过选择机制已经实现了高可用,所以Master高可用主要针对kube-apiserver组件,而该组件是以HTTP API提供服务,因此对他高可用与Web服务器类似,增加负载均衡器对其负载均衡即可,并且可水平扩容。 kube-apiserver高可用架构图: ![img](http://xieys.club/images/posts/2174174-20210613182506683-1713683351.png) - Nginx是一个主流Web服务和反向代理服务器,这里用四层实现对apiserver实现负载均衡。 - Keepalived是一个主流高可用软件,基于VIP绑定实现服务器双机热备,在上述拓扑中,Keepalived主要根据Nginx运行状态判断是否需要故障转移(偏移VIP),例如当Nginx主节点挂掉,VIP会自动绑定在Nginx备节点,从而保证VIP一直可用,实现Nginx高可用。 ``` 为了节省机器,我这里就只用1台nginx [root@nginx ~]# yum -y install gcc pcre make openssl-devel wget [root@nginx ~]# wget http://nginx.org/download/nginx-1.20.1.tar.gz [root@nginx ~]# tar -xvf nginx-1.20.1.tar.gz [root@nginx ~]# groupadd gnginx [root@nginx ~]# useradd -g gnginx -s /sbin/nologin nginx [root@nginx nginx-1.20.1]# ./configure --prefix=/usr/local/nginx --user=nginx --group=nginx --with-http_stub_status_module --with-http_ssl_module --with-http_realip_module --with-http_sub_module --with-http_flv_module --with-http_mp4_module --with-http_random_index_module --with-stream [root@nginx nginx-1.20.1]# make && make install 配置 [root@nginx ~]# egrep -v '^#|^$' /usr/local/nginx/conf/nginx.conf user nginx; worker_processes auto; error_log /usr/local/nginx/logs/error.log; pid /run/nginx.pid; include /usr/share/nginx/modules/*.conf; events { worker_connections 1024; } stream { log_format main '$remote_addr $upstream_addr - [$time_local] $status $upstream_bytes_sent'; access_log /usr/local/nginx/logs/k8s-access.log main; upstream k8s-apiserver { server 192.168.116.20:6443; # Master1 APISERVER IP:PORT server 192.168.116.21:6443; # Master2 APISERVER IP:PORT } server { listen 6443; proxy_pass k8s-apiserver; } } http { log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"'; access_log /usr/local/nginx/logs/access.log main; sendfile on; tcp_nopush on; tcp_nodelay on; keepalive_timeout 65; types_hash_max_size 4096; include /usr/local/nginx/conf/mime.types; default_type application/octet-stream; # Load modular configuration files from the /etc/nginx/conf.d directory. # See http://nginx.org/en/docs/ngx_core_module.html#include # for more information. include /etc/nginx/conf.d/*.conf; server { listen 80; listen [::]:80; server_name _; root /usr/share/nginx/html; # Load configuration files for the default server block. include /etc/nginx/default.d/*.conf; error_page 404 /404.html; location = /404.html { } error_page 500 502 503 504 /50x.html; location = /50x.html { } } } 启动 [root@nginx nginx-1.20.1]# /usr/local/nginx/sbin/nginx ``` ### 更改配置 启动成功后 apiserver 的负载均衡地址就成了 `https://192.168.116.25:6443`。然后我们将 kubeconfig 文件中的 apiserver 地址替换成负载均衡器的地址。 > kubelet.conf ``` [root@master ~]# cat /etc/kubernetes/kubelet.conf ... server: https://192.168.116.25:6443 name: kubernetes ... systemctl restart kubelet ``` > controller-manager.conf ``` [root@master ~]# cat /etc/kubernetes/controller-manager.conf ... server: https://192.168.116.25:6443 name: kubernetes ... [root@master ~]# docker restart `docker ps | grep kube-controller-manager | grep -v pause | awk '{print $1}'` ``` > scheduler.conf ``` [root@master ~]# cat /etc/kubernetes/scheduler.conf ... server: https://192.168.116.25:6443 name: kubernetes ... [root@master ~]# docker restart `docker ps | grep kube-scheduler | grep -v pause | awk '{print $1}'` ``` > 更新kube-proxy ``` [root@master ~]# kubectl edit cm kube-proxy -n kube-system ... kubeconfig.conf: |- apiVersion: v1 kind: Config clusters: - cluster: certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt server: https://192.168.116.25:6443 name: default contexts: - context: cluster: default namespace: default user: default name: default ... ``` 当然还有 kubectl 访问集群的 `~/.kube/config` 文件也需要修改。 ``` [root@master ~]# cat .kube/config ... server: https://192.168.116.25:6443 name: kubernetes ... ``` ### 更新控制面板 由于我们现在已经在控制平面的前面添加了一个负载平衡器,因此我们需要使用正确的信息更新此 ConfigMap。(您很快就会将控制平面节点添加到集群中,因此在此ConfigMap中拥有正确的信息很重要。) 首先,使用以下命令从 ConfigMap 中获取当前配置: ``` [root@master ~]# kubectl -n kube-system edit configmap kubeadm-config apiVersion: v1 data: ClusterConfiguration: | apiServer: certSANs: - api.k8s.local - master - master1 - 192.168.116.20 - 192.168.116.21 - 192.168.116.25 extraArgs: authorization-mode: Node,RBAC timeoutForControlPlane: 4m0s apiVersion: kubeadm.k8s.io/v1beta2 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controlPlaneEndpoint: 192.168.116.25:6443 #需要添加的配置 controllerManager: {} dns: type: CoreDNS etcd: local: dataDir: /var/lib/etcd imageRepository: registry.aliyuncs.com/google_containers kind: ClusterConfiguration kubernetesVersion: v1.16.9 networking: dnsDomain: cluster.local podSubnet: 10.244.0.0/16 serviceSubnet: 10.96.0.0/12 scheduler: {} ClusterStatus: | apiEndpoints: master: advertiseAddress: 192.168.116.20 bindPort: 6443 apiVersion: kubeadm.k8s.io/v1beta2 kind: ClusterStatus kind: ConfigMap metadata: creationTimestamp: "2020-09-29T15:35:35Z" name: kubeadm-config namespace: kube-system resourceVersion: "148258" selfLink: /api/v1/namespaces/kube-system/configmaps/kubeadm-config uid: 7c04d48f-af5b-4dab-ad74-eb2c06c2510b ``` 然后需要在 `kube-public` 命名空间中更新 `cluster-info` 这个 ConfigMap,该命名空间包含一个Kubeconfig 文件,该文件的 `server:` 一行指向单个控制平面节点。只需使用`kubectl -n kube-public edit cm cluster-info` 更新该 `server:` 行以指向控制平面的负载均衡器即可。 ``` [root@master ~]# kubectl edit cm cluster-info -n kube-public # Please edit the object below. Lines beginning with a '#' will be ignored, # and an empty file will abort the edit. If an error occurs while saving this file will be # reopened with the relevant failures. # apiVersion: v1 data: kubeconfig: | apiVersion: v1 clusters: - cluster: certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUN5RENDQWJDZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJd01Ea3lPVEUxTXpVeE5Gb1hEVE13TURreU56RTFNelV4TkZvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBS0ZyCjVJWmV1ZVVyZlJxUmhXQzU0Q1o1ZTczUDl2MElXQ1YvWHd2c0FhWXo4SmtVSFNjbHZoNWZCMGRFbkhFNzR6ZXQKcWYyamNmeWNWdHlyZlJCQ1lsczlWMUkydzMvcHRvbExlUUVwYm9xbFZ3akQvYUVhZk9vdStrZ3loZjc0QVZ6Mwpic2ttQ09rc1h3TDFYaGVpUzh0VU1jSEZ0OURqU1VsOVN6eXVKbWhRajBHU29GbGg3NVAvSEo1VTVXdlJBenBYCk5sQ24rYzhxMkduK3BJWG9SK1hOQWF0TFRyZzBSYXBJeDE4US81cUJtQk4zaXJ6ZnN3WUd0cFBWTTZsMmtGSGgKeFpsS0NJRjJxaHpwenRLN1BsL3htb2dZaWR4bTVTS0dSWjJia3RFRUsreitYekd5cGJjc3QyZUl5S3hDMTBBWQpRdmpGaENjd2pMdXFtdzVqWTVzQ0F3RUFBYU1qTUNFd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFIbSt5d2hrWXc1RlhlRFlhVWoySDhqY1kvNlMKeWhmejZuRUw4bENjV2w3TjM2ck1vc3ZWTTBocCs3Wnp6K2RkS1R4a1hjclZ1OURsVE9uME1mQ2ZiZi9ZOVNBbwpsNlhOQUxTcFBmSE5XdkVRQTFEbUR4MVVWdVgrZXA0OTRiNEszVkVpQ0pJN3pBZDVpdHRVc0dWaWZ4ZzBxT2dPCjQvYTVpTHViTlR3a1BsN01adU4yYW1QYXBUNEtkb25wVWNYTTk2eVRUNEF2bkRUM1ZJRGI3eHA3Yld2UmRIcUkKUmdwYlBldGY3TmdobWUvK3hKbzkzU2tmUE5seVNFNmhxQkJFMGNwVGlJMzdlT1ZzQzBDZmRQZ00rMjhidW56WAptZkV5U2JHVHhNMFRGNitCSmxOMCt6YjZhYjgvVTIyWXlFWjlLcXhjcWFKR1g0V3NnY1hFaVR0dEY5cz0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo= server: https://192.168.116.25:6443 name: "" contexts: [] current-context: "" kind: Config preferences: {} users: [] kind: ConfigMap metadata: creationTimestamp: "2020-09-29T15:35:36Z" name: cluster-info namespace: kube-public resourceVersion: "124903" selfLink: /api/v1/namespaces/kube-public/configmaps/cluster-info uid: f5b74218-84d7-4de1-ba0e-1ef3cabbe65d ``` 更新完成就可以看到 cluster-info 的信息变成了负载均衡器的地址了。 ### 添加控制平面 接下来我们来添加额外的控制平面节点,首先使用如下命令来将集群的证书上传到集群中,供其他控制节点使用: ``` [root@master ~]# kubeadm init phase upload-certs --upload-certs W0824 16:44:59.056367 32171 version.go:101] could not fetch a Kubernetes version from the internet: unable to get URL "https://dl.k8s.io/release/stable-1.txt": Get https://dl.k8s.io/release/stable-1.txt: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) W0824 16:44:59.103275 32171 version.go:102] falling back to the local client version: v1.16.9 [upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace [upload-certs] Using certificate key: 450964c0a914353ca1c9e5d0bc2ac27395db2f39abf2381d1bd1d80fb4b362f4 ``` 上面的命令会生成一个新的证书密钥,但是只有2小时有效期。由于我们现有的集群已经运行一段时间了,所以之前的启动 Token 也已经失效了(Token 的默认生存期为24小时),所以我们也需要创建一个新的 Token 来添加新的控制平面节点: ``` [root@master ~]# kubeadm token create --print-join-command --config kubeadm.yaml kubeadm join 192.168.116.25:6443 --token r69sza.bszdok47uwbidske --discovery-token-ca-cert-hash sha256:8fee3d9ed90a4496bafeadfbcea7b33f4b1af0c019a0d053cb64f10b7976e3f3 ``` 在需要加入的master1上执行 ``` [root@master2 ~]# kubeadm join 192.168.116.25:6443 --token r69sza.bszdok47uwbidske --discovery-token-ca-cert-hash sha256:8fee3d9ed90a4496bafeadfbcea7b33f4b1af0c019a0d053cb64f10b7976e3f3 --control-plane --certificate-key 450964c0a914353ca1c9e5d0bc2ac27395db2f39abf2381d1bd1d80fb4b362f4 [preflight] Running pre-flight checks [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 19.03.13. Latest validated version: 18.09 [WARNING Hostname]: hostname "master1" could not be reached [WARNING Hostname]: hostname "master1": lookup master1 on 114.114.114.114:53: no such host [preflight] Reading configuration from the cluster... [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml' [preflight] Running pre-flight checks before initializing the new control plane instance [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' [download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [master1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local api.k8s.local master master1] and IPs [10.96.0.1 192.168.116.21 192.168.116.25 192.168.116.20 192.168.116.21 192.168.116.25] [certs] Generating "apiserver-kubelet-client" certificate and key [certs] Generating "front-proxy-client" certificate and key [certs] Generating "etcd/server" certificate and key [certs] etcd/server serving cert is signed for DNS names [master1 localhost] and IPs [192.168.116.21 127.0.0.1 ::1] [certs] Generating "etcd/peer" certificate and key [certs] etcd/peer serving cert is signed for DNS names [master1 localhost] and IPs [192.168.116.21 127.0.0.1 ::1] [certs] Generating "etcd/healthcheck-client" certificate and key [certs] Generating "apiserver-etcd-client" certificate and key [certs] Valid certificates and keys now exist in "/etc/kubernetes/pki" [certs] Using the existing "sa" key [kubeconfig] Generating kubeconfig files [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Writing "admin.conf" kubeconfig file [kubeconfig] Writing "controller-manager.conf" kubeconfig file [kubeconfig] Writing "scheduler.conf" kubeconfig file [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" [control-plane] Creating static Pod manifest for "kube-controller-manager" [control-plane] Creating static Pod manifest for "kube-scheduler" [check-etcd] Checking that the etcd cluster is healthy [kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.16" ConfigMap in the kube-system namespace [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Activating the kubelet service [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap... [etcd] Announced new etcd member joining to the existing etcd cluster [etcd] Creating static Pod manifest for "etcd" [etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s {"level":"warn","ts":"2021-08-24T16:52:50.553+0800","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"passthrough:///https://192.168.116.21:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"} [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace [mark-control-plane] Marking the node master1 as control-plane by adding the label "node-role.kubernetes.io/master=''" [mark-control-plane] Marking the node master1 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule] This node has joined the cluster and a new control plane instance was created: * Certificate signing request was sent to apiserver and approval was received. * The Kubelet was informed of the new secure connection details. * Control plane (master) label and taint were applied to the new node. * The Kubernetes control plane instances scaled up. * A new etcd member was added to the local/stacked etcd cluster. To start administering your cluster from this node, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Run 'kubectl get nodes' to see this node join the cluster. ``` ### 查看etcd ``` [root@master2 ~]# cat /etc/kubernetes/manifests/etcd.yaml apiVersion: v1 ... - --initial-cluster=master1=https://192.168.116.21:2380,master=https://192.168.116.20:2380 - --initial-cluster-state=existing ... ``` ### 查看集群是否正常 ``` [root@master ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION master Ready master 328d v1.16.9 master1 Ready master 8m10s v1.16.9 node1 Ready 328d v1.16.9 [root@master ~]# kubectl get cs NAME AGE scheduler controller-manager etcd-0 原因: 这是个 kubectl 的 bug, 跟版本相关,kubernetes 有意废除 get cs 命令 解决: 目前对集群的运行无影响, 可通过加 -o yaml 查看状态 ```