mobile wallpaper 1mobile wallpaper 2mobile wallpaper 3mobile wallpaper 4mobile wallpaper 5mobile wallpaper 6
2972 字
15 分钟
基于Kind搭建测试集群

前言#

官方文档地址:https://kind.sigs.k8s.io/

github仓库地址:https://github.com/kubernetes-sigs/kind

国内镜像仓库地址:https://gitcode.com/gh_mirrors/ki/kind/overview

kind 是一种使用 Docker 容器 nodes 运行本地 Kubernetes 集群的工具。 kind 主要是为了测试 Kubernetes 自身而设计的,但它也可用于本地开发或 CI。

Kind是Kubernetes In Docker的缩写,顾名思义,看起来是把k8s放到docker的意思。没错,kind创建k8s集群的基本原理就是:提前准备好k8s节点的镜像,通过docker启动容器,来模拟k8s的节点,从而组成完整的k8s集群。需要注意,kind创建的集群仅可用于开发、学习、测试等,不能用于生产环境。

使用Kind搭建Kubernetes环境#

可以在Github上查看到Kind的发布记录:https://github.com/kubernetes-sigs/kind/releases

在其中可以找到不同版本对应的Node镜像。

前置准备工作#

代理加速:

先docker 拉取该源,并修改tag:

sudo docker pull m.daocloud.io/docker.io/kindest/node:v1.34.0
sudo docker tag m.daocloud.io/docker.io/kindest/node:v1.34.0 docker.io/kindest/node:v1.34.0
sudo docker rmi m.daocloud.io/docker.io/kindest/node:v1.34.0

查看结果:

$ sudo docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
kindest/node v1.34.0 4357c93ef232 2 weeks ago 985MB

单节点kind集群#

sudo kind create cluster --name huari-test --image kindest/node:v1.34.0 --retain; sudo kind export logs --name huari-test

参数解释:

  • –image可以执行指定不同版本的镜像
  • –name可以指定集群名

查看通过kind创建的集群:

sudo kind get clusters
huari-test

根据提示切换kubectl上下文集群:这在涉及到多集群时很实用

sudo kubectl cluster-info --context kind-huari-test

查看信息:

# 查看集群节点
sudo kubectl get nodes
# 查看集群全部的pod
sudo kubectl get pods -A -owide

删除集群:

sudo kind delete cluster --name huari-test

搭建一主二从Kind集群#

创建一个 huari.yaml 的文件,内容如下:

kind: Cluster
# 一共三个节点,一个主节点,两个从节点
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane # 主节点
- role: worker # 从节点
- role: worker # 从节点

创建集群命令:

sudo kind create cluster --config=huari.yaml --name huari-test --image kindest/node:v1.34.0 --retain; sudo kind export logs --name huari-test

切换kubectl上下文:

sudo kubectl cluster-info --context kind-huari-test

查看信息:

# 查看集群节点
sudo kubectl get nodes
# 查看集群全部的pod
sudo kubectl get pods -A -owide

删除集群:

sudo kind delete cluster --name huari-test

搭建三主三从高可用kind集群#

创建一个 huari.yaml 的文件,一共六个节点,三个 control-plane 节点,三个 workers 节点,内容如下:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: control-plane
- role: control-plane
- role: worker
- role: worker
- role: worker

创建高可用集群命令:

sudo kind create cluster --config=huari.yaml --name huari-test --image kindest/node:v1.34.0 --retain; sudo kind export logs --name huari-test

高可用集群时需要手动处理下镜像:

sudo docker pull m.daocloud.io/docker.io/kindest/haproxy:v20230606-42a2262b
sudo docker tag m.daocloud.io/docker.io/kindest/haproxy:v20230606-42a2262b docker.io/kindest/haproxy:v20230606-42a2262b
sudo docker rmi m.daocloud.io/docker.io/kindest/haproxy:v20230606-42a2262b

切换kubectl上下文:

sudo kubectl cluster-info --context kind-huari-test

查看信息:

# 查看集群节点
sudo kubectl get nodes
# 查看集群全部的pod
sudo kubectl get pods -A -owide

删除集群:

sudo kind delete cluster --name huari-test

问题解决#

问题描述#

在搭建三主三从高可用kind集群时遇见异常问题。

配置:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: control-plane
- role: control-plane
- role: worker
- role: worker
- role: worker

异常:

I0916 14:07:31.164509 282 etcd.go:593] [etcd] Promoting the learner a4010febcb7ad87f failed: etcdserver: can only promote a learner member which is in sync with leader
{"level":"warn","ts":"2025-09-16T14:07:31.664361Z","logger":"etcd-client","caller":"v3@v3.6.4/retry_interceptor.go:65","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0007a43c0/172.18.0.7:2379","method":"/etcdserverpb.Cluster/MemberPromote","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member which is in sync with leader"}
I0916 14:07:31.664483 282 etcd.go:593] [etcd] Promoting the learner a4010febcb7ad87f failed: etcdserver: can only promote a learner member which is in sync with leader
error: error execution phase etcd-join: error creating local etcd static pod manifest file: etcdserver: can only promote a learner member which is in sync with leader

在ubuntu平台上之前就遇见过这个问题,而且改成两个master节点就可以,但是mac上就可以。

修改成两个master节点时,虽然集群可以启动,但是节点状态有问题:

$ sudo kubectl get nodes
NAME STATUS ROLES AGE VERSION
huari-test-control-plane Ready control-plane 5m54s v1.34.0
huari-test-control-plane2 Ready control-plane 5m22s v1.34.0
huari-test-worker2 Ready <none> 5m21s v1.34.0

worker节点huari-test-worker和huari-test-worker3没有展示出来,但实际容器存在:

$ sudo docker ps | grep huari-test
1fcd57452e8c kindest/node:v1.34.0 "/usr/local/bin/entr…" 4 minutes ago Up 4 minutes huari-test-worker2
f8e8f598552c kindest/node:v1.34.0 "/usr/local/bin/entr…" 4 minutes ago Up 4 minutes huari-test-worker3
a6e6a569f86c kindest/haproxy:v20230606-42a2262b "haproxy -W -db -f /…" 4 minutes ago Up 4 minutes 127.0.0.1:33377->6443/tcp huari-test-external-load-balancer
80b12fa4d035 kindest/node:v1.34.0 "/usr/local/bin/entr…" 4 minutes ago Up 4 minutes 127.0.0.1:43807->6443/tcp huari-test-control-plane
c71b91c9a29b kindest/node:v1.34.0 "/usr/local/bin/entr…" 4 minutes ago Up 4 minutes huari-test-worker
8bcd6dfd6c8a kindest/node:v1.34.0 "/usr/local/bin/entr…" 4 minutes ago Up 4 minutes 127.0.0.1:33187->6443/tcp huari-test-control-plane2

容器内错误日志:

$ sudo docker exec -it huari-test-worker bash
root@huari-test-worker:/# systemctl status kubelet
Sep 16 14:15:56 huari-test-worker kubelet[8366]: E0916 14:15:56.561227 8366 manager.go:294] Registration of the raw container factory failed: inotify_init: too many open files
Sep 16 14:15:56 huari-test-worker kubelet[8366]: E0916 14:15:56.561242 8366 kubelet.go:1686] "Failed to start cAdvisor" err="inotify_init: too many open files"
Sep 16 14:15:56 huari-test-worker systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
Sep 16 14:15:56 huari-test-worker systemd[1]: kubelet.service: Failed with result 'exit-code'.

临时解决:#

sudo sysctl fs.inotify.max_user_instances=8192
sudo sysctl fs.inotify.max_user_watches=524288
sudo docker restart huari-test-worker huari-test-worker3

永久解决:#

echo fs.inotify.max_user_instances=8192 | sudo tee -a /etc/sysctl.conf
echo fs.inotify.max_user_watches=524288 | sudo tee -a /etc/sysctl.conf
sudo sysctl -p

验证#

一切恢复正常:

$ sudo kubectl get nodes
NAME STATUS ROLES AGE VERSION
huari-test-control-plane Ready control-plane 14m v1.34.0
huari-test-control-plane2 Ready control-plane 13m v1.34.0
huari-test-worker Ready <none> 6m55s v1.34.0
huari-test-worker2 Ready <none> 13m v1.34.0
huari-test-worker3 Ready <none> 6m55s v1.34.0

并且重新尝试创建三主三从kind节点也可以正常启动:

$ sudo kubectl get nodes
NAME STATUS ROLES AGE VERSION
huari-test-control-plane Ready control-plane 81s v1.34.0
huari-test-control-plane2 Ready control-plane 56s v1.34.0
huari-test-control-plane3 Ready control-plane 33s v1.34.0
huari-test-worker Ready <none> 32s v1.34.0
huari-test-worker2 Ready <none> 32s v1.34.0
huari-test-worker3 Ready <none> 32s v1.34.0

参数解释:

  • fs.inotify.max_user_instances:每个用户最多可以创建多少个 inotify 实例(比如 kubelet、Docker、VSCode 各算一个)
  • fs.inotify.max_user_watches:每个用户最多可以“盯”多少个文件/目录的变化(每个实例可以盯很多文件)

上面问题的最终本质,kubelet 要监听下面的文件:

  • /var/lib/kubelet/pods/
  • /etc/kubernetes/manifests/
  • /var/lib/containerd/

如果盯的文件数超过上限,就会:inotify_init: too many open files,然后崩溃重启,节点就无法注册到集群。

导出kind集群kubeconfig#

导出命令:

sudo kind export kubeconfig --name=huari-test --kubeconfig=$HOME/.kube/config

用法进阶#

我们已经搭建了各种类型的集群,但是我们该怎么访问集群呢?

以前在使用docker时,如果需要访问docker内部署的服务,通常需要以端口映射的方式,将宿主机指定端口的流量,转发进docker的指定端口,既然kind集群是利用docker部署的node,那是不是通过端口映射就可以实现访问集群了呢?

如果我们需要部署服务时,怎么去拉取image呢,这里是要在container里部署container,所以镜像是存储在第一层container(node节点)里的,那么怎么将image导入呢?

端口映射#

设想一种场景:在Kind集群中运行一个Nginx容器服务,监听80端口对外暴露,这时在另一台机器上能不能访问Kind集群所在机器的80端口,进而访问这个Nginx服务呢?

其实不行,因为Kind集群是Docker内运行的,所以Kine集群内Nginx容器的80端口和Kine集群所在的宿主机的80端口并不在同一个网络命名空间。

我们可以通过如下方式来配置端口映射,从而解决这类问题。

在配置文件中增加extraPortMappings配置项:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
extraPortMappings:
- containerPort: 80
hostPort: 80
listenAddress: "0.0.0.0"
protocol: tcp
- role: control-plane
- role: control-plane
- role: control-plane
- role: worker
- role: worker
- role: worker

这样,搭建出来的Kubernetes集群中使用NodePort暴露80端口或者使用hostNetwork方式暴露80端口的Pod就可以通过主机的80端口来访问了。

注意,这里仅配置了一个control-plane,其他的进行配置时,要注意端口不能冲突,即每一个node进行端口映射时,hostPort不能冲突。

创建高可用集群命令:

sudo kind create cluster --config=huari.yaml --name huari-test --image kindest/node:v1.34.0 --retain; sudo kind export logs --name huari-test

切换kubectl上下文:

sudo kubectl cluster-info --context kind-huari-test

查看信息:

# 查看集群节点
sudo kubectl get nodes
# 查看集群全部的pod
sudo kubectl get pods -A -owide

删除集群:

sudo kind delete cluster --name huari-test

暴露kube-apiserver#

有时我们会在一台计算机上使用Kind搭建一套Kubernetes环境,在另一台机器上编写代码,这时会发现我们无法连接到Kind集群中的kube-apiserver来调试Operator程序。

其实这是因为默认配置下kube-apiserver监听127.0.0.1和随机端口,要从外部访问就需要把kube-apiserver监听的网卡改成非lo(代表127.0.0.1,即localhost)的对外网卡,比如eth0。

同样,我们通过配置文件自定义来实现这一需求,添加networking.apiServerAddress配置项,值是本地网卡IP(可根据实际情况修改):

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
networking:
apiServerAddress: "10.10.151.201"
nodes:
- role: control-plane
extraPortMappings:
- containerPort: 6443
hostPort: 6443
listenAddress: "10.10.151.201"
protocol: tcp
- role: control-plane
- role: control-plane
- role: worker
- role: worker
- role: worker

创建高可用集群命令:

sudo kind create cluster --config=huari.yaml --name huari-test --image kindest/node:v1.34.0 --retain; sudo kind export logs --name huari-test

切换kubectl上下文:

sudo kubectl cluster-info --context kind-huari-test

查看信息:

# 查看集群节点
sudo kubectl get nodes
# 查看集群全部的pod
sudo kubectl get pods -A -owide

删除集群:

sudo kind delete cluster --name huari-test

安装ingress#

kind 官方推荐用 ingress-nginx 的“裸机”版(DaemonSet + hostNetwork),直接让 Controller 占用宿主机 80/443,无需云 LB,最适合本地/物理机场景。

安装ingress:

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/main/deploy/static/provider/kind/deploy.yaml

等待 Controller 就绪:

kubectl -n ingress-nginx wait --for=condition=ready pod -l app.kubernetes.io/component=controller --timeout=90s

确认宿主机 80/443 已被监听

$ sudo netstat -tulnp | grep -E ':80|:443'
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 275756/docker-proxy

导入镜像#

通过Kind搭建的环境本质是运行在一个容器内,宿主机上的镜像默认不能被Kind环境所识别,这时可以通过如下方式导入镜像:

# 例如一个镜像名是my-testimage:v1
kind load docker-image my-testimage:v1 --name huari-test
# 例如需要的镜像是一个tar包,my-testimage.tar
kind load image-archive my-testimage.tar --name huari-test

实战#

集群搭建#

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
networking:
apiServerAddress: "10.10.151.201"
nodes:
- role: control-plane
extraPortMappings:
- containerPort: 6443
hostPort: 6443
listenAddress: "10.10.151.201"
protocol: tcp
- role: control-plane
- role: control-plane
- role: worker
extraPortMappings:
- containerPort: 80
hostPort: 7080
listenAddress: "0.0.0.0"
protocol: tcp
- containerPort: 443
hostPort: 7443
listenAddress: "0.0.0.0"
protocol: tcp
- role: worker
extraPortMappings:
- containerPort: 80
hostPort: 8080
listenAddress: "0.0.0.0"
protocol: tcp
- containerPort: 443
hostPort: 8443
listenAddress: "0.0.0.0"
protocol: tcp
- role: worker
extraPortMappings:
- containerPort: 80
hostPort: 9080
listenAddress: "0.0.0.0"
protocol: tcp
- containerPort: 443
hostPort: 9443
listenAddress: "0.0.0.0"
protocol: tcp

创建高可用集群命令:

sudo kind create cluster --config=huari.yaml --name huari-test --image kindest/node:v1.34.0 --retain; sudo kind export logs --name huari-test

切换kubectl上下文:

sudo kubectl cluster-info --context kind-huari-test

查看信息:

# 查看集群节点
sudo kubectl get nodes
# 查看集群全部的pod
sudo kubectl get pods -A -owide

删除集群:

sudo kind delete cluster --name huari-test

Ingress 安装#

需要安装官方定制版本的ingress。

安装ingress:

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/main/deploy/static/provider/kind/deploy.yaml

等待 Controller 就绪:

kubectl -n ingress-nginx wait --for=condition=ready pod -l app.kubernetes.io/component=controller --timeout=90s

确认宿主机 80/443 已被监听

$ sudo netstat -tulnp | grep -E ':80|:443'
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 275756/docker-proxy

测试目标#

主要有两个测试点:

确认监听地址#

# 看宿主机是否在 10.10.151.201:6443 和 80 监听
$ sudo netstat -tulnp | grep -E ':6443|:80'
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 244210/docker-proxy
tcp 0 0 10.10.151.201:6443 0.0.0.0:* LISTEN 244220/docker-proxy

测试 API Server 外部访问#

$ curl -k https://10.10.151.201:6443/livez
ok

测试 80 端口(部署一个简单服务)#

查看 Controller 落在哪个节点:

$ sudo kubectl -n ingress-nginx get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
ingress-nginx-controller-6869595b64-f8tj7 1/1 Running 0 7m6s 10.244.5.3 huari-test-worker <none> <none>

现在controller落在huari-test-worker这个节点上,那么宿主机 7080/7443 就会被 nginx-ingress-controller 监听。

创建nginx-80.yaml文件,并将下面的配置写入:强制调度到该映射了80端口的节点

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: nginx
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
ingressClassName: nginx
rules:
- host: nginx.local # 可改任意域名
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: nginx
port:
number: 80
---
apiVersion: v1
kind: Service
metadata:
name: nginx
spec:
selector:
app: nginx
ports:
- port: 80
targetPort: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: m.daocloud.io/docker.io/nginx:alpine
ports:
- containerPort: 80

部署:

$ sudo kubectl apply -f nginx-80.yaml
deployment.apps/nginx created
service/nginx created

从外部访问测试:

curl -H "Host: nginx.local" http://10.10.151.201:7080

应该返回:

<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
分享

如果这篇文章对你有帮助,欢迎分享给更多人!

基于Kind搭建测试集群
https://hua-ri.cn/posts/基于kind搭建测试集群/
作者
花日
发布于
2025-09-19
许可协议
CC BY-NC-SA 4.0

部分信息可能已经过时