目录

前置

1.实现效果

2.环境准备

3.系统初始化

关闭防火墙

关闭 selinux

关闭 swap

主机名

配置hosts

将桥接的 IPv4 流量传递到 iptables 的链

时间同步

部署Master节点

所有节点安装 Docker/kubeadm/kubelet

部署 Kubernetes Master

0.初始化报错合集

Unfortunately, an error has occurred:        timed out waiting for the condition

This error is likely caused by:        - The kubelet is not running        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:        - 'systemctl status kubelet'        - 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.To troubleshoot, list all containers using your preferred container runtimes CLI.Here is one example how you may list all running Kubernetes containers by using crictl:        - 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock ps -a | grep kube | grep -v pause'        Once you have found the failing container, you can inspect its logs with:        - 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock logs CONTAINERID'error execution phase wait-control-plane: couldn't initialize a Kubernetes clusterTo see the stack trace of this error execute with --v=5 or higher

--------------------------------------------------------------------------------------------------------------------------

[preflight] Running pre-flight checkserror execution phase preflight: [preflight] Some fatal errors occurred:        [ERROR CRI]: container runtime is not running: output: time="2022-09-27T11:34:39+08:00" level=fatal msg="unable to determine runtime API version: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix /var/run/containerd/containerd.sock: connect: no such file or directory\"", error: exit status 1[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`To see the stack trace of this error execute with --v=5 or higher

--------------------------------------------------------------------------------------------------------------------------

[preflight] Running pre-flight checkserror execution phase preflight: [preflight] Some fatal errors occurred:        [ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists        [ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists        [ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists        [ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists        [ERROR Port-10250]: Port 10250 is in use[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`To see the stack trace of this error execute with --v=5 or higher

--------------------------------------------------------------------------------------------------------------------------

1.使用 kubectl 工具

2.加入节点

error execution phase preflight: [preflight] Some fatal errors occurred:        [ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists        [ERROR Port-10250]: Port 10250 is in use        [ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`To see the stack trace of this error execute with --v=5 or higher

--------------------------------------------------------------------------------------------------------------------------

[preflight] Running pre-flight checkserror execution phase preflight: [preflight] Some fatal errors occurred:        [ERROR CRI]: container runtime is not running: output: time="2022-09-27T15:34:07+08:00" level=fatal msg="unable to determine runtime API version: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix /var/run/containerd/containerd.sock: connect: no such file or directory\"", error: exit status 1[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`To see the stack trace of this error execute with --v=5 or higher

--------------------------------------------------------------------------------------------------------------------------

[preflight] Running pre-flight checks        [WARNING Hostname]: hostname "k8s-node1" could not be reached        [WARNING Hostname]: hostname "k8s-node1": lookup k8s-node1 on 114.114.114.114:53: no such hosterror execution phase preflight: [preflight] Some fatal errors occurred:        [ERROR CRI]: container runtime is not running: output: time="2022-09-27T15:30:29+08:00" level=fatal msg="unable to determine runtime API version: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix /var/run/containerd/containerd.sock: connect: no such file or directory\"", error: exit status 1[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`To see the stack trace of this error execute with --v=5 or higher

--------------------------------------------------------------------------------------------------------------------------

3.安装 Pod 网络插件

error: error parsing calico.yaml: error converting YAML to JSON: yaml: line 206: mapping values are not allowed in this context

--------------------------------------------------------------------------------------------------------------------------

测试 kubernetes 集群


前置

1.实现效果

  • 在所有节点上安装 Docker 和 kubeadm
  • 部署 Kubernetes Master
  • 部署容器网络插件
  • 部署 Kubernetes Node,将节点加入 Kubernetes 集群中
  • 部署 Dashboard Web 页面,可视化查看 Kubernetes 资源

2.环境准备

这里用的机器是之前搭建 zk 、 kafka 、 hadoop 和 canal 的集群机器,直接拿来用了。需要查看配置的话可以查看之前的博客:

Zookeeper - 本地安装与参数配置

Zookeeper - 集群搭建 + 群起集群脚本

Kafka + Zookeeper + Hadoop 集群配置

Hadoop3.x - 本地安装 + 完全分布式安装 + 集群配置 + xsync分发脚本 (解决root用户启动Hadoop集群的报错问题)

Canal + MySQL + Zookeeper + Kafka 数据实时同步

角色 IP
k8s-master 192.168.150.102
k8s-node1 192.168.150.103
k8s-node2 192.168.150.104

3.系统初始化

关闭防火墙

systemctl stop firewalld
systemctl disable firewalld

关闭 selinux

临时:

setenforce 0

永久:

sed -i 's/enforcing/disabled/' /etc/selinux/config

关闭 swap

临时:

swapoff -a

永久:

sed -ri 's/.*swap.*/#&/' /etc/fstab

主机名

这里把之前的Hadoop主机名改掉:

hostnamectl set-hostname k8s-master
hostnamectl set-hostname k8s-node1
hostnamectl set-hostname k8s-node2

配置hosts

在 k8s-master 上进行配置:

cat >> /etc/hosts << EOF
192.168.150.102 k8s-master
192.168.150.103 k8s-node1
192.168.150.104 k8s-node2
EOF

将桥接的 IPv4 流量传递到 iptables 的链

这里说白了就是让所有的流量都要从k8s这里过,k8s需要看到所有的流量走向:

cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF

生效:

sysctl --system

时间同步

yum install -y ntpdate
ntpdate time.windows.com

部署Master节点

所有节点安装 Docker/kubeadm/kubelet

Docker:

wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo
yum -y install docker-ce-18.06.1.ce-3.el7
systemctl enable docker && systemctl start docker

添加 yum 源:

cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

安装 kubeadm , kubelet 和 kubectl :

yum install -y kubelet-1.25.0 kubeadm-1.25.0 kubectl-1.25.0

开机自启:

systemctl enable kubelet

部署 Kubernetes Master

在 k8s-master(192.168.150.102) 上执行:

kubeadm init \
--apiserver-advertise-address=192.168.150.102 \
--image-repository registry.aliyuncs.com/google_containers \
--kubernetes-version v1.25.0 \
--service-cidr=10.96.0.0/12 \
--pod-network-cidr=10.244.0.0/16

注意:随着版本更新,请注意这里的 --kubernetes-version 与之前所安装的' kubelet kubeadm kubectl '版本一致

参数说明:

--apiserver-advertise-address=192.168.150.102

master主机的IP地址,例如我的Master主机的IP是:192.168.150.102

--image-repository=registry.aliyuncs.com/google_containers

镜像地址,由于国外地址无法访问,故使用的阿里云仓库地址:registry.aliyuncs.com/google_containers

--kubernetes-version=v1.25.2

下载的k8s软件版本号

--service-cidr=10.96.0.0/12

参数后的IP地址直接就套用10.96.0.0/12 ,以后安装时也套用即可,不要更改

--pod-network-cidr=10.244.0.0/16

k8s内部的pod节点之间网络可以使用的IP段,不能和service-cidr写一样,如果不知道怎么配,就先用这个10.244.0.0/16

效果如下:


0.初始化报错合集

Unfortunately, an error has occurred:
        timed out waiting for the condition

This error is likely caused by:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
        - 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
        Once you have found the failing container, you can inspect its logs with:
        - 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

恭喜你初始化失败了,开始吧。

首先确认你的 kubelet 已经跑起来了,别是 stop :

systemctl status kubelet.service

查看日志:

journalctl -xeu kubelet

内容过多只截了一部分,这里的主要意思是拿不到节点:

确保版本号一致:

yum list | grep kube

若版本号不一致则 remove 掉,然后重新下载:

yum remove -y kubeadm.x86_64 kubectl.x86_64 kubelet.x86_64

重新下载,这里的 -1.25.0 改成自己要下载的版本:

yum install -y kubelet-1.25.0 kubeadm-1.25.0 kubectl-1.25.0

--------------------------------------------------------------------------------------------------------------------------


[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR CRI]: container runtime is not running: output: time="2022-09-27T11:34:39+08:00" level=fatal msg="unable to determine runtime API version: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix /var/run/containerd/containerd.sock: connect: no such file or directory\""
, error: exit status 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher

安装了containerd看这里

rm -rf /etc/containerd/config.toml
systemctl restart containerd

没有安装containerd看这里

三台机器都需要安装

yum install containerd  jq -y
containerd config default > /etc/containerd/config.toml
systemctl enable --now containerd
vim /etc/containerd/config.toml

把这里改掉:

sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.6"

保存退出后重启一下:

systemctl restart containerd.service

--------------------------------------------------------------------------------------------------------------------------


[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
        [ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
        [ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
        [ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists
        [ERROR Port-10250]: Port 10250 is in use
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher

kubeadm reset

然后重新初始化即可。

--------------------------------------------------------------------------------------------------------------------------


1.使用 kubectl 工具

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
kubectl get nodes


2.加入节点

把这句话复制下来,然后直接在 Master 执行,并发现报错(这里是演示一下报错)。

error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists
        [ERROR Port-10250]: Port 10250 is in use
        [ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher

原因是加入节点的这个操作应该在 node 节点中进行,而不是在 Master 中执行。把它复制到另外两台机器中运行,效果如下:

默认token有效期为24小时,当过期之后,该token就不可用了。这时就需要重新创建token,可以直接使用命令快捷生成:

kubeadm token create --print-join-command

--------------------------------------------------------------------------------------------------------------------------

可能出现的报错:

[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR CRI]: container runtime is not running: output: time="2022-09-27T15:34:07+08:00" level=fatal msg="unable to determine runtime API version: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix /var/run/containerd/containerd.sock: connect: no such file or directory\""
, error: exit status 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher

这里报错的原因是 node节点中的 containerd 存在问题,按照上文修改即可。

--------------------------------------------------------------------------------------------------------------------------

其它报错:

[preflight] Running pre-flight checks
        [WARNING Hostname]: hostname "k8s-node1" could not be reached
        [WARNING Hostname]: hostname "k8s-node1": lookup k8s-node1 on 114.114.114.114:53: no such host
error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR CRI]: container runtime is not running: output: time="2022-09-27T15:30:29+08:00" level=fatal msg="unable to determine runtime API version: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix /var/run/containerd/containerd.sock: connect: no such file or directory\""
, error: exit status 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher

去机器上配置一下 hosts 文件即可:

vim /etc/hosts

--------------------------------------------------------------------------------------------------------------------------

加入成功,然后回到 Master 节点中再去看一眼:

kubectl get nodes


3.安装 Pod 网络插件

wget --no-check-certificate https://docs.projectcalico.org/manifests/calico.yaml

然后咱们需要去修改一下配置:

vim calico.yaml

修改这里的ip,注意格式对齐,不然会报错(可以看下面):

kubectl apply -f calico.yaml

kubectl get pods -n kube-system

启动可能比较慢,多等一会即可:

这里可能出现的报错:

error: error parsing calico.yaml: error converting YAML to JSON: yaml: line 206: mapping values are not allowed in this context

问题在于上面的配置文件格式不对,注意两行的 name 和 value 是对齐的,千万不要多空格!

--------------------------------------------------------------------------------------------------------------------------


测试 kubernetes 集群

kubectl get cs

在 Kubernetes 集群中创建一个 pod,验证是否正常运行:

kubectl create deployment nginx --image=nginx
kubectl expose deployment nginx --port=80 --type=NodePort

kubectl get pod,svc

最终访问的时候助力此处图中的端口号,比如我这里访问(node ip加上图中的端口号):

http://192.168.150.103:31341/

完成

云原生 | Kubernetes - k8s集群搭建(kubeadm)(持续收录报错中)相关推荐

  1. 云原生|kubernetes|多集群管理之kubeconfig文件配置和使用(定义,使用方法,合并管理多集群)

    前言: kubernetes集群通常并不是只有一个集群,特别是对于业务量比较多的公司来说,可能集群的规模会非常大.所有的业务都放到一个kubernetes集群内是不现实的,也不是科学的,就如同你不会把 ...

  2. KubeSphere容器云平台在k8s集群搭建

    先决条件 先决条件 要在 Kubernetes 上安装 KubeSphere 3.3.0,您的 Kubernetes 版本必须是 v1.19.x.v1.20.x.v1.21.x.v1.22.x 和 v ...

  3. K8S集群搭建:利用kubeadm构建K8S集群

    master主服务器配置 #--kubernetes-version=v1.14.1指定版本 #--pod-network-cidr=10.244.0.0/16 指定虚拟IP的范围(以10.244开头 ...

  4. 搭建K8s集群(kubeadm方式)-操作系统初始化

    使用kubeadm方式搭建K8S集群 kubeadm是官方社区推出的一个用于快速部署kubernetes集群的工具. 这个工具能通过两条指令完成一个kubernetes集群的部署: # 创建一个 Ma ...

  5. 2019最新k8s集群搭建教程 (centos k8s 搭建)

    2019-k8s-centos 2019最新k8s集群搭建教程 (centos k8s 搭建) 网上全是要么过时的,要么残缺的,大多数都是2016年,2017年的文档,照着尝试了N次,各种卸了重装,最 ...

  6. Centos7 安装部署Kubernetes(k8s)集群过程

    1.系统环境 服务器版本 docker软件版本 CPU架构 CentOS Linux release 7.9 Docker version 20.10.12 x86_64 2.前言 如下图描述了软件部 ...

  7. 通过阿里云ecs部署k8s集群

    通过阿里云ecs部署k8s集群 1. 搭建环境 2. 安装步骤 禁用Selinux Restart Docker 此处仅有两台服务器,一台master节点,一台node节点,后期可按照步骤继续增加no ...

  8. k8s集群搭建-1mater2node

    k8s安装以及安装过程中遇到的问题 背景:因为公司需要针对operator进行开发,Operator 是 Kubernetes API 的客户端,充当自定义资源的的控制器.自己之前没有搞过k8s,所以 ...

  9. 记一次在K8s集群搭建的MySQL主从无法正常启动之数据迁移恢复实践

    本章目录:记一次在K8s集群搭建的MySQL主从无法正常启动之数据迁移恢复实践 描述:在K8s集群中里利用bitnami提供的mysql:5.7.32-debian-10-r61镜像并利用helm进行 ...

最新文章

  1. Node.js 安装配置
  2. 看看这帮猴子的伪原创工具
  3. Hadoop---(2)HDFS 介绍
  4. 编写批处理文件编译.Net工程
  5. linux查看进程加载了哪些dll,linux下动态链接库的加载及解析过程
  6. docker 升级版本
  7. java语言的优缺点
  8. 中国七夕情人节快到了2009
  9. 异构图注意力网络Heterogeneous Graph Attention Network ( HAN )
  10. Java笔记——Java多线程~
  11. Android App Widget中如何调用RemoteView中的函数
  12. C语言基础之数据类型和数据的表现形式
  13. 【朝花夕拾】Android自定义View之(一)手把手教你看懂View绘制流程——向源码要答案
  14. WinRAR - 分卷压缩
  15. WGS84经纬度坐标到北京54高斯投影坐标的转换
  16. android 进度条图标方形_android进度条的样式
  17. 金山pdf能够链接外部的chrome浏览器吗?
  18. Python实战 | 爬取并闪存微信群里的百度云资源
  19. AtCoder题解 —— AtCoder Beginner Contest 187 —— B - Gentle Pairs —— 暴力
  20. 仿牛客论坛-开发社区首页-3

热门文章

  1. Nginx与Apache的区别
  2. Sql Having 用法示例
  3. Unity UCD认证后记
  4. go path/filepath包文件路径操作详解
  5. 组合模式(Composite模式)详解
  6. 黑马程序员Java零基础视频教程(2022最新Java)B站视频学习笔记-Day12-学生管理系统
  7. Webmin--Webmin Configuration模块
  8. BDCC - Lambda VS Kappa
  9. docker 上的mongodb数据库操作命令
  10. 详解js中的闭包(closure)以及闭包的使用