k8s Canal (by quqi99)
作者:张华 发表于:2021-01-25
版权声明:可以任意转载,转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明
如OpenStack中的Secrity Group可以用来在传统服务应用架构中来实现层与层这的网络访问规则,但对于容器化的服务场景,一个容器一个应用粒度更小,且主机节点的个数和 IP地址都 是快速变化的,这样分层防火墙的思路不再可行。Calico就是这样一个工具,通过修改每个节点上的iptables与路由来实现容器间数据的路由和访问控制,并通过etcd协调节点配置信息。 K8s Network Policy可使用Calico作为CNI实现隔离性,只有匹配规则的流量才能进入pod,同理只有匹配规则的流量才可以离开pod。
Canal是Flannel与Calico的结合,访问控制部分由Calico(calico-node -felix)实现, 网络部分仍由Flannel实现(Calico有BGP路由与IPIP隧道两种,Flannel则有Vxlan隧道)。
Calico的网络规则的用法可参见(https://www.open-open.com/news/view/1a7c496), k8s通过CNI集成Canal能使用Calico实现的网络规则部分,但用法是怎样的呢(可参见-https://www.jianshu.com/p/331235d8bcbb):
- 通过kubectl client创建NetworkPolicy资源
- calico的policy-controller(calico-kube-controllers集成了calicoctl用于写policy)监听networkpolicy资源,获取到后写入calico的etcd数据库
- node上calico-felix从etcd数据库中获取policy资源,调用iptables做相应配置。
Calico policy架构
在node上最终生成的最终iptables如下:
测试环境信息
kubernetes-master/0, kubernetes-worker/0, kubernetes-worker/1的IP如下:
kubernetes-master/0* active idle 3 10.5.2.205 6443/tcp Kubernetes master running.canal/2* active idle 10.5.2.205 Flannel subnet 10.1.49.1/24containerd/2* active idle 10.5.2.205 Container runtime available
kubernetes-worker/0* active idle 4 10.5.3.197 80/tcp,443/tcp Kubernetes worker running.canal/0 active idle 10.5.3.197 Flannel subnet 10.1.94.1/24containerd/0 active idle 10.5.3.197 Container runtime available
kubernetes-worker/1 active idle 5 10.5.4.4 80/tcp,443/tcp Kubernetes worker running.canal/1 active idle 10.5.4.4 Flannel subnet 10.1.3.1/24containerd/1 active idle 10.5.4.4 Container runtime available$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
ubuntu-debug-794979f648-4dj8w 1/1 Running 1 118m 10.1.3.14 juju-11ff05-k8s-5 <none> <none>
ubuntu-debug-794979f648-fwdmv 1/1 Running 1 118m 10.1.94.7 juju-11ff05-k8s-4 <none> <none>
canal中的flannel提供pod-to-pod通信
flannel内的tunnel网段是10.1.0.0/16,vxlan类型:
# etcdctl --cert-file /etc/ssl/flannel/client-cert.pem --key-file /etc/ssl/flannel/client-key.pem --ca-file /etc/ssl/flannel/client-ca.pem -endpoints=https://10.5.1.44:2379 get /coreos.com/network/config
{"Network": "10.1.0.0/16", "Backend": {"Type": "vxlan"}}
flannel为kubernetes-master/0, kubernetes-worker/0, kubernetes-worker/1分配的subnet分别是:
root@juju-11ff05-k8s-4:~# etcdctl --cert-file /etc/ssl/flannel/client-cert.pem --key-file /etc/ssl/flannel/client-key.pem --ca-file /etc/ssl/flannel/client-ca.pem -endpoints=https://10.5.1.44:2379 ls /coreos.com/network/subnets/
/coreos.com/network/subnets/10.1.94.0-24 (kubernetes-worker/0)
/coreos.com/network/subnets/10.1.3.0-24 (kubernetes-worker/1)
/coreos.com/network/subnets/10.1.49.0-24 (kubernetes-master/0)
flannel是根据subnet.env分配的,kubernetes-master/0, kubernetes-worker/0, kubernetes-worker/1均应有下列配置,下面只是粘出了kubernetes-worker/0上的例子:
root@juju-11ff05-k8s-4:~# cat /run/flannel/subnet.env
FLANNEL_NETWORK=10.1.0.0/16
FLANNEL_SUBNET=10.1.94.1/24
FLANNEL_MTU=8908
FLANNEL_IPMASQ=trueroot@juju-11ff05-k8s-4:~# ip addr show flannel.1 |grep globalinet 10.1.94.0/32 scope global flannel.1
防火墙应该允许该子网的流量进入
#on kubernetes-worker/0, it should ALLOW 10.1.94.0/24
root@juju-11ff05-k8s-4:~# iptables-save |grep 10.1.94 |grep POST
-A POSTROUTING ! -s 10.1.0.0/16 -d 10.1.94.0/24 -j RETURN
# iptables -t nat -nvL |grep 'Chain POSTROUTING' -A9
Chain POSTROUTING (policy ACCEPT 1447 packets, 89820 bytes)pkts bytes target prot opt in out source destination 1885 119K KUBE-POSTROUTING all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */8 460 fan-egress all -- * * 252.0.0.0/8 0.0.0.0/0 183 11249 RETURN all -- * * 10.1.0.0/16 10.1.0.0/16 6 360 MASQUERADE all -- * * 10.1.0.0/16 !224.0.0.0/4 0 0 RETURN all -- * * !10.1.0.0/16 10.1.94.0/24 0 0 MASQUERADE all -- * * !10.1.0.0/16 10.1.0.0/16
另外,底层docker或contained也应该将etcd中为每个节点配置的subnet加载进去:
- 如果使用docker, 容器内的流量会先到docker0, 然后根据下列路由到flannel.1
- 如果使用containerd,是通过下列的/etc/cni/net.d/10-canal.conflist文件将subnet设置进flannel的。
canal中的Calico/Felix提供网络访问规则
本来calico的felix(通过calico-node -felix运行)是运行在每个节点上负责配置路由及网络访问规则的,现在路由这块不做还是由Flannel来做,所以主要用到felix的网络访问规则。calio的felix的访问规则这块由下列iptables实现(详见Calico网络模型 - https://www.cnblogs.com/menkeyi/p/11364977.html),但尚不清楚canal有没有作其他更改。
root@juju-11ff05-k8s-4:~# route -n |grep flannel
10.1.3.0 10.1.3.0 255.255.255.0 UG 0 0 0 flannel.1
10.1.49.0 10.1.49.0 255.255.255.0 UG 0 0 0 flannel.1
接着流量会被封装成vxlan tunnel送到对方endpoint.
Calico如何管网络策略的还没研究清楚(我估计上节中的充许防火墙规则就是这块设置的,当然只是猜测)。
root@juju-11ff05-k8s-4:~# ip addr show |grep cali -A1
6: cali8215eb6fd4a@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netns cni-262d6f97-ad60-b48b-85e1-ee180b620c30
--
7: cali105686d7bac@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netns cni-09304b68-97b3-c72b-2bd6-b47d9f626890root@juju-11ff05-k8s-4:~# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.5.0.1 0.0.0.0 UG 100 0 0 ens3
10.1.3.0 10.1.3.0 255.255.255.0 UG 0 0 0 flannel.1
10.1.49.0 10.1.49.0 255.255.255.0 UG 0 0 0 flannel.1
10.1.94.6 0.0.0.0 255.255.255.255 UH 0 0 0 cali105686d7bac
10.1.94.7 0.0.0.0 255.255.255.255 UH 0 0 0 cali8215eb6fd4a
10.5.0.0 0.0.0.0 255.255.0.0 U 0 0 0 ens3
169.254.169.254 10.5.0.1 255.255.255.255 UGH 100 0 0 ens3
252.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 fan-252
root@juju-11ff05-k8s-4:~# ip netns exec cni-09304b68-97b3-c72b-2bd6-b47d9f626890 route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 169.254.1.1 0.0.0.0 UG 0 0 0 eth0
169.254.1.1 0.0.0.0 255.255.255.255 UH 0 0 0 eth0
root@juju-11ff05-k8s-4:~# ip netns exec cni-262d6f97-ad60-b48b-85e1-ee180b620c30 route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 169.254.1.1 0.0.0.0 UG 0 0 0 eth0
169.254.1.1 0.0.0.0 255.255.255.255 UH 0 0 0 eth0root@juju-11ff05-k8s-4:~# ctr c ls
CONTAINER IMAGE RUNTIME
calico-node rocks.canonical.com:443/cdk/calico/node:v3.10.1 io.containerd.runc.v2 root@juju-11ff05-k8s-4:~# ip netns
cni-09304b68-97b3-c72b-2bd6-b47d9f626890 (id: 1)
cni-262d6f97-ad60-b48b-85e1-ee180b620c30 (id: 0)root@juju-11ff05-k8s-4:~# ip netns exec cni-09304b68-97b3-c72b-2bd6-b47d9f626890 ip addr show |grep eth0
3: eth0@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default inet 10.1.94.6/32 scope global eth0
root@juju-11ff05-k8s-4:~# ip netns exec cni-262d6f97-ad60-b48b-85e1-ee180b620c30 ip addr show |grep eth0
3: eth0@if6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default inet 10.1.94.7/32 scope global eth0root@juju-11ff05-k8s-4:~# cat /etc/cni/net.d/10-canal.conflist
{"name": "cdk-canal","cniVersion": "0.3.0","plugins": [{"type": "calico","etcd_endpoints": "https://10.5.1.44:2379","etcd_key_file": "/opt/calicoctl/etcd-key","etcd_cert_file": "/opt/calicoctl/etcd-cert","etcd_ca_cert_file": "/opt/calicoctl/etcd-ca","log_level": "info","ipam": {"type": "host-local","subnet": "10.1.94.1/24"},"policy": {"type": "k8s"},"kubernetes": {"kubeconfig": "/root/cdk/kubeconfig"}},{"type": "portmap","capabilities": {"portMappings": true},"snat": true}]
}
iptables提供Service-to-Pod通信
以dashboard为例:
$ kubectl get services -A |grep kubernetes-dashboard
kubernetes-dashboard dashboard-metrics-scraper ClusterIP 10.152.183.92 <none> 8000/TCP 3d22h
kubernetes-dashboard kubernetes-dashboard ClusterIP 10.152.183.89 <none> 443/TCP 3d22h
$ kubectl describe --namespace kubernetes-dashboard service kubernetes-dashboard |grep -E 'IPs|Endpoints|Port'
IPs: 10.152.183.89
Port: <unset> 443/TCP
TargetPort: 8443/TCP
Endpoints: 10.1.3.16:8443
$ kubectl get pods -A -o wide |grep dashboard
kubernetes-dashboard dashboard-metrics-scraper-74757fb5b7-9rqxd 1/1 Running 1 3d22h 10.1.3.17 juju-11ff05-k8s-5 <none> <none>
kubernetes-dashboard kubernetes-dashboard-64f87676d4-s26bs 1/1 Running 1 3d22h 10.1.3.16 juju-11ff05-k8s-5 <none> <none># 如果有多个endpoint也就会有多个KUBE-SEP-xxx实实现LB, 其中MARK-MASQ用于set mark
root@juju-11ff05-k8s-4:~# iptables-save |grep kubernetes-dashboard
-A KUBE-SERVICES ! -s 10.1.0.0/16 -d 10.152.183.89/32 -p tcp -m comment --comment "kubernetes-dashboard/kubernetes-dashboard cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ #egress
-A KUBE-SERVICES -d 10.152.183.89/32 -p tcp -m comment --comment "kubernetes-dashboard/kubernetes-dashboard cluster IP" -m tcp --dport 443 -j KUBE-SVC-CEZPIJSAUFW5MYPQ #ingress
-A KUBE-SVC-CEZPIJSAUFW5MYPQ -m comment --comment "kubernetes-dashboard/kubernetes-dashboard" -j KUBE-SEP-RMJCCMAWKQ3DCGZP
-A KUBE-SEP-RMJCCMAWKQ3DCGZP -s 10.1.3.16/32 -m comment --comment "kubernetes-dashboard/kubernetes-dashboard" -j KUBE-MARK-MASQ #egress
-A KUBE-SEP-RMJCCMAWKQ3DCGZP -p tcp -m comment --comment "kubernetes-dashboard/kubernetes-dashboard" -m tcp -j DNAT --to-destination 10.1.3.16:8443 #ingress
相关服务
systemctl status flannel
systemctl status calico-node.service
systemctl status snap.kubelet.daemon.service
附录 - 一个客户问题
一个客户有一个k8s over vSphere managed by juju 环境,报告使用 goldpinger发现pod间网络不通。
先是遇到问题https://bugs.launchpad.net/juju/+bug/1831244,juju controllers与vSphere之间存在用户名和密码的问题,这样"juju status --format yaml"能看到controller处于suspended状态(注意:只有带–format yaml才能看到得)。
解决1831244之后就可以正常创建新节点了,将那个失败的keycloak pod迁移到新节点,问题依然存在,'kubectl logs’看到keycloak失败原因是: ‘Caused by: java.net.UnknownHostException: postgres’
显然,keycloak依赖于postgres service, 存在dns问题。
coredns并没有从旧节新移到新节点,旧节点上存在一个问题, canal的flannel使用的subnet(/run/flannel/subnet.env)与canal的calico使用的subnet(/etc/cni/net.d/10-canal.conflist)不一致,这样导致旧节点上的iptables下列倒数第二行就弄错了,没有充许flannel的数据包。
grep -A9 "Chain POSTROUTING" sos_commands/networking/iptables_-t_nat_-nvL
Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
1411K 88M cali-POSTROUTING all -- * * 0.0.0.0/0 0.0.0.0/0 /* cali:O3lYWMrLQYEMJtB5 */
1378K 86M CNI-HOSTPORT-MASQ all -- * * 0.0.0.0/0 0.0.0.0/0 /* CNI portfwd requiring masquerade */
1376K 86M KUBE-POSTROUTING all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */
800K 50M RETURN all -- * * 172.17.240.0/20 172.17.240.0/20
0 0 MASQUERADE all -- * * 172.17.240.0/20 !224.0.0.0/4
0 0 RETURN all -- * * !172.17.240.0/20 172.17.247.0/24
521K 31M MASQUERADE all -- * * !172.17.240.0/20 172.17.240.0/20
下列语句可以查询etcd中每个节点的subnet配置:
juju run -u kubernetes-worker/<N> etcdctl --cert-file /etc/ssl/flannel/client-cert.pem --key-file /etc/ssl/flannel/client-key.pem --ca-file /etc/ssl/flannel/client-ca.pem -endpoints=https://10.1.234.195:2379,https://10.1.234.196:2379,https://10.1.234.203:2379 ls /coreos.com/network/subnets/
需要先从etcd里删除它 - rm /coreos.com/network/subnets/172.17.247.0-24
然后重它systemctl restart flannel.service, 接着将 /run/flannel/subnet.env的subnet更新为正确的subnet, 之后通过“ip a s flannel.1 ”确认。最后:
systemctl restart calico-node.service
systemctl restart snap.kubelet.daemon.service
calicoctl测试
juju ssh kubernetes-worker/0 -- sudo -s
#https://docs.projectcalico.org/getting-started/clis/calicoctl/install
curl -O -L https://github.com/projectcalico/calicoctl/releases/download/v3.17.1/calicoctl
cp calicoctl /usr/bin/ && chmod +x /usr/bin/calicoctl
#copy env variable like ETCD_ENDPOINTS from the output of 'ps -ef |grep calicoctl'
ETCD_ENDPOINTS=https://10.5.0.131:2379 ETCD_CA_CERT_FILE=/opt/calicoctl/etcd-ca ETCD_CERT_FILE=/opt/calicoctl/etcd-cert ETCD_KEY_FILE=/opt/calicoctl/etcd-key calicoctl get networkPolicy
ETCD_ENDPOINTS=https://10.5.0.131:2379 ETCD_CA_CERT_FILE=/opt/calicoctl/etcd-ca ETCD_CERT_FILE=/opt/calicoctl/etcd-cert ETCD_KEY_FILE=/opt/calicoctl/etcd-key calicoctl get globalNetworkPolicyk8s networkpolicy - https://docs.projectcalico.org/security/kubernetes-network-policy
calico networkpolicy - https://docs.projectcalico.org/security/calico-network-policy
cat << EOF | sudo tee calicoPolicyTest.yaml
apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:name: deny-circle-blue
spec:selector: k8s-app == 'kubernetes-dashboard'ingress:- action: Denyprotocol: TCPdestination:ports:- 22
EOF
ETCD_ENDPOINTS=https://10.5.0.131:2379 ETCD_CA_CERT_FILE=/opt/calicoctl/etcd-ca ETCD_CERT_FILE=/opt/calicoctl/etcd-cert ETCD_KEY_FILE=/opt/calicoctl/etcd-key calicoctl get globalNetworkPolicy
#go to the worker which dashboard pos is on (kubectl get pod -A -o wide |grep dash)
iptables-save |grep 2222
reference
[1] https://courses.academy.tigera.io/courses/course-v1:tigera+CCO-L1+CCO-L1-2020/course/
k8s Canal (by quqi99)相关推荐
- 使用Kubeadm搭建Kubernetes(1.12.2)集群
Kubeadm是Kubernetes官方提供的用于快速安装Kubernetes集群的工具,伴随Kubernetes每个版本的发布都会同步更新,在2018年将进入GA状态,说明离生产环境中使用的距离越来 ...
- k8s集群下搭建数据同步工具-canal:canal-admin篇
k8s集群下搭建数据同步工具-canal:canal-admin篇 前言 容器化 canal-admin 环境准备 k8s集群创建pod canal-admin 前言 本文使用v1.1.4版本的can ...
- Integrate k8s with cert-maanger and vault (by quqi99)
作者:张华 发表于:2020-05-21 版权声明:可以任意转载,转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 #install vault #https://ubuntu.com/ ...
- k8s http/https nginx ingress (by quqi99)
作者:张华 发表于:2019-11-05 版权声明:可以任意转载,转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 (https://zhhuabj.blog.csdn.net) 问题 ...
- [k8s] 第二章 十分钟带你搭建k8s集群环境
本章节主要介绍如何搭建kubernetes的集群环境 环境规划 集群类型 kubernetes集群大体上分为两类:一主多从和多主多从. 一主多从:一台Master节点和多台Node节点,搭建简单,但是 ...
- 从零开始入门 K8s | Kubernetes 网络概念及策略控制
作者 | 阿里巴巴高级技术专家 叶磊 一.Kubernetes 基本网络模型 本文来介绍一下 Kubernetes 对网络模型的一些想法.大家知道 Kubernetes 对于网络具体实现方案,没有什么 ...
- 基于rancher在线安装k8s集群
canal作用 运维脉络 使用rancher搭建k8s 部署nginx:
- rancher k8s docker 关系_【环境搭建】Ubuntu20.04通过rke部署K8S
问题背景 rke大大简化了k8s集群的部署,但是也带来了一个问题:稍有差池就会一脸懵逼,rke的文档还是偏少,此外rke安装过程中的日志信息也是少的可怜,导致Google都不知道从何说起 关于rke( ...
- 学习笔记之-Kubernetes(K8S)介绍,集群环境搭建,Pod详解,Pod控制器详解,Service详解,数据存储,安全认证,DashBoard
笔记来源于观看黑马程序员Kubernetes(K8S)教程 第一章 kubernetes介绍 应用部署方式演变 在部署应用程序的方式上,主要经历了三个时代: 传统部署:互联网早期,会直接将应用程序部署 ...
最新文章
- 12. MySQL简单使用
- python3 numpy. ndarray 与 list 互转方法
- 2018 Multi-University Training Contest 10 - CSGO
- python 基本数据类型常用方法总结
- Java 11新特性_java 11 值得关注的新特性
- notepad++是什么软件_对比国外更优秀的五款国产软件,却不被国人所熟悉
- OpenCV_cv::Mat初始化
- 如何选择网页更新提醒工具
- 安装JDK11并配置环境变量(附百度网盘下载地址)
- springboot+vue汽车4S店车辆销售维修管理系统java源码
- 服务器协议密码,Radius协议 - 如何将密码发送到服务器?
- Maya---2018up4 Python 开发环境配置(win10x64)
- 台式计算机打印机共享,只需两步台式和笔记本电脑可共享打印机
- 2022 华东师范大学 数据学院复试机考
- 蓝叠模拟器的通讯录位置
- Fifo中Underflow信号的含义
- 怎样能让大腿变细方法 揭秘如何瘦大腿和小腿
- 嵌入式开发板介绍及其分类
- 分析全国的教育资源,高校分布数据,发现不均衡
- Json.NET使用入门(二)【反序列化】