艰难的选择

Kubernetes is open-source, robust, and one of the most popular container orchestration platforms available in the market. However, because of its complexity, not everyone can secure it appropriately.

Kubernetes是开源的,功能强大的,并且是市场上最受欢迎的容器编排平台之一。 但是,由于其复杂性,并非所有人都能适当地保护它。

Experienced system administrators also struggle with its multiple moving parts and numerous settings (which they built in to provide flexibility).

经验丰富的系统管理员还为它的多个活动部件和众多设置(他们内置以提供灵活性)而苦苦挣扎。

Therefore, it might be a reason why you might end up in a disaster. Because of its popularity, multiple vendors offer managed Kubernetes services that have some level of security built it, and you do not have to worry much about the Kubernetes architecture, but what if you do not have that option?

因此,这可能是您可能会陷入灾难的原因。 由于其受欢迎程度,多家供应商都提供了托管的Kubernetes服务,这些服务内置了一定程度的安全性,您不必担心Kubernetes架构,但是如果没有这种选择怎么办?

Well, you can create your cluster in any environment using automation tools such as kubeadm, however, most admins who start with kubeadm struggle to understand how Kubernetes works behind the scenes as kubeadm takes care of setting the cluster for you.

好了,您可以使用自动化工具(例如kubeadm)在任何环境中创建集群,但是,大多数以kubeadm开头的管理员都难以理解Kubernetes是如何在后台工作的,因为kubeadm会为您设置集群。

To know that the best way is to bootstrap your Kubernetes cluster the hard way. This story is an adaption of Kelsy Hightower’s “Kubernetes the Hard Way;” however, unlike the original guide, this one has more focus on building a more secure, production-ready Kubernetes cluster.

要知道最好的方法是用困难的方法引导您的Kubernetes集群。 这个故事是对Kelsy Hightower的“ 艰苦的 Kubernetes”的改编 。 但是,与最初的指南不同,该指南更加着重于构建更安全,更易于生产的Kubernetes集群。

This story is an advanced level topic, and it assumes that you already have some experience administrating Kubernetes clusters. Read “How to Secure Kubernetes the Easy Way” for more automated setup.

这个故事是一个高级话题,它假设您已经具有管理Kubernetes集群的经验。 阅读“ 如何轻松保护Kubernetes的安全 ”以获取更多自动化设置。

集群架构 (Cluster Architecture)

The original guide was for educational purposes only and, therefore, did not consider the resilience and security aspects of the cluster. It had the following configuration:

原始指南仅用于教育目的,因此未考虑群集的弹性和安全性方面。 它具有以下配置:

  • Three control plane (master) nodes.三个控制平面(主)节点。
  • Three worker nodes.三个工作节点。
  • One Google Cloud network load balancer.一个Google Cloud网络负载平衡器。
  • NGINX instances to provide a health check for the API server.NGINX实例可为API服务器提供运行状况检查。
  • Runs a stacked etcd cluster (which means that the etcd service runs within the control plane nodes).

    运行堆叠的etcd集群(这意味着etcd服务在控制平面节点内运行)。

  • Exposes all nodes to the internet through external IPs.通过外部IP将所有节点公开到Internet。
  • There are no firewall rules to limit traffic.没有防火墙规则来限制流量。
  • There is no disaster recovery built-in as all nodes run in the same zone.由于所有节点都在同一区域中运行,因此没有内置的灾难恢复功能。

Since we are building whilst keeping security and resilience in mind, I will modify it to include the following cluster architecture:

由于我们在构建时要牢记安全性和弹性,因此我将对其进行修改以包括以下群集体系结构:

Cluster Architecture
集群架构
  • We will run a regional Kubernetes cluster. That means that our cluster would be resilient to zone outages but not a regional disruption.我们将运行一个区域性Kubernetes集群。 这意味着我们的集群可以应对区域中断,但不会破坏区域。
  • Three master nodes (master01, master02, and master03) run in three different zones.

    三个主节点( master01master02master03 )在三个不同的区域中运行。

  • Three etcd nodes (etcd01, etcd02, and etcd03) run in three different zones.

    三个etcd节点( etcd01etcd02etcd03 )在三个不同的区域中运行。

  • Two worker nodes (node01 and node02) run in two different zones.

    两个工作程序节点( node01node02 )在两个不同的区域中运行。

  • Two NGINX load balancers (masterlb and masterlb-dr) run in two different zones in an active-standby configuration. I will utilize an External static IP and an Internal static IP with aliasing to ensure that at any given time, all nodes communicate with the static IP which is bound to the active load balancer node.

    两个NGINX负载平衡器( masterlbmasterlb-dr )在主备配置中的两个不同区域中运行。 我将使用带有别名的外部静态IP和内部静态IP,以确保在任何给定时间,所有节点都与绑定到活动负载平衡器节点的静态IP通信。

  • A bastion host runs in one zone (we can create another bastion host in case of zone outage).堡垒主机在一个区域中运行(如果区域发生故障,我们可以创建另一个堡垒主机)。
  • All servers apart from the bastion host and load balancers are internal. They don’t have an external IP attached.除堡垒主机和负载平衡器之外的所有服务器都是内部的。 他们没有连接外部IP。
  • Since the nodes need to have outbound internet connectivity, I have utilized a Cloud NAT gateway for egress traffic from the internal servers.由于节点需要具有出站Internet连接,因此我利用Cloud NAT网关从内部服务器输出流量。

防火墙规则 (Firewall Rules)

Kubernetes clusters operate a flat network and, therefore, all nodes are placed in the same subnetwork. However, I would be applying a restrictive policy and will allow only the required traffic through the required ports.

Kubernetes集群运行一个扁平网络,因此,所有节点都位于同一子网中。 但是,我将应用限制性策略,并且将仅允许通过所需端口的所需流量。

Firewall Rules
防火墙规则

That will ensure that if someone gains unauthorized access to any of our servers, the damage would be limited to that server and they would not be able to access other nodes.

这将确保,如果有人获得了对我们任何服务器的未经授权的访问,则损害将仅限于该服务器,并且他们将无法访问其他节点。

There is no SSH access opened between any of the nodes apart from the bastion host. A firewall protects the bastion host and only allows trusted clients access through IP whitelisting.

除堡垒主机外,任何节点之间都没有打开SSH访问。 防火墙保护了堡垒主机,仅允许受信任的客户端通过IP白名单进行访问。

All set! Let’s get started.

可以了,好了! 让我们开始吧。

使用Terraform加速基础架构 (Spinning Up Infrastructure Using Terraform)

We will spin the infrastructure using Terraform (instead of gcloud commands which the original guide used) because it is declarative and more modern. It is easier to clean up and resume from where we started.

我们将使用Terraform (而不是原始指南使用的gcloud命令)来旋转基础结构 ,因为它是声明性的并且更现代。 从我们开始的地方进行清理和恢复比较容易。

设置IAM和管理员 (Set up IAM and admin)

You would now need to configure the Google Cloud environment to allow Terraform to manage infrastructure remotely.

现在,您需要配置Google Cloud环境,以允许Terraform远程管理基础架构。

You would need to first create a service account and provide enough permissions to the service account so that it can administer the infrastructure required.

您需要首先创建一个服务帐户,并为该服务帐户提供足够的权限,以便它可以管理所需的基础结构。

I have assigned the “Project Editor” role to the service account, but you may consider more restrictive access as needed by your organizational policies.

我已经为服务帐户分配了“项目编辑者”角色,但是您可以根据组织策略的需要考虑使用更多限制性访问。

Go to IAM and Admin -> Service Accounts.

转到“ IAM和管理 -> 服务帐户”

Create a service account called Terraform.

创建一个名为Terraform的服务帐户。

Grant the project editor role to the service account so that it can spin resources.

将项目编辑者角色授予服务帐户,以便它可以分配资源。

Grant the service account “service account user” access and generate a JSON key for Terraform to authenticate with Google Cloud APIs as the service account.

授予服务帐户“服务帐户用户”访问权限,并为Terraform生成JSON密钥,以使用Google Cloud API作为服务帐户进行身份验证。

安装Terraform (Installing Terraform)

Terraform CLI is a binary file and it is simple to set up and use. Go to Terraform’s website and download the appropriate package based on your environment.

Terraform CLI是一个二进制文件,易于设置和使用。 转到Terraform的网站并根据您的环境下载适当的软件包。

Unzip the package and then either set the path pointing to the Terraform binary or move the Terraform binary to your bin directory.

解压缩该软件包,然后设置指向Terraform二进制文件的路径或将Terraform二进制文件移动到bin目录。

提升基础架构 (Spin up infrastructure)

Run the following for Terraform to spin up the infrastructure. That will create the entire infrastructure with the firewall rules and static IP address for you.

对Terraform运行以下命令以启动基础架构。 这将为您创建带有防火墙规则和静态IP地址的整个基础结构。

git clone https://github.com/bharatmicrosystems/kthw-terraform.gitcd kthw-terraform/cp terraform.tfvars.example terraform.tfvars

Modify the terraform.tfvars file and replace the variables with relevant values for the project, region, and source_ranges. The source range should contain the IP address of your machine so that you can connect to your bastion host from your system.

修改terraform.tfvars文件,并将变量替换为projectregionsource_ranges相关值。 源范围应包含计算机的IP地址,以便您可以从系统连接到堡垒主机。

Copy the JSON key you’ve downloaded to kthw-terraform/ and rename it to credentials.json.

复制您下载到JSON关键kthw-terraform/和其重命名为credentials.json

Initialize, plan, and apply the configuration to the Google Cloud platform.

初始化,规划配置并将其应用于Google Cloud平台。

terraform initterraform planterraform apply

Go to the Google Compute Engine section and you will see that Terraform has created one bastion, three masters, three etcd nodes, two load balancers, and two worker nodes.

转到Google Compute Engine部分,您将看到Terraform创建了一个堡垒,三个主节点,三个etcd节点,两个负载平衡器和两个工作节点。

GCE Instances
GCE实例

艰难地设置Kubernetes (Setting Up Kubernetes the Hard Way)

The original guide uses the API server certificate for etcd as it uses a stacked etcd configuration. I will generate separate certificates and keys for etcd as we are running an external etcd setup.

原始指南使用etcd的API服务器证书,因为它使用堆叠的etcd配置。 当我们运行外​​部etcd设置时,我将为etcd生成单独的证书和密钥。

We will place the etcd cluster behind a load balancer that gives us multiple advantages.

我们将etcd集群放置在负载均衡器之后,这给我们带来了多个优势。

  • The etcd nodes can have ephemeral IPs.etcd节点可以具有临时IP。
  • You can add and remove etcd nodes according to your requirements.您可以根据需要添加和删除etcd节点。
  • NGINX provides an auto health check of its back-end members, and it would not send traffic to an unhealthy etcd instance avoiding runtime issues.NGINX提供了对其后端成员的自动运行状况检查,并且不会将流量发送到运行状况不佳的etcd实例,从而避免了运行时问题。
  • You don’t need to update the control plane configuration if you make changes to the etcd cluster (such as adding or removing etcd nodes).如果对etcd集群进行了更改(例如添加或删除etcd节点),则无需更新控制平面配置。

We will allow only the desired traffic and block the rest of it. That is required to protect our cluster from unauthorized access. We will encrypt secrets at rest on the etcd cluster as suggested in the original guide.

我们将只允许所需的流量,并阻止其余流量。 这是保护我们的集群免受未经授权的访问所必需的。 我们将按照原始指南中的建议对etcd群集上的静态机密进行加密。

We will use Docker instead of containerd as the container runtime. We will create our environment on CentOS 7 instead of Ubuntu 16.04.6 LTS.

我们将使用Docker而不是containerd作为容器运行时。 我们将在CentOS 7而非Ubuntu 16.04.6 LTS上创建环境。

登录到堡垒主机 (Log In to Bastion Host)

SSH into the bastion host from your local system by running the gcloud compute ssh command.

通过运行gcloud compute ssh命令从本地系统SSH进入堡垒主机。

gcloud compute ssh bastion --zone europe-west2-a

It should open an SSH session with the bastion host. Within the bastion host, clone the kthw-terraform repository.

它应该打开与堡垒主机的SSH会话。 在堡垒主机内,克隆kthw-terraform存储库。

The repository contains scripts that are already written for you and tested on the Google Cloud Platform. To gain a better understanding, feel free to open and explore the scripts. I will only elaborate on the contents of the scripts if it is different from the original guide.

存储库包含已为您编写并在Google Cloud Platform上经过测试的脚本。 为了获得更好的理解,请随时打开和浏览脚本。 如果脚本的内容与原始指南不同,我将仅对其进行详细说明。

git clone https://github.com/bharatmicrosystems/kthw-terraform.gitcd kthw-terraform/cp -a scripts/ exec/cd exec/masters=master01,master02,master03workers=node01,node02loadbalancers=masterlb,masterlb-dretcds=etcd01,etcd02,etcd03internal_vip=masterlb-internal-vipexternal_vip=masterlb-external-vip

设置CA (Set Up CA)

This setup is precisely similar to the original guide. Run the following:

此设置与原始指南完全相似。 运行以下命令:

sh -x setup-ca.sh

Output:

输出:

ca-key.pemca.pem

生成客户端和服务器证书 (Generate Client and Server Certificates)

This setup is identical to the original guide with the only difference that we would be generating a separate etcd cluster certificate and key pair, and we would point to a static internal IP for the API server instead of the external IP.

此设置与原始指南相同,唯一的区别是我们将生成单独的etcd集群证书和密钥对,并且我们将指向API服务器的静态内部IP,而不是外部IP。

sh -x setup-certs.sh $masters $workers $internal_vip $etcds

Output:

输出:

admin-key.pemadmin.pemnode01-key.pemnode01.pemnode02-key.pemnode02.pemkube-controller-manager-key.pemkube-controller-manager.pemkube-proxy-key.pemkube-proxy.pemkube-scheduler-key.pemkube-scheduler.pemkubernetes-key.pemkubernetes.pemetcd-key.pemetcd.pemservice-account-key.pemservice-account.pem

生成Kubernetes配置文件以进行身份​​验证 (Generate Kubernetes Configuration Files for Authentication)

We will now generate kubeconfigs so that the Kubernetes components can connect, authenticate, and authorize with the API server.

现在,我们将生成kubeconfig,以便Kubernetes组件可以与API服务器连接,进行身份验证和授权。

Unlike the original guide, in this configuration, all components talk to the API server via the load balancer. The advantages of this configuration are similar to the benefits of putting the etcd cluster behind a load balancer.

与原始指南不同,在此配置中,所有组件都通过负载平衡器与API服务器通信。 这种配置的优点类似于将etcd群集放在负载均衡器后面的优点。

sh -x generate-kubeconfig.sh $workers $internal_vip

Output:

输出:

node01.kubeconfignode02.kubeconfigkube-proxy.kubeconfigkube-controller-manager.kubeconfigkube-scheduler.kubeconfigadmin.kubeconfig

引导负载平衡器 (Bootstrap the Load Balancer)

Since we are using NGINX as a load balancer instead of the Google Cloud network load balancer, we will configure NGINX to send traffic to the API servers and the etcd nodes.

由于我们将NGINX用作负载平衡器,而不是Google Cloud网络负载平衡器,因此我们将配置NGINX以将流量发送到API服务器和etcd节点。

The NGINX configuration looks like below:

NGINX配置如下所示:

To bootstrap the NGINX cluster using the script run:

要使用脚本引导NGINX集群,请运行:

sh -x setup-nginx.sh $loadbalancers $masters $etcds $internal_vip $external_vip

We also will alias the VIP to the first load balancer and run a keep-alive daemon to ensure that the VIP is aliased to the standby load balancer in case the primary load balancer goes down.

我们还将将VIP别名给第一个负载均衡器,并运行一个保持活动的守护进程,以确保在主负载均衡器出现故障的情况下,将VIP别名给备用负载均衡器。

The keep-alive utility (called “assign-vip” in this case) will run as a daemon and map the VIP automatically to the active instance.

keep-alive实用程序(在这种情况下,称为“ assign-vip ”)将作为守护程序运行,并将VIP自动映射到活动实例。

The standby instance will keep polling for the VIP to be available and if the VIP is released for some reason (such as in case that the primary load balancer crashes), the standby instance will take that over and assign the VIP to itself.

备用实例将继续轮询VIP是否可用,并且如果由于某种原因(例如,主负载均衡器崩溃)释放了VIP,则备用实例将接管并将VIP分配给自己。

Output below.

输出如下。

NGINX should be running on both the masterlb and masterlb-dr node.

NGINX应该同时在masterlbmasterlb-dr节点上运行。

+ sudo systemctl status nginx● nginx.service - nginx - high performance web server   Loaded: loaded (/usr/lib/systemd/system/nginx.service; enabled; vendor preset: disabled)   Active: active (running) since Sun 2020-04-12 23:14:32 UTC; 77ms ago     Docs: http://nginx.org/en/docs/  Process: 1763 ExecStart=/usr/sbin/nginx -c /etc/nginx/nginx.conf (code=exited, status=0/SUCCESS) Main PID: 1764 (nginx)   CGroup: /system.slice/nginx.service           ├─1764 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf           └─1765 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf

分发证书和Kubeconfig (Distributing Certificates and Kubeconfigs)

We then need to distribute the certs, keys, and kubeconfigs to the servers.

然后,我们需要将证书,密钥和kubeconfig分发到服务器。

sh -x distribute-certs.sh $masters $workers $etcdssh -x distribute-kubeconfig.sh $masters $workers

生成数据加密配置和密钥 (Generate the Data Encryption Config and Key)

Like the original guide, we would generate a data encryption config and key. That will allow Kubernetes to store secrets as encrypted text on etcd.

像原始指南一样,我们将生成数据加密配置和密钥。 这将使Kubernetes将机密存储为etcd上的加密文本。

That is extremely important so that if someone gets access to your etcd cluster, they should not be able to view your secrets by just doing a hex dump of your secrets.

这非常重要,因此,如果某人可以访问您的etcd集群,那么他们不应仅通过对您的机密进行十六进制转储就可以查看您的机密。

The below command will generate and copy the generated encryption-config.yaml to all master nodes.

下面的命令将生成并将生成的encryption-config.yaml复制到所有主节点。

sh -x generate-data-enc.config.sh $masters

引导etcd集群 (Bootstrap the etcd Cluster)

The etcd cluster is used to store the state of the Kubernetes cluster and, therefore, is a vital component.

etcd集群用于存储Kubernetes集群的状态,因此是至关重要的组件。

If someone gains access to your etcd cluster, it is equivalent to giving them root access to your cluster. Consequently, we have kept the etcd cluster completely separate and blocked all traffic apart from the etcd API running on port 2379, which only the master nodes have access to via the etcd load balancer.

如果某人获得了对您的etcd集群的访问权限,则相当于授予他们对您的集群的根访问权限。 因此,除了端口2379上运行的etcd API以外,我们使etcd群集完全分开并阻止了所有流量,只有主节点才能通过etcd负载平衡器访问该端口。

Instead of advertising the local client IP, the etcd cluster will advertise the static load balancer IP as the API server would connect to etcd via the load balancer.

由于API服务器将通过负载平衡器连接到etcd,因此etcd集群将不广播本地客户端IP,而是将广播静态负载平衡器IP。

sh -x setup-etcd.sh $etcds $internal_vip

You can check the health of your etcd cluster by running the below in any of your etcd nodes. The above command runs that for you automatically, however.

您可以通过在任何etcd节点中运行以下命令来检查etcd集群的运行状况。 上面的命令自动为您运行。

sudo ETCDCTL_API=3 /usr/local/bin/etcdctl member list \  --endpoints=https://127.0.0.1:2379 \  --cacert=/etc/etcd/ca.pem \  --cert=/etc/etcd/etcd.pem \  --key=/etc/etcd/etcd-key.pem

Output:

输出:

17a5143233735134, started, etcd02, https://etcd02:2380, https://10.154.15.236:2379, false92ad41089af8c0f8, started, etcd01, https://etcd01:2380, https://10.154.15.236:2379, falsecfc65b5eb6a19652, started, etcd03, https://etcd03:2380, https://10.154.15.236:2379, false

引导Kubernetes控制平面 (Bootstrap the Kubernetes Control Plane)

The Kubernetes control plane is responsible for managing your Kubernetes cluster. We would install the Kube API server, Kube controller manager, and Kube scheduler in the control plane nodes.

Kubernetes控制平面负责管理您的Kubernetes集群。 我们将在控制平面节点中安装Kube API服务器 , Kube控制器管理器和Kube调度程序 。

The process of setting up the control plane is precisely similar to the original guide.

设置控制平面的过程与原始指南完全相似。

However, unlike the original guide, the Kube API server would advertise the floating VIP of the load balancer instead of its internal IP, and use the etcd certificates instead of the API server certificate to authenticate with the etcd servers.

但是,与原始指南不同,Kube API服务器将通告负载平衡器的浮动VIP而不是其内部IP,并使用etcd证书而不是API服务器证书来向etcd服务器进行身份验证。

The API server will also use a single load-balanced endpoint of the etcd cluster rather than a comma-separated list of etcd URLs.

API服务器还将使用etcd群集的单个负载平衡端点,而不是用逗号分隔的etcd URL列表。

/usr/local/bin/kube-apiserver \\  --advertise-address=${LOAD_BALANCER_VIP} \\  --allow-privileged=true \\  --apiserver-count=3 \\  --audit-log-maxage=30 \\  --audit-log-maxbackup=3 \\  --audit-log-maxsize=100 \\  --audit-log-path=/var/log/audit.log \\  --authorization-mode=Node,RBAC \\  --bind-address=0.0.0.0 \\  --client-ca-file=/var/lib/kubernetes/ca.pem \\  --enable-admission-plugins=NamespaceLifecycle,NodeRestriction,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota \\  --etcd-cafile=/var/lib/kubernetes/ca.pem \\  --etcd-certfile=/var/lib/kubernetes/etcd.pem \\  --etcd-keyfile=/var/lib/kubernetes/etcd-key.pem \\  --etcd-servers=https://${LOAD_BALANCER_VIP}:2379 \\  --event-ttl=1h \\  --encryption-provider-config=/var/lib/kubernetes/encryption-config.yaml \\  --kubelet-certificate-authority=/var/lib/kubernetes/ca.pem \\  --kubelet-client-certificate=/var/lib/kubernetes/kubernetes.pem \\  --kubelet-client-key=/var/lib/kubernetes/kubernetes-key.pem \\  --kubelet-https=true \\  --runtime-config=api/all \\  --service-account-key-file=/var/lib/kubernetes/service-account.pem \\  --service-cluster-ip-range=10.32.0.0/24 \\  --service-node-port-range=30000-32767 \\  --tls-cert-file=/var/lib/kubernetes/kubernetes.pem \\  --tls-private-key-file=/var/lib/kubernetes/kubernetes-key.pem \\  --v=2

To Bootstrap your control plane, run:

要引导控制平面,请运行:

sh -x setup-master.sh $masters $internal_vip

To check the status of your control plane, run the following on all nodes of your control plane. The above command runs it for your automatically so if you are observing your output, you can figure out when it is running.

要检查控制平面的状态,请在控制平面的所有节点上运行以下命令。 上面的命令是自动运行的,因此,如果您观察输出,则可以确定它何时运行。

kubectl get componentstatuses --kubeconfig admin.kubeconfig

Output:

输出:

NAME                 STATUS    MESSAGE             ERRORcontroller-manager   Healthy   okscheduler            Healthy   oketcd-0               Healthy   {"health":"true"}

If you see here, the number of etcd nodes is just one instead of three.

如果您在此处看到,则etcd节点的数量仅为1,而不是3。

The reason for that is that the etcd cluster is behind a load balancer, Kubernetes is not aware of multiple etcd nodes. That is fine as we have offloaded the load balancing capability to our load balancer instead of the API server.

这样做的原因是etcd群集位于负载均衡器之后,Kubernetes不知道多个etcd节点。 很好,因为我们已将负载平衡功能转移给了负载平衡器而不是API服务器。

在控制平面和Kubelet之间设置RBAC (Setup RBAC Between Control Plane and Kubelet)

The control plane needs to communicate with the kubelet API to manage the cluster. Kubelet is the interface between Kubernetes and the container runtime.

控制平面需要与kubelet API通信以管理集群。 Kubelet是Kubernetes和容器运行时之间的接口。

To allow secure communication between the control plane and kubelet like the original guide, I have set the -authorization-mode flag to Webhook.

为了像原始指南一样允许控制平面和kubelet之间的安全通信,我将Webhook -authorization-mode标志设置为Webhook

sh -x setup-rbac.sh $masters

Output:

输出:

clusterrole.rbac.authorization.k8s.io/system:kube-apiserver-to-kubelet createdclusterrolebinding.rbac.authorization.k8s.io/system:kube-apiserver created

引导工作节点 (Bootstrap the Worker Nodes)

Worker nodes are the powerhouse of a Kubernetes cluster. The container workloads run on the worker nodes and, therefore, there can be numerous worker nodes on your cluster.

工作节点是Kubernetes集群的强大动力。 容器工作负载在工作节点上运行,因此,群集上可能有许多工作节点。

In this story, I will just be bootstrapping two worker nodes. However, you can have as many worker nodes as you like. Unlike the original guide, we would be installing Docker in the worker nodes instead of Containerd, and the kubelet configuration below reflects that.

在这个故事中,我将引导两个工作节点。 但是,您可以根据需要拥有任意数量的工作节点。 与原始指南不同,我们将在工作节点中安装Docker而不是Containerd,下面的kubelet配置反映了这一点。

/usr/local/bin/kubelet \\  --config=/var/lib/kubelet/kubelet-config.yaml \\  --docker=unix:///var/run/docker.sock \\  --docker-endpoint=unix:///var/run/docker.sock \\  --image-pull-progress-deadline=2m \\  --kubeconfig=/var/lib/kubelet/kubeconfig \\  --network-plugin=cni \\  --register-node=true \\  --cgroup-driver=systemd \\  --v=2

To bootstrap your Kubernetes worker nodes, run:

要引导您的Kubernetes工作者节点,请运行:

sh -x setup-worker.sh $workers

To check if the setup was successful, run:

要检查安装是否成功,请运行:

gcloud compute ssh master01 --internal-ip \  --command "kubectl get nodes --kubeconfig admin.kubeconfig"

Output:

输出:

NAME     STATUS     ROLES    AGE   VERSIONnode01   NotReady   <none>   98s   v1.15.3node02   NotReady   <none>   34s   v1.15.3

配置Weave Net和DNS附加 (Configure Weave Net and DNS Add On)

You might have noticed that the nodes are not yet ready. That is because the cluster does not have the correct routes defined in the routing tables.

您可能已经注意到节点尚未准备就绪。 这是因为群集没有在路由表中定义的正确路由。

Unlike the original guide, where Kelsy Hightower created manual routes and mapped the cluster IP range with the node IP range, we would make use of the Weave Net CNI plugin for Kubernetes.

与最初的指南不同,Kelsy Hightower创建了手动路由并将群集IP范围与节点IP范围映射在一起,我们将使用针对Kubernetes的Weave Net CNI插件 。

That will help us do this mapping automatically, and will also provide us with other features like a pod security policy, Kubernetes network policy, etc. We would also install the DNS add on so that pods can discover other pods through the service name instead of the service IP.

这将帮助我们自动进行此映射,还将为我们提供其他功能,例如Pod安全策略,Kubernetes网络策略等。我们还将安装DNS附件,以便Pod可以通过服务名称而不是通过其他Pod来发现其他Pod。服务IP。

sh -x setup-networking.sh $masters

Re-run the kubectl get nodes again to see if they are ready.

重新运行kubectl get nodes以查看它们是否准备就绪。

gcloud compute ssh master01 --internal-ip \  --command "kubectl get nodes --kubeconfig admin.kubeconfig"

Output:

输出:

NAME     STATUS   ROLES    AGE   VERSIONnode01   Ready    <none>   11m   v1.15.3node02   Ready    <none>   10m   v1.15.3

And yes, they are ready now!

是的,他们现在准备好了!

烟雾测试 (Smoke Test)

We have now configured a secured, highly-available, production-ready, Kubernetes cluster the hard way, and it is time to run some smoke tests.

现在,我们已经以一种艰难的方式配置了一个安全的,高度可用的,可投入生产的Kubernetes集群,现在该进行一些烟雾测试了。

测试kube-proxy (Test for kube-proxy)

We will deploy an NGINX container and then run a port-forward on the control plane. If you set up the kube-proxy correctly, we would be able to access NGINX through the forwarded port on the localhost.

我们将部署NGINX容器,然后在控制平面上运行端口转发。 如果您正确设置了kube-proxy,我们将能够通过本地主机上转发的端口访问NGINX。

ZONE=`gcloud compute instances list --filter="name=master01"| grep master01 | awk '{ print $2 }'`gcloud compute ssh --zone=$ZONE --internal-ip master01#Setup NGINX on a containerkubectl run nginx --image=nginxkubectl get pods -l run=nginxPOD_NAME=$(kubectl get pods -l run=nginx -o jsonpath="{.items[0].metadata.name}")kubectl port-forward $POD_NAME 8081:80

From another instance of the bastion host, run:

在堡垒主机的另一个实例中,运行:

ZONE=`gcloud compute instances list --filter="name=master01"| grep master01 | awk '{ print $2 }'`gcloud compute ssh --zone=$ZONE --internal-ip master01curl --head http://127.0.0.1:8081

Output:

输出:

HTTP/1.1 200 OKServer: nginx/1.17.9Date: Sun, 12 Apr 2020 23:58:08 GMTContent-Type: text/htmlContent-Length: 612Last-Modified: Tue, 03 Mar 2020 14:32:47 GMTConnection: keep-aliveETag: "5e5e6a8f-264"Accept-Ranges: bytes

测试执行 (Test for exec)

Test whether we can execute commands within the running container.

测试我们是否可以在正在运行的容器中执行命令。

POD_NAME=$(kubectl get pods -l run=nginx -o jsonpath="{.items[0].metadata.name}")kubectl exec -ti $POD_NAME -- nginx -v

Output:

输出:

nginx version: nginx/1.17.9

测试NodePort (Test for NodePort)

Test if NodePort is running correctly or not.

测试NodePort是否正确运行。

kubectl expose deployment nginx --port 80 --type NodePortNODEPORT=$(kubectl get svc nginx -o jsonpath="{.spec.ports[0].nodePort}")curl -I node01:$NODEPORTcurl -I node02:$NODEPORT

Output:

输出:

+ curl -I node01:32403HTTP/1.1 200 OKServer: nginx/1.17.9Date: Sun, 12 Apr 2020 23:58:06 GMTContent-Type: text/htmlContent-Length: 612Last-Modified: Tue, 03 Mar 2020 14:32:47 GMTConnection: keep-aliveETag: "5e5e6a8f-264"Accept-Ranges: bytes+ curl -I node02:32403HTTP/1.1 200 OKServer: nginx/1.17.9Date: Sun, 12 Apr 2020 23:58:08 GMTContent-Type: text/htmlContent-Length: 612Last-Modified: Tue, 03 Mar 2020 14:32:47 GMTConnection: keep-aliveETag: "5e5e6a8f-264"Accept-Ranges: bytes

测试记录 (Test logs)

Test whether we can browse logs from the container or not.

测试我们是否可以从容器浏览日志。

kubectl logs $POD_NAME

Output:

输出:

10.200.192.0 - - [12/Apr/2020:23:58:06 +0000] "HEAD / HTTP/1.1" 200 0 "-" "curl/7.29.0" "-"10.200.0.1 - - [12/Apr/2020:23:58:08 +0000] "HEAD / HTTP/1.1" 200 0 "-" "curl/7.29.0" "-"

测试CoreDNS (Test for CoreDNS)

Test if we can successfully run nslookup within the node. For this, we will fire a busybox pod from where we will try to nslookup Kubernetes. If you install CoreDNS correctly, Kubernetes is the default service created.

测试我们是否可以在节点内成功运行nslookup 。 为此,我们将尝试从其中启动nslookup Kubernetes的busybox pod。 如果正确安装CoreDNS ,则Kubernetes是创建的默认服务。

kubectl run busybox --image=busybox:1.28 --command -- sleep 3600kubectl get podsPOD_NAME=$(kubectl get pods -l run=busybox -o jsonpath="{.items[0].metadata.name}")sleep 10kubectl exec -ti $POD_NAME -- nslookup kubernetes

Output:

输出:

Server:    10.32.0.10Address 1: 10.32.0.10 kube-dns.kube-system.svc.cluster.localName:      kubernetesAddress 1: 10.32.0.1 kubernetes.default.svc.cluster.local

测试静态加密 (Test for secret encryption at rest)

We will test whether Kubernetes encrypts secrets at rest or not.

我们将测试Kubernetes是否对静态密码进行加密。

kubectl create secret generic kubernetes-the-hard-way --from-literal="mykey=mydata"

Now log in to one of the etcd nodes and do a hex dump of the secret.

现在登录到etcd节点之一,并进行秘密的十六进制转储。

ZONE=`gcloud compute instances list --filter="name=etcd01"| grep master01 | awk '{ print $2 }'`gcloud compute ssh --zone=$ZONE --internal-ip etcd01sudo ETCDCTL_API=3 /usr/local/bin/etcdctl get \  --endpoints=https://127.0.0.1:2379 \  --cacert=/etc/etcd/ca.pem \  --cert=/etc/etcd/etcd.pem \  --key=/etc/etcd/etcd-key.pem\  /registry/secrets/default/kubernetes-the-hard-way | hexdump -C

Output:

输出:

00000000  2f 72 65 67 69 73 74 72  79 2f 73 65 63 72 65 74  |/registry/secret|00000010  73 2f 64 65 66 61 75 6c  74 2f 6b 75 62 65 72 6e  |s/default/kubern|00000020  65 74 65 73 2d 74 68 65  2d 68 61 72 64 2d 77 61  |etes-the-hard-wa|00000030  79 0a 6b 38 73 3a 65 6e  63 3a 61 65 73 63 62 63  |y.k8s:enc:aescbc|00000040  3a 76 31 3a 6b 65 79 31  3a 37 8e ba c4 45 8b 58  |:v1:key1:7...E.X|00000050  36 6d 23 a9 3f 3a 4d 3e  36 45 bd 37 be 23 85 52  |6m#.?:M>6E.7.#.R|00000060  01 75 e9 df de 5a 66 9d  d0 26 04 d2 98 92 a8 ab  |.u...Zf..&......|00000070  80 cb f2 fe a5 c5 16 9c  d5 14 1d c6 de 92 5b 1a  |..............[.|00000080  6e 93 0c 91 17 ed d9 40  74 80 16 b1 36 45 7c cb  |n......@t...6E|.|00000090  5e 1a 24 05 9f 2f 58 c0  b2 83 f7 2d b8 2d ca b6  |^.$../X....-.-..|000000a0  1d 09 f8 3b 18 23 7c eb  1a 35 35 25 68 2f 6a 55  |...;.#|..55%h/jU|000000b0  2f f2 53 f0 8a 04 53 dc  f0 56 4b a5 23 f1 fe 4a  |/.S...S..VK.#..J|000000c0  bc 4d 0d a9 d8 03 2c 4b  1b 9e cd 12 ee 41 df ed  |.M....,K.....A..|000000d0  4a f8 e5 19 a3 4c ed 06  74 08 07 d3 7e 1c 03 f2  |J....L..t...~...|000000e0  a5 21 a4 0e 6a 91 86 93  5f 0a                    |.!..j..._.|000000ea

And, as you can see, the secret is encrypted!

而且,正如您所看到的,秘密是加密的!

烟雾测试清理 (Smoke test cleanup)

Clean up the resources we created for the test.

清理我们为测试创建的资源。

kubectl delete secret kubernetes-the-hard-waykubectl delete svc nginxkubectl delete deployment nginxkubectl delete deployment busybox

打扫干净 (Cleaning Up)

If you created the cluster for learning, it would make sense to destroy the infrastructure after you finish to avoid a huge bill.

如果您创建了用于学习的集群,那么在完成操作后应该销毁基础架构,以避免产生巨额费用。

terraform destroy

结论 (Conclusion)

Thanks for reading, I hope you enjoyed the article. The scope of the story ends with bootstrapping your cluster as per the original guide.

感谢您的阅读,希望您喜欢这篇文章。 故事的范围以按照原始指南自举群集为结尾。

There is much more you can do to make your cluster more secure, which I will cover in my future writeups. To gain a high-level understanding, read “How to Harden Your Kubernetes Cluster for Production.” Keep watching the space for more!

您可以做更多的工作来提高群集的安全性,我将在以后的文章中介绍。 要获得高级了解,请阅读“ 如何强化Kubernetes集群以进行生产” 。 继续关注更多空间!

翻译自: https://medium.com/better-programming/how-to-secure-kubernetes-the-hard-way-9b421b36aba4

艰难的选择


http://www.taodudu.cc/news/show-4470165.html

相关文章:

  • 也许有一天,你发觉日子特别的艰难,那可能是这次的收获将特别的巨大
  • 大学期间-Fans同学的11个艰难的决定
  • 【毒鸡汤】英译毒鸡汤——人生已经如此艰难,何不落井下石
  • 人生艰难
  • 纪录一个艰难的开始
  • 艰难的任务
  • 人生很艰难
  • 艰难的CSDN
  • 2022年全球市场艰难梭菌的分子诊断总体规模、主要生产商、主要地区、产品和应用细分研究报告
  • 记一次艰难的SQL注入(过安全狗)
  • 我刚刚做了一个艰难的决定
  • Archlinux:安装Nvidia闭源驱动的艰难日子
  • 创作总是一个艰难的过程
  • 熬过最艰难的日子说说
  • w7计算机防火墙无法更改,Win7电脑系统防火墙设置无法更改解决方法
  • 计算机防火墙不能更改,win7无法更改防火墙设置提示系统报错怎么办
  • win8服务器防火墙配置文件,Win8自带防火墙吗,Win8防火墙在哪里(适用于Win8.1)?
  • Win7防火墙允许ping
  • win7计算机怎么远程桌面连接不上,Win7系统连接不上远程桌面的解决方法
  • 第二银河怎么用电脑玩 第二银河模拟器玩法教程
  • 为什么物理诺奖颁给量子信息科学?——量子信息的过去、现在和未来
  • Oracle sqlplus 常用命令总结
  • 量子纠缠:从量子物质态到深度学习
  • netty案例,netty4.1源码分析篇五《一行简单的writeAndFlush都做了哪些事》
  • jzyzoj 1216 poj虫洞 3259 Bellman_Ford模板
  • 虫洞
  • Java燕山大学_GitHub - jiajiayao/YsuSelfStudy: 燕山大学空教室查询及教务辅助系统,中国大学生计算机设计大赛省赛三等奖,已上架小米应用商店...
  • 【推荐系统】多目标学习在推荐系统中的应用
  • python2.7打开webdriver打不开ie_18个提高效率改变生活的网站,为你打开新世界的大门...
  • 多目标学习在推荐系统中的应用

艰难的选择_如何艰难地保护Kubernetes相关推荐

  1. 艰难的选择_处理艰难对话的6种方法:分享如何衡量成功

    艰难的选择 通过博客每周社区管理主题来帮助我们收集社区知识. 每个新主题发布后,博客文章应在下周四发布. 下周的挑战是衡量 成功 . 了解在上周的博客挑战中招募新社区成员的方法以及如何维护现有社区 . ...

  2. 计算机专业高级知识,高级选择_电脑基础知识_IT计算机_专业资料

    高级选择_电脑基础知识_IT计算机_专业资料 (79页) 本资源提供全文预览,点击全文预览即可全文预览,如果喜欢文档就下载吧,查找使用更方便哦! 21.9 积分 1.注水泥塞丿施T时,从配水泥浆到反洗 ...

  3. 自适应LASSO的Oracle性质,最新变量选择_惩罚似然_自适应Lasso_SCAD_Oracle性质论文

    最新变量选择_惩罚似然_自适应Lasso_SCAD_Oracle性质论文 变量选择论文:广义线性模型下罚估计量的性质 [中文摘要]变量选择是高维统计建模的基础.但传统的使用逐步回归的方法不仅计算复杂而 ...

  4. 如何优雅的保护 Kubernetes 中的 Secrets

    公众号关注 「奇妙的 Linux 世界」 设为「星标」,每天带你玩转 Linux ! 现如今开发的大多数应用程序,或多或少都会用到一些敏感信息,用于执行某些业务逻辑.比如使用用户名密码去连接数据库,或 ...

  5. 基于CNI保护Kubernetes集群中的主机

    原文发表于kubernetes中文社区,为作者原创翻译 ,原文地址 更多kubernetes文章,请多关注kubernetes中文社区 目录 DaemonSet解决方案概述 创建应用程序 创建一个Do ...

  6. 生活很艰难,我们依然要前行_适用于艰难工作和积极生活方式的最坚固耐用的智能手机...

    生活很艰难,我们依然要前行 If those incredibly masculine pickup truck commercials seem tame to you, if you work a ...

  7. JZOJ 3913. 艰难的选择

    题目 Description Yzx已经当过多次"媒人"了.他因此获得了许多经验.例如,距Yzx观察,身高相近的人似乎比较合得来. Yzx在学校策划了一次大型的"非常男女 ...

  8. 3913. 【NOIP2014模拟11.2B组】艰难的选择

    Description Yzx已经当过多次"媒人"了.他因此获得了许多经验.例如,距Yzx观察,身高相近的人似乎比较合得来. Yzx在学校策划了一次大型的"非常男女&qu ...

  9. 【JZOJ B组】艰难的选择

    Description Yzx已经当过多次"媒人"了.他因此获得了许多经验.例如,距Yzx观察,身高相近的人似乎比较合得来. Yzx在学校策划了一次大型的"非常男女&qu ...

最新文章

  1. Excel向数据库插入数据和数据库向Excel导出数据
  2. 我的微型计算机,我的OC(超频)18年追忆!
  3. Shiro结合Redis解决集群中session同步问题
  4. Thymeleaf 常用属性
  5. 【数据挖掘】基于层次的聚类方法 ( 聚合层次聚类 | 划分层次聚类 | 族间距离 | 最小距离 | 最大距离 | 中心距离 | 平均距离 | 基于层次聚类步骤 | 族半径 )
  6. linux添加以太网头部函数,linux – 在内核模块中创建一个以太网数据包并发送它...
  7. 记录gulp报错The following tasks did not complete: cssmin或类似任务
  8. Boost:aligned alloc对齐分配的测试程序
  9. 【解题报告】Leecode 500. 键盘行——Leecode每日一题系列
  10. 大端机,小端机;截断与提升
  11. 怎么让照片变年轻_女生都想要变年轻,但是应该怎么做呐?其实有了背带裤就可以搞定...
  12. VS2022编译librtmp制作rtmp.lib用于安装windows版本的python-librtmp 0.3.0
  13. 后台将图片以base64形式传给前台,前台展示
  14. 安卓分析工具GameGurdian使用说明
  15. zkServer.cmd报错invalid config exiting abnormally解决
  16. Differentiable Scaffolding Tree for Molecule Optimization(论文解读)
  17. java数据结构和算法——图的深度优先(DFS)遍历
  18. vue项目国际化 vue-i18n以及踩坑解决 小姐姐手把手教你VUE国际化~
  19. win10/11下wsl2安装gpu版的pytorch(避坑指南)
  20. 智能编程和乐高机器人的区别

热门文章

  1. 利用第三方平台验证码进行识别
  2. 我的世界海洋java_我的世界Minecraft Java版18w15a发布
  3. 解决adb无法连接夜神模拟器的问题
  4. echarts 自定义图表的那些事
  5. 易基因 | 文献速递:RRBS方法绘制1538例乳腺癌甲基化图谱并预测癌症发生/预后
  6. 悼念图灵奖得主、ML语言之父Robin Milner
  7. webrtc 的回声抵消(aec、aecm)算法简介(转)
  8. 中秋国庆长假,SEO你在坚持吗?
  9. This application failed to start because no Qt platform plugin could be initialized. 报错解决方法
  10. 最全的SQL练习题(做完你就是高手)