混沌工程之ChaosToolkit使用之一删除K8s POD

1.概述

今天我们来玩一下混沌工程的开源工具chaostoolkit 。
它的目标是提供一个免费,开放,社区驱动的工具集以及api。

2.ChaosToolkit基础介绍

2.1.混沌工程原则

  • 官方源码链接:https://github.com/chaostoolkit/chaostoolkit
  • 要想了解这个工具就必须知道混沌工程原则中提到的要点。如下所示:

  • 记往这里提到的第一个要点,建立稳态假设。

2.2.混沌工程架构

  • 在运行这个工具之前,我们先来看一下它的架构。

  • 简单来解释一下,就是ChaosToolkit通过Drivers来操作你的被测系统。
    它的功能点包括如下部分

3.ChaosToolkit安装

环境说明:CentOS7.8、k8s 1.19.5、示例应用

3.1.ChaosToolkit安装命令

1.安装python3

sudo yum install python3 python3-venv

2.安装pipenv

gaolou@GaoMacPro ~ % pip3 install pipenv

3.安装chaos-toolkit 的k8s扩展和报告模块

pip3 install -U chaostoolkit
pip3 install -U chaostoolkit-kubernetes
pip3 install -U chaostoolkit-reporting

4.创建虚拟环境

python3 -m venv .bundler
source .bundler/bin/activate

为了不影响其他环境,我们这里用python的虚拟环境操作。

以上安装过程是在k8s的master机器上执行的,如果你不是在k8s上安装的,可以配置相应的k8s上下文,具体操作请参考:https://chaostoolkit.org/drivers/kubernetes/。

3.2.ChaosToolkit使用

1.chaos discover 探索试验

首先执行discover命令,chaostoolkit会根据./kube/config中的内容生成discovery.json文件,这个文件中会包括所有可以对k8s执行的操作集合。执行成功的结果如下:

(.bundler) [root@s5 chaostoolkit_scenarios]# chaos discover chaostoolkit-kubernetes
[2021-06-23 12:18:07 INFO] Attempting to download and install package 'chaostoolkit-kubernetes'
[2021-06-23 12:18:08 INFO] Package downloaded and installed in current environment
[2021-06-23 12:18:09 INFO] Discovering capabilities from chaostoolkit-kubernetes
[2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.actions
[2021-06-23 12:18:09 INFO] Searching for probes in chaosk8s.probes
[2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.deployment.actions
[2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.deployment.probes
[2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.node.actions
[2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.node.probes
[2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.pod.actions
[2021-06-23 12:18:09 INFO] Searching for probes in chaosk8s.pod.probes
[2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.replicaset.actions
[2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.service.actions
[2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.service.probes
[2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.statefulset.actions
[2021-06-23 12:18:09 INFO] Searching for probes in chaosk8s.statefulset.probes
[2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.crd.actions
[2021-06-23 12:18:09 INFO] Searching for probes in chaosk8s.crd.probes
[2021-06-23 12:18:09 INFO] Discovery outcome saved in ./discovery.json
(.bundler) [root@s5 chaostoolkit_scenarios]#

2.chaos init 生成试验

执行初始化命令,可以根据提示创建一个混沌试验。

(.bundler) [root@s5 chaostoolkit_scenarios]# chaos init
You are about to create an experiment.
This wizard will walk you through each step so that you can build
the best experiment for your needs.An experiment is made up of three elements:
- a steady-state hypothesis [OPTIONAL]
- an experimental method
- a set of rollback activities [OPTIONAL]Only the method is required. Also your experiment will
not run unless you define at least one activity (probe or action)
within it
Experiment's title: E2 #这里是配置一个试验名A steady state hypothesis defines what 'normality' looks like in your system
The steady state hypothesis is a collection of conditions that are used,
at the beginning of an experiment, to decide if the system is in a recognised
'normal' state. The steady state conditions are then used again when your experimentis complete to detect where your system may have deviated in an interesting,
weakness-detecting wayInitially you may not know what your steady state hypothesis is
and so instead you might create an experiment without one
This is why the stead state hypothesis is optional.
Do you want to define a steady state hypothesis now? [y/N]: y # 创建稳态假说,请注意,这个是混沌工程中的重要概念,但是在其他的大部分混沌工具中都看不到这一步
Hypothesis's title: H2You may now define probes that will determine
the steady-state of your system.
Add an activity
1) all_microservices_healthy
2) deployment_is_fully_available
3) deployment_is_not_fully_available
4) microservice_available_and_healthy
5) microservice_is_not_available
6) read_microservices_logs
7) service_endpoint_is_initialized
8) count_pods
9) pod_is_not_available
10) pods_in_conditions
11) pods_in_phase
12) pods_not_in_phase
13) read_pod_logs
14) statefulset_fully_available
15) statefulset_not_fully_available
16) get_cluster_custom_object
17) get_custom_object
18) list_cluster_custom_objects
19) list_custom_objects
Activity (0 to escape): 1 # 选择稳态假说的判断点,简单来说,这里就是创建一个预期结果!!!DEPRECATED!!!
1) kill_microservice
2) remove_service_endpoint
Do you want to use this probe? [y/N]: y # 确定是否使用上面选择的探针A steady-state probe requires a tolerance value, within which
your system is in a reognised `normal` state.What is the tolerance for this probe?: normalYou now need to fill the arguments for this activity. Default
values will be shown between brackets. You may simply press return
to use it or not set any value.
Argument's value for 'ns' [default]: chaosnamespace # 输入k8s中要操作的命名空间
Do you want to select another activity? [y/N]: y # 是否选择一个的操作动作
Add an activity
1) all_microservices_healthy
2) deployment_is_fully_available
3) deployment_is_not_fully_available
1) kill_microservice
4) microservice_available_and_healthy
5) microservice_is_not_available
6) read_microservices_logs
7) service_endpoint_is_initialized
8) count_pods
9) pod_is_not_available
10) pods_in_conditions
11) pods_in_phase
12) pods_not_in_phase
13) read_pod_logs
14) statefulset_fully_available
15) statefulset_not_fully_available
16) get_cluster_custom_object
17) get_custom_object
18) list_cluster_custom_objects
19) list_custom_objects
Activity (0 to escape): 1 # 选择具体的动作!!!DEPRECATED!!!
Do you want to use this probe? [y/N]: y # 确定使用上面选择的动作You now need to fill the arguments for this activity. Default
values will be shown between brackets. You may simply press return
to use it or not set any value.
Argument's value for 'ns' [default]:
Do you want to select another activity? [y/N]: N # 是否要添加另一个试验动作,这里我不再添加了An experiment's method contains actions and probes. Actions
vary real-world events in your system to determine if your
steady-state hypothesis is maintained when those events occur.An experimental method can also contain probes to gather additional
information about your system as your method is executed.
Do you want to define an experimental method? [y/N]: y # 选择一个试验具体方法
Add an activity1) kill_microservice2) remove_service_endpoint3) scale_microservice4) start_microservice5) all_microservices_healthy6) deployment_is_fully_available7) deployment_is_not_fully_available8) microservice_available_and_healthy9) microservice_is_not_available10) read_microservices_logs11) service_endpoint_is_initialized12) create_deployment13) delete_deployment14) scale_deployment15) deployment_available_and_healthy16) deployment_fully_available17) deployment_not_fully_available18) cordon_node19) create_node20) delete_nodes21) drain_nodes22) uncordon_node23) get_nodes24) delete_pods25) exec_in_pods26) terminate_pods27) count_pods28) pod_is_not_available29) pods_in_conditions30) pods_in_phase31) pods_not_in_phase32) read_pod_logs33) delete_replica_set34) create_service_endpoint35) delete_service36) service_is_initialized37) create_statefulset38) remove_statefulset39) scale_statefulset40) statefulset_fully_available41) statefulset_not_fully_available42) create_cluster_custom_object43) create_custom_object44) delete_cluster_custom_object45) delete_custom_object46) patch_cluster_custom_object47) patch_custom_object48) replace_cluster_custom_object49) replace_custom_object50) get_cluster_custom_object51) get_custom_object52) list_cluster_custom_objects53) list_custom_objectsActivity (0 to escape): 24 # 这里我选择第24个方法:删除一个POD!!!DEPRECATED!!!
Do you want to use this action? [y/N]: y # 确认选择You now need to fill the arguments for this activity. Default
values will be shown between brackets. You may simply press return
to use it or not set any value.
Argument's value for 'name': DeleteRedisPOD # 给这个方法命名Argument's value for 'ns' [default]: chaosnamespace # 确定要操作的k8s命名空间
Argument's value for 'label_selector' [name in ({name})]: app=redis # 输入要操作对象的标签,以便可以找到操作对象
Do you want to select another activity? [y/N]: N # 是否添加另一个动作,这里我不再添加An experiment may optionally define a set of remedial actions
that are used to rollback the system to a given state.
Do you want to add some rollbacks now? [y/N]: N # 是否添加回滚动作,这里我是要删除redis的POD,因为k8s会自动拉起来,所以我不用回滚动作Experiment created and saved in './experiment.json' # 生成了试验文件
(.bundler) [root@s5 chaostoolkit_scenarios]#

3. Chaos Run 执行案例

(.bundler) [root@s5 chaostoolkit_scenarios]# chaos run experiment.json
[2021-06-28 23:03:23 INFO] Validating the experiment's syntax
[2021-06-28 23:03:24 INFO] Experiment looks valid
[2021-06-28 23:03:24 INFO] Running experiment: E2
[2021-06-28 23:03:24 INFO] Steady-state strategy: default
[2021-06-28 23:03:24 INFO] Rollbacks strategy: default
[2021-06-28 23:03:24 INFO] Steady state hypothesis: H2
[2021-06-28 23:03:24 INFO] Probe: all_microservices_healthy
[2021-06-28 23:03:24 WARNING] all_microservices_healthy function is DEPRECATED and will be removed in the next         releases, please use all_pods_healthy instead
[2021-06-28 23:03:24 INFO] Steady state hypothesis is met!
[2021-06-28 23:03:24 INFO] Playing your experiment's method now...
[2021-06-28 23:03:24 INFO] Action: delete_pods
[2021-06-28 23:03:24 INFO] Steady state hypothesis: H2
[2021-06-28 23:03:24 INFO] Probe: all_microservices_healthy
[2021-06-28 23:03:24 WARNING] all_microservices_healthy function is DEPRECATED and will be removed in the next         releases, please use all_pods_healthy instead
[2021-06-28 23:03:24 INFO] Steady state hypothesis is met!
[2021-06-28 23:03:24 INFO] Let's rollback...
[2021-06-28 23:03:24 INFO] No declared rollbacks, let's move on.
[2021-06-28 23:03:24 INFO] Experiment ended with status: completed
(.bundler) [root@s5 chaostoolkit_scenarios]#

4.检查结果

执行试验前:[root@s5 ~]# kubectl get pods -n chaosnamespace -o wideNAME                                   READY   STATUS    RESTARTS   AGE     IP               NODE   NOMINATED NODE   READINESS GATES
...........................redis-master-b96c9795b-nqzmr           1/1     Running   0          3d9h    10.100.220.84    s6     <none>           <none>
redis-slave-6b8d456947-6r42k           1/1     Running   0          3d9h    10.100.220.86    s6     <none>           <none>
redis-slave-6b8d456947-z55m5           1/1     Running   0          3d9h    10.100.53.206    s7     <none>           <none>执行试验后:[root@s5 ~]# kubectl get pods -n chaosnamespace -o wideNAME                                   READY   STATUS              RESTARTS   AGE     IP               NODE   NOMINATED NODE   READINESS GATES
...............................redis-master-b96c9795b-92rc6           0/1     ContainerCreating   0          3s      <none>           s6     <none>           <none>redis-master-b96c9795b-nqzmr           0/1     Terminating         0          3d9h    10.100.220.84    s6     <none>           <none>
redis-slave-6b8d456947-5m2xt           0/1     ContainerCreating   0          2s      <none>           s6     <none>           <none>
redis-slave-6b8d456947-6r42k           1/1     Terminating         0          3d9h    10.100.220.86    s6     <none>           <none>
redis-slave-6b8d456947-fj4xc           0/1     ContainerCreating   0          3s      <none>           s7     <none>           <none>
redis-slave-6b8d456947-z55m5           1/1     Terminating         0          3d9h    10.100.53.206    s7     <none>           <none>POD完全启动后:[root@s5 ~]# kubectl get pods -n chaosnamespace -o wideNAME                                   READY   STATUS    RESTARTS   AGE     IP               NODE   NOMINATED NODE   READINESS GATES.......................redis-master-b96c9795b-92rc6           1/1     Running   0          5m43s   10.100.220.89    s6     <none>           <none>redis-slave-6b8d456947-5m2xt           1/1     Running   0          5m42s   10.100.220.90    s6     <none>           <none>redis-slave-6b8d456947-fj4xc           1/1     Running   0          5m43s   10.100.53.211    s7     <none>           <none>[root@s5 ~]#

从上面的结果可以看到,试验是执行成功的,几个redisPOD都被杀掉并被k8s拉起来了。

混沌工程之ChaosToolkit使用之一删除K8s POD相关推荐

  1. 混沌工程之ChaosMesh使用之模拟POD网络延迟

    文章目录 前言 模拟 POD 网络延迟 目标 配置文件 执行 验证 恢复 留个思考题给你 前言 在这一篇中我们来看一下如何模拟 POD 网络故障. 模拟 POD 网络延迟 目标 指定 pod 产生 1 ...

  2. 混沌工程之ChaosBlade(一):建立混沌工程思想

    本系列文章的目标,是将混沌工程作为一个入口,窥探整个分布式系统. 混沌工程之ChaosBlade(一):建立混沌工程思想 混沌工程之ChaosBlade(二):原理深度剖析 一.混沌工程是什么 < ...

  3. 混沌工程之ChaosBlade-Operator使用之一模拟POD丢包场景

    混沌工程之ChaosBlade-Operator使用之一模拟POD丢包场景 1.概述 在写了多篇混沌工程后,今天我们再介绍一个新的工具在K8S环境实现混沌工程测试. 2.ChaosBalde-Oper ...

  4. 工程之星android版使用,工程之星手机版下载-工程之星3.0 安卓版v3.0-PC6安卓网

    工程之星3.0安卓版是一款非常实用的办公协助测量应用.使用工程之星手机版能能够帮助你快速测量道路,多种实用行小工具让用户测量更加方便,减少很多麻烦! 软件介绍 工程之星3.0手机版一个专业性很强的测量 ...

  5. 特征工程之MinMaxScaler、StandardScaler、Normalizer、Binarizer

    特征工程之MinMaxScaler.StandardScaler.Normalizer.Binarizer 目录 特征工程之MinMaxScaler.StandardScaler.Normalizer ...

  6. 软件工程之“个人附加题”

    (1) 你认为本门课程需要在哪里进行改进,具体措施有哪些,包括:时间进度安排,项目难度等均可: 对于皱欣老师的软件工程之"构建之法",说实话,我并没有权利 去评价,不过对于上了一个 ...

  7. 工程之道,深度学习的工业级模型量化实战

    MegEngine 提供从训练到部署完整的量化支持,包括量化感知训练以及训练后量化,凭借"训练推理一体"的特性,MegEngine更能保证量化之后的模型与部署之后的效果一致.本文将 ...

  8. rtk采点后如何导入cad_【干货】RTK实操视频:工程之星5.0操作攻略!(第五部分)...

    前期回顾:[干货]RTK实操视频:工程之星5.0操作攻略!(第一部分)[干货]RTK实操视频:工程之星5.0操作教程(第二部分) [干货]RTK实操视频:工程之星5.0操作攻略!(第三部分) [干货] ...

  9. vspy如何在图形面板显示报文_Vspy工程之C Code Interface的使用(Vspy系列其三)

    Vspy工程之C Code Interface的使用(Vspy系列其三) Vspy工程之C Code Interface的使用(Vspy系列其三) Vspy工程之C Code Interface的使用 ...

最新文章

  1. C#实现字符串左旋转操作
  2. 【论文解读】Yoshua Bengio最新修改版论文:迈向生物学上可信的深度学习
  3. php system 返回值127,php system 返回值 1
  4. 你知道配置管理工具是什么吗_什么是配置管理工具?
  5. java 双声道音频_Android 播放音频如何实现双声道效果
  6. Vue执行动画(transition)
  7. SQL中sa 用户不能连接解决方案
  8. C语言内建函数:__builtin_XXX
  9. Metro UI 的设计感悟
  10. 图像从程序到GPU再到LCD显示的流程:GPU渲染管线(五)
  11. IHS遥感图像融合算法及其相关的算法
  12. 史上最全最详细的APP运营推广策划方案
  13. WebService
  14. Visual FoxPro权威指南pdf
  15. oracle 设置自增序列
  16. 一个本科生学习嵌入式的心得~
  17. Redis集群搭建——新手上路
  18. 蓝桥杯:跳蚂蚱【BFS】【Python】
  19. 微信公众平台接口,asp.net实现
  20. C++:引用类型(int )

热门文章

  1. 浅识k8s中的准入控制器
  2. Facebook路由事故未圆,何以元宇宙?
  3. 王者荣耀角色注销后我的服务器列表怎么删除,王者荣耀账号怎么注销 角色删除流程要求...
  4. 调节广告速度,跳过广告的另一境界——Video Speed Controller
  5. SpringBoot - 集成Swagger、Knif4j接口文档以及文档添加账号密码登录
  6. Pytorch基本操作流程: 七步成诗
  7. 基于百度搜索指数生成的可视化地图
  8. ROS 学习系列-- 四轮机器人线性速率、角速度和电机PWM线性关系的定量分析
  9. 四川省房产测绘实施细则[2010版]-4
  10. HDU 4884 —— TIANKENG’s rice shop(模拟)