Openshift中Pod的SpringBoot2应用程序健康检查

1. 准备测试的SpringBoot工程, 需要Java 8 JDK or greater and Maven 3.3.x or greater.

git clone https://github.com/megadotnet/Openshift-healthcheck-demo.git

假设您已经掌握基本JAVA应用程序开发,Openshift容器平台已经部署成功。我们的测试工程依赖库Spring Boot Actuator2, 有如新特性

  • 支持 Jersey RESTful Web 服务
  • 支持基于反应式理念的 WebFlux Web App
  • 新的端点映射
  • 简化用户自定义端点的创建
  • 增强端点的安全性

Actuator提供了13个接口,如下:

在Spring Boot 2.x中为了安全起见,Actuator只开放了两个端点/actuator/health和/actuator/info,可以在配置文件中设置开关。

部署编译jar文件到Openshift容器平台

Openshift的部署过程简述如下(这里采用是二进制部署方式):

--> Found image dc046fe (16 months old) in image stream "openshift/s2i-java" under tag "latest" for "s2i-java:latest"

Java S2I builder 1.0

--------------------

Platform for building Java (fatjar) applications with maven or gradle

Tags: builder, maven-3, gradle-2.6, java, microservices, fatjar

* A source build using binary input will be created

* The resulting image will be pushed to image stream "health-demo:latest"

* A binary build was created, use 'start-build --from-dir' to trigger a new build

--> Creating resources with label app=health-demo ...

imagestream "health-demo" created

buildconfig "health-demo" created

--> Success

Uploading directory "oc-build" as binary input for the build ...

build "health-demo-1" started

--> Found image fb46616 (5 minutes old) in image stream "hshreport-stage/health-demo" under tag "latest" for "health-demo:latest"

Java S2I builder 1.0

--------------------

Platform for building Java (fatjar) applications with maven or gradle

Tags: builder, maven-3, gradle-2.6, java, microservices, fatjar

* This image will be deployed in deployment config "health-demo"

* Ports 7575/tcp, 8080/tcp will be load balanced by service "health-demo"

* Other containers can access this service through the hostname "health-demo"

--> Creating resources with label app=health-demo ...

deploymentconfig "health-demo" created

service "health-demo" created

--> Success

Run 'oc status' to view your app.

route "health-demo" exposed

以上过程最近暴露Router, 方便我们演示

演示过程:

第一回合

修改刚部署的deploymentConfig的yaml,增加readiness配置,如下:

---
readinessProbe:
   failureThreshold: 3
   httpGet:
     path: /actuator/health
     port: 8080
     scheme: HTTP
   initialDelaySeconds: 10
   periodSeconds: 10
   successThreshold: 1
   timeoutSeconds: 1

对于以几个参数,我们说明下,大家需要理解

  • initialDelaySeconds:容器启动后第一次执行探测是需要等待多少秒。
  • periodSeconds:执行探测的频率。默认是10秒,最小1秒。
  • timeoutSeconds:探测超时时间。默认1秒,最小1秒。
  • successThreshold:探测失败后,最少连续探测成功多少次才被认定为成功。默认是 1。对于 liveness 必须是 1。最小值是 1。
  • failureThreshold:探测成功后,最少连续探测失败多少次才被认定为失败。默认是 3。最小值是 1。

或是采用openshift cli命令行来配置readiness:

oc set probe dc/app-cli \

--readiness \

--get-url=http://:8080/notreal \

--initial-delay-seconds=5

$ oc get pod -w

#刚才修改 deploymentConfig,pod重新部署了

NAME READY                 STATUS   RESTARTS  AGE

health-demo-1-build 0/1 Completed    0        16m

health-demo-2-sqh4z 1/1 Running      0        11m

执行HTTP API 来STOP停止Tomcat,curl http://${value-name-app}-MY_PROJECT_NAME.LOCAL_OPENSHIFT_HOSTNAME/api/stop,   注意此处URL相部署的DNS环境有关系

程序中日志如下:

Stopping Tomcat context.

2020-01-11 22:17:21.004 INFO 1 --- [nio-8080-exec-9] o.apache.catalina.core.StandardWrapper : Waiting for [1] instance(s) to be deallocated for Servlet [dispatcherServlet]

2020-01-11 22:17:22.008 INFO 1 --- [nio-8080-exec-9] o.apache.catalina.core.StandardWrapper : Waiting for [1] instance(s) to be deallocated for Servlet [dispatcherServlet]

2020-01-11 22:17:23.012 INFO 1 --- [nio-8080-exec-9] o.apache.catalina.core.StandardWrapper : Waiting for [1] instance(s) to be deallocated for Servlet [dispatcherServlet]

2020-01-11 22:17:23.114 INFO 1 --- [nio-8080-exec-9] o.a.c.c.C.[Tomcat].[localhost].[/] : Destroying Spring FrameworkServlet 'dispatcherServlet'

#跟踪pod的动态

$ oc get pod –w

NAME READY STATUS RESTARTS AGE

health-demo-1-build 0/1 Completed 0 16m

health-demo-2-sqh4z 1/1 Running 0 11m

health-demo-2-sqh4z 0/1 Running 0 13m

#我们查阅pod的详细描述

$ oc describe pod/health-demo-2-sqh4z

Name: health-demo-2-sqh4z

Namespace: hshreport-stage

Security Policy: restricted

Node: openshift-lb-02.hsh.io/10.108.78.145

Start Time: Sat, 11 Jan 2020 22:08:59 +0800

Labels: app=health-demo

deployment=health-demo-2

deploymentconfig=health-demo

Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicationController","namespace":"hshreport-stage","name":"health-demo-2","uid":"e6436263-347b-11ea-856c...

openshift.io/deployment-config.latest-version=2

openshift.io/deployment-config.name=health-demo

openshift.io/deployment.name=health-demo-2

openshift.io/generated-by=OpenShiftNewApp

openshift.io/scc=restricted

Status: Running

IP: 10.131.5.124

Controllers: ReplicationController/health-demo-2

Containers:

health-demo:

Container ID: docker://25cdf63f55d839610287b4e2a3cc67182377bfe5010990357f83329286c7e64f

Image: docker-registry.default.svc:5000/hshreport-stage/health-demo@sha256:292f09b7d9ca9bc12560febe3f4ba73e50b3c1a5701cbd55689186e844157fb6

Image ID: docker-pullable://docker-registry.default.svc:5000/hshreport-stage/health-demo@sha256:292f09b7d9ca9bc12560febe3f4ba73e50b3c1a5701cbd55689186e844157fb6

Ports: 7575/TCP, 8080/TCP

State: Running

Started: Sat, 11 Jan 2020 22:09:09 +0800

Ready: False

Restart Count: 0

Readiness: http-get http://:8080/actuator/health delay=10s timeout=1s period=10s #success=1 #failure=3

Environment:

APP_OPTIONS: -Xmx512m -Xss512k -Djava.net.preferIPv4Stack=true -Dfile.encoding=utf-8

DEPLOYER: liu.xxxxx (Administrator) (cicd-1.1.24)

REVISION:

SPRING_PROFILES_ACTIVE: stage

TZ: Asia/Shanghai

Mounts:

/var/run/secrets/kubernetes.io/serviceaccount from default-token-n4klp (ro)

Conditions:

Type Status

Initialized True

Ready False

PodScheduled True

Volumes:

default-token-n4klp:

Type: Secret (a volume populated by a Secret)

SecretName: default-token-n4klp

Optional: false

QoS Class: BestEffort

Node-Selectors: region=primary

Tolerations: <none>

Events:

FirstSeen LastSeen Count From SubObjectPath Type Reason Message

--------- -------- ----- ---- ------------- -------- ------ -------

16m 16m 1 default-scheduler Normal Scheduled Successfully assigned health-demo-2-sqh4z to openshift-lb-02.hsh.io

16m 16m 1 kubelet, openshift-lb-02.hsh.io spec.containers{health-demo} Normal Pulling pulling image "docker-registry.default.svc:5000/hshreport-stage/health-demo@sha256:292f09b7d9ca9bc12560febe3f4ba73e50b3c1a5701cbd55689186e844157fb6"

16m 16m 1 kubelet, openshift-lb-02.hsh.io spec.containers{health-demo} Normal Pulled Successfully pulled image "docker-registry.default.svc:5000/hshreport-stage/health-demo@sha256:292f09b7d9ca9bc12560febe3f4ba73e50b3c1a5701cbd55689186e844157fb6"

15m 15m 1 kubelet, openshift-lb-02.hsh.io spec.containers{health-demo} Normal Created Created container

15m 15m 1 kubelet, openshift-lb-02.hsh.io spec.containers{health-demo} Normal Started Started container

15m 15m 1 kubelet, openshift-lb-02.hsh.io spec.containers{health-demo} Warning Unhealthy Readiness probe failed: Get http://10.131.5.124:8080/actuator/health: dial tcp 10.131.5.124:8080: getsockopt: connection refused

7m 5m 16 kubelet, openshift-lb-02.hsh.io spec.containers{health-demo} Warning Unhealthy Readiness probe failed: HTTP probe failed with statuscode: 404

注意上面的WARN事件,我们发现POD没有被重启,因为我们只配置了readiness

第二回合

对之前部署后应用,增加health check, /actuator/health 是SpringBoot2.0的示例工程默认的健康检查的端点

修改deploymentConfig,增加readiness与liveness

---
livenessProbe:
   failureThreshold: 3
   httpGet:
     path: /actuator/health
     port: 8080
     scheme: HTTP
   initialDelaySeconds: 60
   periodSeconds: 10
   successThreshold: 1
   timeoutSeconds: 1
name: health-demo
ports:
   -
     containerPort: 7575
     protocol: TCP
   -
     containerPort: 8080
     protocol: TCP
readinessProbe:
   failureThreshold: 3
   httpGet:
     path: /actuator/health
     port: 8080
     scheme: HTTP
   initialDelaySeconds: 10
   periodSeconds: 10
   successThreshold: 1
   timeoutSeconds: 1

也可以通过Web 控制台来修改,示例截图如下:

我们看到 WEB UI的配置与yaml中参数是一样的。

oc cli命令行方法 #Configure Liveness/Readiness probes on DCs

oc set probe dc cotd1 --liveness -- echo ok

oc set probe dc/cotd1 --readiness --get-url=http://:8080/index.php --initial-delay-seconds=2

TCP的示例

oc set probe dc/blog --readiness --liveness --open-tcp 8080

移动 probe

$ oc set probe dc/blog --readiness --liveness –remove

执行stop后,此时请求ROUTER已显示,POD还在运行,浏览器返回

Application is not available

通过URL执行后,STOP tomcat,pod中程序部分日志如下:

Stopping Tomcat context.

2020-01-11 22:17:21.004 INFO 1 --- [nio-8080-exec-9] o.apache.catalina.core.StandardWrapper : Waiting for [1] instance(s) to be deallocated for Servlet [dispatcherServlet]

2020-01-11 22:17:22.008 INFO 1 --- [nio-8080-exec-9] o.apache.catalina.core.StandardWrapper : Waiting for [1] instance(s) to be deallocated for Servlet [dispatcherServlet]

2020-01-11 22:17:23.012 INFO 1 --- [nio-8080-exec-9] o.apache.catalina.core.StandardWrapper : Waiting for [1] instance(s) to be deallocated for Servlet [dispatcherServlet]

2020-01-11 22:17:23.114 INFO 1 --- [nio-8080-exec-9] o.a.c.c.C.[Tomcat].[localhost].[/] : Destroying Spring FrameworkServlet 'dispatcherServlet'

过一会儿,我们监控pod动态

$ oc get pod -w

NAME READY STATUS RESTARTS AGE

health-demo-1-build 0/1 Completed 0 33m

health-demo-3-02v11 1/1 Running 0 5m

health-demo-3-02v11 0/1 Running 0 7m

health-demo-3-02v11 0/1 Running 1 7m

$ oc get pod

NAME READY STATUS RESTARTS AGE

health-demo-1-build 0/1 Completed 0 36m

health-demo-3-02v11 1/1 Running 1 8m

请求 curl http://${value-name-app}-MY_PROJECT_NAME.LOCAL_OPENSHIFT_HOSTNAME/api/greeting?name=s2i

后浏览器显示:

{"content":"Hello, s2i!"} (the recovery took 41.783 seconds)

此时POD已经被重启,应用程序的日志如下:

2020-01-11 22:35:13.597 INFO 1 --- [ main] s.b.a.e.w.s.WebMvcEndpointHandlerMapping : Mapped "{[/actuator],methods=[GET],produces=[application/vnd.spring-boot.actuator.v2+json || application/json]}" onto protected java.util.Map<java.lang.String, java.util.Map<java.lang.String, org.springframework.boot.actuate.endpoint.web.Link>> org.springframework.boot.actuate.endpoint.web.servlet.WebMvcEndpointHandlerMapping.links(javax.servlet.http.HttpServletRequest,javax.servlet.http.HttpServletResponse)

2020-01-11 22:35:13.750 INFO 1 --- [ main] o.s.j.e.a.AnnotationMBeanExporter : Registering beans for JMX exposure on startup

2020-01-11 22:35:13.873 INFO 1 --- [ main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat started on port(s): 8080 (http) with context path ''

2020-01-11 22:35:13.882 INFO 1 --- [ main] dev.snowdrop.example.ExampleApplication : Started ExampleApplication in 8.061 seconds (JVM running for 9.682)

2020-01-11 22:35:22.445 INFO 1 --- [nio-8080-exec-1] o.a.c.c.C.[Tomcat].[localhost].[/] : Initializing Spring FrameworkServlet 'dispatcherServlet'

2020-01-11 22:35:22.445 INFO 1 --- [nio-8080-exec-1] o.s.web.servlet.DispatcherServlet : FrameworkServlet 'dispatcherServlet': initialization started

2020-01-11 22:35:22.485 INFO 1 --- [nio-8080-exec-1] o.s.web.servlet.DispatcherServlet : FrameworkServlet 'dispatcherServlet': initialization completed in 39 ms

$ oc describe pod/health-demo-3-02v11

Name: health-demo-3-02v11

Namespace: hshreport-stage

Security Policy: restricted

Node: openshift-node-04.hsh.io/10.108.78.139

Start Time: Sat, 11 Jan 2020 22:32:12 +0800

Labels: app=health-demo

deployment=health-demo-3

deploymentconfig=health-demo

Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicationController","namespace":"hshreport-stage","name":"health-demo-3","uid":"23ad2f21-347f-11ea-856c...

openshift.io/deployment-config.latest-version=3

openshift.io/deployment-config.name=health-demo

openshift.io/deployment.name=health-demo-3

openshift.io/generated-by=OpenShiftNewApp

openshift.io/scc=restricted

Status: Running

IP: 10.129.5.178

Controllers: ReplicationController/health-demo-3

Containers:

health-demo:

Container ID: docker://3e5a6b081022c914d8e118dce829294570e54f441b84394a2b13f6eebb4f5c74

Image: docker-registry.default.svc:5000/hshreport-stage/health-demo@sha256:292f09b7d9ca9bc12560febe3f4ba73e50b3c1a5701cbd55689186e844157fb6

Image ID: docker-pullable://docker-registry.default.svc:5000/hshreport-stage/health-demo@sha256:292f09b7d9ca9bc12560febe3f4ba73e50b3c1a5701cbd55689186e844157fb6

Ports: 7575/TCP, 8080/TCP

State: Running

Started: Sat, 11 Jan 2020 22:35:04 +0800

Last State: Terminated

Reason: Error

Exit Code: 143

Started: Sat, 11 Jan 2020 22:32:15 +0800

Finished: Sat, 11 Jan 2020 22:35:03 +0800

Ready: True

Restart Count: 1

Liveness: http-get http://:8080/actuator/health delay=60s timeout=1s period=10s #success=1 #failure=3

Readiness: http-get http://:8080/actuator/health delay=10s timeout=1s period=10s #success=1 #failure=3

Environment:

APP_OPTIONS: -Xmx512m -Xss512k -Djava.net.preferIPv4Stack=true -Dfile.encoding=utf-8

DEPLOYER: liu.xxxxxx(Administrator) (cicd-1.1.24)

REVISION:

SPRING_PROFILES_ACTIVE: stage

TZ: Asia/Shanghai

Mounts:

/var/run/secrets/kubernetes.io/serviceaccount from default-token-n4klp (ro)

Conditions:

Type Status

Initialized True

Ready True

PodScheduled True

Volumes:

default-token-n4klp:

Type: Secret (a volume populated by a Secret)

SecretName: default-token-n4klp

Optional: false

QoS Class: BestEffort

Node-Selectors: region=primary

Tolerations: <none>

Events:

FirstSeen LastSeen Count From SubObjectPath Type Reason Message

--------- -------- ----- ---- ------------- -------- ------ -------

17m 17m 1 default-scheduler Normal Scheduled Successfully assigned health-demo-3-02v11 to openshift-node-04.hsh.io

15m 14m 3 kubelet, openshift-node-04.hsh.io spec.containers{health-demo} Warning Unhealthy Liveness probe failed: HTTP probe failed with statuscode: 404

15m 14m 3 kubelet, openshift-node-04.hsh.io spec.containers{health-demo} Warning Unhealthy Readiness probe failed: HTTP probe failed with statuscode: 404

17m 14m 2 kubelet, openshift-node-04.hsh.io spec.containers{health-demo} Normal Pulling pulling image "docker-registry.default.svc:5000/hshreport-stage/health-demo@sha256:292f09b7d9ca9bc12560febe3f4ba73e50b3c1a5701cbd55689186e844157fb6"

17m 14m 2 kubelet, openshift-node-04.hsh.io spec.containers{health-demo} Normal Pulled Successfully pulled image "docker-registry.default.svc:5000/hshreport-stage/health-demo@sha256:292f09b7d9ca9bc12560febe3f4ba73e50b3c1a5701cbd55689186e844157fb6"

17m 14m 2 kubelet, openshift-node-04.hsh.io spec.containers{health-demo} Normal Created Created container

14m 14m 1 kubelet, openshift-node-04.hsh.io spec.containers{health-demo} Normal Killing Killing container with id docker://health-demo:pod "health-demo-3-02v11_hshreport-stage(27e5a1da-347f-11ea-856c-0050568d3d78)" container "health-demo" is unhealthy, it will be killed and re-created.

17m 14m 2 kubelet, openshift-node-04.hsh.io spec.containers{health-demo} Normal Started Started container

第二次我们执行 STOP的HTTP  API

$ oc get pod –w

NAME READY STATUS RESTARTS AGE

health-demo-1-build 0/1 Completed 0 47m

health-demo-3-02v11 1/1 Running 1 19m

health-demo-3-02v11 0/1 Running 1 19m

health-demo-3-02v11 0/1 Running 2 20m

health-demo-3-02v11 1/1 Running 2 20m

$ oc get pod

NAME READY STATUS RESTARTS AGE

health-demo-1-build 0/1 Completed 0 49m

health-demo-3-02v11 1/1 Running 2 21

HTTP 请求返回结果:

{"content":"Hello, s2i!"} (the recovery took 51.984 seconds)

$ oc describe pod/health-demo-3-02v11

Name: health-demo-3-02v11

Namespace: hshreport-stage

Security Policy: restricted

Node: openshift-node-04.hsh.io/10.108.78.139

Start Time: Sat, 11 Jan 2020 22:32:12 +0800

Labels: app=health-demo

deployment=health-demo-3

deploymentconfig=health-demo

Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicationController","namespace":"hshreport-stage","name":"health-demo-3","uid":"23ad2f21-347f-11ea-856c...

openshift.io/deployment-config.latest-version=3

openshift.io/deployment-config.name=health-demo

openshift.io/deployment.name=health-demo-3

openshift.io/generated-by=OpenShiftNewApp

openshift.io/scc=restricted

Status: Running

IP: 10.129.5.178

Controllers: ReplicationController/health-demo-3

Containers:

health-demo:

Container ID: docker://e12d1975aa26b07643ae1666ae6bce7ceab4f25fb4c6c947427ba526ad6fdf7b

Image: docker-registry.default.svc:5000/hshreport-stage/health-demo@sha256:292f09b7d9ca9bc12560febe3f4ba73e50b3c1a5701cbd55689186e844157fb6

Image ID: docker-pullable://docker-registry.default.svc:5000/hshreport-stage/health-demo@sha256:292f09b7d9ca9bc12560febe3f4ba73e50b3c1a5701cbd55689186e844157fb6

Ports: 7575/TCP, 8080/TCP

State: Running

Started: Sat, 11 Jan 2020 22:47:14 +0800

Last State: Terminated

Reason: Error

Exit Code: 143

Started: Sat, 11 Jan 2020 22:35:04 +0800

Finished: Sat, 11 Jan 2020 22:47:02 +0800

Ready: True

Restart Count: 2

Liveness: http-get http://:8080/actuator/health delay=60s timeout=1s period=10s #success=1 #failure=3

Readiness: http-get http://:8080/actuator/health delay=10s timeout=1s period=10s #success=1 #failure=3

Environment:

APP_OPTIONS: -Xmx512m -Xss512k -Djava.net.preferIPv4Stack=true -Dfile.encoding=utf-8

DEPLOYER: liu.xxxxx (Administrator) (cicd-1.1.24)

REVISION:

SPRING_PROFILES_ACTIVE: stage

TZ: Asia/Shanghai

Mounts:

/var/run/secrets/kubernetes.io/serviceaccount from default-token-n4klp (ro)

Conditions:

Type Status

Initialized True

Ready True

PodScheduled True

Volumes:

default-token-n4klp:

Type: Secret (a volume populated by a Secret)

SecretName: default-token-n4klp

Optional: false

QoS Class: BestEffort

Node-Selectors: region=primary

Tolerations: <none>

Events:

FirstSeen LastSeen Count From SubObjectPath Type Reason Message

--------- -------- ----- ---- ------------- -------- ------ -------

21m 21m 1 default-scheduler Normal Scheduled Successfully assigned health-demo-3-02v11 to openshift-node-04.hsh.io

21m 6m 3 kubelet, openshift-node-04.hsh.io spec.containers{health-demo} Normal Pulling pulling image "docker-registry.default.svc:5000/hshreport-stage/health-demo@sha256:292f09b7d9ca9bc12560febe3f4ba73e50b3c1a5701cbd55689186e844157fb6"

19m 6m 6 kubelet, openshift-node-04.hsh.io spec.containers{health-demo} Warning Unhealthy Liveness probe failed: HTTP probe failed with statuscode: 404

19m 6m 6 kubelet, openshift-node-04.hsh.io spec.containers{health-demo} Warning Unhealthy Readiness probe failed: HTTP probe failed with statuscode: 404

18m 6m 2 kubelet, openshift-node-04.hsh.io spec.containers{health-demo} Normal Killing Killing container with id docker://health-demo:pod "health-demo-3-02v11_hshreport-stage(27e5a1da-347f-11ea-856c-0050568d3d78)" container "health-demo" is unhealthy, it will be killed and re-created.

21m 6m 3 kubelet, openshift-node-04.hsh.io spec.containers{health-demo} Normal Pulled Successfully pulled image "docker-registry.default.svc:5000/hshreport-stage/health-demo@sha256:292f09b7d9ca9bc12560febe3f4ba73e50b3c1a5701cbd55689186e844157fb6"

21m 6m 3 kubelet, openshift-node-04.hsh.io spec.containers{health-demo} Normal Created Created container

21m 6m 3 kubelet, openshift-node-04.hsh.io spec.containers{health-demo} Normal Started Started container

#看下事件

$ oc get ev

LASTSEEN FIRSTSEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE

29m 41m 6 health-demo-3-02v11 Pod spec.containers{health-demo} Warning Unhealthy kubelet, openshift-node-04.hsh.io Liveness probe failed: HTTP probe failed with statuscode: 404

29m 41m 6 health-demo-3-02v11 Pod spec.containers{health-demo} Warning Unhealthy kubelet, openshift-node-04.hsh.io Readiness probe failed: HTTP probe failed with statuscode: 404

29m 41m 2 health-demo-3-02v11 Pod spec.containers{health-demo} Normal Killing kubelet, openshift-node-04.hsh.io Killing container with id docker://health-demo:pod "health-demo-3-02v11_hshreport-stage(27e5a1da-347f-11ea-856c-0050568d3d78)" container "health-demo" is unhealthy, it will be killed and re-created.

19m 19m 1 health-demo-3-02v11 Pod spec.containers{health-demo} Normal Killing kubelet, openshift-node-04.hsh.io Killing container with id docker://health-demo:Need to kill Pod

44m 44m 1 health-demo-3-deploy Pod Normal Scheduled default-scheduler Successfully assigned health-demo-3-deploy to openshift-lb-02.hsh.io

44m 44m 1 health-demo-3-deploy Pod spec.containers{deployment} Normal Pulled kubelet, openshift-lb-02.hsh.io Container image "openshift/origin-deployer:v3.6.1" already present on machine

44m 44m 1 health-demo-3-deploy Pod spec.containers{deployment} Normal Created kubelet, openshift-lb-02.hsh.io Created container

44m 44m 1 health-demo-3-deploy Pod spec.containers{deployment} Normal Started kubelet, openshift-lb-02.hsh.io Started container

44m 44m 1 health-demo-3 ReplicationController Normal SuccessfulCreate replication-controller Created pod: health-demo-3-02v11

19m 19m 1 health-demo-3 ReplicationController Normal SuccessfulDelete

发现,POD的名称health-demo-3-02v11没有变,演示到这儿结束

小结

Liveness probes可做三种检查

HTTP(S) checks—Checks a given URL endpoint served by the container, and evaluates the HTTP response code.

Container execution check—A command, typically a script, that’s run at intervals to verify that the container is behaving as expected. A non-zero exit code from the command results in a liveness check failure.

TCP socket checks—Checks that a TCP connection can be established on a specific TCP port in the application pod.


Readiness和liveness的区别

readiness 就是意思是否可以访问,liveness就是是否存活。如果一个readiness 为fail 的后果是把这个pod 的所有service 的endpoint里面的改pod ip 删掉,意思就这个pod对应的所有service都不会把请求转到这pod来了。但是如果liveness 检查结果是fail就会直接kill container,当然如果你的restart policy 是always 会重启pod。

Readiness探针和Liveness探针都是用来检测容器进程状态的。区别在于前者关注的是是否把进程的服务地址加入Service的负载均衡列表;而后者则决定是否去重启这个进程来排除故障。它们在进程的整个生命周期中都存在而且同时工作,职责分离。

Kubelet 可以选择是否执行在容器上运行的两种探针执行和做出反应:

  • livenessProbe:指示容器是否正在运行。如果存活探测失败,则 kubelet 会杀死容器,并且容器将受到其 重启策略 的影响。如果容器不提供存活探针,则默认状态为 Success。
  • readinessProbe:指示容器是否准备好服务请求。如果就绪探测失败,端点控制器将从与 Pod 匹配的所有 Service 的端点中删除该 Pod 的 IP 地址。初始延迟之前的就绪状态默认为 Failure。如果容器不提供就绪探针,则默认状态为 Success。

最佳实践

一般来讲,Readiness的执行间隔要比Liveness设置的较长一点比较好。因为当后端进程负载高的时候,我们可以暂时从转发列表里面摘除,但是Liveness决定的是进程是否重启,其实这个时候进程不一定需要重启。所以Liveness的检测周期可以稍微长一点,另外失败的容忍数量也可以多一点。具体根据实际情况判断吧。


今天先到这儿,希望对云原生,技术领导力, 企业管理,系统架构设计与评估,团队管理, 项目管理, 产品管管,团队建设 有参考作用 , 您可能感兴趣的文章:
Openshift部署流程介绍
Openshift V3系列各组件版本
领导人怎样带领好团队
构建创业公司突击小团队
国际化环境下系统架构演化
微服务架构设计
视频直播平台的系统架构演化
微服务与Docker介绍
Docker与CI持续集成/CD
互联网电商购物车架构演变案例
互联网业务场景下消息队列架构
互联网高效研发团队管理演进之一
消息系统架构设计演进
互联网电商搜索架构演化之一
企业信息化与软件工程的迷思
企业项目化管理介绍
软件项目成功之要素
人际沟通风格介绍一
精益IT组织与分享式领导
学习型组织与企业
企业创新文化与等级观念
组织目标与个人目标
初创公司人才招聘与管理
人才公司环境与企业文化
企业文化、团队文化与知识共享
高效能的团队建设
项目管理沟通计划
构建高效的研发与自动化运维
某大型电商云平台实践
互联网数据库架构设计思路
IT基础架构规划方案一(网络系统规划)
餐饮行业解决方案之客户分析流程
餐饮行业解决方案之采购战略制定与实施流程
餐饮行业解决方案之业务设计流程
供应链需求调研CheckList
企业应用之性能实时度量系统演变

如有想了解更多软件设计与架构, 系统IT,企业信息化, 团队管理 资讯,请关注我的微信订阅号:

作者:Petter Liu
出处:http://www.cnblogs.com/wintersun/
本文版权归作者和博客园共有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利。 该文章也同时发布在我的独立博客中-Petter Liu Blog。

Openshift中Pod的SpringBoot2健康检查相关推荐

  1. Kubernetes(k8s)中Pod资源的健康检查

    1.Pod的健康检查,也叫做探针,探针的种类有两种. 答:1).livenessProbe,健康状态检查,周期性检查服务是否存活,检查结果失败,将重启容器. 2).readinessProbe,可用性 ...

  2. k8s中pod的重启策略和健康检查

    目录 k8s中pod的重启策略 pod中一共有以下三个重启策略(restartPolicy) 健康检查: 健康检查类型 支持的检查方法: 检查示例 其他检查方式示例 k8s中pod的重启策略 pod中 ...

  3. 浅析Kubernetes Pod重启策略和健康检查

    使用Kubernetes的主要好处之一是它具有管理和维护集群中容器的能力,几乎可以提供服务零停机时间的保障.在创建一个Pod资源后,Kubernetes会为它选择worker节点,然后将其调度到节点上 ...

  4. K8S集群中Pod资源常见的异常状态以及排查思路

    K8S集群中Pod资源常见的异常状态以及排查思路 1.Pod资源的结构 Pod资源中会有一个基础容器Pause容器,每一个Pod资源下都会有一个Pause容器,Pause容器负责创建一个虚拟网络和存储 ...

  5. 一文读懂 SuperEdge 分布式健康检查 (边端)

    作者:杜杨浩,腾讯云高级工程师,热衷于开源.容器和Kubernetes.目前主要从事镜像仓库.Kubernetes集群高可用&备份还原,以及边缘计算相关研发工作. 前言 SuperEdge 是 ...

  6. nginx限流健康检查

    Nginx原生限流模块: ngx_http_limit_conn_module模块 根据前端请求域名或ip生成一个key,对于每个key对应的网络连接数进行限制. 配置如下: http模块 serve ...

  7. 如何判断当前请求的是健康检查API

    前言 为了性能监控的目的,我们使用了Middleware记录所有请求的Log.实现代码如下: public class RequestLoggingMiddleware {...public asyn ...

  8. Nginx的UDP健康检查

    Nginx的UDP健康检查 本章介绍如何为负载平衡的上游服务器组中的UDP服务器配置不同类型的运行状况检查. 先决条件 被动UDP健康检查 主动UDP运行状况检查 微调UDP运行状况检查 " ...

  9. Nginx负载均衡配置和健康检查

    Nginx负载均衡配置和健康检查 注:原创作品,允许转载,转载时请务必以超链接形式标明文章 原始出处 .作者信息和本声明.否则将追究法律责任. nginx的强大之处不必要我细说,当初第一次接触ngin ...

最新文章

  1. NeurIPS 2020论文评审结果出炉,提前拒稿、作者审稿惹争议,网友:改投别家吧...
  2. 设计模式之美:Memento(备忘录)
  3. SSY and JLBD 题解
  4. 剑指Offer(Java实现)把字符串转换成整数
  5. linux安装字体时找不到mkfontscale、mkfontdir
  6. 阿里云 OpenAPI 开发者门户全新上线
  7. 第三次学JAVA再学不好就吃翔(part86)--可变参数
  8. java 第三方序列化,11.既然有第三方的序列化方式,说明java官方提供的序列化方式应该有一些很明显或者很致命的缺点……...
  9. UI界面设计中的5个实用版面排版技巧
  10. 张家口全国计算机等级考试,河北省张家口市2018年上半年计算机等级考试公告...
  11. 使用pip将Python软件包从本地文件系统文件夹安装到virtualenv
  12. 【比赛】CCL“中国法研杯”相似案例匹配评测竞赛 - TOP队伍攻略分享
  13. float函数python作用_Python内置函数float()
  14. 计算机测试代码怎么写,常见的电脑检测卡代码对照表大全
  15. zigbee模块和433无线模块的区别
  16. 如何获取b站、YouTube等网站的视频封面
  17. 我的大学,被初恋女友甩了!
  18. win10资源管理器拖拽文件卡死无响应
  19. redis集群模式--解决redis单点故障
  20. find ? find !

热门文章

  1. Java入门之基础程序设计
  2. http_build_query或者拼接链接等方式中timestamp变成×tamp问题
  3. 计算机控制器代表硬件,计算机组成原理:计算机硬件系统
  4. 4.1.7 OS之文件共享(索引节点-硬链接、符号链接-软链接)
  5. Safari下载文件名乱码
  6. Mac 提示来打不开 xxx.pkg, 因为它来自身份不明的开发者
  7. Windows沙拉:为什么下载的文件打开时会有警告,而且会被“锁定”?
  8. 加入域的电脑,使用域管理员账号时出现 “Windows 无法访问指定设备、路劲或文件。你可能没有适当的权限访问该项目”
  9. haproxy 绑定vip问题
  10. php-4.3+mysql-3.23+apache-2.0+vbb-2.32论坛的架设方法