本文记录KubeEdge实践的一些记录,包括疑问和解决方案。本文不定时更新。

杂项

编译kubeedge,内存为2GB会出错,4G正常。
同一个pod导出节点端口相同,扩容会不成功,因为节点端口已被占用。
需要先运行得到配置文件,再修改。注意配置文件位置,注意系统平台框架,如果是arm平台,但pause不使用kubeedge/pause-arm:3.1,则出错。
检查主机名称,必须合规(小写字母、数字、横杠-、点号.),否则注册不了,有时返回信息为err:<nil>,无法排查。
边缘端系统需要有默认网关,否则运行会有段错误。按issue说法,此已解决,但依然有。
KubeEdge 不完全等同于 k8s,k8s的部分命令还没有实现。如查看、运行容器的命令就没有。

我收集的相关的bug

2020.3.19记录:
不支持kubectl execkubectl logs命令,官方说后续支持。待观察。
调度信息不够。从kubectl describe中只知道成功调度到了某个节点,至于成功或失败,不知道。只能到节点机器看用docker logs查日志。

问题

无法调度

环境:3台主机,已部署k8s。清理k8s。
按k8s部署deployment,查看pod,显示Pending,删除pod,显示Terminating。再尝试,发现有一个pod可运行在其中一节点,扩容,该节点可运行,另一节点Pending。经过一晚,依旧。
强制停止cloudcore 和 edgecore,k8s中的节点显示NotReady。节点的容器依旧在运行。

疑问:
无法调度,何解?如果优雅关掉pod,再停止cloudcore?目前找不到方法。

云端打印:

messagehandler.go:448] write error, connection for node edge-node2 will be closed, affected event id: dba8d7ec-ffa4-4c6f-ac6e-accfa527a366, parent_id: , group: resource, source: edgecontroller, resource: default/pod/nginx-deployment-77698bff7d-jdm8k, operation: update, reason tls: use of closed connection

边缘端打印:

process.go:130] failed to send message: tls: use of closed connection
process.go:196] websocket write error: failed to send message, error: tls: use of closed connection

猜测:连接断开,但查看node状态,是Ready状态,不知何故。
后续:删除,过一段时间,再部署,成功。

正常连接,跑,一夜后,NotReady状态。pod不断销毁,不断创建。

# kubectl get pod
NAME                                         READY   STATUS        RESTARTS   AGE
led-light-mapper-deployment-94bbdf88-26h2d   0/1     Terminating   0          14h
led-light-mapper-deployment-94bbdf88-2hwxq   0/1     Terminating   0          90m
led-light-mapper-deployment-94bbdf88-4f8pd   0/1     Terminating   0          80m
led-light-mapper-deployment-94bbdf88-52p9w   0/1     Terminating   0          15m
led-light-mapper-deployment-94bbdf88-8t9cl   0/1     Terminating   0          30m
led-light-mapper-deployment-94bbdf88-9bpt7   0/1     Terminating   0          95m
led-light-mapper-deployment-94bbdf88-9nfk6   0/1     Terminating   0          65m
led-light-mapper-deployment-94bbdf88-c8wtb   0/1     Terminating   0          85m
led-light-mapper-deployment-94bbdf88-kpcx4   0/1     Terminating   0          75m
led-light-mapper-deployment-94bbdf88-kwgqs   0/1     Terminating   0          35m
led-light-mapper-deployment-94bbdf88-l6hn2   0/1     Terminating   0          55m
led-light-mapper-deployment-94bbdf88-pk6fx   0/1     Terminating   0          5m1s
led-light-mapper-deployment-94bbdf88-qk9gj   0/1     Terminating   0          60m
led-light-mapper-deployment-94bbdf88-sgns2   0/1     Terminating   0          100m
led-light-mapper-deployment-94bbdf88-sk8gf   0/1     Terminating   0          20m
led-light-mapper-deployment-94bbdf88-svkgr   0/1     Terminating   0          50m
led-light-mapper-deployment-94bbdf88-tjz7z   0/1     Terminating   0          45m
led-light-mapper-deployment-94bbdf88-vwx7w   0/1     Pending       0          1s
led-light-mapper-deployment-94bbdf88-xfsc8   0/1     Terminating   0          10m
led-light-mapper-deployment-94bbdf88-xpq8k   0/1     Terminating   0          40m
led-light-mapper-deployment-94bbdf88-zhj24   0/1     Terminating   0          25m
led-light-mapper-deployment-94bbdf88-zncjg   0/1     Terminating   0          70m

查边缘端:

I0319 09:17:05.425874    2147 communicate.go:151] has msg
I0319 09:17:05.426062    2147 communicate.go:155] redo task due to no recv
I0319 09:17:05.427233    2147 communicate.go:151] has msg
I0319 09:17:05.427416    2147 communicate.go:155] redo task due to no recv
I0319 09:17:05.428657    2147 dtcontext.go:69] CommModule is healthy 1584580625context_channel.go:175] the message channel is full, message: {Header:{ID:5f072fe2-b8cf-411e-8aee-16e927f27433 ParentID: Timestamp:1584580605260 ResourceVersion:391570 Sync:false} Router:{Source:edgecontroller Group:resource Operation:update Resource:default/pod/led-light-mapper-deployment-94bbdf88-26h2d} Content:map[metadata:map[creationTimestamp:2020-03-18T10:23:50Z deletionGracePeriodSeconds:30 deletionTimestamp:2020-03-18T23:40:09Z generateName:led-light-mapper-deployment-94bbdf88- labels:map[app:led-light-mapper pod-template-hash:94bbdf88] name:led-light-mapper-deployment-94bbdf88-26h2d namespace:default ownerReferences:[map[apiVersion:apps/v1 blockOwnerDeletion:true controller:true kind:ReplicaSet name:led-light-mapper-deployment-94bbdf88 uid:52c44b48-1214-4b10-9007-23093a953a40]] resourceVersion:391570 selfLink:/api/v1/namespaces/default/pods/led-light-mapper-deployment-94bbdf88-26h2d uid:12002c7e-69fe-4a31-bf66-759d78380abe] spec:map[containers:[map[image:latelee/led-light-mapper:v1.1 imagePullPolicy:IfNotPresent name:led-light-mapper-container resources:map[] securityContext:map[privileged:true] terminationMessagePath:/dev/termination-log terminationMessagePolicy:File volumeMounts:[map[mountPath:/opt/kubeedge/ name:config-volume] map[mountPath:/var/run/secrets/kubernetes.io/serviceaccount name:default-token-gb4kq readOnly:true]]]] dnsPolicy:ClusterFirst enableServiceLinks:true hostNetwork:true nodeName:latelee.org.ttucon-2142ec priority:0 restartPolicy:Always schedulerName:default-scheduler securityContext:map[] serviceAccount:default serviceAccountName:default terminationGracePeriodSeconds:30 tolerations:[map[effect:NoExecute key:node.kubernetes.io/not-ready operator:Exists tolerationSeconds:300] map[effect:NoExecute key:node.kubernetes.io/unreachable operator:Exists tolerationSeconds:300]] volumes:[map[configMap:map[defaultMode:420 name:device-profile-config-edge-node2] name:config-volume] map[name:default-token-gb4kq secret:map[defaultMode:420 secretName:default-token-gb4kq]]]] status:map[phase:Pending qosClass:BestEffort]]}

DNS警告:

I0319 16:25:18.563472   17947 record.go:24] Warning MissingClusterDNS kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to "Default" policy.
I0319 16:25:18.563724   17947 record.go:24] Warning MissingClusterDNS pod: "webgin-deployment-747c6887f5-dwmtb_default(1ceb1dd6-6dae-4aff-a2c6-d0de64373031)". kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to "Default" policy.
I0319 16:25:18.563902   17947 record.go:19] Warning DNSConfigForming Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 8.8.8.8 8.8.4.4 2001:4860:4860::8888
E0319 16:25:18.564035   17947 dns.go:135] Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 8.8.8.8 8.8.4.4 2001:4860:4860::8888
I0319 16:30:09.037479   17947 edged.go:808] consume added pod [webgin-deployment-7ccff86d8b-s227c] successfully
I0319 16:30:10.506631   17947 record.go:19] Normal Started Started container webgin
E0319 16:30:10.507199   17947 kuberuntime_container.go:172] Failed to create legacy symbolic link "/var/log/containers/webgin-deployment-747c6887f5-f6547_default_webgin-1772b70cd7725f77c30b9cf47e3ce57159d9fdccf47c0c19aed8edf779c52c16.log" to container "1772b70cd7725f77c30b9cf47e3ce57159d9fdccf47c0c19aed8edf779c52c16" log "/var/log/pods/default_webgin-deployment-747c6887f5-f6547_abc27c3c-50f1-49e9-9f2e-b00fa802dc7f/webgin/0.log": symlink /var/log/pods/default_webgin-deployment-747c6887f5-f6547_abc27c3c-50f1-49e9-9f2e-b00fa802dc7f/webgin/0.log /var/log/containers/webgin-deployment-747c6887f5-f6547_default_webgin-1772b70cd7725f77c30b9cf47e3ce57159d9fdccf47c0c19aed8edf779c52c16.log: no such file or directory
I0319 16:30:10.507557   17947 edged.go:808] consume added pod [webgin-deployment-747c6887f5-f6547] successfully
I0319 16:30:10.667156   17947 edged.go:648] sync loop ignore event: [ContainerDied], with pod [1ceb1dd6-6dae-4aff-a2c6-d0de64373031] not found
W0319 16:30:10.685178   17947 docker_sandbox.go:394] failed to read pod IP from plugin/docker: Couldn't find network status for default/webgin-deployment-747c6887f5-f6547 through plugin: invalid network status for
W0319 16:30:10.871129   17947 docker_sandbox.go:394] failed to read pod IP from plugin/docker: Couldn't find network status for default/webgin-deployment-747c6887f5-f6547 through plugin: invalid network status for
I0319 16:30:10.914857   17947 container_manager_linux.go:880] Found 44 PIDs in root, 44 of them are not to be moved
I0319 16:30:11.088286   17947 edged.go:645] sync loop get event [ContainerStarted], ignore it now.
I0319 16:30:11.327738   17947 edged.go:645] sync loop get event [ContainerStarted], ignore it now.
W0319 16:30:12.413498   17947 docker_sandbox.go:394] failed to read pod IP from plugin/docker: Couldn't find network status for default/webgin-deployment-747c6887f5-f6547 through plugin: invalid network status for
W0319 16:30:12.543879   17947 docker_sandbox.go:394] failed to read pod IP from plugin/docker: Couldn't find network status for default/webgin-deployment-747c6887f5-f6547 through plugin: invalid network status for

成功部署pod的:

I0319 16:25:18.564503   17947 edged.go:808] consume added pod [webgin-deployment-747c6887f5-dwmtb] successfully
I0319 16:25:18.564974   17947 proxy.go:318] [L4 Proxy] process other resource: kube-system/endpoints/kube-scheduler
I0319 16:25:18.688263   17947 edged_volumes.go:54] Using volume plugin "kubernetes.io/empty-dir" to mount wrapped_default-token-gb4kq

KubeEdge 实践过程的记录相关推荐

  1. 《测试驱动开发应用实践》讨论记录

    <测试驱动开发应用实践>讨论记录 Design & Pattern团队第二次交流会 主题:测试驱动开发应用实践 日期:本周星期五(2005年1月7日)晚20:00--21:00 地 ...

  2. 表达式求值:从“加减”到“带括号的加减乘除”的实践过程

    本文乃Siliphen原创,转载请注明出处:http://blog.csdn.NET/stevenkylelee ● 为什么想做一个表达式求值的程序 最近有一个需求,策划想设置游戏关卡的某些数值,这个 ...

  3. DL之Attention-ED:基于TF NMT利用带有Attention的 ED模型训练、测试(中英文平行语料库)实现将英文翻译为中文的LSTM翻译模型过程全记录

    DL之Attention-ED:基于TF NMT利用带有Attention的 ED模型训练(中英文平行语料库)实现将英文翻译为中文的LSTM翻译模型过程全记录 目录 测试输出结果 模型监控 训练过程全 ...

  4. NLP之WE之Skip-Gram:基于TF利用Skip-Gram模型实现词嵌入并进行可视化、过程全记录

    NLP之WE之Skip-Gram:基于TF利用Skip-Gram模型实现词嵌入并进行可视化 目录 输出结果 代码设计思路 代码运行过程全记录 输出结果 代码设计思路 代码运行过程全记录 3081 or ...

  5. ros2_object_analytics安装过程全记录

    ros2_object_analytics安装过程全记录 ros2_object_analytics安装过程全记录 问题总结 (1)OpenCV3.3依赖问题 (2)devel版本中librealse ...

  6. 64位Ubuntu 12.04下搭建嵌入式Qt(4.8.6)、QtCreator、qvfb过程全记录

    最新公司搞一个新项目,需要在Linux下进行UI界面设计.选来选去,最后选择采用Qt平台来做,于是要开始研究一下Qt. Qt作为业内最著名的UI开发工具之一,是一套完整的跨平台C++图形用户界面应用程 ...

  7. 在CentOS7上安装配置Corosync高可用集群过程全记录

    在CentOS7上安装配置Corosync高可用集群过程全记录 一.环境.拓朴及其他准备工作: 1-1:准备网络YUM源环境: All Nodes OS CentOS 7.3 x86_64: # wg ...

  8. AIX 关键系统文件被清空问题定位过程全记录

    问题描述 某日接到客户反馈,某系统备机重启后 telnet 无法登录,提示信息如下: telnet (testlpar1)telnetd: /bin/login: Cannot run a file ...

  9. linux环境手动编译安装Nginx实践过程 附异常解决

    linux环境手动编译安装Nginx实践过程 附异常解决 参考文章: (1)linux环境手动编译安装Nginx实践过程 附异常解决 (2)https://www.cnblogs.com/david9 ...

最新文章

  1. 如何解决组织协同?用智办事更简单!
  2. anaconda配置环境变量
  3. mysql持久连接_持久性连接,短连接和连接池
  4. 初试Ajax.Net !
  5. 怎么去掉Xcode工程中的某种类型的警告
  6. Web2.0时代,RSS你会用了吗?(技术实现总结)(转载)
  7. springmvc 传对象报400_那么火的SpringMVC到底有什么过人之处呢
  8. oracle vm 环境支持,使用 Oracle VM 模板快速部署 Oracle RAC 环境
  9. [css] CSS的伪类和伪对象有什么不同?
  10. 噪音曲线图测试软件,利用示波器统计工具分析有噪声信号之测量统计和余晖图...
  11. Java中如何编写一个完美的equals方法
  12. 应云而生,原力觉醒——解读云原生基础设施 | 凌云时刻
  13. cocos2dx +vs2012安装教程
  14. python 减法函数_python之函数
  15. 敷衍没有出路,iPhone14同时被热捧和唾弃
  16. MATLAB批量读取文件夹名,文件名,文件数据
  17. Android Studio:如何使用网格布局将整个界面等比分为三行三列
  18. [Luogu] 逆序数P5149 会议座位
  19. HIVE基本查询操作(二)——第1关:Hive排序
  20. [Leetcode] 33. Search in Rotated Sorted Array 解题报告

热门文章

  1. 路由器的工作原理_VRRP(虚拟路由器冗余协议)知识点梳理
  2. 东方电子全资子公司中标1.065亿元国家电网第三批采购项目
  3. 苹果第三代iPhone SE或将于12月份开始投产 明年春季发布
  4. 王健林最好的时代过去了
  5. 最值得入手的新旗舰来了:性价比极高
  6. 蛋壳公寓CEO高靖被限制消费
  7. 马斯克确诊新冠后续:已从低烧、感冒症状中完全恢复
  8. 百度入股电商直播服务商“卡美啦” 备战2020年双11
  9. 罗永浩直播成绩单:3小时带货超1.1亿元,近5000万人来听相声
  10. 知名插画师描绘nova6“5G”潮乐园 十城nova6自拍互动装置即将引爆