目 录

摘 要 I
Abstract II
1 绪论 1
1.1 选题背景和意义 1
1.2 国内外研究现况及发展趋势 2
1.2.1 国内研究现状 2
1.2.2 国外研究现状 2
1.3 主要研究内容 3
2 故障诊断的总体设计方案 4
2.1 故障模型的要求 4
2.2 决策树建立故障树模型 4
2.2.1 信息熵 4
2.2.2 信息增益 5
2.2.3 ID3算法 5
2.3 支持向量机二分类原理 8
2.3.1 SVM原理 8
2.3.2 对偶问题 11
2.3.3 SVM核函数 13
3 具体方案设计 16
3.1 数据获取 16
3.2 数据存取 17
3.2.1 本地环境 17
3.2.2 大数据平台环境 19
3.3 决策树实现 21
3.3.1 连续属性值离散化 21
3.3.2 样本初始化 23
3.3.3 生成决策树 25
3.3.4 数据测试类 29
3.4 支持向量机实现 30
3.5 人机交互界面设计 32
3.5.1 界面组件介绍 33
3.5.2 操作命令介绍 34
3.5.3 适用场景 36
4 性能分析 38
4.1 性能度量 38
4.2 决策树性能度量 39
4.3 支持向量机性能度量 40
5 结论 42
5.1 全文总结 42
5.2 展望 43
致谢 44
参考文献 45
附录 48
5 结论
5.1 全文总结
本次设计内容总体达到了预期进度,成功的建立了故障诊断的模型和人工交互界面。并且通过划分训练集和测试集,实现了对故障或者异常情况的推理。
本次设计所采用的数据均为连续性数据,在支持向量机的模型中影响不大, 但是对于决策树模型的精度影响十分之大。后来经过EADC算法的离散化处理,决策树模型的精度有了很大的进步。而且在性能分析中提到:决策树对于数据量十分敏感,在运行环境的最大承载能力下,决策树的精度能达到58%左右,而且上升趋势仍然十分明显。由此可见如果将决策树模型应用于实际生产环境,辅之以足够的运算能力,也会有一定的实际应用价值。同时因为模型的计算过程较快,所以数据的更新迭代可以周期性的展开,确保数据驱动所使用的数据都是近期的历史数据,紧随着生产设备的实际情况而更新。不用过多的拘泥于过往的先验知识。
当然,基于数据驱动模型的缺陷也是显而易见的:对于未曾出现过的故障类型或者是异常数据,就会有无法处理或者是判定错误的现象,在这一点上钟福磊的混合故障诊断模型显然更加具有实用性。
另外借鉴与LibSVM的支持向量机模型由于其集成度很高,内部相当完善,所以在数据量的变化过程中具有较好的表现。但是由于其在运算时所需要的时间与内存等相较于决策树模型更多,而且其精度与数据量的关系远没有决策树那么明显。
所以在实际的应用中,决策树适合于数据量较大,对于精度要求高,易于理解的场合中。而且本次设计的GUI界面配套了决策树的输出,本文转载自http://www.biyezuopin.vip/onews.asp?id=15234利于操作人员理解使用。而SVM更加适合于数据量较小,但是对于精度有一定要求的场合。

5.2 展望
本次毕设过程中,我学习到了很多的知识,尤其是对于算法实现,数据选取,数据挖掘,模型优化等内容有了长足的认识与进步。而在模型建立过程中,还有很多理论和技术上的问题需要解决:
(1)本次设计对于各个属性的选取并未经过严格考虑,在实际应用中应该提前对数据进行筛选和清理,比如计算各个属性参数之间的关联度剔除一些很明显的错误训练样本和验证样本,这样不仅能降低训练、测试开销,而且还能提高模型的预测精度;
(2)对于模型的处理问题,剪枝方法未曾实现,而且剪枝虽然能提高模型对一些特定数据的精度并且减少预测开销,但是在一定程度上也限制了被剪枝模型对于更大范围数据的分类能力;
(3)另外,在数据获取部分,应该使用一些机械方面的预备知识,对数据检测模块进行优化,使得获得的数据在计算关联度之前就先天利于模型构建;
(4)SVM模块的内容并未深入,参数均采用默认类型。在实际使用时,可以调节一些参数,使得支持向量机模型更加适合于不同的工作环境;
(5)GUI的功能可以进行扩充,同时改进人机交互体验;
(6)Java与Mysql的使用在设计过程中还有待改进与优化的余地。
如果完成了上面所述的内容,配合基于数据驱动的模型只需提前设置好规范的数据格式即可适应于各类环境的特性,相信构建的模型会具有更大的实用价值。

import numpy as np
import matplotlib.pyplot as plt  ##########################三条曲线对照图##########################
# evenly sampled time at 200ms intervals
# x = [200,400,800,1000,2000,3000,4000,5000,6000,8000,10000,12000,15000,17500,20000,22500,25000,27500,30000,32500,35000,37500,40000,42500,45000,47500,50000,52500,55000,57500,60000,62500,65000,67500,70000,72500,75000,77500,80000,82500,85000,87500,90000,92500,95000,97500,100000,110000,120000,125000,130000,140000,150000,160000,175000,]
# y1 = [33.0,30.0,36.0,37.2,31.7,34.0,35.5,37.6,36.1,37.8,40.3,42.4,35.7,40.0,46.1,42.1,42.0,45.1,45.6,48.3,45.6,48.3,48.7,46.8,48.0,48.0,46.8,46.9,48.3,49.6,49.2,48.9,49.0,50.0,51.9,48.9,50.4,53.2,53.3,50.7,50.9,51.4,50.1,51.9,51.6,53.2,52.1,51.2,53.6,51.7,54.7,54.4,53.5,54.4,56.3,]
# y2 = [61.1,58.5,51.1,54.3,53.0,56.5,54.3,57.1,52.2,57.3,54.6,54.0,54.5,57.0,56.4,56.6,56.2,56.3,56.2,56.7,55.3,56.3,57.5,55.7,57.0,57.0,56.3,55.2,56.1,57.7,56.7,55.8,58.1,57.6,57.4,57.7,57.8,58.4,57.8,57.4,58.2,58.4,56.5,58.2,57.7,59.7,57.5,57.0,58.0,56.0,58.0,57.7,57.7,59.4,58.3,]
# y3 = [32.4,57.4,49.6,59.5,51.3,51.8,52.7,54.0,51.0,52.7,48.9,49.1,48.6,49.7,49.0,49.6,49.7,50.7,48.6,48.6,48.5,48.7,47.6,48.3,48.5,47.2,47.4,47.2,48.7,46.7,47.6,48.0,46.4,48.0,46.1,47.2,47.1,47.3,48.6,46.2,46.3,45.7,46.1,46.5,46.7,44.9,46.3,47.4,47.2,47.6,47.1,47.8,46.6,45.3,48.5,]
# plt.plot(x,y1,color = 'red',linestyle = '-',label = 'Accuracy')
# plt.plot(x,y2,color = 'blue',linestyle = '-.',label = 'Precision')
# plt.plot(x,y3,color = 'green',linestyle = ':',label = 'Recall')
# plt.xlabel('Data Amount')
# plt.ylabel('Percentage (%)')
# plt.legend(loc='lowwer right')
# plt.show()
##########################三条曲线对照图####################################################查全率、查准率精度变化图##########################
# x = [3000,4000,5000,6000,8000,10000,12000,15000,17500,20000,22500,25000,27500,30000,32500,35000,37500,40000,42500,45000,47500,50000,52500,55000,57500,60000,62500,65000,67500,70000,72500,75000,77500,80000,82500,85000,87500,90000,92500,95000,97500,100000,110000,120000,125000,130000,140000,150000,160000,175000,]
# y2 = [56.5,54.3,57.1,52.2,57.3,54.6,54.0,54.5,57.0,56.4,56.6,56.2,56.3,56.2,56.7,55.3,56.3,57.5,55.7,57.0,57.0,56.3,55.2,56.1,57.7,56.7,55.8,58.1,57.6,57.4,57.7,57.8,58.4,57.8,57.4,58.2,58.4,56.5,58.2,57.7,59.7,57.5,57.0,58.0,56.0,58.0,57.7,57.7,59.4,58.3,]
# y3 = [51.8,52.7,54.0,51.0,52.7,48.9,49.1,48.6,49.7,49.0,49.6,49.7,50.7,48.6,48.6,48.5,48.7,47.6,48.3,48.5,47.2,47.4,47.2,48.7,46.7,47.6,48.0,46.4,48.0,46.1,47.2,47.1,47.3,48.6,46.2,46.3,45.7,46.1,46.5,46.7,44.9,46.3,47.4,47.2,47.6,47.1,47.8,46.6,45.3,48.5,]
# plt.plot(x,y2,color = 'blue',linestyle = '-.',label = 'Precision')
# plt.plot(x,y3,color = 'green',linestyle = ':',label = 'Recall')
# plt.xlabel('Data Amount')
# plt.ylabel('Percentage (%)')
# plt.legend(loc='upper left')
# plt.title('Comparative Analysis',color='black')
# plt.annotate('Precision', xy=(63000, 55.5), xytext=(70000, 53), arrowprops=dict(facecolor='blue', shrink=0.01), )
# plt.annotate('Recall', xy=(55000, 49), xytext=(70000, 51), arrowprops=dict(facecolor='green', shrink=0.01), )
# plt.show()
##########################查全率、查准率精度变化图########################### ##########################P-R图##########################
# y2 = [61.1,58.5,51.1,54.3,53.0,56.5,54.3,57.1,52.2,57.3,54.6,54.0,54.5,57.0,56.4,56.6,56.2,56.3,56.2,56.7,55.3,56.3,57.5,55.7,57.0,57.0,56.3,55.2,56.1,57.7,56.7,55.8,58.1,57.6,57.4,57.7,57.8,58.4,57.8,57.4,58.2,58.4,56.5,58.2,57.7,59.7,57.5,57.0,58.0,56.0,58.0,57.7,57.7,59.4,58.3,]
# y3 = [32.4,57.4,49.6,59.5,51.3,51.8,52.7,54.0,51.0,52.7,48.9,49.1,48.6,49.7,49.0,49.6,49.7,50.7,48.6,48.6,48.5,48.7,47.6,48.3,48.5,47.2,47.4,47.2,48.7,46.7,47.6,48.0,46.4,48.0,46.1,47.2,47.1,47.3,48.6,46.2,46.3,45.7,46.1,46.5,46.7,44.9,46.3,47.4,47.2,47.6,47.1,47.8,46.6,45.3,48.5,]
# y2=y2[20:]
# y3=y3[20:]
# plt.plot(y2,y3,color = 'blue',linestyle = ':')
# plt.xlabel('Precision')
# plt.ylabel('Recall')
# plt.legend(loc='lowwer right')
# plt.show()
##########################P-R图####################################################决策树精度变化图##########################
# x = [200,400,800,1000,2000,3000,4000,5000,6000,8000,10000,12000,15000,17500,20000,22500,25000,27500,30000,32500,35000,37500,40000,42500,45000,47500,50000,52500,55000,57500,60000,62500,65000,67500,70000,72500,75000,77500,80000,82500,85000,87500,90000,92500,95000,97500,100000,110000,120000,125000,130000,140000,150000,160000,175000,]
# y1 = [33.0,30.0,36.0,37.2,31.7,34.0,35.5,37.6,36.1,37.8,40.3,42.4,35.7,40.0,46.1,42.1,42.0,45.1,45.6,48.3,45.6,48.3,48.7,46.8,48.0,48.0,46.8,46.9,48.3,49.6,49.2,48.9,49.0,50.0,51.9,48.9,50.4,53.2,53.3,50.7,50.9,51.4,50.1,51.9,51.6,53.2,52.1,51.2,53.6,51.7,54.7,54.4,53.5,54.4,56.3,]
# y2 = [61.1,58.5,51.1,54.3,53.0,56.5,54.3,57.1,52.2,57.3,54.6,54.0,54.5,57.0,56.4,56.6,56.2,56.3,56.2,56.7,55.3,56.3,57.5,55.7,57.0,57.0,56.3,55.2,56.1,57.7,56.7,55.8,58.1,57.6,57.4,57.7,57.8,58.4,57.8,57.4,58.2,58.4,56.5,58.2,57.7,59.7,57.5,57.0,58.0,56.0,58.0,57.7,57.7,59.4,58.3,]
# y3 = [32.4,57.4,49.6,59.5,51.3,51.8,52.7,54.0,51.0,52.7,48.9,49.1,48.6,49.7,49.0,49.6,49.7,50.7,48.6,48.6,48.5,48.7,47.6,48.3,48.5,47.2,47.4,47.2,48.7,46.7,47.6,48.0,46.4,48.0,46.1,47.2,47.1,47.3,48.6,46.2,46.3,45.7,46.1,46.5,46.7,44.9,46.3,47.4,47.2,47.6,47.1,47.8,46.6,45.3,48.5,]
# plt.plot(x,y1,'rs')
# plt.title('DecisionTree',color='black')
# plt.xlabel('Data Amount')
# plt.ylabel('Accuracy (%)')
# plt.show()
##########################决策树精度变化图####################################################SVM精度变化图##########################
# x = [200,400,800,1000,2000,3000,4000,5000,6000,8000,10000,12000,15000,17500,20000,22500,25000,27500,30000,32500,35000,37500,40000,42500,45000,]
# y1 = [55.0,49.5,51.0,56.0,54.3,53.9,54.2,55.4,56.8,55.9,56.1,55.2,56.1,56.8,56.4,57.4,56.8,56.8,57.4,57.6,56.7,58.1,57.6,57.2,58.4,]
# plt.plot(x,y1,'rs')
# plt.title('Support Vector Machine',color='black')
# plt.xlabel('Data Amount')
# plt.ylabel('Accuracy (%)')
# # plt.plot(x,y1,color = 'red',linestyle = '-',label = 'Accuracy')
# plt.show()
##########################SVM精度变化图####################################################精度变化对比分析图##########################
# x = [200,400,800,1000,2000,3000,4000,5000,6000,8000,10000,12000,15000,17500,20000,22500,25000,27500,30000,32500,35000,37500,40000,42500,45000,]
# y1 = [55.0,49.5,51.0,56.0,54.3,53.9,54.2,55.4,56.8,55.9,56.1,55.2,56.1,56.8,56.4,57.4,56.8,56.8,57.4,57.6,56.7,58.1,57.6,57.2,58.4,]
# x1 = [200,400,800,1000,2000,3000,4000,5000,6000,8000,10000,12000,15000,17500,20000,22500,25000,27500,30000,32500,35000,37500,40000,42500,45000,47500,50000,52500,55000,57500,60000,62500,65000,67500,70000,72500,75000,77500,80000,82500,85000,87500,90000,92500,95000,97500,100000,110000,120000,125000,130000,140000,150000,160000,175000,]
# y2 = [33.0,30.0,36.0,37.2,31.7,34.0,35.5,37.6,36.1,37.8,40.3,42.4,35.7,40.0,46.1,42.1,42.0,45.1,45.6,48.3,45.6,48.3,48.7,46.8,48.0,48.0,46.8,46.9,48.3,49.6,49.2,48.9,49.0,50.0,51.9,48.9,50.4,53.2,53.3,50.7,50.9,51.4,50.1,51.9,51.6,53.2,52.1,51.2,53.6,51.7,54.7,54.4,53.5,54.4,56.3,]
# s1 = []
# count = 0
# sy1 = []
# while count<len(x):
#     s1.append(x1[count])
#     sy1.append(y2[count])
#     count+=1
# plt.plot(x,y1,'r',s1,sy1,'b')
# plt.title('Comparative Analysis',color='black')
# plt.annotate('SVM', xy=(12000, 54.5), xytext=(15000, 50), arrowprops=dict(facecolor='red', shrink=0.01), )
# plt.annotate('DecisionTree', xy=(17000, 38), xytext=(20000, 33), arrowprops=dict(facecolor='blue', shrink=0.01), )
# plt.xlabel('Data Amount')
# plt.ylabel('Accuracy (%)')
# # plt.plot(x,y1,color = 'red',linestyle = '-',label = 'Accuracy')
# plt.show()
##########################精度变化对比分析图##########################






































基于大数据的工业设备故障诊断模型设计相关推荐

  1. ssm基于大数据的智能公交平台的设计与实现毕业设计源码261620

    目  录 摘要 1 绪论 1.1研究背景 1.2研究现状 1.3系统开发技术的特色 1.4论文结构与章节安排 2智能公交平台 分析 2.1 可行性分析 2.2 系统流程分析 2.2.1数据增加流程 2 ...

  2. 基于大数据的公共建筑能耗监测系统的应用探究

    摘要:为了解决当前公共建筑能耗居高不下的突出问题,借助当前信息化技术手段,围绕公共建筑能耗监测系统中的大数据应用,从监测系统的总设计框架入手,分别就物联网中数据采集器设计方式.数据传输技术.数据库部署 ...

  3. 基于大数据的病虫害预警系统

    1.1 开发背景及研究意义 基于大数据的病虫害管理系统是利用现代信息技术和大数据分析技术,对农作物病虫害进行监测.预测.预警.防控和管理的一种新型系统.它的主要目的是提供一个快速.高效.准确的信息服务 ...

  4. ISME:基于大数据准确预测土壤的枯萎病发生

    基于大数据整合准确预测土壤的枯萎病发生 Predicting disease occurrence with high accuracy based on soil macroecological p ...

  5. ​易生信-宏基因组积微学术论坛:基于大数据整合准确预测土壤的枯萎病发生...

    博彩众家之长,积微成就突破.为促进我国宏基因组研究领域的学术交流和技术分享,推动微生物组领域的发展,"宏基因组"公众号联合国内外优秀人才组织"易生信-宏基因组 积微学术论 ...

  6. ISME:南农沈其荣团队基于大数据准确预测土壤的枯萎病发生

    基于大数据整合准确预测土壤的枯萎病发生 Predicting disease occurrence with high accuracy based on soil macroecological p ...

  7. 报名 | 基于大数据的中国城市技术社会治理探索

    本讲座选取北京.深圳.成都三个城市基于大数据手段的技术社会治理探索的四个街道(区)的典型案例,以社区社会资本.行政资源配置力度为划分标准,将这些街区的探索实践归纳为四种中国城市大数据技术社会治理模式. ...

  8. 大数据文字游戏_基于大数据的游戏化教学系统研究.docx

    基于大数据的游戏化教学系统研究 ―.引言 目前,我国高校在线开放课程的建设已经取得了较大的发展,在线课程的使用已经使高 校教学发生了巨大的变化.课程的网络资源可以作为延伸课堂教学的工具,有效减轻了课堂 ...

  9. 基于大数据的用户行为预测

    2019独角兽企业重金招聘Python工程师标准>>> 随着智能手机的普及和APP形态的愈发丰富,移动设备的应用安装量急剧上升.用户在每天使用这些APP的过程中,也会产生大量的线上和 ...

最新文章

  1. R语言ggplot2可视化:使用热力图可视化dataframe数据、自定义设置热力图的颜色、自定添加标题、轴标签、热力图线框等
  2. C++中static关键字用法
  3. Spring-mybatis 抽取 baseDao。
  4. .NET开发中应该遵循的几点建议
  5. ovirt 双机_ovirt kvm嵌套虚拟化
  6. java 可变参数_90.Java可变参数
  7. 位置问题_德容:不需过多讨论我的位置问题,我会逐渐适应巴萨
  8. d2rq java,知识图谱学习与实践(6)——从结构化数据进行知识抽取(D2RQ介绍)...
  9. cf——Sasha and a Bit of Relax(dp,math)
  10. RayData学习总结
  11. 2022卡塔尔世界杯引爆全球,跨境电商如何做好选品和营销?
  12. H指数问题(USACO)
  13. 一行python并行加速for循环_加速列表和for循环python
  14. 2018 年 5 款最好的 Linux 游戏
  15. jQuery 图像 360 度旋转插件
  16. 程序员与健康的一些参考资料,转自蝈蝈俊.net
  17. 【IoT】创业:创建硬件公司的 9 个步骤
  18. 很多人都在换苹果原装电池,那么原装电池是哪个厂商生产的?
  19. 关于MyBatis-Plus自动更新时间的小坑
  20. 佳明表盘 c语言开发的吗,说点不一样的:Garmin 佳明 Connect IQ 户外手表 使用心得...

热门文章

  1. 数字化转型导师坚鹏:企业数字化营销能力提升
  2. 招商银行个人银行专业版最新版本 v6.0.4.9
  3. Kettle8.2查询组件之数据库连接
  4. r7 5800u和r5 7530u选哪个 锐龙r75800u和r57530u差多少
  5. 高等数学——求导工具
  6. Practical C++ Programming电子书pdf下载
  7. 写作中的英文标点符号
  8. 诚意分享:阿里大师推荐的这份Java开发必读书单
  9. 参观IDC机房观后感
  10. 微观经济学14周作业(博弈论)