什么是随机森林?

Random forest is a classification technique that proposed by Leo Brieman (2001), given the set of class-labeled data, builds a set of classification trees. Each tree is developed from a bootstrap sample from the training data. When developing individual trees, an arbitrary subset of attributes is drawn (hence the term "random") from which the best attribute for the split is selected. The classification is based on the majority vote from individually developed tree classifiers in the forest.

更为详细的解释:http://en.wikipedia.org/wiki/Random_forest

Matlab库下载

原始实现:

http://www.stat.berkeley.edu/~breiman/RandomForests/cc_software.htm,即将发布新版

从R改装来的实现:

http://randomforest-matlab.googlecode.com/files/Windows-Precompiled-RF_MexStandalone-v0.02-.zip

基于随机森林的集成分类应用:

ENSEMBLE CLASSIFICATION

(1) A conference paper investigating binary classification strategies with ensemble classification has been published. [Chan J.C.-W., Demarchi, L., Van De Voorde, T., & Canters, F. (2008),”Binary classification strategies for mapping urban land cover with ensemble classifiers”, Proceedings of IEEE International Geoscience and Remote Sensing Symposium (IGARSS), July 6-11, 2008, Boston, Massachusetts, USA. Vol. III, pp. 1004-1007.] (see Annex A.9)
Since the data sets related to HABISTAT were not ready in the beginning of 2008, a study on binary classification with ensemble classifiers was conducted using 2 data sets in suburban areas. In the paper, two binary classification strategies were examined to further extend the strength of ensemble classifiers for mapping of urban objects. The first strategy was a one-against-one approach. The idea behind it was to employ a pairwise binary classification where n(n-1)/2 classifiers are created, n being the number of classes. Each of the n(n-1)/2  classifiers was trained using only training cases from two classes at a time. The ensemble was then combined by majority voting. The second strategy was a one-against-all binary approach: if there are n classes, with a = {1,…, n} being one of the classes, then n classifiers were generated, each representing a binary classification of a and non-a. The ensemble was combined using accuracy estimates obtained for each class. Both binary strategies were applied on two single classifiers (decision trees and artificial neural networks) and two ensemble classifiers (Random Forest and Adaboost). Two multi-source data sets were used: one was prepared for an object-based classification and one for a conventional pixel-based approach. Our results indicate that ensemble classifiers generate significantly higher accuracies than a single classifier. Compared to a single C5.0 tree, Random Forest and Adaboost increased the accuracy by 2 to 12%. The range of increase depends on the data set that was used. Applying binary classification strategies often increases accuracy, but only marginally (between 1-3%). All increases are statistically significant, except on one occasion. Coupling ensemble classifiers with binary classification always yielded the highest accuracies. For our first data set, the highest accuracy was obtained with Adaboost and a 1-against-1 strategy, 4.3% better than for a single tree;  for the second data set with the Random Forest approach and a 1-against-all strategy, 13.6% higher than for a single tree.
While the results show statistically significant improvement, the increase in accuracy is marginal. Given its long training time, we have to consider carefully if it is worthwhile to apply this strategy.

(2) We used the ensemble classifier Random Forest to produce four levels of classification using 3 different data sets in the framework of workpackage Validation WP 5200. The data set that was used for this experiment is AHS airborne data. A total of 12 classifications were made (see Figure 9). The results with Random Forest were compared with the performance from other classifiers: Linear Discriminant Analysis, Markov Random Field.

The processing has a problem in terms of the number of training samples and also spatial independence (see Table 5). This issue with the training, testing and validation sets has been discussed during the mid-term evaluation and is under investigation.

Figure 9. Validation exercise using airborne AHS data. The columns represent 3 data sets and rows represent 4 levels of classification. Classifications were done using Random Forest.

Table 5. Table showing the classification scheme and training size at each level.

(3) The use of ensemble classification was studied in all classification tasks with spaceborne data. Two conference papers in relation to classification of heath lands using superresolution enhanced CHRIS data were presented. Random Forest were used for the classifications. The results show rather consistent and satisfactory results with Random Forest. Below are two illustrations (Figure 10 and Figure 11) of the application of Random Forest on the original CHRIS and superresolution enhanced CHRIS data set. For more details, please refer to the paper attached in annex A.8. Random Forest seems to have worked very well with our data sets. We will continue to use and investigate the strength of this ensemble classifier.


Figure 10. Random Forest classification of SR CHRIS (Kalmthout, Belgium). Results presented at IGARSS, July 6-11, 2008, Boston, Massachusetts, USA. (see Annex B of annual report #1)

Figure 11. Random Forest classification of SR CHRIS (Ginkel, the Netherlands). Results presented at the 6th EARSeL SIG Imaging Spectroscopy workshop 2009, Tel Aviv, March 16-19 2009. (see Annex A.8)

来源:http://habistat.vgt.vito.be/modules/Results/EC.php

Orange软件提供的随机森林实现

http://orange.biolab.si/doc/widgets/_static/Classify/RandomForest.htm

Matlab随机森林库相关推荐

  1. MATLAB随机森林回归模型

    MATLAB随机森林回归模型: 调用matlab自带的TreeBagger.m T=textread('E:\datasets-orreview\discretized-regression\10bi ...

  2. 【RF分类】基于matlab随机森林算法数据分类【含Matlab源码 2048期】

    ⛄一.获取代码方式 获取代码方式1: 完整代码已上传我的资源:[RF分类]基于matlab随机森林算法数据分类[含Matlab源码 2048期] 获取代码方式2: 付费专栏Matlab智能算法神经网络 ...

  3. matlab随机森林 分成三类怎么设置,MATLAB随机森林回归模型

    MATLAB随机森林回归模型: 调用matlab自带的TreeBagger.m T=textread('E:\datasets-orreview\discretized-regression\10bi ...

  4. python随机森林库_随机森林库:R和Python中的不同结果

    下面的代码用R和python训练一个随机森林模型.正如您所注意到的,R(1-0.27=0.73)中的精度要比Python中的(0.69)好.此外,特性在R和Python中的重要性是不同的.在 [EDI ...

  5. 【RF预测】基于matlab随机森林算法数据回归预测【含Matlab源码 2047期】

    ⛄一.随机森林算法预测简介 随机森林 (random forest) 是一种基于分类树 (classification tree) 的算法 (Breiman, 2001) .这个算法需要模拟和迭代, ...

  6. matlab随机森林网络工具箱下载安装【强烈推荐!!!!!!!!!!】

    [重点1--随机森林工具箱下载][在这里下载,不要积分,不要会员,随便下!!!!!!!!!!!!!!!!!!!!] 近期在学习由王小川.史峰等编著的<MATLAB神经网络43个案例分析>时 ...

  7. matlab 随机森林算法_(六)如何利用Python从头开始实现随机森林算法

    博客地址:https://blog.csdn.net/CoderPai/article/details/96499505 点击阅读原文,更好的阅读体验 CoderPai 是一个专注于人工智能在量化交易 ...

  8. matlab 随机森林算法_随机森林算法

    随机森林是一种灵活,易于使用的机器学习算法,即使没有超参数调整,也能在大多数情况下产生出色的结果.它也是最常用的算法之一,因为它简单,并且可以用于分类和回归任务.在这篇文章中,您将学习随机森林算法如何 ...

  9. python随机森林库_随机森林算法入门(python)

    目录 1 什么是随机森林 1.1 集成学习 1.2 随机决策树 1.3 随机森林 1.4 投票 2 为什么要用它 3 使用方法 3.1 变量选择 3.2 分类 3.3 回归 4 一个简单的Python ...

最新文章

  1. 为什么要学习Python编程语言?哪些人适合学习Python?
  2. 手把手教你Tableau高级数据分析功能(附数据集)
  3. mysql 关键字 status_Mysql show status命令详解
  4. 关键词为什么迟迟不上首页?太令人“捉急”了!
  5. 还是两个数的交换问题
  6. 获取jar包内部的资源文件
  7. arrylist和linked list区别
  8. 11尺寸长宽 iphone_弱电工程LED显示屏尺寸规格及计算方法
  9. 将not exists更改为外连接
  10. flutter 刷脸_支付宝刷脸认证 - osc_bkdv2it5的个人空间 - OSCHINA - 中文开源技术交流社区...
  11. 设计素材PSD分层模板|美食类海报设计技法
  12. LeetCode-70 爬楼梯
  13. 10+年程序员告诉你职场误区,如何快速提升自己?
  14. android设计个人简历页面_Android程序员个人简历模板下载(Word格式)
  15. OpenCV-python显示图片时图片比窗口大的解决办法
  16. Gentoo 软件包冲突
  17. 地下城与勇士(DNF)安图恩副本(黑雾之源、震颤的大地、舰炮防御战、擎天之柱、能量阻截战、黑色火山、安徒恩的心脏)(童年的回忆)
  18. Linux capability初探
  19. 金蝶kis修改服务器,金蝶kis 修改服务器地址
  20. 读书笔记:《Designing Data-Intensive Applications》

热门文章

  1. flutter系列之:如丝般顺滑的SliverAppBar
  2. Serverless 2021 最新调查报告
  3. Ubuntu命令关机
  4. win10+node@16 安装特定版本 node-sass
  5. 手写数字识别画板前后端实现 | Flask+深度神经网络
  6. php mysql防注入字符串过滤_php中防止SQL注入的方法
  7. VC获取系统空闲时间
  8. tenda w311mi驱动安装-ubuntu
  9. 脑裂产生以及解决办法(转载)
  10. sed命令详解与示例