EL之Bagging:kaggle比赛之利用泰坦尼克号数据集建立Bagging模型对每个人进行获救是否预测

目录

输出结果

设计思路

核心代码


输出结果

设计思路

核心代码

bagging_clf = BaggingRegressor(clf_LoR, n_estimators=10, max_samples=0.8, max_features=1.0, bootstrap=True, bootstrap_features=False, n_jobs=-1)bagging_clf.fit(X, y)#BaggingRegressor
class BaggingRegressor Found at: sklearn.ensemble.baggingclass BaggingRegressor(BaseBagging, RegressorMixin):"""A Bagging regressor.A Bagging regressor is an ensemble meta-estimator that fits baseregressors each on random subsets of the original dataset and thenaggregate their individual predictions (either by voting or by averaging)to form a final prediction. Such a meta-estimator can typically be used asa way to reduce the variance of a black-box estimator (e.g., a decisiontree), by introducing randomization into its construction procedure andthen making an ensemble out of it.This algorithm encompasses several works from the literature. When randomsubsets of the dataset are drawn as random subsets of the samples, thenthis algorithm is known as Pasting [1]_. If samples are drawn withreplacement, then the method is known as Bagging [2]_. When random subsetsof the dataset are drawn as random subsets of the features, then the methodis known as Random Subspaces [3]_. Finally, when base estimators are builton subsets of both samples and features, then the method is known asRandom Patches [4]_.Read more in the :ref:`User Guide <bagging>`.Parameters----------base_estimator : object or None, optional (default=None)The base estimator to fit on random subsets of the dataset.If None, then the base estimator is a decision tree.n_estimators : int, optional (default=10)The number of base estimators in the ensemble.max_samples : int or float, optional (default=1.0)The number of samples to draw from X to train each base estimator.- If int, then draw `max_samples` samples.- If float, then draw `max_samples * X.shape[0]` samples.max_features : int or float, optional (default=1.0)The number of features to draw from X to train each base estimator.- If int, then draw `max_features` features.- If float, then draw `max_features * X.shape[1]` features.bootstrap : boolean, optional (default=True)Whether samples are drawn with replacement.bootstrap_features : boolean, optional (default=False)Whether features are drawn with replacement.oob_score : boolWhether to use out-of-bag samples to estimatethe generalization error.warm_start : bool, optional (default=False)When set to True, reuse the solution of the previous call to fitand add more estimators to the ensemble, otherwise, just fita whole new ensemble.n_jobs : int, optional (default=1)The number of jobs to run in parallel for both `fit` and `predict`.If -1, then the number of jobs is set to the number of cores.random_state : int, RandomState instance or None, optional (default=None)If int, random_state is the seed used by the random number generator;If RandomState instance, random_state is the random number generator;If None, the random number generator is the RandomState instance usedby `np.random`.verbose : int, optional (default=0)Controls the verbosity of the building process.Attributes----------estimators_ : list of estimatorsThe collection of fitted sub-estimators.estimators_samples_ : list of arraysThe subset of drawn samples (i.e., the in-bag samples) for each baseestimator. Each subset is defined by a boolean mask.estimators_features_ : list of arraysThe subset of drawn features for each base estimator.oob_score_ : floatScore of the training dataset obtained using an out-of-bag estimate.oob_prediction_ : array of shape = [n_samples]Prediction computed with out-of-bag estimate on the trainingset. If n_estimators is small it might be possible that a data pointwas never left out during the bootstrap. In this case,`oob_prediction_` might contain NaN.References----------.. [1] L. Breiman, "Pasting small votes for classification in largedatabases and on-line", Machine Learning, 36(1), 85-103, 1999... [2] L. Breiman, "Bagging predictors", Machine Learning, 24(2), 123-140,1996... [3] T. Ho, "The random subspace method for constructing decisionforests", Pattern Analysis and Machine Intelligence, 20(8), 832-844,1998... [4] G. Louppe and P. Geurts, "Ensembles on Random Patches", MachineLearning and Knowledge Discovery in Databases, 346-361, 2012."""def __init__(self, base_estimator=None, n_estimators=10, max_samples=1.0, max_features=1.0, bootstrap=True, bootstrap_features=False, oob_score=False, warm_start=False, n_jobs=1, random_state=None, verbose=0):super(BaggingRegressor, self).__init__(base_estimator, n_estimators=n_estimators, max_samples=max_samples, max_features=max_features, bootstrap=bootstrap, bootstrap_features=bootstrap_features, oob_score=oob_score, warm_start=warm_start, n_jobs=n_jobs, random_state=random_state, verbose=verbose)def predict(self, X):"""Predict regression target for X.The predicted regression target of an input sample is computed as themean predicted regression targets of the estimators in the ensemble.Parameters----------X : {array-like, sparse matrix} of shape = [n_samples, n_features]The training input samples. Sparse matrices are accepted only ifthey are supported by the base estimator.Returns-------y : array of shape = [n_samples]The predicted values."""check_is_fitted(self, "estimators_features_")# Check dataX = check_array(X, accept_sparse=['csr', 'csc'])# Parallel loopn_jobs, n_estimators, starts = _partition_estimators(self.n_estimators, self.n_jobs)all_y_hat = Parallel(n_jobs=n_jobs, verbose=self.verbose)(delayed(_parallel_predict_regression)(self.estimators_[starts[i]:starts[i + 1]], self.estimators_features_[starts[i]:starts[i + 1]], X) for i in range(n_jobs))# Reducey_hat = sum(all_y_hat) / self.n_estimatorsreturn y_hatdef _validate_estimator(self):"""Check the estimator and set the base_estimator_ attribute."""super(BaggingRegressor, self)._validate_estimator(default=DecisionTreeRegressor())def _set_oob_score(self, X, y):n_samples = y.shape[0]predictions = np.zeros((n_samples, ))n_predictions = np.zeros((n_samples, ))for estimator, samples, features in zip(self.estimators_, self.estimators_samples_, self.estimators_features_):# Create mask for OOB samplesmask = ~samplespredictions[mask] += estimator.predict(mask:])[(X[:features])n_predictions[mask] += 1if (n_predictions == 0).any():warn("Some inputs do not have OOB scores. ""This probably means too few estimators were used ""to compute any reliable oob estimates.")n_predictions[n_predictions == 0] = 1predictions /= n_predictionsself.oob_prediction_ = predictionsself.oob_score_ = r2_score(y, predictions)

EL之Bagging:kaggle比赛之利用泰坦尼克号数据集建立Bagging模型对每个人进行获救是否预测相关推荐

  1. ML之RF:kaggle比赛之利用泰坦尼克号数据集建立RF模型对每个人进行获救是否预测

    ML之RF:kaggle比赛之利用泰坦尼克号数据集建立RF模型对每个人进行获救是否预测 目录 输出结果 实现代码 输出结果 后期更新-- 实现代码 #预测模型选择的RF import numpy as ...

  2. ML之LoR:kaggle比赛之利用泰坦尼克号数据集建立LoR模型对每个人进行获救是否预测

    比赛要求:根据训练集数据和测试集数据生成自己的预测模型,按照预测模型来预测出892到1309条数据是否获救,按照比赛规定的格式生成csv文件,并上传到kaggle上,然后会反馈预测的准确率. 导读: ...

  3. ML之LoRBaggingRF:依次利用LoR、Bagging、RF算法对泰坦尼克号数据集 (Kaggle经典案例)获救人员进行二分类预测(最全)

    ML之LoR&Bagging&RF:依次利用LoR.Bagging.RF算法对泰坦尼克号数据集 (Kaggle经典案例)获救人员进行二分类预测 目录 输出结果 设计思路 核心代码 输出 ...

  4. ML之LoRBaggingRF:依次利用Bagging、RF算法对泰坦尼克号数据集 (Kaggle经典案例)获救人员进行二分类预测——模型融合

    ML之LoR&Bagging&RF:依次利用Bagging.RF算法对泰坦尼克号数据集 (Kaggle经典案例)获救人员进行二分类预测--模型融合 目录 输出结果 设计思路 核心代码 ...

  5. ML之LoRBaggingRF:依次利用LoR、Bagging、RF算法对泰坦尼克号数据集 (Kaggle经典案例)获救人员进行二分类预测——优化baseline模型

    ML之LoR&Bagging&RF:依次利用LoR.Bagging.RF算法对泰坦尼克号数据集 (Kaggle经典案例)获救人员进行二分类预测--优化baseline模型 目录 模型优 ...

  6. TF之pix2pix:基于TF利用Facades数据集训练pix2pix模型、测试并进行生成过程全记录

    TF之pix2pix:基于TF利用Facades数据集训练pix2pix模型.测试并进行生成过程全记录 目录 TB监控 1.SCALARS 2.IMAGES 3.GRAPHS 4.DISTRIBUTI ...

  7. 用matlab建立晶体模型,利用materials studio建立晶体模型的步骤 | 附下载

    1.启动materials studio时会提示:create a new project or open an existing project 在这里选择create a new project, ...

  8. ML之LoRBaggingRF:依次利用LoR、Bagging、RF算法对titanic(泰坦尼克号)数据集 (Kaggle经典案例)获救人员进行二分类预测(最全)

    ML之LoR&Bagging&RF:依次利用LoR.Bagging.RF算法对titanic(泰坦尼克号)数据集 (Kaggle经典案例)获救人员进行二分类预测 目录 输出结果 设计思 ...

  9. Kaggle比赛心得

    正文共5453个字,5张图,预计阅读时间14分钟. 最近参加了两场Kaggle比赛,收获颇多,一直想写篇文章总结一下.接触Kaggle到现在不到一年,比赛成绩一个银牌(5%)一个铜牌(9%),勉强算入 ...

最新文章

  1. NAT,Easy IP
  2. 吴恩达 coursera AI 专项五第一课(上)总结+作业答案
  3. Android 自定义View,自定义属性--自定义圆形进度条(整理)
  4. SAP Fiori Elements 本地项目的 annotations.xml 文件
  5. aws cloud map_销毁AWS资源:Cloud-Nuke还是AWS-Nuke?
  6. 【CodeForces - 803D】Magazine Ad(二分答案)
  7. 从今天起,开始等待中信世界杯信用卡
  8. 浅谈如何用We7站群平台打造垂直性政务网站
  9. 机器学习—K-means聚类、密度聚类、层次聚类理论与实战
  10. php bc gmp,php中ipv6转纯数字和反转
  11. java读取类字段名-BeanUtils.describe与PropertyUtils.describe(javaBean转map)
  12. lg相乘公式_ln公式(lg公式大全)
  13. 服务器安全-阿里自研补丁列表整理
  14. Excel数据分列大法总结
  15. 计应121--实训四【李智飞(27号)--李阳持(26号)--胡俊琛(13号)--曹吉(2号)】
  16. 热搜!中科大一博士生打印学位论文,分量堪比书籍!可“惨”的是...
  17. centos7下添加常用YUM源(EPEL/Remi/RPMforge/php/Nginx)
  18. 一台台式计算机功率,一台电脑多少瓦
  19. 小程序生成二维码 发布版本无法显示 测试和体验版正常
  20. keras中文文档学习笔记—快速上手keras

热门文章

  1. python颜色的字母代码,如何在python中更改特定印刷字母的颜色?
  2. sprintboot-learn(一)
  3. 2018.4.2 三周第一次课
  4. 五、jvm垃圾回收3(几种垃圾收集器)
  5. 重启iis提示不支持此接口的解决方案
  6. 配置nginx的那些参数
  7. Rancher搭建NFS服务器
  8. excel2007-分页显示透视表
  9. 为什么 Kubernetes 变得如此流行(2020版)
  10. Kafka Manager 编译 + 部署运行