ML之LightGBM:基于titanic数据集利用LightGBM和shap算法实现数据特征的可解释性(量化特征对模型贡献度得分)

目录

基于titanic数据集利用LightGBM和shap算法实现数据特征的可解释性(量化特征对模型贡献度得分)

设计思路

输出结果

核心代码


相关文章
ML之LightGBM:基于titanic数据集利用LightGBM和shap算法实现数据特征的可解释性(量化特征对模型贡献度得分)
ML之LightGBM:基于titanic数据集利用LightGBM和shap算法实现数据特征的可解释性(量化特征对模型贡献度得分)实现

基于titanic数据集利用LightGBM和shap算法实现数据特征的可解释性(量化特征对模型贡献度得分)

设计思路

更新……

输出结果

核心代码

# flake8: noqaimport warnings
import sys__version__ = '0.37.0'# check python version
if (sys.version_info < (3, 0)):warnings.warn("As of version 0.29.0 shap only supports Python 3 (not 2)!")from ._explanation import Explanation, Cohorts# explainers
from .explainers._explainer import Explainer
from .explainers._kernel import Kernel as KernelExplainer
from .explainers._sampling import Sampling as SamplingExplainer
from .explainers._tree import Tree as TreeExplainer
from .explainers._deep import Deep as DeepExplainer
from .explainers._gradient import Gradient as GradientExplainer
from .explainers._linear import Linear as LinearExplainer
from .explainers._partition import Partition as PartitionExplainer
from .explainers._permutation import Permutation as PermutationExplainer
from .explainers._additive import Additive as AdditiveExplainer
from .explainers import other# plotting (only loaded if matplotlib is present)
def unsupported(*args, **kwargs):warnings.warn("matplotlib is not installed so plotting is not available! Run `pip install matplotlib` to fix this.")try:import matplotlibhave_matplotlib = True
except ImportError:have_matplotlib = False
if have_matplotlib:from .plots._beeswarm import summary_legacy as summary_plotfrom .plots._decision import decision as decision_plot, multioutput_decision as multioutput_decision_plotfrom .plots._scatter import dependence_legacy as dependence_plotfrom .plots._force import force as force_plot, initjs, save_html, getjsfrom .plots._image import image as image_plotfrom .plots._monitoring import monitoring as monitoring_plotfrom .plots._embedding import embedding as embedding_plotfrom .plots._partial_dependence import partial_dependence as partial_dependence_plotfrom .plots._bar import bar_legacy as bar_plotfrom .plots._waterfall import waterfall as waterfall_plotfrom .plots._group_difference import group_difference as group_difference_plotfrom .plots._text import text as text_plot
else:summary_plot = unsupporteddecision_plot = unsupportedmultioutput_decision_plot = unsupporteddependence_plot = unsupportedforce_plot = unsupportedinitjs = unsupportedsave_html = unsupportedimage_plot = unsupportedmonitoring_plot = unsupportedembedding_plot = unsupportedpartial_dependence_plot = unsupportedbar_plot = unsupportedwaterfall_plot = unsupportedtext_plot = unsupported# other stuff :)
from . import datasets
from . import utils
from . import links#from . import benchmarkfrom .utils._legacy import kmeans
from .utils import sample, approximate_interactions# TODO: Add support for hclustering based explanations where we sort the leaf order by magnitude and then show the dendrogram to the left
def summary_legacy(shap_values, features=None, feature_names=None, max_display=None, plot_type=None,color=None, axis_color="#333333", title=None, alpha=1, show=True, sort=True,color_bar=True, plot_size="auto", layered_violin_max_num_bins=20, class_names=None,class_inds=None,color_bar_label=labels["FEATURE_VALUE"],cmap=colors.red_blue,# depreciatedauto_size_plot=None,use_log_scale=False):"""Create a SHAP beeswarm plot, colored by feature values when they are provided.Parameters----------shap_values : numpy.arrayFor single output explanations this is a matrix of SHAP values (# samples x # features).For multi-output explanations this is a list of such matrices of SHAP values.features : numpy.array or pandas.DataFrame or listMatrix of feature values (# samples x # features) or a feature_names list as shorthandfeature_names : listNames of the features (length # features)max_display : intHow many top features to include in the plot (default is 20, or 7 for interaction plots)plot_type : "dot" (default for single output), "bar" (default for multi-output), "violin",or "compact_dot".What type of summary plot to produce. Note that "compact_dot" is only used forSHAP interaction values.plot_size : "auto" (default), float, (float, float), or NoneWhat size to make the plot. By default the size is auto-scaled based on the number offeatures that are being displayed. Passing a single float will cause each row to be that many inches high. Passing a pair of floats will scale the plot by thatnumber of inches. If None is passed then the size of the current figure will be leftunchanged."""# support passing an explanation objectif str(type(shap_values)).endswith("Explanation'>"):shap_exp = shap_valuesbase_value = shap_exp.base_valueshap_values = shap_exp.valuesif features is None:features = shap_exp.dataif feature_names is None:feature_names = shap_exp.feature_names# if out_names is None: # TODO: waiting for slicer support of this#     out_names = shap_exp.output_names# deprecation warningsif auto_size_plot is not None:warnings.warn("auto_size_plot=False is deprecated and is now ignored! Use plot_size=None instead.")multi_class = Falseif isinstance(shap_values, list):multi_class = Trueif plot_type is None:plot_type = "bar" # default for multi-output explanationsassert plot_type == "bar", "Only plot_type = 'bar' is supported for multi-output explanations!"else:if plot_type is None:plot_type = "dot" # default for single output explanationsassert len(shap_values.shape) != 1, "Summary plots need a matrix of shap_values, not a vector."# default color:if color is None:if plot_type == 'layered_violin':color = "coolwarm"elif multi_class:color = lambda i: colors.red_blue_circle(i/len(shap_values))else:color = colors.blue_rgb

ML之LightGBM:基于titanic数据集利用LightGBM和shap算法实现数据特征的可解释性(量化特征对模型贡献度得分)相关推荐

  1. ML之LoR:基于信用卡数据集利用LoR逻辑回归算法实现如何开发通用信用风险评分卡模型之以toad框架全流程讲解

    ML之LoR:基于信用卡数据集利用LoR逻辑回归算法实现如何开发通用信用风险评分卡模型之以toad框架全流程讲解 目录 基于信用卡数据集利用LoR逻辑回归算法实现如何开发通用信用风险评分卡模型之以to ...

  2. ML之LoR:基于信用卡数据集利用LoR逻辑回归算法实现如何开发通用信用风险评分卡模型之以scorecardpy框架全流程讲解

    ML之LoR:基于信用卡数据集利用LoR逻辑回归算法实现如何开发通用信用风险评分卡模型之以scorecardpy框架全流程讲解 目录 基于信用卡数据集利用LoR逻辑回归算法实现如何开发通用信用风险评分 ...

  3. ML之FE:基于波士顿房价数据集利用LightGBM算法进行模型预测然后通过3σ原则法(计算残差标准差)寻找测试集中的异常值/异常样本

    ML之FE:基于波士顿房价数据集利用LightGBM算法进行模型预测然后通过3σ原则法(计算残差标准差)寻找测试集中的异常值/异常样本 目录 基于波士顿房价数据集利用LiR和LightGBM算法进行模 ...

  4. ML之catboost:基于自定义数据集利用catboost 算法实现回归预测(训练采用CPU和GPU两种方式)

    ML之catboost:基于自定义数据集利用catboost 算法实现回归预测(训练采用CPU和GPU两种方式) 目录 基于自定义数据集利用catboost 算法实现回归预测(训练采用CPU和GPU两 ...

  5. ML之FE:基于BigMartSales数据集利用Featuretools工具(1个dataframe表结构切为2个Entity表结构)实现自动特征工程之详细攻略

    ML之FE:基于BigMartSales数据集利用Featuretools工具(1个dataframe表结构切为2个Entity表结构)实现自动特征工程之详细攻略 目录 基于BigMartSales数 ...

  6. ML之FE:基于BigMartSales数据集利用Featuretools工具实现自动特征工程之详细攻略daiding

    ML之FE:基于BigMartSales数据集利用Featuretools工具实现自动特征工程之详细攻略daiding 目录 基于BigMartSales数据集利用Featuretools工具实现自动 ...

  7. ML:基于自定义数据集利用Logistic、梯度下降算法GD、LoR逻辑回归、Perceptron感知器、SVM支持向量机、LDA线性判别分析算法进行二分类预测(决策边界可视化)

    ML:基于自定义数据集利用Logistic.梯度下降算法GD.LoR逻辑回归.Perceptron感知器.支持向量机(SVM_Linear.SVM_Rbf).LDA线性判别分析算法进行二分类预测(决策 ...

  8. ML之LassoRRidgeR:基于datasets糖尿病数据集利用LassoR和RidgeR算法(alpha调参)进行(9→1)回归预测

    ML之LassoR&RidgeR:基于datasets糖尿病数据集利用LassoR和RidgeR算法(alpha调参)进行(9→1)回归预测 目录 基于datasets糖尿病数据集利用Lass ...

  9. ML之LiRLasso:基于datasets糖尿病数据集利用LiR和Lasso算法进行(9→1)回归预测(三维图散点图可视化)

    ML之LiR&Lasso:基于datasets糖尿病数据集利用LiR和Lasso算法进行(9→1)回归预测(三维图散点图可视化) 目录 基于datasets糖尿病数据集利用LiR和Lasso算 ...

最新文章

  1. 谷歌 AI 编舞师,连张艺兴最喜欢的 Krump 都不在话下
  2. PHP气缸种类,气缸分类方法有哪些及气缸的种类
  3. python可以在linux运行_在linux运行python
  4. torch view view_as
  5. Dagger2 知识梳理(1) Dagger2 依赖注入的两种方式
  6. 大学加权平均分计算器_英国排名前20的大学GPA要求
  7. C++使用链表实现queue之二(附完整源码)
  8. Java基础---键盘录入工具(Scanner类)
  9. 守住你自己的“沉香”
  10. java deployment_deployment简略介绍
  11. php实现设计模式之 命令模式
  12. 文件io(二)--unix环境高级编程笔记
  13. Python 2.7.5 CentOS 6.4 编译 错误
  14. 关于尚硅谷视频p135配置完yarn-site.xml的硬件资源配置后
  15. feignclient time out
  16. New Age音乐启蒙与经典选介
  17. python聊天室_python聊天室
  18. 【上课课件整理复习】第六章 网页数据的采集(1)
  19. Linux系统各发行版镜像下载
  20. Spring Boot 接口幂等性实现的 4 种方案!

热门文章

  1. 更有效的加载较大的Bitmap
  2. 将特定像素点在图像上连接起来_图像分割【论文解读】快速图像分割的SuperBPD方法 CVPR-2020...
  3. tcp 的ack, seq
  4. 安装gcc 4.8.2 for cxx 11
  5. cmd MySQL登录
  6. 宝塔环境下配置PM2+NODE+VUE+WEBPACK环境
  7. asp批量生成html静态页面方法
  8. DOS下查看局域网的ip使用情况,以及ip对应的主机名
  9. 什么?你的团队没有100人,那就不要用微服务了!
  10. 一本彻底搞懂MySQL索引优化EXPLAIN百科全书