三个指标怎么做分层图

Hierarchical machine learning models are one top-notch trick. As discussed in previous posts, considering the natural taxonomy of the data when designing our models can be well worth our while. Instead of flattening out and ignoring those inner hierarchies, we’re able to use them, making our models smarter and more accurate.

分层机器学习模型是一种一流的技巧。正如讨论在以前的帖子，设计我们的模型时，可以是非常值得我们在考虑数据的自然分类。我们可以使用它们来代替扁平化处理并忽略那些内部层次结构，从而使我们的模型更智能，更准确。

“More accurate”, I say — are they, though? How can we tell? We are people of science, after all, and we expect bold claims to be be supported by the data. This is why we have performance metrics. Whether it’s precision, f1-score, or any other lovely metric we’ve got our eye on — if using hierarchy in our models improves their performance, the metrics should show it.

我说“更准确”-是吗？我们怎么知道？毕竟，我们是科学人，我们希望大胆的主张能得到数据的支持。这就是为什么我们有性能指标。无论是精度，f1得分，还是我们关注的任何其他可爱指标，如果在模型中使用层次结构可以提高其性能，则指标都应显示出来。

Problem is, if we use regular performance metrics — the ones designed for flat, one-level classification — we go back to ignoring that natural taxonomy of the data.

问题是，如果我们使用常规的性能指标(为平坦的一级分类而设计的指标)，那么我们会回避忽略数据的自然分类法。

If we do hierarchy, let’s do it all the way. If we’ve decided to celebrate our data’s taxonomy and build our model in its image, this needs to also be a part of measuring its performance.

如果我们执行层次结构，那么就一路做下去。如果我们决定庆祝数据的分类法并按照其图像构建模型，则这也必须成为衡量其性能的一部分。

How do we do this? The answer lies below.

我们如何做到这一点？答案就在下面。

在我们深入之前 (Before We Dive In)

This post is about measuring the performance of machine learning models designed for hierarchical classification. It kind of assumes you know what all those words mean. If you don’t, check out my previous posts on the topic. Especially the one introducing the subject. Really. You’re gonna want to know what hierarchical classification is before learning how to measure it. That’s kind of an obvious one.

这篇文章是关于测量为分层分类而设计的机器学习模型的性能。假设您知道所有这些词的意思。如果不这样做，看看我以前的帖子主题演讲。特别是介绍主题的人。真。您将要了解什么是分级分类，然后再学习如何对其进行度量。这很明显。

Throughout this post, I’ll be giving examples based on this taxonomy of common house pets:

在整个这篇文章中，我将基于常见的家庭宠物分类给出示例：

The taxonomy of common house pets. My neighbor just adopted the cutest baby Pegasus.

哦，这么多指标 (Oh So Many Metrics)

So we’ve got a whole ensemble of hierarchically-structured local classifiers, ready to do our bidding. How do we evaluate them?

因此，我们已经有了完整的层次结构本地分类器集合，可以开始进行出价了。我们如何评估它们？

That is not a trivial problem, and the solution is not obvious. As we’ve seen in previous problems in this series, different projects require different treatment. The best metric could differ depending on the specific requirements and limitations of your project.

这不是一个小问题，解决方案也不明显。正如我们在本系列以前的问题中所看到的，不同的项目需要不同的对待。最佳指标可能会有所不同，具体取决于项目的特定要求和限制。

All in all, there are three main options to choose from. Let’s introduce them, shall we?

总而言之，有三个主要选项可供选择。让我们介绍一下，好吗？

The contestants, in all their grace and glory:

参赛者以其全部的光荣与荣耀：

脚踏实地：平面分类指标 (The Down-To-Earth One: Flat Classification Metrics)

These are the same classification metrics we all know and love (precision, recall, f-score — you name it), applied… Well, flatly.

这些是我们都知道并喜欢的相同分类指标(精度，召回率，f分数-随便你说吧)，适用于……嗯，坦率地说。

Same as with the original “flat classification” approach (described in the first post in this series), this method is all about ignoring the hierarchy. Only the final, leaf-node predictions are considered (in our house pets example, those are the specific breeds), and they’re all considered as equal classes, without any special treatment to sibling-classes vs. non-sibling ones.

与最初的“平面分类”方法( 在本系列的第一篇文章中介绍)相同，该方法全部用于忽略层次结构。仅考虑最终的叶节点预测(在我们的家养宠物示例中，这些是特定的品种)，并且它们都被视为同等类别，对同胞类和非同胞类没有任何特殊处理。

This method is simple, but, obviously, not ideal. We don’t want the errors at different levels of the class hierarchy to be penalized in the same way (if I mistook a Pegasus for a Narwhal, that’s not as bad as mistaking it for a Labrador). Also, there isn’t an obvious way to handle cases where the final prediction is not a leaf-node one — which could definitely be the case if you implemented the previously-mentioned blocking by confidence method.

这种方法很简单，但是显然并不理想。我们不希望以相同的方式惩罚类层次结构不同层次上的错误(如果我误以为是“飞马”作为“独角鲸”，那不如误以为是“拉布拉多”)。另外，没有一种明显的方法可以处理最终预测不是叶节点预测的情况-如果您信心十足地实施了前面提到的阻止，肯定会是这种情况方法。

时髦一：定制指标 (The Hipster One: A Custom-Made Metric)

Not happy with the flat metrics, and feel that creative spark tingling in your fingertips? You can conjure up your own special metric, which specifically fits your unique snowflake of a use case.

对平坦的指标不满意，并感到触手可及的创意火花吗？您可以构想自己的特殊指标，该指标特别适合您用例的独特需求。

This could be useful where the model needs to fit some unusual business constraints. If, for example, you don’t really care about falsely identifying dogs as unicorns, but a Sphynx cat must be correctly spotted or all hell breaks loose, you can design your metrics accordingly, giving more or less weight to different errors.

当模型需要适应一些异常的业务约束时，这可能会很有用。例如，如果您真的不关心将狗误认为是独角兽，但是必须正确地发现Sphynx猫，否则所有地狱都将变成现实，那么您可以相应地设计度量标准，或多或少地权衡不同的错误。

自命不凡的人：常规分类指标的层次结构特定变体 (The Pretentious One: Hierarchy-Specific Variations on the Regular Classification Metrics)

Those are variations of well-known precision, recall and f-score metrics, specifically adapted to fit hierarchical classification.

这些是众所周知的精度，召回率和f得分指标的变体，特别适合于分层分类。

Please bear with me as I throw some math notations in your general direction:

当我朝您的一般方向介绍一些数学符号时，请耐心配合：

Definitions for hierarchical precision (hP), hierarchical recall (hR) and hierarchical f-measure (hF), respectively.

What does it all mean, though?

那到底是什么意思呢？

Pi is the set consisting of the most specific class (or classes, in case of a multi label problem) predicted for each test example i, and all of its/their ancestor classes; Ti is the set consisting of the true most specific class(es) of test example i, and all its/their ancestor classes; and each summation is computed, of course, over all of the test set examples.

Pi是由针对每个测试示例i 预测的最特定的类(如果有多重标签问题，则为多个类)及其所有/其祖先类组成的集合； Ti是由测试示例i的最真实的特定类及其所有/其祖先类组成的集合；当然，所有求和示例都将计算出每个总和。

This one is a bit of a handful to unpack, so if you find yourself puzzled, check out the appendix, where I explain it in more detail.

拆开这个包有点难，因此，如果您感到困惑，请查看附录，我会在其中详细说明。

Now, if you’ve implemented your model with non-mandatory leaf-node prediction (meaning the most specific level predicted doesn’t have to be the deepest one), some adjustments need to be made; I won’t go into it here, but if this is something you want to read more about, let me know.

现在，如果您使用非强制性叶节点预测来实现模型(这意味着预测的最具体级别不必是最深的级别)，则需要进行一些调整；否则，请进行调整。我在这里不做介绍，但是如果您想了解更多信息，请告诉我。

哪个指标最合适？ (Which Metric Is The Perfect Match?)

The everlasting question. As I previously mentioned, there isn’t one obvious answer, but here are my own thoughts on the subject:

永恒的问题。正如我之前提到的，没有一个明显的答案，但这是我对这个问题的看法：

Flat metrics: it’s a simple enough method, but it loses hierarchy information, which you probably deem important if you went through the trouble of building a hierarchical ensemble model in the first place. I would recommend using flat metrics only for super quick-and-dirty projects, where time limit is a big factor.

扁平 度量：这是一种足够简单的方法，但是会丢失层次结构信息，如果您首先遇到构建层次结构集成模型的麻烦，您可能会认为这很重要。 我建议仅将平面度量标准用于时间有限是一个很大因素的超级快速和肮脏的项目。
Custom-made, unique metrics: might be a better fit, but you pay with time and effort. Also, since you’ll be using a metric that wasn’t peer reviewed, you could be missing something important. I would recommend custom-made metrics only when the project at hand has very unique requirements that should be taken into account when evaluating model performance.

量身定制的独特指标：可能更合适，但您要付出时间和精力。另外，由于您将使用未经同行审查的指标，因此您可能会遗漏一些重要的内容。 我仅在手头的项目具有非常独特的要求时才建议使用定制指标，在评估模型性能时应考虑这些要求。
Hierarchical versions of common classification metrics: this method is somewhat intuitive (once you get the hang of it), and it makes a lot of sense for a hierarchical model. However, it might not fit your own use-case the best (for example, there’s no added weight for a correct/false prediction of the deepest class — which might be important in some use cases). It also requires some extra implementation time. All in all, though, I think it’s a good enough premade solution, and should probably be the first choice for most projects.

通用分类指标的分层版本：此方法有些直观(一旦掌握了方法)，对于分层模型来说就很有意义了。但是，它可能无法最适合您自己的用例(例如，对于最深层的类的正确/错误预测没有增加的权重-在某些用例中这可能很重要)。它还需要一些额外的实施时间。 总而言之，我认为这是一个足够好的预制解决方案，并且可能应该是大多数项目的首选。

结论 (To Conclude)

A machine learning model is nothing without its performance metrics, and hierarchical models require their own special care. There is no one best method to measure hierarchy-based classification: different approaches have their own pros and cons, and each project has its own best fit. If you got this far, you hopefully have an idea as to which method is best for yours, and can now measure your model once you’ve got it rolling.

如果没有机器学习模型的性能指标，它就什么也不是，而分层模型则需要特别注意。没有一种最佳的方法可以衡量基于层次的分类：不同的方法各有利弊，每个项目都有自己的最佳选择。如果到此为止，您希望对哪种方法最适合自己有所了解，现在可以在模型滚动后对其进行测量。

This post concludes my four-post series about hierarchical classification models. If you’ve read all of them, you should have all of the tools you need to design, build and measure an outstanding hierarchical classification project. I hope you put it to the best use possible.

这篇文章总结了我关于层次分类模型的四篇文章系列。如果已阅读所有内容，则应该拥有设计，构建和衡量出色的层次分类项目所需的所有工具。希望您能将其发挥最大的作用。

Previous posts in the series:

该系列中的先前文章：

The Hitchhiker’s Guide to Hierarchical Classification

《旅行者分类指南》
Hierarchical Classification with Local Classifiers: Down the Rabbit Hole

带有局部分类器的分层分类：沿着兔子洞
Hierarchical Classification by Local Classifiers: Your Must-Know Tweaks & Tricks

本地分类器的分层分类：您必须知道的调整和技巧

Noa Weiss is an AI & Machine Learning Consultant based in Tel Aviv.

Noa Weiss是位于特拉维夫的AI和机器学习顾问。

附录 (Appendix)

Can’t figure out those pesky hierarchical metrics? I’m here to help.

无法找出那些令人讨厌的分层指标？我是来帮忙的。

In the table below I go over the mock results of a “common house pets” hierarchical model, looking at the measures for the “Dalmatian” class (remember: precision, recall and f-score metrics are calculated per class, treating the labels — both predicted and true — as binary).

在下面的表格中，我查看了“普通宠物”分层模型的模拟结果， 查看了“达尔马提亚”类的度量(请记住：精度，召回率和f得分指标是按类计算的，并处理标签-既是预测值，也是真实值-二进制)。

I go over a few examples, checking out what each of them contributes to both the precision and the recall scores. Remember — the final precision/recall scores are the summation of all those individual examples.

我看了几个例子，检查了每个例子对准确性和查全率的贡献。请记住，最终的精确度/召回分数是所有这些单独示例的总和。

Demystifying hierarchical metrics one dog at a time.

Comments by example:

举例说明 ：

Misclassification of a different breed as a Dalmatian: a full point for recall (as the “dog” part was correctly identified), but only half a point for precision (as “dog” was correct, but the predicted “dalmatian” label was wrong). Recall isn’t negatively affected since the “Labrador” label, which was missed here, is not part of the [Dog, Dalmatian] classes, which are the ones measured here.

将另一个品种误分类为达尔马提亚犬：回想的满分(因为正确识别了“狗”部分)，但精确度只有半分(因为“犬”是正确的，但是预测的“达尔马提亚”标签是错误的) )。 召回不会受到负面影响，因为在此处未使用的“拉布拉多”标签不是[Dog，Dalmatian]类的一部分，在此处已对其进行了测量。
Misclassification of a narwhal as a dalmatian — a zero for precision (as both the “dog” and “dalmatian” predicted labels are wrong), but the recall metric isn’t affected, since the true narwhal label is irrelevant to the measurements of the [Dog, Dalmatian] classes.

将独角鲸误分类为达尔马提亚狗(精度为零)(因为“ dog”和“ dalmatian”预测的标签均不正确)，但召回指标不受影响，因为真实的独角鲸标签与测量结果无关[狗，达尔马提亚狗]类。
Perfect prediction — an extra point for both prediction and recall.

完美的预测-预测和召回的额外要点。
Misclassification of a dalmatian as a different breed: a full point for the precision metric (as the “dog” classifier, which is the only that came out positive out of the two, was correct), but only half a point for recall (as the “dog” label was correctly identified, but the “dalmatian” one was missed.

将达尔马提亚狗误分类为其他品种：精确度指标的满分(作为“狗”分类器，这是两者中唯一得出肯定的，是正确的)，但是召回率只有半点(因为正确地识别了“狗”标签，但是错过了“达尔马提亚”标签。
A dalmatian misclassified as a Rainbow unicorn: 0 for recall (as both dog and dalmatian labels were missed), but the precision score isn’t affected.

斑点狗被错误地归类为彩虹独角兽：召回0(因为错过了狗和斑点狗的标签)，但精度得分不受影响。
This example doesn’t teach us anything about the performance of the Dog/Dalmatian classifiers, so it stands to reason it doesn’t affect the score.

这个示例没有告诉我们有关Dog / Dalmatian分类器性能的任何信息，因此可以说它不会影响得分。

Source: C.N. Silla & A.A. Freitas, A survey of hierarchical classification across different application domains (2011), Data Mining and Knowledge Discovery, 22(1–2):182–196

资料来源 ：CN Silla和AA Freitas， 不同应用领域的层次结构分类调查 (2011)， 数据挖掘和知识发现 ，22(1-2)：182-196

翻译自: https://towardsdatascience.com/hierarchical-performance-metrics-and-where-to-find-them-7090aaa07183

三个指标怎么做分层图

查看全文

http://www.taodudu.cc/news/show-2949495.html

mac设置开机启动app_Mac App无法启动？这是解决方法
javascript 异步_javascript异步操作使您的网站充满活力
spring文件上传拦截器及异常处理
ttkbootstrap 学习
使用Packer在Winodws VMware Workstation Pro上自动部署Windows Server 2016中文版
week07 13.3 NewsPipeline之三News Deduper之 tf_idf 查重
使用 Packer 构建虚拟机镜像踩的坑
使用Packer 在 VMware vSphere 上构建 Redhat/CentOS 虚拟机
Blazor Web Assembly (WASM) 主题切换
flask中的所有第三方模块大集合
Python GUI之tkinter的皮肤(ttkbootstrap)打造出你的窗口之美
Spring aop开发步骤
【python】tkinter+pyserial实现串口调试助手
（17）打鸡儿教你Vue.js
Springmvc-简单入门
Spring配置
R Markdown 如何使用外部css
基于python实现的聊天室（客户端：一）窗口设置
70个数据分析常用网址，我先收藏了！
超全！常用的 70 个数据分析网址
Hbase的二级索引和RowKey的设计
超全！我常用的 70 个数据分析网址
常用的70个数据分析网址
超全网址分享：常用的 70 个数据分析网址
超全，我梳理了最频繁使用的 70 个数据分析网址
面试面经
Java 调用企业微信消息推送,实现定时打卡提醒
数说故事2022年中国预制菜行业趋势及营销创新洞察报告
70个数据分析工具，必须收藏！
70个必备的数据分析工具

三个指标怎么做分层图_分层性能指标以及在哪里找到它们相关推荐

手艺人吧：为顾客“量脚”做鞋(图)_网易新闻中心
手艺人吧:为顾客"量脚"做鞋(图)_网易新闻中心手艺人吧:为顾客"量脚"做鞋(图)_网易新闻中心手艺人吧:为顾客"量脚"做鞋(图) po ...
【分层图】分层图学习笔记
zky学长不止一次说分层图很简单随便看看就会了然后今天就拿出时间来学了学分层图(写这篇文章是不是会被骂傻叉算了反正我就是傻叉) 首先@出一篇论文 2004国家集训队<分层图思想及其在信息学竞赛 ...
做流向图_各类型供热暖系统图大全，一饱眼福！
↑ 点击上方"暖通风向标"关注我们推广. 暖通风向标本文来源:制冷网好书推荐做热泵这几本书你不得不看! 区域供热系统热电联产系统地热水供暖系统即热式生活热水系统即热式特点:可 ...
python都能做什么图_如何学习数据分析
展开全部第1本<谁说菜鸟不会数据分析入门篇> 很有趣的数据分析书!基本看62616964757a686964616fe58685e5aeb931333365663539过就能明白,以小说 ...
tableau做折线图_用Tableau制作10种漂亮的折线图
公众号:Tableau从入门到精通制作该10种折线图所用的数据均来自于以下: 数据源提取: 链接: https://pan.baidu.com/s/1qSV9xnN9JGyoy_SqXvcEEw 提 ...
python做路径图_初学者福利：python路线图
原标题:初学者福利:python路线图编程思维简单地分成了两个主要部分,一个是建模,一个是算法优化. 举个例子,比如我们想要做一个程序,这个程序会自动把大象装进冰箱里.那么建模是什么呢?就是我们要知 ...
python做pca图_【教程】组学研究，用python快速实现PCA分析和绘图
什么是PCA 主成分分析(Principal Component Analysis,PCA)是一种无监督的多元统计分析方法.在蛋白组学和代谢组学研究中能从总体上反应各组样本之间的总体差异和组内样本之间 ...
layui做折线图_绘制曲线图/折线图只需4步
绘制曲线图/折线图只需4步 8390251284.gif 下载YJGraph文件拖入工程后 1.导入头文件 #import "YJGraphView.h" #import &quo ...
zb怎么做渲染图_如何在ZBrush中渲染漫画风格的插画
创建"漫画插画"的外观和感觉想必一定很有趣吧,但是,获得想要的精确外观有时也会令人相当沮丧,因此了解一些基本原则,创建类似于ZBrush®漫画MatCaps的作品很有必要. 使用2 ...

三个指标怎么做分层图_分层性能指标以及在哪里找到它们