机器学习中倒三角符号

By David Weinberger

大卫·温伯格(David Weinberger)

AI Outside In is a column by PAIR’s writer-in-residence, David Weinberger, who offers his outsider perspective on key ideas in machine learning. His opinions are his own and do not necessarily reflect those of Google.

AI Outside In 是PAIR的常驻作者David Weinberger的专栏文章,他提供了有关机器学习关键思想的局外人观点。 他的观点是他自己的,不一定反映Google的观点。

机器学习的超能力 (Machine learning’s superpower)

When we humans argue over what’s fair, sometimes it’s about principles, sometimes about consequences, and sometimes about trade-offs. But machine learning systems can bring us to think about fairness — and many other things — in terms of three interrelated factors: two ways the machine learning (ML) can go wrong, and the most basic way of adjusting the balance between these potential errors. The types of error you’ll prefer to live with depends entirely on the sort of fairness — defined mathematically — you’re aiming your ML system at. But one way or another, you have to decide.

当我们人类争论什么是公平的时候,有时是关于原则,有时是后果,有时是权衡。 但是,机器学习系统可以使我们从三个相互关联的因素来考虑公平性以及许多其他方面:机器学习(ML)出错的两种方式,以及调节这些潜在错误之间的平衡的最基本的方式。 您更愿意忍受的错误类型完全取决于以ML系统为目标的公平性(以数学方式定义)。 但是,您必须决定一种方式。

At their heart, many ML systems are classifiers. They ask: Should this photo go into the bucket of beach photos or not? Should this dark spot on a medical scan be classified as a fibrous growth or something else? Should this book go on the “Recommended for You” or “You’re Gonna Hate It” list? ML’s superpower is that it lets computers make these sorts of “decisions” based on what they’ve inferred from looking at thousands or even millions of examples that have already been reliably classified. From these examples they notice patterns that indicate which categories new inputs should be put into.

本质上,许多机器学习系统都是分类器。 他们问:这张照片是否应该放在沙滩照片的桶中? 是否应该将医学扫描上的黑点归类为纤维状生长或其他? 这本书应该放在“推荐给您”还是“您讨厌它”清单上? ML的超强能力是,它使计算机可以根据从数千个甚至数百万个已经可靠分类的示例中得出的结论来做出这些“决定”。 从这些示例中,他们注意到指示新输入应放入哪些类别的模式。

While this works better than almost anyone would expect — and a tremendous amount of research is devoted to fundamental improvements in classification algorithms — virtually every ML system that classifies inputs mis-classifies some of them. An image classifier might think that the photo of a desert is a photo of a beach. The cellphone you’re dictating into might insist that you said “Wreck a nice beach” instead of “Recognize speech.”

尽管这比几乎任何人都预期的要好,并且大量研究致力于分类算法的根本改进,但实际上,对输入进行分类的每个ML系统都会对其中一些进行错误分类。 图像分类器可能认为沙漠的照片就是海滩的照片。 您要输入的手机可能会坚持要求您说“ 破坏美丽的海滩 ”,而不是“识别语音”。

So, researchers and developers typically test and tune their ML systems by having them classify data that’s already been reliably tagged — the same sort of data these systems were trained on. In fact, it’s typical to hold back some of the inputs the system is being trained on so that it can test itself on data it hasn’t yet seen. Since the right classifications are known for the test inputs, the developers can quickly see how well the system has done.

因此,研究人员和开发人员通常通过让他们对已经可靠标记的数据进行分类来测试和优化ML系统,这些数据是对这些系统进行训练的相同类型的数据。 实际上,通常会保留一些正在接受系统训练的输入,以便可以对尚未看到的数据进行自我测试。 由于测试输入已知正确的分类,因此开发人员可以快速查看系统的性能。

In this sort of basic testing, there are two ways the system can go wrong. A image classifier designed simply to identify photos of beaches might, say, put an image of the Sahara into the “Beach” bucket, or it might put an image of a beach into the “Not a Beach” bucket.

在这种基本测试中,系统有两种可能出错的方法。 例如,仅用于识别海滩照片的图像分类器可能会将撒哈拉沙漠的图像放入“海滩”桶中,或者可能将海滩的图像放入“非海滩”桶中。

For this post’s purposes, let’s call the first “False alarms”: the ML thinks the photo of the Sahara depicts a beach.

出于这篇文章的目的,我们称第一个“虚假警报”:ML认为撒哈拉沙漠的照片描绘的是海滩。

The second “Missed targets”: the ML failed to recognize an actual beach photo.

第二个“失踪目标”:ML无法识别实际的海滩照片。

ML practitioners use other terms for these errors. False alarms are false positives. Missed targets are false negatives. But just about everyone finds these confusing names, even many professionals. Non-medical folk understandably can assume that positive test results are always good news. In the ML world, it’s easy to confuse the positivity of the classification with the positivity of the trait being classified. For example, ML might be used to looking at lots of metrics to assess whether a car is likely to need service soon. If a healthy car is put into the “Needs Service” bucket, it would count as a false positive even though we might think of needing service as a negative. And logically, shouldn’t a false negative be a positive? The concepts are crucial, but the terms are not not unintuitive.

ML练习者使用其他术语来表示这些错误。 错误警报是误报 。 错过的目标是假阴性 。 但是几乎每个人都发现了这些令人困惑的名字,甚至很多专业人员。 可以理解的是,非医学人士可以假定阳性测试结果始终是个好消息。 在ML世界中,很容易将分类的积极性与要分类的特征的积极性混淆。 例如,ML可能用于查看大量指标以评估汽车是否可能很快需要维修。 如果将健康的汽车放入“需要服务”类别,即使我们可能认为需要服务是负面的,也将被视为误报。 从逻辑上讲,假否定不应该是肯定的吗? 概念很关键,但术语并非并非直觉。

So, let’s go with false alarms and missed targets as we talk about errors.

因此,当我们谈论错误时,让我们带着错误的警报和错过的目标。

深刻的后果 (Deep-reaching consequences)

Take an example that doesn’t involve machine learning, at least not yet. Let’s say you’re adjusting a body scanner at an airport security checkpoint. Those who fly often (back in the day) can attest to the fact that most of the people for whom the scanner buzzes are in fact not security threats. They get manually screened by an agent — often a pat-down — and are sent on their way. That’s not an accident or a misadjustment. The scanners are set to generate false alarms rather frequently: if there’s any doubt, the machine beeps a human over to double check.

举一个不涉及机器学习的例子,至少现在还不涉及。 假设您要在机场安全检查站调整人体扫描仪。 那些经常飞行的人(白天回来)可以证明,扫描仪嗡嗡作响的大多数人实际上并不是安全威胁。 他们由代理人手动筛选(通常是轻拍),并按自己的方式发送。 这不是意外或错误调整。 扫描仪被设置为相当频繁地产生误报:如果有任何疑问,机器会发出哔哔声,以进行仔细检查。

That’s a bit of a bother for the mis-classified passengers, but if the machine were set to create fewer false alarms, it potentially would miss genuine threats. So it errs on the side of false alarms, rather than missed targets.

对于错误分类的乘客来说,这有点麻烦,但是如果机器设置为创建更少的错误警报,则可能会错过真正的威胁。 因此,它会误报警,而不是错过目标。

There are two things to note here. First, reducing the false alarms can increase the number of missed targets, and vice versa. Second, which is the better thing to do depends on the goal of the machine learning system. And that always depends on the context.

这里有两件事要注意。 首先,减少错误警报可以增加错过目标的数量,反之亦然。 其次,哪个更好,取决于机器学习系统的目标。 这始终取决于上下文。

For example, false alarms are not too much of a bother when the result is that more passengers get delayed for a few seconds. But if the ML is being used to recommend preventive surgery, false alarms could potentially lead people to put themselves at unnecessary risk. Having a kidney removed for no good reason is far worse than getting an unnecessary pat down. (This is obviously why a human doctor will be involved in your decision.)

例如,当更多的乘客延迟几秒钟时,错误警报就不会太麻烦。 但是,如果使用ML来推荐预防性手术,则错误警报可能会导致人们将自己置于不必要的风险中。 无缘无故拔除肾脏远比不必要的轻拍要差得多。 (这显然就是为什么人类医生会参与您的决定。)

The consequences can reach deep. If your ML system is predicting which areas of town ought to be patrolled most closely by the police, then tolerating a high rate of false alarms may mean that local people will feel targeted for stop-and-frisk operations, potentially alienating them from the police force, which can have its own harmful consequences on a community…as well as other highly consequential outcomes.

结果可能会很深。 如果您的机器学习系统正在预测哪个城镇应该由警察最密切地巡逻,那么容忍高误报率可能意味着当地人会感到有目标地进行停停加急操作,从而可能使他们与警察疏远武力,可能对社区产生自己的有害后果……以及其他后果严重的后果。

False alarms are possible in every system designed by humans. They can be very expensive, in whatever dimensions you’re calculating costs.

在人为设计的每个系统中,错误警报都是可能的。 无论您要计算成本的任何维度,它们都可能非常昂贵。

It gets no less complex when considering how many missed targets you’re going to design your ML system to accept. If you tune your airport scanner so that it generates fewer false alarms, some people who are genuine threats may be waved on through, endangering an entire airplane. On the other hand, if your ML is deciding who is worthy of being granted a loan, a false alarm — someone who is granted a loan and then defaults on it — may be more costly to the lender than the missed opportunity of turning down someone who would have repaid the loan.

考虑要设计ML系统接受多少个错过的目标时,它的复杂度也不会降低。 如果您对机场扫描仪进行调整,使其产生更少的错误警报,则可能会冒出一些真正的威胁,危及整架飞机。 另一方面,如果您的ML决定谁值得获得贷款,那么错误的警报(某人获得贷款然后拖欠贷款)对放贷方而言可能比错过了拒绝某人的机会更为昂贵。谁会偿还贷款。

Now, to not miss an opportunity to be confusing when talking about ML, consider an online book store that presents each user with suggestions for the next book to buy. What should the ML be told to prefer: Adding false alarms to the list, or avoiding missed opportunities? False alarms in this case are books the ML thinks the reader will be interested in, but the reader in fact doesn’t care about. Missed opportunities are the books the readers might actually buy but the ML thinks the reader wouldn’t care about. From the store’s point of view, what’s the best adjustment of those two sliders?

现在,为避免错过谈论ML的机会,请考虑一家在线书店,该书店向每个用户提供有关购买下一本书的建议。 应该告诉ML更喜欢什么:将错误警报添加到列表中,或避免错过机会? 在这种情况下,虚假警报是ML认为读者会感兴趣的书,但实际上读者并不在意。 错失的机会是读者可能实际购买的书,但ML认为读者不会在意。 从商店的角度来看,这两个滑块的最佳调整是什么?

That question isn’t easy, and not just because the terms are non-intuitive for most of us. For one thing, should the buckets for books be “User Will Buy It” or, perhaps, “User Will Enjoy It”? Or maybe, “User Will Be Stretched By It”?

这个问题并不容易,不仅仅是因为这些术语对我们大多数人而言都不直观。 一方面,书桶应该是“用户愿意购买”还是“用户喜欢”? 或者,“用户会被它吸引”?

Then, for reasons external to ML, not all missed opportunities and false alarms are equal. For example, maybe your loan application ML is doing fine sorting applications into “Approve” and “Disapprove” buckets in terms of the missed opportunities and false alarms your company can tolerate. But suppose many more applications that become missed opportunities are coming from women or racial minorities. The system is performing up to specification, but that specification turns out to have unfair and unacceptable results.

然后,由于ML之外的原因,并非所有错过的机会和错误警报都是相等的。 例如,就您的公司可以容忍的错失机会和虚假警报而言,也许您的贷款申请ML正在将申请分类为“批准”和“拒绝”两个类别。 但是,假设更多的成为错失良机的应用来自女性或少数民族。 该系统正在执行符合规范的要求,但事实证明该规范具有不公平和不可接受的结果。

努力思考并大声说出来 (Think hard and out loud)

Adjusting the mix of false alarms and missed opportunities brings us to the third point of the Triangle of Error: the ML confidence level.

调整错误警报和错过的机会的组合,使我们进入错误三角的第三点:机器学习置信度。

One of the easiest ways to adjust the percentage of false alarms and missed targets is to change the threshold of confidence required to make it into the bin. (Others way including training the system on better data or adjusting its classification algorithms.) For example, suppose you’ve trained an ML system on hundreds of thousands of images that have been manually labeled as “Smiling” or “Not Smiling”. From this training, the ML has learned that a broad expanse of light patches towards the bottom of the image is highly correlated with smiles, but then there are the Clint Eastwoods whose smiles are much subtler. When the ML comes across a photo like that, it may classify it as smiling, but not as confidently as the image of the person with the broad, toothy grin.

调整错误警报和错过目标的百分比的最简单方法之一是更改将其放入垃圾箱所需的置信度阈值 。 (其他方法包括在更好的数据上训练系统或调整其分类算法。)例如,假设您已经在成千上万个手动标记为“微笑”或“不微笑”的图像上训练了机器学习系统。 从这次培训中,机器学习人员得知,朝向图像底部的广阔色块与微笑高度相关,但随后还有克林特·伊斯特伍德(Clint Eastwoods)的微笑更加微妙。 当ML遇到这样的照片时,它可能将其分类为微笑,但不如带有露齿露齿笑容的人的形象那样自信。

If you want to lower the percentage of false alarms, you can raise the confidence level required to be put into the “Smiling” bin. Let’s say that on a scale of 0 to 10, the ML gives a particular toothy grin a 9, while Clint gets a 5. If you stipulate that it takes at least a 6 to make it into the “Smile” bin, Clint won’t make the grade; he’ll become a missed target. Your “Smile” bucket will become more accurate, but your “Not Smile” bucket will have at least one more missed opportunity.

如果要降低错误警报的百分比,则可以提高放入“微笑”容器中所需的置信度。 假设从0到10的比例,ML给特定的露齿笑容9,而Clint则得到5。如果您规定至少需要6才能使它进入“微笑”容器,Clint不会t成绩; 他会成为错过的目标。 您的“微笑”存储桶将变得更加准确,但是您的“不微笑”存储桶将至少有一个错过的机会。

Was that the right choice? That’s not something the machine can answer. It takes humans — design teams, communities, the full range of people affected by the machine learning — to decide what they want from the system, and what the trade-offs should be to best achieve that result.

那是正确的选择吗? 这不是机器可以回答的。 需要人类(设计团队,社区, 受机器学习影响的所有人员)来决定他们希望从系统中获得什么,以及应该进行哪些取舍才能最好地实现该结果。

Deciding on the trade-offs occasions difficult conversations. But perhaps one of the most useful consequences of machine learning at the social level is not only that it requires us humans to think hard and out loud about these issues, but the requisite conversations implicitly acknowledge that we can never entirely escape error. At best we can decide how to err in ways that meet our goals and that treat all as fairly as possible.

在权衡取舍时,很难进行对话。 但是,在社会层面上机器学习最有用的后果之一不仅是它要求我们人类对这些问题进行认真思考和大声思考,而且必要的对话含蓄地承认我们永远无法完全避免错误。 充其量,我们可以决定如何以符合我们目标的方式来犯错误,并尽可能公平地对待所有人。

翻译自: https://medium.com/people-ai-research/machine-learnings-triangle-of-error-2c05267cb2bd

机器学习中倒三角符号


http://www.taodudu.cc/news/show-863697.html

相关文章:

  • 使用Java解决您的数据科学问题
  • 树莓派 神经网络植入_使用自动编码器和TensorFlow进行神经植入
  • opencv 运动追踪_足球运动员追踪-使用OpenCV根据运动员的球衣颜色识别运动员的球队
  • 犀牛建模软件的英文语言包_使用tidytext和textmineR软件包在R中进行主题建模(
  • 使用Keras和TensorFlow构建深度自动编码器
  • 出人意料的生日会400字_出人意料的有效遗传方法进行特征选择
  • fast.ai_使用fast.ai自组织地图—步骤4:使用Fast.ai DataBunch处理非监督数据
  • 无监督学习与监督学习_有监督与无监督学习
  • 分类决策树 回归决策树_决策树分类器背后的数学
  • 检测对抗样本_对抗T恤以逃避ML人检测器
  • 机器学习中一阶段网络是啥_机器学习项目的各个阶段
  • 目标检测 dcn v2_使用Detectron2分6步进行目标检测
  • 生成高分辨率pdf_用于高分辨率图像合成的生成变分自编码器
  • 神经网络激活函数对数函数_神经网络中的激活函数
  • 算法伦理
  • python 降噪_使用降噪自动编码器重建损坏的数据(Python代码)
  • bert简介_BERT简介
  • 卷积神经网络结构_卷积神经网络
  • html两个框架同时_两个框架的故事
  • 深度学习中交叉熵_深度计算机视觉,用于检测高熵合金中的钽和铌碎片
  • 梯度提升树python_梯度增强树回归— Spark和Python
  • 5行代码可实现5倍Scikit-Learn参数调整的更快速度
  • tensorflow 多人_使用TensorFlow2.x进行实时多人2D姿势估计
  • keras构建卷积神经网络_在Keras中构建,加载和保存卷积神经网络
  • 深度学习背后的数学_深度学习背后的简单数学
  • 深度学习:在图像上找到手势_使用深度学习的人类情绪和手势检测器:第1部分
  • 单光子探测技术应用_我如何最终在光学/光子学应用程序中使用机器学习作为博士学位
  • 基于深度学习的病理_组织病理学的深度学习(第二部分)
  • ai无法启动产品_启动AI启动的三个关键教训
  • 达尔文进化奖_使用Kydavra GeneticAlgorithmSelector将达尔文进化应用于特征选择

机器学习中倒三角符号_机器学习的三角误差相关推荐

  1. 机器学习中常见的损失函数_机器学习中最常见的损失函数

    机器学习中常见的损失函数 现实世界中的DS (DS IN THE REAL WORLD) In mathematical optimization and decision theory, a los ...

  2. css下拉框带三角符号_创建带有符号的下拉列表

    css下拉框带三角符号 To make data entry easier, you can create a drop down list in an Excel cell, using data ...

  3. 机器学习中的无监督学习_无监督机器学习中聚类背后的直觉

    机器学习中的无监督学习 When it comes to analyzing & making sense of the data from the past and understandin ...

  4. 机器学习中激活函数和模型_探索机器学习中的激活和丢失功能

    机器学习中激活函数和模型 In this post, we're going to discuss the most widely-used activation and loss functions ...

  5. 机器学习中的数学符号及其读法

    数学符号及读法大全 常用数学输入符号:  ≈ ≡≠= ≤≥ < > ≮ ≯ ∷ ±+ - × ÷ / ∫∮ ∝ ∞  ∧ ∨ ∑∏∪ ∩ ∈ ∵ ∴  ⊥ | ∠ ⌒  ≌ ∽ √   ( ...

  6. 机器学习如何计算特征的重要性_机器学习之特征缩放

    今天本来要发一篇推荐以下吴恩达的机器学习课程,结果过不了审核,..... 没办法这里简单提一下:课程地址:https://study.163.com/course/courseMain.htm?cou ...

  7. 机器学习线性回归算法实验报告_机器学习之简单线性回归

    为了利用机器学习进行简单的线性回归,先理解机器学习和线性回归的概念,然后通过案例进行机器学习.本文主要目录如下: 一.机器学习的概念 二.线性回归的概念 三.机器学习线性回归模型 (一)导入数据集 ( ...

  8. word中如何插入 符号_如何在Word中插入版权或商标符号

    word中如何插入 符号 You can easily insert hundreds of symbols into your Word document with a few nimble key ...

  9. word中如何插入 符号_如何在Word 2013中使用符号

    word中如何插入 符号 Hundreds of symbols that are not available on your keyboard are provided in Microsoft W ...

最新文章

  1. SVN建立分支和合并代码
  2. WinCE 系统刚启动时运行应用,在应用启动时偶尔出现异常
  3. 如何避免Puppeteer被前端JS检测
  4. 学分绩点计算编程java_方便我们计算学分绩点的JavaScript
  5. eclipse导入myeclipse项目
  6. Simulink中如何定义变量的初始值
  7. java final对象_java面向对象基础_final详细介绍
  8. html无序列表只能横着排吗,[三地连线走势图]css 怎样让无序列表 横着排列
  9. 基于PostgreSQL进行Java应用开发
  10. html文字居中单词,html文字居中
  11. 老师用计算机教我们画画拼音,小学一年级语文《汉语拼音13angengingong》第三课时教学设计.docx...
  12. vue 解决跨域问题(开发环境)
  13. Gradle配置及同一应用不同版本配置不同资源文件,不同签名,包名进行打包
  14. Java3d获取坐标_java-使用带有xzyz坐标和jzy3d的3d表面图
  15. poi-tl导出word;自定义列表序号和表格宽度,表格合并,自定义标题,更新目录
  16. vue项目如何部署?布署服务器后刷新404如何解决?
  17. 520了,用32做个简单的小程序
  18. 学习Transformer:自注意力与多头自注意力的原理及实现
  19. 蜂蜜橙文案:水果蜂蜜橙的文案图片,水果蜂蜜橙朋友圈卖货文案
  20. 女孩,你为什么不沉住气奋斗

热门文章

  1. PD生成SQL脚本附带注释命令
  2. Android Studio经常使用配置及使用技巧(二)
  3. django from组件 实现增加 删除 编辑(推荐用法)
  4. 转载:Redis 应用场景
  5. Web前端一种动态样式语言-- Less
  6. 关系代数——附加的关系运算(1)
  7. 格子箱被评选为12家最值得注意的亚洲初创科技公司之一
  8. Android SDK Manager 更新慢解决办法
  9. 计算机软件硬件的会计处理,重庆会计从业考试《会计电算化》第二章第四节计算机软件...
  10. python 取一个字前的文本的_python删除某一行字符前面的内容