机器学习基石和机器学习技法

The allegory of the cave was presented by the Greek philosopher Plato in his work Republic, originally to compare “the effect of education and the lack of it on our nature”. Oddly enough, the state-of-the-art field of machine learning, as it turns out, still fits more or less into this mold of more than 2000 years old.

洞穴寓言是希腊哲学家柏拉图在他的作品“ 共和国”中提出的 。 比较“教育的影响及其缺乏对我们本性的影响”。 奇怪的是,事实证明,最先进的机器学习领域还是或多或少地适合这种具有2000多年历史的模具。

洞穴的寓言 (The Allegory of the Cave)

The allegory is elucidated from a conversation between Socrates and Glaucon, his disciple.

从苏格拉底和他的门徒格劳Kong(Glaucon)之间的对话可以阐明这一寓言。

Simply put, Socrates tells Glaucon to imagine people living in a vast subterranean cave, open to the outside world via only one strenuous and steep tunnel. The people are chained, facing a tall wall, unable to turn their faces or break the chains. Behind the prisoners, there is a great fire burning, thus projecting what happens behind the people as all shapes of shadows onto the wall that the prisoners are facing.

简而言之,苏格拉底告诉格劳Kong(Glaucon),想象人们生活在一个巨大的地下洞穴中,该洞穴仅通过一条陡峭陡峭的隧道向外界开放。 人们被锁住,面对一堵高墙,无法转过脸或打断锁链。 在囚犯后面,有一团大火在燃烧,因此,当各种形状的阴影都在囚犯所面对的墙上时,投射出发生在人们身后的事情。

As the people are never able to turn their faces a bit throughout their lives, they grow up and die watching the shadows on the wall, which constitute all the known reality to them.

由于人们一生都无法转过脸,所以他们长大后死了,看着墙上的阴影,这是他们所知道的所有现实。

Symbolically speaking, the shadows on the wall represent the superficial truth — a ‘virtual’ reality that is perceived via our senses only, as against the ultimate ‘real’ reality, which, according to the Platonism, is made up of Forms (ideas) that can never be known by us mortals through sensation.

象征性地讲,在墙上的影子代表肤浅的道理-那就是通过我们的感官只感觉到一个“虚拟”的现实,作为对最终的“真实”的现实,而根据柏拉图主义,是由形式向上(想法)凡人无法通过轰动知道的东西。

Further, the two-dimensional projections on the wall constitute the whole world for the poor prisoners. Their curious minds will try develop stories or histories for those phantoms — how they tend to interact with each other, when some shadows tend to show up together, whether one peculiar shadow causes another to appear, etc. They speculate and conjure up the most elaborate theories to explain the shadows’ behaviors; yet sadly they are still in ignorance as to the true nature of the shadows.

此外,墙上的二维投影构成了贫困囚犯的整个世界。 他们好奇的头脑会尝试为这些幻影开发故事或历史-它们如何相互影响,当一些阴影倾向于一起出现,一个奇特的阴影是否导致另一个阴影出现时,等等。他们推测并构想出最精致的解释阴影行为的理论; 但是可悲的是,他们仍然对阴影的真实性质一无所知。

If only they could turn faces, they would find out some of their favorite shadow theories were based on some bad jokes by some tricksters right behind them, who might stop playing at any moment, extinguishing some of the best models from the prisoner team. Even if we assume a benevolent world in which no one is out there to trick on the poor prisoners, any moment a small random adjustment of the angle that a three-dimensional figure shows up would cause a big difference on its projection on the wall, surely leading to a sizable havoc among the prisoner theorists.

只要他们能转过脸,他们就会发现一些他们最喜欢的影子理论是基于他们后面一些骗子的恶作剧,这些骗子可能随时停止比赛,从而从囚犯队伍中淘汰了一些最好的模特。 即使我们假设一个仁慈的世界,没有人在外面诱骗可怜的囚犯,任何时候只要对三维图形显示的角度进行很小的随机调整,都会在墙上投射出很大的差异,无疑会导致囚犯理论家大肆破坏。

机器学习的作用 (What Machine Learning Does)

Despite all the hype and promises surrounding machine learning or artificial intelligence, essentially, machine learning algorithms look at a bunch of observations made regarding a phenomenon and their attached labels/values (depending on whether it is a classification or regression task), try to come up with a function made out of known functions whose shape seems to match that of the data and then go on to dictate a label or value when a new observation comes in.

尽管围绕机器学习或人工智能进行了大肆宣传和承诺,但实质上,机器学习算法还是根据对现象及其相关标签/值(取决于它是分类任务还是回归任务)的观察结果进行尝试。由一个由已知函数组成的函数组成,这些函数的形状似乎与数据的形状匹配,然后在出现新的观察值时继续指定一个标签或值。

Photo by Markus Spiske on Unsplash
Markus Spiske在Unsplash上拍摄的照片

The rules of thumb are that (1) the quality of any machine learning outputs heavily hinges on the quality of the input. Simply put, garbage in, garbage out; (2) machine learning algorithms can never ‘learn’ or discover any novel rules that are yet unknown to us.

经验法则是:(1)任何机器学习输出的质量在很大程度上取决于输入的质量。 简单地说,垃圾进,垃圾出; (2)机器学习算法永远不会“学习”或发现任何我们尚不了解的新颖规则。

In other words, what we can ultimately achieve with machine learning models are bound by the quality of the data on the one hand, and the known ‘reality’ of the researchers who crafted those models on the other.

换句话说,我们最终可以用机器学习模型实现的目标一方面受到数据质量的限制,另一方面受制于研究模型的研究人员的已知“现实”的约束。

平行线 (The Parallels)

Back to the allegory of the cave for a while. Apparently, just as real-life three-dimensional figures are projected as two-dimensional shadows onto the wall, in daily machine learning practices, descriptions of real-life objects (variables), when put down as numerical or categorical data, are never a complete picture of the original object.

回到洞穴的寓言一段时间。 显然,就像现实中的三维图形作为二维阴影投射到墙上一样,在日常的机器学习实践中,当将真实对象(变量)描述为数字或分类数据时,绝不会原始对象的完整图片。

The reason might be that we lack the measurements of certain properties due to various logistic constraints, but more likely, as someone constrained since birth by all sorts of priming and collective learning experiences, we simply lack the knowledge of even the existence of certain crucial properties of an object. Obviously, if we are not even aware of the existence of something, how can we ask for it in the first place, then ask questions about it and finally measure it?

原因可能是由于各种逻辑约束,我们缺乏某些属性的度量,但更有可能的是,由于某人自出生以来受到各种启动和集体学习经验的限制,我们甚至根本不了解某些重要属性的存在一个对象。 显然,如果我们甚至不知道某物的存在,我们该如何首先要求它,然后再提出有关问题并最终进行测量?

Projections are never a good description of the original object
投影永远不会很好地描述原始对象

In that light, the current efforts to bring in alternative data to all machine learning problems make perfect sense and are laudable in that we have realized the traditional feature set is no longer up to our needs, and since we are working with data of ‘shadows’ anyway, why not borrow in snippets from other ‘shadows’ that may help us out in realistic terms? No wonder satellite images of factories or social network sentiment data are increasingly incorporated in automatic trading models; the truth is that they have been good factors in predicting equity performances, and old schoolers stuck with balance sheets are simply missing out.

有鉴于此,当前为所有机器学习问题引入替代数据的努力是完全有意义的,值得称赞的是,我们已经意识到传统功能集已无法满足我们的需求,并且由于我们正在处理“影子”数据”“无论如何,为什么不从其他“影子”中借用一些片段,这些片段可能有助于我们切合实际? 难怪工厂的卫星图像或社交网络情绪数据越来越多地被纳入自动交易模型中。 事实是,它们一直是预测股票表现的好因素,而陷入资产负债表的老学生根本就没有参加。

Not hard to imagine, millions of machine learning models are being trained every day on data of ‘shadows’, while being tasked with figuring out the relationship (correlation, or even causality, if demanded by some over-achieving project manager) between real-life objects that gave rise to the data. Without going into the Neo-platonic topic of Emanationism, which basically says everything derives from the perfect first reality and while being further derived and alienated from the first reality, becomes less pure and less perfect, we might be able to describe the dilemma by turning to another example.

不难想象,每天都有数以百万计的机器学习模型在“影子”数据上训练,而任务是弄清真实模型与现实模型之间的关系(如果某些超成就的项目经理要求,则为关系,甚至是因果关系)。产生数据的生命对象。 无需进入Emanationism的新柏拉图式主题,即基本上说一切都源于完美的第一现实,而又进一步从源于第一现实衍生和疏远,则变得不那么纯净和不够完美,我们或许可以通过转向再举一个例子。

In Lewis Carroll’s novel Through the Looking-Glass, when the White Knight explains to Alice the song Haddocks’ Eyes, the author discusses the distinction between ‘the song, what the song is called, the name of the song, and what the name of the song is called’. The song itself, being a Platonic Form, gives rise to derived forms such as name of the song, or even the name of the name of the song (which can go on infinitely); yet these derivations might not even bear close relationship to the theme of the song. Nonetheless, the (low-level) derivations are in most cases what the real object is known by and analyzed by, whereas the object itself remains ineffable or unknown.

在刘易斯·卡罗尔(Lewis Carroll)的小说《 穿越玻璃》中 ,当白骑士(White Knight)向爱丽丝(Alice)解释歌曲《 哈多克斯的眼睛》 ( Haddocks'Eyes)时 ,作者讨论了“这首歌,这首歌的名字,这首歌的名字和那首歌的名字之间的区别”。这首歌叫做“。 歌曲本身是柏拉图形式,会产生派生形式,例如歌曲名称,甚至是歌曲名称的名称(可以无限进行)。 但是这些派生词甚至可能与歌曲的主题没有密切关系。 但是,在大多数情况下,(低级)推导是已知真实对象并对其进行分析的对象,而对象本身仍然是无法解释的或未知的。

奥卡姆剃刀 (Occam’s Razor)

Therefore, the moment we make an observation of an object, a derivation has been made and we have created a shadow on the wall. If we realize that at the end of the day, we are working with datasets of ‘shadows’, we might be better at setting our expectations of what can come out of the grinding machine learning models. We might be able to accidentally and transiently break the chains in our mind by letting loose our curiosity and imagination and postulating a wildly parameterized model of the reality, but apparently the Occam’s razor dominates the cave and epiphanic insights are not frequent after all.

因此,在我们观察物体的那一刻,就进行了推导,并在墙上创建了阴影。 如果我们最终意识到我们正在使用“影子”数据集,那么我们可能会更好地设定对磨床学习模型的期望。 通过放宽我们的好奇心和想象力并假设一个参数化的现实模型,我们也许可以偶然地和短暂地打破思维的链条,但是显然奥卡姆剃刀占据了整个洞穴,而主观的见解毕竟并不常见。

visuals on 视觉在UnsplashUnsplash拍摄

翻译自: https://towardsdatascience.com/machine-learning-and-platos-allegory-of-the-cave-9b9846bb63f3

机器学习基石和机器学习技法


http://www.taodudu.cc/news/show-2422914.html

相关文章:

  • 板凳——————————————————c++(104)
  • C++PrimerPlus 第六章 分支语句和逻辑运算符 - 6.1 if语句
  • I didn't write blog yesterday night for some ineffable reasons
  • VBA操作CAD画一条直线
  • CAD绘制直线
  • cad 绘制直线 设定长度 角度
  • 机械CAD中如何设置重叠图形消隐?
  • CAD打断线条的快捷键是什么?CAD打断线条教程
  • 【AutoCAD】04.直线类命令
  • CAD中画一条直线与两个圆相切
  • cad画直角命令_CAD直线怎么画?直线命令快捷键是什么
  • Canvas 画直线
  • AutoCAD入门——直线
  • cad线加粗怎么设置_cad2016怎么把线加粗
  • cad线加粗怎么设置_CAD图形中线条如何加粗?
  • cad直线和圆弧倒角不相切_数控加工中心如何使用任意角度倒角C和倒圆角R功能的编程...
  • 直线端点画垂线lisp_AutoCAD中利用AutoLISP开发小程序,实现快速画直线对称中心线...
  • P2P继续停止...
  • 2-4 CAD基础 修剪(trim)
  • 悲剧!广电总局12月11日将封闭的网站目录!!!
  • 怎么将CAD中的两条直线拉成弧形呢?
  • VeryCD将于本月关闭 P2P历史即将终结
  • cad画直线长度与实际不符_CAD问题,画线长度不对?
  • 吊销 BTChina 营业执照”后元旦之前可能相继落马的“影视下载”网站名单
  • cad直线和圆弧倒角不相切_在cad绘制倒圆角的方法技巧步骤详解
  • 沉痛哀悼我们的电骡和BT中国联盟
  • CAD参数绘制直线(网页版)
  • 迅雷跃居全球BT市场第一
  • cass等距离等分线段的命令键_cad直线均分的命令(CAD等分线段快捷键?)
  • 下载站传送门

机器学习基石和机器学习技法_机器学习和洞穴寓言寓言相关推荐

  1. 机器学习中倒三角符号_机器学习的三角误差

    机器学习中倒三角符号 By David Weinberger 大卫·温伯格(David Weinberger) AI Outside In is a column by PAIR's writer-i ...

  2. 机器学习线性回归算法实验报告_机器学习之简单线性回归

    为了利用机器学习进行简单的线性回归,先理解机器学习和线性回归的概念,然后通过案例进行机器学习.本文主要目录如下: 一.机器学习的概念 二.线性回归的概念 三.机器学习线性回归模型 (一)导入数据集 ( ...

  3. 机器学习基石4-在何时才能使用机器学习(4)

    向杜少致敬! Lecture 4: Feasibility of Learning 4.1. Learning is Impossible?   图 4-1 Q1:在训练集 (in-sample) 能 ...

  4. 机器学习如何计算特征的重要性_机器学习之特征缩放

    今天本来要发一篇推荐以下吴恩达的机器学习课程,结果过不了审核,..... 没办法这里简单提一下:课程地址:https://study.163.com/course/courseMain.htm?cou ...

  5. 机器学习如何计算特征的重要性_机器学习之特征工程

    特征选择是特征工程中的一个子集,从所有的特征中,选择有意义的,对模型有帮助的特征,以避免将所有特征中对模型没作用的特征导入模型去训练,消耗不必要的计算资源.更正式地说,给定n个特征,我们搜索其中包括k ...

  6. 机器学习朴素贝叶斯算法_机器学习中的朴素贝叶斯算法

    机器学习朴素贝叶斯算法 朴素贝叶斯算法 (Naive Bayes Algorithm) Naive Bayes is basically used for text learning. Using t ...

  7. 机器学习线性回归算法实验报告_机器学习笔记 线性回归

    一.线性回归找到最佳拟合直线 1. 定义 线性回归是通过现有数据,让训练模型生成一个拟合公式,从而计算目标数据的预测值. 在统计学中,线性回归(Linear Regression)是利用称为" ...

  8. 机器学习中常见的损失函数_机器学习中最常见的损失函数

    机器学习中常见的损失函数 现实世界中的DS (DS IN THE REAL WORLD) In mathematical optimization and decision theory, a los ...

  9. 机器学习与分布式机器学习_机器学习的歧义

    机器学习与分布式机器学习 超越最高精度 (Beyond Achieving Top Accuracy) We are familiar with the idea of using machine l ...

  10. 机器学习学习吴恩达逻辑回归_机器学习基础:逻辑回归

    机器学习学习吴恩达逻辑回归 In the previous stories, I had given an explanation of the program for implementation ...

最新文章

  1. DeepMind提出「心智神经网络ToMnet」,训练机器的「理解」能力
  2. 关于win2003服务器远程断开后自动注销的问题解决
  3. org.codehaus.plexus.archiver.jar.Manifest.write(java.io.PrintWriter)
  4. 奇异值的物理意义是什么?
  5. SQL Server 2005 Sa 用户的启用
  6. iOS Apps核心对象
  7. 多线程并发如何高效实现生产者/消费者?
  8. integer比较_Java中的整型包装类值的比较为什么不能用==比较?原因是因为缓存
  9. 焦作师范高等专科学校计算机,焦作师范高等专科学校计算机房管理规定
  10. Ubuntu安装完后设置root密码-转
  11. DEDECMS添加友情链接长度限制的详细解决方法
  12. 计算机类专业数学分数,同济大学计算机专业数学分数
  13. Eclipse SWT 创建项目(一)
  14. expected an indented block报错的原因
  15. Java面向对象高级部分——通过Class类实例化对象(五十二)
  16. 电脑上怎么压缩PDF文件
  17. 用于身份管理的区块链:需要考虑的影响
  18. AngularJS学习之 ngTable 翻页 功能以及利用angular service准备测试数据
  19. 【Java17】全面拥抱Java17,个人企业均恢复免费,开源组织也在跟进,我们还要等待吗?
  20. 微星小飞机无法定位序数6744

热门文章

  1. win10无法防问其他计算机没有权限,win10系统访问磁盘共享没有权限的解决方案...
  2. 扫描死链接的工具xenu
  3. 最佳Android模拟器,你值得拥有
  4. 卡特兰数(Catalan UVa 991 10303 10007 1478)[11]
  5. Directshow的视频捕捉
  6. UniApp开发社交社区
  7. 口诀计算机,PID算法的通俗讲解及调节口诀[计算机类]
  8. 使用idea在serviceImpl中配置radis
  9. Android 项目总结(view控件之设置大小和间距)
  10. 什么是模式、什么是模式识别、模式识别的方法、过程