机器学习导论�

Say you are practising basketball on your own and you are trying to shoot the ball into the hoop. If you fail at the first try, your first instinct would most probably be to move forward or backwards, maybe jump higher or go lower, or even stretch your hands properly. Thing is, whatever you do, you are trying to get that ball into the basket. If it does not work, you keep trying new tactics to eventually reach your goal. This is the concept of machine learning.

假设您自己练习篮球，并且试图将球射入篮筐。如果您第一次尝试失败，那么您的第一个直觉很可能是向前或向后移动，可能会跳得更高或更低，甚至正确地伸手。事情是，无论您做什么，您都在努力将那个球放进篮筐。如果它不起作用，您将继续尝试新的策略以最终实现目标。这就是机器学习的概念。

Machine learning is an application of artificial intelligence that provides systems with the ability to automatically learn and improve from experience without being explicitly programmed. It focuses on the development of computer programs that can access data and use statistical analysis to predict an output while updating outputs as new data becomes available(ie learn).

机器学习是人工智能的一种应用，它使系统能够自动学习并从经验中进行改进，而无需进行明确的编程。它着重于计算机程序的开发，该程序可以访问数据并使用统计分析来预测输出，同时随着新数据的获得(即学习)更新输出。

“A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.”- Tom Mitchel

“如果计算机的程序在T上的性能(由P来衡量)随着经验E的提高而提高，那么据说计算机程序就可以从经验E中学习一些任务T和一些性能指标P。”

机器学习分类 (Classification of Machine Learning)

There are various categories of machine learning. They are:

机器学习有各种类别。他们是：

Supervised Learning监督学习
Unsupervised Learning无监督学习
Reinforcement Learning强化学习

Supervised Learning: Here, the system has been supplied with previously labelled data so it can apply what has been learned from those labelled examples to new data to predict future events. It is like someone trying to memorize new facts while comparing it to a note. This learning algorithm can compare its output with the correct, intended output and find errors in order to modify the model accordingly. A typical example would be email classification as spam, where you already have some emails that have been labelled “spam”, and you classify new emails as spam or not depending on whether they have the same qualities as the spam mails. Regression is another type of supervised learning.

监督学习：这里已向系统提供了以前标记的数据，因此可以将从那些标记的示例中学到的信息应用于新数据以预测未来事件。就像有人试图在将新事实与笔记进行比较时记住新事实一样。该学习算法可以将其输出与正确的预期输出进行比较，并发现错误，以便相应地修改模型。一个典型的示例是将电子邮件分类为垃圾邮件，其中您已经有一些已标记为“垃圾邮件”的电子邮件，并且根据新邮件是否具有与垃圾邮件相同的质量，将新邮件分类为垃圾邮件。回归是另一种监督学习。

Unsupervised Learning: Here, the system is presented with unlabeled, uncategorized data leaving to the algorithm to determine the data patterns on its own. The system doesn’t figure out the right output, but it explores the data and can draw inferences from datasets to describe hidden structures from unlabeled data. Recommendation systems usually seen on the web in that does marketing automation are based on this type of learning. Clustering and association are types of unsupervised learning.

无监督学习：在这里，系统将显示未标记，未分类的数据，并留给算法自行确定数据模式。该系统无法找出正确的输出，但可以浏览数据并可以从数据集中得出推论，以描述未标记数据中的隐藏结构。在网络上通常可以看到的推荐系统可以进行营销自动化，它是基于这种学习类型的。聚类和关联是无监督学习的类型。

Reinforcement Learning: Here, you present the system with examples that lack labels as in unsupervised learning, but this time around, you accompany an example with positive or negative feedback (a reward system) according to the solution the algorithm proposes. It is a type of dynamic programming that trains algorithms using a system of reward and punishment. This method allows the algorithm or agent to automatically determine the ideal behaviour within a specific context in order to maximize its performance. The learning algorithm, or agent, learns by interacting with its environment and is typically seen when computers learn to play games, outperform human players, and even optimize its score.

强化学习：在这里，您向系统展示的示例缺少无监督学习中的标签，但是这次，根据算法提出的解决方案，您将为示例提供正面或负面的反馈(奖励系统)。这是一种动态编程，它使用奖励和惩罚系统来训练算法。此方法允许算法或代理自动确定特定上下文内的理想行为，以使其性能最大化。学习算法或代理是通过与环境互动来学习的，通常在计算机学习玩游戏，超越人类玩家甚至优化其分数时才能看到。

选择正确的机器学习问题 (Choosing the Right Machine Learning Problem)

You have collected a bunch of data and want to use machine learning techniques to analyse this data, how do you choose the right machine learning problem for your use case? The problem categories we will cover in this article are:

您已经收集了很多数据，并希望使用机器学习技术来分析这些数据，如何为您的用例选择正确的机器学习问题？我们将在本文中介绍的问题类别是：

Classification分类
Regression回归
Clustering聚类
Dimensionality reduction降维

Classification: When you need to classify your input data into categories or classes, it turns out that predicting categories is a very common use case and these categories could be virtually anything. Like I mentioned in the email example above, is this email “spam” or “not spam”? Should you send it to the “inbox” or “spam” folder? As a financial trader constantly monitoring stock markets, given past information on the market, company performance, stock performance, should you “buy”, “sell” or “hold”? Or say you are working with image data and want to do object recognition, is this a “cat”, “mouse” or “dog”. The list is endless, but we can see that the output of a classification model is one category or class.

分类：当您需要将输入数据分类为类别或类别时，事实证明预测类别是一个非常普遍的用例，而这些类别实际上可以是任何东西。就像我在上面的电子邮件示例中提到的那样，此电子邮件是“垃圾邮件”还是“非垃圾邮件”？您应该将其发送到“收件箱”或“垃圾邮件”文件夹吗？作为一名金融交易员，不断监控股票市场，鉴于过去的市场信息，公司业绩，股票表现，您应该“买”，“卖”还是“持有”？或者说您正在使用图像数据并且想要进行对象识别，这是“猫”，“鼠标”还是“狗”。列表是无止境的，但是我们可以看到分类模型的输出是一个类别或类。

Regression: When you want your model to predict continuous numeric values, you would want to use a regression model. As a financial trader, given current market sentiments, previous earnings of the company and you need to predict the price of the stock tomorrow, then a regression model is your guy. You might be analysing the performance of different cars available given the attributes of a car and you want to predict its mileage or even trying to predict the price of a house considering the location and other conditions of the house. Once you are able to observe the nature of the problem, it is easier to know what to use.

回归：当您希望模型预测连续的数值时，您将要使用回归模型。作为金融交易员，考虑到当前的市场情绪，公司的先前收益以及您需要预测明天的股票价格，那么回归模型就是您的理想选择。给定汽车的属性，您可能正在分析可用的不同汽车的性能，并且您想要预测其行驶里程，甚至考虑房屋的位置和其他条件来尝试预测房屋的价格。一旦您能够观察到问题的本质，就更容易知道使用什么。

Clustering: When you have a really large dataset with no idea of what is in it, to make some sense of it, you may want to try clustering. In social media ads targeting, finding users that are interested in a particular field so you can target specific ads to them is an application of clustering. Another one is document discovery, you could gather all documents related to armed robbery and see if you can find patterns in the cases. Clustering just allows you to self discover patterns in fine details.

聚类：如果您有一个非常大的数据集，却不知道其中的内容，那么从某种意义上讲，您可能想尝试聚类。在社交媒体广告定位中，找到对特定字段感兴趣的用户，以便您可以将特定广告定位到他们，这是集群的一种应用。另一个是文件发现，您可以收集与武装抢劫有关的所有文件，看看是否可以找到案件中的模式。聚类仅允许您自行发现详细的模式。

Dimensionality Reduction: This is a preprocessing technique used to perform feature detection on your data. Let’s say you have 500 different variables, which of them are most significant? What features do you pay more attention to? This is where dimensionality reduction comes to play. It is used to preprocess your data to build more robust machine learning models with better performance whether they are classification, regression or any other kind. Dimensionality reduction helps us find latent factors when we have large data and no target values.

降维：这是一种预处理技术，用于对数据执行特征检测。假设您有500个不同的变量，其中哪个变量最重要？您需要注意哪些功能？这就是降维的作用所在。无论是分类，回归还是任何其他类型的数据，它都可以用于预处理数据以构建更强大的，性能更好的机器学习模型。当我们拥有大数据且没有目标值时，降维可帮助我们找到潜在因素。

结论 (Conclusion)

Machine Learning comes into the picture when problems cannot be solved by means of typical approaches. It enables the analysis of large data delivers faster, more accurate results in order to identify profitable opportunities or dangerous risks.

当无法通过典型方法解决问题时，机器学习就会成为现实。它可以对大数据进行分析，从而提供更快，更准确的结果，从而确定可获利的机会或危险的风险。

This article is intended to just give an introduction to the concept of Machine learning. There is a lot more to learn and it can be done by wanting to learn, creating time and finding the right resources online. I hope I have been able to make you want to learn more/

本文旨在仅介绍机器学习的概念。还有很多东西要学习，可以通过学习，创造时间并在线找到合适的资源来完成。希望我能够使您想了解更多/

翻译自: https://medium.com/@amarachi.anyim00/an-introduction-to-machine-learning-493d16017d9b

机器学习导论�

查看全文

http://www.taodudu.cc/news/show-863832.html

直线回归数据离群值_处理离群值：OLS与稳健回归
Python中机器学习的特征选择技术
聚类树状图_聚集聚类和树状图-解释
机器学习与分布式机器学习_我将如何再次开始学习机器学习（3年以上）
机器学习算法机器人足球_购买足球队：一种机器学习方法
机器学习与不确定性_机器学习求职中的不确定性
pandas数据处理代码_使用Pandas方法链接提高代码可读性
opencv 检测几何图形_使用OpenCV + ConvNets检测几何形状
立即学习AI：03-使用卷积神经网络进行马铃薯分类
netflix 开源_Netflix的Polynote是一个新的开源框架，可用来构建更好的数据科学笔记本
电场大学_人工电场优化算法
主题建模lda_使用LDA的Google Play商店应用评论的主题建模
胶囊路由_评论：胶囊之间的动态路由
交叉验证python_交叉验证
open ai gpt_您实际上想尝试的GPT-3 AI发明鸡尾酒
python 线性回归_Python中的简化线性回归
机器学习模型的性能指标
利用云功能和API监视Google表格中的Cloud Dataprep作业状态
谷歌联合学习的论文_Google的未来联合学习
使用cnn预测房价_使用CNN的人和马预测
利用colab保存模型_在Google Colab上训练您的机器学习模型中的“后门”
java 回归遍历_回归基础：代码遍历
sql 12天内的数据_想要在12周内成为数据科学家吗？
SorterBot-第1部分
算法题指南书_分类算法指南
小米 pegasus_使用Google的Pegasus库生成摘要
数据集准备及数据预处理_1.准备数据集
ai模型_这就是AI的样子：用于回答问题的BiDAF模型
正则化技术
检测对抗样本_避免使用对抗性T恤进行检测

机器学习导论�_机器学习导论相关推荐

机器学习凝聚态物理_机器学习遇到了凝聚的问题
机器学习凝聚态物理为什么要机器学习? (Why machine learning?) Machine learning is one of today's most rapidly cutting ...
机器学习模型非线性模型_机器学习：通过预测菲亚特500的价格来观察线性模型的工作原理...
机器学习模型非线性模型 Introduction 介绍 In this article, I'd like to speak about linear models by introducing y ...
机器学习偏差方差_机器学习101 —偏差方差难题
机器学习偏差方差 Determining the performance of our model is one of the most crucial steps in the machine le ...
机器学习系列(7)_机器学习路线图（附资料）
作者:寒小阳&&龙心尘时间:2016年2月. 出处:http://blog.csdn.net/han_xiaoyang/article/details/50759472 http:/ ...
（转）机器学习系列(7)_机器学习路线图（附资料）
作者:寒小阳&&龙心尘时间:2016年2月. 出处:http://blog.csdn.net/han_xiaoyang/article/details/50759472 http:/ ...
机器学习与不确定性_机器学习求职中的不确定性
机器学习与不确定性 In less than a year, I will be deemed worthy by my university of a Bachelors degree. In le ...
机器学习分类算法_机器学习分类算法
人们曾在自身的神经元得到启发,将机器学习中给出的特征输入与权重之积作为输出与阈值作比较,得到0或者1的输出. 这就是我们感知器的实现原理感知器在实现过程中的步骤如下: ①将权值初始化称为一个很小的向 ...
机器学习模型非线性模型_机器学习模型说明
机器学习模型非线性模型 A Case Study of Shap and pdp using Diabetes dataset 使用糖尿病数据集对Shap和pdp进行案例研究 Explaining ...

机器学习导论�_机器学习导论

机器学习分类 (Classification of Machine Learning)

选择正确的机器学习问题 (Choosing the Right Machine Learning Problem)

结论 (Conclusion)

相关文章：

机器学习导论�_机器学习导论相关推荐

最新文章

热门文章