机器学习--线性回归1_线性回归-进入迷人世界的第一步

机器学习--线性回归1

Artificial Intelligence has gained momentum in the recent years because of improvements of computing capacity. Under this wide spectrum of Artificial Intelligence there exists sub branch of Machine Learning which has gained lot of attention in the recent past. Machine Learning is nothing but training the machine according to our problem statement so that it can remember historical values and predict future values. This process can be simply called as training and testing of Machine Learning Models. Machine Learning is used in almost every industry to predict outcomes. For example, Machine Learning is used to identify if a tumour is malignant or not in medical field, it is also used by recruiters to check if a resume is suitable or not for a particular job profile. Machine Learning is also used to predict customer churn rate for business developments. For anyone who wishes to start studying Machine Learning Algorithms the first start point would definitely be Linear Regression Algorithm. Thus we can do the same by beginning our journey by understanding Linear Regression.

近年来，由于计算能力的提高，人工智能得到了发展。在如此广泛的人工智能领域，存在着机器学习的子分支，该分支在最近已经引起了很多关注。机器学习不过是根据我们的问题陈述来训练机器，以便它可以记住历史值并预测未来值。此过程可以简单地称为机器学习模型的训练和测试。几乎每个行业都使用机器学习来预测结果。例如，机器学习用于识别医学领域中的肿瘤是否为恶性肿瘤，招聘人员还可以使用机器学习来检查简历是否适合特定的工作状况。机器学习还用于预测业务发展的客户流失率。对于任何希望开始学习机器学习算法的人来说，第一个起点肯定是线性回归算法。因此，我们可以通过了解线性回归来开始我们的旅程。

What is Linear Regression?

什么是线性回归？

Before moving onto Linear Regression let us first understand what a Regression is. Regression is a method which is used to model dependent values based on independent values. Types of Regression techniques differs depending on the number and type of dependent and independent variables. This method is mostly used whenever there is a need to forecast and find out cause and effect between the variables.

在进行线性回归之前，让我们首先了解什么是回归。回归是一种用于基于独立值对依赖值进行建模的方法。回归技术的类型取决于因变量和自变量的数量和类型。此方法通常在需要预测并找出变量之间的因果关系时使用。

Simple Linear Regression is used when there is only one X variable(predictor variable) to predict Y variable(target variable).It is used only when there exists linear relationship between predictor and target variables.The blue line in image represents best fit line.Best fit line is the optimal line which has minimum total error in predicting data points.Simple Linear Regression is usually denoted by:

简单线性回归用于只有一个X变量(预测变量)来预测Y变量(目标变量)的情况，仅当预测变量与目标变量之间存在线性关系时才使用，图像中的蓝线表示最佳拟合线。最佳拟合线是在预测数据点时具有最小总误差的最佳线。简单线性回归通常表示为：

Y=B_0+B_1*X

Y = B_0 + B_1 * X

B_0 is the Intercept that decides where the line intercepts Y-axis

B_0是决定线在Y轴上截距的截距

B_1 defines slope of the line

B_1定义线的斜率

The ultimate motive of Linear Regression is to find the best values for B_0 and B_1.Let us explore the two most important concepts of Cost Function and Gradient Descendent before jumping into the algorithm itself.

线性回归的最终动机是找到B_0和B_1的最佳值。让我们在进入算法本身之前先探讨一下成本函数和梯度下降这两个最重要的概念。

Cost Function:

成本函数：

The cost function is used to help us find out the best possible values for our B_1(Slope) and B_0(Intercept) which in order helps us to arrive at our best fit line. Typically in this problem, we wish to find the best values in order to arrive at the best fit line which can be possible only when the total error is minimized. Thus this problem can be converted from a search problem to a minimization problem where the total error between predicted and actual values is the least.

成本函数用于帮助我们找出B_1(坡度)和B_0(拦截)的最佳值，以帮助我们达到最佳拟合线。通常，在此问题中，我们希望找到最佳值以达到最佳拟合线，这只有在使总误差最小时才可能实现。因此，该问题可以从搜索问题转换为最小化问题，其中预测值与实际值之间的总误差最小。

We choose the above function to minimize total error in the prediction. We are squaring the error difference and sum over all data points and divide that total by the total count of data points which provides us mean squared error over all the data points. So this cost function is also called popularly as Mean Squared Error(MSE). Then this MSE function is used to modify values for slope and intercept such that MSE value settles at the least value.

我们选择上述函数以最大程度地减少预测中的总误差。我们将对所有数据点的误差差求和，并将其总和除以数据点的总数，从而为所有数据点提供均方误差。因此，这个成本函数通常也称为均方误差(MSE)。然后，此MSE函数用于修改斜率和截距的值，以使MSE值稳定在最小值。

Gradient Descent

梯度下降

Gradient Descent is one of the most required concepts to understand Linear Regression. It is the most common method of updating B_1 and B_0 to reduce the cost function. The main concept behind Gradient Descent is iteratively changing values of slope and intercept to reduce the cost function.

梯度下降是理解线性回归最需要的概念之一。这是更新B_1和B_0以降低成本函数的最常用方法。渐变下降的主要概念是迭代更改斜率和截距的值以减少成本函数。

Assumptions of Linear Regression:

线性回归的假设：

1) X and Y is assumed to have linear relation. If the data does not prove to show linear relation transformation techniques like log transformation is done to make it linear.

1)假设X和Y具有线性关系。如果数据未显示出线性关系转换技术，则可以执行对数转换等技术使其线性化。

2) It assumes Input data is noiseless so it is necessary to remove outliers before training the model

2)假设输入数据是无噪声的，因此有必要在训练模型之前除去异常值

3) Remove collinearity in order to prevent overfitting of highly correlated data

3)删除共线性以防止高度相关数据的过拟合

4) Make sure all the variables follow Gaussian distribution

4)确保所有变量都遵循高斯分布

5) Prediction is more reliable if you rescale input using scaling techniques

5)如果使用缩放技术重新缩放输入，则预测更加可靠

Conclusion

结论

Linear regression is an algorithm every data science enthusiast must know to start building a base in Machine Learning. It is very simple but can be used in multiple scenarios. Hoping that this article was helpful to all of you out there!!!

线性回归是每个数据科学爱好者都必须知道的一种算法，以开始建立机器学习的基础。它非常简单，但是可以在多种情况下使用。希望这篇文章对您所有人都有帮助！！！

翻译自: https://medium.com/ml-course-microsoft-udacity/linear-regression-first-step-into-fascinating-world-of-machine-learning-ce21efd22792

机器学习--线性回归1

查看全文

http://www.taodudu.cc/news/show-1873849.html

神经网络神经元_神经去耦
ai人工智能将替代人类_人类可以信任AI吗？
ai人工智能可以干什么_人工智能可以解决我的业务问题吗？
如何识别媒体偏见_面部识别软件：宝贵资产，还是社会偏见的体现？
snorkel_Snorkel AI：标记培训数据的程序化方法
ai/ml_本月有关AI / ML的令人印象深刻的中等文章
ai人工智能最新相关消息_我如何了解最新的AI研究
人工智能算法自动化测试_自动化：算法如何塑造我和你的生活
情书，由多士炉写。
快二游戏数据分析_1.更快的数据分析
决策树人工智能预测模型_部署和服务AI模型进行预测的10种方法
商业洞察力_正在进行的寻求洞察力和远见卓识
阿里ai布局开始_如何从AI开始？
python惰性_如何创建惰性属性以提高Python的性能
如何识别媒体偏见_面部识别技术存在偏见：为什么我们不应该盲目相信新技术
自然语言处理：简单解释
ai技术领先的企业_领先企业如何扩展AI
机器学习为什么重要_什么是机器学习？为什么对您的业务很重要？
数据重塑_人工智能能否重塑全球力量平衡？
平安科技一轮等多久_科技正等着我们成长
r语言生成等差序列_使用序列模型生成自然语言
人工智能火灾报警器_使用AI进行准确的火灾预测
ai/ml_您应该在本周（7月11日）阅读有趣的AI / ML文章
西蒙决策_西蒙的象棋因子
ai的利与弊辩论_为什么AI辩论失败了
k8s apollo_AI增强的Apollo 16素材让您以4K登上月球
ai疾病风险因素识别_克服AI的“蠕动因素”
人工智能的未来是强化学习_多主体强化学习与AI的未来
ai人工智能的本质和未来_什么是人工智能，它将如何塑造我们的未来？
日本初创公司Elix正在使用AI研究COVID-19药物

机器学习--线性回归1_线性回归-进入迷人世界的第一步相关推荐

转转“拯救世界”的第一步，师从小米换LOGO？
宣布转型循环经济产业公司的转转集团,也换了个新LOGO. 一石惊起千层浪. 网友戏称:"互联网品牌不是在换LOGO就是在换LOGO的路上".事实也是如此,近两年无论是LOGO用了十 ...
机器学习算法1_线性回归
通俗描述线性回归模型是利用线性函数对一个或多个自变量和因变量(y)(y)(y)之间关系进行拟合的模型. 公式推导数据输入给定数据集D={(x1,y1),(x2,y2),-,(xm,ym)},yi ...
极客之眼 Nmap：窥探世界的第一步
文章目录参考描述 Nmap 极客之眼 Nmap 与黑客的缠绵往事 CIDR Nmap 的获取检测 Nmap 是否已经安装下载并安装 Nmap Linux MacOS 与 WIndows 区域扫 ...
机器学习中的线性回归，你理解多少？
作者丨algorithmia 编译 | 武明利,责编丨Carol 来源 | 大数据与人工智能(ID: ai-big-data) 机器学习中的线性回归是一种来源于经典统计学的有监督学习技术.然而,随着机 ...
机器学习线性回归学习心得_机器学习中的线性回归
机器学习线性回归学习心得机器学习中的线性回归 (Linear Regression in Machine Learning) There are two types of supervised ma ...
机器学习之多变量线性回归（Linear Regression with multiple variables）
机器学习之多变量线性回归(Linear Regression with multiple variables) 1. Multiple features(多维特征) 在机器学习之单变量线性回归(Lin ...
《机器学习实战》-线性回归
目录线性回归用线性回归找到最佳拟合直线程序8-1 标准回归函数和数据导入函数程序8-2 基于程序8-1绘图图片8-1 ex0的数据集和它的最佳拟合直线局部加权线性回归图片8-2 参数k与 ...
Julia 机器学习 ---- 单变量线性回归和多元线性回归 (Linear regression)
目录 1.线性回归概述 2.数据准备 3.数据探索 3.1简单的数据清洗 3.2 图形分析数据 3.2.1 盒形图的离群点分析,需要根据四分位距去掉部分数据点. 3.2.2 密度图进行分布分析 3.2 ...
机器学习介绍及线性回归技术总结
文章目录什么是机器学习? 机器学习的应用场景实现机器学习的基本框架机器学习的类别监督式学习(Supervised Learning) 无监督式学习(Unsupervised Learning) ...
Python数据挖掘与机器学习实战——回归分析——线性回归及实例
回归分析回归分析(Regression Analysis)是确定两种或两种以上变量间相互依赖的定量关系的一种统计分析方法 ,是一种预测性的建模技术. 线性回归: 简单而言,就是将输入项分别乘以一些常 ...

机器学习--线性回归1_线性回归-进入迷人世界的第一步

相关文章：

机器学习--线性回归1_线性回归-进入迷人世界的第一步相关推荐

最新文章

热门文章