摘要: 本文是吴恩达 (Andrew Ng)老师《机器学习》课程,第七章《logistic回归》中第48课时《决策边界》的视频原文字幕。为本人在视频学习过程中记录下来并加以修正,使其更加简洁,方便阅读,以便日后查阅使用。现分享给大家。如有错误,欢迎大家批评指正,在此表示诚挚地感谢!同时希望对大家的学习能有所帮助。

In the last video, we talked about the hypothesis representation for logistic regression.

What I'd like to do now is tell you about something called the decision boundary, and this will give us a better sense of what the logistic regression hypothesis function is computing. To recap, this is what we wrote out last time, where we said that the hypothesis is represented as , where g is this function called the sigmoid function which looks like this . So it slowly increases from 0 to 1, asymptoting at 1. What I want to do now is try to understand better when this hypothesis will make predictions that y is equal to 1 versus when it might make predictions that y is equal to 0 and understand better what the hypothesis function looks like particularly when we have more than one feature. Concretely, this hypothesis is outputting estimates of the probability that y is equal to 1 given x and parameterized by . So if we wanted to predict is y equal to 1 or is y equal to 0, here is something we might do. Whenever the hypothesis outputs that the probability with y being 1 is greater than or equal to 0.5 so this means that it is more likely to be y equals 1 than y equals 0 then let's predict y equals 1. And otherwise, if the probability of, the estimated probability of being 1 is less than 0.5, then let's predict y equals 0. And I chose a greater than or equal to 0.5 or less than 0.5. If is equal to 0.5 exactly, then we could predict positive or negative, but I put a greater than or equal to here so we default maybe to predict a positive if is 0.5. But that's a detail that really doesn't matter that much. What I want to do is understand better when it is exactly that will be greater or equal to 0.5, so that we end up predicting y is equal to 1. If we look at this plot of the sigmoid function, we'll notice that the sigmoid function, g(z), is greater than or equal to 0.5 whenever z is greater than or equal to 0. So is in this half of the figure that, g takes on values that are 0.5 and higher. This is node here, that's the 0.5. So when z is positive, g(z) the sigmoid function, is greater than or equal to 0.5. Since the hypothesis for logistic regression is . This is therefore going to be greater than or equal to 0.5 whenever is greater than or equal to 0. So what was shown, right, because here takes the role of z. So what we're shown is that our hypothesis is going to predict y equals 1 whenever is greater than or equal to 0. Let's now consider the other case of when a hypothesis will predict y is equal to 0. Well, by similar argument, is going to be less than 0.5 whenever g(z) is less than 0.5, because the range of values of z that calls g(z) to take on values less than 0.5, well that's when z is negative. So when g(z) is less than 0.5, our hypothesis will predict that y is equal to 0, and by similar argument to what we had earlier, . And so, we'll predict y equals 0 whenever this quantity is less than 0. To summarize what we just worked out, we saw that if we decide to predict whether y is equal to 1 or y is equal to 0, depending on whether the estimated probability is greater than or equal to 0.5, or whether it's less than 0.5, that's the same as saying that will predict y equals 1 whenever is greater than or equal to 0, and we'll predict y is equal to 0 whenever is less than 0.

Let's use this to better understand how the hypothesis of logistic regression makes those predictions. Now, let's suppose we have a training set like that shown on the slide, and suppose our hypothesis is . We haven't talked yet about how to fit the parameters of this model. We'll talk about that in the next video. But suppose that very procedure to be specified, we end up choosing the following values for the parameters. Let's say we choose . So this means my parameter vector is going to be . So, we're given this choice of my hypothesis parameters, let's try to figure out where a hypothesis will end up predicting y=1 and where it will end up predicting y equals 0. Using the formulas that we worked on the previous slides, we know that y=1 is more likely, that is the probability that y=1 is greater than or equals to 0.5. Whenever is greater than 0. And this formula that I just underlined, is, of course, , when is equal to this value of the parameters that we just chose. So, for any example, for any example with features x1 and x2, that satisfy this equation that is greater than or equal to 0, our hypothesis will think that y equals 1 is more likely, or will predict that y is equal to 1. We can also take -3 and bring this to the right and rewrite this as . And so, equivalently, we found that this hypothesis will predict y=1 whenever x1+x2 is greater than or equal to 3. Let's see what that means on the figure. If I write down the equation , this defines the equation of a straight line. And if I draw what that straight line looks like, it gives me the following line which passes through 3 and 3 on the x1 and x2 axis. So the part of the input space, the part of the x1 and x2 plane that corresponds to when x1+x2 is greater than or equal to 3. That is going to be this right half plane. That is everything to the upper right portion of this magenta line that I just drew. And so, the region where our hypothesis will predict y=1 is really this huge region this half space over to the upper right. And let me just write that down. I'm gonna call this y=1 region. And in contrast, the region there x1+x2 is less than 3 that's when we'll predict that y=0, and that corresponds to this region. You know, it's really a half plane, but that region on the left is the region where our hypothesis is predict y=0. I want to give this line, this magenta line that I drew a name. This line there is called the decision boundary. And concretely, this straight line x1+x2=3. That corresponds to the set of points, that corresponds to the region where is equal to 0.5 exactly. And the decision boundary, that is this straight line, that's the line that separates the region where the hypothesis predicts y=1 from the region where the hypothesis predicts that y=0. And just to be clear, the decision boundary is a property of the hypothesis including the parameters , and . And in the figure I drew a training set. I drew a data set in order to help the visualization. But even if we take away the data set, this decision boundary and a region where we predict y=1 versus y=0. That's a property of the hypothesis and of the parameters of the hypothesis, and not a property of the data set. Later on, of course, we'll talk about how to fit the parameters and there we'll end up using the training set, or using our data, to determine the value of the parameters. But once we have particular values for the parameters , and , then that completely defines the decision boundary and we don't actually need to plot a training set in order to plot the decision boundary.

Let's now look at a more complex example where, as usual, I have crosses to denote my positive examples and o's to denote my negative examples. Given a training set like this, how can I get logistic regression to fit this sort of data? Earlier, when we were talking about polynomial regression or when we're talking about linear regression, we talked about how we can add extra higher order polynomial terms to the features. And we can do the same for logistic regression. Concretely, let's say my hypothesis looks like this. Where I've added two extra features, and to my features. So that I now have 5 parameters, through . As before, we'll defer to the next video our discussion on how to automatically choose values for the parameters through . But let's say that very procedure to be specified, I end up choosing , and . What this means is that with this particular choice of parameters, my parameter vector . Following our earlier discussion, this means that my hypothesis will predict that y=1 whenever . This is whenever . And if I take -1 and just bring this to the right, I'm saying that my hypothesis will predict that y=1 whenever . So, what does decision boundary look like? Well, if you were to plot the curve for . Some of you will recognize that's the equation for a circle of radius 1 centered around the origin. So, that is my decision boundary. And everything outside the circle I'm going to predict as y=1. So out here is my y=1 region. And inside the circle is where I'll predict y=0. So, by adding this more complex polynomial terms to my features as well, I can get more complex decision boundaries that don't just try to separate the positive and negative examples with straight line. I can get in this example a decision boundary that is a circle. Once again, the decision boundary is a property not of the training set, but of the hypothesis and of the parameters. So long as we've given my parameter vector , that defines the decision boundary which is the circle. But the training set is not what we use to define the decision boundary. The training set may be used to fit the parameters . We'll talk about how to do that later. But once you have the parameters , that is what defines the decision boundary. Let me put the back the training set just for visualization.

And finally, let's look at a more complex example. So can we come up with even more complex decision boundary than this? If I have even higher order polynomial terms, so things like . If I have much higher order polynomials then it's possible to show that you can get even more complex decision boundaries and logistic regression can be used to find decision boundaries that may, for example, be an ellipse like that, or with a different setting of parameters, maybe you can get a different decision boundary which may even look like, some funny shape like that.  Or even for more complex examples you can also get decision boundaries that could look like more complex shape like that. Where everything in here you predict y=1, and everything outside you predict y=0. So these higher order polynomial features you can get very complex decision boundaries.  So with these visualizations, I hope that gives you a sense what's the range of hypothesis functions you can represent using the representation that we have for logistic regression.

Now that we know what can represent. What I'd like to do next in the following video is talk about how to automatically choose the parameters . So that given a training set we can automatically fit the parameters to our data.

<end>

Logistic Regression - Decision boundary相关推荐

  1. machine learning(15) --Regularization:Regularized logistic regression

    Regularization:Regularized logistic regression without regularization 当features很多时会出现overfitting现象,图 ...

  2. [机器学习] Coursera ML笔记 - 逻辑回归(Logistic Regression)

    引言 机器学习栏目记录我在学习Machine Learning过程的一些心得笔记,涵盖线性回归.逻辑回归.Softmax回归.神经网络和SVM等等.主要学习资料来自Standford Andrew N ...

  3. 第三讲-------Logistic Regression Regularization

     第三讲-------Logistic Regression & Regularization 本讲内容: Logistic Regression ==================== ...

  4. Exercise: Logistic Regression and Newton's Method

     Exercise: Logistic Regression and Newton's Method 题目地址: Exercise: Logistic Regression 题目概要:某个高中有8 ...

  5. Machine Learning week 3 quiz : Logistic Regression

    Logistic Regression 5 试题 1. Suppose that you have trained a logistic regression classifier, and it o ...

  6. Coursera公开课笔记: 斯坦福大学机器学习第六课“逻辑回归(Logistic Regression)”

    Coursera公开课笔记: 斯坦福大学机器学习第六课"逻辑回归(Logistic Regression)" 斯坦福大学机器学习第六课"逻辑回归"学习笔记,本次 ...

  7. 斯坦福大学机器学习第四课“逻辑回归(Logistic Regression)”

    斯坦福大学机器学习第四课"逻辑回归(Logistic Regression)" 本次课程主要包括7部分: 1) Classification(分类) 2) Hypothesis R ...

  8. 机器学习实践一 logistic regression regularize

    Logistic regression 数据内容: 两个参数 x1 x2 y值 0 或 1 Potting def read_file(file):data = pd.read_csv(file, n ...

  9. Andrew Ng Machine Learning 专题【Logistic Regression amp; Regularization】

    此文是斯坦福大学,机器学习界 superstar - Andrew Ng 所开设的 Coursera 课程:Machine Learning 的课程笔记. 力求简洁,仅代表本人观点,不足之处希望大家探 ...

最新文章

  1. Linux控制Bash输出的格式与颜色
  2. php语言smtp类,php mailer类调用远程SMTP服务器发送邮件实现方法
  3. 2.Cocos2dx 3.2中的重力系统Box2D
  4. imageset matlab,如何以imageSet或imageDataStore的形式向MATLAB中的BagOfFeatures()函數提供輸入?...
  5. openjudge 14:求10000以内n的阶乘
  6. 为什么Kaggle不会让你成为一名出色的数据科学家?
  7. python利器-python利器app下载-python利器手机版 _5577安卓网
  8. linux下Apache默认安装路径
  9. Python 随笔之Redis
  10. 【VRP】基于matlab蚁群算法求解多中心的车辆路径规划问题【含Matlab源码 111期】
  11. RTI_DDS自定义插件开发 6 方法
  12. 《UnityAPI.MovieTexture影片纹理》(Yanlz+Unity+SteamVR+云技术+5G+AI+VR云游戏+MovieTexture+audioClip+立钻哥哥++OK++)
  13. arping工具使用
  14. python中pyserial模块使用方法
  15. kindle电子书格式转换
  16. 刷手机坐公交 背后蕴含了什么技术?
  17. Cassandra在海量数据存储及大型项目案例介绍-part1
  18. 来晚了,秋招五投大厂,成功拿下三家Offer,最终入职美团,分享我的美团1-4面(Java岗)
  19. matlab导入word数据,matlab导入word
  20. python 折线图变成直线图_python如何画折线图

热门文章

  1. JQ JS分页序号连续
  2. 关于区块链应用和技术的4个PPT
  3. 使用Obj2gltf 将Obj格式转换为glft格式
  4. [32位汇编系列]004 - 对话框资源的使用(2)
  5. 表面散射 | Vol.5 基于图像的偏振反射率的采集与建模
  6. matlab中归一化到[0,1]的函数mapminmax使用
  7. 一生里该听的外语歌曲100首
  8. 高回复率的开发信都有这些共同点
  9. 无需外源库Python小游戏
  10. DOS命令__ping