Advice for applying machine learning - Diagnosing bias vs. variance

摘要: 本文是吴恩达 (Andrew Ng)老师《机器学习》课程，第十一章《应用机器学习的建议》中第86课时《诊断偏差与方差》的视频原文字幕。为本人在视频学习过程中记录下来并加以修正，使其更加简洁，方便阅读，以便日后查阅使用。现分享给大家。如有错误，欢迎大家批评指正，在此表示诚挚地感谢！同时希望对大家的学习能有所帮助.
————————————————

If you run the learning algorithm and it doesn't do as well as you are hoping, almost all the time it will be because you have either a high bias problem or a high variance problem. In other words either an underfitting problem or an overfitting problem. And in this case it's very important to figure out which of these two problems is, bias or variance or a bit of both that you actually have. Because knowing which of these two things is happening would give a very strong indicator for what the useful and promising ways to try to improve your algorithm. In this video, I would like to dive more deeply into this bias and variance issue and understand them better as well as figure out how to look an algorithm and evaluate or diagnose what we might have, a bias problem or a variance problem? Since this would be critical to figure out how to improve the performance of learning algorithm that you implement.

So you've seen this figure a few times, where if you fit to simple hypothesis, like a straight line that underfits the data. If you fit a too complex hypothesis, then that might fit the training set perfectly but overfit the data, and this maybe hypothesis of some intermediate level of complexity, of some, maybe degree two polynomials. Not too low and too high degree. That's just right, and gives you the best generalization error out of these options.Now that we're armed with the notion of training and validation and test sets, we can understand the concepts of bias and variance a little bit better.

Concretely, let our training error and cross validation error be defined in the previous videos, just say, the squared error, the average squared error, as measured on the training set or as measured on the cross validation set. Now let's plot the following figure. On the horizontal axis I am going to plot the degree of polynomial, so as I go to the right I'm going to be fitting higher and higher order polynomials. So will the left of this figure, where maybe $d=1$ , we're going to be fitting very simple function. Whereas we here on the right of the horizontal axis, I will have much larger value of $d$ , so a much higher degree of polynomial. And here there's going to correspond to fitting much more complex functions to your training set. Let's look at the training error and the cross validation error and plot them on this figure. Let's start with the training error. As we increase the degree of the polynomial, we're going to be able to fit our training set better and better. So, if $d=1$ , then it's a relatively high training error. If we have a very high degree polynomial, our training erro is going to be really low, maybe even zero because we'll fit the training set very well. And so as we increase the degree of polynomial, we find typically that the training error decreases. So, I'm going to write $J_{training}(\theta )$ there. Because our training error tends to decrease with the degree of polynomial that we fit to the data. Next let's look at the cross validation erro. Or for that matter, if we look at the test set error, we'll get a pretty similar result as if we were to plot the cross validation error. So, we know that if $d=1$ , We're fitting a very simple function. So we may be under fitting the training set. And so we're going to have a very high cross validation error. If we fit an intermediate degree polynomial, like we have a $d=2$ in our example on the previous slide, we're going to have a much lower cross validation error, because we're just finding a much better fit to the data. And conversely, if $d$ was too high, so if $d$ took on, say, a value of four, then we're getting over fitting, and so we ended with a high value for cross validation error. So, if you were to vary this smoothly and plot the curve, you might end up with a curve like that. Where that's $J_{cv}(\theta )$ . And again if you plot $J_{test}(\theta )$ , you get something very similar. This sort of plot also helps us to better understand the bias and variance.

Concretely, suppose you've applied a learning algorithm, and it's not performing as well as you were hoping. So, if your cross valiation set error, or your test set error is high. How can we figure out if the learning algorithm is suffering from high bias or if it's suffering from high variance? So the setting of the cross validation error being high corresponds to either this regime or this regime. So this regime on the left corresponds to a high bias problem. That is if you're fitting an overly lower order polynomial, such as $d=1$ when we really need a higher order polynomial to fit the data. Whereas in contrast, this regime corresponds to a high variance problem. That is, if $d$ , the degree of polynomial is too large, for the data set that we have. And this figure just has a clue for how to distinguish between these two cases. Concretely, for the high bias case, that is the case of under fitting, what we find is that both the cross validation error and the training error are going to be high. So, if your algorithm is suffering a bias problem, the training set error $J_{training}(\theta )$ wil be high. And you might find that the cross validation error $J_{cv}(\theta )$ will also be very high. It might be a close, maybe just slightly higher than a training error. And so, if you see this combination, that's a sign that your algorithm may be suffer from high bias. In contrast, if your algorithm is suffering from high variance, then if you look here, we'll notice that $J_{training}(\theta )$ which is the training error is going to be low. That's you're fitting the training set very well, whereas your cross validation error, assuming that this is say the squared error, which we're tring to minimize. Whereas on contrast, your error on the cross validation set or your cost function in the cross validation set would be much bigger than your training set error. So there's a double greater than sign that's the math symbol for much greater than denoted by two greater than signs. And so, if you see this combination of values, that's a clue that your learning algorithm maybe suffering from high variance. And might be overfitting. And the key that distinguishes these two cases is if you have a high bias problem, your training set error will also be high. Your hypothesis is just not fitting the training set well. And if you have a high variance problem, your training set error will usually be low. That is much lower than your cross validation error.

So, hopefully that gives you some better understanding of the two problems of bias and variance. I still have a lot more to say about bias and variance in the next few videos. But we'll see later is that by diagnosing whether a learning algorithm may be suffering from high bias or high variance. I'll show you even more details how to do that in later videos. We'll see that by figuring out whether a learning algorithm may be suffering from high bias or high variance, or a combination of both, that would give us much better guidance, promising things to try in order to improve the performance of a learning algorithm.

<end>

Advice for applying machine learning - Diagnosing bias vs. variance相关推荐

斯坦福大学机器学习第十课“应用机器学习的建议(Advice for applying machine learning)”
斯坦福大学机器学习第十课"应用机器学习的建议(Advice for applying machine learning)" 斯坦福大学机器学习斯坦福大学机器学习第十课"应 ...
Machine Learning week 6 quiz: Advice for Applying Machine Learning
Advice for Applying Machine Learning 5 试题 1. You train a learning algorithm, and find that it has un ...
斯坦福机器学习视频笔记 Week6 关于机器学习的建议 Advice for Applying Machine Learning...
我们将学习如何系统地提升机器学习算法,告诉你学习算法何时做得不好,并描述如何'调试'你的学习算法和提高其性能的"最佳实践".要优化机器学习算法,需要先了解可以在哪里做最大的改进. ...
Coursera机器学习-第六周-Advice for Applying Machine Learning
Evaluating a Learning Algorithm Desciding What to Try Next 先来看一个有正则的线性回归例子: 当在预测时,有很大的误差,该如何处理? 1.得到 ...
Week 6 测验：Advice for Applying Machine Learning【Maching Learning】
1 You train a learning algorithm, and find that it has unacceptably high error on the test set. You ...
斯坦福大学公开课机器学习：advice for applying machine learning | learning curves （改进学习算法：高偏差和高方差与学习曲线的关系）...
绘制学习曲线非常有用,比如你想检查你的学习算法,运行是否正常.或者你希望改进算法的表现或效果.那么学习曲线就是一种很好的工具.学习曲线可以判断某一个学习算法,是偏差.方差问题,或是二者皆有. 为了绘制 ...
week6:Diagnosing Bias vs. Variance难点记录
1.Bias vs. Variance是什么概念? 图形上的理解:https://www.zhihu.com/question/27068705 http://blog.csdn.n ...
【PaperReading】Navigating the pitfalls of applying machine learning in genomics.
机器学习在基因组学中的常见使用陷阱 Nature Reviews Genetics| 在基因组学中应用机器学习的常见陷阱 1. 摘要 2. 引言陷阱1:distributional differen ...
Bias vs. Variance(1)--diagnosing bias vs. variance
我们的函数是有high bias problem(underfitting problem)还是 high variance problem(overfitting problem),区分它们很得要, ...

Advice for applying machine learning - Diagnosing bias vs. variance

Advice for applying machine learning - Diagnosing bias vs. variance相关推荐

最新文章

热门文章