原文地址:https://en.wikipedia.org/wiki/Huber_loss

In statistics, the Huber loss is a loss function used in robust regression, that is less sensitive to outliers in data than the squared error loss. A variant for classification is also sometimes used.

Definition

Huber loss (green, {\displaystyle \delta =1}) and squared error loss (blue) as a function of {\displaystyle y-f(x)}

The Huber loss function describes the penalty incurred by an estimation procedure f. Huber (1964) defines the loss function piecewise by[1]

{\displaystyle L_{\delta }(a)={\begin{cases}{\frac {1}{2}}{a^{2}}&{\text{for }}|a|\leq \delta ,\\\delta (|a|-{\frac {1}{2}}\delta ),&{\text{otherwise.}}\end{cases}}}

This function is quadratic for small values of a, and linear for large values, with equal values and slopes of the different sections at the two points where {\displaystyle |a|=\delta }. The variable a often refers to the residuals, that is to the difference between the observed and predicted values {\displaystyle a=y-f(x)}, so the former can be expanded to[2]

{\displaystyle L_{\delta }(y,f(x))={\begin{cases}{\frac {1}{2}}(y-f(x))^{2}&{\textrm {for}}|y-f(x)|\leq \delta ,\\\delta \,|y-f(x)|-{\frac {1}{2}}\delta ^{2}&{\textrm {otherwise.}}\end{cases}}}

Motivation

Two very commonly used loss functions are the squared loss, {\displaystyle L(a)=a^{2}}, and the absolute loss, {\displaystyle L(a)=|a|}. The squared loss function results in an arithmetic mean-unbiased estimator, and the absolute-value loss function results in a median-unbiased estimator (in the one-dimensional case, and a geometric median-unbiased estimator for the multi-dimensional case). The squared loss has the disadvantage that it has the tendency to be dominated by outliers—when summing over a set of {\displaystyle a}'s (as in {\textstyle \sum _{i=1}^{n}L(a_{i})}), the sample mean is influenced too much by a few particularly large a-values when the distribution is heavy tailed: in terms of estimation theory, the asymptotic relative efficiency of the mean is poor for heavy-tailed distributions.

As defined above, the Huber loss function is convex in a uniform neighborhood of its minimum {\displaystyle a=0}, at the boundary of this uniform neighborhood, the Huber loss function has a differentiable extension to an affine function at points {\displaystyle a=-\delta } and {\displaystyle a=\delta }. These properties allow it to combine much of the sensitivity of the mean-unbiased, minimum-variance estimator of the mean (using the quadratic loss function) and the robustness of the median-unbiased estimator (using the absolute value function).

Pseudo-Huber loss function

The Pseudo-Huber loss function can be used as a smooth approximation of the Huber loss function, and ensures that derivatives are continuous for all degrees. It is defined as[3][4]

{\displaystyle L_{\delta }(a)=\delta ^{2}({\sqrt {1+(a/\delta )^{2}}}-1).}

As such, this function approximates {\displaystyle a^{2}/2} for small values of {\displaystyle a}, and approximates a straight line with slope {\displaystyle \delta } for large values of {\displaystyle a}.

While the above is the most common form, other smooth approximations of the Huber loss function also exist.[5]

Variant for classification

For classification purposes, a variant of the Huber loss called modified Huber is sometimes used. Given a prediction {\displaystyle f(x)} (a real-valued classifier score) and a true binary class label {\displaystyle y\in \{+1,-1\}}, the modified Huber loss is defined as[6]

{\displaystyle L(y,f(x))={\begin{cases}\max(0,1-y\,f(x))^{2}&{\textrm {for}}\,\,y\,f(x)\geq -1,\\-4y\,f(x)&{\textrm {otherwise.}}\end{cases}}}

The term {\displaystyle \max(0,1-y\,f(x))} is the hinge loss used by support vector machines; the quadratically smoothed hinge loss is a generalization of {\displaystyle L}.[6]

Applications

The Huber loss function is used in robust statistics, M-estimation and additive modelling.[7]

See also

  • Winsorizing
  • Robust regression
  • M-estimator
  • Visual comparison of different M-estimators

References

  1. Huber, Peter J. (1964). "Robust Estimation of a Location Parameter". Annals of Statistics 53 (1): 73–101. doi:10.1214/aoms/1177703732. JSTOR 2238020.
  2. Hastie, Trevor; Tibshirani, Robert; Friedman, Jerome (2009). The Elements of Statistical Learning. p. 349. Compared to Hastie et al., the loss is scaled by a factor of ½, to be consistent with Huber's original definition given earlier.
  3. Charbonnier, P.; Blanc-Feraud, L.; Aubert, G.; Barlaud, M. (1997). "Deterministic edge-preserving regularization in computed imaging". IEEE Trans. Image Processing 6(2): 298–311. doi:10.1109/83.551699.
  4. Hartley, R.; Zisserman, A. (2003). Multiple View Geometry in Computer Vision (2nd ed.). Cambridge University Press. p. 619. ISBN 0-521-54051-8.
  5. Lange, K. (1990). "Convergence of Image Reconstruction Algorithms with Gibbs Smoothing". IEEE Trans. Medical Imaging 9 (4): 439–446. doi:10.1109/42.61759.
  6. Zhang, Tong (2004). Solving large scale linear prediction problems using stochastic gradient descent algorithms. ICML.
  7. Friedman, J. H. (2001). "Greedy Function Approximation: A Gradient Boosting Machine". Annals of Statistics 26 (5): 1189–1232. doi:10.1214/aos/1013203451.JSTOR 2699986.

转载于:https://www.cnblogs.com/davidwang456/articles/5586178.html

Huber loss--转相关推荐

  1. huber loss

    huber loss huber loss 是一种优化平方loss的一种方式,使得loss变化没有那么大. import numpy as np import matplotlib.pyplot as ...

  2. Huber Loss function

    Huber loss是为了增强平方误差损失函数(squared loss function)对噪声(或叫离群点,outliers)的鲁棒性提出的. Definition Lδ(a)=⎧⎩⎨⎪⎪⎪⎪12 ...

  3. 线性拟合——从最大似然估计到平方误差到huber loss

    考虑这样一些数据: x = np.array([0, 3, 9, 14, 15, 19, 20, 21, 30, 35,40, 41, 42, 43, 54, 56, 67, 69, 72, 88]) ...

  4. 回归损失函数2 : HUber loss,Log Cosh Loss,以及 Quantile Loss

    均方误差(Mean Square Error,MSE)和平均绝对误差(Mean Absolute Error,MAE) 是回归中最常用的两个损失函数,但是其各有优缺点.为了避免MAE和MSE各自的优缺 ...

  5. 机器学习_LGB自定义huber loss函数

    很多时候为了达到更好的训练效果我们需要改变损失函数,以加速数据的拟合. 一.huber函数的近似函数 众所周知我们rmse会对异常值的损失关注度特别高,mae对异常会没有那么敏感.将两者进行结合就可以 ...

  6. 【损失函数】MSE, MAE, Huber loss详解

    转载:https://mp.weixin.qq.com/s/Xbi5iOh3xoBIK5kVmqbKYA https://baijiahao.baidu.com/s?id=16119517755261 ...

  7. 【机器学习】Huber loss

    Huber Loss 是一个用于回归问题的带参损失函数, 优点是能增强平方误差损失函数(MSE, mean square error)对噪声(或叫离群点,outliers)的鲁棒性. 当预测偏差小于 ...

  8. 机器学习之Huber loss

    Huber Loss 是用于回归问题的带参损失函数, 优点是能增强平方误差损失函数(MSE, mean square error)对离群点的鲁棒性. 当预测偏差小于 δ 时,它采用平方误差, 当预测偏 ...

  9. 回归损失函数:Huber Loss

    Huber损失函数,平滑平均绝对误差 相比平方误差损失,Huber损失对于数据中异常值的敏感性要差一些.在值为0时,它也是可微分的.它基本上是绝对值,在误差很小时会变为平方值.误差使其平方值的大小如何 ...

  10. 回归损失函数:L1,L2,Huber,Log-Cosh,Quantile Loss

    回归损失函数:L1,L2,Huber,Log-Cosh,Quantile Loss 机器学习中所有的算法都需要最大化或最小化一个函数,这个函数被称为"目标函数".其中,我们一般把最 ...

最新文章

  1. 大数据落地决胜的关键——百分点BASIC模型
  2. 什么是用户账户?-联科教育
  3. arcgis mxt模板 创建工具条无法保存_【从零开始学GIS】ArcGIS中的绘图基本操作(二)...
  4. REVERSE-PRACTICE-BUUCTF-27
  5. [GO]append的扩容
  6. oracle 二进制日志格式,二进制日志
  7. varnish工作原理详细讲解
  8. HTML的基本知识-和常用标签-以及相对路径和绝对路径的区别
  9. mysql首字母排序,抛弃传统的php首字母排序
  10. mvc npoi将List实体导出excel的最简单方法
  11. 一个好的测试工程师的简历到底是怎么写的
  12. obs听到了自己的回音_如何在直播中解决播放杂音、噪音、回声问题 | 直播疑难杂症排查...
  13. 超市产品关联性分析——天池竞赛
  14. csv文件用excel打开不分列
  15. 星星是怎么来的?—— CG短片《繁星》幕后分享
  16. mysql 存储过程 总结_Mysql存储过程总结
  17. 企业邮箱能传多大的附件?企业邮箱附件大小有限制吗?
  18. 工作站就是高级的微型计算机,家用pc机和工作站有什么不同?
  19. cinder卷删除不掉解决方法
  20. 如何使用javascript制作一个网页端3D贪吃蛇游戏(附源码及链接)

热门文章

  1. java磁盘读写b 树_原来你是这样的B+树
  2. oracle书评,【书评:Oracle查询优化改写】第二章
  3. 利用python进行数据分析学习笔记(2)
  4. android studio socket 失败,Android应用开发Android Studio建立Socket连接失败解决方法
  5. hbase 协处理器 部署_hbase中安装和删除observer协处理器
  6. vue仿今日头条_黄圣依荣获“时代气质明星”,头条时尚盛典她的“天鹅妆”美出圈!...
  7. 在浙学计算机基础2020答案,浙江大学2020年硕士研究生复试分数线的基本要求
  8. 前端微信签名验证工具_微信jssdk 签名错误排查方法
  9. vins中imu融合_双目版 VINS 项目发布,小觅双目摄像头作为双目惯导相机被推荐...
  10. 激光点云感知 voxnet本质