• Convex and strictly convex
  • Strong convex

1. Convex and strictly convex

Common used notations about convexity are convex and strictly convex. Their definitions are

Definition 1: [convex]: f(x)f(x) is said to be convex if one of the following holds ∀x,y\forall x,y

f(λx+(1−λ)y)≤λf(x)+(1−λ)f(y)

f(\lambda x+(1-\lambda)y)\leq \lambda f(x)+(1-\lambda)f(y)

Definition 2: [strictly convex]: f(x)f(x) is said to be strictly convex if one of the following holds ∀x,y\forall x,y

f(λx+(1−λ)y)<λf(x)+(1−λ)f(y)

f(\lambda x+(1-\lambda)y)

And there exist two equivalent definitions:

Theorem 3. [first order condition(1)]: If f(x)f(x) is first differentiable, then f(x)f(x) is convex iff ∀x,y\forall x,y

f(y)≥f(x)+∇f(x)⋅(y−x)

f(y) \geq f(x)+\nabla f(x)\cdot (y-x)
This equivalence holds for strictly convex for >>.

proof:
necessary: If f(x)f(x) is convex and let λ→0\lambda \rightarrow 0

f(x)≥f(λx+(1−λ)y)−(1−λ)f(y)λ=f(y)+f(y+λ(x−y))−f(y)λ=f(y)+f(y+λ(x−y))−f(y)λ(x−y)⋅(x−y)=f(y)+∇f(y)⋅(x−y)

\begin{align} f(x)&\geq \frac{f(\lambda x+(1-\lambda)y ) - (1-\lambda)f(y)}{\lambda}\\ &=f(y) + \frac{f(y+\lambda (x-y)) - f(y)}{\lambda}\\ &=f(y) + \frac{f(y+\lambda (x-y)) - f(y)}{\lambda (x-y)}\cdot(x-y)\\ &=f(y) + \nabla f(y)\cdot(x-y)\\ \end{align}
sufficient: If the first order condition is satisfied,

f(x)f(y)≥f(λx+(1−λ)y)+∇f(λx+(1−λ)y)⋅(1−λ)(x−y)≥f(λx+(1−λ)y)+∇f(λx+(1−λ)y)⋅λ(y−x)

\begin{align} f(x)&\geq f(\lambda x+(1-\lambda)y)+\nabla f(\lambda x+(1-\lambda)y)\cdot (1-\lambda)(x-y)\\ f(y)&\geq f(\lambda x+(1-\lambda)y)+\nabla f(\lambda x+(1-\lambda)y)\cdot \lambda(y-x)\\ \end{align}

combining these two together, we get:

λf(x)+(1−λ)f(y)≤f(λx+(1−λ)y)

\begin{align} \lambda f(x)+(1-\lambda)f(y) \leq f(\lambda x+(1-\lambda)y) \end{align}

**Theorem 4. [first order condition(2)[monotone of ∇f(x)\nabla f(x)]]: f(x)f(x) is convex iff (∇f(x)−∇f(y))⋅(x−y)≥0(\nabla f(x)-\nabla f(y))\cdot (x-y)\geq 0.
proof: necessary:
If f(x)f(x) is convex, then ∀x,y\forall x,y, we have

f(x)≥f(y)+∇f(y)⋅(x−y)f(y)≥f(x)+∇f(x)⋅(y−x)

\begin{align} &f(x)\geq f(y) +\nabla f(y)\cdot (x-y)\\ &f(y)\geq f(x)+\nabla f(x)\cdot (y-x) \end{align}
adding these two equalities:

f(x)+f(y)≥f(y)+f(x)+(∇f(y)−∇f(x))⋅(x−y)

f(x)+f(y)\geq f(y)+f(x)+(\nabla f(y) - \nabla f(x))\cdot (x-y)
i.e.

(∇f(x)−∇f(y))⋅(x−y)≥0

(\nabla f(x)-\nabla f(y))\cdot (x-y)\geq 0
sufficient:
Let g(t)=f(x+t(y−x))g(t) = f(x +t(y-x)). Then ∇g(x)=∇f(x+t(y−x))⋅(y−x)\nabla g(x) = \nabla f(x+t(y-x))\cdot (y-x)

∇g(t)−∇g(0)=∇f(x+t(y−x))⋅(y−x)−∇f(x)⋅(y−x)=1t(∇f(x+t(y−x))−∇f(x))⋅t(y−x)≥0

\begin{align} \nabla g(t) - \nabla g(0) & = \nabla f(x+t(y-x))\cdot (y-x) - \nabla f(x)\cdot (y-x) \\ &= \frac{1}{t}(\nabla f(x+t(y-x)) - \nabla f(x))\cdot t(y-x)\\ &\geq 0 \end{align}
so ∇g(t)\nabla g(t) is monotone increasing.

So

g(1)⇒=g(0)+∫10∇g(t)dt≥∇g(0)f(y)≥f(x)+∇f(x)⋅(y−x)

\begin{align} g(1) &=g(0)+ \int_0^1 \nabla g(t) dt\geq \nabla g(0) \\ \Rightarrow& f(y) \geq f(x) +\nabla f(x)\cdot (y-x) \end{align}

Theorem 5. [second order condition]: If f(x)f(x) is second differentiable, then f(x)f(x) is convex iff ∀x\forall x

∇2f(x)≥0

\nabla^2 f(x)\geq 0
This equivalence holds for strictly convex for >>.
proof:
For simply, we firstly prove one variable function situation:
If h(x):x∈Rh(x): x\in \mathbb{R} is convex iff its twice derivative h′′(x)≥0h''(x)\geq 0
sufficient:
From h′′(x)≥0h''(x)\geq 0 and taylor expansion, we have

h(y)≥h(x)+h′(x)(y−x)

h(y) \geq h(x)+h'(x)(y-x)
and from last theorem, we know h(x)h(x) is convex.
necessary:
∀x≤z≤y\forall x\leq z\leq y, we have z=λx+(1−λy)z=\lambda x+(1-\lambda y) with λ=y−zy−x\lambda = \frac{y-z}{y-x}

h(z)=h(λx+(1−λ)y)≤λh(x)+(1−λ)h(y)=y−zy−xh(x)+z−xy−xh(y)

\begin{align} h(z) &= h(\lambda x+(1-\lambda)y)\\&\leq \lambda h(x)+(1-\lambda)h(y)\\&=\frac{y-z}{y-x} h(x)+\frac{z-x}{y-x}h(y) \end{align}

⇒(y−x)h(z)≤(y−z)h(x)+(z−x)h(y)

\Rightarrow (y-x)h(z)\leq (y-z)h(x)+(z-x)h(y)

⇒(y−z)(h(z)−h(x))≤(z−x)(h(y)−h(z))

\Rightarrow(y-z)(h(z)-h(x))\leq (z-x)(h(y)-h(z))

⇒h(z)−h(x)z−x≤h(y)−h(z)y−z

\Rightarrow\frac{h(z)-h(x)}{z-x}\leq \frac{h(y) - h(z)}{y-z}

So for t1≤x≤z≤y≤t2t_1\leq x\leq z\leq y\leq t_2, we have

h(x)−h(t1)x−t1≤h(z)−h(x)z−x≤h(y)−h(z)y−z≤h(t2)−h(y)t2−y

\frac{h(x)-h(t_1)}{x-t_1}\leq \frac{h(z)-h(x)}{z-x}\leq \frac{h(y) - h(z)}{y-z}\leq \frac{h(t_2) - h(y)}{t_2 - y}
letting t1→xt_1\rightarrow x and t2→yt_2 \rightarrow y, we have

h′(x)≤h(z)−h(x)z−x≤h(y)−h(z)y−z≤h′(y)

h'(x)\leq \frac{h(z)-h(x)}{z-x}\leq \frac{h(y) - h(z)}{y-z} \leq h'(y)

So h′(x)h'(x) is increasing →\rightarrow h′′(x)≥0h''(x)\geq 0.

Now we prove for multivariable function. Let g(t)=f(x+tℓ)g(t)=f(x+t\ell) be one variable function.
sufficient:
From convexity of f(x)f(x),

g(λt1+(1−λ)t2)=f(x+λt1ℓ+(1−λ)ℓ)≤λf(x+t1ℓ)+(1−λ)f(x+t2ℓ)=g(t1)+g(t2)

g(\lambda t_1+(1-\lambda)t_2) = f(x+\lambda t_1\ell+(1-\lambda)\ell) \leq \lambda f(x+t_1\ell)+(1-\lambda)f(x+t_2\ell) = g(t_1)+g(t_2)
So g(t)g(t) is convex as a one variable function. Then

g′′(t)=ℓt∇2f(x+tℓ)ℓ≥0

g''(t) = \ell^t\nabla^2 f(x+t\ell) \ell\geq 0
So

∇2f(x)≥0

\nabla^2 f(x) \geq 0
necessary:*
Let g(t)=f(x+t(y−x))g(t) = f(x+t(y-x)), then

g′′(t)=(y−x)t∇2f(x+t(y−x))(y−x)≥0

g''(t)=(y-x)^t \nabla^2 f(x+t(y-x)) (y-x)\geq 0
So g(t)g(t) is convex.

Then

f(λx+(1−λ)y)=f(x+(1−λ)(y−x))=g(1−λ)=g(λ0+(1−λ)1)≤λg(0)+(1−λ)g(1)=λg(x)+(1−λ)f(y)

\begin{align} f(\lambda x+(1-\lambda)y) &= f(x+(1-\lambda)(y-x))\\ &=g(1-\lambda) = g(\lambda 0 +(1-\lambda) 1)\\ &\leq \lambda g(0)+(1-\lambda)g(1)\\ &=\lambda g(x)+(1-\lambda)f(y) \end{align}
So f(x)f(x) is convex.

From the proof, we know that the convexity of a function on a convex set is one-dimensional fact.

Intuition:

  • convex says a function is convex ≥\geq a linear function
  • strictly convex says a function is convex >> a linear function

2. Strong convex

Definition 3: [strong convex]: f(x)f(x) is said to be m-strong convex if f(x)−m2∥x∥22f(x)-\frac{m}{2}\|x\|_2^2 is convex.

Then from last section, we have that:
first order condition (1):

f(y)≥f(x)+∇f(x)⋅(y−x)+m2∥y−x∥22

f(y)\geq f(x)+\nabla f(x)\cdot (y-x)+\frac{m}{2}\|y-x\|_2^2
first order condition (2)[monotone of derivative]:

(∇f(x)−∇f(y))⋅(x−y)>m∥x−y∥22

(\nabla f(x) - \nabla f(y)) \cdot (x-y)> m\|x-y\|_2^2
seconf order condition :

∇2f(x)>m⋅I

\nabla^2 f(x)> m\cdot I

Intuition: strong convex says a function is convex ≥\geq a quadratic function.

Theorem: If a function is strong convex then the first derivative of it is Lipschitz continuous.
proof: Firstly, we claim that the subset S={x,f(x)≤f(x(0))}S=\{x, f(x)\leq f(x^{(0)})\} is closed. Since ∀y∈S\forall y \in S, we have

f(x(0))≥f(y)≥f(x∗)+∇f(x∗)⋅(y−x)+m2∥y−x∗∥22⇒∥y−x∗∥22≤2mf(x(0))

\begin{align} &f(x^{(0)})\geq f(y) \geq f(x^*) +\nabla f(x^*)\cdot (y-x)+\frac{m}{2}\|y-x^*\|_2^2\\ &\Rightarrow \| y - x^*\|_2^2 \leq \frac{2}{m} f(x^{(0)}) \end{align}

And the maximum eigenvalue of ∇2f(x)\nabla^2 f(x) is continuous, so there exists a upper bound MM for ∇2f(x)\nabla^2 f(x), which says that ∇f(x)\nabla f(x) is lipschitz continuous.

[LA] Different convexity相关推荐

  1. WCDMA中的URA和LA/RA

    1.关于URA的概念: URA(UTRAN Registration Area)是UTRAN内部区域的划分适用于UE处于RRC连接状态的情形,而且只能在UTRAN端使用(比如由UTRAN发起的寻呼). ...

  2. LA 5717枚举+最小生成树回路性质

    1 /*LA 5717 2 <训练指南>P343 3 最小生成树的回路性质 4 在生成的最小生成树上,新增一条边e(u,v) 5 若原图上u到v的路径的最大边大于e,则删除此边,加上e,否 ...

  3. 编译php时错误make ***[libphp5.la] Error 1

    错误信息 make ***[libphp5.la] Error 1 /usr/bin/ld: cannot find -lltdl collect2: ld returned 1 exit statu ...

  4. 获取长度length_lab、labE、la、laE、ll、llE 钢筋锚固搭接长度6项参数的相互关系...

    文|施工小诸葛 目录 01   相关概念 02   字母含义 03   lab 非抗震纵向受拉钢筋的基本锚固长度 04   la 非抗震纵向受拉钢筋的锚固长度 05   ll 非抗震纵向受拉钢筋搭接长 ...

  5. qu.la网站上的小说爬取

    qu.la网站上的小说爬取 ##这个项目是我最早开始写的爬虫项目,代码比较简陋 在写这个项目时,我还不会Python的协程编程,用协程可提升爬虫速度至少5倍,参考我的文章[线程,协程对比和Python ...

  6. Linux中的动态库和静态库(.a/.la/.so/.o)

    为什么80%的码农都做不了架构师?>>>    Linux中的动态库和静态库(.a/.la/.so/.o) Linux中的动态库和静态库(.a/.la/.so/.o) C/C++程序 ...

  7. Android 解决 No static method in class La/a/a/a; or its super classes

    错误堆栈: Process: com.chaozh.iReader, PID: 24217java.lang.NoSuchMethodError: No static method getDrawab ...

  8. c语言数据结构线性表LA和LB,数据结构(C语言版)设有线性表LA(3,5,8,110)和LB(2,6,8,9,11,15,20)求新集合?...

    数据结构(C语言版)设有线性表LA(3,5,8,110)和LB(2,6,8,9,11,15,20)求新集合? 数据结构(C语言版)设有线性表LA(3,5,8,110)和LB(2,6,8,9,11,15 ...

  9. android .a .so区别,.so,.la和.a库文件有什么区别?

    .so文件是动态库.后缀代表"共享对象",因为所有与该库链接的应用程序都使用同一文件,而不是在生成的可执行文件中进行复制. .a文件是静态库.后缀代表"存档", ...

  10. Linux中.a,.la,.o,.so文件的意义和编程实现

    Linux中.a,.la,.o,.so文件的意义和编程实现 (转) Linux下文件的类型是不依赖于其后缀名的,但一般来讲: .o,是目标文件,相当于windows中的.obj文件 .so 为共享库, ...

最新文章

  1. SQL Server 2012笔记分享-49:理解数据库快照
  2. 4g模块注册上网 移远_Openwrt实现4G模块上网功能
  3. 全球首款64核AMD工作站发布,搭载最新线程撕裂者Pro,号称“地表最强”
  4. Redis使用及工具类
  5. 关于Git中的一些常用的命令
  6. Spring-基于注解的配置[01定义Bean+扫描Bean]
  7. [ActionScript 3.0] 安全沙箱的类型sandboxType,判断当前程序是AIR还是web程序
  8. JAVA进阶教学之(序列化和反序列化)
  9. 2019年2月数据库流行度排行: PostgreSQL攀至历史新高
  10. java猜拳游戏代码_猜拳游戏 - java代码库 - 云代码
  11. checkbox修改默认样式
  12. 【爬虫剑谱】二卷2章 实战篇-精美动漫图片爬取并保存
  13. 维吾尔语小程序开发个人中心插件
  14. H5页面跳转微信小程序
  15. ffmpeg提取音频数据
  16. Typora自定义主题
  17. iOS导航控制器使用interactivePopGestureRecognizer导致导航栏标题可能层次错乱的问题解决
  18. 展讯通信:文章紫光收购后展讯困难重重”失实
  19. vue文件下载及重命名
  20. 【文件系统】uploader实战详解实现分片上传、秒传、续传等(1)

热门文章

  1. 实验吧——天网管理系统
  2. vue后台管理框架(三)——登录功能
  3. NPR——卡通渲染(二)
  4. 我的世界 unity3d minecraft 用unity3d来制作类似我的世界的游戏 优化树和草
  5. 易捷行云获选国际开源基础设施基金会OIF“双董事” 席位
  6. 用了三天终于安装成功 jsv8 centos7.6 + 宝塔+ php7.2 安装V8js
  7. 阿里巴巴集团2015年秋季校招在线笔试附加题分析
  8. 微信企业号开发(第一篇)
  9. 利用大白菜制作多系统启动U盘(ubuntu+windows)
  10. BTN7970在直流电机驱动系统中的应用