Amir Beck’s INTRODUCTION TO NONELINEAR OPTIMIZATION Theory, Algorithms, and Applications with MATLAB Excise 5.2

INTRODUCTION TO NONELINEAR OPTIMIZATION Theory, Algorithms, and Applications with MATLAB. Amir Beck. 2014

本文主要涉及题目(ii)的MATLAB部分,
题目(i): https://blog.csdn.net/IYXUAN/article/details/121454946

实验目的

  • 掌握回溯法梯度下降、阻尼牛顿法和混合牛顿法;
  • 学会使用MATLAB,通过上述三种方法求解优化问题;
  • 了解这三种方法的优劣,并能够分析结果产生的原因。

实验环境

MATLAB R2021a

实验内容

Consider the Freudenstein and Roth test function

f(x)=f1(x)2+f2(x)2,x∈R2,f\left(x\right)=f_1\left(x\right)^2+f_2\left(x\right)^2,x\in \mathbb{R}^2,f(x)=f1​(x)2+f2​(x)2,x∈R2,

where

f1(x)=−13+x1+((5−x2)x2−2)x2,f_1\left(x\right)=-13+x_1+\left(\left(5-x_2\right)x_2-2\right)x_2\ ,f1​(x)=−13+x1​+((5−x2​)x2​−2)x2​ ,

f2(x)=−29+x1+((x2+1)x2−14)x2.f_2\left(x\right)=-29+x_1+\left(\left(x_2+1\right)x_2-14\right)x_2\ .f2​(x)=−29+x1​+((x2​+1)x2​−14)x2​ .

Use MATLAB to employ the following three methods on the problem of minimizing fff :

  1. the gradient method with backtracking and parameters (s,α,β)=(1,0.5,0.5)\left(s,\alpha,\beta\right)=\left(1,0.5,0.5\right)(s,α,β)=(1,0.5,0.5).
  2. the hybrid Newton’s method with parameters (s,α,β)=(1,0.5,0.5)\left(s,\alpha,\beta\right)=\left(1,0.5,0.5\right)(s,α,β)=(1,0.5,0.5).
  3. damped Gauss–Newton’s method with a backtracking line search strategy with parameters (s,α,β)=(1,0.5,0.5)\left(s,\alpha,\beta\right)=\left(1,0.5,0.5\right)(s,α,β)=(1,0.5,0.5).

All the algorithms should use the stopping criteria ∣∣∇f(x)∣∣≤ϵ,ϵ=10−5\left|\left|\ \nabla f\left(x\right)\right|\right|\le\epsilon,\epsilon={10}^{-5}∣∣ ∇f(x)∣∣≤ϵ,ϵ=10−5. Each algorithm should be employed four times on the following four starting points: (−50,7)T,(20,7)T,(20,−18)T,(5,−10)T\left(-50,7\right)^T,\left(20,7\right)^T,\left(20,-18\right)^T,\left(5,-10\right)^T(−50,7)T,(20,7)T,(20,−18)T,(5,−10)T. For each of the four starting points, compare the number of iterations and the point to which each method converged. If a method did not converge, explain why.

算法描述

gradient_method_backtracking

回溯法梯度下降算法描述

设定 ϵ\epsilonϵ;

初始化 x0∈Rnx_0\in \mathbb{R}^nx0​∈Rn;

FOR k = 0, 1, 2, …

  1. 选取下降方向 dk=g(xk)d_k=g(x_k)dk​=g(xk​);
  2. 选取步长 tkt_ktk​,使得 f(xk+tkdk)<f(xk)f(x_k+t_kd_k)<f(x_k)f(xk​+tk​dk​)<f(xk​);
  3. 设置 xk+1=xk+tkdkx_{k+1}=x_k+t_kd_kxk+1​=xk​+tk​dk​;
  4. IF ∥g(xk+1)∥≤ϵ\Vert g(x_{k+1}) \Vert \le \epsilon∥g(xk+1​)∥≤ϵ THEN STOP,OUTPUT xk+1x_{k+1}xk+1​;

在算法循环的 2. 步长选取中,对于定步长的梯度下降来说,tkt_ktk​ 为一常量,即 tk=tˉt_k=\bar{t}tk​=tˉ. 而对于采用回溯法的非精确线搜索来说,令步长的初值 tk=s,s>0t_k=s, s>0tk​=s,s>0,更新步长 tk←βtk,β∈(0,1)t_k \gets \beta t_k, \beta \in (0,1)tk​←βtk​,β∈(0,1),直到满足 f(xk)−f(xk+tkdk)≥−αtk∇f(xk)Tdk,α∈(0,1)f(x_k)-f(x_k+t_kd_k)\ge -\alpha t_k\nabla f(x_k)^Td_k, \alpha \in (0,1)f(xk​)−f(xk​+tk​dk​)≥−αtk​∇f(xk​)Tdk​,α∈(0,1),可以证明,当 t∈[0,ϵ]t\in [0,\epsilon]t∈[0,ϵ],上述不等式总成立。

newton_backtracking

回溯牛顿算法描述:

回溯牛顿法和上述梯度下降相似,只需将1. 下降方向选取 改为牛顿方向即可。

dk=−(∇2f(xk))−1∇f(xk)d_k=-\left( \nabla^2f(x_k) \right )^{-1}\nabla f(x_k)dk​=−(∇2f(xk​))−1∇f(xk​).

newton_hybrid

混合牛顿法描述:

在选取下降方向dkd_kdk​时,当 Hessian 矩阵∇2f(xk)\nabla^2f(x_k)∇2f(xk​)非正定时,使用回溯法梯度下降处理,即dk=−∇f(xk)d_k=-\nabla f(x_k)dk​=−∇f(xk​)。当∇2f(xk)\nabla^2f(x_k)∇2f(xk​)正定时,dkd_kdk​可选用牛顿方向。在判定 Hessian 矩阵是否正定时,使用 Cholesky 分解 作为判定条件。

实验步骤

计算 f(x),g(x),h(x)f(x),g(x),h(x)f(x),g(x),h(x)

f(x)=2x12+12x1x22−32x1x2−84x1+2x26−8x25+2x24−80x23+12x22+864x2+1010f(x)=2x_1^2 + 12x_1x_2^2 - 32x_1x_2 - 84x_1 + 2x_2^6 - 8x_2^5 + 2x_2^4 - 80x_2^3 + 12x_2^2 + 864x_2 + 1010f(x)=2x12​+12x1​x22​−32x1​x2​−84x1​+2x26​−8x25​+2x24​−80x23​+12x22​+864x2​+1010

g(x)=(12x22−32x2+4x1−84,24x2−32x1+24x1x2−240x22+8x23−40x24+12x25+864)Tg(x)=\left ( 12x_2^2 - 32x_2 + 4x_1 - 84,24x_2 - 32x_1 + 24x_1x_2 - 240x_2^2 + 8x_2^3 - 40x_2^4 + 12x_2^5 + 864\right )^Tg(x)=(12x22​−32x2​+4x1​−84,24x2​−32x1​+24x1​x2​−240x22​+8x23​−40x24​+12x25​+864)T

h(x)=(424x2−3224x2−3260x24−160x23+24x22−480x2+24x1+24)h(x)=\begin{pmatrix} 4 &24x_2 - 32 \\ 24x_2 - 32 &60x_2^4 - 160x_2^3 + 24x_2^2 - 480x_2 + 24x_1 + 24 \end{pmatrix}h(x)=(424x2​−32​24x2​−3260x24​−160x23​+24x22​−480x2​+24x1​+24​)

clc;clear
syms f1 f2 x1 x2 f g h
f1=-13+x1+((5-x2)*x2-2)*x2;
f2=-29+x1+((x2+1)*x2-14)*x2;
f=expand(f1^2+f2^2)
% f = 2*x1^2 + 12*x1*x2^2 - 32*x1*x2 - 84*x1 + 2*x2^6 - 8*x2^5 + 2*x2^4 - 80*x2^3 + 12*x2^2 + 864*x2 + 1010
g=gradient(f)
% g = [12*x2^2 - 32*x2 + 4*x1 - 84; 24*x2 - 32*x1 + 24*x1*x2 - 240*x2^2 + 8*x2^3 - 40*x2^4 + 12*x2^5 + 864]
h=hessian(f)
% h = [4, 24*x2 - 32; 24*x2 - 32, 60*x2^4 - 160*x2^3 + 24*x2^2 - 480*x2 + 24*x1 + 24]

绘制函数的等高线和曲面图

Figure 1. contour and surface plots of f(x)f(x)f(x) around the global optimal solution x∗=(5,4)x^*=(5,4)x∗=(5,4).

Figure 2. contour and surface plots of f(x)f(x)f(x) around the local optimal solution x∗=(11.4128,−0.8968)x^*=(11.4128, -0.8968)x∗=(11.4128,−0.8968).

clc;clear
clc;clear
clc;clear
% local optimality
x1 = linspace(11.40,11.42,50);
x2 = linspace(-0.897,-0.8965,50);
% global optimality
% x1 = linspace(4.9,5.1,50);
% x2 = linspace(3.9,4.1,50);
[X1,X2] = meshgrid(x1,x2);
f = 2*X1.^2 + 12*X1.*X2.^2 - 32*X1.*X2 - 84*X1 + 2*X2.^6 - 8*X2.^5 ...+ 2*X2.^4 - 80*X2.^3 + 12*X2.^2 + 864*X2 + 1010;
createfigure(X1,X2,f)
%% 由 surfc 函数生成:
function createfigure(xdata1, ydata1, zdata1)
%CREATEFIGURE(xdata1, ydata1, zdata1)
%  XDATA1:  surface xdata
%  YDATA1:  surface ydata
%  ZDATA1:  surface zdata%  Auto-generated by MATLAB on 08-Dec-2021 15:44:51% Create figure
figure1 = figure;% Create axes
axes1 = axes('Parent',figure1);
hold(axes1,'on');% Create surf
surf(xdata1,ydata1,zdata1,'Parent',axes1);% Create contour
contour(xdata1,ydata1,zdata1,'ZLocation','zmin');view(axes1,[-113 13]);
grid(axes1,'on');
axis(axes1,'tight');
hold(axes1,'off');
% Create colorbar
colorbar(axes1);

执行函数

clc;clear%% init
s = 1;
alpha = .5;
beta = .5;
epsilon = 1e-5;
p1 = [-50;7];
p2 = [20;7];
p3 = [20;-18];
p4 = [5;-10];
f = @(x) 2*x(1)^2 + 12*x(1)*x(2)^2 - 32*x(1)*x(2) - 84*x(1) ...+ 2*x(2)^6 - 8*x(2)^5 + 2*x(2)^4 - 80*x(2)^3 + 12*x(2)^2 ...+ 864*x(2) + 1010;
g = @(x) [12*x(2)^2 - 32*x(2) + 4*x(1) - 84; 24*x(2) - 32*x(1)...+ 24*x(1)*x(2) - 240*x(2)^2 + 8*x(2)^3 - 40*x(2)^4 + 12*x(2)^5 + 864];
h = @(x) [4, 24*x(2) - 32; 24*x(2) - 32, 60*x(2)^4 - 160*x(2)^3 ...+ 24*x(2)^2 - 480*x(2) + 24*x(1) + 24];%% call func
% gradient_method_backtracking
[x_gb1,fun_val_gb1] = gradient_method_backtracking(f,g,p1,s,alpha,...
beta,epsilon);
[x_gb2,fun_val_gb2] = gradient_method_backtracking(f,g,p2,s,alpha,...
beta,epsilon);
[x_gb3,fun_val_gb3] = gradient_method_backtracking(f,g,p3,s,alpha,...
beta,epsilon);
[x_gb4,fun_val_gb4] = gradient_method_backtracking(f,g,p4,s,alpha,...
beta,epsilon);% newton_backtracking
x_nb1 = newton_backtracking(f,g,h,p1,alpha,beta,epsilon);
x_nb2 = newton_backtracking(f,g,h,p2,alpha,beta,epsilon);
x_nb3 = newton_backtracking(f,g,h,p3,alpha,beta,epsilon);
x_nb4 = newton_backtracking(f,g,h,p4,alpha,beta,epsilon);% newton_hybrid
x_nh1 = newton_hybrid(f,g,h,p1,alpha,beta,epsilon);
x_nh2 = newton_hybrid(f,g,h,p2,alpha,beta,epsilon);
x_nh3 = newton_hybrid(f,g,h,p3,alpha,beta,epsilon);
x_nh4 = newton_hybrid(f,g,h,p4,alpha,beta,epsilon);
function [x,fun_val]=gradient_method_backtracking(f,g,x0,s,alpha,...
beta,epsilon)% Gradient method with backtracking stepsize rule
%
% INPUT
%=======================================
% f ......... objective function
% g ......... gradient of the objective function
% x0......... initial point
% s ......... initial choice of stepsize
% alpha ..... tolerance parameter for the stepsize selection
% beta ...... the constant in which the stepsize is multiplied
% at each backtracking step (0<beta<1)
% epsilon ... tolerance parameter for stopping rule
% OUTPUT
%=======================================
% x ......... optimal solution (up to a tolerance)
% of min f(x)
% fun_val ... optimal function valuex=x0;
grad=g(x);
fun_val=f(x);
iter=0;
while (norm(grad)>epsilon)iter=iter+1;t=s;while (fun_val-f(x-t*grad)<alpha*t*norm(grad)^2)t=beta*t;endx=x-t*grad;fun_val=f(x);grad=g(x);fprintf('iter_number = %3d norm_grad = %2.6f fun_val = %2.6f \n',...iter,norm(grad),fun_val);
end
function x=newton_backtracking(f,g,h,x0,alpha,beta,epsilon)% Newton’s method with backtracking
%
% INPUT
%=======================================
% f ......... objective function
% g ......... gradient of the objective function
% h ......... hessian of the objective function
% x0......... initial point
% alpha ..... tolerance parameter for the stepsize selection strategy
% beta ...... the proportion in which the stepsize is multiplied
% at each backtracking step (0<beta<1)
% epsilon ... tolerance parameter for stopping rule
% OUTPUT
%=======================================
% x ......... optimal solution (up to a tolerance)
% of min f(x)
% fun_val ... optimal function valuex=x0;
gval=g(x);
hval=h(x);
d=hval\gval;
iter=0;
while ((norm(gval)>epsilon)&&(iter<10000))iter=iter+1;t=1;while(f(x-t*d)>f(x)-alpha*t*gval'*d)t=beta*t;endx=x-t*d;fprintf('iter= %2d f(x)=%10.10f\n',iter,f(x))gval=g(x);hval=h(x);d=hval\gval;
end
if (iter==10000)fprintf('did not converge\n')
end
function x=newton_hybrid(f,g,h,x0,alpha,beta,epsilon)% Hybrid Newton’s method
%
% INPUT
%=======================================
% f ......... objective function
% g ......... gradient of the objective function
% h ......... hessian of the objective function
% x0......... initial point
% alpha ..... tolerance parameter for the stepsize selection strategy
% beta ...... the proportion in which the stepsize is multiplied
% at each backtracking step (0<beta<1)
% epsilon ... tolerance parameter for stopping rule
% OUTPUT
%=======================================
% x ......... optimal solution (up to a tolerance)
% of min f(x)
% fun_val ... optimal function valuex=x0;
gval=g(x);
hval=h(x);
[L,p]=chol(hval,'lower');
if (p==0)d=L'\(L\gval);
elsed=gval;
end
iter=0;
while ((norm(gval)>epsilon)&&(iter<10000))iter=iter+1;t=1;while(f(x-t*d)>f(x)-alpha*t*gval'*d)t=beta*t;endx=x-t*d;fprintf('iter= %2d f(x)=%10.10f\n',iter,f(x))gval=g(x);hval=h(x);[L,p]=chol(hval,'lower');if (p==0)d=L'\(L\gval);elsed=gval;end
end
if (iter==10000)fprintf('did not converge\n')
end

结果与分析

p1=(−50,7)T,p2=(20,7)T,p3=(20,−18)T,p4=(5,−10)Tp_1=\left(-50,7\right)^T,p_2=\left(20,7\right)^T,p_3=\left(20,-18\right)^T,p_4=\left(5,-10\right)^Tp1​=(−50,7)T,p2​=(20,7)T,p3​=(20,−18)T,p4​=(5,−10)T

gradient_method_backtracking

x* fun_val iter
p1 5.0, 4.0 0.000000 -
p2 5.0, 4.0 0.000000 -
p3 11.4128, -0.8968 48.984254 -
p4 5.0, 4.0 0.000107 -
# p1
iter_number =   1 norm_grad = 16886.999242 fun_val = 11336.705024
iter_number =   2 norm_grad = 5600.459983 fun_val = 5803.984203
iter_number =   3 norm_grad = 1056.831987 fun_val = 4723.717473
iter_number =   4 norm_grad = 434.668354 fun_val = 4676.341028
iter_number =   5 norm_grad = 181.792854 fun_val = 4664.456052
iter_number =   6 norm_grad = 481.845230 fun_val = 4645.697772
iter_number =   7 norm_grad = 1450.431648 fun_val = 2784.316562
iter_number =   8 norm_grad = 626.401853 fun_val = 2597.516663
iter_number =   9 norm_grad = 220.402369 fun_val = 2565.305713
iter_number =  10 norm_grad = 116.512243 fun_val = 2561.054079
......
iter_number = 20849 norm_grad = 0.000092 fun_val = 0.000000
iter_number = 20850 norm_grad = 0.000092 fun_val = 0.000000
iter_number = 20851 norm_grad = 0.000092 fun_val = 0.000000
iter_number = 20852 norm_grad = 0.000092 fun_val = 0.000000
......# p2
iter_number =   1 norm_grad = 20020.154130 fun_val = 11955.275024
iter_number =   2 norm_grad = 8107.335439 fun_val = 3744.085975
iter_number =   3 norm_grad = 2993.085982 fun_val = 1129.842099
iter_number =   4 norm_grad = 949.370487 fun_val = 445.778197
iter_number =   5 norm_grad = 192.946640 fun_val = 320.464548
iter_number =   6 norm_grad = 86.882053 fun_val = 313.996463
iter_number =   7 norm_grad = 41.236184 fun_val = 311.880377
iter_number =   8 norm_grad = 223.087207 fun_val = 295.899225
iter_number =   9 norm_grad = 91.972368 fun_val = 287.479235
iter_number =  10 norm_grad = 40.293830 fun_val = 285.264643
......
iter_number = 13961 norm_grad = 0.000151 fun_val = 0.000000
iter_number = 13962 norm_grad = 0.000151 fun_val = 0.000000
iter_number = 13963 norm_grad = 0.000151 fun_val = 0.000000
iter_number = 13964 norm_grad = 0.000151 fun_val = 0.000000
iter_number = 13965 norm_grad = 0.000151 fun_val = 0.000000
......# p3
iter_number =   1 norm_grad = 10459534.883496 fun_val = 26901796.557585
iter_number =   2 norm_grad = 4328861.498929 fun_val = 9350549.111728
iter_number =   3 norm_grad = 1815010.486957 fun_val = 3306985.630845
iter_number =   4 norm_grad = 764085.159510 fun_val = 1178877.486706
iter_number =   5 norm_grad = 322588.854922 fun_val = 423840.683140
iter_number =   6 norm_grad = 136745.858345 fun_val = 154327.967236
iter_number =   7 norm_grad = 58355.714510 fun_val = 57260.204151
iter_number =   8 norm_grad = 25173.561268 fun_val = 21780.708212
iter_number =   9 norm_grad = 11034.676282 fun_val = 8504.776429
iter_number =  10 norm_grad = 4932.916085 fun_val = 3367.298679
......
iter_number = 11759 norm_grad = 0.000034 fun_val = 48.984254
iter_number = 11760 norm_grad = 0.000034 fun_val = 48.984254
iter_number = 11761 norm_grad = 0.000034 fun_val = 48.984254
iter_number = 11762 norm_grad = 0.000034 fun_val = 48.984254
iter_number = 11763 norm_grad = 0.000034 fun_val = 48.984254
iter_number = 11764 norm_grad = 0.000034 fun_val = 48.984254
......# p4
iter_number =   1 norm_grad = 740484.580250 fun_val = 1125740.309089
iter_number =   2 norm_grad = 318800.867908 fun_val = 410873.876977
iter_number =   3 norm_grad = 135283.737154 fun_val = 147433.167712
iter_number =   4 norm_grad = 57286.724183 fun_val = 52647.763355
iter_number =   5 norm_grad = 24273.902315 fun_val = 18647.021114
iter_number =   6 norm_grad = 10264.843110 fun_val = 6442.533586
iter_number =   7 norm_grad = 4279.439838 fun_val = 2094.562330
iter_number =   8 norm_grad = 1694.354839 fun_val = 604.979471
iter_number =   9 norm_grad = 568.486420 fun_val = 158.093768
iter_number =  10 norm_grad = 98.759692 fun_val = 69.674565
......
iter_number = 6072 norm_grad = 0.000107 fun_val = 0.000000
iter_number = 6073 norm_grad = 0.000107 fun_val = 0.000000
iter_number = 6074 norm_grad = 0.000107 fun_val = 0.000000
iter_number = 6075 norm_grad = 0.000107 fun_val = 0.000000
iter_number = 6076 norm_grad = 0.000107 fun_val = 0.000000
......

newton_backtracking

x* fun_val iter
p1 5.0, 4.0 -0.000000 8
p2 5.0, 4.0 0.000000 11
p3 11.4128, -0.8968 48.984254 16
p4 11.4128, -0.8968 48.984254 13
# p1
iter=  1 f(x)=17489.2000310639
iter=  2 f(x)=3536.1276294897
iter=  3 f(x)=556.9651095482
iter=  4 f(x)=50.4029326519
iter=  5 f(x)=1.2248344195
iter=  6 f(x)=0.0013284549
iter=  7 f(x)=0.0000000018
iter=  8 f(x)=-0.0000000000# p2
iter=  1 f(x)=18114.8519548468
iter=  2 f(x)=3676.4191962662
iter=  3 f(x)=583.8090713510
iter=  4 f(x)=53.8298627069
iter=  5 f(x)=1.3684607619
iter=  6 f(x)=0.0016459572
iter=  7 f(x)=0.0000000027
iter=  8 f(x)=0.0000000007
iter=  9 f(x)=0.0000000002
iter= 10 f(x)=0.0000000000
iter= 11 f(x)=0.0000000000# p3
iter=  1 f(x)=21356364.5665758252
iter=  2 f(x)=5561714.7070109081
iter=  3 f(x)=1444932.1368662491
iter=  4 f(x)=374422.3650773597
iter=  5 f(x)=96843.5106842914
iter=  6 f(x)=25078.5606496656
iter=  7 f(x)=6556.5446423615
iter=  8 f(x)=1764.6276022580
iter=  9 f(x)=509.7515417783
iter= 10 f(x)=171.5021447876
iter= 11 f(x)=76.9343325176
iter= 12 f(x)=52.3964690617
iter= 13 f(x)=49.0499328705
iter= 14 f(x)=48.9842770185
iter= 15 f(x)=48.9842536792
iter= 16 f(x)=48.9842536792# p4
iter=  1 f(x)=695503.8833055196
iter=  2 f(x)=180005.8157903731
iter=  3 f(x)=46559.5630006172
iter=  4 f(x)=12100.5787363822
iter=  5 f(x)=3202.4785180644
iter=  6 f(x)=889.1211707798
iter=  7 f(x)=275.2063763915
iter=  8 f(x)=106.0983438392
iter=  9 f(x)=59.2021422471
iter= 10 f(x)=49.5512377643
iter= 11 f(x)=48.9860167506
iter= 12 f(x)=48.9842536959
iter= 13 f(x)=48.9842536792

newton_hybrid

x* fun_val iter
p1 5.0, 4.0 -0.000000 8
p2 5.0, 4.0 0.000000 11
p3 11.4128, -0.8968 48.984254 16
p4 11.4128, -0.8968 48.984254 13
# p1
iter=  1 f(x)=17489.2000310639
iter=  2 f(x)=3536.1276294897
iter=  3 f(x)=556.9651095482
iter=  4 f(x)=50.4029326519
iter=  5 f(x)=1.2248344195
iter=  6 f(x)=0.0013284549
iter=  7 f(x)=0.0000000018
iter=  8 f(x)=-0.0000000000# p2
iter=  1 f(x)=18114.8519548468
iter=  2 f(x)=3676.4191962662
iter=  3 f(x)=583.8090713510
iter=  4 f(x)=53.8298627069
iter=  5 f(x)=1.3684607619
iter=  6 f(x)=0.0016459572
iter=  7 f(x)=0.0000000027
iter=  8 f(x)=0.0000000007
iter=  9 f(x)=0.0000000002
iter= 10 f(x)=0.0000000000
iter= 11 f(x)=0.0000000000# p3
iter=  1 f(x)=21356364.5665758252
iter=  2 f(x)=5561714.7070109081
iter=  3 f(x)=1444932.1368662491
iter=  4 f(x)=374422.3650773597
iter=  5 f(x)=96843.5106842914
iter=  6 f(x)=25078.5606496656
iter=  7 f(x)=6556.5446423615
iter=  8 f(x)=1764.6276022580
iter=  9 f(x)=509.7515417783
iter= 10 f(x)=171.5021447876
iter= 11 f(x)=76.9343325176
iter= 12 f(x)=52.3964690617
iter= 13 f(x)=49.0499328705
iter= 14 f(x)=48.9842770185
iter= 15 f(x)=48.9842536792
iter= 16 f(x)=48.9842536792# p4
iter=  1 f(x)=695503.8833055196
iter=  2 f(x)=180005.8157903731
iter=  3 f(x)=46559.5630006172
iter=  4 f(x)=12100.5787363822
iter=  5 f(x)=3202.4785180644
iter=  6 f(x)=889.1211707798
iter=  7 f(x)=275.2063763915
iter=  8 f(x)=106.0983438392
iter=  9 f(x)=59.2021422471
iter= 10 f(x)=49.5512377643
iter= 11 f(x)=48.9860167506
iter= 12 f(x)=48.9842536959
iter= 13 f(x)=48.9842536792

结果分析

回溯法梯度下降(gradient_method_backtracking)在f(x)f(x)f(x)函数上的表现情况比较糟糕,在所有初始点(p1-p4)上均不能满足结束条件∣∣∇f(x)∣∣≤ϵ,ϵ=10−5\left|\left|\ \nabla f\left(x\right)\right|\right|\le\epsilon,\epsilon={10}^{-5}∣∣ ∇f(x)∣∣≤ϵ,ϵ=10−5,所以其迭代步数iter→∞iter \to \inftyiter→∞. 但是可以发现,随着迭代次数的不断增加,最优解x∗x^*x∗和函数值fun_valfun\_valfun_val均收敛。在起始点为p1,p2,p4时,回溯梯度下降求得的最优解xˉ→(5.0,4.0),fun_val→0.0\bar{x} \to (5.0,4.0), fun\_val \to 0.0xˉ→(5.0,4.0),fun_val→0.0,显然是f(x)f(x)f(x)的全局最优,但当以p3为起始点时,所得解为局部最优。此外,当以p4为起始点时,fun_val→0.000107fun\_val \to 0.000107fun_val→0.000107,收敛效果不如p1,p2起始点。总之,回溯法梯度下降的结果说明,由梯度下降得到的f(x)f(x)f(x)的最优解,无论是在全局最优还是局部最优,其梯度∇f(x∗)\nabla f\left(x^*\right)∇f(x∗)难以收敛到0\mathscr{0}0,这也导致了梯度下降的发散。

回溯牛顿法(newton_backtracking)和混合牛顿法(newton_hybrid)得到的结果极为相似,二者在所有四个起始点上的收敛结果表现良好。当以p1,p2为起始点时,两种方法均在10步左右收敛到全局最优(5.0,4.0)(5.0,4.0)(5.0,4.0)。当以p3,p4为起始点时,两方法也能在16步以内收敛到局部最优(11.4128,−0.8968)(11.4128,-0.8968)(11.4128,−0.8968). 观察两种方法在四个起始点上的输出结果,不难发现,输出结果完全相同,说明∇2f(xk),k=1,2,…\nabla ^2f(x_k), k=1,2,\dots∇2f(xk​),k=1,2,…是非正定的,所以混合牛顿法中的(a)步(纯牛顿)永远不执行,混合牛顿此时等价于回溯牛顿。

总之,在f(x)f(x)f(x)上,当以∣∣∇f(x)∣∣≤ϵ,ϵ=10−5\left|\left|\ \nabla f\left(x\right)\right|\right|\le\epsilon,\epsilon={10}^{-5}∣∣ ∇f(x)∣∣≤ϵ,ϵ=10−5为终止条件时,牛顿法的迭代次数明显少于梯度下降法。两种方法,当选取的起始点不同,可能会导致收敛不到全局最优解。

©️ Sylvan Ding ❤️

INTRODUCTION TO NONELINEAR OPTIMIZATION Excise 5.2 Freudenstein and Roth Test Function相关推荐

  1. Introduction to Linear Optimization 2.2 极点,顶角与基可行解

    1.极点 极点的定义及理解 Definition 2.6 Let P be a polyhedron. A vector x ∈ P is an extreme point of P if we ca ...

  2. Introduction to Linear Optimization 2.3标准形式的多面体

    1.标准型的表示 通常将多面体 P = {x ∈ Rn | Ax = b, x ≥ 0} 称为多面体的标准型,即多面体中所有约束都为等式.其中 A 为 m×n 的矩阵(通常认为m ≤ n). 现在不妨 ...

  3. L1正则化降噪,对偶函数的构造,求解含L1正则项的优化问题,梯度投影法

    L1正则化降噪,对偶函数的构造,求解含L1正则项的优化问题,梯度投影法 本文主要实现L1正则化降噪,L2 正则化降噪的文章在: https://blog.csdn.net/IYXUAN/article ...

  4. Paper:《Adam: A Method for Stochastic Optimization》的翻译与解读

    Paper:<Adam: A Method for Stochastic Optimization>的翻译与解读 目录 Adam: A Method for Stochastic Opti ...

  5. Introduction to TurboFan

    建议看英文原版: https://doar-e.github.io/blog/2019/01/28/introduction-to-turbofan/#the-d8-shell 作者:Nevv 链接: ...

  6. Auto Machine Learning笔记 - Bayesian Optimization

    原文链接:Auto Machine Learning笔记 - Bayesian Optimization 优化器是机器学习中很重要的一个环节.当确定损失函数时,你需要一个优化器使损失函数的参数能够快速 ...

  7. 2018/8/26 PSO-based Clustering Techniques to Solve Multimodal Optimization Problems: A Survey

    信息:  2018年    米尔普尔科技大学 题目: PSO-based Clustering Techniques to Solve Multimodal Optimization Problems ...

  8. python——聚类

    目录 Cluster Analysis聚类分析 Introduction to Unsupervised Learning无监督学习简介 Clustering Distance or Similari ...

  9. 单纯形法表格法例题详解_优化 |运筹学线性规划单纯形法之求解

    文章申明 文章作者:臧永森 臧永森:清华大学工业工程系在读博士,研究方向:运筹优化算法的设计与应用.数据统计分析.大数据技术与应用,戚铭尧老师团队 责任编辑:阎泳楠 文章由『运筹OR帷幄』原创发布,如 ...

最新文章

  1. ElasticSearch 2 (1) - Getting Start
  2. 作为程序员,你评估工作量留 buffer 吗?
  3. 030_CSS外边距合并
  4. 前端学习(501):水平居中布局得第二种方式的优缺点
  5. 关于计算机的内存的一点常识
  6. Atitit 存储引擎核心技术 总结目录1. 表的存储有三个文件:结构+数据+索引 12. 页式管理
  7. (1)初识云计算-《云计算核心技术剖析》学习笔记
  8. 设置cookie存活时间_js中如何设置cookie的保存时间呢?
  9. 取绝对值 :abs fabs fabsf用法
  10. 会议论文投稿到接收流程【手里有粮心中不慌】
  11. android webview静态方法,在android webview中加载静态页面
  12. 如何在虚拟机中运行知乎App
  13. UnityHub打开黑屏的解决方法
  14. CSS 文字下划线间距
  15. Kubernetes的Pod详解
  16. Android Studio App开发之循环试图RecyclerView,布局管理器LayoutManager、动态更新循环视图讲解及实战(附源码)
  17. java注解检验集合对象_Java 对list对象进行属性校验
  18. 苏州公安第一次用计算机,苏州破获首起非法控制计算机信息系统程序案
  19. 易得无价宝,难得有情郎
  20. Hypermesh-laminate复合材料案例学习笔记

热门文章

  1. LeetCode 1133. 最大唯一数
  2. LeetCode 628. 三个数的最大乘积
  3. LeetCode 290. 单词规律(哈希)
  4. 树莓派python gpio 模仿iic_Adafruit的树莓派教程:GPIO配置
  5. Opencv visual studio c++ 环境搭建
  6. 轻松看懂机器学习十大常用算法 - 基础知识
  7. 26岁!年入100万,两周把 Github 项目推向全球榜首,他是怎么做的?
  8. 配送A/B评估体系建设实践
  9. 实体链指比赛方案分享
  10. 常见的算法面试问题以及代码实现