INTRODUCTION TO NONELINEAR OPTIMIZATION Excise 5.2 Freudenstein and Roth Test Function
Amir Beck’s INTRODUCTION TO NONELINEAR OPTIMIZATION Theory, Algorithms, and Applications with MATLAB Excise 5.2
INTRODUCTION TO NONELINEAR OPTIMIZATION Theory, Algorithms, and Applications with MATLAB. Amir Beck. 2014
本文主要涉及题目(ii)的MATLAB部分,
题目(i): https://blog.csdn.net/IYXUAN/article/details/121454946
实验目的
- 掌握回溯法梯度下降、阻尼牛顿法和混合牛顿法;
- 学会使用MATLAB,通过上述三种方法求解优化问题;
- 了解这三种方法的优劣,并能够分析结果产生的原因。
实验环境
MATLAB R2021a
实验内容
Consider the Freudenstein and Roth test function
f(x)=f1(x)2+f2(x)2,x∈R2,f\left(x\right)=f_1\left(x\right)^2+f_2\left(x\right)^2,x\in \mathbb{R}^2,f(x)=f1(x)2+f2(x)2,x∈R2,
where
f1(x)=−13+x1+((5−x2)x2−2)x2,f_1\left(x\right)=-13+x_1+\left(\left(5-x_2\right)x_2-2\right)x_2\ ,f1(x)=−13+x1+((5−x2)x2−2)x2 ,
f2(x)=−29+x1+((x2+1)x2−14)x2.f_2\left(x\right)=-29+x_1+\left(\left(x_2+1\right)x_2-14\right)x_2\ .f2(x)=−29+x1+((x2+1)x2−14)x2 .
Use MATLAB to employ the following three methods on the problem of minimizing fff :
- the gradient method with backtracking and parameters (s,α,β)=(1,0.5,0.5)\left(s,\alpha,\beta\right)=\left(1,0.5,0.5\right)(s,α,β)=(1,0.5,0.5).
- the hybrid Newton’s method with parameters (s,α,β)=(1,0.5,0.5)\left(s,\alpha,\beta\right)=\left(1,0.5,0.5\right)(s,α,β)=(1,0.5,0.5).
- damped Gauss–Newton’s method with a backtracking line search strategy with parameters (s,α,β)=(1,0.5,0.5)\left(s,\alpha,\beta\right)=\left(1,0.5,0.5\right)(s,α,β)=(1,0.5,0.5).
All the algorithms should use the stopping criteria ∣∣∇f(x)∣∣≤ϵ,ϵ=10−5\left|\left|\ \nabla f\left(x\right)\right|\right|\le\epsilon,\epsilon={10}^{-5}∣∣ ∇f(x)∣∣≤ϵ,ϵ=10−5. Each algorithm should be employed four times on the following four starting points: (−50,7)T,(20,7)T,(20,−18)T,(5,−10)T\left(-50,7\right)^T,\left(20,7\right)^T,\left(20,-18\right)^T,\left(5,-10\right)^T(−50,7)T,(20,7)T,(20,−18)T,(5,−10)T. For each of the four starting points, compare the number of iterations and the point to which each method converged. If a method did not converge, explain why.
算法描述
gradient_method_backtracking
回溯法梯度下降算法描述:
设定 ϵ\epsilonϵ;
初始化 x0∈Rnx_0\in \mathbb{R}^nx0∈Rn;
FOR k = 0, 1, 2, …
- 选取下降方向 dk=g(xk)d_k=g(x_k)dk=g(xk);
- 选取步长 tkt_ktk,使得 f(xk+tkdk)<f(xk)f(x_k+t_kd_k)<f(x_k)f(xk+tkdk)<f(xk);
- 设置 xk+1=xk+tkdkx_{k+1}=x_k+t_kd_kxk+1=xk+tkdk;
- IF ∥g(xk+1)∥≤ϵ\Vert g(x_{k+1}) \Vert \le \epsilon∥g(xk+1)∥≤ϵ THEN STOP,OUTPUT xk+1x_{k+1}xk+1;
在算法循环的 2. 步长选取中,对于定步长的梯度下降来说,tkt_ktk 为一常量,即 tk=tˉt_k=\bar{t}tk=tˉ. 而对于采用回溯法的非精确线搜索来说,令步长的初值 tk=s,s>0t_k=s, s>0tk=s,s>0,更新步长 tk←βtk,β∈(0,1)t_k \gets \beta t_k, \beta \in (0,1)tk←βtk,β∈(0,1),直到满足 f(xk)−f(xk+tkdk)≥−αtk∇f(xk)Tdk,α∈(0,1)f(x_k)-f(x_k+t_kd_k)\ge -\alpha t_k\nabla f(x_k)^Td_k, \alpha \in (0,1)f(xk)−f(xk+tkdk)≥−αtk∇f(xk)Tdk,α∈(0,1),可以证明,当 t∈[0,ϵ]t\in [0,\epsilon]t∈[0,ϵ],上述不等式总成立。
newton_backtracking
回溯牛顿算法描述:
回溯牛顿法和上述梯度下降相似,只需将1. 下降方向选取 改为牛顿方向即可。
dk=−(∇2f(xk))−1∇f(xk)d_k=-\left( \nabla^2f(x_k) \right )^{-1}\nabla f(x_k)dk=−(∇2f(xk))−1∇f(xk).
newton_hybrid
混合牛顿法描述:
在选取下降方向dkd_kdk时,当 Hessian 矩阵∇2f(xk)\nabla^2f(x_k)∇2f(xk)非正定时,使用回溯法梯度下降处理,即dk=−∇f(xk)d_k=-\nabla f(x_k)dk=−∇f(xk)。当∇2f(xk)\nabla^2f(x_k)∇2f(xk)正定时,dkd_kdk可选用牛顿方向。在判定 Hessian 矩阵是否正定时,使用 Cholesky 分解 作为判定条件。
实验步骤
计算 f(x),g(x),h(x)f(x),g(x),h(x)f(x),g(x),h(x)
f(x)=2x12+12x1x22−32x1x2−84x1+2x26−8x25+2x24−80x23+12x22+864x2+1010f(x)=2x_1^2 + 12x_1x_2^2 - 32x_1x_2 - 84x_1 + 2x_2^6 - 8x_2^5 + 2x_2^4 - 80x_2^3 + 12x_2^2 + 864x_2 + 1010f(x)=2x12+12x1x22−32x1x2−84x1+2x26−8x25+2x24−80x23+12x22+864x2+1010
g(x)=(12x22−32x2+4x1−84,24x2−32x1+24x1x2−240x22+8x23−40x24+12x25+864)Tg(x)=\left ( 12x_2^2 - 32x_2 + 4x_1 - 84,24x_2 - 32x_1 + 24x_1x_2 - 240x_2^2 + 8x_2^3 - 40x_2^4 + 12x_2^5 + 864\right )^Tg(x)=(12x22−32x2+4x1−84,24x2−32x1+24x1x2−240x22+8x23−40x24+12x25+864)T
h(x)=(424x2−3224x2−3260x24−160x23+24x22−480x2+24x1+24)h(x)=\begin{pmatrix} 4 &24x_2 - 32 \\ 24x_2 - 32 &60x_2^4 - 160x_2^3 + 24x_2^2 - 480x_2 + 24x_1 + 24 \end{pmatrix}h(x)=(424x2−3224x2−3260x24−160x23+24x22−480x2+24x1+24)
clc;clear
syms f1 f2 x1 x2 f g h
f1=-13+x1+((5-x2)*x2-2)*x2;
f2=-29+x1+((x2+1)*x2-14)*x2;
f=expand(f1^2+f2^2)
% f = 2*x1^2 + 12*x1*x2^2 - 32*x1*x2 - 84*x1 + 2*x2^6 - 8*x2^5 + 2*x2^4 - 80*x2^3 + 12*x2^2 + 864*x2 + 1010
g=gradient(f)
% g = [12*x2^2 - 32*x2 + 4*x1 - 84; 24*x2 - 32*x1 + 24*x1*x2 - 240*x2^2 + 8*x2^3 - 40*x2^4 + 12*x2^5 + 864]
h=hessian(f)
% h = [4, 24*x2 - 32; 24*x2 - 32, 60*x2^4 - 160*x2^3 + 24*x2^2 - 480*x2 + 24*x1 + 24]
绘制函数的等高线和曲面图
Figure 1. contour and surface plots of f(x)f(x)f(x) around the global optimal solution x∗=(5,4)x^*=(5,4)x∗=(5,4).
Figure 2. contour and surface plots of f(x)f(x)f(x) around the local optimal solution x∗=(11.4128,−0.8968)x^*=(11.4128, -0.8968)x∗=(11.4128,−0.8968).
clc;clear
clc;clear
clc;clear
% local optimality
x1 = linspace(11.40,11.42,50);
x2 = linspace(-0.897,-0.8965,50);
% global optimality
% x1 = linspace(4.9,5.1,50);
% x2 = linspace(3.9,4.1,50);
[X1,X2] = meshgrid(x1,x2);
f = 2*X1.^2 + 12*X1.*X2.^2 - 32*X1.*X2 - 84*X1 + 2*X2.^6 - 8*X2.^5 ...+ 2*X2.^4 - 80*X2.^3 + 12*X2.^2 + 864*X2 + 1010;
createfigure(X1,X2,f)
%% 由 surfc 函数生成:
function createfigure(xdata1, ydata1, zdata1)
%CREATEFIGURE(xdata1, ydata1, zdata1)
% XDATA1: surface xdata
% YDATA1: surface ydata
% ZDATA1: surface zdata% Auto-generated by MATLAB on 08-Dec-2021 15:44:51% Create figure
figure1 = figure;% Create axes
axes1 = axes('Parent',figure1);
hold(axes1,'on');% Create surf
surf(xdata1,ydata1,zdata1,'Parent',axes1);% Create contour
contour(xdata1,ydata1,zdata1,'ZLocation','zmin');view(axes1,[-113 13]);
grid(axes1,'on');
axis(axes1,'tight');
hold(axes1,'off');
% Create colorbar
colorbar(axes1);
执行函数
clc;clear%% init
s = 1;
alpha = .5;
beta = .5;
epsilon = 1e-5;
p1 = [-50;7];
p2 = [20;7];
p3 = [20;-18];
p4 = [5;-10];
f = @(x) 2*x(1)^2 + 12*x(1)*x(2)^2 - 32*x(1)*x(2) - 84*x(1) ...+ 2*x(2)^6 - 8*x(2)^5 + 2*x(2)^4 - 80*x(2)^3 + 12*x(2)^2 ...+ 864*x(2) + 1010;
g = @(x) [12*x(2)^2 - 32*x(2) + 4*x(1) - 84; 24*x(2) - 32*x(1)...+ 24*x(1)*x(2) - 240*x(2)^2 + 8*x(2)^3 - 40*x(2)^4 + 12*x(2)^5 + 864];
h = @(x) [4, 24*x(2) - 32; 24*x(2) - 32, 60*x(2)^4 - 160*x(2)^3 ...+ 24*x(2)^2 - 480*x(2) + 24*x(1) + 24];%% call func
% gradient_method_backtracking
[x_gb1,fun_val_gb1] = gradient_method_backtracking(f,g,p1,s,alpha,...
beta,epsilon);
[x_gb2,fun_val_gb2] = gradient_method_backtracking(f,g,p2,s,alpha,...
beta,epsilon);
[x_gb3,fun_val_gb3] = gradient_method_backtracking(f,g,p3,s,alpha,...
beta,epsilon);
[x_gb4,fun_val_gb4] = gradient_method_backtracking(f,g,p4,s,alpha,...
beta,epsilon);% newton_backtracking
x_nb1 = newton_backtracking(f,g,h,p1,alpha,beta,epsilon);
x_nb2 = newton_backtracking(f,g,h,p2,alpha,beta,epsilon);
x_nb3 = newton_backtracking(f,g,h,p3,alpha,beta,epsilon);
x_nb4 = newton_backtracking(f,g,h,p4,alpha,beta,epsilon);% newton_hybrid
x_nh1 = newton_hybrid(f,g,h,p1,alpha,beta,epsilon);
x_nh2 = newton_hybrid(f,g,h,p2,alpha,beta,epsilon);
x_nh3 = newton_hybrid(f,g,h,p3,alpha,beta,epsilon);
x_nh4 = newton_hybrid(f,g,h,p4,alpha,beta,epsilon);
function [x,fun_val]=gradient_method_backtracking(f,g,x0,s,alpha,...
beta,epsilon)% Gradient method with backtracking stepsize rule
%
% INPUT
%=======================================
% f ......... objective function
% g ......... gradient of the objective function
% x0......... initial point
% s ......... initial choice of stepsize
% alpha ..... tolerance parameter for the stepsize selection
% beta ...... the constant in which the stepsize is multiplied
% at each backtracking step (0<beta<1)
% epsilon ... tolerance parameter for stopping rule
% OUTPUT
%=======================================
% x ......... optimal solution (up to a tolerance)
% of min f(x)
% fun_val ... optimal function valuex=x0;
grad=g(x);
fun_val=f(x);
iter=0;
while (norm(grad)>epsilon)iter=iter+1;t=s;while (fun_val-f(x-t*grad)<alpha*t*norm(grad)^2)t=beta*t;endx=x-t*grad;fun_val=f(x);grad=g(x);fprintf('iter_number = %3d norm_grad = %2.6f fun_val = %2.6f \n',...iter,norm(grad),fun_val);
end
function x=newton_backtracking(f,g,h,x0,alpha,beta,epsilon)% Newton’s method with backtracking
%
% INPUT
%=======================================
% f ......... objective function
% g ......... gradient of the objective function
% h ......... hessian of the objective function
% x0......... initial point
% alpha ..... tolerance parameter for the stepsize selection strategy
% beta ...... the proportion in which the stepsize is multiplied
% at each backtracking step (0<beta<1)
% epsilon ... tolerance parameter for stopping rule
% OUTPUT
%=======================================
% x ......... optimal solution (up to a tolerance)
% of min f(x)
% fun_val ... optimal function valuex=x0;
gval=g(x);
hval=h(x);
d=hval\gval;
iter=0;
while ((norm(gval)>epsilon)&&(iter<10000))iter=iter+1;t=1;while(f(x-t*d)>f(x)-alpha*t*gval'*d)t=beta*t;endx=x-t*d;fprintf('iter= %2d f(x)=%10.10f\n',iter,f(x))gval=g(x);hval=h(x);d=hval\gval;
end
if (iter==10000)fprintf('did not converge\n')
end
function x=newton_hybrid(f,g,h,x0,alpha,beta,epsilon)% Hybrid Newton’s method
%
% INPUT
%=======================================
% f ......... objective function
% g ......... gradient of the objective function
% h ......... hessian of the objective function
% x0......... initial point
% alpha ..... tolerance parameter for the stepsize selection strategy
% beta ...... the proportion in which the stepsize is multiplied
% at each backtracking step (0<beta<1)
% epsilon ... tolerance parameter for stopping rule
% OUTPUT
%=======================================
% x ......... optimal solution (up to a tolerance)
% of min f(x)
% fun_val ... optimal function valuex=x0;
gval=g(x);
hval=h(x);
[L,p]=chol(hval,'lower');
if (p==0)d=L'\(L\gval);
elsed=gval;
end
iter=0;
while ((norm(gval)>epsilon)&&(iter<10000))iter=iter+1;t=1;while(f(x-t*d)>f(x)-alpha*t*gval'*d)t=beta*t;endx=x-t*d;fprintf('iter= %2d f(x)=%10.10f\n',iter,f(x))gval=g(x);hval=h(x);[L,p]=chol(hval,'lower');if (p==0)d=L'\(L\gval);elsed=gval;end
end
if (iter==10000)fprintf('did not converge\n')
end
结果与分析
p1=(−50,7)T,p2=(20,7)T,p3=(20,−18)T,p4=(5,−10)Tp_1=\left(-50,7\right)^T,p_2=\left(20,7\right)^T,p_3=\left(20,-18\right)^T,p_4=\left(5,-10\right)^Tp1=(−50,7)T,p2=(20,7)T,p3=(20,−18)T,p4=(5,−10)T
gradient_method_backtracking
x* | fun_val | iter | |
---|---|---|---|
p1 | 5.0, 4.0 | 0.000000 | - |
p2 | 5.0, 4.0 | 0.000000 | - |
p3 | 11.4128, -0.8968 | 48.984254 | - |
p4 | 5.0, 4.0 | 0.000107 | - |
# p1
iter_number = 1 norm_grad = 16886.999242 fun_val = 11336.705024
iter_number = 2 norm_grad = 5600.459983 fun_val = 5803.984203
iter_number = 3 norm_grad = 1056.831987 fun_val = 4723.717473
iter_number = 4 norm_grad = 434.668354 fun_val = 4676.341028
iter_number = 5 norm_grad = 181.792854 fun_val = 4664.456052
iter_number = 6 norm_grad = 481.845230 fun_val = 4645.697772
iter_number = 7 norm_grad = 1450.431648 fun_val = 2784.316562
iter_number = 8 norm_grad = 626.401853 fun_val = 2597.516663
iter_number = 9 norm_grad = 220.402369 fun_val = 2565.305713
iter_number = 10 norm_grad = 116.512243 fun_val = 2561.054079
......
iter_number = 20849 norm_grad = 0.000092 fun_val = 0.000000
iter_number = 20850 norm_grad = 0.000092 fun_val = 0.000000
iter_number = 20851 norm_grad = 0.000092 fun_val = 0.000000
iter_number = 20852 norm_grad = 0.000092 fun_val = 0.000000
......# p2
iter_number = 1 norm_grad = 20020.154130 fun_val = 11955.275024
iter_number = 2 norm_grad = 8107.335439 fun_val = 3744.085975
iter_number = 3 norm_grad = 2993.085982 fun_val = 1129.842099
iter_number = 4 norm_grad = 949.370487 fun_val = 445.778197
iter_number = 5 norm_grad = 192.946640 fun_val = 320.464548
iter_number = 6 norm_grad = 86.882053 fun_val = 313.996463
iter_number = 7 norm_grad = 41.236184 fun_val = 311.880377
iter_number = 8 norm_grad = 223.087207 fun_val = 295.899225
iter_number = 9 norm_grad = 91.972368 fun_val = 287.479235
iter_number = 10 norm_grad = 40.293830 fun_val = 285.264643
......
iter_number = 13961 norm_grad = 0.000151 fun_val = 0.000000
iter_number = 13962 norm_grad = 0.000151 fun_val = 0.000000
iter_number = 13963 norm_grad = 0.000151 fun_val = 0.000000
iter_number = 13964 norm_grad = 0.000151 fun_val = 0.000000
iter_number = 13965 norm_grad = 0.000151 fun_val = 0.000000
......# p3
iter_number = 1 norm_grad = 10459534.883496 fun_val = 26901796.557585
iter_number = 2 norm_grad = 4328861.498929 fun_val = 9350549.111728
iter_number = 3 norm_grad = 1815010.486957 fun_val = 3306985.630845
iter_number = 4 norm_grad = 764085.159510 fun_val = 1178877.486706
iter_number = 5 norm_grad = 322588.854922 fun_val = 423840.683140
iter_number = 6 norm_grad = 136745.858345 fun_val = 154327.967236
iter_number = 7 norm_grad = 58355.714510 fun_val = 57260.204151
iter_number = 8 norm_grad = 25173.561268 fun_val = 21780.708212
iter_number = 9 norm_grad = 11034.676282 fun_val = 8504.776429
iter_number = 10 norm_grad = 4932.916085 fun_val = 3367.298679
......
iter_number = 11759 norm_grad = 0.000034 fun_val = 48.984254
iter_number = 11760 norm_grad = 0.000034 fun_val = 48.984254
iter_number = 11761 norm_grad = 0.000034 fun_val = 48.984254
iter_number = 11762 norm_grad = 0.000034 fun_val = 48.984254
iter_number = 11763 norm_grad = 0.000034 fun_val = 48.984254
iter_number = 11764 norm_grad = 0.000034 fun_val = 48.984254
......# p4
iter_number = 1 norm_grad = 740484.580250 fun_val = 1125740.309089
iter_number = 2 norm_grad = 318800.867908 fun_val = 410873.876977
iter_number = 3 norm_grad = 135283.737154 fun_val = 147433.167712
iter_number = 4 norm_grad = 57286.724183 fun_val = 52647.763355
iter_number = 5 norm_grad = 24273.902315 fun_val = 18647.021114
iter_number = 6 norm_grad = 10264.843110 fun_val = 6442.533586
iter_number = 7 norm_grad = 4279.439838 fun_val = 2094.562330
iter_number = 8 norm_grad = 1694.354839 fun_val = 604.979471
iter_number = 9 norm_grad = 568.486420 fun_val = 158.093768
iter_number = 10 norm_grad = 98.759692 fun_val = 69.674565
......
iter_number = 6072 norm_grad = 0.000107 fun_val = 0.000000
iter_number = 6073 norm_grad = 0.000107 fun_val = 0.000000
iter_number = 6074 norm_grad = 0.000107 fun_val = 0.000000
iter_number = 6075 norm_grad = 0.000107 fun_val = 0.000000
iter_number = 6076 norm_grad = 0.000107 fun_val = 0.000000
......
newton_backtracking
x* | fun_val | iter | |
---|---|---|---|
p1 | 5.0, 4.0 | -0.000000 | 8 |
p2 | 5.0, 4.0 | 0.000000 | 11 |
p3 | 11.4128, -0.8968 | 48.984254 | 16 |
p4 | 11.4128, -0.8968 | 48.984254 | 13 |
# p1
iter= 1 f(x)=17489.2000310639
iter= 2 f(x)=3536.1276294897
iter= 3 f(x)=556.9651095482
iter= 4 f(x)=50.4029326519
iter= 5 f(x)=1.2248344195
iter= 6 f(x)=0.0013284549
iter= 7 f(x)=0.0000000018
iter= 8 f(x)=-0.0000000000# p2
iter= 1 f(x)=18114.8519548468
iter= 2 f(x)=3676.4191962662
iter= 3 f(x)=583.8090713510
iter= 4 f(x)=53.8298627069
iter= 5 f(x)=1.3684607619
iter= 6 f(x)=0.0016459572
iter= 7 f(x)=0.0000000027
iter= 8 f(x)=0.0000000007
iter= 9 f(x)=0.0000000002
iter= 10 f(x)=0.0000000000
iter= 11 f(x)=0.0000000000# p3
iter= 1 f(x)=21356364.5665758252
iter= 2 f(x)=5561714.7070109081
iter= 3 f(x)=1444932.1368662491
iter= 4 f(x)=374422.3650773597
iter= 5 f(x)=96843.5106842914
iter= 6 f(x)=25078.5606496656
iter= 7 f(x)=6556.5446423615
iter= 8 f(x)=1764.6276022580
iter= 9 f(x)=509.7515417783
iter= 10 f(x)=171.5021447876
iter= 11 f(x)=76.9343325176
iter= 12 f(x)=52.3964690617
iter= 13 f(x)=49.0499328705
iter= 14 f(x)=48.9842770185
iter= 15 f(x)=48.9842536792
iter= 16 f(x)=48.9842536792# p4
iter= 1 f(x)=695503.8833055196
iter= 2 f(x)=180005.8157903731
iter= 3 f(x)=46559.5630006172
iter= 4 f(x)=12100.5787363822
iter= 5 f(x)=3202.4785180644
iter= 6 f(x)=889.1211707798
iter= 7 f(x)=275.2063763915
iter= 8 f(x)=106.0983438392
iter= 9 f(x)=59.2021422471
iter= 10 f(x)=49.5512377643
iter= 11 f(x)=48.9860167506
iter= 12 f(x)=48.9842536959
iter= 13 f(x)=48.9842536792
newton_hybrid
x* | fun_val | iter | |
---|---|---|---|
p1 | 5.0, 4.0 | -0.000000 | 8 |
p2 | 5.0, 4.0 | 0.000000 | 11 |
p3 | 11.4128, -0.8968 | 48.984254 | 16 |
p4 | 11.4128, -0.8968 | 48.984254 | 13 |
# p1
iter= 1 f(x)=17489.2000310639
iter= 2 f(x)=3536.1276294897
iter= 3 f(x)=556.9651095482
iter= 4 f(x)=50.4029326519
iter= 5 f(x)=1.2248344195
iter= 6 f(x)=0.0013284549
iter= 7 f(x)=0.0000000018
iter= 8 f(x)=-0.0000000000# p2
iter= 1 f(x)=18114.8519548468
iter= 2 f(x)=3676.4191962662
iter= 3 f(x)=583.8090713510
iter= 4 f(x)=53.8298627069
iter= 5 f(x)=1.3684607619
iter= 6 f(x)=0.0016459572
iter= 7 f(x)=0.0000000027
iter= 8 f(x)=0.0000000007
iter= 9 f(x)=0.0000000002
iter= 10 f(x)=0.0000000000
iter= 11 f(x)=0.0000000000# p3
iter= 1 f(x)=21356364.5665758252
iter= 2 f(x)=5561714.7070109081
iter= 3 f(x)=1444932.1368662491
iter= 4 f(x)=374422.3650773597
iter= 5 f(x)=96843.5106842914
iter= 6 f(x)=25078.5606496656
iter= 7 f(x)=6556.5446423615
iter= 8 f(x)=1764.6276022580
iter= 9 f(x)=509.7515417783
iter= 10 f(x)=171.5021447876
iter= 11 f(x)=76.9343325176
iter= 12 f(x)=52.3964690617
iter= 13 f(x)=49.0499328705
iter= 14 f(x)=48.9842770185
iter= 15 f(x)=48.9842536792
iter= 16 f(x)=48.9842536792# p4
iter= 1 f(x)=695503.8833055196
iter= 2 f(x)=180005.8157903731
iter= 3 f(x)=46559.5630006172
iter= 4 f(x)=12100.5787363822
iter= 5 f(x)=3202.4785180644
iter= 6 f(x)=889.1211707798
iter= 7 f(x)=275.2063763915
iter= 8 f(x)=106.0983438392
iter= 9 f(x)=59.2021422471
iter= 10 f(x)=49.5512377643
iter= 11 f(x)=48.9860167506
iter= 12 f(x)=48.9842536959
iter= 13 f(x)=48.9842536792
结果分析
回溯法梯度下降(gradient_method_backtracking)在f(x)f(x)f(x)函数上的表现情况比较糟糕,在所有初始点(p1-p4)上均不能满足结束条件∣∣∇f(x)∣∣≤ϵ,ϵ=10−5\left|\left|\ \nabla f\left(x\right)\right|\right|\le\epsilon,\epsilon={10}^{-5}∣∣ ∇f(x)∣∣≤ϵ,ϵ=10−5,所以其迭代步数iter→∞iter \to \inftyiter→∞. 但是可以发现,随着迭代次数的不断增加,最优解x∗x^*x∗和函数值fun_valfun\_valfun_val均收敛。在起始点为p1,p2,p4时,回溯梯度下降求得的最优解xˉ→(5.0,4.0),fun_val→0.0\bar{x} \to (5.0,4.0), fun\_val \to 0.0xˉ→(5.0,4.0),fun_val→0.0,显然是f(x)f(x)f(x)的全局最优,但当以p3为起始点时,所得解为局部最优。此外,当以p4为起始点时,fun_val→0.000107fun\_val \to 0.000107fun_val→0.000107,收敛效果不如p1,p2起始点。总之,回溯法梯度下降的结果说明,由梯度下降得到的f(x)f(x)f(x)的最优解,无论是在全局最优还是局部最优,其梯度∇f(x∗)\nabla f\left(x^*\right)∇f(x∗)难以收敛到0\mathscr{0}0,这也导致了梯度下降的发散。
回溯牛顿法(newton_backtracking)和混合牛顿法(newton_hybrid)得到的结果极为相似,二者在所有四个起始点上的收敛结果表现良好。当以p1,p2为起始点时,两种方法均在10步左右收敛到全局最优(5.0,4.0)(5.0,4.0)(5.0,4.0)。当以p3,p4为起始点时,两方法也能在16步以内收敛到局部最优(11.4128,−0.8968)(11.4128,-0.8968)(11.4128,−0.8968). 观察两种方法在四个起始点上的输出结果,不难发现,输出结果完全相同,说明∇2f(xk),k=1,2,…\nabla ^2f(x_k), k=1,2,\dots∇2f(xk),k=1,2,…是非正定的,所以混合牛顿法中的(a)步(纯牛顿)永远不执行,混合牛顿此时等价于回溯牛顿。
总之,在f(x)f(x)f(x)上,当以∣∣∇f(x)∣∣≤ϵ,ϵ=10−5\left|\left|\ \nabla f\left(x\right)\right|\right|\le\epsilon,\epsilon={10}^{-5}∣∣ ∇f(x)∣∣≤ϵ,ϵ=10−5为终止条件时,牛顿法的迭代次数明显少于梯度下降法。两种方法,当选取的起始点不同,可能会导致收敛不到全局最优解。
©️ Sylvan Ding ❤️
INTRODUCTION TO NONELINEAR OPTIMIZATION Excise 5.2 Freudenstein and Roth Test Function相关推荐
- Introduction to Linear Optimization 2.2 极点,顶角与基可行解
1.极点 极点的定义及理解 Definition 2.6 Let P be a polyhedron. A vector x ∈ P is an extreme point of P if we ca ...
- Introduction to Linear Optimization 2.3标准形式的多面体
1.标准型的表示 通常将多面体 P = {x ∈ Rn | Ax = b, x ≥ 0} 称为多面体的标准型,即多面体中所有约束都为等式.其中 A 为 m×n 的矩阵(通常认为m ≤ n). 现在不妨 ...
- L1正则化降噪,对偶函数的构造,求解含L1正则项的优化问题,梯度投影法
L1正则化降噪,对偶函数的构造,求解含L1正则项的优化问题,梯度投影法 本文主要实现L1正则化降噪,L2 正则化降噪的文章在: https://blog.csdn.net/IYXUAN/article ...
- Paper:《Adam: A Method for Stochastic Optimization》的翻译与解读
Paper:<Adam: A Method for Stochastic Optimization>的翻译与解读 目录 Adam: A Method for Stochastic Opti ...
- Introduction to TurboFan
建议看英文原版: https://doar-e.github.io/blog/2019/01/28/introduction-to-turbofan/#the-d8-shell 作者:Nevv 链接: ...
- Auto Machine Learning笔记 - Bayesian Optimization
原文链接:Auto Machine Learning笔记 - Bayesian Optimization 优化器是机器学习中很重要的一个环节.当确定损失函数时,你需要一个优化器使损失函数的参数能够快速 ...
- 2018/8/26 PSO-based Clustering Techniques to Solve Multimodal Optimization Problems: A Survey
信息: 2018年 米尔普尔科技大学 题目: PSO-based Clustering Techniques to Solve Multimodal Optimization Problems ...
- python——聚类
目录 Cluster Analysis聚类分析 Introduction to Unsupervised Learning无监督学习简介 Clustering Distance or Similari ...
- 单纯形法表格法例题详解_优化 |运筹学线性规划单纯形法之求解
文章申明 文章作者:臧永森 臧永森:清华大学工业工程系在读博士,研究方向:运筹优化算法的设计与应用.数据统计分析.大数据技术与应用,戚铭尧老师团队 责任编辑:阎泳楠 文章由『运筹OR帷幄』原创发布,如 ...
最新文章
- ElasticSearch 2 (1) - Getting Start
- 作为程序员,你评估工作量留 buffer 吗?
- 030_CSS外边距合并
- 前端学习(501):水平居中布局得第二种方式的优缺点
- 关于计算机的内存的一点常识
- Atitit 存储引擎核心技术 总结目录1. 表的存储有三个文件:结构+数据+索引 12. 页式管理
- (1)初识云计算-《云计算核心技术剖析》学习笔记
- 设置cookie存活时间_js中如何设置cookie的保存时间呢?
- 取绝对值 :abs fabs fabsf用法
- 会议论文投稿到接收流程【手里有粮心中不慌】
- android webview静态方法,在android webview中加载静态页面
- 如何在虚拟机中运行知乎App
- UnityHub打开黑屏的解决方法
- CSS 文字下划线间距
- Kubernetes的Pod详解
- Android Studio App开发之循环试图RecyclerView,布局管理器LayoutManager、动态更新循环视图讲解及实战(附源码)
- java注解检验集合对象_Java 对list对象进行属性校验
- 苏州公安第一次用计算机,苏州破获首起非法控制计算机信息系统程序案
- 易得无价宝,难得有情郎
- Hypermesh-laminate复合材料案例学习笔记