本文由两部分组成

  • 逻辑回归 Logistic Regression
    问题背景:预测学生是否可以入学

  • 正则化逻辑回归 regularized logistic regression
    问题背景:预测芯片是否可以通过质量检测。

Logistic Regression

问题背景:预测学生是否可以入学
数据集:历届申请者的历史数据:两次考试的成绩,以及是否入学的结果(0表示未入学,1表示入学)
目标:建立一个分类模型,基于申请者的两次考试成绩来评估申请者可以上大学的概率

两部分的成绩

初始化和加载数据
数据集:ex2data1.txt 位于文章最后

%% Initialization
clear ; close all; clc%% Load Data
%  The first two columns contains the exam scores and the third column
%  contains the label.data = load('ex2data1.txt');
X = data(:, [1, 2]); y = data(:, 3);

Part 1:Plotting

%% ==================== Part 1: Plotting ====================
%  We start the exercise by first plotting the data to understand the
%  the problem we are working with.fprintf(['Plotting data with + indicating (y = 1) examples and o ' ...'indicating (y = 0) examples.\n']);plotData(X, y);%自己编写该函数% Put some labels
hold on;
% Labels and Legend
xlabel('Exam 1 score')
ylabel('Exam 2 score')% Specified in plot order
legend('Admitted', 'Not admitted')
hold off;fprintf('\nProgram paused. Press enter to continue.\n');
pause;

需要完成如下效果:两个坐标轴分别是两次得分,绘制出来的点的颜色代表是否录取。黑色代表录取,黄色代表未录取

plotData.m

function plotData(X, y)
%PLOTDATA Plots the data points X and y into a new figure
%   PLOTDATA(x,y) plots the data points with + for the positive examples
%   and o for the negative examples. X is assumed to be a Mx2 matrix.% Create New Figure
figure; hold on;% ====================== YOUR CODE HERE ======================
% Instructions: Plot the positive and negative examples on a
%               2D plot, using the option 'k+' for the positive
%               examples and 'ko' for the negative examples.
%pos=find(y==1);
neg=find(y==0);
plot(X(pos,1),X(pos,2),'k+','LineWidth',2,...'MarkerSize',7);
plot(X(neg,1),X(neg,2),'ko','MarkerFaceColor','y',...'MarkerSize',7);% =========================================================================hold off;end

绘制结果

资料查询
find函数
功能:找到非零元素的索引(也就是它在array中的位置)

find - Find indices and values of nonzero elementsThis MATLAB function returns a vector containing the linear indices of eachnonzero element in array X.k = find(X)k = find(X,n)k = find(X,n,direction)[row,col] = find(___)[row,col,v] = find(___)

本例中使用的是 pos=find(y==1)这里y是一个列向量,find(y== 1)得到的就是 y==1时的位置,通过pos找到矩阵X中对应的一行,进行绘制。

绘制的代码

‘MarkerFaceColor’ - 标记填充颜色
本例中'MarkerFaceColor','y'使用的是黄色
图片来源:matlab 帮助中心
如果没有加入'MarkerFaceColor','y'得到的是黑白图形

Part2 :Compute Cost and Gradient

%% ============ Part 2: Compute Cost and Gradient ============
%  In this part of the exercise, you will implement the cost and gradient
%  for logistic regression. You neeed to complete the code in
%  costFunction.m%  Setup the data matrix appropriately, and add ones for the intercept term
[m, n] = size(X);% Add intercept term to x and X_test
X = [ones(m, 1) X];% Initialize fitting parameters
initial_theta = zeros(n + 1, 1);% Compute and display initial cost and gradient
[cost, grad] = costFunction(initial_theta, X, y);fprintf('Cost at initial theta (zeros): %f\n', cost);
fprintf('Expected cost (approx): 0.693\n');
fprintf('Gradient at initial theta (zeros): \n');
fprintf(' %f \n', grad);
fprintf('Expected gradients (approx):\n -0.1000\n -12.0092\n -11.2628\n');% Compute and display cost and gradient with non-zero theta
test_theta = [-24; 0.2; 0.2];
[cost, grad] = costFunction(test_theta, X, y);fprintf('\nCost at test theta: %f\n', cost);
fprintf('Expected cost (approx): 0.218\n');
fprintf('Gradient at test theta: \n');
fprintf(' %f \n', grad);
fprintf('Expected gradients (approx):\n 0.043\n 2.566\n 2.647\n');fprintf('\nProgram paused. Press enter to continue.\n');
pause;

sigmoid.m函数
预测函数
hθ(x)=g(θTx)h_\theta(x)=g(\theta^Tx)hθ​(x)=g(θTx)
其中 g为sigmoid function
g(z)=11+e−zg(z)=\frac{1}{1+e^{-z}}g(z)=1+e−z1​

function g = sigmoid(z)
%SIGMOID Compute sigmoid function
%   g = SIGMOID(z) computes the sigmoid of z.% You need to return the following variables correctly
g = zeros(size(z));% ====================== YOUR CODE HERE ======================
% Instructions: Compute the sigmoid of each value of z (z can be a matrix,
%               vector or scalar).g=1./(1+exp(-z));%这里需要点除,否则报错% =============================================================end

函数的性质

sigmoid(100000)
ans = 1
sigmoid(0)
ans = 0.5000
sigmoid(-100000)
ans =0

代价函数
J(θ)=−1m∑i=1m[y(i)×log(hθ(x(i)))+(1−y(i))×log(1−hθ(x(i)))]J(\theta)=-\frac{1}{m}\sum_{i=1}^{m}\left [y^{(i)}\times log\left(h_\theta(x^{(i)})\right)+(1-y^{(i)}) \times log\left(1-h_\theta(x^{(i)})\right)\right ]J(θ)=−m1​i=1∑m​[y(i)×log(hθ​(x(i)))+(1−y(i))×log(1−hθ​(x(i)))]

代价函数的梯度是一个向量,第jthj^{th}jth个元素被定义为
∂J(θ)∂θj=1m∑i=1m(hθ(x(i))−y(i))xj(i)\frac{\partial J(\theta)}{\partial \theta_j}=\frac{1}{m}\sum_{i=1}^{m}(h_\theta(x^{(i)})-y^{(i)})x_j^{(i)}∂θj​∂J(θ)​=m1​i=1∑m​(hθ​(x(i))−y(i))xj(i)​

costFunction.m

function [J, grad] = costFunction(theta, X, y)
%COSTFUNCTION Compute cost and gradient for logistic regression
%   J = COSTFUNCTION(theta, X, y) computes the cost of using theta as the
%   parameter for logistic regression and the gradient of the cost
%   w.r.t. to the parameters.% Initialize some useful values
m = length(y); % number of training examples% You need to return the following variables correctly
J = 0;
grad = zeros(size(theta));% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
%               You should set J to the cost.
%               Compute the partial derivatives and set grad to the partial
%               derivatives of the cost w.r.t. each parameter in theta
%
% Note: grad should have the same dimensions as theta
%
h=sigmoid(X*theta);
first=y.*log(h);%第一项,点乘
second=(1-y).*log(1-h);%第二项,同样是点乘
J=-1/m*sum(first+second);%求和,代价函数grad=1/m*X'*(h-y);% =============================================================end

Part 3: Optimizing using fminunc

下面是调用内置函数 fminunc

首先是定义fminunc的选项设置GradObj为on,是告诉函数返回值有两个:cost 和 gradient
设置MaxIter为400,是告诉函数 最多迭代400次就要结束运行。

options = optimset('GradObj', 'on', 'MaxIter', 400);

@(t)(costFunction(t, X, y))创造一个函数,参数为t的函数,调用costFunction

[theta, cost] = ...fminunc(@(t)(costFunction(t, X, y)), initial_theta, options);
%% ============= Part 3: Optimizing using fminunc  =============
%  In this exercise, you will use a built-in function (fminunc) to find the
%  optimal parameters theta.%  Set options for fminunc
options = optimset('GradObj', 'on', 'MaxIter', 400);%  Run fminunc to obtain the optimal theta
%  This function will return theta and the cost
[theta, cost] = ...fminunc(@(t)(costFunction(t, X, y)), initial_theta, options);% Print theta to screen
fprintf('Cost at theta found by fminunc: %f\n', cost);
fprintf('Expected cost (approx): 0.203\n');
fprintf('theta: \n');
fprintf(' %f \n', theta);
fprintf('Expected theta (approx):\n');
fprintf(' -25.161\n 0.206\n 0.201\n');% Plot Boundary
plotDecisionBoundary(theta, X, y);% Put some labels
hold on;
% Labels and Legend
xlabel('Exam 1 score')
ylabel('Exam 2 score')% Specified in plot order
legend('Admitted', 'Not admitted')
hold off;fprintf('\nProgram paused. Press enter to continue.\n');
pause;

函数fminunc不需要自己手动写循环,不需要手动为梯度下降法设置学习率,只需要提供计算代价和梯度的函数costFunction.它会收敛到正确的最优参数,并且返回cost和θ\thetaθ

Cost at theta found by fminunc: 0.203498
Expected cost (approx): 0.203
theta: -25.161343 0.206232 0.201472
Expected theta (approx):-25.1610.2060.201

后面是绘制决策边界

对于函数
hθ(x)=g(θ1+θ2x2+θ3x3)h_\theta(x)=g(\theta_1+\theta_2x_2+\theta_3x_3)hθ​(x)=g(θ1​+θ2​x2​+θ3​x3​)
由于sigmoid函数分界点为z=0,z>0时,输出为1;z<0时,输出为0;

所以分界点时,有:这里的变量是(x2,x3x_2,x_3x2​,x3​)
θ1+θ2x2+θ3x3=0\theta_1+\theta_2x_2+\theta_3x_3=0θ1​+θ2​x2​+θ3​x3​=0
这里下标使用1,2,3的原因是:matlab中初始下标是从1开始,而不是0,便于计算

其中绘图函数代码如下
plotDecisionBoundary.m函数

function plotDecisionBoundary(theta, X, y)
%PLOTDECISIONBOUNDARY Plots the data points X and y into a new figure with
%the decision boundary defined by theta
%   PLOTDECISIONBOUNDARY(theta, X,y) plots the data points with + for the
%   positive examples and o for the negative examples. X is assumed to be
%   a either
%   1) Mx3 matrix, where the first column is an all-ones column for the
%      intercept.
%   2) MxN, N>3 matrix, where the first column is all-ones% Plot Data
plotData(X(:,2:3), y);%调用前面的绘图函数
hold onif size(X, 2) <= 3% Only need 2 points to define a line, so choose two endpointsplot_x = [min(X(:,2))-2,  max(X(:,2))+2];% Calculate the decision boundary lineplot_y = (-1./theta(3)).*(theta(2).*plot_x + theta(1));% Plot, and adjust axes for better viewingplot(plot_x, plot_y)% Legend, specific for the exerciselegend('Admitted', 'Not admitted', 'Decision Boundary')axis([30, 100, 30, 100])
else% Here is the grid rangeu = linspace(-1, 1.5, 50);v = linspace(-1, 1.5, 50);z = zeros(length(u), length(v));% Evaluate z = theta*x over the gridfor i = 1:length(u)for j = 1:length(v)z(i,j) = mapFeature(u(i), v(j))*theta;endendz = z'; % important to transpose z before calling contour% Plot z = 0% Notice you need to specify the range [0, 0]contour(u, v, z, [0, 0], 'LineWidth', 2)
end
hold offend

绘制直线需要知道 点对(x,y)的值,
举例子

plot_x=[0,10];
plot_y=[3,13];
plot(plot_x,plot_y);

绘制的图形为:

对于本题,对应点对是什么呢?
这里的变量是(x2,x3x_2,x_3x2​,x3​),其中x2=x_2=x2​=plot_x, 而x3=x_3=x3​=plot_y可由θ1+θ2x2+θ3x3=0\theta_1+\theta_2x_2+\theta_3x_3=0θ1​+θ2​x2​+θ3​x3​=0解出来。

 % Only need 2 points to define a line, so choose two endpointsplot_x = [min(X(:,2))-2,  max(X(:,2))+2];% Calculate the decision boundary line,这实际上是x3plot_y = (-1./theta(3)).*(theta(2).*plot_x + theta(1));

Part 4: Predict and Accuracies

%% ============== Part 4: Predict and Accuracies ==============
%  After learning the parameters, you'll like to use it to predict the outcomes
%  on unseen data. In this part, you will use the logistic regression model
%  to predict the probability that a student with score 45 on exam 1 and
%  score 85 on exam 2 will be admitted.
%
%  Furthermore, you will compute the training and test set accuracies of
%  our model.
%
%  Your task is to complete the code in predict.m%  Predict probability for a student with score 45 on exam 1
%  and score 85 on exam 2 prob = sigmoid([1 45 85] * theta);
fprintf(['For a student with scores 45 and 85, we predict an admission ' ...'probability of %f\n'], prob);
fprintf('Expected value: 0.775 +/- 0.002\n\n');% Compute accuracy on our training set
p = predict(theta, X);fprintf('Train Accuracy: %f\n', mean(double(p == y)) * 100);
fprintf('Expected accuracy (approx): 89.0\n');fprintf('\n');

下面是predict.m函数的内容

返回值p是预测值,结果只能是0或者1,0代表不能上大学,1代表能够上大学。
X的维度是m*(n+1),m表示训练样本的数量。theta长度是(n+1)*1

index=find(sigmoid(X*theta)>=0.5);%找到>=0.5的
p(index)=1;

下面是简化写法,也是matlab推荐的写法,速度更快

%或者写成
index=sigmoid(X*theta)>=0.5;
p(index)=1;

代码

function p = predict(theta, X)
%PREDICT Predict whether the label is 0 or 1 using learned logistic
%regression parameters theta
%   p = PREDICT(theta, X) computes the predictions for X using a
%   threshold at 0.5 (i.e., if sigmoid(theta'*x) >= 0.5, predict 1)m = size(X, 1); % Number of training examples% You need to return the following variables correctly
p = zeros(m, 1);% ====================== YOUR CODE HERE ======================
% Instructions: Complete the following code to make predictions using
%               your learned logistic regression parameters.
%               You should set p to a vector of 0's and 1's
%index=sigmoid(X*theta)>=0.5;
p(index)=1;% =========================================================================end

然后是计算准确率
这里准确率的计算是通过比较预测值p和真实值y差异的平均值。若p==y,则结果=1,若p≠y,则结果=0,这样所有的结果累加,然后取平均值,计算出来的就是准确率。比如100个数据中,有90个是预测值等于真实值,然后平均值就是(901+100)/100=0.9。

fprintf('Train Accuracy: %f\n', mean(double(p == y)) * 100);

结果

For a student with scores 45 and 85, we predict an admission probability of 0.776291
Expected value: 0.775 +/- 0.002Train Accuracy: 89.000000
Expected accuracy (approx): 89.0

数据集
ex2data1.txt

34.62365962451697,78.0246928153624,0
30.28671076822607,43.89499752400101,0
35.84740876993872,72.90219802708364,0
60.18259938620976,86.30855209546826,1
79.0327360507101,75.3443764369103,1
45.08327747668339,56.3163717815305,0
61.10666453684766,96.51142588489624,1
75.02474556738889,46.55401354116538,1
76.09878670226257,87.42056971926803,1
84.43281996120035,43.53339331072109,1
95.86155507093572,38.22527805795094,0
75.01365838958247,30.60326323428011,0
82.30705337399482,76.48196330235604,1
69.36458875970939,97.71869196188608,1
39.53833914367223,76.03681085115882,0
53.9710521485623,89.20735013750205,1
69.07014406283025,52.74046973016765,1
67.94685547711617,46.67857410673128,0
70.66150955499435,92.92713789364831,1
76.97878372747498,47.57596364975532,1
67.37202754570876,42.83843832029179,0
89.67677575072079,65.79936592745237,1
50.534788289883,48.85581152764205,0
34.21206097786789,44.20952859866288,0
77.9240914545704,68.9723599933059,1
62.27101367004632,69.95445795447587,1
80.1901807509566,44.82162893218353,1
93.114388797442,38.80067033713209,0
61.83020602312595,50.25610789244621,0
38.78580379679423,64.99568095539578,0
61.379289447425,72.80788731317097,1
85.40451939411645,57.05198397627122,1
52.10797973193984,63.12762376881715,0
52.04540476831827,69.43286012045222,1
40.23689373545111,71.16774802184875,0
54.63510555424817,52.21388588061123,0
33.91550010906887,98.86943574220611,0
64.17698887494485,80.90806058670817,1
74.78925295941542,41.57341522824434,0
34.1836400264419,75.2377203360134,0
83.90239366249155,56.30804621605327,1
51.54772026906181,46.85629026349976,0
94.44336776917852,65.56892160559052,1
82.36875375713919,40.61825515970618,0
51.04775177128865,45.82270145776001,0
62.22267576120188,52.06099194836679,0
77.19303492601364,70.45820000180959,1
97.77159928000232,86.7278223300282,1
62.07306379667647,96.76882412413983,1
91.56497449807442,88.69629254546599,1
79.94481794066932,74.16311935043758,1
99.2725269292572,60.99903099844988,1
90.54671411399852,43.39060180650027,1
34.52451385320009,60.39634245837173,0
50.2864961189907,49.80453881323059,0
49.58667721632031,59.80895099453265,0
97.64563396007767,68.86157272420604,1
32.57720016809309,95.59854761387875,0
74.24869136721598,69.82457122657193,1
71.79646205863379,78.45356224515052,1
75.3956114656803,85.75993667331619,1
35.28611281526193,47.02051394723416,0
56.25381749711624,39.26147251058019,0
30.05882244669796,49.59297386723685,0
44.66826172480893,66.45008614558913,0
66.56089447242954,41.09209807936973,0
40.45755098375164,97.53518548909936,1
49.07256321908844,51.88321182073966,0
80.27957401466998,92.11606081344084,1
66.74671856944039,60.99139402740988,1
32.72283304060323,43.30717306430063,0
64.0393204150601,78.03168802018232,1
72.34649422579923,96.22759296761404,1
60.45788573918959,73.09499809758037,1
58.84095621726802,75.85844831279042,1
99.82785779692128,72.36925193383885,1
47.26426910848174,88.47586499559782,1
50.45815980285988,75.80985952982456,1
60.45555629271532,42.50840943572217,0
82.22666157785568,42.71987853716458,0
88.9138964166533,69.80378889835472,1
94.83450672430196,45.69430680250754,1
67.31925746917527,66.58935317747915,1
57.23870631569862,59.51428198012956,1
80.36675600171273,90.96014789746954,1
68.46852178591112,85.59430710452014,1
42.0754545384731,78.84478600148043,0
75.47770200533905,90.42453899753964,1
78.63542434898018,96.64742716885644,1
52.34800398794107,60.76950525602592,0
94.09433112516793,77.15910509073893,1
90.44855097096364,87.50879176484702,1
55.48216114069585,35.57070347228866,0
74.49269241843041,84.84513684930135,1
89.84580670720979,45.35828361091658,1
83.48916274498238,48.38028579728175,1
42.2617008099817,87.10385094025457,1
99.31500880510394,68.77540947206617,1
55.34001756003703,64.9319380069486,1
74.77589300092767,89.52981289513276,1

regularized logistic regression正则化逻辑回归

问题背景:预测芯片是否可以通过质量检测。
数据集:过去芯片的两项检测结果的历史数据

数据可视化

%% Initialization
clear ; close all; clc%% Load Data
%  The first two columns contains the X values and the third column
%  contains the label (y).data = load('ex2data2.txt');
X = data(:, [1, 2]); y = data(:, 3);plotData(X, y);% Put some labels
hold on;% Labels and Legend
xlabel('Microchip Test 1')
ylabel('Microchip Test 2')% Specified in plot order
legend('y = 1', 'y = 0')
hold off;

结果:芯片经历两次检测的结果,调用的仍然是上文的 plotData函数

我们可以看到,该题不是一个线性的决策边界,不能用线性函数来做。需要考虑多项式来做。

Part 1: Regularized Logistic Regression

从每一个特征点中提取出来更多的特征,我们使用x1x_1x1​和x2x_2x2​的最高六次多项式来进行映射。

从上面可以看到,我们把两个特征(x1x_1x1​和x2x_2x2​)转换成了一个(28×128\times 128×1)的向量
经过这个高维的向量,训练出来的逻辑回归分类器,它的决策边界更加复杂,而且非线性。
对应的多项式特征转变使用如下函数
mapFeature.m函数

function out = mapFeature(X1, X2)
% MAPFEATURE Feature mapping function to polynomial features
%
%   MAPFEATURE(X1, X2) maps the two input features
%   to quadratic features used in the regularization exercise.
%
%   Returns a new feature array with more features, comprising of
%   X1, X2, X1.^2, X2.^2, X1*X2, X1*X2.^2, etc..
%
%   Inputs X1, X2 must be the same size
%degree = 6;
out = ones(size(X1(:,1)));
for i = 1:degreefor j = 0:iout(:, end+1) = (X1.^(i-j)).*(X2.^j);end
endend

下面计算正则化逻辑回归的代价函数和梯度
代价函数为
需要注意的是,不需要正则化θ0\theta_0θ0​,即不需要正则化theta(1),因为matlab下标是从1开始的。
对应的梯度函数

函数为
costFunctionReg.m文件
核心部分就是上面两个公式的matlab实现

function [J, grad] = costFunctionReg(theta, X, y, lambda)
%COSTFUNCTIONREG Compute cost and gradient for logistic regression with regularization
%   J = COSTFUNCTIONREG(theta, X, y, lambda) computes the cost of using
%   theta as the parameter for regularized logistic regression and the
%   gradient of the cost w.r.t. to the parameters. % Initialize some useful values
m = length(y); % number of training examples% You need to return the following variables correctly
J = 0;
grad = zeros(size(theta));% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
%               You should set J to the cost.
%               Compute the partial derivatives and set grad to the partial
%               derivatives of the cost w.r.t. each parameter in thetafirst=y.*log(sigmoid(X*theta));
second=(1-y).*log(1-sigmoid(X*theta));
theta_1=[0;theta(2:end)];%把theta(1)拿掉,不参与正则化J=-sum((first+second))/m+lambda/(2*m)*(theta_1'*theta_1);%注意这里不是点乘,点乘得到的是向量,这里需要的是一个数:向量转置*向量=数
%calculate the gradient
%不参与正则化
grad=X'*(sigmoid(X*theta)-y)/m+lambda/m*theta_1;% =============================================================end

下面是调用函数,并且进行正则化逻辑回归

%% =========== Part 1: Regularized Logistic Regression ============
%  In this part, you are given a dataset with data points that are not
%  linearly separable. However, you would still like to use logistic
%  regression to classify the data points.
%
%  To do so, you introduce more features to use -- in particular, you add
%  polynomial features to our data matrix (similar to polynomial
%  regression).
%% Add Polynomial Features% Note that mapFeature also adds a column of ones for us, so the intercept
% term is handled
X = mapFeature(X(:,1), X(:,2));% Initialize fitting parameters
initial_theta = zeros(size(X, 2), 1);%size(X, 2)%返回X第二维度的长度,也就是每一行的元素个数% Set regularization parameter lambda to 1
lambda = 1;% Compute and display initial cost and gradient for regularized logistic
% regression
[cost, grad] = costFunctionReg(initial_theta, X, y, lambda);fprintf('Cost at initial theta (zeros): %f\n', cost);
fprintf('Expected cost (approx): 0.693\n');
fprintf('Gradient at initial theta (zeros) - first five values only:\n');
fprintf(' %f \n', grad(1:5));
fprintf('Expected gradients (approx) - first five values only:\n');
fprintf(' 0.0085\n 0.0188\n 0.0001\n 0.0503\n 0.0115\n');fprintf('\nProgram paused. Press enter to continue.\n');
pause;% Compute and display cost and gradient
% with all-ones theta and lambda = 10
test_theta = ones(size(X,2),1);
[cost, grad] = costFunctionReg(test_theta, X, y, 10);fprintf('\nCost at test theta (with lambda = 10): %f\n', cost);
fprintf('Expected cost (approx): 3.16\n');
fprintf('Gradient at test theta - first five values only:\n');
fprintf(' %f \n', grad(1:5));
fprintf('Expected gradients (approx) - first five values only:\n');
fprintf(' 0.3460\n 0.1614\n 0.1948\n 0.2269\n 0.0922\n');fprintf('\nProgram paused. Press enter to continue.\n');
pause;

测试结果

Cost at initial theta (zeros): 0.693147
Expected cost (approx): 0.693
Gradient at initial theta (zeros) - first five values only:0.008475 0.018788 0.000078 0.050345 0.011501
Expected gradients (approx) - first five values only:0.00850.01880.00010.05030.0115Program paused. Press enter to continue.Cost at test theta (with lambda = 10): 3.164509
Expected cost (approx): 3.16
Gradient at test theta - first five values only:0.346045 0.161352 0.194796 0.226863 0.092186
Expected gradients (approx) - first five values only:0.34600.16140.19480.22690.0922Program paused. Press enter to continue.

Part 2: Regularization and Accuracies

正则化和准确性:尝试不同的λ\lambdaλ取值,看决策边界如何变化和训练集的准确性如何变化。

%% ============= Part 2: Regularization and Accuracies =============
%  Optional Exercise:
%  In this part, you will get to try different values of lambda and
%  see how regularization affects the decision coundart
%
%  Try the following values of lambda (0, 1, 10, 100).
%
%  How does the decision boundary change when you vary lambda? How does
%  the training set accuracy vary?
%% Initialize fitting parameters
initial_theta = zeros(size(X, 2), 1);% Set regularization parameter lambda to 1 (you should vary this)
lambda = 1;% Set Options
options = optimset('GradObj', 'on', 'MaxIter', 400);% Optimize
[theta, J, exit_flag] = ...fminunc(@(t)(costFunctionReg(t, X, y, lambda)), initial_theta, options);% Plot Boundary
plotDecisionBoundary(theta, X, y);
hold on;
title(sprintf('lambda = %g', lambda))% Labels and Legend
xlabel('Microchip Test 1')
ylabel('Microchip Test 2')legend('y = 1', 'y = 0', 'Decision boundary')
hold off;% Compute accuracy on our training set
p = predict(theta, X);fprintf('Train Accuracy: %f\n', mean(double(p == y)) * 100);
fprintf('Expected accuracy (with lambda = 1): 83.1 (approx)\n');

结果

对应的准确率

Train Accuracy: 88.983051
Expected accuracy (with lambda = 1): 83.1 (approx)

对应的准确率

Train Accuracy: 61.016949
Expected accuracy (with lambda = 1): 83.1 (approx)

其中
plotDecisionBoundary.m函数如下,绘制决策边界的函数代码

function plotDecisionBoundary(theta, X, y)
%PLOTDECISIONBOUNDARY Plots the data points X and y into a new figure with
%the decision boundary defined by theta
%   PLOTDECISIONBOUNDARY(theta, X,y) plots the data points with + for the
%   positive examples and o for the negative examples. X is assumed to be
%   a either
%   1) Mx3 matrix, where the first column is an all-ones column for the
%      intercept.
%   2) MxN, N>3 matrix, where the first column is all-ones% Plot Data
plotData(X(:,2:3), y);
hold onif size(X, 2) <= 3% Only need 2 points to define a line, so choose two endpointsplot_x = [min(X(:,2))-2,  max(X(:,2))+2];% Calculate the decision boundary lineplot_y = (-1./theta(3)).*(theta(2).*plot_x + theta(1));% Plot, and adjust axes for better viewingplot(plot_x, plot_y)% Legend, specific for the exerciselegend('Admitted', 'Not admitted', 'Decision Boundary')axis([30, 100, 30, 100])
else% Here is the grid rangeu = linspace(-1, 1.5, 50);v = linspace(-1, 1.5, 50);z = zeros(length(u), length(v));% Evaluate z = theta*x over the gridfor i = 1:length(u)for j = 1:length(v)z(i,j) = mapFeature(u(i), v(j))*theta;endendz = z'; % important to transpose z before calling contour% Plot z = 0% Notice you need to specify the range [0, 0]contour(u, v, z, [0, 0], 'LineWidth', 2)
end
hold offend

数据集
ex2data2.txt文件中内容如下


0.051267,0.69956,1
-0.092742,0.68494,1
-0.21371,0.69225,1
-0.375,0.50219,1
-0.51325,0.46564,1
-0.52477,0.2098,1
-0.39804,0.034357,1
-0.30588,-0.19225,1
0.016705,-0.40424,1
0.13191,-0.51389,1
0.38537,-0.56506,1
0.52938,-0.5212,1
0.63882,-0.24342,1
0.73675,-0.18494,1
0.54666,0.48757,1
0.322,0.5826,1
0.16647,0.53874,1
-0.046659,0.81652,1
-0.17339,0.69956,1
-0.47869,0.63377,1
-0.60541,0.59722,1
-0.62846,0.33406,1
-0.59389,0.005117,1
-0.42108,-0.27266,1
-0.11578,-0.39693,1
0.20104,-0.60161,1
0.46601,-0.53582,1
0.67339,-0.53582,1
-0.13882,0.54605,1
-0.29435,0.77997,1
-0.26555,0.96272,1
-0.16187,0.8019,1
-0.17339,0.64839,1
-0.28283,0.47295,1
-0.36348,0.31213,1
-0.30012,0.027047,1
-0.23675,-0.21418,1
-0.06394,-0.18494,1
0.062788,-0.16301,1
0.22984,-0.41155,1
0.2932,-0.2288,1
0.48329,-0.18494,1
0.64459,-0.14108,1
0.46025,0.012427,1
0.6273,0.15863,1
0.57546,0.26827,1
0.72523,0.44371,1
0.22408,0.52412,1
0.44297,0.67032,1
0.322,0.69225,1
0.13767,0.57529,1
-0.0063364,0.39985,1
-0.092742,0.55336,1
-0.20795,0.35599,1
-0.20795,0.17325,1
-0.43836,0.21711,1
-0.21947,-0.016813,1
-0.13882,-0.27266,1
0.18376,0.93348,0
0.22408,0.77997,0
0.29896,0.61915,0
0.50634,0.75804,0
0.61578,0.7288,0
0.60426,0.59722,0
0.76555,0.50219,0
0.92684,0.3633,0
0.82316,0.27558,0
0.96141,0.085526,0
0.93836,0.012427,0
0.86348,-0.082602,0
0.89804,-0.20687,0
0.85196,-0.36769,0
0.82892,-0.5212,0
0.79435,-0.55775,0
0.59274,-0.7405,0
0.51786,-0.5943,0
0.46601,-0.41886,0
0.35081,-0.57968,0
0.28744,-0.76974,0
0.085829,-0.75512,0
0.14919,-0.57968,0
-0.13306,-0.4481,0
-0.40956,-0.41155,0
-0.39228,-0.25804,0
-0.74366,-0.25804,0
-0.69758,0.041667,0
-0.75518,0.2902,0
-0.69758,0.68494,0
-0.4038,0.70687,0
-0.38076,0.91886,0
-0.50749,0.90424,0
-0.54781,0.70687,0
0.10311,0.77997,0
0.057028,0.91886,0
-0.10426,0.99196,0
-0.081221,1.1089,0
0.28744,1.087,0
0.39689,0.82383,0
0.63882,0.88962,0
0.82316,0.66301,0
0.67339,0.64108,0
1.0709,0.10015,0
-0.046659,-0.57968,0
-0.23675,-0.63816,0
-0.15035,-0.36769,0
-0.49021,-0.3019,0
-0.46717,-0.13377,0
-0.28859,-0.060673,0
-0.61118,-0.067982,0
-0.66302,-0.21418,0
-0.59965,-0.41886,0
-0.72638,-0.082602,0
-0.83007,0.31213,0
-0.72062,0.53874,0
-0.59389,0.49488,0
-0.48445,0.99927,0
-0.0063364,0.99927,0
0.63265,-0.030612,0

吴恩达机器学习Ex2相关推荐

  1. Andrew Ng -- machine learning ex2/吴恩达机器学习ex2

    这个项目包含了吴恩达机器学习ex2的python实现,主要知识点为逻辑回归.正则化,题目内容可以查看数据集中的ex2.pdf 代码来自网络(原作者黄广海的github),添加了部分对于题意的中文翻译, ...

  2. 吴恩达机器学习ex2:逻辑回归

    吴恩达机器学习练习二:逻辑回归 1. 逻辑回归(logistic regression) 构建一个可以基于两次测试评分来评估录取可能性的分类模型. 知识点回顾: 1.1 数据可视化 #coding=u ...

  3. 吴恩达机器学习ex2 Logistic Regression (python)

    Programming Exercise 2: Logistic Regression Machine Learning 目录 Introduction 1 Logistic regression 1 ...

  4. 吴恩达ex3_吴恩达机器学习 EX3 作业 第一部分多分类逻辑回归 手写数字

    1 多分类逻辑回归 逻辑回归主要用于分类,也可用于one-vs-all分类.如本练习中的数字分类,输入一个训练样本,输出结果可能为0-9共10个数字中的一个数字.一对多分类训练过程使用"一对 ...

  5. 逻辑回归python sigmoid(z)_python实现吴恩达机器学习练习2(逻辑回归)-data1

    python实现吴恩达机器学习练习2(逻辑回归)-data1 这篇是第一个数据集:这部分练习中,你将建立一个预测学生是否被大学录取的逻辑回归模型. 假如一所大学会每个报名学生进行两项入学考试,根据两项 ...

  6. [转载] 吴恩达机器学习逻辑回归练习题:逻辑回归及规则化(python实现)

    参考链接: 了解逻辑回归 Python实现 练习题背景:网易云课堂->吴恩达机器学习课程->逻辑回归练习题 对于练习题的详细内容,和课程中推荐的octave编程实现,请见:吴恩达机器学习逻 ...

  7. 【CV】吴恩达机器学习课程笔记 | 第1-2章

    本系列文章如果没有特殊说明,正文内容均解释的是文字上方的图片 机器学习 | Coursera 吴恩达机器学习系列课程_bilibili 目录 1 介绍 1-3 监督学习 1-4 无监督学习 2 单变量 ...

  8. 【CV】吴恩达机器学习课程笔记第18章

    本系列文章如果没有特殊说明,正文内容均解释的是文字上方的图片 机器学习 | Coursera 吴恩达机器学习系列课程_bilibili 目录 18 应用案例:照片OCR 18-1 问题描述与流程(pi ...

  9. 【CV】吴恩达机器学习课程笔记第17章

    本系列文章如果没有特殊说明,正文内容均解释的是文字上方的图片 机器学习 | Coursera 吴恩达机器学习系列课程_bilibili 目录 17 大规模机器学习 17-1 学习大数据集 17-2 随 ...

最新文章

  1. HTML 基础知识(特殊字符的转义)
  2. 网页制作使用CSS样式制作轮播教程,静态网页设计与开发 1.案例——CSS3制作图片轮播图 (4)使用纯CSS3代码实现简单的图片轮播——分步骤实现.docx...
  3. mysql优化Analyze Table
  4. python怎么导入包-如何理解Python中包的引入
  5. get请求中文乱码问题
  6. 大型EAI项目中的ORACLE 数据库管理(ZT)
  7. python 百度云文字识别 proxy_python使用百度文字识别功能方法详解
  8. ShardingSphere RAW JDBC 分布式事务 Narayana XA 代码示例
  9. PHP随机生成中国人姓名的类
  10. 不能不说的C#特性-迭代器(下),yield以及流的延迟计算
  11. 远程桌面(3389)复制(拖动)文件
  12. 工作中应该发火,勿感情用词
  13. Axure-图表元件库
  14. PHP搭建留言板,PHP搭建简易留言板
  15. word中如何删除某符号前面或后面所有的文字
  16. 微信红包封面,你真的领取到了吗?
  17. 记12306货运系统“抢订空车”插件的编写--流程简化及核心代码
  18. CMMI3-CMMI5评估认证需要遵循七大原则
  19. 祝萍:后疫情时代,医美运营既要走心也要反套路
  20. 新版RTSP协议网络摄像头网页无插件直播平台EasyNVR如何自定义通道的背景音乐?

热门文章

  1. javaweb数据库操作
  2. 为了缅怀Borland Delphi!!!
  3. AVG Anti-Spyware 7.5 .0.50(原EWIDO)汉化 破解 注册 序列号
  4. Leetcode 相关资料
  5. @getmapping注解的作用_一口气说出6种,@Transactional注解的失效场景
  6. 安徽师范大学计算机专业导师,安徽师范大学数学计算机科学学院导师介绍:罗永龙...
  7. 程序的内存模型—内存四区—堆区
  8. android之broadcast发送广播
  9. 接到一个新需求:手机照片视频存储及备份需求整理及分析
  10. 自测题的整理(持续更新)