机器学习--逻辑回归模型(Logistic Regression)

1、逻辑回归

逻辑回归（Logistic Regression）是一种用于二元分类问题的机器学习算法。逻辑回归的目的是基于输入特征预测一个样本属于某个特定的类别的概率。

逻辑回归的核心思想是将线性回归的结果经过一个逻辑函数（Logistic Function）转化为一个在0和1之间的概率值，从而进行分类。逻辑函数通常采用Sigmoid函数，它的输入是线性回归的结果，输出值在0到1之间。当输出值大于0.5时，我们将样本预测为正类，否则预测为负类。

在逻辑回归中，我们使用最大似然估计（Maximum Likelihood Estimation）来求解模型参数。具体地，我们需要最大化对数似然函数，以获得最优的模型参数。我们可以使用梯度下降等优化算法来最大化对数似然函数。

逻辑回归具有简单、快速、易于实现等优点，因此在许多实际问题中得到了广泛应用，例如医学诊断、金融风险评估、自然语言处理等。

2、代码实现

metrics.py

import numpy as np
from math import sqrt# 分类准确度
def accuracy_score(y_true, y_predict):"""计算y_true(y_test)和y_predict之间的准确率"""assert y_true.shape[0] == y_predict.shape[0], \"the size of y_true must be equal to the size of y_predict"return np.sum(y_true == y_predict) / len(y_true)# 下面三个是对线性回归模型大的评测指标
def mean_squared_error(y_true, y_predict):"""计算y_true和y_predict之间的mse"""assert len(y_true) == len(y_predict), \"the size of y_true must be equal to the size of y_predict"return np.sum((y_true - y_predict) ** 2) / len(y_true)def root_mean_squared_error(y_true, y_predict):"""计算y_true和y_predict之间的RMSE"""return sqrt(mean_squared_error(y_true, y_predict))def mean_absolute_error(y_true, y_predict):"""计算y_true和y_predict之间的RMSE"""assert len(y_true) == len(y_predict), \"the size of y_true must be equal to the size of y_predict"return np.sum(np.absolute(y_true - y_predict)) / len(y_true)def r2_score(y_true, y_predict):"""计算y_true和y_predict之间的R Square"""return 1 - mean_squared_error(y_true, y_predict) / np.var(y_true)# 评价分类的指标
def TN(y_true, y_predict):assert len(y_true) == len(y_predict)return np.sum((y_true == 0) & (y_predict == 0))def FP(y_true, y_predict):assert len(y_true) == len(y_predict)return np.sum((y_true == 0) & (y_predict == 1))def FN(y_true, y_predict):assert len(y_true) == len(y_predict)return np.sum((y_true == 1) & (y_predict == 0))def TP(y_true, y_predict):assert len(y_true) == len(y_predict)return np.sum((y_true == 1) & (y_predict == 1))def confusion_matrix(y_true, y_predict):return np.array([[TN(y_true, y_predict), FP(y_true, y_predict)],[FN(y_true, y_predict), TP(y_true, y_predict)]])def precision_score(y_true, y_predict):tp = TP(y_true, y_predict)fp = FP(y_true, y_predict)try:return tp / (tp + fp)except:return 0.0def recall_score(y_true, y_predict):tp = TP(y_true, y_predict)fn = FN(y_true, y_predict)try:return tp / (tp + fn)except:return 0.0def f1_score(y_true, y_predict):precision = precision_score(y_true, y_predict)recall = recall_score(y_true, y_predict)try:return 2 * precision * recall / (precision + recall)except:return 0.0def TPR(y_true, y_predict):tp = TP(y_true, y_predict)fn = FN(y_true, y_predict)try:return tp / (tp + fn)except:return 0.0def FPR(y_true, y_predict):fp = FP(y_true, y_predict)tn = TN(y_true, y_predict)try:return fp / (fp + tn)except:return 0.0

上面的代码是一些评估机器学习模型表现的函数，包括回归模型和分类模型的指标。下面是每个函数的具体功能描述：
accuracy_score(y_true, y_predict)：计算分类模型的准确率。
mean_squared_error(y_true, y_predict)：计算回归模型的均方误差。
root_mean_squared_error(y_true, y_predict)：计算回归模型的均方根误差。
mean_absolute_error(y_true, y_predict)：计算回归模型的平均绝对误差。
r2_score(y_true,y_predict)：计算回归模型的R²分数。
TN(y_true, y_predict)：计算二分类模型中真负类数。
FP(y_true,y_predict)：计算二分类模型中假正类数。
FN(y_true, y_predict)：计算二分类模型中假负类数。
TP(y_true, y_predict)：计算二分类模型中真正类数。
confusion_matrix(y_true,y_predict)：计算二分类模型中的混淆矩阵。
precision_score(y_true,y_predict)：计算二分类模型中的精确率。
recall_score(y_true, y_predict)：计算二分类模型中的召回率。
f1_score(y_true, y_predict)：计算二分类模型中的F1分数。
TPR(y_true,y_predict)：计算二分类模型中的真正类率。
FPR(y_true, y_predict)：计算二分类模型中的假正类率。

LogisticRegression.py

import numpy as np
from .metrics import accuracy_score
# 逻辑回归是处理分类任务的，使用分类准确度衡量模型# 逻辑回归模型
class LogisticRegression:def __init__(self):"""初始化Logistic Regression模型"""self.coef_ = None  # 系数self.interception_ = None  # 截距self._theta = None# 自定义sigmoid函数，为私有方法def _sigmoid(self, t):return 1. / (1. + np.exp(-t))# 使用批量梯度下降法def fit(self, X_train, y_train, eta=0.01, n_iters=1e4):"""根据训练数据集X_train, y_train,使用梯度下降法训练Logistic Regression模型"""assert X_train.shape[0] == y_train.shape[0], \"the size of X_train must be equal to the size of y_train"def J(theta, X_b, y):"""求出对应theta的损失函数"""y_hat = self._sigmoid(X_b.dot(theta))try:return -np.sum(y*np.log(y_hat) + (1-y)*np.log(1-y_hat)) / len(y)except:return float('inf')def dJ(theta, X_b, y):"""求出损失函数的对应theta梯度"""# 使用下面向量化运算求梯度return X_b.T.dot(self._sigmoid(X_b.dot(theta)) - y) / len(y)def gradient_descent(X_b, y, initial_theta, eta, n_iters=1e4, epsilon=1e-8):"""使用梯度下降算法训练模型"""theta = initial_thetacur_iter = 0while cur_iter < n_iters:gradient = dJ(theta, X_b, y)last_theta = thetatheta = theta - eta * gradientif abs(J(theta, X_b, y) - J(last_theta, X_b, y)) < epsilon:breakcur_iter += 1return thetaX_b = np.hstack([np.ones((len(X_train), 1)),  X_train])initial_theta = np.zeros(X_b.shape[1])self._theta = gradient_descent(X_b, y_train, initial_theta, eta, n_iters)self.interception_ = self._theta[0]self.coef_ = self._theta[1:]return selfdef predict_proba(self, X_predict):"""给定待预测数据集X_predict,返回表示X_predict结果的概率向量"""assert self.coef_ is not None and self.interception_ is not None, \"must fit before predict!"assert X_predict.shape[1] == len(self.coef_), \"the feature number of X_predict must be equal to X_train"X_b = np.hstack([np.ones((X_predict.shape[0], 1)), X_predict])return self._sigmoid(X_b.dot(self._theta))def predict(self, X_predict):"""给定待预测数据集X_predict,返回表示X_predict"""assert self.coef_ is not None and self.interception_ is not None, \"must fit before predict!"assert X_predict.shape[1] == len(self.coef_), \"the feature number of X_predict must be equal to X_train"proba = self.predict_proba(X_predict)return np.array(proba >= 0.5, dtype='int')def score(self, X_test, y_test):"""根据测试数据集X_test和y_test确定当前模型的准确度"""y_predict = self.predict(X_test)return accuracy_score(y_test, y_predict)def __repr__(self):return "LogisticRegression()"

这段代码实现了逻辑回归模型，包含了模型训练和预测等基本功能。具体来说，实现了以下方法：

初始化模型；

自定义sigmoid函数；

使用批量梯度下降法训练模型；

给定待预测数据集X_predict，返回表示X_predict结果的概率向量；

给定待预测数据集X_predict，返回表示X_predict的预测结果；

根据测试数据集X_test和y_test确定当前模型的准确度；
*输出模型的字符串表示。

Time:2023.3.27
如果上面代码对您有帮助，欢迎点个赞！！！

机器学习--逻辑回归模型(Logistic Regression)相关推荐

逻辑回归模型(Logistic Regression, LR)基础
逻辑回归模型(Logistic Regression, LR)基础逻辑回归(Logistic Regression, LR)模型其实仅在线性回归的基础上,套用了一个逻辑函数,但也就由于这个逻辑函数, ...
逻辑回归模型(Logistic Regression, LR)基础 - 文赛平
逻辑回归模型(Logistic Regression, LR)基础 - 文赛平时间 2013-11-25 11:56:00 博客园精华区原文 http://www.cnblogs.com/ ...
逻辑回归模型(Logistic Regression)
逻辑回归符合伯努利分布.伯努利分布就是我们常见的0-1分布,即它的随机变量只取0或者1,各自的频率分别取1−p和p,当x=0或者x=1时,我们数学定义为: 所以在常规的逻辑回归模型中,只有两个类别,0 ...
[机器学习] Coursera ML笔记 - 逻辑回归（Logistic Regression）
引言机器学习栏目记录我在学习Machine Learning过程的一些心得笔记,涵盖线性回归.逻辑回归.Softmax回归.神经网络和SVM等等.主要学习资料来自Standford Andrew N ...
机器学习算法与Python实践之逻辑回归（Logistic Regression）
转载自:http://blog.csdn.net/zouxy09/article/details/20319673 机器学习算法与Python实践这个系列主要是参考<机器学习实战>这本书. ...
Python机器学习算法 — 逻辑回归（Logistic Regression）
逻辑回归--简介逻辑回归(Logistic Regression)就是这样的一个过程:面对一个回归或者分类问题,建立代价函数,然后通过优化方法迭代求解出最优的模型参数,然后测试验证我们这个求解的模型 ...
logisticregression参数_通俗地说逻辑回归【Logistic regression】算法（二）sklearn逻辑回归实战...
前情提要: 通俗地说逻辑回归[Logistic regression]算法(一) 逻辑回归模型原理介绍上一篇主要介绍了逻辑回归中,相对理论化的知识,这次主要是对上篇做一点点补充,以及介绍sklear ...
逻辑回归（logistic regression）原理详解
机器学习解决的问题,大体上就是两种:数值预测和分类.前者一般采用的是回归模型,比如最常用的线性回归:后者的方法则五花八门,决策树,kNN,支持向量机,朴素贝叶斯等等模型都是用来解决分类问题的.其实,两 ...
基于逻辑回归（Logistic Regression）的糖尿病视网膜病变（Diabetic Retinopathy）检测
基于逻辑回归的糖尿病视网膜病变检测说明数据集探索性数据分析方法结果代码说明这是我学机器学习的一个项目, 基于逻辑回归(Logistic Regression)的糖尿病视网膜病变(Dia ...

机器学习--逻辑回归模型(Logistic Regression)

1、逻辑回归

2、代码实现

机器学习--逻辑回归模型(Logistic Regression)相关推荐

最新文章

热门文章

机器学习--逻辑回归模型(Logistic Regression)

1、逻辑回归

2、 代码实现

机器学习--逻辑回归模型(Logistic Regression)相关推荐

最新文章

热门文章

2、代码实现