ML之NB：利用朴素贝叶斯NB算法(CountVectorizer+不去除停用词)对fetch_20newsgroups数据集(20类新闻文本)进行分类预测、评估

输出结果

设计思路

核心代码

输出结果

设计思路

核心代码

https://www.cnblogs.com/yunyaniu/articles/10465701.html

class MultinomialNB Found at: sklearn.naive_bayesclass MultinomialNB(BaseDiscreteNB):"""Naive Bayes classifier for multinomial modelsThe multinomial Naive Bayes classifier is suitable for classification withdiscrete features (e.g., word counts for text classification). Themultinomial distribution normally requires integer feature counts. However,in practice, fractional counts such as tf-idf may also work.Read more in the :ref:`User Guide <multinomial_naive_bayes>`.Parameters----------alpha : float, optional (default=1.0)Additive (Laplace/Lidstone) smoothing parameter(0 for no smoothing).fit_prior : boolean, optional (default=True)Whether to learn class prior probabilities or not.If false, a uniform prior will be used.class_prior : array-like, size (n_classes,), optional (default=None)Prior probabilities of the classes. If specified the priors are notadjusted according to the data.Attributes----------class_log_prior_ : array, shape (n_classes, )Smoothed empirical log probability for each class.intercept_ : propertyMirrors ``class_log_prior_`` for interpreting MultinomialNBas a linear model.feature_log_prob_ : array, shape (n_classes, n_features)Empirical log probability of featuresgiven a class, ``P(x_i|y)``.coef_ : propertyMirrors ``feature_log_prob_`` for interpreting MultinomialNBas a linear model.class_count_ : array, shape (n_classes,)Number of samples encountered for each class during fitting. Thisvalue is weighted by the sample weight when provided.feature_count_ : array, shape (n_classes, n_features)Number of samples encountered for each (class, feature)during fitting. This value is weighted by the sample weight whenprovided.Examples-------->>> import numpy as np>>> X = np.random.randint(5, size=(6, 100))>>> y = np.array([1, 2, 3, 4, 5, 6])>>> from sklearn.naive_bayes import MultinomialNB>>> clf = MultinomialNB()>>> clf.fit(X, y)MultinomialNB(alpha=1.0, class_prior=None, fit_prior=True)>>> print(clf.predict(X[2:3]))[3]Notes-----For the rationale behind the names `coef_` and `intercept_`, i.e.naive Bayes as a linear classifier, see J. Rennie et al. (2003),Tackling the poor assumptions of naive Bayes text classifiers, ICML.References----------C.D. Manning, P. Raghavan and H. Schuetze (2008). Introduction toInformation Retrieval. Cambridge University Press, pp. 234-265.http://nlp.stanford.edu/IR-book/html/htmledition/naive-bayes-text-classification-1.html"""def __init__(self, alpha=1.0, fit_prior=True, class_prior=None):self.alpha = alphaself.fit_prior = fit_priorself.class_prior = class_priordef _count(self, X, Y):"""Count and smooth feature occurrences."""if np.any((X.data if issparse(X) else X) < 0):raise ValueError("Input X must be non-negative")self.feature_count_ += safe_sparse_dot(Y.T, X)self.class_count_ += Y.sum(axis=0)def _update_feature_log_prob(self, alpha):"""Apply smoothing to raw counts and recompute log probabilities"""smoothed_fc = self.feature_count_ + alphasmoothed_cc = smoothed_fc.sum(axis=1)self.feature_log_prob_ = np.log(smoothed_fc) - np.log(smoothed_cc.reshape(-1, 1))def _joint_log_likelihood(self, X):"""Calculate the posterior log probability of the samples X"""check_is_fitted(self, "classes_")X = check_array(X, accept_sparse='csr')return safe_sparse_dot(X, self.feature_log_prob_.T) + self.class_log_prior_

ML之NB：利用朴素贝叶斯NB算法(CountVectorizer+不去除停用词)对fetch_20newsgroups数据集(20类新闻文本)进行分类预测、评估相关推荐

ML之NB：利用NB朴素贝叶斯算法(CountVectorizer/TfidfVectorizer+去除停用词)进行分类预测、评估
ML之NB:利用NB朴素贝叶斯算法(CountVectorizer/TfidfVectorizer+去除停用词)进行分类预测.评估目录输出结果设计思路核心代码输出结果设计思路核心代码 c ...
ML之NB：利用朴素贝叶斯NB算法(TfidfVectorizer+不去除停用词)对20类新闻文本数据集进行分类预测、评估
ML之NB:利用朴素贝叶斯NB算法(TfidfVectorizer+不去除停用词)对20类新闻文本数据集进行分类预测.评估目录输出结果设计思路核心代码输出结果设计思路核心代码 class ...
ML之SVM：利用SVM算法(超参数组合进行多线程网格搜索+3fCrVa)对20类新闻文本数据集进行分类预测、评估
ML之SVM:利用SVM算法(超参数组合进行多线程网格搜索+3fCrVa)对20类新闻文本数据集进行分类预测.评估目录输出结果设计思路核心代码输出结果 Fitting 3 folds for ...
ML之SVM：利用SVM算法(超参数组合进行单线程网格搜索+3fCrVa)对20类新闻文本数据集进行分类预测、评估
ML之SVM:利用SVM算法(超参数组合进行单线程网格搜索+3fCrVa)对20类新闻文本数据集进行分类预测.评估目录输出结果设计思路核心代码输出结果 Fitting 3 folds for ...
ML之NB：朴素贝叶斯Naive Bayesian算法的简介、应用、经典案例之详细攻略
ML之NB:朴素贝叶斯Naive Bayesian算法的简介.应用.经典案例之详细攻略目录朴素贝叶斯Naive Bayesian算法的简介 1.朴素贝叶斯计算流程表述 2.朴素贝叶斯的优缺点 2. ...
ML之NB：基于NB朴素贝叶斯算法训练20类新闻文本数据集进行多分类预测
ML之NB:基于NB朴素贝叶斯算法训练20类新闻文本数据集进行多分类预测目录输出结果设计思路核心代码输出结果设计思路核心代码 vec = CountVectorizer() X_trai ...
NLP之TopicModel：朴素贝叶斯NB的先验概率之Dirichlet分布的应用
NLP之TopicModel:朴素贝叶斯NB的先验概率之Dirichlet分布的应用目录 1.Dirichlet骰子先验和后验分布的采样 2.稀疏Dirichlet先验的采样 1.Dirichlet ...
高斯判别分析(GDA)和朴素贝叶斯(NB)
生成模型和判别模型监督学习一般学习的是一个决策函数y=f(x)y=f(x)y=f(x)或者是条件概率分布p(y∣x)p(y|x)p(y∣x). 判别模型直接用数据学习这个函数或分布,例如Linear ...
#第26篇分享：一个文本分类的数据挖掘（python语言：sklearn 朴素贝叶斯NB）（2）
#sklearn 朴素贝叶斯NB算法常用于文本分类,尤其是对于英文等语言来说,分类效果很好:它常用于垃圾文本过滤.情感预测.推荐系统等:是基于概率进行预测的模型,可以做二分类及多分类( 朴素贝叶斯是个 ...

ML之NB：利用朴素贝叶斯NB算法(CountVectorizer+不去除停用词)对fetch_20newsgroups数据集(20类新闻文本)进行分类预测、评估

输出结果

设计思路

核心代码

ML之NB：利用朴素贝叶斯NB算法(CountVectorizer+不去除停用词)对fetch_20newsgroups数据集(20类新闻文本)进行分类预测、评估相关推荐

最新文章

热门文章