内容汇总:https://blog.csdn.net/weixin_43093481/article/details/114989382?spm=1001.2014.3001.5501
课程笔记:1.2 情感分析与朴素贝叶斯法(Sentiment Analysis with Naïve Bayes)
代码:https://github.com/Ogmx/Natural-Language-Processing-Specialization
————————————————————————————————————

作业 2: 朴素贝叶斯(Naive Bayes)

学习目标:
 学习朴素贝叶斯原理,并应用其对推特进行情感分析。给出一条推特,判断其是正向情感还是负向情感。

具体而言,将会学习:

  • 训练朴素贝叶斯模型用于情感分析
  • 测试模型
  • 计算正向词和负向词比率
  • 进行错误分析
  • 使用自己的数据预测

你可能已经熟悉朴素贝叶斯法及其原理和条件概率与独立性

  • 在本项目中,将使用正向情感与负向情感的概率比率
  • 这种方法能简单快速的解决二分类问题

导入python库

from utils import process_tweet, lookup
import pdb
from nltk.corpus import stopwords, twitter_samples
import numpy as np
import pandas as pd
import nltk
import string
from nltk.tokenize import TweetTokenizer
from os import getcwd

下载数据

nltk.download('stopwords')
nltk.download('twitter_samples')

划分数据集

# get the sets of positive and negative tweets
all_positive_tweets = twitter_samples.strings('positive_tweets.json')
all_negative_tweets = twitter_samples.strings('negative_tweets.json')# split the data into two pieces, one for training and one for testing (validation set)
test_pos = all_positive_tweets[4000:]
train_pos = all_positive_tweets[:4000]
test_neg = all_negative_tweets[4000:]
train_neg = all_negative_tweets[:4000]train_x = train_pos + train_neg
test_x = test_pos + test_neg# avoid assumptions about the length of all_positive_tweets
train_y = np.append(np.ones(len(train_pos)), np.zeros(len(train_neg)))
test_y = np.append(np.ones(len(test_pos)), np.zeros(len(test_neg)))

Part 1: 数据处理

对于任何机器学习项目,当获取完数据后,第一步操作一定是对数据进行处理,使其符合模型的输入

  • 去除噪音: 移除数据中的噪音,即移除那些不关键的单词,如一些常见词’ ‘I, you, are, is, etc…’ ,因为这些词不会提供任何情感信息。
  • 同样要移除标签符号,如转发符号、超链接和标签,因为它们同样不会提供任何情感信息.
  • 对于标点符号,虽然其会包含一些情感信息,但出于简单考虑,同样将其移除
  • 最后,对各单词进行词根化处理,如 “motivation”, “motivated”, and “motivate” 将其转换为同一词根 “motiv-”.

使用函数 process_tweet() 来处理数据.

custom_tweet = "RT @Twitter @chapagain Hello There! Have a great day. :) #good #morning http://chapagain.com.np"# print cleaned tweet
print(process_tweet(custom_tweet))

[‘hello’, ‘great’, ‘day’, ‘: )’, ‘good’, ‘morn’]

Part 1.1 实现帮助函数

为了训练朴素贝叶斯模型,需要先构建一个词频字典,键为(word, label),值为对应的频率。其中,label为1或0,表示正向情感和负向情感。

实现lookup() 帮助函数,其输入freqs 字典,一个单词,和一个标签(1 or 0),返回该(word, label)在语料库中出现次数。

例如:对于这两条推特 ["i am rather excited", "you are rather happy"] 和标签 1, 其频率字典如下:

{
  (“rather”, 1): 2
  (“happi”, 1) : 1
  (“excit”, 1) : 1
}

  • 对于语料库中的各个单词,都为其指定相同的标签1
  • 对于 “i” 和 “am” 这样的单词并没被保存,因为其作为停用词在数据处理时被移除
  • 因为 “rather” 在两文本中都出现一次,因此其频率为2

实现count_tweets()函数

实现 count_tweets()函数,其输入一系列推特,对其进行处理,最后返回词频字典

  • 键为单词词根和其标签, 如 (“happi”,1).
  • 值为该单词在语料库中出现次数 (一个整数).
# UNQ_C1 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)
def count_tweets(result, tweets, ys):'''Input:result: a dictionary that will be used to map each pair to its frequencytweets: a list of tweetsys: a list corresponding to the sentiment of each tweet (either 0 or 1)Output:result: a dictionary mapping each pair to its frequency'''### START CODE HERE (REPLACE INSTANCES OF 'None' with your code) ###for y, tweet in zip(ys, tweets):for word in process_tweet(tweet):# define the key, which is the word and label tuplepair = (word,y)# if the key exists in the dictionary, increment the countif pair in result:result[pair] += 1# else, if the key is new, add it to the dictionary and set the count to 1else:result[pair] = 1### END CODE HERE ###return result
# Testing your function
result = {}
tweets = ['i am happy', 'i am tricked', 'i am sad', 'i am tired', 'i am tired']
ys = [1, 0, 0, 0, 0]
count_tweets(result, tweets, ys)

{(‘happi’, 1): 1, (‘trick’, 0): 1, (‘sad’, 0): 1, (‘tire’, 0): 2}

Part 2: 训练朴素贝叶斯模型

朴素贝叶斯是一种算法可用于情感分析,可以在较短时间内完成训练和预测

如何训练朴素贝叶斯分类器?

  • 训练朴素贝叶斯分类器的第一步是确定分类的类别
  • 对于每一类建立一个概率.
    P(Dpos)P(D_{pos})P(Dpos​) 是正向文本的概率.
    P(Dneg)P(D_{neg})P(Dneg​) 是负向文本的概率.
    可通过以下公式计算:

P(Dpos)=DposD(1)P(D_{pos}) = \frac{D_{pos}}{D}\tag{1}P(Dpos​)=DDpos​​(1)

P(Dneg)=DnegD(2)P(D_{neg}) = \frac{D_{neg}}{D}\tag{2}P(Dneg​)=DDneg​​(2)

其中 DDD 是文本总数, 即总推特数, DposD_{pos}Dpos​ 是正向推特的总数, DnegD_{neg}Dneg​ 是负向推特的总数。

先验(Prior)与对数先验(Logprior)

先验概率表示数据集中一条推特是正向还是负向的潜在概率。即当我们不知道具体信息时,随机从数据集中抽取一条推特,其为正向的概率是多少?为负向的概率是多少?这就是先验

先验是概率的比值 P(Dpos)P(Dneg)\frac{P(D_{pos})}{P(D_{neg})}P(Dneg​)P(Dpos​)​.
可以对其取对数进行缩放,即得到对数先验

logprior=log(P(Dpos)P(Dneg))=log(DposDneg)\text{logprior} = log \left( \frac{P(D_{pos})}{P(D_{neg})} \right) = log \left( \frac{D_{pos}}{D_{neg}} \right)logprior=log(P(Dneg​)P(Dpos​)​)=log(Dneg​Dpos​​).

注意 log(AB)log(\frac{A}{B})log(BA​) 等价于 log(A)−log(B)log(A) - log(B)log(A)−log(B). 所以对数先验也可表示为两对数的差值:

logprior=log⁡(P(Dpos))−log⁡(P(Dneg))=log⁡(Dpos)−log⁡(Dneg)(3)\text{logprior} = \log (P(D_{pos})) - \log (P(D_{neg})) = \log (D_{pos}) - \log (D_{neg})\tag{3}logprior=log(P(Dpos​))−log(P(Dneg​))=log(Dpos​)−log(Dneg​)(3)

词的正向概率与负向概率

为了计算一个单词的正向概率和负向概率,使用如下输入:

  • freqposfreq_{pos}freqpos​ 和 freqnegfreq_{neg}freqneg​ 表示一个词在正向类和负向类中的频率,例如一个词的正向频率即其被标记为1的次数
  • NposN_{pos}Npos​ 和 NnegN_{neg}Nneg​ 是数据集(全部推特)中正向词和负向词的总数
  • VVV 数据集总单词数,不计算重复单词

通过下式来计算一个词的正向概率和负向概率
P(Wpos)=freqpos+1Npos+V(4)P(W_{pos}) = \frac{freq_{pos} + 1}{N_{pos} + V}\tag{4} P(Wpos​)=Npos​+Vfreqpos​+1​(4)
P(Wneg)=freqneg+1Nneg+V(5)P(W_{neg}) = \frac{freq_{neg} + 1}{N_{neg} + V}\tag{5} P(Wneg​)=Nneg​+Vfreqneg​+1​(5)

注意分子中 “+1” 用于实现加法平滑.详细解释见 wiki article

对数似然(Log likelihood)

为了计算一个词的对数似然,可使用下式:

loglikelihood=log⁡(P(Wpos)P(Wneg))(6)\text{loglikelihood} = \log \left(\frac{P(W_{pos})}{P(W_{neg})} \right)\tag{6}loglikelihood=log(P(Wneg​)P(Wpos​)​)(6)

建立 freqs 字典
  • 给出 count_tweets() 函数, 计算建立 freqs 字典,包含全部频率.
  • freqs 字典中, 键为(word, label)
  • 值为对应键出现的次数

该字典将会被多次使用

# Build the freqs dictionary for later usesfreqs = count_tweets({}, train_x, train_y)

训练模型

给出频率字典, train_x (推特文本) 和 train_y (对应标签),实现朴素贝叶斯分类器

计算 VVV
  • 统计freqs字典中不重复单词个数VVV (可使用 set 函数).
计算 freqposfreq_{pos}freqpos​ 和 freqnegfreq_{neg}freqneg​
  • 使用 freqs 字典, 计算各单词的正向频率 freqposfreq_{pos}freqpos​ 和负向频率 freqnegfreq_{neg}freqneg​.
计算 NposN_{pos}Npos​ 和 NnegN_{neg}Nneg​
  • 使用 freqs 字典,计算正向词总数NposN_{pos}Npos​ 和负向词总数 NnegN_{neg}Nneg​.
计算 DDD, DposD_{pos}Dpos​, DnegD_{neg}Dneg​
  • 使用 train_y 计算推特总数 DDD, 正向推特数 DposD_{pos}Dpos​ 和负向推特数DnegD_{neg}Dneg​.
  • 计算一条推特是正向的概率 P(Dpos)P(D_{pos})P(Dpos​), 和是负向的概率P(Dneg)P(D_{neg})P(Dneg​)
计算对数先验(logprior)
  • 对数先验为 log(Dpos)−log(Dneg)log(D_{pos}) - log(D_{neg})log(Dpos​)−log(Dneg​)
计算对数似然(loglikelihood)
  • 最后,遍历词典中的每个单词,使用 lookup 函数得到各单词的正向频率 freqposfreq_{pos}freqpos​,和负向频率freqnegfreq_{neg}freqneg​.
  • 计算各单词的正向概率 P(Wpos)P(W_{pos})P(Wpos​), 负向概率P(Wneg)P(W_{neg})P(Wneg​) ,使用下式:

P(Wpos)=freqpos+1Npos+V(4)P(W_{pos}) = \frac{freq_{pos} + 1}{N_{pos} + V}\tag{4} P(Wpos​)=Npos​+Vfreqpos​+1​(4)
P(Wneg)=freqneg+1Nneg+V(5)P(W_{neg}) = \frac{freq_{neg} + 1}{N_{neg} + V}\tag{5} P(Wneg​)=Nneg​+Vfreqneg​+1​(5)

注意: 使用字典存储各单词的对数似然,键为单词,值为该单词的对数似然

  • 最后计算对数似然: log(P(Wpos)P(Wneg))log \left( \frac{P(W_{pos})}{P(W_{neg})} \right)log(P(Wneg​)P(Wpos​)​).
# UNQ_C2 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)
def train_naive_bayes(freqs, train_x, train_y):'''Input:freqs: dictionary from (word, label) to how often the word appearstrain_x: a list of tweetstrain_y: a list of labels correponding to the tweets (0,1)Output:logprior: the log prior. (equation 3 above)loglikelihood: the log likelihood of you Naive bayes equation. (equation 6 above)'''loglikelihood = {}logprior = 0### START CODE HERE (REPLACE INSTANCES OF 'None' with your code) #### calculate V, the number of unique words in the vocabularyvocab = set([pair[0] for pair in freqs.keys()])V = len(vocab)# calculate N_pos and N_negN_pos = N_neg = 0for pair in freqs.keys():# if the label is positive (greater than zero)if pair[1] > 0:# Increment the number of positive words by the count for this (word, label) pairN_pos += freqs[pair]# else, the label is negativeelse:# increment the number of negative words by the count for this (word,label) pairN_neg += freqs[pair]# Calculate D, the number of documentsD = len(train_y)# Calculate D_pos, the number of positive documents (*hint: use sum(<np_array>))D_pos = sum(train_y==1)# Calculate D_neg, the number of negative documents (*hint: compute using D and D_pos)D_neg = D - D_pos# Calculate logpriorlogprior = np.log(D_pos) - np.log(D_neg)# For each word in the vocabulary...for word in vocab:# get the positive and negative frequency of the wordfreq_pos = freqs.get((word,1),0)freq_neg = freqs.get((word,0),0)# calculate the probability that each word is positive, and negativep_w_pos = (freq_pos+1) / (N_pos+V)p_w_neg = (freq_neg+1) / (N_neg+V)# calculate the log likelihood of the wordloglikelihood[word] = np.log(p_w_pos / p_w_neg)### END CODE HERE ###return logprior, loglikelihood
# UNQ_C3 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)
# You do not have to input any code in this cell, but it is relevant to grading, so please do not change anything
logprior, loglikelihood = train_naive_bayes(freqs, train_x, train_y)
print(logprior)
print(len(loglikelihood))

0.0
9089

Part 3: 测试模型

现在我们有了 logpriorloglikelihood,可以通过对一些推特进行预测来验证模型

实现 naive_bayes_predict

方法
实现 naive_bayes_predict 函数,用于预测推特.

  • 该函数输入 tweet, logprior, loglikelihood.
  • 返回该条推特是正向还是负向的概率.
  • 对于每条推特, 对其中各单词的对数似然求和.
  • 最后再加上对数先验,来预测该推特的情感分类

p=logprior+∑iN(loglikelihoodi)p = logprior + \sum_i^N (loglikelihood_i)p=logprior+i∑N​(loglikelihoodi​)

注意

通过训练数据计算先验,训练数据为平衡数据集(包含4000条正向推特和4000条负向推特)。因此正负数据比值为1,则对数先验为0。

本实验中对数先验为0,然而对于非平衡数据集,对数先验不为0,因此不要忘记加上对数先验。

# UNQ_C4 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)
def naive_bayes_predict(tweet, logprior, loglikelihood):'''Input:tweet: a stringlogprior: a numberloglikelihood: a dictionary of words mapping to numbersOutput:p: the sum of all the logliklihoods of each word in the tweet (if found in the dictionary) + logprior (a number)'''### START CODE HERE (REPLACE INSTANCES OF 'None' with your code) #### process the tweet to get a list of wordsword_l = process_tweet(tweet)# initialize probability to zerop = 0# add the logpriorp += logpriorfor word in word_l:# check if the word exists in the loglikelihood dictionaryif word in loglikelihood:# add the log likelihood of that word to the probabilityp += loglikelihood[word]### END CODE HERE ###return p
# UNQ_C5 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)
# You do not have to input any code in this cell, but it is relevant to grading, so please do not change anything# Experiment with your own tweet.
my_tweet = 'She smiled.'
p = naive_bayes_predict(my_tweet, logprior, loglikelihood)
print('The expected output is', p)

The expected output is 1.5740278623499175

实现 test_naive_bayes

方法

  • 实现 test_naive_bayes 用来检测预测的准确性.
  • 该函数输入 test_x, test_y, log_prior, 和 loglikelihood
  • 返回模型的准确度.
  • 使用 naive_bayes_predict 函数对每个 text_x 中的推特进行预测.
# UNQ_C6 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)
def test_naive_bayes(test_x, test_y, logprior, loglikelihood):"""Input:test_x: A list of tweetstest_y: the corresponding labels for the list of tweetslogprior: the logpriorloglikelihood: a dictionary with the loglikelihoods for each wordOutput:accuracy: (# of tweets classified correctly)/(total # of tweets)"""accuracy = 0  # return this properly### START CODE HERE (REPLACE INSTANCES OF 'None' with your code) ###y_hats = []for tweet in test_x:# if the prediction is > 0if naive_bayes_predict(tweet, logprior, loglikelihood) > 0:# the predicted class is 1y_hat_i = 1else:# otherwise the predicted class is 0y_hat_i = 0# append the predicted class to the list y_hatsy_hats.append(y_hat_i)# error is the average of the absolute values of the differences between y_hats and test_yerror = sum(y_hats != test_y) / len(test_y)# Accuracy is 1 minus the erroraccuracy = 1 - error### END CODE HERE ###return accuracy
print("Naive Bayes accuracy = %0.4f" %(test_naive_bayes(test_x, test_y, logprior, loglikelihood)))

Naive Bayes accuracy = 0.9940

# UNQ_C7 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)
# You do not have to input any code in this cell, but it is relevant to grading, so please do not change anything# Run this cell to test your function
for tweet in ['I am happy', 'I am bad', 'this movie should have been great.', 'great', 'great great', 'great great great', 'great great great great']:# print( '%s -> %f' % (tweet, naive_bayes_predict(tweet, logprior, loglikelihood)))p = naive_bayes_predict(tweet, logprior, loglikelihood)
#     print(f'{tweet} -> {p:.2f} ({p_category})')print(f'{tweet} -> {p:.2f}')

I am happy -> 2.15
I am bad -> -1.29
this movie should have been great. -> 2.14
great -> 2.14
great great -> 4.28
great great great -> 6.41
great great great great -> 8.55

# Feel free to check the sentiment of your own tweet below
my_tweet = 'you are bad :('
naive_bayes_predict(my_tweet, logprior, loglikelihood)

-8.801622640492191

Part 4: 通过正负计数比划分单词

  • 一些词有更多的正计数,被认为是更"正向"。同样,也有一些词被认为是更"负向"的
  • 在不计算对数似然的情况下,定义积极或消极程度的一种方法是比较单词的积极频率和消极频率
    • 当然,也可以使用对数似然来比较单词的正负向程度
  • 可以计算一个单词的正负向频率比.
  • 当计算出这个比率,就可以根据其高低来划分单词

实现 get_ratio()

  • 给出 freqs 字典和一个单词,使用 lookup(freqs,word,1) 来得到该单词的正向计数
  • 类似的,使用lookup() 函数来得到该单词负向计数
  • 计算正负向计数比值

ratio=pos_words+1neg_words+1ratio = \frac{\text{pos\_words} + 1}{\text{neg\_words} + 1} ratio=neg_words+1pos_words+1​

其中 pos_words 和 neg_words 对应于它们各自类别中单词的频率

Words Positive word count Negative Word Count
glad 41 2
arriv 57 4
:( 1 3663
:-( 0 378
# UNQ_C8 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)
def get_ratio(freqs, word):'''Input:freqs: dictionary containing the wordsword: string to lookupOutput: a dictionary with keys 'positive', 'negative', and 'ratio'.Example: {'positive': 10, 'negative': 20, 'ratio': 0.5}'''pos_neg_ratio = {'positive': 0, 'negative': 0, 'ratio': 0.0}### START CODE HERE (REPLACE INSTANCES OF 'None' with your code) #### use lookup() to find positive counts for the word (denoted by the integer 1)pos_neg_ratio['positive'] = lookup(freqs,word,1)# use lookup() to find negative counts for the word (denoted by integer 0)pos_neg_ratio['negative'] = lookup(freqs,word,0)# calculate the ratio of positive to negative counts for the wordpos_neg_ratio['ratio'] = (pos_neg_ratio['positive']+1) / (pos_neg_ratio['negative']+1)### END CODE HERE ###return pos_neg_ratio
get_ratio(freqs, 'happi')['ratio']

8.526315789473685

实现 get_words_by_threshold(freqs,label,threshold)

  • 当 label 设为1, 选择正负计数比大于等于阈值的单词
  • 当 label 设为0, 选择正负计数比小于等于阈值的单词
  • 使用 get_ratio() 函数生成一个字典,其包含正向计数、负向计数、正负计数比
  • 构建一个字典到列表中,其中键为单词,值为一个字典类型pos_neg_ratio,即get_ratio()的返回值
    例如,其结构如下:
{'happi':{'positive': 10, 'negative': 20, 'ratio': 0.5}
}
for key in freqs.keys():word, _ = keyprint(freqs[(word,_)])

23
30
7
14
27
72
2847
60
7
2
5
80

# UNQ_C9 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)
def get_words_by_threshold(freqs, label, threshold):'''Input:freqs: dictionary of wordslabel: 1 for positive, 0 for negativethreshold: ratio that will be used as the cutoff for including a word in the returned dictionaryOutput:word_set: dictionary containing the word and information on its positive count, negative count, and ratio of positive to negative counts.example of a key value pair:{'happi':{'positive': 10, 'negative': 20, 'ratio': 0.5}}'''word_list = {}### START CODE HERE (REPLACE INSTANCES OF 'None' with your code) ###for key in freqs.keys():word, _ = key# get the positive/negative ratio for a wordpos_neg_ratio = get_ratio(freqs, word)# if the label is 1 and the ratio is greater than or equal to the threshold...if label == 1 and pos_neg_ratio['ratio'] >= threshold :# Add the pos_neg_ratio to the dictionaryword_list[word] = pos_neg_ratio# If the label is 0 and the pos_neg_ratio is less than or equal to the threshold...elif label == 0 and pos_neg_ratio['ratio'] <= threshold:# Add the pos_neg_ratio to the dictionaryword_list[word] = pos_neg_ratio# otherwise, do not include this word in the list (do nothing)### END CODE HERE ###return word_list
# Test your function: find negative words at or below a threshold
get_words_by_threshold(freqs, label=0, threshold=0.05)

{’

自然语言处理(NLP)编程实战-1.2 使用朴素贝叶斯实现情感分类相关推荐

  1. 朴素贝叶斯网络matlab实现_基于朴素贝叶斯的文本分类方法实战

    基于朴素贝叶斯的文本分类方法 一.朴素贝叶斯原理的介绍 二.朴素贝叶斯分类器的代码实现 分类器有时会产生错误结果,这时可以要求分类器给出一个最优的类别猜测结果,同时会给出这个猜测的概率估计值.朴素贝叶 ...

  2. 朴素贝叶斯(Naive Bayes)原理+编程实现拉普拉斯修正的朴素贝叶斯分类器

    贝叶斯方法与朴素贝叶斯 1.生成模型与判别模型 2.贝叶斯 2.1贝叶斯公式 2.2贝叶斯方法 3朴素贝叶斯 3.1条件独立性假设 3.2朴素贝叶斯Naive在何处? 3.3朴素贝叶斯的三种模型 3. ...

  3. NLP之TM之Dirichlet:朴素贝叶斯NB的先验概率之Dirichlet分布的应用

    NLP之TM之Dirichlet:朴素贝叶斯NB的先验概率之Dirichlet分布的应用 目录 1.Dirichlet骰子先验和后验分布的采样 2.稀疏Dirichlet先验的采样 1.Dirichl ...

  4. 机器学习实战(三)朴素贝叶斯 (Peter Harrington著)

    知识储备: 一.概率论和数理统计 第一章 概率论的基本概念 1.必须要掌握的名词 (1) 样本空间 一般可以认为是整个样本 (2) 样本点 其中的一个样本,其中每个样本一般可以理解为特征向量 (3) ...

  5. 机器学习实战(三)朴素贝叶斯NB(Naive Bayes)

    目录 0. 前言 1. 条件概率 2. 朴素贝叶斯(Naive Bayes) 3. 朴素贝叶斯应用于文本分类 4. 实战案例 4.1. 垃圾邮件分类案例 学习完机器学习实战的朴素贝叶斯,简单的做个笔记 ...

  6. NLP系列(3)_用朴素贝叶斯进行文本分类(下)

    作者: 龙心尘 && 寒小阳 时间:2016年2月. 出处:http://blog.csdn.net/longxinchen_ml/article/details/50629110 h ...

  7. 自然语言处理NLP星空智能对话机器人系列:贝叶斯Bayesian Transformer课程片段1到片段7

    Coherence is everything you need! – Gavin Wang(星空智能对话机器人作者,AI通用双线思考法创始人) 贝叶斯神经网络(Bayesian Neural Net ...

  8. 机器学习实战读书笔记(3)朴素贝叶斯

    贝叶斯定理 要理解贝叶斯推断,必须先理解贝叶斯定理.后者实际上就是计算"条件概率"的公式. 所谓"条件概率"(Conditional probability), ...

  9. 机器学习实战 - 读书笔记(04) - 朴素贝叶斯

    核心公式 - 贝叶斯准则 \[p(c|x) = \frac{p(x|c)p(c)}{p(x)}\] p(c|x) 是在x发生的情况下,c发生的概率. p(x|c) 是在c发生的情况下,x发生的概率. ...

最新文章

  1. 内网通免广告_3D打印进军广告发光字领域,成为名副其实的智能打印工厂
  2. linux下mysql数据库操作命令
  3. Ctrl与Caps Lock键的交换
  4. Java GC系列(4):垃圾回收监视和分析
  5. SQL查询得到(按编号分组的日期最大的记录)
  6. 游戏开发3D基础知识
  7. 上传苹果版本时错误解决办法:No suitable application records were found. Verify your bundle identifier
  8. iphone 开发设置tableview 初始位置。
  9. 最新黑马程序员全套视频-.net视频,大家赶紧来下载吧,看图片水印上的QQ加我索取视频教程
  10. 如何彻底卸载Anaconda?
  11. 前端的IDE工具对比
  12. tensorflow(一)windows 10 python3.6安装tensorflow1.4与基本概念解读
  13. ubuntu中GoldenDict的使用
  14. 时钟系统和系统功耗的关系
  15. 2023华为机考刷题指南:八周机考速通车
  16. 【自考】——考后总结
  17. 如何修改Windows上Docker的镜像源
  18. 『论文复现系列』3.Glove
  19. html给数字加货币单位,WPS如何批量给数字添加货币符号?
  20. Swift - JSON

热门文章

  1. swift block语法
  2. 免疫算法求解多元函数论文
  3. 国内20家优秀的低代码平台/厂商汇总
  4. 微信公众号开发整理(一)所有微信资料整理参考慕课网学习而得
  5. C语言程序设计(基础篇)
  6. 金蝶K3 WISE 14.3版本增加用户账号
  7. 4192=鬼吹灯之龙岭迷窟
  8. 世界上最远的距离_泰戈尔
  9. 多套头像/壁纸/背景图资源微信小程序源码 粉色UI 带流量主
  10. 电脑解锁后黑屏有鼠标_电脑黑屏后屏幕只有鼠标怎么办呢?