1 关联规则
- 1.0.1 Identify number of customers
- 1.0.2 Identify customer doing most purchasing & amount
- 1.0.3 Clean Data
- 1.0.4 Finding all the credit records
- 1.0.5 Basket Creation
- 1.0.6 Examples1
- 1.0.7 examples2
  - 1.0.7.1 频繁项集：
  - 1.0.7.2 支持度：
  - 1.0.7.3 置信度：
  - 1.0.7.4 关联分析示例：
  - 1.0.7.5 首先创建数据：
  - 1.0.7.6 转换数据列表：
  - 1.0.7.7 转换为模型可接受数据：
  - 1.0.7.8 求频繁项集：
  - 1.0.7.9 求关联规则：

关联规则

import pandas as pd
import numpy as np

data = pd.read_excel('Data/Online Retail.xlsx')

from mlxtend.frequent_patterns import apriori, association_rules

data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 541909 entries, 0 to 541908
Data columns (total 8 columns):
InvoiceNo      541909 non-null object
StockCode      541909 non-null object
Description    540455 non-null object
Quantity       541909 non-null int64
InvoiceDate    541909 non-null datetime64[ns]
UnitPrice      541909 non-null float64
CustomerID     406829 non-null float64
Country        541909 non-null object
dtypes: datetime64[ns](1), float64(2), int64(1), object(4)
memory usage: 33.1+ MB

data.head()

	InvoiceNo	StockCode	Description	Quantity	InvoiceDate	UnitPrice	CustomerID	Country
0	536365	85123A	WHITE HANGING HEART T-LIGHT HOLDER	6	2010-12-01 08:26:00	2.55	17850.0	United Kingdom
1	536365	71053	WHITE METAL LANTERN	6	2010-12-01 08:26:00	3.39	17850.0	United Kingdom
2	536365	84406B	CREAM CUPID HEARTS COAT HANGER	8	2010-12-01 08:26:00	2.75	17850.0	United Kingdom
3	536365	84029G	KNITTED UNION FLAG HOT WATER BOTTLE	6	2010-12-01 08:26:00	3.39	17850.0	United Kingdom
4	536365	84029E	RED WOOLLY HOTTIE WHITE HEART.	6	2010-12-01 08:26:00	3.39	17850.0	United Kingdom

data.Country.value_counts()

United Kingdom          495478
Germany                   9495
France                    8557
EIRE                      8196
Spain                     2533
Netherlands               2371
Belgium                   2069
Switzerland               2002
Portugal                  1519
Australia                 1259
Norway                    1086
Italy                      803
Channel Islands            758
Finland                    695
Cyprus                     622
Sweden                     462
Unspecified                446
Austria                    401
Denmark                    389
Japan                      358
Poland                     341
Israel                     297
USA                        291
Hong Kong                  288
Singapore                  229
Iceland                    182
Canada                     151
Greece                     146
Malta                      127
United Arab Emirates        68
European Community          61
RSA                         58
Lebanon                     45
Lithuania                   35
Brazil                      32
Czech Republic              30
Bahrain                     19
Saudi Arabia                10
Name: Country, dtype: int64

Identify number of customers

len(data.CustomerID.unique())

Identify customer doing most purchasing & amount

data['TotalPrice'] = data['Quantity'] * data['UnitPrice']

res = data.groupby(['CustomerID']).TotalPrice.sum()

res.sort_values(ascending=False)

CustomerID
14646.0    279489.02
18102.0    256438.49
17450.0    187482.17
14911.0    132572.62
12415.0    123725.45...
12503.0     -1126.00
17603.0     -1165.30
14213.0     -1192.20
15369.0     -1592.49
17448.0     -4287.63
Name: TotalPrice, Length: 4372, dtype: float64

data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 541909 entries, 0 to 541908
Data columns (total 9 columns):
InvoiceNo      541909 non-null object
StockCode      541909 non-null object
Description    540455 non-null object
Quantity       541909 non-null int64
InvoiceDate    541909 non-null datetime64[ns]
UnitPrice      541909 non-null float64
CustomerID     406829 non-null float64
Country        541909 non-null object
TotalPrice     541909 non-null float64
dtypes: datetime64[ns](1), float64(3), int64(1), object(4)
memory usage: 37.2+ MB

Clean Data

data.InvoiceNo.value_counts()

573585     1114
581219      749
581492      731
580729      721
558475      705...
C552241       1
C549840       1
556417        1
C560825       1
C554870       1
Name: InvoiceNo, Length: 25900, dtype: int64

Finding all the credit records

data[data.InvoiceNo.astype('str').str.startswith('C')]

	InvoiceNo	StockCode	Description	Quantity	InvoiceDate	UnitPrice	CustomerID	Country	TotalPrice
141	C536379	D	Discount	-1	2010-12-01 09:41:00	27.50	14527.0	United Kingdom	-27.50
154	C536383	35004C	SET OF 3 COLOURED FLYING DUCKS	-1	2010-12-01 09:49:00	4.65	15311.0	United Kingdom	-4.65
235	C536391	22556	PLASTERS IN TIN CIRCUS PARADE	-12	2010-12-01 10:24:00	1.65	17548.0	United Kingdom	-19.80
236	C536391	21984	PACK OF 12 PINK PAISLEY TISSUES	-24	2010-12-01 10:24:00	0.29	17548.0	United Kingdom	-6.96
237	C536391	21983	PACK OF 12 BLUE PAISLEY TISSUES	-24	2010-12-01 10:24:00	0.29	17548.0	United Kingdom	-6.96
...	...	...	...	...	...	...	...	...	...
540449	C581490	23144	ZINC T-LIGHT HOLDER STARS SMALL	-11	2011-12-09 09:57:00	0.83	14397.0	United Kingdom	-9.13
541541	C581499	M	Manual	-1	2011-12-09 10:28:00	224.69	15498.0	United Kingdom	-224.69
541715	C581568	21258	VICTORIAN SEWING BOX LARGE	-5	2011-12-09 11:57:00	10.95	15311.0	United Kingdom	-54.75
541716	C581569	84978	HANGING HEART JAR T-LIGHT HOLDER	-1	2011-12-09 11:58:00	1.25	17315.0	United Kingdom	-1.25
541717	C581569	20979	36 PENCILS TUBE RED RETROSPOT	-5	2011-12-09 11:58:00	1.25	17315.0	United Kingdom	-6.25

9288 rows × 9 columns

data = data[~data.InvoiceNo.astype('str').str.startswith('C')]

data['Description'] = data.Description.str.strip()

Basket Creation

data_Germany = data[data.Country == 'Germany']

data_Germany.groupby(['InvoiceNo','Description'])['Quantity'].sum()

InvoiceNo  Description
536527     3 HOOK HANGER MAGIC GARDEN             125 HOOK HANGER MAGIC TOADSTOOL          125 HOOK HANGER RED MAGIC TOADSTOOL      12ASSORTED COLOUR LIZARD SUCTION HOOK    24CHILDREN'S CIRCUS PARADE MUG           12..
581578     SPOTTY BUNTING                          9VINTAGE DONKEY TAIL GAME                6WRAP ALPHABET POSTER                   25WRAP CIRCUS PARADE                     25WRAP RED APPLES                        25
Name: Quantity, Length: 9015, dtype: int64

data.Description.value_counts()

WHITE HANGING HEART T-LIGHT HOLDER    2327
JUMBO BAG RED RETROSPOT               2115
REGENCY CAKESTAND 3 TIER              2019
PARTY BUNTING                         1707
LUNCH BAG RED RETROSPOT               1594...
FOUND                                    1
JAM JAR WITH BLUE LID                    1
PINK POLKADOT KIDS BAG                   1
showroom                                 1
BIRD ON BRANCH CANVAS SCREEN             1
Name: Description, Length: 4194, dtype: int64

basket_Germany = data[data['Country'] =="France"].groupby(['InvoiceNo', 'Description'])['Quantity'].sum().unstack().reset_index().fillna(0).set_index('InvoiceNo')

basket_Germany.head()

Description	10 COLOUR SPACEBOY PEN	12 COLOURED PARTY BALLOONS	12 EGG HOUSE PAINTED WOOD	12 MESSAGE CARDS WITH ENVELOPES	12 PENCIL SMALL TUBE WOODLAND	12 PENCILS SMALL TUBE RED RETROSPOT	12 PENCILS SMALL TUBE SKULL	12 PENCILS TALL TUBE POSY	12 PENCILS TALL TUBE RED RETROSPOT	12 PENCILS TALL TUBE WOODLAND	...	WRAP VINTAGE PETALS DESIGN	YELLOW COAT RACK PARIS FASHION	YELLOW GIANT GARDEN THERMOMETER	YELLOW SHARK HELICOPTER	ZINC STAR T-LIGHT HOLDER	ZINC FOLKART SLEIGH BELLS	ZINC HERB GARDEN CONTAINER	ZINC METAL HEART DECORATION	ZINC T-LIGHT HOLDER STAR LARGE	ZINC T-LIGHT HOLDER STARS SMALL
InvoiceNo
536370	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
536852	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
536974	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
537065	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
537463	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0

5 rows × 1563 columns

basket_encoded = basket_Germany.applymap(lambda x: 0 if x <=0 else 1)
basket_Germany = basket_encoded

frq_items = apriori(basket_Germany, min_support = 0.05, use_colnames = True)

rules = association_rules(frq_items, metric ="confidence", min_threshold = .1)
# rules = rules.sort_values(['confidence', 'lift'], ascending =[False, False])
# print(rules.head())

rules = rules.sort_values(['confidence', 'lift'], ascending =[False, False])

rules.head(20)

	antecedents	consequents	antecedent support	consequent support	support	confidence	lift	leverage	conviction
34	(JUMBO BAG WOODLAND ANIMALS)	(POSTAGE)	0.076531	0.765306	0.076531	1.000000	1.306667	0.017961	inf
228	(PLASTERS IN TIN CIRCUS PARADE, RED TOADSTOOL ...	(POSTAGE)	0.051020	0.765306	0.051020	1.000000	1.306667	0.011974	inf
241	(RED TOADSTOOL LED NIGHT LIGHT, PLASTERS IN TI...	(POSTAGE)	0.053571	0.765306	0.053571	1.000000	1.306667	0.012573	inf
269	(SET/20 RED RETROSPOT PAPER NAPKINS, SET/6 RED...	(SET/6 RED SPOTTY PAPER PLATES)	0.102041	0.127551	0.099490	0.975000	7.644000	0.086474	34.897959
267	(SET/6 RED SPOTTY PAPER PLATES, SET/20 RED RET...	(SET/6 RED SPOTTY PAPER CUPS)	0.102041	0.137755	0.099490	0.975000	7.077778	0.085433	34.489796
302	(SET/6 RED SPOTTY PAPER CUPS, SET/20 RED RETRO...	(SET/6 RED SPOTTY PAPER PLATES)	0.084184	0.127551	0.081633	0.969697	7.602424	0.070895	28.790816
299	(SET/6 RED SPOTTY PAPER PLATES, SET/20 RED RET...	(SET/6 RED SPOTTY PAPER CUPS)	0.084184	0.137755	0.081633	0.969697	7.039282	0.070036	28.454082
114	(RED RETROSPOT PICNIC BAG)	(POSTAGE)	0.071429	0.765306	0.068878	0.964286	1.260000	0.014213	6.571429
126	(SET OF 9 BLACK SKULL BALLOONS)	(POSTAGE)	0.066327	0.765306	0.063776	0.961538	1.256410	0.013015	6.102041
153	(SET/6 RED SPOTTY PAPER PLATES)	(SET/6 RED SPOTTY PAPER CUPS)	0.127551	0.137755	0.122449	0.960000	6.968889	0.104878	21.556122
78	(PACK OF 6 SKULL PAPER CUPS)	(POSTAGE)	0.063776	0.765306	0.061224	0.960000	1.254400	0.012417	5.867347
119	(RETROSPOT PARTY BAG + STICKER SET)	(POSTAGE)	0.061224	0.765306	0.058673	0.958333	1.252222	0.011818	5.632653
28	(GUMBALL COAT RACK)	(POSTAGE)	0.058673	0.765306	0.056122	0.956522	1.249855	0.011219	5.397959
79	(PACK OF 6 SKULL PAPER PLATES)	(POSTAGE)	0.056122	0.765306	0.053571	0.954545	1.247273	0.010621	5.163265
261	(SET/6 RED SPOTTY PAPER PLATES, POSTAGE)	(SET/6 RED SPOTTY PAPER CUPS)	0.107143	0.137755	0.102041	0.952381	6.913580	0.087281	18.107143
29	(JAM MAKING SET PRINTED)	(POSTAGE)	0.053571	0.765306	0.051020	0.952381	1.244444	0.010022	4.928571
140	(TEA PARTY BIRTHDAY CARD)	(POSTAGE)	0.094388	0.765306	0.089286	0.945946	1.236036	0.017050	4.341837
139	(STRAWBERRY LUNCH BOX WITH CUTLERY)	(POSTAGE)	0.122449	0.765306	0.114796	0.937500	1.225000	0.021085	3.755102
123	(ROUND SNACK BOXES SET OF4 WOODLAND)	(POSTAGE)	0.158163	0.765306	0.147959	0.935484	1.222366	0.026916	3.637755
20	(CHILDRENS CUTLERY SPACEBOY)	(CHILDRENS CUTLERY DOLLY GIRL)	0.068878	0.071429	0.063776	0.925926	12.962963	0.058856	12.535714

rules['name'] = rules.antecedents.astype('str')

rules[rules.name.str.contains('CHILDRENS CUTLERY SPACEBOY')]

	antecedents	consequents	antecedent support	consequent support	support	confidence	lift	leverage	conviction	name
20	(CHILDRENS CUTLERY SPACEBOY)	(CHILDRENS CUTLERY DOLLY GIRL)	0.068878	0.071429	0.063776	0.925926	12.962963	0.058856	12.535714	frozenset({'CHILDRENS CUTLERY SPACEBOY'})

Examples1

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import itertools
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier
from mlxtend.classifier import EnsembleVoteClassifier
from mlxtend.data import iris_data
from mlxtend.plotting import plot_decision_regions

# Initializing Classifiers
clf1 = LogisticRegression(random_state=0)
clf2 = RandomForestClassifier(random_state=0)
clf3 = SVC(random_state=0, probability=True)
eclf = EnsembleVoteClassifier(clfs=[clf1, clf2, clf3], weights=[2, 1, 1], voting='soft')

# Loading some example data
X, y = iris_data()
X = X[:,[0, 2]]# Plotting Decision Regions
gs = gridspec.GridSpec(2, 2)
fig = plt.figure(figsize=(10, 8))

<Figure size 720x576 with 0 Axes>

%matplotlib qt
for clf, lab, grd in zip([clf1, clf2, clf3, eclf],['Logistic Regression', 'Random Forest', 'RBF kernel SVM', 'Ensemble'],itertools.product([0, 1], repeat=2)):clf.fit(X, y)ax = plt.subplot(gs[grd[0], grd[1]])fig = plot_decision_regions(X=X, y=y, clf=clf, legend=2)plt.title(lab)
plt.show()

d:\Anaconda3\lib\site-packages\sklearn\linear_model\logistic.py:432: FutureWarning: Default solver will be changed to ‘lbfgs’ in 0.22. Specify a solver to silence this warning.
FutureWarning)
d:\Anaconda3\lib\site-packages\sklearn\linear_model\logistic.py:469: FutureWarning: Default multi_class will be changed to ‘auto’ in 0.22. Specify the multi_class option to silence this warning.
“this warning.”, FutureWarning)
d:\Anaconda3\lib\site-packages\sklearn\svm\base.py:193: FutureWarning: The default value of gamma will change from ‘auto’ to ‘scale’ in version 0.22 to account better for unscaled features. Set gamma explicitly to ‘auto’ or ‘scale’ to avoid this warning.
“avoid this warning.”, FutureWarning)
d:\Anaconda3\lib\site-packages\sklearn\linear_model\logistic.py:432: FutureWarning: Default solver will be changed to ‘lbfgs’ in 0.22. Specify a solver to silence this warning.
FutureWarning)
d:\Anaconda3\lib\site-packages\sklearn\linear_model\logistic.py:469: FutureWarning: Default multi_class will be changed to ‘auto’ in 0.22. Specify the multi_class option to silence this warning.
“this warning.”, FutureWarning)
d:\Anaconda3\lib\site-packages\sklearn\svm\base.py:193: FutureWarning: The default value of gamma will change from ‘auto’ to ‘scale’ in version 0.22 to account better for unscaled features. Set gamma explicitly to ‘auto’ or ‘scale’ to avoid this warning.
“avoid this warning.”, FutureWarning)

examples2

频繁项集：

频繁项集是指那些经常出现在一起的物品，例如上图的{葡萄酒、尿布、豆奶}，从上面的数据集中也可以找到尿布->葡萄酒的关联规则，这意味着有人买了尿布，那很有可能他也会购买葡萄酒。那如何定义和表示频繁项集和关联规则呢？这里引入支持度和可信度（置信度）。

支持度：

支持度：一个项集的支持度被定义为数据集中包含该项集的记录所占的比例，上图中，豆奶的支持度为4/5，（豆奶、尿布）为3/5。支持度是针对项集来说的，因此可以定义一个最小支持度，只保留最小支持度的项集。

置信度：

可信度（置信度）：针对如{尿布}->{葡萄酒}这样的关联规则来定义的。计算为支持度{尿布，葡萄酒}/支持度{尿布}，其中{尿布，葡萄酒}的支持度为3/5，{尿布}的支持度为4/5，所以“尿布->葡萄酒”的可行度为3/4=0.75，这意味着尿布的记录中，我们的规则有75%都适用（买了尿布的顾客有75%还会买葡萄酒）。

上面简单介绍三个基本概念，下面我们就来利用 mlxtend 完整简单的实现上面购物表单的关联分析问题。

关联分析示例：

首先创建数据：

转换为DataFrame格式，然后再教一个后续转换回来的方法。

import pandas as pdshopping_list = [['豆奶','莴苣'],['莴苣','尿布','葡萄酒','甜菜'],['豆奶','尿布','葡萄酒','橙汁'],['莴苣','豆奶','尿布','葡萄酒'],['莴苣','豆奶','尿布','橙汁']]shopping_df = pd.DataFrame(shopping_list)

转换数据列表：

接着转换DataFrame数据为包含数据的列表。（由于我们接触到的可能是DataFrame数据所以这里介绍了两个转换为上面列表的方法）

# df_arr = shopping_df.stack().groupby(level=0).apply(list).tolist()   # 方法一def deal(data):return data.dropna().tolist()
df_arr = shopping_df.apply(deal,axis=1).tolist()              # 方法二

转换为模型可接受数据：

由于mlxtend的模型只接受特定的数据格式。（TransactionEncoder类似于独热编码，每个值转换为一个唯一的bool值）


from mlxtend.preprocessing import TransactionEncoder    # 传入模型的数据需要满足特定的格式，可以用这种方法来转换为bool值，也可以用函数转换为0、1te = TransactionEncoder()    # 定义模型
df_tf = te.fit_transform(df_arr)
# df_01 = df_tf.astype('int')            # 将 True、False 转换为 0、1 # 官方给的其它方法
# df_name = te.inverse_transform(df_tf)        # 将编码值再次转化为原来的商品名
df = pd.DataFrame(df_tf,columns=te.columns_)

求频繁项集：

导入apriori方法设置最小支持度min_support=0.05求频繁项集，还能选择出长度大于x的频繁项集。

from mlxtend.frequent_patterns import apriorifrequent_itemsets = apriori(df,min_support=0.05,use_colnames=True)   # use_colnames=True表示使用元素名字，默认的False使用列名代表元素
# frequent_itemsets = apriori(df,min_support=0.05)
frequent_itemsets.sort_values(by='support',ascending=False,inplace=True)   # 频繁项集可以按支持度排序
# print(frequent_itemsets[frequent_itemsets.itemsets.apply(lambda x: len(x)) >= 2])  # 选择长度 >=2 的频繁项集

求关联规则：

导入association_rules方法判断’confidence’大于0.9，求关联规则。

from mlxtend.frequent_patterns import association_rulesassociation_rule = association_rules(frequent_itemsets,metric='confidence',min_threshold=0.9)    # metric可以有很多的度量选项，返回的表列名都可以作为参数
association_rule.sort_values(by='leverage',ascending=False,inplace=True)    #关联规则可以按leverage排序
# print(association_rule)

下面便得到了上表中满足设置条件的关联规则

mlxtend使用了 DataFrame 方式来描述关联规则，而不是 —> 符号，其中：

antecedents：规则先导项
consequents：规则后继项
antecedent support：规则先导项支持度
consequent support：规则后继项支持度
support：规则支持度（前项后项并集的支持度）
confidence：规则置信度（规则置信度：规则支持度support / 规则先导项）
lift：规则提升度，表示含有先导项条件下同时含有后继项的概率，与后继项总体发生的概率之比。
leverage：规则杠杆率，表示当先导项与后继项独立分布时，先导项与后继项一起出现的次数比预期多多少。
conviction：规则确信度，与提升度类似，但用差值表示。

提升度计算公式：
$lift(X\rightarrow Y) = \frac{support(X\bigcap Y)}{support(X)*support(Y)}#pic_center$
其中，当先导项与后继项独立分布时，值为 1，提升度越大，表示先导项与后继项的关联性越强。

杠杆率计算公式：
$leverage(X\rightarrow Y) = support(X\rightarrow Y)-support(X)*support(Y)#pic_center$

确信度计算公式：

$conviction(X\rightarrow Y) = \frac{1-support(Y)}{1-confidence(X\rightarrow Y)}#pic_center$
确信度值越大，则先导项与后继项的关联性越强。以上三个值都是越大关联强度也就越大。

mlxtend官网地址：https://rasbt.github.io/mlxtend/

mlxtend GitHub地址：https://github.com/rasbt/mlxtend

Association Rules_python关联规则相关推荐

Association Rules 关联规则
Association Rules 关联规则除了apriori和FPGrowth目前还有那些方法用来发现关联规则? 关键词: 频繁项集,apriori算法,FPGrowth,关联规则, 频繁项集评估 ...
react 条件渲染_React中的条件渲染语法
react 条件渲染为什么我们不能使用If-Else以及三元运算符如何提供帮助 (Why We Can't Use If-Else and How the Ternary Operator can ...
关联规则挖掘算法_关联规则的挖掘与应用——Apriori和CBA算法
文|光大科技大数据部魏乐卢格润 1 关联规则 1.1 关联规则基本概念 1.2 Apriori算法基本思路 2 关联分类 2.1 CBA关联分类算法思路 2.2 CBA算法实现总结关 ...
简单易懂的人工智能系列：关联规则
关联规则:Association Rule 关联规则是反应失误与实物间相互的依存关系和关联性.如果两个或多个事物间存在一定的关联关系,则其中一个事物能够通过其他食物预测到.最常见的场景就是购物篮分析( ...
在过去的12个月（2016）里，你用到的最多的算法或方法是什么？
原文链接:http://www.kdnuggets.com/2016/09/poll-algorithms-used-data-scientists.html 这个调查问卷一共有844个投票,排名前1 ...
数据库系统知识点总结与英文课件翻译
数据库系统 lec1 数据库系统概述 1.什么是数据库 P3 Data 数据: facts and statistics collected together for reference or ana ...
【论文阅读 - YolTrack】YolTrack:基于MTL的自动车辆实时多目标跟踪和分割
本文2021.12发表于IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS,作者来自哈工大.本文的主要贡献在于提出了一种新的神经网络模型 ...
程序员的机器学习入门笔记（一）：基本概念介绍
一 .概述随着计算机技术的发展,各行各业都开始采用计算机及相应的信息技术进行管理和运营,这使得企业生成.收集.存贮和处理数据的能力大大提高,数据量与日俱增.企业数据实际上是企业的经验积累,当其积累到 ...
关联分析-问题定义（1）‍♀️
关联分析-问题定义目录前言一.关联分析:基本概念和算法 1.1 问题定义 1.1.1 二元表示 1.1.2 项集和支持度计数 1.1.3 事务的宽度 1.1.4 关联规则(association ...

Association Rules_python关联规则

Table of Contents