参照:SF-Crime Analysis & Prediction
Crime Scene Exploration and Model Fit
Random Forest Crime Classification(特征工程和预测)
主要是因为这个数据集包含了时间序列和坐标点。练习一下特征处理。

数据分析

导入库

#%%
%matplotlib inline
import numpy as np
import pandas as pd
import math
import seaborn as snsimport matplotlib.pyplot as plt
import matplotlib.ticker as tickerfrom datetime import tzinfo,timedelta,datetime

datetime库概述

导入数据

data_train = pd.read_csv("C:\\Users\\Nihil\\Documents\\pythonlearn\\data\\kaggle\\sf-crime\\train.csv")
print(data_train.dtypes)
Dates          object
Category       object
Descript       object
DayOfWeek      object
PdDistrict     object
Resolution     object
Address        object
X             float64
Y             float64
dtype: object
print(data_train.head())
                 Dates        Category                      Descript  \
0  2015-05-13 23:53:00        WARRANTS                WARRANT ARREST
1  2015-05-13 23:53:00  OTHER OFFENSES      TRAFFIC VIOLATION ARREST
2  2015-05-13 23:33:00  OTHER OFFENSES      TRAFFIC VIOLATION ARREST
3  2015-05-13 23:30:00   LARCENY/THEFT  GRAND THEFT FROM LOCKED AUTO
4  2015-05-13 23:30:00   LARCENY/THEFT  GRAND THEFT FROM LOCKED AUTO   DayOfWeek PdDistrict      Resolution                    Address  \
0  Wednesday   NORTHERN  ARREST, BOOKED         OAK ST / LAGUNA ST
1  Wednesday   NORTHERN  ARREST, BOOKED         OAK ST / LAGUNA ST
2  Wednesday   NORTHERN  ARREST, BOOKED  VANNESS AV / GREENWICH ST
3  Wednesday   NORTHERN            NONE   1500 Block of LOMBARD ST
4  Wednesday       PARK            NONE  100 Block of BRODERICK ST   X          Y
0 -122.425892  37.774599
1 -122.425892  37.774599
2 -122.424363  37.800414
3 -122.426995  37.800873
4 -122.438738  37.771541
print(data_train.columns.values)
['Dates' 'Category' 'Descript' 'DayOfWeek' 'PdDistrict' 'Resolution''Address' 'X' 'Y']

单一特征分析

Category

类型:类别特征
有哪些特征?

print('number of crime categories is {}'.format(len(data_train.Category.unique())))
print('Type:')
print(data_train.Category.value_counts().index)

统计频次还可以用:

print('number of crime categories is {}'.format(data_train.Category.nunique()))
number of crime categories is 39
number of crime categories is 39
Type
Index(['LARCENY/THEFT', 'OTHER OFFENSES', 'NON-CRIMINAL', 'ASSAULT','DRUG/NARCOTIC', 'VEHICLE THEFT', 'VANDALISM', 'WARRANTS', 'BURGLARY','SUSPICIOUS OCC', 'MISSING PERSON', 'ROBBERY', 'FRAUD','FORGERY/COUNTERFEITING', 'SECONDARY CODES', 'WEAPON LAWS','PROSTITUTION', 'TRESPASS', 'STOLEN PROPERTY', 'SEX OFFENSES FORCIBLE','DISORDERLY CONDUCT', 'DRUNKENNESS', 'RECOVERED VEHICLE', 'KIDNAPPING','DRIVING UNDER THE INFLUENCE', 'RUNAWAY', 'LIQUOR LAWS', 'ARSON','LOITERING', 'EMBEZZLEMENT', 'SUICIDE', 'FAMILY OFFENSES', 'BAD CHECKS','BRIBERY', 'EXTORTION', 'SEX OFFENSES NON FORCIBLE', 'GAMBLING','PORNOGRAPHY/OBSCENE MAT', 'TREA'],dtype='object')

作图:每个类型出现的频次

number_of_crimes = data_train.Category.value_counts()
ax = sns.barplot(x=number_of_crimes.index,y=number_of_crimes)
ax.set_xticklabels(number_of_crimes.index,rotation=90)

时间属性

类别:时间序列
type(数据类型):object(需要转换)

统计各个属性的频次

print(data_train.DayOfWeek.value_counts())
Friday       133734
Wednesday    129211
Saturday     126810
Thursday     125038
Tuesday      124965
Monday       121584
Sunday       116707
Name: DayOfWeek, dtype: int64
crimeTime = data_train.DayOfWeek.value_counts()
ax = sns.barplot(x=crimeTime.index,y=crimeTime)
ax.set_xticklabels(crimeTime.index,rotation=90)

转换日期格式

data_train.Dates = pd.to_datetime(data_train.Dates,format='%Y/%m/%d %H:%M:%S')
print(data_train.Dates.dtypes)
datetime64[ns]

- 如何只留下年月日,去掉分秒

data_train['Dates'] = pd.to_datetime(data_train['Dates'])
data_train['Date'] = data_train['Dates'].dt.date
print(data_train.Date)
0    2015-05-13
1    2015-05-13
2    2015-05-13
3    2015-05-13
4    2015-05-13
Name: Date, dtype: object

写一个函数,方便转换测试集和训练集

def transformTimeDataset(dataset):dataset['Dates'] = pd.to_datetime(dataset['Dates'])dataset['Date'] = dataset['Dates'].dt.datedataset['n_days'] = (dataset['Date']-dataset['Date'].min()).apply(lambda x:x.days)dataset['Year'] = dataset['Dates'].dt.yeardataset['DayOfWeek'] = dataset['Dates'].dt.dayofweekdataset['WeekOfYear'] = dataset['Dates'].dt.weekofyeardataset['Month'] = dataset['Dates'].dt.monthdataset['Hour'] = dataset['Dates'].dt.hourdataset = dataset.drop('Dates',1)return dataset

转换

data_train = transformTimeDataset(data_train)
data_test = transformTimeDataset(data_test)
print(data_train.Date.head())
0    2015-05-13
1    2015-05-13
2    2015-05-13
3    2015-05-13
4    2015-05-13
Name: Date, dtype: object

地理信息

有几个街区,街区类别

print('number of PdDistrict is {}'.format(data_train.PdDistrict.nunique()))
print('Type:')
print(data_train.PdDistrict.value_counts().index)
number of PdDistrict is 10
Type:
Index(['SOUTHERN', 'MISSION', 'NORTHERN', 'BAYVIEW', 'CENTRAL', 'TENDERLOIN','INGLESIDE', 'TARAVAL', 'PARK', 'RICHMOND'],dtype='object')
给街区编热独码。
dataset = pd.get_dummies(data=dataset, columns=[ 'PdDistrict'], drop_first = True)
print(dataset)
PdDistrict_CENTRAL  PdDistrict_INGLESIDE  PdDistrict_MISSION  \
0       0                     0                   0
1       0                     0                   0
2       0                     0                   0
3       0                     0                   0
4       0                     0                   0   PdDistrict_NORTHERN  PdDistrict_PARK  PdDistrict_RICHMOND  \
0                    1                0                    0
1                    1                0                    0
2                    1                0                    0
3                    1                0                    0
4                    0                1                    0   PdDistrict_SOUTHERN  PdDistrict_TARAVAL  PdDistrict_TENDERLOIN
0                    0                   0                      0
1                    0                   0                      0
2                    0                   0                      0
3                    0                   0                      0
4                    0                   0                      0

地址

print('number of Address is {}'.format(data_train.Address.nunique()))
print('Type:')
print(data_train.Address.value_counts())
number of Address is 23228
Type:
800 Block of BRYANT ST                 26533
800 Block of MARKET ST                  6581
2000 Block of MISSION ST                5097
1000 Block of POTRERO AV                4063
900 Block of MARKET ST                  3251
0 Block of TURK ST                      3228
0 Block of 6TH ST                       2884
300 Block of ELLIS ST                   2703
400 Block of ELLIS ST                   2590
16TH ST / MISSION ST                    2504
1000 Block of MARKET ST                 2489
1100 Block of MARKET ST                 2319
2000 Block of MARKET ST                 2168
100 Block of OFARRELL ST                2140
700 Block of MARKET ST                  2081
3200 Block of 20TH AV                   2035
100 Block of 6TH ST                     1887
500 Block of JOHNFKENNEDY DR            1824
TURK ST / TAYLOR ST                     1810
200 Block of TURK ST                    1800
0 Block of PHELAN AV                    1791
0 Block of UNITEDNATIONS PZ             1789
0 Block of POWELL ST                    1717
100 Block of EDDY ST                    1681
1400 Block of PHELPS ST                 1629
300 Block of EDDY ST                    1589
100 Block of GOLDEN GATE AV             1353
100 Block of POWELL ST                  1333
200 Block of INTERSTATE80 HY            1316
MISSION ST / 16TH ST                    1300...
CRISP RD / QUESADA AV                      1
AGUA WY / TERESITA BL                      1
MUNICH ST / NAYLOR ST                      1
SPEAR ST / THE EMBARCADERO SOUTH ST        1
SENECA AV / BERTITA ST                     1
600 Block of ARTHUR AV                     1
MAJESTIC AV / SUMMIT ST                    1
FERNWOOD DR / BRENTWOOD AV                 1
PACHECO ST / GREAT HWY                     1
CLAREMONT BL / DORCHESTER WY               1
ARBOR ST / HILIRITAS AV                    1
23RD ST / SEVERN ST                        1
CERVANTES BL / BAY ST                      1
TELEGRAPH HILL BL / LOMBARD ST             1
300 Block of MATEO ST                      1
CABRILLO ST / 22ND AV                      1
WAWONA ST / 33RD AV                        1
CONRAD ST / POPPY LN                       1
16TH ST / SPENCER AL                       1
MARSILY ST / ST MARYS AV                   1
MOULTRIE ST / OGDEN AV                     1
1500 Block of BURROWS ST                   1
OGDEN AV / FOLSOM ST                       1
LAWTON ST / LOWER GREAT HY                 1
CHENERY ST / MIZPAH ST                     1
3RD ST / JAMES LICK FREEWAY HY             1
5THSTNORTH ST / ELLIS ST                   1
PARADISE AV / BURNSIDE AV                  1
PORTOLA DR / 15TH AV                       1
DE HARO ST / ALAMEDA ST                    1
Name: Address, Length: 23228, dtype: int64

这个操作太有意思了,小本本记下来
把文本里含有某个关键词的赋值1,其余赋值为0。

dataset['Block'] = dataset['Address'].str.contains('block', case=False)
dataset['Block'] = dataset['Block'].map(lambda x: 1 if x == True else 0)
print(dataset.Block)
0    0
1    0
2    0
3    1
4    1
Name: Block, dtype: int64

为了方便测试集和训练集的转换,写一个函数(其实可以把时间和地址的转换写一起,我只是为了方便看明白所以分开了)

def transformdGeoDataset(dataset):dataset['Block'] = dataset['Address'].str.contains('block', case=False)dataset['Block'] = dataset['Block'].map(lambda x: 1 if x == True else 0)dataset.drop('Address', 1)dataset = pd.get_dummies(data=dataset,columns='PdDistrict',drop_first=True)return dataset

转换测试集与训练集

data_train = transformdGeoDataset(data_train)
data_test = transformdGeoDataset(data_test)
print(data_train.head())

输出结果

         Category                      Descript  DayOfWeek      Resolution                    Address           X          Y        Date  n_days  Year  WeekOfYear  Month  Hour  Block  PdDistrict_CENTRAL  PdDistrict_INGLESIDE  PdDistrict_MISSION  PdDistrict_NORTHERN  PdDistrict_PARK  PdDistrict_RICHMOND  PdDistrict_SOUTHERN  PdDistrict_TARAVAL  PdDistrict_TENDERLOIN
0        WARRANTS                WARRANT ARREST          2  ARREST, BOOKED         OAK ST / LAGUNA ST -122.425892  37.774599  2015-05-13    4510  2015          20      5    23      0                   0                     0                   0                    1                0                    0                    0                   0                      0
1  OTHER OFFENSES      TRAFFIC VIOLATION ARREST          2  ARREST, BOOKED         OAK ST / LAGUNA ST -122.425892  37.774599  2015-05-13    4510  2015          20      5    23      0                   0                     0                   0                    1                0                    0                    0                   0                      0
2  OTHER OFFENSES      TRAFFIC VIOLATION ARREST          2  ARREST, BOOKED  VANNESS AV / GREENWICH ST -122.424363  37.800414  2015-05-13    4510  2015          20      5    23      0                   0                     0                   0                    1                0                    0                    0                   0                      0
3   LARCENY/THEFT  GRAND THEFT FROM LOCKED AUTO          2            NONE   1500 Block of LOMBARD ST -122.426995  37.800873  2015-05-13    4510  2015          20      5    23      1                   0                     0                   0                    1                0                    0                    0                   0                      0
4   LARCENY/THEFT  GRAND THEFT FROM LOCKED AUTO          2            NONE  100 Block of BRODERICK ST -122.438738  37.771541  2015-05-13    4510  2015          20      5    23      1                   0                     0                   0                    0                1                    0                    0                   0                      0

X,Y坐标点

写了一下午的又没了,一脸血泪,记得保存。

sns.pairplot(data_train[["X", "Y"]])

从图中可看出,y点有80的值较为离群。

sns.boxplot(data_train[["Y"]])


查看Y<80的X的分布

data_train = data_train[train_data["Y"] < 80]
sns.distplot(data_train[["X"]])

去掉一些属性

data_train = data_train.drop(["Descript", "Resolution","Address","Dates","Date"], axis = 1)
data_test = data_test.drop(["Address","Dates","Date"], axis = 1)

对Target(Category)进行编码

from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
data_train.Category = le.fit_transform(data_train.Category)

X与y

X = data_train.drop("Category",axis=1).values
y = data_train['Category'].valuesfrom sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.1,random_state=42)

训练模型采用的数据均来自train.csv,把Train里的数据分为训练集和测试集。

决策树模型

from sklearn.tree import DecisionTreeClassifier
dtree = DecisionTreeClassifier()
dtree.fit(X_train,y_train)

预测与评估

from sklearn.metrics import classification_report,confusion_matrixcm = confusion_matrix(y_test,predictions)
fig,ax = plt.subplots(figsize=(20,20))
sns.heatmap(cm,annot=False,ax = ax)
ax.set_xlabel('Predicted labels')
ax.set_ylabel('True labels')
ax.set_title('Confusion Matrix')
plt.show()

precision,recall,f1-score,support

print(classification_report(y_test,predictions))
        precision    recall  f1-score   support0       0.04      0.04      0.04       1681       0.21      0.27      0.23      76322       0.00      0.00      0.00        413       0.00      0.00      0.00        284       0.14      0.14      0.14      37625       0.02      0.04      0.03       4096       0.02      0.03      0.02       2127       0.35      0.51      0.41      54378       0.01      0.01      0.01       4219       0.00      0.00      0.00        9110       0.00      0.00      0.00        2911       0.02      0.02      0.02        4412       0.11      0.13      0.12      104313       0.07      0.08      0.07      162114       0.07      0.07      0.07        1415       0.04      0.04      0.04       23216       0.38      0.34      0.36     1745517       0.06      0.07      0.06       19218       0.26      0.23      0.25       13819       0.49      0.57      0.53      256220       0.21      0.19      0.20      931121       0.25      0.23      0.24     1246422       0.00      0.00      0.00         123       0.59      0.55      0.57       77924       0.03      0.03      0.03       31125       0.08      0.07      0.07      228126       0.19      0.17      0.18       20427       0.00      0.00      0.00      103128       0.15      0.13      0.14       47829       0.00      0.00      0.00        1430       0.02      0.01      0.02       47031       0.03      0.02      0.02        4532       0.07      0.06      0.06      319733       0.00      0.00      0.00         034       0.03      0.02      0.03       76135       0.13      0.11      0.11      454036       0.43      0.46      0.44      529637       0.12      0.09      0.11      421638       0.10      0.08      0.09       875accuracy                           0.25     87805macro avg       0.12      0.12      0.12     87805
weighted avg       0.25      0.25      0.25     87805

随机森林

from sklearn.ensemble import RandomForestClassifier
rfc = RandomForestClassifier(n_estimators=40,min_samples_split=100)
rfc.fit(X_train,y_train)
print(rfc)
RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',max_depth=None, max_features='auto', max_leaf_nodes=None,min_impurity_decrease=0.0, min_impurity_split=None,min_samples_leaf=1, min_samples_split=100,min_weight_fraction_leaf=0.0, n_estimators=40,n_jobs=None, oob_score=False, random_state=None,verbose=0, warm_start=False)

验证

rfc_pred = rfc.predict(X_test)
print(classification_report(y_test,rfc_pred))
              precision    recall  f1-score   support0       0.00      0.00      0.00       1681       0.20      0.18      0.19      76322       0.00      0.00      0.00        413       0.00      0.00      0.00        284       0.23      0.03      0.06      37625       0.18      0.02      0.04       4096       0.00      0.00      0.00       2127       0.34      0.43      0.38      54378       0.00      0.00      0.00       4219       0.00      0.00      0.00        9110       0.00      0.00      0.00        2911       0.00      0.00      0.00        4412       0.29      0.00      0.01      104313       0.28      0.01      0.01      162114       0.00      0.00      0.00        1415       0.00      0.00      0.00       23216       0.30      0.77      0.43     1745517       0.75      0.02      0.03       19218       1.00      0.04      0.08       13819       0.53      0.29      0.38      256220       0.24      0.15      0.18      931121       0.28      0.37      0.32     1246422       0.00      0.00      0.00         123       0.56      0.68      0.62       77924       0.00      0.00      0.00       31125       0.00      0.00      0.00      228126       0.77      0.11      0.20       20427       0.33      0.00      0.00      103128       0.00      0.00      0.00       47829       0.00      0.00      0.00        1430       0.00      0.00      0.00       47031       0.00      0.00      0.00        4532       0.00      0.00      0.00      319734       0.26      0.02      0.03       76135       0.24      0.01      0.02      454036       0.28      0.24      0.26      529637       0.28      0.00      0.01      421638       1.00      0.00      0.00       875accuracy                           0.29     87805macro avg       0.22      0.09      0.09     87805
weighted avg       0.27      0.29      0.23     87805

特征重要度

n_features = X.shape[1]#列
plt.barh(range(n_features),rfc.feature_importances_)
plt.yticks(np.arange(n_features),data_train.columns[1:])
plt.show()

n_features = X.shape[1]
print(n_features)
18

Submission(保存并且输出)

1.把Target的类型作为keys,其标签作为values

rfc_pred = rfc.predict(X_test)
keys = le.classes_
values = le.transform(le.classes_)
print(keys)

2.将Target的类别及进行特征工程后的值保存为字典

dictionary = dict(zip(keys,values))
print(dictionary)
{'ARSON': 0, 'ASSAULT': 1, 'BAD CHECKS': 2, 'BRIBERY': 3, 'BURGLARY': 4, 'DISORDERLY CONDUCT': 5, 'DRIVING UNDER THE INFLUENCE': 6, 'DRUG/NARCOTIC': 7, 'DRUNKENNESS': 8, 'EMBEZZLEMENT': 9, 'EXTORTION': 10, 'FAMILY OFFENSES': 11, 'FORGERY/COUNTERFEITING': 12, 'FRAUD': 13, 'GAMBLING': 14, 'KIDNAPPING': 15, 'LARCENY/THEFT': 16, 'LIQUOR LAWS': 17, 'LOITERING': 18, 'MISSING PERSON': 19, 'NON-CRIMINAL': 20, 'OTHER OFFENSES': 21, 'PORNOGRAPHY/OBSCENE MAT': 22, 'PROSTITUTION': 23, 'RECOVERED VEHICLE': 24, 'ROBBERY': 25, 'RUNAWAY': 26, 'SECONDARY CODES': 27, 'SEX OFFENSES FORCIBLE': 28, 'SEX OFFENSES NON FORCIBLE': 29, 'STOLEN PROPERTY': 30, 'SUICIDE': 31, 'SUSPICIOUS OCC': 32, 'TREA': 33, 'TRESPASS': 34, 'VANDALISM': 35, 'VEHICLE THEFT': 36, 'WARRANTS': 37, 'WEAPON LAWS': 38}

3.对测试集进行处理

查看测试集的信息

print(data_test.head())
 Id  DayOfWeek           X          Y  n_days  Year  WeekOfYear  Month  Hour  Block  PdDistrict_CENTRAL  PdDistrict_INGLESIDE  PdDistrict_MISSION  PdDistrict_NORTHERN  PdDistrict_PARK  PdDistrict_RICHMOND  PdDistrict_SOUTHERN  PdDistrict_TARAVAL  PdDistrict_TENDERLOIN
0   0          6 -122.399588  37.735051    4512  2015          19      5    23      1                   0                     0                   0                    0                0                    0                    0                   0                      0
1   1          6 -122.391523  37.732432    4512  2015          19      5    23      0                   0                     0                   0                    0                0                    0                    0                   0                      0
2   2          6 -122.426002  37.792212    4512  2015          19      5    23      1                   0                     0                   0                    1                0                    0                    0                   0                      0
3   3          6 -122.437394  37.721412    4512  2015          19      5    23      1                   0                     1                   0                    0                0                    0                    0                   0                      0
4   4          6 -122.437394  37.721412    4512  2015          19      5    23      1                   0                     1                   0                    0                0                    0                    0                   0                      0

去掉多余特征‘Id’

data_test = data_test.drop('Id',axis=1)

用前面训练好的模型rfc对测试集进行预测

y_pred_prob = rfc.predict_proba(data_test)
print(y_pred_prob)
[[0.00220038 0.14332224 0.         ... 0.12182985 0.04372455 0.02692804][0.0003572  0.0505187  0.         ... 0.07283049 0.06742673 0.03087049][0.0040157  0.08378697 0.         ... 0.06877263 0.02119305 0.00594818]...[0.00373092 0.12110726 0.00178571 ... 0.22416137 0.03237321 0.00850585][0.0169558  0.15710357 0.00148783 ... 0.12023433 0.05170328 0.00645309][0.00309621 0.06249053 0.015625   ... 0.17716719 0.03481506 0.00728016]]

好气啊又没有了。

封装结果

results = pd.DataFrame(y_pred_prob)
results.columns = keys
results.to_csv(path_or_buf="rfc_predict_4.csv",index=True, index_label = 'Id')

芝加哥犯罪率数据集(数据分析与特征处理)相关推荐

  1. 一键数据分析自动化特征工程!

    创造新的特征是一件十分困难的事情,需要丰富的专业知识和大量的时间.机器学习应用的本质基本上就是特征工程.--Andrew Ng 业内常说数据决定了模型效果上限,而机器学习算法是通过数据特征做出预测的, ...

  2. ML之FE:对爬取的某平台二手房数据进行数据分析以及特征工程处理

    ML之FE:对爬取的某平台二手房数据进行数据分析以及特征工程处理 目录 对爬取的某平台二手房数据进行数据分析以及特征工程处理 1.定义数据集 2.特征工程(数据分析+数据处理) 对爬取的某平台二手房数 ...

  3. ML之DataScience:基于机器学习处理数据科学(DataScience)任务(数据分析、特征工程、科学预测等)的简介、流程、案例应用执行详细攻略

    ML之DataScience:基于机器学习处理数据科学(DataScience)任务(数据分析.特征工程.科学预测等)的简介.流程.案例应用执行详细攻略 目录 数据科学的任务(数据分析.特征工程.科学 ...

  4. python sklearn 归一化_数据分析|Python特征工程(5)

    OX00 引言 数据和特征决定了机器学习的上限,而模型和算法只是逼近这个上限而已.由此可见,特征工程在机器学习中占有相当重要的地位.在实际应用当中,可以说特征工程是机器学习成功的关键. 特征做不好,调 ...

  5. 机器学习核心总结-概念、线性回归、损失函数、泛化及数据集划分、特征工程、逻辑回归和分类

    文章目录 一.机器学习入门概念 一.基本概念 机器学习:让机器进行学习和决策 机器学习分类:无监督学习.监督学习.强化学习 深度学习:模拟人脑,自动提取输入特征,是实现机器学习的方式之一 神经网络:一 ...

  6. python数据分析及特征工程(实战)

    python数据分析及特征工程(实战) 1.数据分析 1.1单属性分析 1.1.1 异常值分析 1.1.2 分布分析 1.1.3 对比分析 1.1.4 结构分析 1.2多属性分析 1.2.1假设检验 ...

  7. 机器学习之数据分析与特征工程

    通过七月在线的限免课程,学习了数据分析与特征工程,记录一下学习的过程供日后回顾 问题与建模 首先需要明确要解决的问题:回归?分类?根据要解决的问题进行建模. 建模流程为:识别问题,理解数据,数据预处理 ...

  8. ML之FE:对人类性别相关属性数据集进行数据特征分布可视化分析与挖掘

    ML之FE:对人类性别相关属性数据集进行数据特征分布可视化分析与挖掘 目录 对人类性别相关属性数据集进行数据特征分布可视化分析与挖掘 输出结果 实现代码 对人类性别相关属性数据集进行数据特征分布可视化 ...

  9. OpenCV对图片数据集提取HOG特征并用SVM进行识别

    OpenCV对图片数据集提取HOG特征并用SVM进行识别 代码编程环境 设置图片大小源代码 提取HOG源代码 SVM识别源代码 实验结果 代码编程环境 Windows系统: OpenCV3.4.1: ...

最新文章

  1. 自动类型转换和强制类型转换
  2. 浙大计算机知识基础,计算机基础知识题浙大远程
  3. IOS开发基础知识--碎片39
  4. 一个无法捕获ADO.NET Dataset的内存错误
  5. 浅析 Mybatis 与 Hibernate 的区别与用途
  6. Replace Pioneer
  7. 谷歌+安卓,他已经改变了世界两次,但还想多来几次
  8. html+css+dom补充
  9. Quartz-第三篇 quartz-misfire 错失,补偿执行
  10. 使用二维码识别技术的好处_二维码门禁系统,是如何实现解密开锁的呢?
  11. Python 计时器倒计时弹窗提醒
  12. Leetcode第904题
  13. 初学者吉他怎么选?适合男女生新手吉他入门品牌推荐!
  14. _EPROCESS断链 —— 实现进程内核隐藏
  15. 计算机组成原理唐朔飞第二版答案第六章,计算机组成原理第六章部分课后题答案(唐朔飞版)...
  16. 解决:navicat连接mysql报错10060
  17. u-boot-1.3.4 移植到S3C2440 (带有某些解析)
  18. java技术栈 高清视频
  19. EDA,该如何做这困兽之斗
  20. 教师资格证报名显示内核服务器错误,中小学教师资格证报考支付卡状态失效是怎么回事?..._教师资格考试_帮考网...

热门文章

  1. python矩阵和向量乘法总结
  2. 企业流程篇--项目管理(七)
  3. android 4.4以上sd卡,怎样无根绕过Android 4.4(KitKat)外部SD卡限制
  4. cmd显示服务器对区域没有权威,查询dns解析服务器地址cmd命令
  5. 3DTouch简单实现
  6. 交通指示牌的特征匹配代码
  7. 网站信息的采集系列(一)--基本流程
  8. pthread_cond_destroy死锁卡住问题处理记录
  9. 微信小程序获取启动参数
  10. sicily 9562 SUME