深度学习模型用于专利分类

仍然是对专利分类代码的学习
https://github.com/newzhoujian/LCASPatentClassification

深度学习模型（七个）
Word2Vec+ANN.py（人工神经网络）
Word2Vec+ATT.py
Word2Vec+GRU.py
Word2Vec+BiGRU.py
Word2Vec+TextCNN.py
Word2Vec+BiGRU+TextCNN.py
Word2Vec+BiGRU+ATT+TextCNN.py

数据

训练集、验证集、测试集划分

X_train, X_val, X_test, y_train, y_val, y_test = X[:50000], X[50000:60000], X[60000:], y[:50000], y[50000:60000], y[60000:]

环境配置

os.environ["CUDA_VISIBLE_DEVICES"] = "0"
config = tf.ConfigProto(allow_soft_placement = True)
config.gpu_options.per_process_gpu_memory_fraction = 1.
sess = tf.Session(config=config)
KTF.set_session(sess)

参数配置

sent_maxlen = 200
word_size = 300
sent_size = 300
sess_size = 200
batch_size = 200

模型训练

word2vec的词向量&&神经网络的embedding层的关系
参考：
https://blog.csdn.net/anshuai_aw1/article/details/84329337
简单来说，神经网络的embedding layer可用于加载训练好的词嵌入模型

1 ANN（人工神经网络）

摘自博客：https://blog.csdn.net/wuyanxue/article/details/79980904

with tf.device('/gpu:0'):cnt = 60000// 初始化一个keras张量model_input = Input(shape=(sent_maxlen,))// NN的embedding layer可用于加载训练好的词嵌入模型// input_dim：这是文本数据中词汇的大小// output_dim: 该层的输出向量的大小// input_length: 输入序列的长度wordembed = Embedding(len(word2vec_metrix), 300, weights=[word2vec_metrix], input_length=200, trainable=False)(model_input)// Flatten(): 返回一个一维数组wordembed = Flatten()(wordembed)// 全连接层wordembed = Dense(600, activation='relu')(wordembed)// Dropout 防止过拟合conv_out = Dropout(0.5)(wordembed)// 全连接层model_output = Dense(8, activation='softmax')(conv_out)model = Model(inputs=model_input, outputs=model_output)// 输出模型各层的参数状况model.summary()checkpoint = ModelCheckpoint('../../PCmodel/Word2VecANN.h5', monitor='val_categorical_accuracy', verbose=1, save_best_only=True, mode='max')model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['categorical_accuracy'])history = model.fit(X_train, y_train, batch_size=200, epochs=20, validation_data=[X_val, y_val], callbacks=[checkpoint])model.save('../../PCmodel/Word2VecANN.h5')y_pred = model.predict(X_test)

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_3 (InputLayer)         (None, 200)               0
_________________________________________________________________
embedding_2 (Embedding)      (None, 200, 300)          69231300
_________________________________________________________________
flatten_2 (Flatten)          (None, 60000)             0
_________________________________________________________________
dense_3 (Dense)              (None, 600)               36000600
_________________________________________________________________
dropout_2 (Dropout)          (None, 600)               0
_________________________________________________________________
dense_4 (Dense)              (None, 8)                 4808
=================================================================
Total params: 105,236,708
Trainable params: 36,005,408
Non-trainable params: 69,231,300

macro_precision: 0.3586128473581196
macro_recall: 0.3232228788937765
macro_f1: 0.3312381715366206

2 ATT

with tf.device('/gpu:0'):   cnt = 60000model_input = Input(shape=(sent_maxlen,))wordembed = Embedding(len(word2vec_metrix), 300, weights=[word2vec_metrix], input_length=200, trainable=False)(model_input)// #attentionattention_pre = Dense(300)(wordembed)attention_probs  = Softmax()(attention_pre)attention_mul = Lambda(lambda x:x[0]*x[1])([attention_probs, wordembed])attention_mul = Dropout(0.5)(attention_mul)attention_mul = Flatten()(attention_mul)conv_out = Dropout(0.5)(attention_mul)model_output = Dense(8, activation='softmax')(conv_out)model = Model(inputs=model_input, outputs=model_output)model.summary()checkpoint = ModelCheckpoint('../../PCmodel/Word2VecATT.h5', monitor='val_categorical_accuracy', verbose=1, save_best_only=True, mode='max')model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['categorical_accuracy'])history = model.fit(X_train, y_train, batch_size=200, epochs=70, validation_data=[X_val, y_val], callbacks=[checkpoint])model.save('../../PCmodel/Word2VecATT.h5')y_pred = model.predict(X_test)

Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
input_4 (InputLayer)            (None, 200)          0
__________________________________________________________________________________________________
embedding_3 (Embedding)         (None, 200, 300)     69231300    input_4[0][0]
__________________________________________________________________________________________________
dense_5 (Dense)                 (None, 200, 300)     90300       embedding_3[0][0]
__________________________________________________________________________________________________
softmax_1 (Softmax)             (None, 200, 300)     0           dense_5[0][0]
__________________________________________________________________________________________________
lambda_1 (Lambda)               (None, 200, 300)     0           softmax_1[0][0]                  embedding_3[0][0]
__________________________________________________________________________________________________
dropout_3 (Dropout)             (None, 200, 300)     0           lambda_1[0][0]
__________________________________________________________________________________________________
flatten_3 (Flatten)             (None, 60000)        0           dropout_3[0][0]
__________________________________________________________________________________________________
dropout_4 (Dropout)             (None, 60000)        0           flatten_3[0][0]
__________________________________________________________________________________________________
dense_6 (Dense)                 (None, 8)            480008      dropout_4[0][0]
==================================================================================================```
Total params: 69,801,608
Trainable params: 570,308
Non-trainable params: 69,231,300

macro_precision: 0.6013798296383913
macro_recall: 0.6010459238438004
macro_f1: 0.6002507803210585

3 GRU

with tf.device('/gpu:0'):cnt = 60000model_input = Input(shape=(sent_maxlen,))wordembed = Embedding(len(word2vec_metrix), 300, weights=[word2vec_metrix], input_length=200, trainable=False)(model_input)sen2vec = GRU(300, activation='tanh', return_sequences=False)(wordembed)    conv_out = Dropout(0.5)(sen2vec)model_output = Dense(8, activation='softmax')(conv_out)model = Model(inputs=model_input, outputs=model_output)model.summary()checkpoint = ModelCheckpoint('../../PCmodel/Word2VecGRU.h5', monitor='val_categorical_accuracy', verbose=1, save_best_only=True, mode='max')model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['categorical_accuracy'])history = model.fit(X_train, y_train, batch_size=200, epochs=20, validation_data=[X_val, y_val], callbacks=[checkpoint])model.save('../../PCmodel/Word2VecGRU.h5')y_pred = model.predict(X_test)

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_5 (InputLayer)         (None, 200)               0
_________________________________________________________________
embedding_4 (Embedding)      (None, 200, 300)          69231300
_________________________________________________________________
gru_1 (GRU)                  (None, 300)               540900
_________________________________________________________________
dropout_5 (Dropout)          (None, 300)               0
_________________________________________________________________
dense_7 (Dense)              (None, 8)                 2408
=================================================================
Total params: 69,774,608
Trainable params: 543,308
Non-trainable params: 69,231,300

macro_precision: 0.6771869219680868
macro_recall: 0.6748016800950041
macro_f1: 0.6752740405416235

4 BiGRU

with tf.device('/gpu:0'):cnt = 60000model_input = Input(shape=(sent_maxlen,))wordembed = Embedding(len(word2vec_metrix), 300, weights=[word2vec_metrix], input_length=200, trainable=False)(model_input)sen2vec = Bidirectional(GRU(300, activation='tanh', return_sequences=False))(wordembed)    conv_out = Dropout(0.5)(sen2vec)model_output = Dense(8, activation='softmax')(conv_out)model = Model(inputs=model_input, outputs=model_output)model.summary()checkpoint = ModelCheckpoint('../../PCmodel/Word2VecBiGRU.h5', monitor='val_categorical_accuracy', verbose=1, save_best_only=True, mode='max')model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['categorical_accuracy'])history = model.fit(X_train, y_train, batch_size=200, epochs=20, validation_data=[X_val, y_val], callbacks=[checkpoint])model.save('../../PCmodel/Word2VecBiGRU.h5')y_pred = model.predict(X_test)

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_6 (InputLayer)         (None, 200)               0
_________________________________________________________________
embedding_5 (Embedding)      (None, 200, 300)          69231300
_________________________________________________________________
bidirectional_1 (Bidirection (None, 600)               1081800
_________________________________________________________________
dropout_6 (Dropout)          (None, 600)               0
_________________________________________________________________
dense_8 (Dense)              (None, 8)                 4808
=================================================================
Total params: 70,317,908
Trainable params: 1,086,608
Non-trainable params: 69,231,300

macro_precision: 0.6613445594141754
macro_recall: 0.6588297751045578
macro_f1: 0.6594757746759219

5 TextCNN

with tf.device('/gpu:0'):cnt = 60000model_input = Input(shape=(sent_maxlen,))wordembed = Embedding(len(word2vec_metrix), 300, weights=[word2vec_metrix], input_length=200, trainable=False)(model_input)# sen2vec = Bidirectional(GRU(300, activation='tanh', return_sequences=True))(wordembed)    convs = []filter_size = [2, 3, 4, 5]for i in filter_size:conv_layer = Conv1D(filters=300, kernel_size=i)(wordembed)conv_layer = BatchNormalization()(conv_layer)conv_layer = Activation('relu')(conv_layer)pool_layer = MaxPooling1D(200-i+1,1)(conv_layer)pool_layer = Flatten()(pool_layer)convs.append(pool_layer)conv_out = concatenate(convs, axis=1)conv_out = Dropout(0.5)(conv_out)model_output = Dense(8, activation='softmax')(conv_out)model = Model(inputs=model_input, outputs=model_output)model.summary()checkpoint = ModelCheckpoint('../../PCmodel/Word2VecTextCNN.h5', monitor='val_categorical_accuracy', verbose=1, save_best_only=True, mode='max')model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['categorical_accuracy'])history = model.fit(X_train, y_train, batch_size=200, epochs=15, validation_data=[X_val, y_val], callbacks=[checkpoint])model.save('../../PCmodel/Word2VecTextCNN.h5')y_pred = model.predict(X_test)

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
input_8 (InputLayer)            (None, 200)          0
__________________________________________________________________________________________________
embedding_7 (Embedding)         (None, 200, 300)     69231300    input_8[0][0]
__________________________________________________________________________________________________
conv1d_5 (Conv1D)               (None, 199, 300)     180300      embedding_7[0][0]
__________________________________________________________________________________________________
conv1d_6 (Conv1D)               (None, 198, 300)     270300      embedding_7[0][0]
__________________________________________________________________________________________________
conv1d_7 (Conv1D)               (None, 197, 300)     360300      embedding_7[0][0]
__________________________________________________________________________________________________
conv1d_8 (Conv1D)               (None, 196, 300)     450300      embedding_7[0][0]
__________________________________________________________________________________________________
batch_normalization_5 (BatchNor (None, 199, 300)     1200        conv1d_5[0][0]
__________________________________________________________________________________________________
batch_normalization_6 (BatchNor (None, 198, 300)     1200        conv1d_6[0][0]
__________________________________________________________________________________________________
batch_normalization_7 (BatchNor (None, 197, 300)     1200        conv1d_7[0][0]
__________________________________________________________________________________________________
batch_normalization_8 (BatchNor (None, 196, 300)     1200        conv1d_8[0][0]
__________________________________________________________________________________________________
activation_5 (Activation)       (None, 199, 300)     0           batch_normalization_5[0][0]
__________________________________________________________________________________________________
activation_6 (Activation)       (None, 198, 300)     0           batch_normalization_6[0][0]
__________________________________________________________________________________________________
activation_7 (Activation)       (None, 197, 300)     0           batch_normalization_7[0][0]
__________________________________________________________________________________________________
activation_8 (Activation)       (None, 196, 300)     0           batch_normalization_8[0][0]
__________________________________________________________________________________________________
max_pooling1d_5 (MaxPooling1D)  (None, 1, 300)       0           activation_5[0][0]
__________________________________________________________________________________________________
max_pooling1d_6 (MaxPooling1D)  (None, 1, 300)       0           activation_6[0][0]
__________________________________________________________________________________________________
max_pooling1d_7 (MaxPooling1D)  (None, 1, 300)       0           activation_7[0][0]
__________________________________________________________________________________________________
max_pooling1d_8 (MaxPooling1D)  (None, 1, 300)       0           activation_8[0][0]
__________________________________________________________________________________________________
flatten_8 (Flatten)             (None, 300)          0           max_pooling1d_5[0][0]
__________________________________________________________________________________________________
flatten_9 (Flatten)             (None, 300)          0           max_pooling1d_6[0][0]
__________________________________________________________________________________________________
flatten_10 (Flatten)            (None, 300)          0           max_pooling1d_7[0][0]
__________________________________________________________________________________________________
flatten_11 (Flatten)            (None, 300)          0           max_pooling1d_8[0][0]
__________________________________________________________________________________________________
concatenate_2 (Concatenate)     (None, 1200)         0           flatten_8[0][0]                  flatten_9[0][0]                  flatten_10[0][0]                 flatten_11[0][0]
__________________________________________________________________________________________________
dropout_8 (Dropout)             (None, 1200)         0           concatenate_2[0][0]
__________________________________________________________________________________________________
dense_10 (Dense)                (None, 8)            9608        dropout_8[0][0]
==================================================================================================
Total params: 70,506,908
Trainable params: 1,273,208
Non-trainable params: 69,233,700

macro_precision: 0.6755142868067746
macro_recall: 0.6721758488510332
macro_f1: 0.670720748929639

6 BiGRU + TextCNN

with tf.device('/gpu:0'):cnt = 60000model_input = Input(shape=(sent_maxlen,))wordembed = Embedding(len(word2vec_metrix), 300, weights=[word2vec_metrix], input_length=200, trainable=False)(model_input)sen2vec = Bidirectional(GRU(300, activation='tanh', return_sequences=True))(wordembed)    convs = []filter_size = [2, 3, 4, 5]for i in filter_size:conv_layer = Conv1D(filters=300, kernel_size=i)(sen2vec)conv_layer = BatchNormalization()(conv_layer)conv_layer = Activation('relu')(conv_layer)pool_layer = MaxPooling1D(200-i+1,1)(conv_layer)pool_layer = Flatten()(pool_layer)convs.append(pool_layer)conv_out = concatenate(convs, axis=1)conv_out = Dropout(0.5)(conv_out)model_output = Dense(8, activation='softmax')(conv_out)model = Model(inputs=model_input, outputs=model_output)model.summary()checkpoint = ModelCheckpoint('../../PCmodel/Word2VecBiGRUTextCNN.h5', monitor='val_categorical_accuracy', verbose=1, save_best_only=True, mode='max')model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['categorical_accuracy'])history = model.fit(X_train, y_train, batch_size=200, epochs=15, validation_data=[X_val, y_val], callbacks=[checkpoint])model.save('../../PCmodel/Word2VecBiGRUTextCNN.h5')y_pred = model.predict(X_test)

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
input_7 (InputLayer)            (None, 200)          0
__________________________________________________________________________________________________
embedding_6 (Embedding)         (None, 200, 300)     69231300    input_7[0][0]
__________________________________________________________________________________________________
bidirectional_2 (Bidirectional) (None, 200, 600)     1081800     embedding_6[0][0]
__________________________________________________________________________________________________
conv1d_1 (Conv1D)               (None, 199, 300)     360300      bidirectional_2[0][0]
__________________________________________________________________________________________________
conv1d_2 (Conv1D)               (None, 198, 300)     540300      bidirectional_2[0][0]
__________________________________________________________________________________________________
conv1d_3 (Conv1D)               (None, 197, 300)     720300      bidirectional_2[0][0]
__________________________________________________________________________________________________
conv1d_4 (Conv1D)               (None, 196, 300)     900300      bidirectional_2[0][0]
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 199, 300)     1200        conv1d_1[0][0]
__________________________________________________________________________________________________
batch_normalization_2 (BatchNor (None, 198, 300)     1200        conv1d_2[0][0]
__________________________________________________________________________________________________
batch_normalization_3 (BatchNor (None, 197, 300)     1200        conv1d_3[0][0]
__________________________________________________________________________________________________
batch_normalization_4 (BatchNor (None, 196, 300)     1200        conv1d_4[0][0]
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 199, 300)     0           batch_normalization_1[0][0]
__________________________________________________________________________________________________
activation_2 (Activation)       (None, 198, 300)     0           batch_normalization_2[0][0]
__________________________________________________________________________________________________
activation_3 (Activation)       (None, 197, 300)     0           batch_normalization_3[0][0]
__________________________________________________________________________________________________
activation_4 (Activation)       (None, 196, 300)     0           batch_normalization_4[0][0]
__________________________________________________________________________________________________
max_pooling1d_1 (MaxPooling1D)  (None, 1, 300)       0           activation_1[0][0]
__________________________________________________________________________________________________
max_pooling1d_2 (MaxPooling1D)  (None, 1, 300)       0           activation_2[0][0]
__________________________________________________________________________________________________
max_pooling1d_3 (MaxPooling1D)  (None, 1, 300)       0           activation_3[0][0]
__________________________________________________________________________________________________
max_pooling1d_4 (MaxPooling1D)  (None, 1, 300)       0           activation_4[0][0]
__________________________________________________________________________________________________
flatten_4 (Flatten)             (None, 300)          0           max_pooling1d_1[0][0]
__________________________________________________________________________________________________
flatten_5 (Flatten)             (None, 300)          0           max_pooling1d_2[0][0]
__________________________________________________________________________________________________
flatten_6 (Flatten)             (None, 300)          0           max_pooling1d_3[0][0]
__________________________________________________________________________________________________
flatten_7 (Flatten)             (None, 300)          0           max_pooling1d_4[0][0]
__________________________________________________________________________________________________
concatenate_1 (Concatenate)     (None, 1200)         0           flatten_4[0][0]                  flatten_5[0][0]                  flatten_6[0][0]                  flatten_7[0][0]
__________________________________________________________________________________________________
dropout_7 (Dropout)             (None, 1200)         0           concatenate_1[0][0]
__________________________________________________________________________________________________
dense_9 (Dense)                 (None, 8)            9608        dropout_7[0][0]
==================================================================================================
Total params: 72,848,708
Trainable params: 3,615,008
Non-trainable params: 69,233,700

macro_precision: 0.678810654617721
macro_recall: 0.6628354708407058
macro_f1: 0.6670883069148354

7 BiGRU + TextCNN + ATT

with tf.device('/gpu:0'):cnt = 60000model_input = Input(shape=(sent_maxlen,))wordembed = Embedding(len(word2vec_metrix), 300, weights=[word2vec_metrix], input_length=200, trainable=False)(model_input)sen2vec = Bidirectional(GRU(300, activation='tanh', return_sequences=True))(wordembed)    attention_pre = Dense(600)(sen2vec)attention_probs  = Softmax()(attention_pre)attention_mul = Lambda(lambda x:x[0]*x[1])([attention_probs, sen2vec])attention_mul = Dropout(0.5)(attention_mul)convs = []filter_size = [7, 8, 9]for i in filter_size:conv_layer = Conv1D(filters=312, kernel_size=i)(attention_mul)conv_layer = BatchNormalization()(conv_layer)conv_layer = Activation('relu')(conv_layer)pool_layer = MaxPooling1D(200-i+1,1)(conv_layer)pool_layer = Flatten()(pool_layer)convs.append(pool_layer)conv_out = concatenate(convs, axis=1)conv_out = Dropout(0.5)(conv_out)model_output = Dense(8, activation='softmax')(conv_out)model = Model(inputs=model_input, outputs=model_output)model.summary()checkpoint = ModelCheckpoint('../../PCmodel/Word2VecBiGRUATTTextCNN.h5', monitor='val_categorical_accuracy', verbose=1, save_best_only=True, mode='max')model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['categorical_accuracy'])history = model.fit(X_train, y_train, batch_size=200, epochs=15, validation_data=[X_val, y_val], callbacks=[checkpoint])model.save('../../PCmodel/Word2VecBiGRUATTTextCNN.h5')y_pred = model.predict(X_test)

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
input_9 (InputLayer)            (None, 200)          0
__________________________________________________________________________________________________
embedding_8 (Embedding)         (None, 200, 300)     69231300    input_9[0][0]
__________________________________________________________________________________________________
bidirectional_3 (Bidirectional) (None, 200, 600)     1081800     embedding_8[0][0]
__________________________________________________________________________________________________
dense_11 (Dense)                (None, 200, 600)     360600      bidirectional_3[0][0]
__________________________________________________________________________________________________
softmax_2 (Softmax)             (None, 200, 600)     0           dense_11[0][0]
__________________________________________________________________________________________________
lambda_2 (Lambda)               (None, 200, 600)     0           softmax_2[0][0]                  bidirectional_3[0][0]
__________________________________________________________________________________________________
dropout_9 (Dropout)             (None, 200, 600)     0           lambda_2[0][0]
__________________________________________________________________________________________________
conv1d_9 (Conv1D)               (None, 194, 312)     1310712     dropout_9[0][0]
__________________________________________________________________________________________________
conv1d_10 (Conv1D)              (None, 193, 312)     1497912     dropout_9[0][0]
__________________________________________________________________________________________________
conv1d_11 (Conv1D)              (None, 192, 312)     1685112     dropout_9[0][0]
__________________________________________________________________________________________________
batch_normalization_9 (BatchNor (None, 194, 312)     1248        conv1d_9[0][0]
__________________________________________________________________________________________________
batch_normalization_10 (BatchNo (None, 193, 312)     1248        conv1d_10[0][0]
__________________________________________________________________________________________________
batch_normalization_11 (BatchNo (None, 192, 312)     1248        conv1d_11[0][0]
__________________________________________________________________________________________________
activation_9 (Activation)       (None, 194, 312)     0           batch_normalization_9[0][0]
__________________________________________________________________________________________________
activation_10 (Activation)      (None, 193, 312)     0           batch_normalization_10[0][0]
__________________________________________________________________________________________________
activation_11 (Activation)      (None, 192, 312)     0           batch_normalization_11[0][0]
__________________________________________________________________________________________________
max_pooling1d_9 (MaxPooling1D)  (None, 1, 312)       0           activation_9[0][0]
__________________________________________________________________________________________________
max_pooling1d_10 (MaxPooling1D) (None, 1, 312)       0           activation_10[0][0]
__________________________________________________________________________________________________
max_pooling1d_11 (MaxPooling1D) (None, 1, 312)       0           activation_11[0][0]
__________________________________________________________________________________________________
flatten_12 (Flatten)            (None, 312)          0           max_pooling1d_9[0][0]
__________________________________________________________________________________________________
flatten_13 (Flatten)            (None, 312)          0           max_pooling1d_10[0][0]
__________________________________________________________________________________________________
flatten_14 (Flatten)            (None, 312)          0           max_pooling1d_11[0][0]
__________________________________________________________________________________________________
concatenate_3 (Concatenate)     (None, 936)          0           flatten_12[0][0]                 flatten_13[0][0]                 flatten_14[0][0]
__________________________________________________________________________________________________
dropout_10 (Dropout)            (None, 936)          0           concatenate_3[0][0]
__________________________________________________________________________________________________
dense_12 (Dense)                (None, 8)            7496        dropout_10[0][0]
==================================================================================================
Total params: 75,178,676
Trainable params: 5,945,504
Non-trainable params: 69,233,172

Epoch 00015: val_categorical_accuracy did not improve from 0.68530
macro_precision: 0.6969946770346792
macro_recall: 0.6918944837355078
macro_f1: 0.6927655522710012

结果测评

y1 = [np.argmax(i) for i in y_test]
y2 = [np.argmax(i) for i in y_pred]from sklearn import metricsprint('macro_precision:\t',end='')
print(metrics.precision_score(y1,y2,average='macro'))print('macro_recall:\t\t',end='')
print(metrics.recall_score(y1,y2,average='macro'))print('macro_f1:\t\t',end='')
print(metrics.f1_score(y1,y2,average='macro'))

模型	precision	recall	f1
ANN	35.86%	32.32%	33.12%
ATT	60.14%	60.10%	60.03%
GRU	67.72%	67.48%	67.53%
BiGRU	66.13%	65.88%	65.94%
TextCNN	67.55%	67.22%	67.07%
BiGRU+TextCNN	67.88%	66.28%	66.71%
BiGRU+ATT+TextCNN	69.70%	69.19%	69.28%

【专利练习4】深度学习模型用于专利分类相关推荐

【NLP】相当全面：各种深度学习模型在文本分类任务上的应用
论文标题:Deep Learning Based Text Classification:A Comprehensive Review 论文链接:https://arxiv.org/pdf/2004. ...
（翻译）传统和深度学习模型在文本分类中的应用综述与基准
原文:Overview and benchmark of traditional and deep learning models in text classification 本文是我在试验Twit ...
【NLP从零入门】预训练时代下，深度学习模型的文本分类算法（超多干货，小白友好，内附实践代码和文本分类常见中文数据集）
如今NLP可以说是预训练模型的时代,希望借此抛砖引玉,能多多交流探讨当前预训练模型在文本分类上的应用. 1. 任务介绍与实际应用文本分类任务是自然语言处理(NLP)中最常见.最基础的任务之一,顾名思 ...
【论文翻译】HCGN：面向集体分类的异构图卷积网络深度学习模型
HCGN:面向集体分类的异构图卷积网络深度学习模型摘要集合分类是研究网络数据的一项重要技术,旨在利用一组具有复杂依赖关系的互联实体的标签自相关性.随着各种异构信息网络的出现,集合分类目前正面临着来 ...
深度学习核心技术精讲100篇（七）-keras 实战系列之深度学习模型处理多标签（multi_label）
前言最近在读论文的的过程中接触到多标签分类(multi-label classification)的任务,必须要强调的是多标签(multi-label)分类任务和多分类(multi-class) ...
【神经网络与深度学习】1.线性分类与感知机
线性分类与感知机线性分类线性回归线性二分类多分类回归多层感知机神经元模型感知机模型作为机器学习的一类,深度学习通常基于神经网络模型逐级表示越来越抽象的概念或模式. 这里从线性回归和so ...
Data Augmentation for Deep Learning-based Radio ModulationClassification解读（基于深度学习的无线电调制分类数据扩充）
摘要:深度学习最近被应用于自动分类接收无线电信号的调制类别,而无需人工经验.然而,训练深度学习模型需要大量的数据.训练数据不足会导致严重的过度拟合问题,降低分类精度.为了处理小数据集,数据增强被广泛应 ...
#今日论文推荐# 中国矿大团队，开发集成多尺度深度学习模型，用于 RNA 甲基化位点预测
#今日论文推荐# 中国矿大团队,开发集成多尺度深度学习模型,用于 RNA 甲基化位点预测研究表明,通过转录后 RNA 修饰进行的表观转录组调控,对于所有种类的 RNA 都是必不可少的.准确识别 RN ...
DeepHPV:一个用于预测HPV整合人类基因位点的深度学习模型
文章出处及相关链接出处: Briefings in Bioinformatics doi: 10.1093/bib/bbaa242 代码:https://github.com/JiuxingLian ...

【专利练习4】深度学习模型用于专利分类