文章目录

  • 前言
  • DIN和DIEN的总体思路
  • DIN对兴趣建模的缺点
  • DIEN对兴趣建模的思路
  • 结构
  • 行为序列层(Behavior Layer)
  • 兴趣抽取层(Interest Extractor Layer)
    • 辅助损失
  • 兴趣进化层(Interest Evolving Layer)
    • 兴趣演化的特点
    • 作用
    • 注意力机制的计算
    • DIN中注意力得分结合的方式
    • DIEN中注意力得分结合方式(AUGRU)
  • tf2实现

前言

因为DIEN是GRU和推荐网络的结合,所以有关序列模型部分请看博主的这篇文章:

https://blog.csdn.net/qq_42363032/article/details/122042268?spm=1001.2014.3001.5501

DIN和DIEN的总体思路


DIN对兴趣建模的缺点


DIEN对兴趣建模的思路

结构

行为序列层(Behavior Layer)

本质就是一个embedding层,它做的事情就是embedding。将用户行为序列embedding之后和其他特征 embedding一起作为输入。

兴趣抽取层(Interest Extractor Layer)

辅助损失


兴趣进化层(Interest Evolving Layer)

从兴趣状态中,学习兴趣演化的过程,它不是学所有的演化过程,只学和target ad 相关联的那部分兴趣演化过 程。所以就有了注意力机制来进行筛选,筛选出和target ad相关的兴趣,来学习他们演化的过程,模拟出下一时 刻对target ad感兴趣的程度。

兴趣演化的特点

作用

注意力机制的计算

DIN中注意力得分结合的方式

DIEN中注意力得分结合方式(AUGRU)




tf2实现

# coding:utf-8
# @Time: 2022/1/5 11:07 上午
# @File: ctr_DIEN.py
'''
DIEN
'''
import numpy as np
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras import Model
from tensorflow.keras import optimizers
from tensorflow.keras import metrics
import tensorflow.keras.backend as K
from sklearn.model_selection import train_test_split
import yaml
from tqdm import tqdmfrom tools import *'''   负采样生成辅助的样本   '''
def auxiliary_sample(data):'''在DIEN算法中,兴趣抽取层在用GRU来抽取interest的同时,为了使得抽取的interest的表示更加合理,该层设计了一个二分类的模型来计算兴趣抽取的准确性,用用户的下一时刻真实的行为作为positive sample,负采样得到的行为作为negative sample来计算一个辅助的lossParameters:param data: pandas-df:return: neg_sample : negative samples   numpy.ndarray'''cate_max = np.max(data.iloc[:, 2:-1].values)  # 获取所有行为pos_sample = data.iloc[:, 3:-1].values  # 去掉最后的cateID和无下一时刻的第一列neg_sample = np.zeros_like(pos_sample)for i in range(pos_sample.shape[0]):for j in range(pos_sample.shape[1]):if pos_sample[i, j] > 0:idx = np.random.randint(low=1, high=cate_max + 1)while idx == pos_sample[i, j]:idx = np.random.randint(low=1, high=cate_max + 1)neg_sample[i, j] = idxelse:break  # 后面的行为都是padding的0return neg_sample'''   Dice   '''
class Dice(layers.Layer):def __init__(self):super(Dice, self).__init__()self.epsilon = 1e-9def build(self, input_shape):self.alpha = self.add_weight(name='dice_alpha',shape=(),initializer=tf.keras.initializers.Zeros(),trainable=True)def call(self, x):top = x - tf.reduce_mean(x, axis=0)bottom = tf.sqrt(tf.math.reduce_variance(x, axis=0) + self.epsilon)norm_x = top / bottomp = tf.sigmoid(norm_x)x = self.alpha * x * (1 - p) + x * preturn x'''   Embedding Layer   '''
class EmbeddingLayer(layers.Layer):def __init__(self, seq_size, seq_dim, feature_columns):super(EmbeddingLayer, self).__init__()self.feature_dim = seq_sizeself.embed_dim = seq_dim# self.embedding = layers.Embedding(feature_dim + 1, embed_dim, mask_zero=True, name='item_emb')# 行为特征、queryId embedding# mask_True=True:输入矩阵中的0会被mask掉,即0为padding的值,要忽略这些的影响self.seq_emb = layers.Embedding(seq_size + 1, seq_dim, mask_zero=True, name='item_emb')# 除序列特征之外的其它特征:如画像特征、上下文信息等self.feature_columns = feature_columns# 其它特征embeddingself.feature_emb = {'emb_{}'.format(i): layers.Embedding(feat['vocabulary_size'], feat['embed_dim'])for i, feat in enumerate(self.feature_columns)}def call(self, x, neg_x=None):"""input :x : (behaviors * 40, ads * 1) -> batch * (behaviors + ads) -> (2000, 40)neg_x : (behavior * 39) -> batch * behavior -> (2000, 39)output :query_ad : (batch * 1 * embed_dim) -> (2000, 1, 4)user_behavior : (batch * Time_seq_len * embed_dim) -> (2000, 40, 4)mask : (2000, 40, 1)neg_user_behavior:(2000, 39, 4)neg_mask:(2000, 39, 1)profile_context:(2000, 16)"""# User Profile、Context Featureprofile_context = x[:, :2]  # (2000, 2)# User Behaviorsbehaviors_x = x[:, 2:-1]  # (2000, 40)# Candidate Adads_x = x[:, -1]  # (2000,)# embeddingprofile_context = tf.concat([self.feature_emb['emb_{}'.format(i)](profile_context[:, i])for i in range(len(self.feature_columns))], axis=1)  # (2000, 4)query_ad = tf.expand_dims(self.seq_emb(ads_x), axis=1)  # (2000, 1, 4)  因为行为序列emb后是3维,所以这里增加一维user_behavior = self.seq_emb(behaviors_x)  # (2000, 40, 4)# 定义maskmask = tf.cast(behaviors_x > 0, tf.float32)  # (2000, 40)mask = tf.expand_dims(mask, axis=-1)   # (2000, 40, 1)if neg_x is not None:neg_mask = tf.cast(neg_x > 0, tf.float32)neg_mask = tf.expand_dims(neg_mask, axis=-1)neg_user_behavior = self.seq_emb(neg_x)return query_ad, user_behavior, mask, \neg_user_behavior, neg_mask, profile_contextreturn query_ad, user_behavior, mask, profile_context'''   兴趣抽取层   '''
class InterestExtractLayer(Model):def __init__(self, embed_dim, extract_fc_dims, extract_dropout):super(InterestExtractLayer, self).__init__()# 传统的GRU来抽取时序行为的兴趣表示  return_sequences=True: 返回上次的输出self.GRU = layers.GRU(units=embed_dim, activation='tanh', recurrent_activation='sigmoid', return_sequences=True)# 用一个mlp来计算 auxiliary lossself.auxiliary_mlp = tf.keras.Sequential()for fc_dim in extract_fc_dims:self.auxiliary_mlp.add(layers.Dense(fc_dim))self.auxiliary_mlp.add(layers.Activation('relu'))self.auxiliary_mlp.add(layers.Dropout(extract_dropout))self.auxiliary_mlp.add(layers.Dense(1))def call(self, user_behavior, mask, neg_user_behavior=None, neg_mask=None):"""user_behavior : (2000, 40, 4)mask : (2000, 40, 1)neg_user_behavior : (2000, 39, 4)neg_mask : (2000, 39, 1)"""# 将0-1遮罩变换boolmask_bool = tf.cast(tf.squeeze(mask, axis=2), tf.bool)  # (2000, 40)gru_interests = self.GRU(user_behavior, mask=mask_bool)  # (2000, 40, 4)# 计算Auxiliary Loss,只在负采样的时候计算 aux lossif neg_user_behavior is not None:# 此处用户真实行为user_behavior为图中的e,GRU抽取的状态为图中的hgru_embed = gru_interests[:, 1:]  # (2000, 39, 4)neg_mask_bool = tf.cast(tf.squeeze(neg_mask, axis=2), tf.bool)  # (2000, 39)# 正样本的构建  选取下一个行为作为正样本pos_seq = tf.concat([gru_embed, user_behavior[:, 1:]], -1)  # (2000, 39, 8)pos_res = self.auxiliary_mlp(pos_seq)  # (2000, 39, 1)pos_res = tf.sigmoid(pos_res[neg_mask_bool])  # 选择不为0的进行sigmoid  (N, 1) ex: (18290, 1)pos_target = tf.ones_like(pos_res, tf.float16)  # label# 负样本的构建  从未点击的样本中选取一个作为负样本neg_seq = tf.concat([gru_embed, neg_user_behavior], -1)  # (2000, 39, 8)neg_res = self.auxiliary_mlp(neg_seq)  # (2000, 39, 1)neg_res = tf.sigmoid(neg_res[neg_mask_bool])neg_target = tf.zeros_like(neg_res, tf.float16)# 计算辅助损失 二分类交叉熵aux_loss = tf.keras.losses.binary_crossentropy(tf.concat([pos_res, neg_res], axis=0), tf.concat([pos_target, neg_target], axis=0))aux_loss = tf.cast(aux_loss, tf.float32)aux_loss = tf.reduce_mean(aux_loss)return gru_interests, aux_lossreturn gru_interests, 0'''   Activation Unit   '''
class ActivationUnit(Model):def __init__(self, embed_dim, att_dropout=0.2, att_fc_dims=[32, 16]):super(ActivationUnit, self).__init__()# self.fc_layers = tf.keras.Sequential()# input_dim = embed_dim * 4# for fc_dim in att_fc_dims:#     self.fc_layers.add(layers.Dense(fc_dim, input_shape=[input_dim, ]))#     self.fc_layers.add(Dice())#     self.fc_layers.add(layers.Dropout(att_dropout))#     self.input_dim = fc_dim# self.fc_layers.add(layers.Dense(1, input_shape=[input_dim, ]))self.fc_layers = tf.keras.Sequential()for fc_dim in att_fc_dims:self.fc_layers.add(layers.Dense(fc_dim))self.fc_layers.add(Dice())self.fc_layers.add(layers.Dropout(att_dropout))self.fc_layers.add(layers.Dense(1))def call(self, query, user_behavior):"""query : 单独的ad的embedding mat -> batch * 1 * embeduser_behavior : 行为特征矩阵 -> batch * seq_len * embed"""# repeat adsseq_len = user_behavior.shape[1]queries = tf.concat([query] * seq_len, axis=1)attn_input = tf.concat([queries,user_behavior,queries - user_behavior,queries * user_behavior], axis=-1)out = self.fc_layers(attn_input)return out'''   Attention Pooling Layer   '''
class AttentionPoolingLayer(Model):def __init__(self, embed_dim, att_dropout=0.2, att_fc_dims=[32, 16], return_score=False):super(AttentionPoolingLayer, self).__init__()self.active_unit = ActivationUnit(embed_dim, att_dropout, att_fc_dims)self.return_score = return_scoredef call(self, query, user_behavior, mask):"""query_ad : 单独的ad的embedding mat -> batch * 1 * embeduser_behavior : 行为特征矩阵 -> batch * seq_len * embedmask : 被padding为0的行为置为false -> batch * seq_len * 1"""# attn weightsattn_weights = self.active_unit(query, user_behavior)# mul weights and sum poolingif not self.return_score:output = user_behavior * attn_weights * maskreturn outputreturn attn_weights'''   AGRU单元   '''
class AGRUCell(layers.Layer):"""Attention based GRU (AGRU)公式如下:r = sigmoid(W_ir * x + b_ir + W_hr * h + b_hr)#z = sigmoid(W_iz * x + b_iz + W_hz * h + b_hz)h' = tanh(W_ih * x + b_ih + r * (W_hh * h + b_hh))h = (1 - att_score) * h + att_score * h'"""def __init__(self, units):super().__init__()self.units = units# 作为一个 RNN 的单元,必须有state_size属性# state_size 表示每个时间步输出的维度self.state_size = unitsdef build(self, input_shape):# 输入数据是一个tupe: (gru_embed, atten_scores)  (2000, 4)、(2000, 1)# 因此,t时刻输入的维度为:dim_xt = input_shape[0][-1]# 重置门中的参数self.w_ir = tf.Variable(tf.random.normal(shape=[dim_xt, self.units]), name='w_ir')self.w_hr = tf.Variable(tf.random.normal(shape=[self.units, self.units]), name='w_hr')self.b_ir = tf.Variable(tf.random.normal(shape=[self.units]), name='b_ir')self.b_hr = tf.Variable(tf.random.normal(shape=[self.units]), name='b_hr')# 更新门被att_score代替# 候选隐藏中的参数self.w_ih = tf.Variable(tf.random.normal(shape=[dim_xt, self.units]), name='w_ih')self.w_hh = tf.Variable(tf.random.normal(shape=[self.units, self.units]), name='w_hh')self.b_ih = tf.Variable(tf.random.normal(shape=[self.units]), name='b_ih')self.b_hh = tf.Variable(tf.random.normal(shape=[self.units]), name='b_hh')def call(self, inputs, states):x_t, att_score = inputsstates = states[0]# 重置门r_t = tf.sigmoid(tf.matmul(x_t, self.w_ir) + self.b_ir + \tf.matmul(states, self.w_hr) + self.b_hr)# 候选隐藏状态h_t_ = tf.tanh(tf.matmul(x_t, self.w_ih) + self.b_ih + \tf.multiply(r_t, (tf.matmul(states, self.w_hh) + self.b_hh)))# 输出值h_t = tf.multiply(1 - att_score, states) + tf.multiply(att_score, h_t_)# 对gru而言,当前时刻的output与传递给下一时刻的state相同next_state = h_treturn h_t, next_state'''   AUGRU单元   '''
class AUGRUCell(layers.Layer):"""GRU with attentional update gate (AUGRU)公式如下:r = sigmoid(W_ir * x + b_ir + W_hr * h + b_hr)z = sigmoid(W_iz * x + b_iz + W_hz * h + b_hz)z = z * att_scoreh' = tanh(W_ih * x + b_ih + r * (W_hh * h + b_hh))h = (1 - z) * h + z * h'"""def __init__(self, units):super().__init__()self.units = units# 作为一个 RNN 的单元,必须有state_size属性# state_size 表示每个时间步输出的维度self.state_size = unitsdef build(self, input_shape):# 输入数据是一个tupe: (gru_embed, atten_scores)# 因此,t时刻输入的维度为:dim_xt = input_shape[0][-1]# 重置门中的参数self.w_ir = tf.Variable(tf.random.normal(shape=[dim_xt, self.units]), name='w_ir')self.w_hr = tf.Variable(tf.random.normal(shape=[self.units, self.units]), name='w_hr')self.b_ir = tf.Variable(tf.random.normal(shape=[self.units]), name='b_ir')self.b_hr = tf.Variable(tf.random.normal(shape=[self.units]), name='b_hr')# 更新门中的参数self.w_iz = tf.Variable(tf.random.normal(shape=[dim_xt, self.units]), name='w_iz')self.w_hz = tf.Variable(tf.random.normal(shape=[self.units, self.units]), name='W_hz')self.b_iz = tf.Variable(tf.random.normal(shape=[self.units]), name='b_iz')self.b_hz = tf.Variable(tf.random.normal(shape=[self.units]), name='b_hz')# 候选隐藏中的参数self.w_ih = tf.Variable(tf.random.normal(shape=[dim_xt, self.units]), name='w_ih')self.w_hh = tf.Variable(tf.random.normal(shape=[self.units, self.units]), name='w_hh')self.b_ih = tf.Variable(tf.random.normal(shape=[self.units]), name='b_ih')self.b_hh = tf.Variable(tf.random.normal(shape=[self.units]), name='b_hh')def call(self, inputs, states):x_t, att_score = inputsstates = states[0]# 重置门r_t = tf.sigmoid(tf.matmul(x_t, self.w_ir) + self.b_ir + \tf.matmul(states, self.w_hr) + self.b_hr)# 更新门z_t = tf.sigmoid(tf.matmul(x_t, self.w_iz) + self.b_iz + \tf.matmul(states, self.w_hz) + self.b_hz)# 带有注意力的更新门z_t = tf.multiply(att_score, z_t)# 候选隐藏状态h_t_ = tf.tanh(tf.matmul(x_t, self.w_ih) + self.b_ih + \tf.multiply(r_t, (tf.matmul(states, self.w_hh) + self.b_hh)))# 输出值h_t = tf.multiply(1 - z_t, states) + tf.multiply(z_t, h_t_)# 对gru而言,当前时刻的output与传递给下一时刻的state相同next_state = h_treturn h_t, next_state'''   兴趣进化层   '''
class InterestEvolutionLayer(Model):def __init__(self, input_size, gru_type='AUGRU', evolution_dropout=0.2, att_dropout=0.2, att_fc_dims=[32, 16]):super(InterestEvolutionLayer, self).__init__()self.gru_type = gru_typeself.dropout = evolution_dropoutif gru_type == 'GRU':self.attention = AttentionPoolingLayer(embed_dim=input_size, att_dropout=att_dropout, att_fc_dims=att_fc_dims)self.evolution = layers.GRU(units=input_size,return_sequences=True)elif gru_type == 'AIGRU':self.attention = AttentionPoolingLayer(embed_dim=input_size,att_dropout=att_dropout,att_fc_dims=att_fc_dims,return_score=True)self.evolution = layers.GRU(units=input_size)elif gru_type == 'AGRU':self.attention = AttentionPoolingLayer(embed_dim=input_size,att_dropout=att_dropout,att_fc_dims=att_fc_dims,return_score=True)self.evolution = layers.RNN(AGRUCell(units=input_size))elif gru_type == 'AUGRU':self.attention = AttentionPoolingLayer(embed_dim=input_size,att_dropout=att_dropout,att_fc_dims=att_fc_dims,return_score=True)self.evolution = layers.RNN(AUGRUCell(units=input_size))def call(self, query_ad, gru_interests, mask):"""query_ad : B * 1 * E -> (2000, 1, 4)gru_interests : B * T * H -> (2000, 40, 4)mask : B * T * 1 -> (2000, 40, 1)"""mask_bool = tf.cast(tf.squeeze(mask, axis=2), tf.bool)  # (2000, 40)if self.gru_type == 'GRU':# GRU后接attentionout = self.evolution(gru_interests, mask=mask_bool)  # (2000, 40, 4)out = self.attention(query_ad, out, mask)  # (2000, 40, 4)out = tf.reduce_sum(out, axis=1)  # (2000, 4)elif self.gru_type == 'AIGRU':# AIGRUatt_score = self.attention(query_ad, gru_interests, mask)  # (2000, 40, 1)out = att_score * gru_interests  # (2000, 40, 4)out = self.evolution(out, mask=mask_bool)  # (2000, 4)elif self.gru_type == 'AGRU' or self.gru_type == 'AUGRU':# AGRU or AUGRUatt_score = self.attention(query_ad, gru_interests, mask)  # (2000, 40, 1)out = self.evolution((gru_interests, att_score), mask=mask_bool)  # (2000, 4)return out'''   DIEN   '''
class DIEN(Model):def __init__(self, seq_size, seq_dim, feature_columns, mlp_dims, gru_type='AUGRU',extract_fc_dims=[100, 50], extract_dropout=0.,evolution_dropout=0.2, att_dropout=0.2, att_fc_dims=[32, 16]):super(DIEN, self).__init__()self.feature_dim = seq_sizeself.embed_dim = seq_dimself.gru_type = gru_type# Embedding Layerself.embedding = EmbeddingLayer(seq_size, seq_dim, feature_columns)# Interest Extract Layerself.interest_extract = InterestExtractLayer(embed_dim=seq_dim, extract_fc_dims=extract_fc_dims, extract_dropout=extract_dropout)# Interest Evolution Layerself.interest_evolution = InterestEvolutionLayer(input_size=seq_dim,gru_type=gru_type,evolution_dropout=evolution_dropout,att_dropout=att_dropout,att_fc_dims=att_fc_dims)# 最后的MLP层预测self.final_mlp = tf.keras.Sequential()for fc_dim in mlp_dims:self.final_mlp.add(layers.Dense(fc_dim))self.final_mlp.add(layers.Activation('relu'))self.final_mlp.add(layers.Dropout(evolution_dropout))self.final_mlp.add(layers.Dense(1))def call(self, x, neg_x=None):"""x : (behaviors * 40, ads * 1, other_feature*2) -> batch * (behaviors + ads + other_feature) -> (2000, 43)neg_x : (behaviors * 39) -> batch * (behaviors + ads) -> (2000, 39)"""# Embedding   只有行为序列参与兴趣抽取和兴趣进化,其它特征embedding和兴趣concat_ = self.embedding(x, neg_x)if neg_x is not None:query_ad, user_behavior, mask, neg_user_behavior, neg_mask, profile_context = _else:query_ad, user_behavior, mask, profile_context = _neg_user_behavior = Noneneg_mask = None# Interest Extraction  兴趣抽取层gru_interest, aux_loss = self.interest_extract(user_behavior,mask,neg_user_behavior,neg_mask)# Interest Evolution  兴趣进化层final_interest = self.interest_evolution(query_ad, gru_interest, mask)# MLP for predictionconcat_out = tf.concat([tf.squeeze(query_ad, 1),final_interest,profile_context], axis=1)out = self.final_mlp(concat_out)out = tf.sigmoid(out)return out, aux_lossif __name__ == '__main__':tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.ERROR)with open('config.yaml', 'r') as f:config = yaml.Loader(f).get_data()data = pd.read_csv(config['read_path_amazon'], index_col=0)# 新增两列离散数据,分别模拟用户画像特征、上下文特征tmp = data.shape[0]discrete_1 = np.random.randint(1, 100, (tmp,))discrete_2 = np.random.randint(1, 100, (tmp,))data['user_profile'] = discrete_1data['context'] = discrete_2# 重置列的顺序,前两列为新增的假数据,中间为行为序列,最后为labelhist = []for i in range(40):hist.append('hist_cate_{}'.format(i))data = data[['user_profile', 'context'] + hist + ['cateID', 'label']]data_X = data.iloc[:, :-1]data_y = data.label.valuesfields = data_X.max().max()print('fields: ', fields)other_spare_emb = [sparseFeature(feat, data_X[feat].max() + 1, config['DIEN']['fea_dim']) for feat in ['user_profile', 'context']]tmp_X, test_X, tmp_y, test_y = train_test_split(data_X, data_y, test_size=0.2, random_state=42, stratify=data_y)train_X, val_X, train_y, val_y = train_test_split(tmp_X, tmp_y, test_size=0.1, random_state=42, stratify=tmp_y)train_X_neg = auxiliary_sample(train_X)train_X = train_X.valuesval_X = val_X.valuestest_X = test_X.valuestrain_loader = tf.data.Dataset.from_tensor_slices((train_X, train_X_neg, train_y)).shuffle(len(train_X)).batch(config['train']['batch_size'])val_loader = tf.data.Dataset.from_tensor_slices((val_X, val_y)).batch(config['train']['batch_size'])model = DIEN(seq_size=fields, seq_dim=config['DIEN']['seq_dim'], feature_columns=other_spare_emb,mlp_dims=config['DIEN']['mlp_dims'], gru_type=config['DIEN']['gru_type'],extract_fc_dims=config['DIEN']['extract_fc_dims'], extract_dropout=config['DIEN']['extract_dropout'],evolution_dropout=config['DIEN']['evolution_dropout'], att_dropout=config['DIEN']['att_dropout'], att_fc_dims=config['DIEN']['att_fc_dims'])adam = optimizers.Adam(lr=0.001, beta_1=0.95, beta_2=0.96, decay=config['train']['adam_lr'] / config['train']['epochs'])epoches = config['train']['epochs']for epoch in range(epoches):epoch_train_loss = tf.keras.metrics.Mean()for batch, (x, neg_x, y) in tqdm(enumerate(train_loader)):with tf.GradientTape() as tape:out, aux_loss = model(x, neg_x)loss = tf.keras.losses.binary_crossentropy(y, out)# loss_target = loss + α * aux_lossloss = tf.reduce_mean(loss) + config['DIEN']['alpha_aux_loss'] * aux_lossloss = tf.reduce_mean(loss)grads = tape.gradient(loss, model.trainable_variables)adam.apply_gradients(grads_and_vars=zip(grads, model.trainable_variables))epoch_train_loss(loss)epoch_val_loss = tf.keras.metrics.Mean()for batch, (x, y) in tqdm(enumerate(val_loader)):out, _ = model(x)loss = tf.keras.losses.binary_crossentropy(y, out)loss = tf.reduce_mean(loss)epoch_val_loss(loss)print('EPOCH : %s, train loss : %s, val loss: %s' % (epoch,epoch_train_loss.result().numpy(),epoch_val_loss.result().numpy()))model.summary()
EPOCH : 0, train loss : 2.1815844, val loss: 0.6931561
EPOCH : 1, train loss : 1.776755, val loss: 0.69315314
EPOCH : 2, train loss : 1.1352692, val loss: 0.69314873

CTR --- DIEN论文阅读笔记,及tf2复现相关推荐

  1. CTR --- NFM论文阅读笔记,及tf2复现

    文章目录 提出动机 结构 特征交叉池化层 特点 工程化结构 tf2实现 提出动机 结构 特征交叉池化层 特点 工程化结构 tf2实现 # coding:utf-8 # @Email: wangguis ...

  2. CTR --- AFM论文阅读笔记,及tf2复现

    文章目录 注意力机制 提出动机 解决方案 举例理解 结构 基于注意力机制的池化层 综合上述注意力机制的计算理解 tf2实现 注意力机制 提出动机 解决方案 把注意力机制引到里面去,来学习不同交叉特征对 ...

  3. 2019 sample-free(样本不平衡)目标检测论文阅读笔记

    点击我爱计算机视觉标星,更快获取CVML新技术 本文转载自知乎,已获作者同意转载,请勿二次转载 (原文地址:https://zhuanlan.zhihu.com/p/100052168) 背景 < ...

  4. 论文阅读笔记:看完也许能进一步了解Batch Normalization

    提示:阅读论文时进行相关思想.结构.优缺点,内容进行提炼和记录,论文和相关引用会标明出处. 文章目录 前言 介绍 BN之前的一些减少Covariate Shift的方法 BN算法描述 Batch No ...

  5. 论文阅读笔记(2):Learning a Self-Expressive Network for Subspace Clustering,SENet,用于大规模子空间聚类的自表达网络

    论文阅读笔记(2):Learning a Self-Expressive Network for Subspace Clustering. SENet--用于大规模子空间聚类的自表达网络 前言 摘要 ...

  6. 点云配准论文阅读笔记--Comparing ICP variants on real-world data sets

    目录 写在前面 点云配准系列 摘要 1引言(Introduction) 2 相关研究(Related work) 3方法( Method) 3.1输入数据的敏感性 3.2评价指标 3.3协议 4 模块 ...

  7. Designing an optimal contest(博弈论+机制设计) 论文阅读笔记

    Designing an optimal contest 论文阅读笔记 一.基本信息 二.文章摘要 三.背景介绍 四.核心模型 五.核心结论 六.总结展望 一.基本信息 题目:设计一个最优竞赛 作者: ...

  8. 【论文阅读笔记】Myers的O(ND)时间复杂度的高效的diff算法

    前言 之前咱们三个同学做了个Simple-SCM,我负责那个Merge模块,也就是对两个不同分支的代码进行合并.当时为了简便起见,遇到文件冲突的时候,就直接按照文件的更改日期来存储,直接把更改日期较新 ...

  9. 论文阅读笔记——基于CNN-GAP可解释性模型的软件源码漏洞检测方法

    本论文相关内容 论文下载地址--Engineering Village 论文阅读笔记--基于CNN-GAP可解释性模型的软件源码漏洞检测方法 文章目录 本论文相关内容 前言 基于CNN-GAP可解释性 ...

最新文章

  1. 安防行业巨头都是如何布局无人机的?
  2. 【特征工程】特征分箱
  3. Faster-RCNN+ZF用自己的数据集训练模型(Python版本)
  4. Java菜鸟教程math类_Java Number Math 类
  5. 使用 Playwright 对 ASP.NET Core 应用执行功能测试
  6. 软考信息安全工程师备考笔记5:第五章应用系统安全基础备考要点
  7. TypeError: 'module' object is not callable (pytorch在进行MNIST数据集预览时出现的错误)
  8. zeppelin 连接hive 认证方式_HIVE的学习之路(六)Hive的分组Join排序
  9. 【算法学习】纯高斯模糊算法处理灰度图片
  10. Visual Studio 2015的破解密钥
  11. 【我的Java开发学习之旅】如何实现中文汉字进行笔划(笔画)排序?
  12. 不开机win7计算机还原,win7忘记开机密码一键还原操作不了怎么办
  13. 超级玛丽 Super Mario java基础小游戏:基于JAVA面向对象实现的超级马里奥(Super Mario)游戏(简单小游戏,仅仅使用Java面向对象基础实现(附上源码))
  14. linux_5.10 iptables踩坑
  15. clion设置为中文_手把手教你去除CLion的中文字符乱码
  16. C++ 的未来——第 2 部分
  17. git 解决push报错:[rejected] master -> master (fetch first)
  18. GitHub 热点速览 Vol.13:近 40k star 计算机论文项目再霸 GitHub Trending 榜
  19. 马云内部邮件:新入职员工勿批判公司
  20. vuecli4关于Warning in ./src/plugins/element.js “export ‘default’ (imported as ‘Vue’) was not found in

热门文章

  1. Nginx搭建图片服务起报403错误
  2. 【公钥密码】ECC椭圆密码体制 (实现Elgamal加密方法)
  3. 婚恋交友源码开发,实现图片的滑动切换
  4. JavaMail 给多人发送邮件
  5. H.264_AVC视频编码技术学习
  6. IDEA设置Working directory及作用
  7. php select只有一条_读取数据库如何只取出一条数据????请赐教!
  8. jmeter循环和计数器
  9. 前端开发中的骚操作~~~~~
  10. linux电脑自启动,caddyserver在linux系统下设置开机启动