这篇论文发表在2018年的WWW上。引入知识来进行新闻推荐。
关键词：News recommendation; knowledge graph representation; deep neural networks; attention model

Motivation

过去的新闻推荐方法没有引入知识，很难发现潜在的知识层面的关系
另一方面新闻推荐有高度的时间敏感性并且要随着用户的兴趣而改变。

新闻推荐面临的三个挑战:

highly time-sensitive and their relevance expires quickly within a short period,协同过滤算法效果不好
How to dynamically measure auser’s interest based on his diversified reading history for current candidate news
news language is usually highly condensed and comprised of a large amount of knowledge entities and common sense. 传统方法没有引入知识而只是基于词语共现和聚类。

To extract deep logical connections among news, it is necessary to introduce additional knowledge graph information into news recommendations. 引入外部知识到新闻推荐中。

Methodology

我们把用户 $i$ 的点击历史表示为 $t_{1}^{i},t_{2}^{i},t_{2}^{i},...,t_{N_{i}}^{i} ]$ ， $t_{j}^{i}$ 就是第 $i$ 个用户的第 $j$ 个点击的新闻标题。每个标题都是由一堆字组成。每个字又和知识图谱中的某个实体相关。
输入：新闻和用户的点击历史
输出：点击这条新闻的概率

首先将新闻中的每个词和知识图谱中的相关实体联系起来，并搜索和使用每个实体的上下文实体
然后使用knowledge-aware convolutional neural networks(KCNN)来整合word-level and knowledge-level representations of news and generate a knowledge-aware embedding vector
KCNN和之前的工作有两点不同

multi-channel
word-entity-aligned

To get a dynamic representation of a user with respect to current candidate news, we use an attention module to automatically match candidate news to each piece of clicked news, and aggregate the user’s history with different weights.

下面先介绍几个相关的知识点

Knowledge Graph Embedding

典型的知识图谱就是一系列（实体1，关系，实体2）的三元组的集合。

The goal of knowledge graph embedding is to learn a low-dimensional representation vector for each entity and relation that preserves the structural information of the original knowledge graph.

其中，translatioin-based 的方法比较好，它有以下几种具体的方法：TransE、TransH、TransD等。

CNN for Sentence Representation Learning

词袋模型有很多问题。
引入Kim CNN进行句子表示。
QQ截图20180801160401.png
表示方法如上图所示。使用不同通道的卷积核以及不同大小的卷积核进行卷积操作得到最终的句子表示向量。

可以引入RNN句子表示和CNN句子表示相结合。

DEEP KNOWLEDGE-AWARE NETWORK (DNK)

从这部分开始介绍本文提出的方法。
DNK的整体框架如下所示：

QQ截图20180801165512.png
KCN将新闻提取为向量表示。之后Attention Net得到历史点击的表示，将其和待预测新闻的向量结合输入到深度神经网络中得到最终的预测结果。

Knowledge Distillation

QQ截图20180802102651.png
这一部分说明如何将知识融入到特征中。
第一步通过实体链接（entity linking）技术将新闻中的实体识别出来。
第二步从原始KG中提取子图。只有从新闻中提取的实体构建出的子图可能太过稀疏。所以吧子图扩展到和已有实体一跳的实体。
第三步就是know graph embedding。
第四步，学到的entity embedding作为KCNN和DKN的输入

光有新闻中entity的embedding的信息缺少知识图谱的结构信息。为了提供实体的位置信息，加入了“context entity”的信息 $eˉ\bar{e}$ ，就是与该实体一跳距离的实体embedding的平均值。
$eˉ=1∣context(e)∣∑ei∈context(e)ei\bar{e}=\frac{1}{|context(e)|}\sum_{e_{i}\in context(e)}e_i$
QQ截图20180802103840.png

Knowledge-aware CNN

如前所述，每个word embedding $w_i$ 都会有一个entity embedding $ei∈Rk×1e_i\in \Bbb R^{k\times1}$ 以及相应的context embedding $eiˉ∈Rk×1\bar{e_i}\in \Bbb R^{k\times1}$ 。其中， $k$ 是entity embedding 的维度。
对于上面这些特征，一个直观的想法就是把他们组合起来作为一个“pseudo words”：
$W=[w_1 w_2... w_n e_{t_1} e_{t_2}...]$ 但是存在以下问题：

The concatenating strategy breaks up the connection between words and associated
entities and is unaware of their alignment.

Word embeddings and entity embeddings are learned by different methods, meaning it is not suitable to convolute them together in a single vector space.

The concatenating strategy implicitly forces word embeddings and entity embeddings to have the same dimension, which may not be optimal in practical.

所以作者采用了图三左下中的方法，把它们作为类似图像中的不同通道。但entity embedding的维度和word embedding的维度不一样怎么办？又采用了一个转换层将其转换为同样的维度。下面的计算方法就和CV中的卷积神经网络一样了。

Attention-based User Interest Extraction

这里的Attention Net不是指Attention Is All You Need中的attention机制。而是使用了一个DNN网络来计算每个历史点击的权重。但是仔细想想的话其实和attention机制本质上是一样的。但是有个问题是这个方法没有考虑时间，直观上考虑肯定是时间越近的历史对当前影响越大。 最后点击概率的预测又是使用另外一个DNN。

DKN: Deep Knowledge-Aware Network for News Recommendation阅读笔记相关推荐

2018_WWW_DKN- Deep Knowledge-Aware Network for News Recommendation阅读笔记
Xmind思维导图: deep knowledge-aware network(DKN) properties: incorporates knowledge graph representation ...
Alleviating the Inconsistency Problem of Applying Graph Neural Network to Fraud Detection阅读笔记
Alleviating the Inconsistency Problem of Applying Graph Neural Network to Fraud Detection阅读笔记文章标题:A ...
25. Deep Weighted MaxSAT for Aspect-based Opinion Extraction 阅读笔记
25. Deep Weighted MaxSAT for Aspect-based Opinion Extraction 阅读笔记 Author Information::Meixi Wu* , We ...
《Deep Learning for Computer Vision withPython》阅读笔记-PractitionerBundle(第9 - 11章)
9.使用HDF5和大数据集到目前为止,在本书中,我们只使用了能够装入机器主存储器的数据集.对于小数据集来说,这是一个合理的假设--我们只需加载每一个单独的图像,对其进行预处理,并允许其通过我们的网络 ...
论文解读《Evaluating the visualization of what a Deep Neural Network has learned》–阅读笔记
本文属于原创,转载请注明出处 *本论文解读的初衷: 1.由于某些原因,最近有关注到神经网络可解释性与可视化方向的发展. 2.本人习惯阅读优秀的博文后直接点赞收藏,而这篇却没有搜到相关解读,不知道是不是 ...
《Deep Learning for Computer Vision withPython》阅读笔记-StarterBundle(第18 - 23章)
18.检查点模型截止到P265页 //2022.1.18日22:14开始学习在第13章中,我们讨论了如何在培训完成后将模型保存和序列化到磁盘上.在上一章中,我们学习了如何在发生欠拟合和过拟合时发现 ...
EnlightenGAN: Deep Light Enhancement without Paired Supervision论文阅读笔记
EnlightenGAN: Deep Light Enhancement without Paired Supervision论文解读 Motivation and introduction 最近在x ...
《Deep Learning for Computer Vision withPython》阅读笔记-StarterBundle(第6 - 7章)
6.配置您的开发环境当涉及到学习新技术(尤其是深度学习)时,配置开发环境往往是成功的一半.在不同的操作系统.不同的依赖版本以及实际的库本身之间,配置您自己的深度学习开发环境可能是相当令人头痛的事情. ...
Deep High-Resolution Representation Learning for Visual Recognition阅读笔记
用于视觉识别的深度高分辨率表示学习论文链接摘要: 高分辨率表示对于人体姿态估计.语义分割和目标检测这类位置敏感的视觉问题至关重要.现有的 sota 框架首先通过串联 high-to-low 分辨率 ...

DKN: Deep Knowledge-Aware Network for News Recommendation阅读笔记