关键词:News recommendation; knowledge graph representation; deep neural networks; attention model


另一方面 新闻推荐有高度的时间敏感性并且要随着用户的兴趣而改变。


  • highly time-sensitive and their relevance expires quickly within a short period,协同过滤算法效果不好
  • How to dynamically measure auser’s interest based on his diversified reading history for current candidate news
  • news language is usually highly condensed and comprised of a large amount of knowledge entities and common sense. 传统方法没有引入知识 而只是基于词语共现和聚类。

To extract deep logical connections among news, it is necessary to introduce additional knowledge graph information into news recommendations. 引入外部知识到新闻推荐中。


我们把用户iii的点击历史表示为[t1i,t2i,t2i,...,tNii][ t_{1}^{i},t_{2}^{i},t_{2}^{i},...,t_{N_{i}}^{i} ][t1i,t2i,t2i,...,tNii]tjit_{j}^{i}tji就是第iii个用户的第jjj个点击的新闻标题。每个标题都是由一堆字组成。每个字又和知识图谱中的某个实体相关。

然后使用knowledge-aware convolutional neural networks(KCNN)来整合word-level and knowledge-level representations of news and generate a knowledge-aware embedding vector

  • multi-channel
  • word-entity-aligned

To get a dynamic representation of a user with respect to current candidate news, we use an attention module to automatically match candidate news to each piece of clicked news, and aggregate the user’s history with different weights.


Knowledge Graph Embedding


The goal of knowledge graph embedding is to learn a low-dimensional representation vector for each entity and relation that preserves the structural information of the original knowledge graph.

其中,translatioin-based 的方法比较好,它有以下几种具体的方法:TransE、TransH、TransD等。

CNN for Sentence Representation Learning

引入Kim CNN进行句子表示。




KCN将新闻提取为向量表示。之后Attention Net得到历史点击的表示,将其和待预测新闻的向量结合输入到深度神经网络中得到最终的预测结果。

Knowledge Distillation

第一步通过实体链接(entity linking)技术将新闻中的实体识别出来。
第三步就是know graph embedding。
第四步,学到的entity embedding作为KCNN和DKN的输入

光有新闻中entity的embedding的信息缺少知识图谱的结构信息。为了提供实体的位置信息,加入了“context entity”的信息eˉ\bar{e}eˉ,就是与该实体一跳距离的实体embedding的平均值。
eˉ=1∣context(e)∣∑ei∈context(e)ei\bar{e}=\frac{1}{|context(e)|}\sum_{e_{i}\in context(e)}e_ieˉ=context(e)1eicontext(e)ei

Knowledge-aware CNN

如前所述,每个word embedding wiw_iwi都会有一个entity embedding ei∈Rk×1e_i\in \Bbb R^{k\times1}eiRk×1以及相应的context embedding eiˉ∈Rk×1\bar{e_i}\in \Bbb R^{k\times1}eiˉRk×1。其中,kkk是entity embedding 的维度。
对于上面这些特征,一个直观的想法就是把他们组合起来作为一个“pseudo words”:
W=[w1w2...wnet1et2...]W=[w_1 w_2... w_n e_{t_1} e_{t_2}...]W=[w1w2...wnet1et2...]但是存在以下问题:

  • The concatenating strategy breaks up the connection between words and associated
    entities and is unaware of their alignment.
  • Word embeddings and entity embeddings are learned by different methods, meaning it is not suitable to convolute them together in a single vector space.
  • The concatenating strategy implicitly forces word embeddings and entity embeddings to have the same dimension, which may not be optimal in practical.

所以作者采用了图三左下中的方法,把它们作为类似图像中的不同通道。但entity embedding的维度和word embedding的维度不一样怎么办?又采用了一个转换层将其转换为同样的维度。下面的计算方法就和CV中的卷积神经网络一样了。

Attention-based User Interest Extraction

这里的Attention Net不是指Attention Is All You Need中的attention机制。而是使用了一个DNN网络来计算每个历史点击的权重。但是仔细想想的话其实和attention机制本质上是一样的。但是有个问题是这个方法没有考虑时间,直观上考虑肯定是时间越近的历史对当前影响越大。 最后点击概率的预测又是使用另外一个DNN。

