[论文阅读笔记]2019_WSDM_Session-Based Social Recommendation via Dynamic Graph Attention Networks

论文下载地址： https://doi.org/10.1145/3289600.3290989
发表期刊：WSDN
Publish time: 2019
作者及单位:

Weiping Song School of EECS, Peking University songweiping@pku.edu.cn
Zhiping Xiao School of EECS, UC Berkeley patricia.xiao@berkeley.edu
Yifan Wang School of EECS, Peking University yifanwang@pku.edu.cn
Laurent Charlin Mila & HEC Montreal laurent.charlin@hec.ca
Ming Zhang∗ School of EECS, Peking University mzhang_cs@pku.edu.cn
Jian Tang∗ Mila & HEC Montreal jian.tang@hec.ca

数据集： 正文中的介绍

Douban http://www.douban.com
Delicious https://grouplens.org/datasets/hetrec-2011/
Yelp https://www.yelp.com/dataset

代码：

（作者没给）

其他：

其他人写的文章

论文《Session-based Social Recommendation via Dynamic Graph Attention Networks》阅读

简要概括创新点： (1)Session几乎必用RNN(其他文章也尝试LSTM，GRU…)
(2)不同的Session, user current interest不同(dynamic)，friends’ influence也不同(context-dependent social influences)，上Attention。
(3)朋友的影响也分短期(friends’ short-term preferences)和长期(long-term)的，拼接后过ReLu

We propose an approach to model both users’ session-based interests as well as dynamic social influences. (我们提出了一种方法来建模 用户基于会话的兴趣 以及 动态的社会影响 。)

That is, which subset of a user’s friends influence her (the influencers) according to her current session. (也就是说，根据用户的当前会话，哪个用户的朋友子集会影响她（影响者）。)

Our recommendation model is based on dynamic-graph-attention networks. (我们的推荐模型基于 动态图注意网络 。)

Our approach first models user behaviors within a session using a recurrent neural network (RNN) [7]. (我们的方法首先使用 递归神经网络（RNN） 对 会话中 的 用户行为 进行建模[7]。)

According to user’s current interests—captured by the hidden representation of the RNN—we capture the influences of friends using the graph-attention network [33]. (根据 RNN的隐藏表示 捕捉到的 用户当前兴趣 ，我们使用 图注意网络 捕捉 朋友的影响 [33]。)

To provide session-level recommendations, we distinguish the model of friends’ short-term preferences from that of the long-term ones. (为了提供会话级建议，我们将朋友的 短期偏好模型 与 长期偏好 模型区分开来。)

The influence of each friend, given the user’s current interests, is then determined automatically using an attention mechanism [1, 40]. (然后，根据用户当前的兴趣，使用注意机制自动确定每个朋友的影响[1,40]。)

To summarize, we make the following contributions: (总之，我们做出以下贡献：)
We propose to study both dynamic user interests and context-dependent social influences for the recommendation in online communities. （我们建议研究在线社区中推荐的动态用户兴趣和上下文相关的社会影响。）

We propose a novel recommendation approach based on dynamic-graph-attention networks for modeling both dynamic user interests and context-dependent social influences. The approach can effectively scale to large data sets. (我们提出了一种 基于动态图注意网络 的推荐方法，用于建模 动态用户兴趣 和 下文相关的社会影响。该方法可以有效地扩展到大型数据集。)

We propose a model based on graph convolutional networks for session-based social recommendation in online communities. (我们提出了一个基于 图卷积网络 的在线社区 基于会话 的社会推荐模型。)
Our model first learns individual user representations by modeling the users’ current interests. (我们的模型首先通过对 用户当前兴趣 的建模来学习 单个用户的表示)

Each user’s representation is then aggregated with her friends’ representations using a graph convolutional networks with a novel attention mechanism. (然后，每个 用户的表示 与 她的朋友的表示 使用一个带有新颖 注意机制 的 图卷积网络 进行聚合。)

The combined representation along with the user’s original representation is then used to form item recommendations. (然后，将 组合表示 与 用户的原始表示 一起用于形成项目推荐。)

DGREC is composed of four modules (Figure 2). (DGREC由四个模块组成（图2）。)
(1) First (§4.1), a recurrent neural network (RNN) [7] models the sequence of items consumed in the (target) user’s current session. (首先（§4.1），递归神经网络（RNN）[7]对（目标）用户当前会话中消费的项目序列进行建模。)

(2) Her friends’ interests are modeled using a combination of their short- and long-term preferences (§4.2). (她的朋友的兴趣是通过结合他们的短期和长期偏好来建模的（§4.2）)
The short-term preferences, or items in their most recent session, are also encoded using an RNN. (短期偏好或最近会话中的项目也使用RNN编码。)

Friends’ long-term preferences are encoded with a learned individual embedding. (朋友的长期偏好是通过一个学习过的个体嵌入编码的。)

(3) The model then combines the representation of the current user with the representations of her friends using a graph-attention network (§4.3). （然后，该模型使用图形注意力网络（第4.3节）将 当前用户的表示 与她的 朋友的表示 结合起来。）
This is a key part of our model and contribution: our proposed mechanism learns to weigh the influence of each friend based on the user’s current interests. （这是我们的模型和贡献的关键部分：我们提出的机制学习根据用户当前的兴趣权衡每个朋友的影响。）

(4) At the final step (§4.4), the model produces recommendations by combining a user’s current preferences with her (context-dependent) social influences. （在最后一步（§4.4），该模型通过将用户的当前偏好与她 （依赖于上下文的）社会影响 相结合来产生建议。）

细节

用了Adam dropout

ABSTRACT

(1) Online communities such as Facebook and Twitter are enormously popular and have become an essential part of the daily life of many of their users. Through these platforms, users can discover and create information that others will then consume. In that context, recommending relevant information to users becomes critical for viability. However, recommendation in online communities is a challenging problem: (Facebook和Twitter等在线社区非常受欢迎，已成为许多用户日常生活的重要组成部分。通过这些平台，用户可以发现并创建其他人将要使用的信息。在这种情况下，向用户推荐相关信息对于生存能力至关重要。然而，在线社区中的推荐是一个具有挑战性的问题：)
- (1) users’ interests are dynamic, and (用户的兴趣是动态的，而且)
- (2) users are influenced by their friends. (用户受到朋友的影响)
- Moreover, the influencers may be context-dependent. (此外，影响者可能是上下文相关的。)
- That is, different friends may be relied upon for different topics. (也就是说，不同的话题可能依赖不同的朋友。)
- Modeling both signals is therefore essential for recommendations. (因此，对这两种信号进行建模对于推荐至关重要。)
(2) We propose a recommender system for online communities based on a dynamic-graph-attention neural network. (我们提出了一个 基于动态图注意神经网络 的 在线社区 推荐系统。)
- We model dynamic user behaviors with a recurrent neural network, (我们用 递归神经网络 模拟 动态用户行为 ，)
- and context-dependent social influence with a graph-attention neural network, which dynamically infers the influencers based on users’ current interests. ( 基于上下文的社会影响 ，采用 图形注意神经网络 ，根据用户当前兴趣动态推断 影响者。)
- The whole model can be efficiently fit on large-scale data. (整个模型可以有效地拟合大规模数据。)
- Experimental results on several real-world data sets demonstrate the effectiveness of our proposed approach over several competitive baselines including state-of-the-art models. (在多个真实数据集上的实验结果表明，我们提出的方法在包括最新模型在内的多个竞争基线上是有效的。)

CCS CONCEPTS

• Information systems → Social recommendation; • Computing methodologies → Ranking; Learning latent representations;

KEYWORDS

Dynamic interests; social network; graph convolutional networks; session-based recommendation

1 INTRODUCTION

(1) Online social communities are an essential part of today’s online experience. Platforms such as Facebook, Twitter, and Douban enable users to create and share information as well as consume the information created by others. Recommender systems for these platforms are therefore critical to surface information of interest to users and to improve long-term user engagement. However, online communities come with extra challenges for recommender systems. (在线社交社区是当今在线体验的重要组成部分。Facebook、Twitter和豆瓣等平台使用户能够创建和共享信息，以及消费他人创建的信息。因此，这些平台的推荐系统对于用户感兴趣的表面信息和提高长期用户参与度至关重要。然而，在线社区给推荐系统带来了额外的挑战。)
(2) First, user interests are dynamic by nature. A user may be interested in sports items for a period of time and then search for new music groups. (首先，用户兴趣本质上是动态的。用户可能会对体育项目感兴趣一段时间，然后搜索新的音乐组。)
- Second, since online communities often promote sharing information among friends, users are also likely to be influenced by their friends. For instance, a user looking for a movie may be influenced by what her friends have liked. (其次，由于在线社区通常促进朋友之间的信息共享，用户也可能受到朋友的影响。例如，一个正在寻找电影的用户可能会受到她的朋友喜欢的内容的影响。)
- Further, the set of influencers can be dynamic since they can be context-dependent. For instance, a user will trust a set of friends who like comedies when searching for funny films; while she could be influenced by another set of friends when searching for action movies. (此外，影响者的集合可以是动态的，因为它们可能是 依赖于上下文的 。例如，用户在搜索搞笑电影时会信任一组喜欢喜剧的朋友；而在寻找动作片时，她可能会受到其他朋友的影响。)
(2) Motivating Example. Figure 1 presents the behavior of Alice’s and her friends’ in an online community. Behaviors are described by a sequence of actions (e.g., item clicks). To capture users’ dynamic interests, their actions are segmented into sub-sequences denoted as sessions. We are therefore interested in session-based recommendations [28]: within each session, we recommend the next item Alice should consume based on the items in the current session she has consumed so far. Figure 1 presents two sessions: session (a) and (b). In addition, the items consumed by Alice’s friends are also available. We would like to utilize them for better recommendations. We are thus in a session-based social recommendation setting. (激励性的例子。图1展示了Alice和她的朋友在一个在线社区中的行为。行为通过一系列动作（例如，点击项目）来描述。为了捕捉用户的动态兴趣，他们的动作被分割成子序列，表示为会话。因此，我们对基于会话的建议感兴趣[28]：在每个会话中，我们根据Alice到目前为止在当前会话中消费的项目，推荐Alice应该消费的下一个项目。图1显示了两个会话：会话（a）和会话（b）。此外，爱丽丝的朋友们消费的物品也可以买到。我们希望利用它们提出更好的建议。因此，我们处于基于会话的社会推荐环境中。)
(3) In session (a), Alice browses sports items. Two of her friends: Bob and Eva, are notorious sports fans (long-term interests), and they are browsing sports’ items recently (short-term interests). Considering both facts, Alice may be influenced by the two and, e.g., decides to learn more about Ping Pong next. In session (b), Alice is interested in “literature & art” items. The situation is different with session (a) since none of her friends have consumed such items recently. But David is generally interested in this topic (long-term interests).In this case, it would make sense for Alice to be influenced by David, and say, be recommended a book that David enjoyed. These examples show how a user’s current interests combining with the (short- and long-term) interests of different friends’ provide session-based social recommendations. In this paper, we present a recommendation model based on both. (在会话（a）中，爱丽丝浏览运动项目。她的两个朋友：鲍勃和伊娃是臭名昭著的体育迷（长期兴趣），最近他们在浏览体育项目（短期兴趣）。考虑到这两个事实，爱丽丝可能会受到这两个因素的影响，例如，她决定下一步学习更多关于乒乓球的知识。在会话（b）中，爱丽丝对“文学与艺术”项目感兴趣。session（a）的情况有所不同，因为她的朋友最近都没有消费过此类物品。但是大卫通常对这个话题感兴趣（长期兴趣）。在这种情况下，爱丽丝受大卫的影响是有道理的，比如说，被推荐读一本大卫喜欢的书。这些例子展示了用户当前的兴趣与不同朋友的（短期和长期）兴趣相结合如何提供基于会话的社交推荐。在本文中，我们提出了一个基于两者的推荐模型。)
(4) The current recommendation literature has modeled either users’ dynamic interests or their social influences, but, as far as we know, has never combined both (like in the example above). (目前的推荐文献要么模拟了用户的动态兴趣，要么模拟了他们的社会影响，但据我们所知，从未将两者结合起来（如上面的例子）。)
- A recent study [13] (《Session-based recommendations with recurrent neural networks》2016NCLP) models session-level user behaviors using recurrent neural networks, ignoring social influences. (最近的一项研究[13]使用递归神经网络对会话级用户行为进行建模，忽略了社会影响。)
- Others studied merely social influences [4, 24, 41]. (其他人只研究社会影响[4,24,41])
  - For example, Ma et al. [24] explores the social influence of friends’ long-term preferences on recom mendations. However, the influences from different users are static, not depicting the users’ current interests. (例如，马等人[24]探讨了朋友长期偏好对推荐的社会影响。然而，来自不同用户的影响是静态的，不能描述用户当前的兴趣。)
(5) We propose an approach to model both users’ session-based interests as well as dynamic social influences. (我们提出了一种方法来建模 用户基于会话的兴趣 以及 动态的社会影响 。)
- That is, which subset of a user’s friends influence her (the influencers) according to her current session. (也就是说，根据用户的当前会话，哪个用户的朋友子集会影响她（影响者）。)
- Our recommendation model is based on dynamic-graph-attention networks. (我们的推荐模型基于 动态图注意网络 。)
- Our approach first models user behaviors within a session using a recurrent neural network (RNN) [7]. (我们的方法首先使用 循环神经网络（RNN） 对 会话中 的 用户行为 进行建模[7]。)
- According to user’s current interests—captured by the hidden representation of the RNN—we capture the influences of friends using the graph-attention network [33]. (根据 RNN的隐藏表示 捕捉到的 用户当前兴趣 ，我们使用 图注意网络 捕捉 朋友的影响 [33]。)
- To provide session-level recommendations, we distinguish the model of friends’ short-term preferences from that of the long-term ones. (为了提供会话级建议，我们将朋友的 短期偏好模型 与 长期偏好 模型区分开来。)
- The influence of each friend, given the user’s current interests, is then determined automatically using an attention mechanism [1, 40]. (然后，根据用户当前的兴趣，使用注意机制自动确定每个朋友的影响[1,40]。)
(6) We conduct extensive experiments on data sets collected from several online communities (Douban, Delicious, and Yelp). Our proposed approach outperforms the well-known competitive baselines by modeling both users’ dynamic behaviors and dynamic social influences. (我们对从几个在线社区（Douban、Delicious和Yelp）收集的数据集进行了广泛的实验。通过对用户的动态行为和动态社会影响进行建模，我们提出的方法优于著名的竞争基线。)
(7) To summarize, we make the following contributions: (总之，我们做出以下贡献：)
- We propose to study both dynamic user interests and context-dependent social influences for the recommendation in online communities. （我们建议研究在线社区中推荐的动态用户兴趣和上下文相关的社会影响。）
- We propose a novel recommendation approach based on dynamic-graph-attention networks for modeling both dynamic user interests and context-dependent social influences. The approach can effectively scale to large data sets. (我们提出了一种 基于动态图注意网络 的推荐方法，用于建模 动态用户兴趣 和 下文相关的社会影响。该方法可以有效地扩展到大型数据集。)
- We conduct extensive experiments on real-world data sets. Experimental results demonstrate the effectiveness of our model over strong and state-of-the-art baselines.
(8) Organization. §2 discusses related works. In §3 we give a formal definition of the session-based social recommendation problem. Our session-based social recommendation approach is described in §4. §5 presents the experimental results, followed by concluding remarks in §6. (组织§2讨论了相关工作。在§3中，我们给出了基于会话的社会推荐问题的正式定义。我们基于会话的社会推荐方法如§4所述。第5节介绍了实验结果，然后是第6节中的结束语。)

2 RELATED WORK

We discuss three lines of research that are relevant to our work: (我们将讨论与我们的工作相关的三条研究路线)
- (1) recommender systems that model the dynamic user behaviors, (为动态用户行为建模的推荐系统，)
- (2) social recommender systems that take social influence into consideration, and (考虑社会影响的社会推荐系统)
- (3) recent progress of convolutional network developed for graph-structured data. (图结构数据卷积网络的最新进展。)

2.1 Dynamic Recommendation

(1) Modeling user interests that change over time has already received some attention [5,19,39]. Most of these models are based on (Gaussian) matrix factorization [26]. (对用户兴趣随时间变化的建模已经受到了一些关注[5,19,39]。这些模型大多基于（高斯）矩阵分解 [26]。)
- For example, Xiong et al. [39] learned temporal representations by factorizing the (user, item, time) tensor. (例如，Xiong等人[39]通过分解（用户、项目、时间）张量来学习时间表示。)
- Koren [19] developed a similar model named timeSVD++. (Koren[19]开发了一个类似的模型，名为timeSVD++。)
- Charlin et al. [5] modeled similarly but using Poisson factorization [10]. (Charlin等人[5]进行了类似的建模，但使用了泊松分解[10]。)
However, these approaches assume that the interest of users changes slowly and smoothly over long-term horizons, typically on the order of months or years. (然而，这些方法假设用户的兴趣在长期范围内缓慢而平稳地变化，通常是几个月或几年。)
- To effectively capture users’ short-term interests, recent works introduce RNN to model their recent (ordered) behaviors. (为了有效地捕捉用户的短期兴趣，最近的工作引入了RNN来模拟他们最近（有序）的行为。)
- For example,
  - Hidasi et al. [13] first proposed SessionRNN to model user’s interest within a session. (Hidasi等人[13]首次提出SessionRNN来模拟用户在会话中的兴趣。)
  - Li et al. [21] further extended Session-RNN with attention mechanism to capture user’s both local and global interests. (Li等人[21]进一步扩展了会话RNN，使用注意机制捕捉用户的局部和全局兴趣。)
  - Wu et al. [37] used two separate RNNs to update the representations of both users and items based on new observations. (Wu等人[37]使用两个独立的RNN，根据新的观察结果更新用户和项目的表示。)
  - Beutel et al. [2] built an RNN-based recommender that can incorporate auxiliary context information. (Beutel等人[2]构建了一个基于RNN的推荐器，可以包含辅助上下文信息。)
- These models assume that items exhibit coherence within a period of time, and we use a similar approach to model session-based user interests. (这些模型假设项目在一段时间内表现出一致性，我们使用类似的方法对基于会话的用户兴趣进行建模。)

2.2 Social Recommendation

(1) Modeling the influence of friends on user interests has also received attention [15, 16, 23–25]. Most proposed models are (also) based on Gaussian or Poisson matrix factorization. (模拟朋友对用户兴趣的影响也受到了关注[15,16,23-25]。大多数提出的模型（也）基于高斯或泊松矩阵分解。)
- For example, Ma et al. [24] studied social recommendations by regularizing latent user factors such that the factors of connected users are close by. (例如，Ma等人[24]通过 正则潜在用户因素 来研究社交推荐，这样连接用户的因素就在附近。)
- Chaney et al. [4] weighted the contribution of friends on a user’s recommendation using a learned “trust factor”. (Chaney等人[4]使用学到的“信任因子”，根据用户的推荐对朋友的贡献进行加权。)
- Zhao et al. [41] proposed an approach to leverage social networks for active learning. (赵等人[41]提出了一种利用社交网络进行主动学习的方法。)
- Xiao et al. [38] framed the problem as transfer learning between the social domain and the recommendation domain. (Xiao等人[38]将这个问题定义为社交领域和推荐领域之间的迁移学习。)
- These approaches can model social influences assuming influences are uniform across friends and independent from the user’s preferences. (这些方法可以模拟社会影响，假设影响在朋友之间是一致的，并且独立于用户的偏好 。)
- Tang et al. [31] and Tang et al. [30] proposed multi-facet trust relations, which relies on additional side information (e.g., item category) to define facets. (Tang等人[31]和Tang等人[30]提出了多面信任关系，它依赖于附加的边信息（例如，项目类别）来定义面。)
- Wang et al. [35] and Wang et al. [34] distinguished strong and weak ties among users for recommendation in social networks. However, they ignore the user’s short-term behaviors and integrate context-independent social influences. (Wang等人[35]和Wang等人[34]区分了社交网络中推荐用户之间的强弱关系。然而，它们忽略了 用户的短期行为，并 整合了与上下文无关的社会影响 。)
- Our proposed approach models dynamic social influences by modeling the dynamic user interests, and context-dependent social influences. (我们提出的方法通过建模动态用户兴趣和上下文相关的社会影响来建模动态社会影响。)

2.3 Graph Convolutional Networks

(1) Graph convolutional networks (GCNs) inherits convolutional neural networks (CNNs). CNNs have achieved great success in computer vision and several other applications. (图卷积网络（GCN）继承了卷积神经网络（CNN）。CNN在计算机视觉和其他一些应用方面取得了巨大成功)
- CNNs are mainly developed for data with 2-D grid structures such as images [20]. (CNN主要用于二维网格结构的数据，如图像[20]。)
- Recent works focus on modeling more general graph-structure data using CNNs [3, 6, 12, 18]. (最近的工作集中于使用CNN对更一般的图结构数据建模)
  - Specifically, Kipf and Welling [18] proposed graph-convolutional networks (GCNs) for semi-supervised graph classification. The model learns node representations by leveraging both the node attributes and the graph structure. It is composed of multiple graph-convolutional layers, each of which updates node representations using a combination of the current node’s representation and that of its neighbors. Through this process, the dependency between nodes is captured. However, in the original formulation, all neighbors are given the static “weight” when updating the node representations. (具体来说，Kipf和Welling[18]提出了用于 半监督图分类 的 图卷积网络（GCN） 。该模型通过利用节点属性和图结构来学习节点表示。它由 多个卷积层 组成，每个卷积层使用当前节点的表示和其邻居的表示的组合来更新节点表示。通过这个过程，节点之间的依赖关系被捕获。然而，在原始公式中，当更新节点表示时，所有邻居都被赋予静态“权重”。)
  - Velickovic et al. [33] addressed this problem by proposing graph-attention networks. They weighed the contribution of neighbors differently using an attention mechanism [1, 40]. (Velickovic等人[33]通过提出 图注意网络 解决了这个问题。他们使用 注意力机制 对邻居的贡献进行了不同的权衡[1,40]。)
(2) We propose a dynamic-graph-attention network. Compared to previous work, (我们提出了一种动态图注意网络。与之前的工作相比，)
- we focus on a different application (modeling the context-dependent social influences for recommendations). (我们专注于一个不同的应用程序（为推荐建立依赖于上下文的社会影响模型）。)
- Besides, we model a dynamic graph, where the features of nodes evolve over time, and the attention between nodes also changes along with the current context over time. (此外，我们还建立了一个动态图模型，其中节点的特征会随着时间的推移而变化，节点之间的注意力也会随着当前上下文的变化而变化。)

3 PROBLEM DEFINITION

(1) Recommender systems suggest relevant items to their users according to their historical behaviors. (推荐系统根据用户的历史行为向用户推荐相关项目。)
In classical recommendation models (e.g., matrix factorization [26]), the order in which a user consumes items is ignored. (在经典的推荐模型（例如，矩阵分解[26]）中，用户消费物品的顺序被忽略。)
However, in online communities, user-preferences change rapidly, and the order of user preference behaviors must be considered so as to model users’ dynamic interests. (然而，在网络社区中，用户偏好变化迅速， 必须考虑用户偏好行为的顺序，以模拟用户的动态兴趣。)
In practice, since users’ entire history record can be extremely long (e.g., certain online communities have existed for years) and users’ interests switch quickly, a common approach is to segment user preference behaviors into different sessions (e.g., using timestamps and consider each user’s behavior within a week as a session) and provide recommendations at session level [13]. We define this problem as follows: (在实践中，由于用户的整个历史记录可能非常长（例如，某些在线社区已经存在多年），而且用户的兴趣转换很快， 一种常见的方法是将用户偏好行为分段成不同的会话（例如，使用时间戳并考虑每个用户在一周内的行为作为会话），并在会话级别（13）提供建议。 我们对这个问题的定义如下：)

DEFINITION 1. (Session-based Recommendation) (基于会话的推荐)

(1) Let $U$ denote the set of users and $I$ be the set of items.
Each user $u$ is associated with a set of sessions by the time step $T$ , $ITu={S→1u,S→2u,...,S→Tu}I^u_T = \{\overrightarrow{S}^u_1, \overrightarrow{S}^u_2, .. ., \overrightarrow{S}^u_T\}$
1u,S
2u,...,S
Tu},
- where $S→tu\overrightarrow{S}^u_t$ is the $t_{th}$ session of user $u$ .
Within each session, $S→tu\overrightarrow{S}^u_t$
tu consists of a sequence of user behaviors ${it,1u,it,2u,...,it,Nu,tu}\{i^u_{t,1},i^u_{t,2}, ..., i^u_{t, N_{u,t}}\}$ ${i^{t, 1 u}, i^{t, 2 u}, . . ., i^{t, N^{u, t} u}}$ ,
- where $it,pui^u_{t, p}$ is the $p_{th}$ item consumed by user $u$ in $t_{th}$ session, (是用户 $u$ 在 $t_{th}$ 会话中消费的 $p_{th}$ 项)
- and $N_{u,t}$ is the amount of items in the session. (是会话中的项目数量。)
For each user $u$ , given a new session $S→T+1u={iT+1,1u,...,iT+1,nu}\overrightarrow{S}^u_{T+1} = \{i^u_{T+1,1}, ..., i^u_{T+1,n}\}$
T+1u={iT+1,1u,...,iT+1,nu},
the goal of session-based recommendation is to recommend a set of items from $I$ that the user is likely to be interested in during the next step $n + 1$ , i.e., $iT+1,n+1ui^u_{T+1,n+1}$ . (基于会话的推荐的目标是从 $I$ 中推荐一组用户在下一步 $n + 1$ 中可能感兴趣的项目)
(2) In online communities, users’ interests are not only correlated to their historical behaviors, but also commonly influenced by their friends. （在网络社区中，用户的兴趣不仅与他们的历史行为相关，而且通常受到朋友的影响。）
- For example, if a friend watches a movie, I may also be interested in watching it. This is known as social influence [32]. (例如，如果一个朋友看一部电影，我可能也对看它感兴趣。这就是所谓的社会影响力)
- Moreover, the influences from friends are context-dependent. In other words, the influences from friends vary from one situation to another. (此外，朋友的影响取决于语境。换句话说，朋友的影响因情况而异。)
- For example, if a user wants to buy a laptop, she will be more likely referring to friends who are keen on high-tech devices; while she may be influenced by photographer friends when shopping a camera. (例如，如果一个用户想买一台笔记本电脑，她更可能指的是对高科技设备感兴趣的朋友；而她在购买相机时可能会受到摄影师朋友的影响。)
- Like as Figure 1, a user can be influenced by both her friends’ short- and long-term preferences. (如图1所示，用户可以受到朋友短期和长期偏好的影响。)
(3) To provide an effective recommendation to users in online communities, we propose to model both users’ dynamic interests and context-dependent social influences. We define the resulting problem as follows: (为了向在线社区中的用户提供有效的推荐，我们建议对用户的动态兴趣和上下文相关的社会影响进行建模。我们将由此产生的问题定义如下：)

DEFINITION 2.(Session-based Social Recommendation) (基于会话的社会推荐)

(1) Let $U$ denote the set of users,
- $I$ be the set of items,
- and $G = (U, E)$ be the social network,
  - where $E$ is the set of social links between users. (其中 $E$ 是用户之间的社交链接集。)
(2) Given a new session $S→T+1u={iT+1,1u,...,iT+1,nu}\overrightarrow{S}^u_{T+1} = \{i^u_{T+1,1}, ..., i^u_{T+1,n} \}$ $S^{T + 1 u} = {i^{T + 1, 1 u}, . . ., i^{T + 1, n u}}$ of user $u$ ,
- the goal of session-based social recommendation is to recommend a set of items from $I$ that $u$ is likely to be interested in during the next time step $n + 1$ by utilizing information from both （基于会话的社交推荐的目标是通过利用来自这两方面的信息，从 $I$ 中推荐一组 $u$ 在下一个时间步骤 $n + 1$ 中可能感兴趣的项目）
  - her dynamic interests (i.e., information from $∪t=1T+1S→tu\cup^{T+1}_{t=1} \overrightarrow{S}^u_t$ )
  - and the social influences (i.e., information from $∪k=1N(u)∪t=1TS→tk\cup^{N(u)}_{k=1} \cup^T_{t=1}\overrightarrow{S}^k_t$ $\cup^{k = 1 N (u)} \cup^{t = 1 T} S^{t k}$ ,
    - where $N (u)$ is the set of $u$ ’s friends). ( $N （ u ）$ 是 $u$ 的朋友集）。)

4 DYNAMIC SOCIAL RECOMMENDER SYSTEMS

(1) As is discussed previously, users are not only guided by their current preferences but also by their friends’ preferences. (如前所述，用户不仅受当前偏好的引导，还受朋友偏好的引导。)
We propose a novel dynamic graph attention model Dynamic Graph Recommendation (DGREC) which models both types of preferences. (我们提出了一种新的动态图注意模型动态图推荐（DGREC），它对这两种偏好都进行了建模。)
(2) DGREC is composed of four modules (Figure 2). (DGREC由四个模块组成（图2）。)
- (1) First (§4.1), a recurrent neural network (RNN) [7] models the sequence of items consumed in the (target) user’s current session. (首先（§4.1），递归神经网络（RNN）[7]对（目标）用户当前会话中消费的项目序列进行建模。)
- (2) Her friends’ interests are modeled using a combination of their short- and long-term preferences (§4.2). (她的朋友的兴趣是通过结合他们的短期和长期偏好来建模的（§4.2）)
  - The short-term preferences, or items in their most recent session, are also encoded using an RNN. (短期偏好或最近会话中的项目也使用RNN编码。)
  - Friends’ long-term preferences are encoded with a learned individual embedding. (朋友的长期偏好是通过一个学习过的个体嵌入编码的。)
- (3) The model then combines the representation of the current user with the representations of her friends using a graph-attention network (§4.3). （然后，该模型使用图形注意力网络（第4.3节）将 当前用户的表示 与她的 朋友的表示 结合起来。）
  - This is a key part of our model and contribution: our proposed mechanism learns to weigh the influence of each friend based on the user’s current interests. （这是我们的模型和贡献的关键部分：我们提出的机制学习根据用户当前的兴趣权衡每个朋友的影响。）
- (4) At the final step (§4.4), the model produces recommendations by combining a user’s current preferences with her (context-dependent) social influences. （在最后一步（§4.4），该模型通过将用户的当前偏好与她 （依赖于上下文的）社会影响 相结合来产生建议。）

4.1 Dynamic Individual Interests

(1) To capture a user’s rapidly-changing interests, we use RNN to model the actions (e.g., clicks) of the (target) user in the current session. (为了捕捉用户快速变化的兴趣，我们使用RNN对（目标）用户在当前会话中的行为（例如，点击）进行建模。)
(2) RNN is standard for modeling sequences and has recently been used for modeling user (sequential) preference data [13]. (RNN是序列建模的标准，最近被用于用户（序列）偏好数据建模[13]。)
(3) The RNN infers the representation of a user’s session $S→T+1u={iT+1,1u,...,iT+1,nu}\overrightarrow{S}^u_{T+1} = \{i^u_{T+1,1} ,... , i^u_{T+1,n} \}$ $S^{T + 1 u} = {i^{T + 1, 1 u}, . . ., i^{T + 1, n u}}$ , token by token by recursively combining the epresentation of all previous tokens with the latest token, i.e., (RNN推断用户会话的表示形式 $S→T+1u={iT+1,1u,...,iT+1,nu}\overrightarrow{S}^u_{T+1} = \{i^u_{T+1,1} ,... , i^u_{T+1,n} \}$ $S^{T + 1 u} = {i^{T + 1, 1 u}, . . ., i^{T + 1, n u}}$ ，逐个令牌，通过递归地将所有先前令牌的表示与最新令牌相结合，即。)
- where $h_n$ represents a user’s interests (其中， $h_n$ 代表用户的兴趣)
- and $f (\cdot, \cdot)$ is a non-linear function combining both sources of information. (是一个结合了两种信息源的非线性函数。)
(4) In practice, the long short-term memory (LSTM) [14] unit is often used as the combination function f (·,·): (在实践中，长短时记忆（LSTM）[14]单元常被用作组合函数f（·，·）)
- where $σ\sigma$ is the sigmoid function: $σ(x)=(1+exp(−x))−1\sigma(x) = (1 + exp(-x))^{-1}$ .

4.2 Representing Friends’ Interests

We consider both friends’ short- and long-term interests. (我们考虑朋友的短期兴趣和长期兴趣。)
- Short-term interests are modeled using the sequence of recently-consumed items (e.g., a friend’s latest online session). (短期兴趣使用最近消费的物品序列（例如，朋友最近的在线会话）建模。)
- Long-term interests represent a friend’s average interest and are modeled using individual embedding. (长期兴趣代表朋友的平均兴趣，并使用个人嵌入进行建模。)

4.2.1 Short-term preference:

(1) For a target user’s current session $S→T+1u\overrightarrow{S}^u_{T+1}$ , her friends’ short-term interests are represented using their sessions right before session $T + 1$ (our model generalizes beyond single session but this is effective empirically). (对于目标用户的当前会话 $S→T+1u\overrightarrow{S}^u_{T+1}$ ，她的朋友的短期兴趣是在会话 $T + 1$ 之前使用他们的会话来表示的（我们的模型概括了单个会话之外的内容，但这在经验上是有效的）。)
Each friend $k$ ’s actions $S→Tk={iT,1k,iT,2k,...,iT,Nk,Tk}\overrightarrow{S}^k_T= \{i^k_{T,1}, i^k_{T,2}, ..., i^k_{T,N_{k,T}}\}$ are modeled using an RNN. (每个朋友 $k$ 的行动 $S→Tk={iT,1k,iT,2k,...,iT,Nk,Tk}\overrightarrow{S}^k_T= \{i^k_{T,1}, i^k_{T,2}, ..., i^k_{T,N_{k,T}}\}$ 使用RNN建模。)
In fact, here we reuse the RNN for modeling the target user’s session (§ 4.1). In other words, both RNNs share the same weights. (事实上，这里我们重用RNN来建模目标用户的会话（§4.1）。换句话说，两个RNN共享相同的权重)
We represent friend $k$ ’s short-term preference $skss^s_k$ by the final output of the RNN: (我们通过RNN的最终输出表示朋友 $k$ 的短期偏好 $skss^s_k$ ：)

4.2.2 Long-term preference:

(1) Friends’ long-term preferences reflect their average interests . Since long-term preferences are not time-sensitive, we use a single vector to represent them. Formally, (朋友的长期偏好反映了他们的平均兴趣。由于长期偏好对时间不敏感，我们使用单个向量来表示它们。正式地)
where friend $k$ ’s long-term preference $skls^l_k$ is the $k_{th}$ row of the user embedding matrix $W_u$ (其中，朋友 $k$ 的长期偏好 $skls^l_k$ 是用户嵌入矩阵 $W_u$ 的 $k_{th}$ 行)
(2) Finally, we concatenate friends’ short- and long-term preferences using a non-linear transformation:
- where $R e L U (x) = m a x (0, x)$ is a non-linear activation function (是一个非线性激活函数)
- and $W_1$ is the transformation matrix. (是变换矩阵。)

4.3 Context-dependent Social Influences

(1) We described how we obtain representations of target user (§ 4.1) and her friends (§ 4.2). (我们描述了如何获得目标用户（§4.1）和她的朋友（§4.2）的表示。)
- We now combine both into a single representation that we then use downstream (§4.4). (现在，我们将两者合并为一个单独的表示，然后使用下游（§4.4）。)
- The combined representation is a mixture of the target user’s interest and her friends’ interest. (组合表示是目标用户兴趣和她朋友兴趣的混合。)
(2) We obtain this combined representation using a novel graph attention network. (我们使用一种新的图形注意网络来获得这种组合表示。)
- First, we encode the friendship network in a graph where nodes correspond to users (i.e., target users and their friends) and edges denote friendship. In addition, each node uses its corresponding user’s representation (§4.1 & §4.2) as (dynamic) features. (首先，我们将友谊网络编码为一个图，其中节点对应于用户（即目标用户及其朋友），边表示友谊。此外，每个节点使用其相应的用户表示（§4.1和§4.2）作为（动态）特征。)
- Second, these features are propagated along the edges using a message-passing algorithm [9]. (其次，使用消息传递算法沿边缘传播这些特征[9]。)
- The main novelty of our approach lies in using an attention mechanism to weigh the features
  traveling along each edge. (我们的方法的主要创新之处在于使用注意机制来衡量沿每条边移动的特征。)
- A weight corresponds to the level of a friend’s influence. After a fixed number of iterations of message passing, the resulting features at the target user’s node are the combined representation. (权重相当于朋友的影响力。经过固定次数的 消息传递迭代 后，目标用户节点上的结果特征就是组合表示。)
(3) Below we detail how we design the node features as well as the accompanying graph-attention mechanism. (下面我们将详细介绍如何设计节点功能以及相应的图形注意机制。)

4.3.1 Dynamic feature graph.

(1) For each user, we build a graph where nodes correspond to that user and her friends. (对于每个用户，我们构建一个图，其中节点对应于该用户及其朋友。)
For target user $u$ with $∣ N (u) ∣$ friends, the graph has $∣ N (u) ∣ + 1$ nodes. (对于目标用户 $u$ 和 $∣ N (u) ∣$ 朋友，该图有 $∣ N (u) ∣ + 1$ 个节点。)
User $u$ ’s initial representation $h_n$ is used as node $u$ ’s features $hu(0)h^{(0)}_u$ (the features are updated whenever $u$ consumes a new item in $S→T+1u\overrightarrow{S}^u_{T+1}$ ). (用户 $u$ 的初始表示形式 $h_n$ 用作节点 $u$ 的功能 $hu(0)h^{(0)}_u$ （只要 $u$ 在 $S→T+1u\overrightarrow{S}^u_{T+1}$ 中消费一个新项目，这些功能就会更新）。)
For a friend $k$ , the corresponding node feature is set to $s_k$ and remains unchanged for the duration of time step $T + 1$ . (对于朋友 $k$ ，相应的节点功能设置为 $s_k$ ，并且在时间步长 $T + 1$ 期间保持不变)
Formally, the node features are $hu(0)=hnh^{(0)}_u = hn$ and ${hk(0)=sk,k∈N(u)}\{h^{(0)}_k = s_k, k \in N(u)\}$ .

4.3.2 Graph-Attention Network.

(1) With the node features defined as above, we then pass messages (features) to combine friends’ and the target user’s interests. (通过上面定义的节点特性，我们可以传递消息（特性）来结合朋友和目标用户的兴趣。)
- This procedure is formalized as inference in a graph convolutional network [18]. (这个过程被形式化为图卷积网络中的推理[18]。)
(2) Kipf and Welling [18] introduce graph convolutional networks for semi-supervised node representation learning. In these networks, the convolutional layers “pass” the information between nodes. The number of layers $L$ of the networks corresponds to the number of iterations of message passing. (Kipf和Welling[18]介绍了用于 半监督节点表示学习 的 图卷积网络。在这些网络中，卷积层在节点之间“传递”信息。网络的层数 $L$ 对应于消息传递的迭代次数。)
- We propagate information on a graph that also contains higher-order relationships (e.g., friends of friends of friends) in practice. In the $l_{th}$ layer of the network, the target user then receives information from users that are $l$ degrees away. (我们在一个图形上传播信息，该图形实际上也包含高阶关系（例如，朋友的朋友的朋友）。在网络的 $l_{th}$ 层中，目标用户然后从距离 $l$ 度的用户那里接收信息)
- However, all neighbors are treated equally. (然而，所有邻居都受到平等对待。)
- Instead, we propose a novel dynamic graph attention network to model context-dependent social influences. (相反，我们提出了一种新的动态图形注意网络来模拟上下文相关的社会影响。)
  
  图3：使用注意机制的单个卷积层的图形模型，其中以当前兴趣为条件的输出被解释为与上下文相关的社会影响。
(3) The fixed symmetric normalized Laplacian is widely used as a propagation strategy in existing graph convolutional networks [6, 18]. 在现有的图卷积网络中，固定对称归一化拉普拉斯算子 被广泛用作一种 传播策略 [6,18]。()
- In order to distinguish the influence of each friend, we must break the static propagation schema first. (为了区分每个朋友的影响，我们必须首先打破 静态传播模式。)
- We propose to use an attention mechanism to guide the influence propagation. The process is illustrated in Figure 3. We first calculate the similarity between the target user’s node representation $hu(l)h^{(l)}_u$ and all of its neighbors’ representations $hk(l)h^{(l)}_k$ : (我们建议使用注意机制来引导影响传播。该过程如图3所示。我们首先计算目标用户的节点表示形式 $hu(l)h^{(l)}_u$ 与其所有邻居表示形式 $hk(l)h^{(l)}_k$ 之间的相似性)
- where $hu(l)h^{(l)}_u$ is the representation of node/user $u$ at layer $l$ , (是节点/用户 $u$ 在层 $l$ 的表示，)
- and ${h^{(l)}_u}^\top h^{(l)}_k$ is the similarity function between two elements. (是两个元素之间的相似函数。)
Intuitively, $αuk(l)\alpha^{(l)}_{uk}$ is the level of influence or weight of friend $k$ on user $u$ (conditioned on the current context $hu(l)h^{(l)}_u$ ). (直观地说， $αuk(l)\alpha^{(l)}_{uk}$ 是朋友 $k$ 对用户 $u$ 的影响程度或权重（取决于当前上下文 $hu(l)h^{(l)}_u$ ）。)
Note that we also include a self-connection edge to preserve a user’s revealed interests. $αu(l)\alpha^{(l)}_u$ :then provide the weights to combine the features: (请注意，我们还包括一个自我连接边缘，以保护用户的公开兴趣 $αu(l)\alpha^{(l)}_u$ ：然后提供权重以组合特征：)
where $h~u(l)\tilde{h}^{(l)}_u$ is a mixture of user $u$ ’s friends’ interests at layer $l$ , followed by a non-linear transformation: $hu(l+1)=ReLU(W(l)h~u(l))h^{(l+1)}_u = ReLU(W^{(l)} \tilde{h}^{(l)}_u)$ . (其中 $h~u(l)\tilde{h}^{(l)}_u$ 是用户 $u$ 的朋友在层 $l$ 的兴趣的混合，然后是非线性转换： $hu(l+1)=ReLU(W(l)h~u(l))h^{(l+1)}_u = ReLU(W^{(l)} \tilde{h}^{(l)}_u)$ 。)
- $W^{(l)}$ is the shared and learnable weight matrix at layer $l$ . (是 $l$ 层的共享和可学习权重矩阵。)
- We obtain the final representation of each node by stacking this attention layer $L$ times. (我们通过将这个注意层叠加 $L$ 次来获得每个节点的最终表示。)
- The combined (social-influenced) representation is denoted by $hu(L)h^{(L)}_u$ . (组合的（受社会影响的）表征用 $hu(L)h^{(L)}_u$ 表示。)

4.4 Recommendation

(1) Since a user’s interest depends on both her recent behaviors and social influences, her final representation is obtained by combining them using a fully-connected layer: (由于用户的兴趣取决于其最近的行为和社会影响，因此通过使用一个完全连接的层将它们结合起来，可以获得用户的最终表现：)
- where $W_2$ is a linear transformation matrix, ( $W_2$ 是一个线性变换矩阵)
- and $h^n\hat{h}_n$ is the final representation of the user $u$ ’s current interest. ( $h^n\hat{h}_n$ 是用户 $u$ 当前兴趣的最终表示)
(2) We then obtain the probability that the next item will be $y$ using a softmax function: (然后，我们使用softmax函数获得下一项为 $y$ 的概率：)
- where $N (u)$ are user $u$ ’s set of friends according to the social network $G$ , ( $N (u)$ 是用户 $u$ 在社交网络 $G$ 中的一组朋友，)
- $z_y$ is the embedding of item $y$ , (是项 $y$ 的嵌入)
- and $∣ I ∣$ the total number of items. (项目总数。)
(3) We also tested our model with two popular context-independent propagation strategies that do not use an attention mechanism: (我们还用两种流行的不使用注意机制的上下文无关传播策略测试了我们的模型)
- (a) averaging friends’ interests and; (平衡朋友的兴趣和爱好；)
- (b) element-wise max-pooling over their interests —similar to techniques for aggregating word-level embeddings [36]. (元素级的最大兴趣池化，类似于 聚合单词级嵌入 的技术[36]。)
- Mean aggregation outperforms the latter, but both are inferior to our proposed attention model. (平均聚集优于后者，但两者都不如我们提出的注意模型。)

4.5 Training

(1) We train the model by maximizing the log-likelihood of the observed items in all user sessions: (我们通过最大化所有用户会话中观察项目的对数似然来训练模型)
(2) This function is optimized using gradient descent. (该函数使用梯度下降法进行优化。)

5 EXPERIMENTS

(1) Studying the effectiveness of our DGRec using real-world data sets, we highlight the following results: (通过使用真实数据集研究DGRec的有效性，我们强调了以下结果)
- DGRec significantly outperforms all seven methods that it is compared to under all experimental settings. (DGRec在所有实验设置下的表现都明显优于所有七种方法。)
- Ablation studies demonstrate the usefulness of the different components of DGRec. (消融研究证明了DGRec不同成分的有用性。)
- Exploring the fitted models shows that attention contextually weighs the influences of friends. (探索拟合模型表明，注意力在语境中衡量朋友的影响。)

5.1 Experimental Setup

5.1.1 Data Sets.

(1) We study all models using data collected from three well-known online communities. Descriptive statistics for all data sets are in Table 1. (我们使用从三个著名在线社区收集的数据研究所有模型。所有数据集的描述性统计数据见表1。)
(2) Douban. A popular site on which users can review movies, music, and books they consume. (豆瓣。一个受欢迎的网站，用户可以在上面查看他们消费的电影、音乐和书籍。)
- We crawled the data using the identities of the users in the movie community, obtaining every movie they reviewed along with associated timestamps. (我们使用电影社区中用户的身份对数据进行了爬网，获得了他们观看的每一部电影以及相关的时间戳。)
- We also crawled the users’ social networks. (我们还对用户的社交网络进行了爬取。)
- We construct our data set by using each review as an evidence that a user consumed an item. ( 我们通过使用每次评论作为用户消费物品的证据 来构建数据集。)
- Users tend to be highly active on Douban so we segment users’ behaviors (movie consumption) into week-long sessions. (用户在豆瓣上往往非常活跃， 所以我们将用户的行为（电影消费）分为为期一周的时段。 )
(2) Delicious. An online bookmarking system where users can store, share, and discover web bookmarks and assign them a variety of semantic tags. (美味的一个在线书签系统，用户可以在其中存储、共享和发现网络书签，并为它们分配各种语义标记。)
- The task we consider is personalized tag recommendations for bookmarks. (我们考虑的任务是书签的个性化标签推荐。)
- Each session is a sequence of tags a user has assigned to a bookmark (tagging actions are timestamped). (每个会话都是用户分配给书签的一系列标记（标记操作带有时间戳）。)
- This differs from the ordinary definition of sessions as a sequence of consumptions over a short horizon. (这与会话的一般定义不同，会话是短期内的消费序列。)
(3) Yelp. An online review system where users review local businesses (e.g., restaurants and shops). (一个在线评价系统，用户可以在该系统中评价当地企业（如餐厅和商店）)
- Similar as for Douban, we treat each review as an observation. （与豆瓣相似，我们将每次评价视为一次观察。）
- Based on the empirical frequency of the reviews, we segment the data into month-long sessions. （根据评论的经验频率，我们将数据分为为期一个月的会议。）
(4) We also tried different segmentation strategies. (我们还尝试了不同的细分策略。)
- Preliminary results showed that our method consistently outperformed Session-RNN and NARM for other session lengths. (初步结果表明，对于其他会话长度，我们的方法始终优于会话RNN和NARM。)
- We leave a systematic study for optimizing session segmentation as our future work. (我们留下了一个系统的研究来优化会话分割，作为我们未来的工作。)

5.1.2 Train/valid/test splits.

(1) We reserve the sessions of the last $d$ days for testing and filter out items that did not appear in the training set. (我们保留最后d天的课程，用于测试和筛选培训集中未出现的项目。)
Due to the different sparseness of the three data sets, we choose $d$ = 180,50 and 25 for Douban, Yelp and Delicious data sets respectively. (由于这三个数据集的稀疏性不同，我们分别为豆瓣、Yelp和Delicious数据集选择 $d$ =180、50和25。)
We randomly and equally split the held out sessions into validation and test sets. (我们随机平均地将举行的会议分为验证组和测试组。)

5.1.3 Competing Models.

(1) We compare DGREC to three classes of recommenders:
- (A) classical methods that utilize neither social nor temporal factors;
- (B) social recommenders, which take context-independent social influences into consideration; and
- (C ) session-based recommendation methods, which model user interests in sessions. (Below, we indicate a model’s class next to its name.)
ItemKNN [22] (A): inspired by the classic KNN model, it looks for items that are similar to items liked by a user in the past. (受经典KNN模型的启发，它寻找与用户过去喜欢的物品相似的物品。)
BPR-MF [27] (A): matrix factorization (MF) technique trained using a ranking objective as opposed to a regression objective. (矩阵分解（MF）技术使用排序目标而不是回归目标进行训练。)
SoReg [24] (B): uses the social network to regularize the latent user factors of matrix factorization. (使用社交网络来正则化矩阵分解的 潜在用户因素。)
SBPR [41] (B): an approach for social recommendations based on BPR-MF. The social network is used to provide additional training samples for matrix factorization. （基于BPR-MF的社会推荐方法。社交网络用于为矩阵分解提供额外的训练样本。）
TranSIV [38] (B): uses shared latent factors to transfer the learned information from the social domain to the recommendation domain. (使用共享的潜在因素将学习到的信息从社交领域转移到推荐领域。)
RNN-Session [13] (C ): recent state-of-the-art approach that uses recurrent neural networks for session-based recommendations. (最近最先进的方法，使用 循环神经网络 进行 基于会话 的推荐。)
NARM [21] (C ): a hybrid model of both session-level preferences and the user’s “main purpose”, where the main purpose is obtained via attending on previous behaviors within the session. ( 会话级偏好 和 用户“主要目的” 的混合模型，其中主要目的是通过参与会话中以前的行为来实现的。)

5.1.4 Evaluation Metrics.

(1) We evaluate all models with two widely used ranking-based metrics: (我们使用两个广泛使用的基于排名的指标评估所有模型：)
- Recall@K and
- Normalized Discounted Cumulative Gain (NDCG).
(2) Recall@K measures the proportion of the top-K recommended items that are in the evaluation set. (测量评估集中top-K推荐项目的比例。)
- We use K = 20.
(3) NDCG is a standard ranking metric. (NDCG是一个标准的排名指标。)
- In the context of session-based recommendation, it is formulated as: $NDCG=1log2(1+rankpos)NDCG = \frac{1}{log_2(1+rank_{pos})}$ $N D C G = l o g^{2} (1 + r a n k^{p o s}) 1$ , (在基于会话的建议中，其表述如下)
  - where $rank_{pos}$ denotes the rank of a positive item. (表示正项的等级)
- We report the average value of NDCG over all the testing examples. (我们报告了所有测试示例中NDCG的平均值。)

5.1.5 Hyper-parameter Settings.

(1) For RNN-Session, NARM and our models, we use a batch size of 200. (对于RNN会话、NARM和我们的模型，我们使用200的批量。)
(2) We use Adam [17] for optimization due to its effectiveness with $β1\beta_1$ = 0.9, $β2\beta_2$ = 0.999 and $ϵ=1e−8\epsilon = 1e^{-8}$ as suggested in TensorFlow [8]. (由于Adam[17]的有效性，我们使用它进行优化)
The initial learning rate is empirically set to 0.002 (根据经验，初始学习率设置为0.002)
and decayed at the rate of 0.98 every 400 steps. (以每400步0.98的速度衰减。)
For all models, the dimensions of the user (when needed) and item representations are fixed to 100 following Hidasi et al. [13]. (对于所有模型，用户尺寸（需要时）和项目表示固定为100，如下Hidasi等人[13]。)
We cross-validated the number of hidden units of the LSTMs and the performance plateaued around 100 hidden units. (我们交叉验证了LSTM的隐藏单元数量，性能稳定在大约100个隐藏单元。)
The neighborhood sample sizes are empirically set to 10 and 15 in the first and second convolutional layers, respectively. (根据经验，在第一和第二卷积层中，邻域样本大小分别设置为10和15。)
We tried to use more friends in each layer but observed no significant improvement. (我们试图在每一层中使用更多的朋友，但没有观察到明显的改善。)
In our models, dropout [29] with rate 0.2 is used to avoid overfitting. (在我们的模型中，使用率为0.2的dropout[29]来避免过度拟合。)

5.1.6 Implementation Details.

(1) We implement our model using TensorFlow [8]. (我们使用TensorFlow[8]实现了我们的模型。)
(2) Training graph attention networks on our data with mini-batch gradient descent is not trivial since node degrees have a large range. (使用 小批量梯度下降法 在我们的数据上训练 图注意网络 并非易事，因为节点度有很大的范围。)
We found the neighbor sampling technique proposed in [11] pretty effective. (我们发现[11]中提出的邻域采样技术非常有效)
Further, to reasonably reduce the computational cost of training DGREC, we represent friends’ short-term interests using only their most recent sessions. (此外，为了合理降低训练DGREC的计算成本，我们仅使用朋友最近的会话来代表他们的短期兴趣。)

5.2 Quantitative Results (定量结果)

(1) The performance of different algorithms is summarized in Table 2. (表2总结了不同算法的性能)
ItemKNN and BPR-MF perform very similarly, except on Douban. (ItemKNN和BPR-MF的表现非常相似，除了豆瓣。)
A particularity of Douban is that users typically only consume each item once (different from Delicious and Yelp). (豆瓣的一个特殊性是，用户通常只消费一次每一个项目（不同于美味和Yelp）。)
MF-based methods tend to recommend previously consumed items which explain BPR-MF’s poor performance. (基于MF的方法倾向于推荐以前消费过的产品，这解释了BPR-MF的糟糕表现。)
By modeling social influence, the performance of social recommenders improves compared to BPR-MF in most cases. (通过对社会影响进行建模，在大多数情况下，与BPR-MF相比，社会推荐人的绩效有所提高。)
However, the improvement is marginal because these three algorithms (B) only model context-independent social influence. (然而，由于这三种算法（B）只模拟了与上下文无关的社会影响，因此改进是微不足道的。)
By modeling dynamic user interests, RNN-Session significantly outperforms ItemKNN and BPR, which is consistent with the results in Hidasi et al. [13]. (通过对动态用户兴趣进行建模，RNN会话显著优于ItemKNN和BPR，这与Hidasi等人[13]的结果一致。)
Further, NARM extends RNN-Session by explicitly modeling user’s main purpose and becomes the strongest baseline. (此外，NARM通过明确建模用户的主要目的来扩展RNN会话，并成为最强大的基线。)
Our proposed model DGREC achieves the best performance among all the algorithms by modeling both user’s dynamic interests and context-dependent social influences. (我们提出的模型DGREC通过建模用户的动态兴趣和上下文相关的社会影响，在所有算法中取得了最好的性能。)
Besides, the improvement over RNN-Session and NARM is more significant compared to that of SoReg over BPR-MF, which shows the necessity of modeling context-dependent social influences. (此外，相对于BPR-MF，RNN会话和NARM的改进比SoReg更为显著，这表明了建模依赖于上下文的社会影响的必要性。)

5.3 Variations of DGREC

To justify and gain further insights into the specifics of DGREC’s architecture, we now study and compare variations of our model. (为了证明DGREC架构的合理性并进一步深入了解其细节，我们现在研究并比较我们模型的各种变化。)

5.3.1 Self v.s. Social.

(1) DGREC obtains users’ final preferences as a combination of user’s consumed items in the current session and context-dependent social influences (see Eq. 8). (DGREC将用户在 当前会话中消费的物品 和 上下文相关的社会影响 结合起来，获得用户的最终偏好（见等式8）。)
To tease apart the contribution of both sources of information, we compare DGREC against two submodels: (为了区分这两种信息来源的贡献，我们将DGREC与两个子模型进行比较)
- (a) (DGRECself) a model of the user’s current session only (Eq. 8 without social influence features $hu(L)h^{(L)}_u$ ) and; (（DGRECself）仅用户当前会话的模型（等式8，无社会影响特征）)
- (b) (DGRECsocial) a model using context-dependent social influence features only (Eq. 8 without individual features $h_n$ ). (（DGRECsocial）仅使用上下文相关的社会影响特征的模型（公式8，不含个体特征）)
Note that when using individual features only, DGRECself is identical to RNN-Session (hence the results are reproduced from Table 2). (请注意，当仅使用个体特征时，DGRECself与RNN会话相同（因此，结果从表2中复制）。)
Table 3 reports the performance of all three models on our data sets. (表3报告了我们数据集上所有三种模型的性能。)
- DGRECself consistently outperforms DGRECsocial across all three data sets, which means that overall users’ individual interests have a higher impact on recommendation quality. (DGRECself在所有三个数据集上都始终优于DGRECsocial，这意味着总体用户的个人兴趣对推荐质量有更高的影响。)
- Compared to the full model DGREC, the performance of both DGRECself and DGRECsocial significantly decreases. (与全模型DGREC相比，DGRECself和DGRECsocial的性能都显著降低。)
To achieve good recommendation performance in online communities, it is, therefore, crucial to model both a user’s current interests as well as her (dynamic) social influences. (因此，为了在在线社区中实现良好的推荐性能，对用户当前的兴趣以及她的（动态）社会影响进行建模至关重要。)

5.3.2 Short-term v.s. Long-term.

(1) DGREC provides a mechanism for encoding friends’ short- as well as long-term interests (see § 4.2). (DGREC提供了一种编码朋友短期和长期兴趣的机制（见§4.2）。)
We study the impact of each on the model’s performance. Similar to above, we compare using either short- or long-term interests to the results of using both. (我们研究了每种方法对模型性能的影响。与上面类似，我们将使用短期或长期利益与使用两者的结果进行比较。)
Figure 4 reports that for Douban, the predictive capability of friends’ short-term interests outperforms that of friends’ long-term interest drastically, and shows comparable performance in regard to the full model. (图4显示，豆瓣对朋友短期兴趣的预测能力显著优于朋友长期兴趣的预测能力，并显示出与完整模型相当的性能。
- It is reasonable, considering that the interests of users in online communities (e.g., Douban) change frequently, and exploiting users’ short-term interests should be able to predict user behaviors more quickly. (考虑到在线社区（如豆瓣）中用户的兴趣经常变化，利用用户的短期兴趣应该能够更快地预测用户行为，这是合理的。)
Interestingly, on the data set Delicious, different results are observed. Using long-term interests yield more accurate predictions than doing short-term. (有趣的是，在数据集Delicious上，观察到了不同的结果。使用长期利益比使用短期利益产生更准确的预测。)
- This is not surprising since, on Delicious website, users tend to have static interests. (这并不奇怪，因为在Delicious网站上，用户往往有静态兴趣)

5.3.3 Number of Convolutional Layers.

(1) DGREC aggregates friends’ interests using a multi-layer graph convolutional network. (DGREC使用多层图卷积网络聚合朋友的兴趣。)
More convolutional layers will yield influences from higher-order friends. (更多的卷积层将产生来自高阶朋友的影响。)
In our study so far we have used two-layer graph convolutional networks. (到目前为止，在我们的研究中，我们使用了两层图卷积网络。)
To validate this choice we compare the performance to one- and three-layer networks but maintain the number of selected friends to 10 and 5 in the first and third layer, respectively. (为了验证这一选择，我们将性能与一层和三层网络进行比较，但在第一层和第三层，选择的朋友数分别保持在10和5。)
Table 4 shows a significant decline in performance when using a single layer. This implies that the interests of friends’ friends (obtained by 2 layers) is important for recommendations. (表4显示了使用单层时性能的显著下降。这意味着朋友的朋友的兴趣（通过两层获得）对于推荐很重要。)
(2) Next, we test our model using three convolutional layers to explore the influences of even higher-order friends. The influence of the third layer on the performance is small. (接下来，我们使用三个卷积层来测试我们的模型，以探索更高阶朋友的影响。第三层对性能的影响很小。)
- There is a small improvement for Yelp (Yelp有一个小小的提升)
- but a slightly larger drop in performance for both Douban and Delicious, （但豆瓣和Delicious的表现都略有下降，）
- which may be attributed to model overfitting or noises introduced by higher-order friends. (这可能是由于模型过度拟合或高阶朋友引入的噪声造成的。)
- This confirms that two convolutional layers are enough for our data sets. (这证实了对于我们的数据集来说，两个卷积层就足够了。)

5.4 Exploring Attention

(1) DGREC uses an attention mechanism to weigh the contribution of different friends based on a user’s current session. (DGREC使用一种注意力机制，根据用户当前会话来衡量不同朋友的贡献。)
(2) We hypothesized that while friends have varying interests, user session typically only explores a subset of these interests. As a consequence, for a target user, different subsets of her friends should be relied upon in different situations. We now explore the results of the attention learned by our model. (我们假设，虽然朋友有不同的兴趣，但用户会话通常只探索这些兴趣的一个子集。因此，对于目标用户，在不同的情况下应该依赖其朋友的不同子集。现在，我们探索通过我们的模型学习注意力的结果。)
(3) First, we randomly select a Douban user from those who have at least 5 test sessions as well as 5 friends and plot her attention weights (Eq. 6) within and across session(s) in Figure 5. (首先，我们从至少有5次测试的人和5个朋友中随机选择一个豆瓣用户，并在图5中绘制出她在测试过程中的注意力权重（等式6）。)
- For the inter-session level plot (left), we plot the average attention weight of a friend within a session. (对于会话间水平图（左），我们绘制了一个会话中朋友的平均注意力权重)
- For intra-session level plot (right), the user’s attention weights within one session (i.e., SessionId=7) are presented. (对于会话内级别图（右），显示了一个会话内用户的注意力权重（即SessionId=7）。)
- We make the following observations. (我们做以下观察。)
  - First, the user allocates her attention to different friends across different sessions. (首先，用户在不同的会话中将注意力分配给不同的朋友。)
    - This indicates that social influence is indeed conditioned on context (i.e., target user’s current interests). (这表明社会影响确实取决于背景（即目标用户当前的兴趣）。)
    - Further, friend #8 obtains little attention in all sessions, which means that social links do not necessarily lead to observed shared interest. (此外，friend#8在所有课程中都很少受到关注，这意味着社交联系不一定会带来共同的兴趣。)
  - Second, the distribution of attention is relatively stable within a single session. (其次，注意力在一次会话中的分布相对稳定。)
  - This confirms that the user’s behaviors are coherent in a short period and suitable to be processed in a session manner. (这证实了用户的行为在短时间内是一致的，并且适合以会话方式进行处理。)

图5：不同时段（左）和一个时段（右）内注意力权重的热图。对于这两个图，y轴代表目标用户的朋友。x轴代表（1）左侧目标用户的八个会话，以及（2）右侧会话#7中的项目顺序。

(4) As a second exploration of the behavior of the attention mechanism we take a macro approach and analyze the attention across all users (as opposed to a single user across friends). (作为对注意力机制行为的第二次探索，我们采用宏观方法，分析所有用户的注意力（而不是朋友中的单个用户）。)
- We use the attention levels inferred on the Douban test set. (我们使用豆瓣测试集中推断的注意力水平。)
- Figure 6 reports the empirical distributions of the inter-session (brown) and intrasession (blue) attention variance (i.e., how much does the attention weights vary in each case). (图6报告了会话间（棕色）和会话内（蓝色）注意方差的经验分布（即，在每种情况下，注意权重的变化有多大）。)
  - The intra-session variance is lower on average. (会话内方差平均较低。)
    - This agrees with our assumption that users’ interests tend to be focused within a short time so that the same set of friends are attended to for the duration of a session. (这与我们的假设是一致的，即 用户的兴趣往往会在短时间内集中，以便在会话期间关注同一组朋友 。)
    - On the contrary, a user is more likely to trust different friends in different sessions, which further validates modeling context-dependent social influences via attention-based graph convolutional networks . (相反，用户更可能在不同的会话中信任不同的朋友，这进一步验证了通过基于注意的图卷积网络建模上下文相关的社会影响)
      
      图6:DGREC在会话间和会话内的注意方差分布。方差值离散为20个区间。

6 CONCLUSIONS

(1) We propose a model based on graph convolutional networks for session-based social recommendation in online communities. (我们提出了一个基于 图卷积网络 的在线社区 基于会话 的社会推荐模型。)
- Our model first learns individual user representations by modeling the users’ current interests. (我们的模型首先通过对 用户当前兴趣 的建模来学习 单个用户的表示)
- Each user’s representation is then aggregated with her friends’ representations using a graph convolutional networks with a novel attention mechanism. (然后，每个 用户的表示 与 她的朋友的表示 使用一个带有新颖 注意机制 的 图卷积网络 进行聚合。)
- The combined representation along with the user’s original representation is then used to form item recommendations. (然后，将 组合表示 与 用户的原始表示 一起用于形成项目推荐。)
(2) Experimental results on three realworld data sets demonstrate the superiority of our model compared to several state-of-the-art models. (在三个真实数据集上的实验结果表明，与几种最先进的模型相比，我们的模型具有优越性。)
(3) Next steps involve exploring user and item features indicative of preferences and further improving the performance of recommender systems for online communities. (接下来的步骤包括探索表明偏好的用户和项目特征，并进一步提高在线社区推荐系统的性能。)

7 ACKNOWLEDGEMENT

References

2019_WSDM_Session-Based Social Recommendation via Dynamic Graph Attention Networks相关推荐

【推荐系统-＞论文阅读】Dynamic Graph Neural Networks for Sequential Recommendation（用于序列推荐的动态图神经网络）
Dynamic Graph Neural Networks for Sequential Recommendation(用于序列推荐的动态图神经网络) Mengqi Zhang, Shu Wu,Mem ...
2019_WWW_Dual graph attention networks for deep latent representation of multifaceted social effect
[论文阅读笔记]2019_WWW_Dual graph attention networks for deep latent representation of multifaceted social ...
Dual Graph Attention Networks for Deep Latent Representation of Multifaceted Social...》论文学习笔记
Dual Graph Attention Networks for Deep Latent Representation of Multifaceted Social Effects in Recom ...
Micro-Expression Classiﬁcation based on Landmark Relations with Graph Attention Convolutional Networ
[2021CVPR] Micro-Expression Classiﬁcation based on Landmark Relations with Graph Attention Convoluti ...
论文笔记之：Graph Attention Networks
Graph Attention Networks 2018-02-06 16:52:49 Abstract: 本文提出一种新颖的 graph attention networks (GATs), 可 ...
异构图神经网络（1）Heterogenous Graph Attention Networks
Heterogenous Graph Attention Networks 这篇文章发表在WWW 2019会议上,主要是用了注意力机制来进行节点级别聚合和语义级别的聚合,从而提出了HAN模型. Mot ...
论文阅读和分析： “How Attentive are Graph Attention Networks?”
下面所有博客是个人对EEG脑电的探索,项目代码是早期版本不完整,需要完整项目代码和资料请私聊. 数据集 1.脑电项目探索和实现(EEG) (上):研究数据集选取和介绍SEED 相关论文阅读分析: 1. ...
论文阅读ICLR2020《ADAPTIVE STRUCTURAL FINGERPRINTS FOR GRAPH ATTENTION NETWORKS》
论文阅读ICLR2020<ADAPTIVE STRUCTURAL FINGERPRINTS FOR GRAPH ATTENTION NETWORKS> 摘要确定节点相似性时图的结构 Ad ...
图注意力网络——Graph attention networks (GAT)
文章目录摘要引言 GAT结构数据集与评估结果未来改进方向参考文献摘要图注意力网络,一种基于图结构数据的新型神经网络架构,利用隐藏的自我注意层来解决之前基于图卷积或其近似的方法的不足. ...
《Poluparity Prediction on Social Platforms with Coupled Graph Neural Networks》阅读笔记
论文地址:Popularity Prediction on Social Platforms with Coupled Graph Neural Networks 文章概览作者提出了一种耦合图神经网 ...

2019_WSDM_Session-Based Social Recommendation via Dynamic Graph Attention Networks