[论文阅读笔记]2018_AAAI_Social Recommendation with an Essential Preference Space—(AAAI, 2018)-- Chun-Yi Liu, Chuan Zhou, Jia Wu, Yue Hu, Li Guo

Publish time: 2018
作者单位:Chinese Academy of Sciences
University of Chinese Academy of Sciences
Macquarie University
数据集:Film Trust, Flixster, Epinions, Ciao


1 提出了“基本偏好空间”的概念
2 和SoRec的思路不同的是,user latent vector在两个域中是不一样的。
(i) user对item rating时,考虑的比如是item的价格、实用性、品牌xxxxx等一些其实我们也不知道的描述item的具体的因素
(ii) 而user在social network中考虑该trust谁,或者做出的rating(或者购买抉择)时,考虑是 和这个人熟不熟、这个人的爱好,这个人可不可信xxxx等一些我们也不知道的描述people的具体因素,但肯定不是item的属性。
(iii) 所以在面对item和people时,user考虑的因素必然是不同的,对应到Recommendation System中,user-item-interaction中的user latent vector与social-network中的user latent vector 应该属于不同的essential preference space
3 作者assume that the user preferences in different scenarios are results of different linear combinations from a more underlying user preference space(假设不同场景中的用户偏好是来自更底层的用户偏好空间的不同线性组合的结果)
(i) 用MF技术,对Rating Matrix进行分解,对Rating Space进行建模,
(ii) 用LINE(Large-scale Information Network Embedding大规模信息网络嵌入技术),得到社交网络空间的用户向量
(iii) 对Raing信息、购买信息、社交网络的信息进行建模后,进行联合(一堆公式推导)建模,学习得到Essential Preference Space中的user latent vector,将其投影到Rating Space中,就得到了Rating Space中的user latent vector,从而就得到了 预测评分。


(1) we assume that the user preferences in different scenarios are results of different linear combinations from a more underlying user preference space(我们假设不同场景中的用户偏好是来自更底层的用户偏好空间的不同线性组合的结果)
(2) we propose a novel social recommendation framework, called social recommendation with an essential pref-erences space (SREPS)(提出新的框架,Model)
   simultaneously models the structural information in the social network, the rating and the consumption information in the recommender system under the capture of essential preference space.

1 Introduction

In summary, the user latent vectors in recommender system and social network should belong to different latent spaces rather than two different vectors in the same latent space(本文最大的创新,与SoRec的思路典型的不同)

Figure 1: An explanatory example. The purple box is the scenarios that the user rates items. The user mainly considers quality, price and brand to make decisions. The green box is the scenarios in social networks, and the user regards appearance, personality traits and social position to decide whether to trust other people. Clearly, the user considers different latent factors in these two scenarios.

1.1 essential preference space

(1) describe the user multiple preferences differences in different scenarios like recommender system and social network.
(2) a network embedding model called large scale information network embedding (LINE)
(3) A social recommendation framework SREPS is proposed to jointly model the rating, consumption and social relation information based on the essential preference space(本文提出的新Model,创新点)
(4) An effective stochastic gradient descent algorithm is designed for the model learning

2 Preliminaries and Problem Definitions


(1) a user set PPP with mmm users and an item set QQQ with nnn items. Let R=[rui]m×nR=[r_{ui}]_{m\times n}R=[rui]m×n denote the user-item rating matrix,
(2) a social network is represented by a graph GS=(VPS,ES)\mathcal{G}^{\mathcal{S}}=(\mathcal{V}^{\mathcal{S}}_{\mathcal{P}}, \mathcal{E}^S)GS=(VPS,ES), where VPS\mathcal{V}^{\mathcal{S}}_{\mathcal{P}}VPS is the set of vertexes that represent users, and ES\mathcal{E}^{\mathcal{S}}ES is the set of edges,
(3) A recommendation network is a bipartite graph GR=(VPR,VQR,ER)\mathcal{G}^{\mathcal{R}}=(\mathcal{V}^{\mathcal{R}}_{\mathcal{P}}, \mathcal{V}^{\mathcal{R}}_{\mathcal{Q}}, \mathcal{E}^{\mathcal{R}})GR=(VPR,VQR,ER), where VPR\mathcal{V}^{\mathcal{R}}_{\mathcal{P}}VPR is a set of vertexes to represent users, VQR\mathcal{V}^{\mathcal{R}}_{\mathcal{Q}}VQR is another set of vertexes to represent items, and ER\mathcal{E}^{\mathcal{R}}ER is the set of directed unweighted edges from users to items.
(4) we explore the structure information of the recommendation network to learn the user implicit preferences with the help of property- reserved network embedding method.

Matrix Factorization(用到的方法)

The unknown entities then be predicted by calculating the inner products of these latent vectors, i.e., rui=UuTVir_{ui}=U^{T}_{u}V_irui=UuTVi, where UuTU^T_uUuT is the transpose of latent vector UuU_uUu. Formally, the loss function for matrix factorization is as follows,

where the second term controlled by λ\lambdaλ is to avoid overfitting

Large-scale Information Network Embedding(用到的方法LINE)

(1) LINE captures both the local structures and similarity of neighborhood network structures between two vertexes.
(2) In LINE, each vertex can be treated as a context for the other vertexes, and vertexes with similar distributions over a context are assumed to be similar.
(3) the probability of context ttt generated from vertex sss is defined as

(4) The empirical probability is that p^(t∣s)=wstdsout\hat{p}(t|s)=\frac{w_{st}}{d^{out}_{s}}p^(ts)=dsoutwst, where dsoutd^{out}_{s}dsout is the out_degree of vertex s, i.e., dsout=∑v∈Vωsvd^{out}_{s}=\sum_{v\in\mathcal{V}}^{}\omega_{sv}dsout=vVωsv
(5) The objective function of LINE is defined as

where KL(p,q)KL(p, q)KL(p,q) is the KL-divergence of the probability distributions ppp and qqq, p^(⋅∣s)\hat{p}(\cdot|s)p^(s) and p(⋅∣s)p(\cdot|s)p(s) are the empirical and defined distributions of contexts generated from vertex sss, respectively.
(6) Omitting the constants, which does not affect the optimization of the objective function, the final loss function can be put as

3 Social Recommendation with an Essential Preference Space (SREPS)(创新Model)

Figure 2: The overview of our SREPS model. Each user has a latent vector in the essential preference space, and his semantic
latent vectors are projections from the essential preference space by multiplying space projection matrices (i.e. MEM_EME, MCM_CMC, MRM_RMR
and MIM_IMI). We model the historical rating information with matrix factorization, while the social and recommendation networks
are modeled by network embedding. By jointly modeling these elements, we can learn the user latent vectors in the essential
preference space and the space projection matrices. Finally, we can use user latent vectors in essential preference space, the
rating space projection matrix MRM_RMR, and the item latent vectors in the rating space to predict the final rating.

Essential Preference Space

Definition 1 Semantic Latent Space(自己定义一个语义隐空间)

(1) For a particular scenario like item rating and friend trusting, the corresponding semantic latent space is inferred from the user feedback and can be used to explain the user preferences by characterizing users in terms of latent factors.
(2) F or example, the user latent vectors learned from Eq. (1) belong to a rating semantic latent space, and the embedding learned representation vectors from Eq. (3) belong to a social semantic latent space.(user latent vector从不同的式子分解得到的,或者说是从不同的语义隐空间分解得到的,就应是不同的。主要对标SoRec)

Definition 2 Essential Preference Space(自己定义的一个基本偏好空间)

(1) The essential preference space is used to describe the fundamental factors that influence user preferences
(2) The factor in each semantic latent space is a linear combination of factors in the essential preference space.(解释自己的理论)
(3) Let U^u∈Rl\hat{U}_u\in R^lU^uRl be the latent vector in essential preference space for user uuu. Let Uu∈Rd0U_u\in R^{d_0}UuRd0 be the latent vector in rating semantic latent space for user uuu in Eq. (1), which can be obtained from following transition

where MR∈Rd0×lM_R\in R^{d_0\times l}MRRd0×l is the space projection matrix that maps the essential preference space into the rating semantic latent space.
(4) Similarly, the embedding vector EuE_uEu and context vector CuC_uCu in Eq. (3) can be mapped from U^u\hat{U}_uU^u by space projection matrices ME∈Rd1×lM_E\in R^{d_1\times l}MERd1×l and MC∈Rd1×lM_C\in R^{d_1\times l}MCRd1×l as follow:

The SREPS Model

(1) By incorporating Eq. (4) and Eq. (5), the rating loss function in Eq. (1) without regularizations can be represented as

(2) The loss function O2O_2O2 for the social network representation is as follows

where ωst\omega_{st}ωst is the weight in edge (s,t)(s, t)(s,t).
(3) Similarly, the loss function O3O_3O3 for the recommendation network representation is expressed as

where MI∈Rd2×lM_I\in R^{d_2\times l}MIRd2×l is the space projection matrix corresponding to recommendation network and Bi∈Rd2B_i\in R^{d_2}BiRd2 is the context vector of item iii.
(4) Note that the recommendation network is a bipartite graph.
   Hence, different from the social network, the user vertexes have only the embedding vectors, while the item vertexes have only context vectors.(user顶点只有embedding向量,item顶点只有context向量)
   In sum, the loss function of the SREPS model is

whereα α≥0\alpha\geq0α0 and β≥0\beta\geq0β0 are parameters that control the balance of loss function meeting α+β≤1\alpha+\beta\leq1α+β1, and Reg is the regularization term:

where λ\lambdaλ is the regularization parameter.


Optimization Approach

we simultaneously learn the parameters by sampling examples from different parts of the SREPS loss function

Rating Loss

(1) We randomly sample a pair (u,i)(u, i)(u,i) from the observed entity set Ω\OmegaΩ
(2) the gradients of the
rating loss function L1:=O1+Reg1L_1:=O_1+Reg_1L1:=O1+Reg1 for the sampled pair u,iu, iu,i are as follow,

where IlI_lIl is an l×ll\times ll×l identity matrix and δuiR=U^uTMRTVi−rui\delta^{R}_{ui}=\hat{U}^{T}_{u}M^{T}_{R}V_i-r_{ui}δuiR=U^uTMRTVirui.

Social Network Embedding

(1) Now we optimize the loss function L2:=O2+Reg2L_2:=O_2+Reg_2L2:=O2+Reg2 in the social network embedding, where Reg2Reg_2Reg2 is the regularization corresponding to ∥MEU^u∥2\parallel M_E \hat{U}_u\parallel^ 2MEU^u2, ∥MCU^u∥2\parallel M_C\hat{U}_u\parallel^2MCU^u2 and ∥U^u∥2\parallel \hat{U}_u\parallel^2U^u2
(2)we can change p(t∣s)p(t|s)p(ts) into the following form

where σ(⋅)\sigma(\cdot)σ() is the sigmoid function, KKK is the number of negative samples, and the negative vertexes vniv_{n_i}vni are drawn from the distribution Pn(v)P_n(v)Pn(v)
(3) Here we set Pn(v)∝dv3/4P_n(v) \propto d^{3/4}_{v}Pn(v)dv3/4 , where dvd_vdv is the out-degree of vertex vvv. Empirically, we set KKK = 5. Thus, for the randomly sampled edge (t,s)(t, s)(t,s), we can obtain the gradients as follow,

Recommendation Network Embedding

(1) the recommendation network embedding objective L3:=O3+Reg3L_3:=O_3+Reg_3L3:=O3+Reg3, where Reg3Reg_3Reg3 contains ∥Bi∥2\parallel B_i\parallel^2Bi2, ∥MIU^u∥2\parallel M_I \hat{U}_u \parallel^2MIU^u2 and ∥U^u∥2\parallel \hat{U}_u \parallel ^2U^u2
(2) Since the recommendation network is a bipartite network, and we simply sample the negative item vertexes according to a uniform distribution.
(3) Similarly, we can obtain the gradients for each edge (u,i)∈ER(u, i)\in\mathcal{E^{\mathcal{R}}}(u,i)ER as follow

4 Experiments

Experimental Settings

Evaluation Metrics

mean absolute error (MAE)
root mean square error (RMSE)

Comparison Methods


Results and Analysis

Parameter Sensitivity

Hyper Parametersα α\alphaα and β\betaβ

Dimensions l, d0, d1 and d2

Related Work


shared common user latent vectors factorized by ratings and by trust.(感觉这是本文最大的靶子)




