2020.09.16
FM 因子分解机
2021.09.18
- 论文阅读
- - Interactive Recommender System via Knowledge Graph-enhanced Reinforcement Learning
  - Hierarchical Reinforcement Learning for Integrated Recommendation
  - DEAR: Deep Reinforcement Learning for Online Advertising Impression in Recommender Systems
  - Reinforcement Recommendation with User Multi-aspect Preference
  - User Response Models to Improve a REINFORCE Recommender System
  - Reinforcement Learning with a Disentangled Universal Value Function for Item Recommendation
2019.09.23
2021.09.30

2020.09.16

FM 因子分解机

代码：

读取数据用pd.read_csv()
np.empty(shape[, dtype, order]) 依给定的shape, 和数据类型 dtype, 返回一个一维或者多维数组，数组的元素不为空，为随机产生的数据。
截取固定间隔数据 [start :: step] 从start开始，每隔step取值
lambda 匿名函数 g = lambda x:x+1 冒号左边为输入，右边为输出

The Criteo Display Ads dataset 是kaggle上的一个CTR预估竞赛数据集。里面包含13个数值特征I1-I13和26个类别特征C1-C26。

2021.09.18

交互式推荐系统：an IRS consecutively recommends items to individual users and receives their feedbacks which makes it possible to refine its recommendation policy during such interactive processes.（单个用户的反馈）
对话推荐系统：① 交互式推荐系统可以视为 CRSs 的一种早期雏形，目前仍然有交互式推荐系统的研究。大多数交互式推荐系统，都遵循两个步骤：1）推荐一个列表；2）收集用户对于该推荐的反馈。然后往复循环这两个步骤。
② 然而这并不是一种好的交互模式。首先，这种交互太单调了，每轮都在循环推荐和收集反馈，很容易让用户失去耐心；其次，一个好的推荐系统应该只在其置信度比较高、信心比较充足的情况下进行推荐；最后，由于商品的数量巨大，用推荐商品的方式来了解用户的兴趣喜好，是低效的。
③ 而 CRSs 引入了更多的交互模式。例如，其可以主动问用户问题，例如问关于商品属性的问题：“你喜欢什么样颜色的手机？”“你喜欢关于摇滚类乐曲吗？”丰富的交互模式克服了交互式推荐系统的三个问题，用更高效的方式来进行交互，从而快速获得用户的兴趣爱好，在信心比较充足的情况下，才作出推荐。
④ CRSs 的一个核心任务是关注如何问问题，即什么时候问问题，什么时候做推荐。

论文阅读

这段时间找了一些RL+RS的论文阅读，以下是随便写的笔记

Interactive Recommender System via Knowledge Graph-enhanced Reinforcement Learning

利用知识图谱的先验知识 (i) guide the candidate selection for better candidate item retrieval, (ii) enrich the representation of items and user states, and (iii) propagate user preferences among the correlated items over KG to deal with the sparsity of user feedback.

Hierarchical Reinforcement Learning for Integrated Recommendation

将推荐分为两部分，先是channel推荐，再是item推荐（The low-level agent is a channel selector, which generates a personalized channel list. The high-level agent is an item recommender, which recommends specific items from heterogeneous channels under the channel constraints.）还设计了多种reward函数，四种损失函数； 2. 综合推荐：不局限于某个领域，而是同时推荐更多样化的东西heterogeneous items。Integrated recommendation is proposed to simultaneously recommend these heterogeneous items from different sources (i.e., channels) in a single recommendation system。
挑战：不同channel（ news, article, long video and short video）用到了不同的排序方式，怎么融合呢？怎么衡量用户对channel的喜好程度和用户对item的喜好；任务是rank。
输入输出：The inputs are heterogeneous items from different channels, and the output is a recommended list (i.e., top 10 items)

可参考的思路：①设立多种reward指标 ② low-level和high-level
用到了DDPG，Actor-Critic
采用卷积神经网络作为策略函数μμμ和QQQ函数的模拟，即策略网络和Q网络；然后使用深度学习的方法来训练上述神经网络。

DEAR: Deep Reinforcement Learning for Online Advertising Impression in Recommender Systems

提出广告策略，研究何时插入广告能实现最大化长期收入（(i) whether to interpolate an ad or not in the recommendation list, and if yes, (ii) the optimal ad and (iii) the optimal location to interpolate.）；目标：在最大化广告收益的同时最小化广告对用户体验的负面影响。
用到了Deep-Q network
可参考：reward？看看怎么最大化广告收益的，用户负面影响？怎么衡量的呢

Reinforcement Recommendation with User Multi-aspect Preference

对用户的多方面偏好进行建模；也用到了Actor-Critic；啊，居然没有模型图

User Response Models to Improve a REINFORCE Recommender System

用户响应建模；涉及到辅助任务

Reinforcement Learning with a Disentangled Universal Value Function for Item Recommendation

总结了强化学习用于推荐的几点挑战：In this paper, we summarize three key practical challenges of large-scale RL-based recommender systems: massive state and action spaces, high-variance environment, and the unspecific reward setting in recommendation. All these problems remain largely unexplored in the existing literature and make the application of RL challenging
model-based reinforcement learning framework
好像有点解耦的意思，不是针对某个应用

2019.09.23

pip安装时报错check_hostname requires server_hostname发现是因为挂了vpn，关掉就好啦
指定随机种子：

  torch.manual_seed(seed)torch.cuda.manual_seed(seed)torch.cuda.manual_seed_all(seed)np.random.seed(seed)random.seed(seed)torch.set_deterministic(True)torch.backends.cudnn.benchmark = Falsetorch.backends.cudnn.deterministic = True

报错module 'gym.envs.box2d' has no attribute 'LunarLander'，应该是包没下好

2021.09.30

使用tailf命令查看文件时出错Command 'tailf' not found，换成tail -f就可以了。
查看服务器GPU运行情况nvidia-smi。
查看资源占用情况top。

九月学习笔记（FM、一些论文阅读、代码）相关推荐

[置顶]人工智能（深度学习）加速芯片论文阅读笔记（已添加ISSCC17，FPGA17...ISCA17...）...
这是一个导读,可以快速找到我记录的关于人工智能(深度学习)加速芯片论文阅读笔记. ISSCC 2017 Session14 Deep Learning Processors: ISSCC 2017关于 ...
强化学习泛化性综述论文阅读 A SURVEY OF GENERALISATION IN DEEP REINFORCEMENT LEARNING
强化学习泛化性综述论文阅读摘要一.介绍二.相关工作:强化学习子领域的survey 三.强化学习中的泛化的形式 3.1 监督学习中泛化性 3.2 强化学习泛化性背景 3.3 上下文马尔可夫决策过 ...
经典神经网络论文超详细解读（三）——GoogLeNet InceptionV1学习笔记（翻译＋精读+代码复现）
前言在上一期中介绍了VGG,VGG在2014年ImageNet 中获得了定位任务第1名和分类任务第2名的好成绩,而今天要介绍的就是同年分类任务的第一名--GoogLeNet . 作为2014年Ima ...
[BEV]学习笔记之BEVDepth（原理+代码）
文章目录 1.前言 2.模型简介 3.代码解析 4.总结 1.前言继lift-splat-shoot之后,纯视觉BEV感知又有了新的进展,如旷视科技.华中理工和西安交大提出来的BEVDepth.本文 ...
组队学习笔记Task1：论文数据统计
数据分析第一次组队学习笔记--Lizzy @Datawhale Task1:论文数据统计学习主题:论文数量统计(数据统计任务),统计2019年全年,计算机各个方向论文数量: 学习内容:赛题理解.Pa ...
Java Web--HTML、CSS、JavaScript详细学习笔记（内含丰富示例代码）
** Java Web–HTML.CSS.JavaScript学习笔记 ** HTML(Hyper Text Markup Language超文本标记语言):控制的是页面的内容,是由标签组成的语言,能 ...
推荐系统领域对比学习和数据增强论文及代码集锦
对比学习和数据增强是近年各领域关注度较高的研究方向,在推荐系统领域也是如此,并取得了众多成果.本文汇总了推荐系统领域对比学习和数据增强的最新论文和代码,涵盖 SIGIR.SIGKDD.RecSys.C ...
深度学习笔记--单层感知机原理及代码实现
本文作者:合肥工业大学管理学院钱洋 email:1563178220@qq.com . 以下内容是个人的学习笔记,内容可能有不到之处,欢迎交流.未经本人允许禁止转载. python3实现简单的感知 ...
学习笔记2：指针经典代码阅读练习
目录一.阅读下列代码分析输出结果: 画图分析: 二.阅读下列代码分析输出结果: 画图分析 : 三.阅读下列代码分析输出结果: 画图分析: 四.关于sizeof操作符的运用一.阅读下列代码分析输出结 ...

九月学习笔记（FM、一些论文阅读、代码）

目录

2020.09.16

FM 因子分解机

2021.09.18

论文阅读

Interactive Recommender System via Knowledge Graph-enhanced Reinforcement Learning

Hierarchical Reinforcement Learning for Integrated Recommendation

DEAR: Deep Reinforcement Learning for Online Advertising Impression in Recommender Systems

Reinforcement Recommendation with User Multi-aspect Preference

User Response Models to Improve a REINFORCE Recommender System

Reinforcement Learning with a Disentangled Universal Value Function for Item Recommendation

2019.09.23

2021.09.30

九月学习笔记（FM、一些论文阅读、代码）相关推荐

最新文章

热门文章

九月学习笔记 （FM、一些论文阅读、代码）

目录

2020.09.16

FM 因子分解机

2021.09.18

论文阅读

Interactive Recommender System via Knowledge Graph-enhanced Reinforcement Learning

Hierarchical Reinforcement Learning for Integrated Recommendation

DEAR: Deep Reinforcement Learning for Online Advertising Impression in Recommender Systems

Reinforcement Recommendation with User Multi-aspect Preference

User Response Models to Improve a REINFORCE Recommender System

Reinforcement Learning with a Disentangled Universal Value Function for Item Recommendation

2019.09.23

2021.09.30

九月学习笔记 （FM、一些论文阅读、代码）相关推荐

最新文章

热门文章

九月学习笔记（FM、一些论文阅读、代码）

九月学习笔记（FM、一些论文阅读、代码）相关推荐