1、电影评分数据集

load ('ex8_movies.mat');

该数据集包含两个矩阵,分别是矩阵Y和矩阵R,其维度均为1682943(ij)。
Y(i,j)表示为第j个用户对第i个电影的评分,R中数据为0或1,R(i,j) = 1表示为第j个用户对第i个电影有评分,0表示为未评分。
若需要对第1部电影的已有评分计算其平均分,则代码应为:

mean(Y(1, R(1, :)));

数据可视化图形为:

2、协同过滤算法
2.1、协同过滤算法损失函数

得损失函数的计算公式为:

J = sum(sum(((X*Theta' - Y).*R).^2))/2;

运行得:

Cost at loaded parameters: 22.224604
(this value should be about 22.22)

2.2、协同过滤梯度

X_grad = R.*(X*Theta' - Y)*Theta ;
Theta_grad = (R.*(X*Theta' - Y))'*X

运行程序得:

Checking Gradients (without regularization) ... 5.5335    5.53353.6186    3.61865.4422    5.4422-1.7312   -1.73124.1196    4.1196-1.4833   -1.4833-6.0734   -6.07342.3490    2.34907.6341    7.63411.8651    1.86514.1192    4.1192-1.5834   -1.58341.2828    1.2828-6.1573   -6.15731.6628    1.66281.1686    1.16865.5630    5.56300.3050    0.30504.6442    4.6442-1.6691   -1.6691-2.1505   -2.1505-3.6832   -3.68323.4067    3.4067-4.0743   -4.07430.5567    0.5567-2.1056   -2.10560.9168    0.9168The above two columns you get should be very similar.
(Left-Your Numerical Gradient, Right-Analytical Gradient)If your cost function implementation is correct, then
the relative difference will be small (less than 1e-9). Relative Difference: 1.84952e-12

2.3、正则化损失函数

转换成代码为:

J = sum(sum((R.*(X*Theta' - Y)).^2))/2 + sum(sum(Theta.^2))*lambda/2+...sum(sum(X.^2))*lambda/2;

2.4、正则化梯度

转换成代码为:

J = sum(sum((R.*(X*Theta' - Y)).^2))/2 + sum(sum(Theta.^2))*lambda/2+...sum(sum(X.^2))*lambda/2;
X_grad = R.*(X*Theta' - Y)*Theta + X*lambda;
Theta_grad = (R.*(X*Theta' - Y))'*X + Theta*lambda;

运行程序有:

Cost at loaded parameters (lambda = 1.5): 31.344056
(this value should be about 31.34)
Checking Gradients (with regularization) ... 2.2223    2.22230.7968    0.7968-3.2924   -3.2924-0.7029   -0.7029-4.2016   -4.20163.5969    3.59690.8859    0.88591.0523    1.0523-7.8499   -7.84990.3904    0.3904-0.1347   -0.1347-2.3656   -2.36562.1066    2.10661.6703    1.67030.8519    0.8519-1.0380   -1.03802.6537    2.65370.8114    0.8114-0.8604   -0.8604-0.5884   -0.5884-0.7108   -0.7108-4.0652   -4.06520.2494    0.2494-4.3484   -4.3484-3.6167   -3.6167-4.1277   -4.1277-3.2439   -3.2439The above two columns you get should be very similar.
(Left-Your Numerical Gradient, Right-Analytical Gradient)If your cost function implementation is correct, then
the relative difference will be small (less than 1e-9). Relative Difference: 1.78901e-12

3.电影推荐系统

New user ratings:
Rated 4 for Toy Story (1995)
Rated 3 for Twelve Monkeys (1995)
Rated 5 for Usual Suspects, The (1995)
Rated 4 for Outbreak (1995)
Rated 5 for Shawshank Redemption, The (1994)
Rated 3 for While You Were Sleeping (1995)
Rated 5 for Forrest Gump (1994)
Rated 2 for Silence of the Lambs, The (1991)
Rated 4 for Alien (1979)
Rated 5 for Die Hard 2 (1990)
Rated 5 for Sphere (1998)Program paused. Press enter to continue.Training collaborative filtering...
Iteration     1 | Cost: 3.108511e+05
Iteration     2 | Cost: 1.475959e+05
Iteration     3 | Cost: 1.000321e+05
Iteration     4 | Cost: 7.707565e+04
Iteration     5 | Cost: 6.153638e+04
Iteration     6 | Cost: 5.719300e+04
Iteration     7 | Cost: 5.239113e+04
Iteration     8 | Cost: 4.771435e+04
Iteration     9 | Cost: 4.559863e+04
Iteration    10 | Cost: 4.385394e+04
Iteration    11 | Cost: 4.263562e+04
Iteration    12 | Cost: 4.184598e+04
Iteration    13 | Cost: 4.116751e+04
Iteration    14 | Cost: 4.073297e+04
Iteration    15 | Cost: 4.032577e+04
Iteration    16 | Cost: 4.009203e+04
Iteration    17 | Cost: 3.986428e+04
Iteration    18 | Cost: 3.971337e+04
Iteration    19 | Cost: 3.958890e+04
Iteration    20 | Cost: 3.949630e+04
Iteration    21 | Cost: 3.940187e+04
Iteration    22 | Cost: 3.934142e+04
Iteration    23 | Cost: 3.930822e+04
Iteration    24 | Cost: 3.926063e+04
Iteration    25 | Cost: 3.922334e+04
Iteration    26 | Cost: 3.920956e+04
Iteration    27 | Cost: 3.917145e+04
Iteration    28 | Cost: 3.914804e+04
Iteration    29 | Cost: 3.913479e+04
Iteration    30 | Cost: 3.910882e+04
Iteration    31 | Cost: 3.908992e+04
Iteration    32 | Cost: 3.908209e+04
Iteration    33 | Cost: 3.907380e+04
Iteration    34 | Cost: 3.906903e+04
Iteration    35 | Cost: 3.906437e+04
Iteration    36 | Cost: 3.905754e+04
Iteration    37 | Cost: 3.905112e+04
Iteration    38 | Cost: 3.904531e+04
Iteration    39 | Cost: 3.904023e+04
Iteration    40 | Cost: 3.903390e+04
Iteration    41 | Cost: 3.902800e+04
Iteration    42 | Cost: 3.902367e+04
Iteration    43 | Cost: 3.902195e+04
Iteration    44 | Cost: 3.902007e+04
Iteration    45 | Cost: 3.901780e+04
Iteration    46 | Cost: 3.901699e+04
Iteration    47 | Cost: 3.901489e+04
Iteration    48 | Cost: 3.901190e+04
Iteration    49 | Cost: 3.900929e+04
Iteration    50 | Cost: 3.900742e+04
Iteration    51 | Cost: 3.900630e+04
Iteration    52 | Cost: 3.900485e+04
Iteration    53 | Cost: 3.900348e+04
Iteration    54 | Cost: 3.900283e+04
Iteration    55 | Cost: 3.900208e+04
Iteration    56 | Cost: 3.900118e+04
Iteration    57 | Cost: 3.899982e+04
Iteration    58 | Cost: 3.899860e+04
Iteration    59 | Cost: 3.899710e+04
Iteration    60 | Cost: 3.899381e+04
Iteration    61 | Cost: 3.899242e+04
Iteration    62 | Cost: 3.899094e+04
Iteration    63 | Cost: 3.898986e+04
Iteration    64 | Cost: 3.898908e+04
Iteration    65 | Cost: 3.898811e+04
Iteration    66 | Cost: 3.898754e+04
Iteration    67 | Cost: 3.898736e+04
Iteration    68 | Cost: 3.898712e+04
Iteration    69 | Cost: 3.898687e+04
Iteration    70 | Cost: 3.898673e+04
Iteration    71 | Cost: 3.898634e+04
Iteration    72 | Cost: 3.898524e+04
Iteration    73 | Cost: 3.898369e+04
Iteration    74 | Cost: 3.898322e+04
Iteration    75 | Cost: 3.898257e+04
Iteration    76 | Cost: 3.898194e+04
Iteration    77 | Cost: 3.898141e+04
Iteration    78 | Cost: 3.898077e+04
Iteration    79 | Cost: 3.898025e+04
Iteration    80 | Cost: 3.897962e+04
Iteration    81 | Cost: 3.897909e+04
Iteration    82 | Cost: 3.897861e+04
Iteration    83 | Cost: 3.897735e+04
Iteration    84 | Cost: 3.897609e+04
Iteration    85 | Cost: 3.897534e+04
Iteration    86 | Cost: 3.897488e+04
Iteration    87 | Cost: 3.897468e+04
Iteration    88 | Cost: 3.897414e+04
Iteration    89 | Cost: 3.897389e+04
Iteration    90 | Cost: 3.897371e+04
Iteration    91 | Cost: 3.897355e+04
Iteration    92 | Cost: 3.897320e+04
Iteration    93 | Cost: 3.897304e+04
Iteration    94 | Cost: 3.897290e+04
Iteration    95 | Cost: 3.897276e+04
Iteration    96 | Cost: 3.897254e+04
Iteration    97 | Cost: 3.897240e+04
Iteration    98 | Cost: 3.897232e+04
Iteration    99 | Cost: 3.897222e+04
Iteration   100 | Cost: 3.897217e+04Recommender system learning completed.Program paused. Press enter to continue.Top recommendations for you:
Predicting rating 5.0 for movie Saint of Fort Washington, The (1993)
Predicting rating 5.0 for movie Great Day in Harlem, A (1994)
Predicting rating 5.0 for movie Someone Else's America (1995)
Predicting rating 5.0 for movie Entertaining Angels: The Dorothy Day Story (1996)
Predicting rating 5.0 for movie Santa with Muscles (1996)
Predicting rating 5.0 for movie Aiqing wansui (1994)
Predicting rating 5.0 for movie Prefontaine (1997)
Predicting rating 5.0 for movie They Made Me a Criminal (1939)
Predicting rating 5.0 for movie Marlene Dietrich: Shadow and Light (1996)
Predicting rating 5.0 for movie Star Kid (1997)Original ratings provided:
Rated 4 for Toy Story (1995)
Rated 3 for Twelve Monkeys (1995)
Rated 5 for Usual Suspects, The (1995)
Rated 4 for Outbreak (1995)
Rated 5 for Shawshank Redemption, The (1994)
Rated 3 for While You Were Sleeping (1995)
Rated 5 for Forrest Gump (1994)
Rated 2 for Silence of the Lambs, The (1991)
Rated 4 for Alien (1979)
Rated 5 for Die Hard 2 (1990)
Rated 5 for Sphere (1998)

吴恩达机器学习ex8:推荐系统相关推荐

  1. 吴恩达机器学习13.推荐系统

    推荐系统 1.问题形式化 从一个例子开始定义推荐系统的问题. 假使我们是一个电影供应商,我们有 5 部电影和 4 个用户,我们要求用户为电影打分. 前三部电影是爱情片,后两部则是动作片,我们可以看出A ...

  2. 吴恩达机器学习ex8:异常检测

    数据集ex8data1.mat中给出了m=307个样本,其表示的是服务器电脑特征值,第1个特征值表示的是吞吐量,第2个特征值表示的是延迟.需要做的是从这些无标签数据中,找出异常数据. 其数据的可视化为 ...

  3. 吴恩达机器学习 EX7 第二部分 主成分分析(PCA)

    2 主成分分析 主成分分析通过协方差矩阵提取数据的主要成分,如90%的成分,通常用户数据压缩和数据可视化(维度降低方便可视化) 2.1 导入模块和数据 该部分通过将二维数据压缩成一维数据演示主成分分析 ...

  4. 8. 吴恩达机器学习课程-作业8-异常检测和推荐系统

    fork了别人的项目,自己重新填写,我的代码如下 https://gitee.com/fakerlove/machine-learning/tree/master/code 代码原链接 文章目录 8. ...

  5. 吴恩达机器学习(十四)推荐系统(基于梯度下降的协同过滤算法)

    目录 0. 前言 1. 基于内容的推荐算法(Content-based recommendations) 2. 计算电影特征 3. 基于梯度下降的协同过滤算法(Collaborative filter ...

  6. 吴恩达机器学习作业Python实现(八):异常检测和推荐系统

    吴恩达机器学习系列作业目录 1 Anomaly detection 这部分,您将实现一个异常检测算法来检测服务器计算机中的异常行为.他的特征是测量每个服务器的响应速度(mb/s)和延迟(ms).当你的 ...

  7. 【CV】吴恩达机器学习课程笔记第16章

    本系列文章如果没有特殊说明,正文内容均解释的是文字上方的图片 机器学习 | Coursera 吴恩达机器学习系列课程_bilibili 目录 16 推荐系统 16-1 问题规划 16-2 基于内容的推 ...

  8. 带你少走弯路:五篇文章学完吴恩达机器学习

    本文是吴恩达老师的机器学习课程[1]的笔记和代码复现部分,这门课是经典,没有之一.但是有个问题,就是内容较多,有些内容确实有点过时. 如何在最短时间学完这门课程?作为课程的主要翻译者和笔记作者,我推荐 ...

  9. Github标星24300!吴恩达机器学习课程笔记.pdf

    个人认为:吴恩达老师的机器学习课程,是初学者入门机器学习的最好的课程!我们整理了笔记(336页),复现的Python代码等资源,文末提供下载. 课程简介 课程地址:https://www.course ...

最新文章

  1. python程序语法元素分析_Python程序语法元素分析(2)
  2. JDK 14 – JEP 361从预览中切换表达式
  3. 《MySQL—— 业务高峰期的性能问题的紧急处理的手段 》
  4. day01_初识python
  5. 你值得掌握的 Git分支等 常用命令 (持续更新中)
  6. Android 绑定类型服务---使用信使(Messenger)
  7. Mybatis框架简单使用
  8. Mysql Sql语句令某字段值等于原值加上一个字符串
  9. 新华三模拟器STP和RSTP及其MSTP的作用与配置
  10. Bambook 简介
  11. java core 之 泛型
  12. PSP ISO游戏运行必备工具:ISO TOOL 1.970 功能一览图文教程
  13. 微信群裂变不起来怎么办?
  14. 乐视云盘电脑版 V3.1.0 官方最新版
  15. Elasticsearch 7.X-8.0 AggregationBuliders 相关聚合函数(一)计数指标-百分位数
  16. Java心理健康测试系统
  17. 计算机网络常见面试题,一网打尽!
  18. 【Linux集群教程】07 块存储之 iSCSI 服务
  19. 盒子滚动到底部有偏差 js_干货丨JS 经典实例收集整理
  20. “燕云十六将”之Jerry葛涵涛

热门文章

  1. 阻止页面电话号码变蓝
  2. oracle中的数据对象
  3. SQL高级查询——50句查询(含答案) ---参考别人的,感觉很好就记录下来留着自己看。...
  4. 二进制文件和ASCII文件有何差别
  5. 考英语四级误用六级题 千余考生困教室4个小时
  6. FCKEditor在Asp.net环境下的配置安装
  7. 如何帮助企业把风控做得更好?
  8. 【学习总结】Git学习-参考廖雪峰老师教程十-自定义Git
  9. 《java入门第一季》之类面试题
  10. 依据BOM和已经存在的文件生成其他种类的文件