这篇博文转自哈佛大学博士生Sam整理的机器学习资料,包括了数据基础、几何、概率论、统计学习、深度学习等。内容非常丰富,Blog是完全拷贝用于备份。最新内容建议阅读Sam维护的博文:https://sgfin.github.io/learning-resources/


ML Resources

This is a not-particularly-systematic attempt to curate a handful of my favorite resources for learning statistics and machine learning. This isn’t meant to be comprehensive, and in fact is still missing the vast majority of my favorite explainers. Rather, it’s just a smattering of resources I’ve found myself turning to multiple times and thus would like to have in one place. The organizatiion is as follows:

  • Open Courses and Textbooks: Cover a fairly broad topic reasonably comprehensively, and would take weeks to months to work through start-to-finish.

  • Tutorials, Overviews, and (Individual) Lecture Notes: Explain a specific topic extremely clearly, and take minutes to hours (or a few days tops) to work through from start-to-finish.

  • Cheatsheets: Provide structured access to useful bits of information on the order of seconds.

Finally, I’ve added a section with links to a few miscellanous websites that often produce great content.

Of the above, the second section is both the most incomplete and the one that I am most excited about. I hope to use it to capture the best explanations of tricky topics that I have read online, to make it easier to re-learn them later when I inevitably forget. (In a perfect world, Chris Olah and/or distill.pub would just write an article on everything, but in the meantime I have to gather scraps from everywhere else.)

If you stumble upon this list and have suggestions for me to add (especially for the middle section!), please feel free to reach out! But I’m only trying to post things on here that I’ve read, so it may be caught in my to-read list for a while before it makes it on here. Of course, the source for this webpage is on github, so you can also just take it.

Open Courses and Textbooks

I’m trying to limit to this list to things that are legally accessible online, for free.

Foundation

File Description
Math for ML Book Math for machine learning book by Faisal and Ong, available on github.
Boyd Applied Linear Algebra Freely available book from Boyd and Vandenberghe on Applied LA (website).
Fast.ai Computational Linear Algebra Rachel Thomas has put together this great online textbook for computational linear algebra with accompanying youtube videos.
MIT 6.041 Intro Probability John Tsitsiklis et al have put together some great resources. Their classic MIT intro to probability has been archived on OCW and also offered on Edx (Part 1, Part 2). The textbook is also excellent.
Joe Blitzstein’s Stat110 Joe Blitzstein’s undergrad probability course has a high overlap in content with 6.041. Like 6.041, it also has a great textbook, youtube videos, and an edx offering. It’s a bit more playful, as well.
MathematicalMonk This guy is amazing. Some 250 youtube tutorials on ML, Probability, and Information Theory. What’s great about these playlists is any individual video could go into section 2!

Statistics

File Description
Doug Sparks’ Stats 200 Nice course notes from Doug Sparks 2014 offering of stats 200
Modern Statistics for Modern Biology This online textbook is from Susan Holmes and Wolfgang Huber, and provides a nice and accessible intro to the parts of modern data science revelant to computational biologists. It also happens to be a piece of typographic art, created with bookdown.
Statistical Rethinking Lecture Videos on youtube accompany this very well-reviewed introductory textbook.
Hernan and Robbins Causal Inference Book Long-upcoming textbook on causal inference (from the epidemiology perspective), with drafts fairly frequently updated on the web page.

Classic Machine Learning

File Description
CS 229 Lecture Notes Classic note set from Andrew Ng’s amazing grad-level intro to ML: CS229.
ESL and ISL from Hastie et al Beginner (ISL) and Advanced (ESL) presentation to classic machine learning from world-class stats professors. Slides and video for a MOOC on ISL is available here.
CS 228 PGM Notes Really great course notes on Probabilistic Graphical Models from at Stanford. PDF export wasn’t ideal so linking only to website.
Blei Foundations of Graphical Models Course 2016 course notes on Foundations of Graphical Models from David Blei 2016 website

Deep Learning

File Description
Roger Grosse’s CSC231 Notes Notes from Roger Grosse’s CSC 231 full website here. Probably the single best intro to DL course I’ve found from any university. Notes and slides are gorgeous.
Fast.Ai Wonderful set of intro lectures + notebooks from Jeremy Howard and Rachel Thomas. In addition, Hiromi Suenaga has released excellent and self-contained notes of the whole series with timestamp links back to videos: FastAI DL Part 1, FastAI DL Part 2, and FastAI ML.
CS231N DL for Vision Amazing notes from Andrej Karapthy, with lectures on Youtube as well.
CS224 Deep Learning for NLP 2017 Fantastic course notes on Deep Learning for NLP from Stanford’s CS224. Github repo here
CMU CS 11-747 Fantastic course on Deep Learning for NLP from CMU’s Graham Neubig. Really great lecture videos on Youtube here
Deep Learning Book This textbook by Ian Goodfellow, Yoshua Bengio, and Aaron Courville is probably the closest we have to a de facto standard textbook for DL.

Reinforcement Learning

File Description
Sutton and Barto Open RL Book De-facto standard intro to RL, even though the textbook is only now about to be published!
Berkeley Deep Reinforcement Learning RL class from Berkely taught by top dogs in the field, lectures posted to Youtube.

Optimization

File Description
Boyd Convex Optimization Book Famous and freely available textbook from Boyd and Vandenberghe, accompanied by slidesand Youtube videos. More advanced follow-up class here
NYU Optimization-based Data Analysis 2016 and 2017 Fantastic course notes on Optimization-based data analysis from NYU 2016 website and 2017 website.

Tutorials, Overviews, and (Individual) Lecture Notes

This section is fledgling at best, but was my real motivation in making this page. Archetypes include basically anything on distill.pub, good blog or medium posts, etc. Depth-first learning looks like a great access point here, but I haven’t gotten to do more than skim any of those, yet.

Fundamentals

File Description
CS 229 Linear Algebra Notes Linear algebra reference from Stanford’s Machine Learning Course.
Matrix Calc for DL (pdf here) Really nice overview of matrix calculus for deep learning from Parr/Howard. Citable on on arxiv.

Probability and Statistics

File Description
Hernan Selection Bias Nice summary of selection bias via DAGs by Hernan et al.

Classic Machine Learning/Data Science NOS

File Description
Roughgarden SVD Notes Really great presentation of SVD from Tim Rougharden’s CS168 at Stanford.
Roughgarden PCA Notes Really great presentaiton of PCA from Tim Rougharden’s CS168 at Stanford.

Bayesian Machine Learning

File Description
Blei Exponential Familes/Variational Inference A couple of the course notes I particularly like from Blei’s 2011 Probabilistic Modeling Course )
Blei Variational Inference Review Overview on Variational Inference from David Blei available on arxiv

Deep Learning

File Description
Adversarial Examples/Robust ML Part 1, Part 2, and Part 3 The Madry lab is one of the top research groups in robust deep learning research. They put together a fantastic intro to these topics on their blog. I hope they keep making posts…
Distill Attention Amazingly clear presentation of the attention mechanism and its (early) variants
Distill Building Interpretability Coolest visualizations of NN internals I’ve ever seen
Distill Feature Visualization Running theme: If it’s only distill.pub, read it.
Chris Olah Understanding LSTMs Chris Olah is a master of his craft, and here offers a fantastic overview of LSTMs and GRUs.

Natural Language Processing

File Description
Chris Olah on Word Embeddings Chris Olah explaining world embeddings and the like.
The Annotated Transformer Harvard’s Sasha Rush created a line-by-line annotation of “Attention is All You Need” that also serves as a working notebook. Pedagogical brilliance, and it would be awesome to do this for a couple papers per year.
Goldberg’s Primer on NNs for NLP Overview of Deep Learning for NLP from Yoav Goldberg downloaded from here.
Neubig’s Tutorial on NNs for NLP Overview of Deep Learning for NLP from Graham Neubig. Downloaded from arxiv and pairs nicely with his course and videos.

Reinforcement Learning

File Description
Karpathy’s Pong From Pixels Andrej Karpathy has a real gift for didactics. This is a self-contained explanation of deep reinforcement learning sufficient to understand a basic atari agent.
Weng’s A (Long) Peek into RL A nice blog post covering the foundations of reinforcement learning
OpenAI’s Intro to RL The introductory tutorial for OpenAIs new “Spinning Up in Deep RL” website

Information Theory

File Description
Chris Olah Visual Information Theory As always, Chris Olah creates an amazing presentation both in words and images. Goal is to visualize key information theory concepts.
Cover and Thomas Ch2 - Entropy and Information The extremely well-written introductory chapter from the classic information theory textbook.
Cover and Thomas Ch11 - Info Theory and Statistics The information theory and statistics chapter from the classic information theory textbook.
Deriving Probability Distributions from Maximum Entropy Principle It feels slimey and self-serving to include this, but I wrote this post to better understand how information theory can be used to understand/derive common probability distributions from first principles.
Deriving the information entropy of the multivariate gaussian Another blog post I wrote to try to understand information theory + statistics.

Optimization

File Description
Ruder Gradient Descent Overview (PDF here) Great overview of gradient descent algorithms.
Bottou Large-Scale Optimization Notes on Optimization from Bottou, Curtis, and Nocedal. Downloaded from arxiv.

Cheatsheets

Math

File Description
Probability Cheatsheet Probability cheat sheet, from William Chen’s github
CS 229 TA Cheatsheet 2018 TA cheatsheet from the 2018 offering of Stanford’s Machine Learning Course, Github repo here.
CS Theory Cheatsheet CS theory cheat sheet, originally accessed here

Programming

File Description
R dplyr cheatsheet Cheatsheet for Hadley’s amazing data wrangling package, dplyr. One of many from RStudio
R ggplot2 cheatsheet Cheatsheet for Hadley’s amazing plotting package, ggplot2. One of many from RStudio
SQL Joins cheatsheet Graphical description of classic SQL joins w/ toy code
Python pandas cheatsheet Cheatsheet for python’s data wrangling package, pandas. Downloaded from here
Python numpy cheatsheet Cheatsheet for python’s numerical package, numpy. Downloaded from Datacamp
Python keras cheatsheet Cheatsheet for python’s NN package, keras. Downloaded from Datacamp.
Python scikit-learn cheatsheet Cheatsheet for python’s ML package, scikit-learn. Downloaded from Datacamp.
Python seaborn tutorial Tutorial for python’s plotting system, seaborn. Haven’t found a great one yet for matplotlib.
Graphic Design cheatsheet Cute little graphic design cheatsheet downloaded from here

Miscellaneous websites

File Description
Chris Olah’s Blog Essentially everything on here is gold. I am so grateful for the hours he must put into these posts.
distill.pub Distill navigates a really interesting gap between super-blog and research journal. I wish that we had more publications like this.
Pytorch Tutorials The tutorials put out by the pytorch developers are really fantastic. Easy to see why the community is growing so fast.
Sebastian Ruder’s blog Sebastian has produced a lot of really great explanations, like the one on gradient descent methods I linked to above. He also maintains a website tracking progress on NLP benchmarks
Berkeley AI Research (BAIR) Blog BAIR produces a lot of great research, and uses this blog to release more accessible presentations of their papers.
Off the Convex Path Nice blog on machine learning and optimization.
Ferenc Huszár’s blog Pretty popular blog that has a lot of explorations/musings on ML from an author with a rigorous mathematical perspective
Thibaut Lienart’s Blog This website has some notes on math and optimization that seem interesting.

机器学习资源-Harvard Ph.D Sam维护相关推荐

  1. 写给人类的机器学习 六、最好的机器学习资源

    六.最好的机器学习资源 原文:The Best Machine Learning Resources 作者:Vishal Maini 译者:飞龙 协议:CC BY-NC-SA 4.0 用于制定人工智能 ...

  2. .NET平台机器学习资源汇总,有你想要的么?(转)

    出处:http://www.cnblogs.com/asxinyu/p/4422050.html 阅读目录 1.开源综合类 2.开源.NET平台非综合类 3.其他资源与技术博客 4.我的100篇博客之 ...

  3. .NET平台机器学习资源汇总,有你想要的么?

    接触机器学习1年多了,由于只会用C#堆代码,所以只关注.NET平台的资源,一边积累,一边收集,一边学习,所以在本站第101篇博客到来之际,分享给大家.部分用过的 ,会有稍微详细点的说明,其他没用过的, ...

  4. 国外程序员整理机器学习资源大全

    我想很多程序员应该记得 GitHub 上有一个 Awesome - XXX 系列的资源整理.awesome-machine-learning 就是 josephmisiti 发起维护的机器学习资源列表 ...

  5. [机器学习]机器学习资源大全中文版

    机器学习资源大全中文版 我想很多程序员应该记得 GitHub 上有一个 Awesome - XXX 系列的资源整理.awesome-machine-learning 就是 josephmisiti 发 ...

  6. 手机上的机器学习资源!Github标星过万的吴恩达机器学习、深度学习课程笔记,《统计学习方法》代码实现!...

    吴恩达机器学习.深度学习,李航老师<统计学习方法>.CS229数学基础等,可以说是机器学习入门的宝典.本文推荐一个网站"机器学习初学者",把以上资源的笔记.代码实现做成 ...

  7. 一个用SAM维护多个串的根号特技

    一个用SAM维护多个串的根号特技 基本介绍 在多个串的字符串题中,往往会出现一类题需要用到某个子串是否在一些母串中出现.此时对于 \(\text{parent}\) 树的 \(\text{right} ...

  8. 精心推荐自己收藏的机器学习资源

    前言 随着AI领域的持续火热,越来越多的人开始自学AI .自学AI的第一个难题是如何找到有用的学习资源,网上的资源太多了,某些网友用截图的方式发了机器学习资源到QQ群,资源多到可以让人放弃自学AI,几 ...

  9. 资源保障团队的设备维护人员的技能要求

    资源保障团队的设备维护人员需具备以下技能: 熟练掌握设备维护.维修和检测技术. 理解设备运行原理和维护要求. 熟练使用相关工具和设备. 具备良好的沟通和团队合作能力. 具备较强的问题解决能力和判断能力 ...

最新文章

  1. php parseurl的反函数,字符串修改(处理)函数
  2. rhel-server版安装vbox增强功能
  3. 2019年十大AI创业死亡名单:无人车机器人为主,B轮阵亡最多
  4. HDU (1575)Tr A ---矩阵快速幂
  5. 微信端上传图片方式1
  6. python中自定义函数如何传递动态参数_python 函数的动态参数
  7. 1、ShardingSphere基本概念
  8. 记录:通过SSH远程连接Ubuntu
  9. 垃圾收集器–串行,并行,CMS,G1(以及Java 8中的新增功能)
  10. ThreadLocal原理解析以及是否需要调用remove方法
  11. elasticsearch常用配置
  12. android计步器报告书,Android精准计步器开发-Dylan计歩
  13. DS18B20温度传感器学习笔记
  14. mysql 数据库引擎切花_asyncio异步编程,你搞懂了吗?
  15. ISO26262功能安全--产品开发过程
  16. 浏览器不能上网,QQ能登录 问题解决方法
  17. 58、JAVA Collections集合排序相关静态方法---方法1
  18. JS脚本defer的作用 (转自一路前行)
  19. 7个趣味性超高的国产APP,总有一个能让你赞不绝口!
  20. Knative v1.0.x安装全过程

热门文章

  1. TensorFlow------学习篇
  2. Sierpinski镂垫
  3. 【ABAP】Cross client master/business data transfer guide(ALE I Doc)
  4. 如何看待水氢发动机事件
  5. Arduino LiquidCrystal库函数中文对照
  6. Python元组tuple(不可变)
  7. ALGO-117_蓝桥杯_算法训练_友好数
  8. codeforces 551 C GukiZ hates Boxes
  9. 数据库自增主键可能产生的问题
  10. Spring框架的事务管理及应用