http://blog.csdn.net/pipisorry/article/details/43089121

机器学习Machine Learning - Andrew NG courses学习笔记

Introduction机器学习介绍

内容content:

What is Machine Learning

Supervised Learning

Unsupervised Learning

机器学习的来源和用例:

Machine Learning
- Grew out of work in AI
- New capability for computers

Examples:
- Database mining
Large datasets from growth of automation/web.
E.g., Web click data, medical records, biology, engineering
- Applications can’t program by hand.
E.g., Autonomous helicopter, handwriting recognition, most of
Natural Language Processing (NLP), Computer Vision.

机器学习用于商业运营的典型用例

客户潜在顾客评分、市场细分、个性化推荐、预防客户的流失、产品辅助定价、产品路线图、信贷风险评分、欺诈检测、欺诈发现等

[Machine Learning – 9 Most Common Usecases for Higher Business Growth]

机器学习的定义Machine Learning definition

Arthur Samuel (1959). Machine Learning:

Field of study that gives computers the ability to learn without being explicitly programmed.

Tom Mitchell (1998) Well-posed Learning Problem:

A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.

一个计算机程序从与一些任务T还有一些性能指标P相关的经验中学习,如果用性能度量P测定在任务T上性能,则通过经验E来提高性能度量.

例子:Suppose your email program watches which emails you do or do not mark as spam, and based on that learns how to better filter spam. What is the task T in this setting?

Classifying emails as spam or not spam.                                                                            Task
Watching you label emails as spam or not spam.                                                              Experence
The number (or fraction) of emails correctly classified as spam/not spam.                  Performance
None of the above—this is not a machine learning problem.

这个例子就是说program通过你label垃圾邮件来学习,完成垃圾邮件classifing的任务,并不断通过学习来提高performance.

机器学习算法Machine learning algorithms
- Supervised learning监督学习
- Unsupervised learning非监督学习

Others: Reinforcement learning, recommender systems.

监督学习Supervised Learning

Supervised Learning:“right answers” given,给出训练数据{(size in feet2, price in 1000)的数据集}正确的值(这里是Price)。可以认为是有标签的训练数据。

回归Regression:

Predict continuous valued output (price)

回归的例子:

分类Classification
Discrete valued output (0 or 1)

分类的例子1(1个feature):

分类的例子2(2个feature右边是更多的feature的例子):

区分分类和回归的例子:

You’re running a company, and you want to develop learning algorithms to address each of two problems.
Problem 1: You have a large inventory of identical items. You want to predict how many of these items will sell over the next 3 months.
Problem 2: You’d like software to examine individual customer accounts, and for each account decide if it has been hacked/compromised.

Question:Should you treat these as classification or as regression problems?

1 Treat both as classification problems.
2 Treat problem 1 as a classification problem, problem 2 as a regression problem.
3 Treat problem 1 as a regression problem, problem 2 as a classification problem.
4 Treat both as regression problems.

Answer:3 is right.

For problem one, I would treat this as a regression problem, because if I have, you know, thousands of items, well, I would probably just treat this as a real value,as a continuous value. And treat, therefore, the number of items I sell,as a continuous value.
And for the second problem, I would treat that as a classification problem, because I might say, set the value I want to predict with zero, to denote the account has not been hacked. And set the value one to denote an account that has been hacked into.

非监督学习Unsupervised Learning
Unsupervised Learning, which is a learning setting where you give the algorithm a ton of data and just ask it to find structure in the data for us.

not giving the algorithm the right answer for the examples in my data set.

与Supervised Learning的区别:

(labeled有标签的)     (unlabeled无标签的)


聚类Clustering(one type of Unsupervised Learning)

例子:

{通过聚类Genes来groups不同的人}So this is Unsupervised Learning because we're not telling the algorithm in advance that these are type 1 people, those are type 2 persons, those are type 3 persons and so on and instead what were saying is yeah here's a bunch of data.

example1 of clustering:

large computer clusters and trying to figure out which machines tend to work together and if you can put those machines together,you can make your data center work more efficiently.
social network analysis.So given knowledge about which friends you email the most or given your Facebook friends or your Google+ circles, can we automatically identify which are cohesive groups of friends,also which are groups of people that all know each other?
Market segmentation.Many companies have huge databases of customer information.So, can you look at this customer data set and automatically discover market segments and automatically group your customers into different market segments so that you can automatically and more efficiently sell or market your different market segments together?

example2 of clustering:Cocktail party problem

“ 鸡尾酒会问题”(cocktail party problem)是在计算机语音识别领域的一个问题,当前语音识别技术已经可以以较高精度识别一个人所讲的话,但是当说话的人数为两人或者多人时,语音识别率就会极大的降低。

Cocktail party problem algorithm
[W,s,v] = svd((repmat(sum(x.*x,1),size(x,1),1).*x)*x');

Supervised Learning 和Unsupervised Learning的区别的例子:

Question: Of the following examples, which would you address using an unsupervised learning algorithm?

1 Given email labeled as spam/not spam, learn a spam filter.
2 Given a set of news articles found on the web, group them into set of articles about the same story.

3 Given a database of customer data, automatically discover market segments and group customers into different market segments.
4 Given a dataset of patients diagnosed as either having diabetes or not, learn to classify new patients as having diabetes or not.

Answer:

1 and 4 is supervised learning algorithms while 2 and 3 is unsupervised learning algorithms.

Explain:If you have labeled data, you know, with spam and non-spam e-mail, we'd treat this as a Supervised Learning problem.
The news story example, that's exactly the Google News example,how you can use a clustering algorithm to cluster these articles together so that's Unsupervised Learning.

关于机器学习开发环境Octave

Why?:if you use Octave as your learning tool and as your prototyping(原型) tool, it'll let you learn and prototype learning algorithms much more quickly.use an algorithm like Octave to first prototype the learning algorithm, and only after you've gotten it to work, then you migrate it to C++ or Java or whatever.

Octave 安装教程 Octave // Matlab Tutorial

Octave文档 Octave documentation

from:http://blog.csdn.net/pipisorry/article/details/43089121

ref: [机器学习系列(4)_机器学习算法一览,应用建议与解决思路 ]

Machine Learning - I. Introduction机器学习综述 (Week 1)相关推荐

  1. Machine Learning:如何选择机器学习算法?

    2019独角兽企业重金招聘Python工程师标准>>> Machine Learning Algorithms Overview 关于目前最流行的一些机器学习算法,建议阅读: Mac ...

  2. 中科院计算所开源Easy Machine Learning:让机器学习应用开发简单快捷 By 机器之心2017年6月13日 13:05 今日,中科院计算所研究员徐君在微博上宣布「中科院计算所开源了

    中科院计算所开源Easy Machine Learning:让机器学习应用开发简单快捷 By 机器之心2017年6月13日 13:05 今日,中科院计算所研究员徐君在微博上宣布「中科院计算所开源了 E ...

  3. AI:Algorithmia《2021 enterprise trends in machine learning 2021年机器学习的企业趋势》翻译与解读

    AI:Algorithmia<2021 enterprise trends in machine learning 2021年机器学习的企业趋势>翻译与解读 目录 <2021 ent ...

  4. Day 5. Suicidal Ideation Detection: A Review of Machine Learning Methods and Applications综述

    Title: Suicidal Ideation Detection: A Review of Machine Learning Methods and Applications 自杀意念检测:机器学 ...

  5. Build a Machine Learning Portfolio(构建机器学习投资组合)

    Complete Small Focused Projects and Demonstrate Your Skills (完成小型针对性机器学习项目,证明你的能力) A portfolio is ty ...

  6. Machine Learning-Introduction

    What is Machine Learning? Supervised learning Unsupervised Learning 1. What is Machine Learning? Art ...

  7. 【Machine Learning】初识机器学习

    文章目录 前言 一.何为机器学习 二.基本常用概念 2.1 数据集(data set) 2.2 属性(attribute) 2.3 样本空间(sample space) 2.4 特征向量(featur ...

  8. 机器学习规则 (Rules of Machine Learning): 关于机器学习工程的最佳实践

    马丁·辛克维奇 本文档旨在帮助已掌握机器学习基础知识的人员从 Google 机器学习的最佳实践中受益.它介绍了一种机器学习样式,类似于 Google C++ 样式指南和其他常用的实用编程指南.如果您学 ...

  9. Data Mining Machine Learning学习笔记 机器学习入门笔记 之jieba分词(中文分词)(二)

    第二章 Rationlism & Empiricism 理性主义和经验主义 Rationalism (按照人类的方式进行处理,做词法,语法,语义分析) Natural Language Pro ...

  10. 【Machine Learning, Coursera】机器学习Week6 偏斜数据集的处理

    ML Week6: Handing Skewed Data 本节内容: 查准率(precision)和召回率(recall) F1F_1F1​ Score 偏斜类(skewed class)问题:数据 ...

最新文章

  1. Mac os 安装 MySQL和MySQL-Python
  2. 撬开骁龙8一看,满满都是顶会论文
  3. BufferedReader和BufferedWriter读写文件
  4. Python——web.py模块错误【UnicodeDecodeError: ‘gbk‘ codec can‘t decode byte 0xab in position 285】解决方案
  5. android中对sd卡的 操作文件问题 创建目录 创建文件到指定目录
  6. css文件修改后没变化 static_Go Web编程使用Go语言创建静态文件服务器
  7. 【C语言进阶深度学习记录】七 C语言中的循环语句
  8. 037-PHP如何返回闭包函数实例
  9. Dojo学习笔记(三):类化JavaScript
  10. oracle中主键自增长,oracle 数据库主键自动增长方法
  11. abb变频器acs880说明书_ABB变频器ACS880-104/ACS880-204/ACS880-304产品参数及功能介绍
  12. 关于投资收益和风险的例题(线性规划)
  13. 锤子发布会2018的四个环节概述
  14. 50 行代码,实现中英文翻译
  15. 创客集结号:3D打印如何与中小学教育有机结合?
  16. PHP学生学校在线考试管理系统,MYSQL数据库网页设计
  17. Detected outdated SDK Tools version 0.0.0 when the min version is XXX
  18. Golang线程池gpool
  19. 英文打字自动空格如何解决?
  20. 英语老师唱歌软件测试,【出彩教育人】课上打电话,课下能K歌,这样的英语课给我来一打!...

热门文章

  1. Android自定义之流式布局
  2. jQuery中浏览器版本判断的一个BUG,此BUG已影响到jqModal,thickbox等多个jQuery插件的应用...
  3. 开启Spark history server
  4. Python全栈(第一部分)day1
  5. 调用wsdl的接口-用axis
  6. 垃圾回收机制,是不是这样理解?
  7. 定义枚举类型时指定其使用的大小 (C++,C#)
  8. 任务方案思考:句子相似度和匹配
  9. 【Transformer】Transformer中16个注意力头一定要比1个注意力头效果好吗?
  10. 【基础】集成学习 (Ensemble Learning)