读取数据和简单的数据探索

In [1]:

import numpy as np

In [2]:

import matplotlib as mpl
import matplotlib.pyplot as plt

In [3]:

from sklearn import  datasets

In [4]:

iris = datasets.load_iris()

In [5]:

iris.keys()

Out[5]:

dict_keys(['data', 'target', 'frame', 'target_names', 'DESCR', 'feature_names', 'filename'])

In [6]:

print(iris.DESCR)
.. _iris_dataset:Iris plants dataset
--------------------**Data Set Characteristics:**:Number of Instances: 150 (50 in each of three classes):Number of Attributes: 4 numeric, predictive attributes and the class:Attribute Information:- sepal length in cm- sepal width in cm- petal length in cm- petal width in cm- class:- Iris-Setosa- Iris-Versicolour- Iris-Virginica:Summary Statistics:============== ==== ==== ======= ===== ====================Min  Max   Mean    SD   Class Correlation============== ==== ==== ======= ===== ====================sepal length:   4.3  7.9   5.84   0.83    0.7826sepal width:    2.0  4.4   3.05   0.43   -0.4194petal length:   1.0  6.9   3.76   1.76    0.9490  (high!)petal width:    0.1  2.5   1.20   0.76    0.9565  (high!)============== ==== ==== ======= ===== ====================:Missing Attribute Values: None:Class Distribution: 33.3% for each of 3 classes.:Creator: R.A. Fisher:Donor: Michael Marshall (MARSHALL%PLU@io.arc.nasa.gov):Date: July, 1988The famous Iris database, first used by Sir R.A. Fisher. The dataset is taken
from Fisher's paper. Note that it's the same as in R, but not as in the UCI
Machine Learning Repository, which has two wrong data points.This is perhaps the best known database to be found in the
pattern recognition literature.  Fisher's paper is a classic in the field and
is referenced frequently to this day.  (See Duda & Hart, for example.)  The
data set contains 3 classes of 50 instances each, where each class refers to a
type of iris plant.  One class is linearly separable from the other 2; the
latter are NOT linearly separable from each other... topic:: References- Fisher, R.A. "The use of multiple measurements in taxonomic problems"Annual Eugenics, 7, Part II, 179-188 (1936); also in "Contributions toMathematical Statistics" (John Wiley, NY, 1950).- Duda, R.O., & Hart, P.E. (1973) Pattern Classification and Scene Analysis.(Q327.D83) John Wiley & Sons.  ISBN 0-471-22361-1.  See page 218.- Dasarathy, B.V. (1980) "Nosing Around the Neighborhood: A New SystemStructure and Classification Rule for Recognition in Partially ExposedEnvironments".  IEEE Transactions on Pattern Analysis and MachineIntelligence, Vol. PAMI-2, No. 1, 67-71.- Gates, G.W. (1972) "The Reduced Nearest Neighbor Rule".  IEEE Transactionson Information Theory, May 1972, 431-433.- See also: 1988 MLC Proceedings, 54-64.  Cheeseman et al"s AUTOCLASS IIconceptual clustering system finds 3 classes in the data.- Many, many more ...

In [7]:

iris.data

Out[7]:

array([[5.1, 3.5, 1.4, 0.2],[4.9, 3. , 1.4, 0.2],[4.7, 3.2, 1.3, 0.2],[4.6, 3.1, 1.5, 0.2],[5. , 3.6, 1.4, 0.2],[5.4, 3.9, 1.7, 0.4],[4.6, 3.4, 1.4, 0.3],[5. , 3.4, 1.5, 0.2],[4.4, 2.9, 1.4, 0.2],[4.9, 3.1, 1.5, 0.1],[5.4, 3.7, 1.5, 0.2],[4.8, 3.4, 1.6, 0.2],[4.8, 3. , 1.4, 0.1],[4.3, 3. , 1.1, 0.1],[5.8, 4. , 1.2, 0.2],[5.7, 4.4, 1.5, 0.4],[5.4, 3.9, 1.3, 0.4],[5.1, 3.5, 1.4, 0.3],[5.7, 3.8, 1.7, 0.3],[5.1, 3.8, 1.5, 0.3],[5.4, 3.4, 1.7, 0.2],[5.1, 3.7, 1.5, 0.4],[4.6, 3.6, 1. , 0.2],[5.1, 3.3, 1.7, 0.5],[4.8, 3.4, 1.9, 0.2],[5. , 3. , 1.6, 0.2],[5. , 3.4, 1.6, 0.4],[5.2, 3.5, 1.5, 0.2],[5.2, 3.4, 1.4, 0.2],[4.7, 3.2, 1.6, 0.2],[4.8, 3.1, 1.6, 0.2],[5.4, 3.4, 1.5, 0.4],[5.2, 4.1, 1.5, 0.1],[5.5, 4.2, 1.4, 0.2],[4.9, 3.1, 1.5, 0.2],[5. , 3.2, 1.2, 0.2],[5.5, 3.5, 1.3, 0.2],[4.9, 3.6, 1.4, 0.1],[4.4, 3. , 1.3, 0.2],[5.1, 3.4, 1.5, 0.2],[5. , 3.5, 1.3, 0.3],[4.5, 2.3, 1.3, 0.3],[4.4, 3.2, 1.3, 0.2],[5. , 3.5, 1.6, 0.6],[5.1, 3.8, 1.9, 0.4],[4.8, 3. , 1.4, 0.3],[5.1, 3.8, 1.6, 0.2],[4.6, 3.2, 1.4, 0.2],[5.3, 3.7, 1.5, 0.2],[5. , 3.3, 1.4, 0.2],[7. , 3.2, 4.7, 1.4],[6.4, 3.2, 4.5, 1.5],[6.9, 3.1, 4.9, 1.5],[5.5, 2.3, 4. , 1.3],[6.5, 2.8, 4.6, 1.5],[5.7, 2.8, 4.5, 1.3],[6.3, 3.3, 4.7, 1.6],[4.9, 2.4, 3.3, 1. ],[6.6, 2.9, 4.6, 1.3],[5.2, 2.7, 3.9, 1.4],[5. , 2. , 3.5, 1. ],[5.9, 3. , 4.2, 1.5],[6. , 2.2, 4. , 1. ],[6.1, 2.9, 4.7, 1.4],[5.6, 2.9, 3.6, 1.3],[6.7, 3.1, 4.4, 1.4],[5.6, 3. , 4.5, 1.5],[5.8, 2.7, 4.1, 1. ],[6.2, 2.2, 4.5, 1.5],[5.6, 2.5, 3.9, 1.1],[5.9, 3.2, 4.8, 1.8],[6.1, 2.8, 4. , 1.3],[6.3, 2.5, 4.9, 1.5],[6.1, 2.8, 4.7, 1.2],[6.4, 2.9, 4.3, 1.3],[6.6, 3. , 4.4, 1.4],[6.8, 2.8, 4.8, 1.4],[6.7, 3. , 5. , 1.7],[6. , 2.9, 4.5, 1.5],[5.7, 2.6, 3.5, 1. ],[5.5, 2.4, 3.8, 1.1],[5.5, 2.4, 3.7, 1. ],[5.8, 2.7, 3.9, 1.2],[6. , 2.7, 5.1, 1.6],[5.4, 3. , 4.5, 1.5],[6. , 3.4, 4.5, 1.6],[6.7, 3.1, 4.7, 1.5],[6.3, 2.3, 4.4, 1.3],[5.6, 3. , 4.1, 1.3],[5.5, 2.5, 4. , 1.3],[5.5, 2.6, 4.4, 1.2],[6.1, 3. , 4.6, 1.4],[5.8, 2.6, 4. , 1.2],[5. , 2.3, 3.3, 1. ],[5.6, 2.7, 4.2, 1.3],[5.7, 3. , 4.2, 1.2],[5.7, 2.9, 4.2, 1.3],[6.2, 2.9, 4.3, 1.3],[5.1, 2.5, 3. , 1.1],[5.7, 2.8, 4.1, 1.3],[6.3, 3.3, 6. , 2.5],[5.8, 2.7, 5.1, 1.9],[7.1, 3. , 5.9, 2.1],[6.3, 2.9, 5.6, 1.8],[6.5, 3. , 5.8, 2.2],[7.6, 3. , 6.6, 2.1],[4.9, 2.5, 4.5, 1.7],[7.3, 2.9, 6.3, 1.8],[6.7, 2.5, 5.8, 1.8],[7.2, 3.6, 6.1, 2.5],[6.5, 3.2, 5.1, 2. ],[6.4, 2.7, 5.3, 1.9],[6.8, 3. , 5.5, 2.1],[5.7, 2.5, 5. , 2. ],[5.8, 2.8, 5.1, 2.4],[6.4, 3.2, 5.3, 2.3],[6.5, 3. , 5.5, 1.8],[7.7, 3.8, 6.7, 2.2],[7.7, 2.6, 6.9, 2.3],[6. , 2.2, 5. , 1.5],[6.9, 3.2, 5.7, 2.3],[5.6, 2.8, 4.9, 2. ],[7.7, 2.8, 6.7, 2. ],[6.3, 2.7, 4.9, 1.8],[6.7, 3.3, 5.7, 2.1],[7.2, 3.2, 6. , 1.8],[6.2, 2.8, 4.8, 1.8],[6.1, 3. , 4.9, 1.8],[6.4, 2.8, 5.6, 2.1],[7.2, 3. , 5.8, 1.6],[7.4, 2.8, 6.1, 1.9],[7.9, 3.8, 6.4, 2. ],[6.4, 2.8, 5.6, 2.2],[6.3, 2.8, 5.1, 1.5],[6.1, 2.6, 5.6, 1.4],[7.7, 3. , 6.1, 2.3],[6.3, 3.4, 5.6, 2.4],[6.4, 3.1, 5.5, 1.8],[6. , 3. , 4.8, 1.8],[6.9, 3.1, 5.4, 2.1],[6.7, 3.1, 5.6, 2.4],[6.9, 3.1, 5.1, 2.3],[5.8, 2.7, 5.1, 1.9],[6.8, 3.2, 5.9, 2.3],[6.7, 3.3, 5.7, 2.5],[6.7, 3. , 5.2, 2.3],[6.3, 2.5, 5. , 1.9],[6.5, 3. , 5.2, 2. ],[6.2, 3.4, 5.4, 2.3],[5.9, 3. , 5.1, 1.8]])

In [8]:

iris.data.shape

Out[8]:

(150, 4)

In [9]:

iris.feature_names

Out[9]:

['sepal length (cm)','sepal width (cm)','petal length (cm)','petal width (cm)']

In [10]:

iris.target

Out[10]:

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

In [11]:

iris.target

Out[11]:

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

In [12]:

iris.target.shape

Out[12]:

(150,)

In [13]:

iris.target_names

Out[13]:

array(['setosa', 'versicolor', 'virginica'], dtype='<U10')

In [19]:

X= iris.data[:,:2]

In [21]:

X.shape

Out[21]:

(150, 2)

In [23]:

plt.scatter(X[:,0],X[:,1])

Out[23]:

<matplotlib.collections.PathCollection at 0x41f874b970>

In [24]:

y =iris.target

In [28]:

plt.scatter(X[y==0,0],X[y==0,1],color="red")
plt.scatter(X[y==1,0],X[y==1,1],color="blue")
plt.scatter(X[y==2,0],X[y==2,1],color="green")

Out[28]:

<matplotlib.collections.PathCollection at 0x41fd33a190>

In [29]:

plt.scatter(X[y==0,0],X[y==0,1],color="red",marker='o')
plt.scatter(X[y==1,0],X[y==1,1],color="blue",marker='+')
plt.scatter(X[y==2,0],X[y==2,1],color="green",marker='x')

Out[29]:

<matplotlib.collections.PathCollection at 0x41fe363c70>

In [30]:

X= iris.data[:,2:]

In [31]:

plt.scatter(X[y==0,0],X[y==0,1],color="red",marker='o')
plt.scatter(X[y==1,0],X[y==1,1],color="blue",marker='+')
plt.scatter(X[y==2,0],X[y==2,1],color="green",marker='x')

Out[31]:

<matplotlib.collections.PathCollection at 0x41fe3cd5b0>

In [ ]:

												

[云炬python3玩转机器学习笔记] 3-12 数据加载和简单的数据探索相关推荐

  1. [云炬python3玩转机器学习笔记] 1-3课程所使用的主要技术栈

    课程环境 语言:Python3 框架:Scikit-learn 其他框架:numpy,matplotlib... IDE:Jupyter Notebook,PyCharm,ANACONDA 课程学习基 ...

  2. [云炬python3玩转机器学习笔记] 3-2 Jupter Notebook魔法命令

    xxxxxxxxxx### %run %run¶ In [1]:%run myscript/hello.py hello Machine Learning ! . . .In [2]:xxxxxxxx ...

  3. [云炬python3玩转机器学习笔记] 3-1 Jupyter Notebook

    1+2for _ in range(5):print("Hello, Machine Learning!")5+5*29+9print("天津云炬网络科技有限公司&quo ...

  4. [云炬python3玩转机器学习笔记] 2-4批量学习、咋西安学习、参数学习和非参数学习

    机器学习的其他分类: 在线学习(online learining)和批量学习(离线学习 batch learning/offline learning): 批量学习(之前没有具体说明的话,都可以用批量 ...

  5. [云炬python3玩转机器学习笔记] 2-7开发环境搭建笔记

    开发环境搭建笔记

  6. [云炬python3玩转机器学习笔记] 2-6关于回归和分类

    在这一章,我们了解到了,机器学习主要可以处理的两大类问题,是回归和分类.看起来,似乎有些局限,但是,非常出人意料的,在我们现实生活中,很多问题,都可以通过化简,或者转换的手段,转换成分类问题或者回归问 ...

  7. [云炬python3玩转机器学习笔记] 2-5机器学习相关的哲学思考

    2-5机器学习相关的哲学思考

  8. [云炬python3玩转机器学习笔记] 2-2机器学习主要任务

    机器学习(监督学习)的主要任务 一.分类:将给定的数据进行分类- 二分类任务:二选一的方式,yes/no- 多分类任务:结果不仅仅在两个结果中,而是很多结果,获得的结果很明确- 数字识别- 图像识别- ...

  9. [云炬python3玩转机器学习笔记] 2-1机器学习基础概念

    机器学习基础概念 一.关于数据 本文约定: 大写表示矩阵 小写表示向量 上标代表第几个样本 下标代表第几个特征 一般向量都表示为列向量 特征空间:每个维度都可以表示一个特征,形成一个空间(2D,3D, ...

  10. [云炬python3玩转机器学习笔记] 1-1什么是机器学习

    一. 什么是机器学习 机器学习本质是在模拟人类进行思考学习,人类的思考学习大部分来自经验的积累,机器学习也一样 二.机器学习的应用场景 (一)已投入生产的 (二)未来需要运用机器学习的领域 在未来,A ...

最新文章

  1. Sublime Text 3 中文乱码的解决方法
  2. JSFL 获取当前脚本路径,执行其他脚本
  3. SQL数据库中临时表、临时变量和WITH AS关键词创建“临时表”的区别
  4. 2022年CXO领导力峰会暨IT东方会技术高管年会
  5. 普通路由器连接光猫一体机的配置教程(以水星MW300R路由器,移动吉比特GS3202光猫一体机为例,可突破专供定制路由限制)
  6. 教你轻松创建谷歌账号、谷歌邮箱!
  7. IDEA中使用Git,文件不同颜色代表的含义
  8. 电气规则检查-ERC
  9. foobar2000_备份Foobar2000并将其传输到新计算机
  10. 我从华为身上学到的项目管理经验 -- 测试篇
  11. ad软件one pin错误是啥意思_AD错误中英文对照
  12. golang构建htpp服务
  13. 深入理解Android相机体系结构之二
  14. mysql主从配置duxi_mysql 主从配置笔记
  15. 计算机word资料,怎样快速找到电脑中的Word文档
  16. 华科计算机专硕英语几,2020华中科技大学计算机专硕考研成功经验谈
  17. 如何破解Aspose.word带水印问题
  18. micro-app 微前端脚手架搭建
  19. Dart | Flutter 中的异常处理框架 Talker
  20. java多态练习_Java多态练习

热门文章

  1. DELPHI加密字串(异或运算加密)
  2. 开始一瓢凉水浇顶,然后慢慢的感觉良好。
  3. blockhouses
  4. linux bash脚本编程知识点
  5. dubbo的学习使用,第一章
  6. Linux - 磁盘操作
  7. hdu 6127---Hard challenge(思维)
  8. Machine Learning No.7: Support Vector Machines
  9. 为什么现在腿会抽筋了?
  10. 一个mp4文件分析工具