pandas二、索引、选择以及赋值
首先是检查是否数据
import pandas as pd
reviews = pd.read_csv("winemag-data-130k-v2.csv", index_col=0)
reviews.head()
country | description | designation | points | price | province | region_1 | region_2 | taster_name | taster_twitter_handle | title | variety | winery | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Italy | Aromas include tropical fruit, broom, brimston... | Vulkà Bianco | 87 | NaN | Sicily & Sardinia | Etna | NaN | Kerin O’Keefe | @kerinokeefe | Nicosia 2013 Vulkà Bianco (Etna) | White Blend | Nicosia |
1 | Portugal | This is ripe and fruity, a wine that is smooth... | Avidagos | 87 | 15.0 | Douro | NaN | NaN | Roger Voss | @vossroger | Quinta dos Avidagos 2011 Avidagos Red (Douro) | Portuguese Red | Quinta dos Avidagos |
2 | US | Tart and snappy, the flavors of lime flesh and... | NaN | 87 | 14.0 | Oregon | Willamette Valley | Willamette Valley | Paul Gregutt | @paulgwine | Rainstorm 2013 Pinot Gris (Willamette Valley) | Pinot Gris | Rainstorm |
3 | US | Pineapple rind, lemon pith and orange blossom ... | Reserve Late Harvest | 87 | 13.0 | Michigan | Lake Michigan Shore | NaN | Alexander Peartree | NaN | St. Julian 2013 Reserve Late Harvest Riesling ... | Riesling | St. Julian |
4 | US | Much like the regular bottling from 2012, this... | Vintner's Reserve Wild Child Block | 87 | 65.0 | Oregon | Willamette Valley | Willamette Valley | Paul Gregutt | @paulgwine | Sweet Cheeks 2012 Vintner's Reserve Wild Child... | Pinot Noir | Sweet Cheeks |
1、 从结果中选择某一列 进行赋值
desc = reviews.description
# or
desc = reviews["description"] # 这两个都可以
desc.head()
"""
0 Aromas include tropical fruit, broom, brimston...
1 This is ripe and fruity, a wine that is smooth...
2 Tart and snappy, the flavors of lime flesh and...
3 Pineapple rind, lemon pith and orange blossom ...
4 Much like the regular bottling from 2012, this...
Name: description, dtype: object
"""
2、从reviws的description列取第一个值
first_description = reviews.description.iloc[0]
first_description
'''
"Aromas include tropical fruit, broom, brimstone and dried herb. The palate isn't overly expressive, offering unripened apple, citrus and dried sage alongside brisk acidity."
'''
3、取第一行的值,即第一行记录
first_row = reviews.iloc[0]
first_row
'''
country Italy
description Aromas include tropical fruit, broom, brimston...
designation Vulkà Bianco
points 87
price NaN
province Sicily & Sardinia
region_1 Etna
region_2 NaN
taster_name Kerin O’Keefe
taster_twitter_handle @kerinokeefe
title Nicosia 2013 Vulkà Bianco (Etna)
variety White Blend
winery Nicosia
Name: 0, dtype: object
'''
4、选取reviews中description列的前10行值。
first_description = reviews.description.iloc[:10]
first_description
'''
0 Aromas include tropical fruit, broom, brimston...
1 This is ripe and fruity, a wine that is smooth...
2 Tart and snappy, the flavors of lime flesh and...
3 Pineapple rind, lemon pith and orange blossom ...
4 Much like the regular bottling from 2012, this...
5 Blackberry and raspberry aromas show a typical...
6 Here's a bright, informal red that opens with ...
7 This dry and restrained wine offers spice in p...
8 Savory dried thyme notes accent sunnier flavor...
9 This has great depth of flavor with its fresh ...
Name: description, dtype: object
'''
5、选取索引1,2,3,5,8的记录行
index = [1,2,3,5,8]
sample_reviews = reviews.iloc[index]
# sample_reviews = reviews.loc[index]
'''
这里loc 和 iloc的区别是:
iloc:是你选择的是第1,2,3,5,8行
而loc:则是根据你的索引 比如你的索引是从1500开始往后递增的 那么用上面的1,2,3,5,8就会报错应该用[1501,1502,1503,1505,1508]
'''
# result
6、创建一个变量df,df包含reviews的 `country`, `province`, `region_1`, and `region_2`列,并保留索引 0 1 10 100的记录,换言之产生一个如下的DataFrame:
cols = ['country', 'province', 'region_1', 'region_2']
indices = [0, 1, 10, 100]
reviews.loc[indices, cols]
# 运行结果如下:
country | province | region_1 | region_2 | |
---|---|---|---|---|
0 | Italy | Sicily & Sardinia | Etna | NaN |
1 | Portugal | Douro | NaN | NaN |
10 | US | California | Napa Valley | Napa |
100 | US | New York | Finger Lakes | Finger Lakes |
7、 创建包含country、variety列且保留前100行数据的变量df:
cols = ['country', 'variety']
df = reviews.loc[:99, cols]
# orcols_idx = [0, 11]
df = reviews.iloc[:100, cols_idx]
country | variety | |
---|---|---|
0 | Italy | White Blend |
1 | Portugal | Portuguese Red |
2 | US | Pinot Gris |
3 | US | Riesling |
4 | US | Pinot Noir |
5 | Spain |
Tempranillo-Merlot |
.........................................................................
8、创建一个DataFrame 名字叫做italian_wines,包含 ‘Italy’列 即 酒的产地。提示: `reviews.country`
italian_wines = reviews[reviews.country == 'Italy']
italian_wines.head()
9、创建一个DataFrame名字叫做top_oceania_wines,包含至少95行以上产地来自Australia或new zealand的信息。
top_oc = reviews[(reviews.country.isin(['Australia','new zealand'])&(reviews.points >= 95))
]
pandas二、索引、选择以及赋值相关推荐
- pandas 索引_Pandas学习笔记03数据清洗(通过索引选择数据)
点击上方"可以叫我才哥"关注我们 今天我们就在jupyterlab里进行操作演示,本次推文内容主要以截图为主了. 有兴趣的可以公众号回复 "索引" 获取 演示原 ...
- Pandas简明教程:四、Pandas数据索引方式
文章目录 1.以类似`dict`的方式访问 2.以属性方式访问数据 3.访问行(slicing) 4.用`loc`方法访问不同行列 5.用`iloc`方法直接访问行列 6.其它方式 本系列教程教程完整 ...
- pandas mysql index_Pandas从入门到精通(3)- Pandas多级索引MultiIndex
首先了解一下什么是多级索引,以及它的作用,为什么要有这个玩意. 多级索引也称为层次化索引(hierarchical indexing),是指数据在一个轴上(行或者列)拥有多个(两个以上)索引级别.之所 ...
- Pandas知识点-索引和切片操作
Pandas知识点-索引和切片操作 索引和切片操作是最基本最常用的数据处理操作,Pandas中的索引和切片操作基于Python的语言特性,支持类似于numpy中的操作,也可以使用行标签.列标签以及行标 ...
- 通过整数索引选择一行熊猫系列/数据框
本文翻译自:Selecting a row of pandas series/dataframe by integer index I am curious as to why df[2] is no ...
- DataFrame的元素选择与赋值
选择 1. 根据列名选择 >>> df['A'] 选择df中名称为'A'的列,该操作产生一个Series,等同于df.A. 2. 根据行号选择 >>> df[0 : ...
- Python入门(二)——IDE选择PyCharm,输入和输出,基础规范,数据类型和变量,常量,字符串和编码,格式化
Python入门(二)--IDE选择PyCharm,输入和输出,基础规范,数据类型和变量,常量,字符串和编码,格式化 我们从今天就开始正式的学习PY交易了,PY交易还行,我们有基础之后学习起来倒不是说 ...
- Android仿微信实现快速索引选择联系人
Android仿微信实现快速索引选择联系人 原创 2016年03月05日 13:19:20 1640 3 1 一.概述 先看效果图,然后在给大家慢慢介绍 二.实现 先给大家说说这些城市的数据是怎么来 ...
- 为什么MySQL数据库索引选择使用B+树?
在进一步分析为什么MySQL数据库索引选择使用B+树之前,我相信很多小伙伴对数据结构中的树还是有些许模糊的,因此我们由浅入深一步步探讨树的演进过程,在一步步引出B树以及为什么MySQL数据库索引选择使 ...
- 数据分析工具Pandas(2):Pandas的索引操作
数据分析工具Pandas(1):Pandas的数据结构 数据分析工具Pandas(2):Pandas的索引操作 Pandas的索引操作 索引对象Index 1. Series和DataFrame中的索 ...
最新文章
- 中国经济是前所未有二元经济(转)
- 华为新版交换机端口配置由TRUNK改为ACCESS
- 编译系统总结篇-Android10.0编译系统(十一)
- Android 开发环境建立
- idea中生成mapper xml文件,快速从代码跳转到mapper及从mapper返回代码的插件安装
- boost::geometry::detail::overlay::approximately_equals用法的测试程序)
- 【51Nod - 1163】最高的奖励 (贪心+优先队列 或 妙用并查集)
- uniapp语音识别_uni-app开发APP语音播报功能
- sql server版本号_识别SQL Server版本号的不同方法
- 更新智能开发研发进度
- 浅析GitLab Flow的十一个规则
- python用什么软件编程-python开发用什么编辑器
- Amoeba Architecture
- python实现设计模式
- 智慧仓储管理系统实时仓储作业管理
- 运筹帷幄——我国古代的高超算术
- Revit二开之管道翻弯
- 运行github上下载的vue项目
- 2021-05-26SEO关键词KPI考核指标有哪些
- SSH Tunnel隧道