大熊猫 - 不在索引中(pandas - not in index)


y = len(df)

for i in range(1,y):

if (np.isnan(df.ix[i,'A'])):

df.ix[i,'A'] = dfH.ix[i-1,'A']


KeyError: "[9 'A'] not in index"

但是第9行A列显然在我的dataFrame中。 这里发生了什么?

I'm trying to replace the nan values in column 'A' (for which the first one appears in row 9) with the previous value is the row before

y = len(df)

for i in range(1,y):

if (np.isnan(df.ix[i,'A'])):

df.ix[i,'A'] = dfH.ix[i-1,'A']

gives the error

KeyError: "[9 'A'] not in index"

but row 9 column A is clearly in my dataFrame. What is going on here?


更新时间:2019-09-04 16:02



df['A'].fillna(method='ffill', inplace = True)

You could use fillna with ffill method:

df['A'].fillna(method='ffill', inplace = True)



>>> myseries[myseries == 7]

3 7

dtype: int64

>>> myseries[myseries == 7].index[0]


虽然我承认应该有一个更好的方法来做到这一点,但是这至少避免了迭代和循环遍历对象并将其移动到C级。 >>> myseries[myseries == 7]

3 7

dtype: int64

>>> myseries[myseries == 7].index[0]


Though I admit that there


首先通过左连接合并两个DataFrames的索引: pd.merge(staff_df, student_df, how='left', left_index=True, right_index=True)

第二个,并在两个DataFrames按列Name简化第三个左连接: pd.merge(staff_df, student_df, how='left', left_on='Name', right_on='Name')

pd.merge(staff_df, student_df, how='


In [5]: a.reset_index().merge(b, how="left").set_index('index')


col1 to_merge_on col2


a 1 1 1

b 2 3 2

c 3 4 NaN

In [5]: a.reset_index().merge(b, how="left").set


当然可以使用.get_loc() : In [45]: df = DataFrame({"pear": [1,2,3], "apple": [2,3,4], "orange": [3,4,5]})

In [46]: df.columns

Out[46]: Index([apple, orange, pear], dtype=object)

In [47]: df.columns.get_loc("pear")

Out[47]: 2

虽然老实说,我不经常需要这个。 通常按名称访问我想要的( d


请参阅docs: http : //pandas.pydata.org/pandas-docs/stable/dsintro.html#column-selection-addition-deletion 使用idx = 0将插入开头 df.insert(idx, col_name, value)

see docs: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.insert.html using l


你可以使用fillna和ffill方法: df['A'].fillna(method='ffill', inplace = True)

You could use fillna with ffill method: df['A'].fillna(method='ffill', inplace = True)

这是一个非常紧凑的方式。 为了简单起见,我用排名写了'得分'值,但如果你愿意的话,你可以保留原来的得分(这只是更冗长的一点)。 df['score'] = df.groupby('group')['score'].rank()



score 1 2 3





使用groupby和参数levels对数据进行分组,然后使用mean和std 。 如果您希望将平均值作为现有数据框中的新列,请使用transform返回与您的df具有相同索引的Series: grouped = df.groupby(level = ['Country','State', 'City'])

df['Mean'] = grouped['price_observation'].transform('mean')

df['Std'] = grouped['price_observatio


这就是map的用途。 只做u = s.map(t) 。 That's what map is for. Just do u = s.map(t).

你想要满足什么条件? import pandas as pd

df=pd.DataFrame([['This is also a interesting topic',2],['

the valley of flowers ...',1],['found in the hilly terrain',5],

['we must preserve it ',6]],columns=['description','count'])



