导入必要的模块
banking=pd.read_csv("bank.csv",index_col=0)
banking.dropna(axis=0,inplace=True)
banking.insert(17,'deposit_yes',0)
for i in range(0,len(banking['deposit']),1):
if(banking['deposit'][i]=='yes'):
banking ['deposit_yes'][i]==1错误:正在尝试在数据帧中切片的副本上设置值
Keyerror 83从错误中引发keyerror(key)
数据:https://drive.google.com/file/d/18RGaTG1-ClLAnn3Gm1Rh5PuXLljWdond/view?usp=drivesdk
你能找出错误所在吗?谢谢
发布于 2021-02-21 19:24:36
获取有押金的客户的平均年龄:
import pandas as pd
URL = "https://drive.google.com/file/d/18RGaTG1-ClLAnn3Gm1Rh5PuXLljWdond/view?usp=drivesdk"
banking = pd.read_csv(f'https://drive.google.com/uc?export=download&id={URL.split("/")[-2]}', index_col=0)
# this will give you the average age of all with deposit of yes
banking[banking['deposit']=='yes']['age'].mean()
# explaining step by step
# filter data to get only deposit is yes
banking[banking['deposit']=='yes']
# select the age column and get mean
banking[banking['deposit']=='yes']['age'].mean()发布于 2021-02-21 19:07:43
两件事。
第一。您在循环中迭代过索引,但是使用标签banking['deposit'][i]访问数据。i应该是标签,而不是索引。
第二。永远不要(几乎)遍历各行,因为它很慢。使用内置的矢量化功能,在本例中,要获得新的列deposit_yes,可以应用以下代码
df = pd.DataFrame({'deposit':['yes', 'no']})
df['deposit_yes'] = (df['deposit'] == 'yes').astype(int)
df结果
deposit deposit_yes
0 yes 1
1 no 0https://stackoverflow.com/questions/66301353
复制相似问题