文章/答案/技术大牛

发布

社区首页 >问答首页 >Python pandas - groupby()跳过Dataframe中的重复值

问Python pandas - groupby()跳过Dataframe中的重复值
EN

Stack Overflow用户

提问于 2019-06-21 23:57:59

回答 2查看 1.4K关注 0票数 0

我有一个pandas dataframe，我使用groupby()函数对它进行分组，除了pandas跳过了重复的值，只显示唯一的值。

下面是一个示例数据帧

data = [ 
    ['American Mathematical Society', 'Journal', 2, 'Mathematics & Statistics'],
    ['American Mathematical Society', 'Journal', 2, 'Mathematics & Statistics'],
    ['American Mathematical Society', 'Journal', 38, 'Mathematics & Statistics'],
    ['American Mathematical Society', 'Journal', 4, 'Mathematics & Statistics']]

df = pd.DataFrame(data, columns = ['Provider', 'Type', 'Downloads JR1 2017', 'Field'])

现在，我使用groupby函数在列表中按我喜欢的方式对它们进行分组。

jr1_provider = df.groupby(['Provider', 'Field', 'Downloads JR1 2017'], as_index=False).sum().values.tolist()

下面是输出：

[['American Mathematical Society', 'Mathematics & Statistics', 2, 'JournalJournal'], ['American Mathematical Society', 'Mathematics & Statistics', 4, 'Journal'], ['American Mathematical Society', 'Mathematics & Statistics', 38, 'Journal']]

但是，输出中应该有4个项目。我看到结果中删除了重复的值，因为其中两行的“Downloads JR1 2017”列中的值为“2”。

为什么？怎样才能返回所有的结果呢？

我想要得到的输出是'provider‘的名称，加上'Downloads JR1 2017’的总和。示例：

['American Mathematical Society', 46]

python

pandas

回答 2

Stack Overflow用户

回答已采纳

发布于 2019-06-22 02:01:21

根据您在评论中的其他详细信息，如何

df.groupby(['Provider', 'Field'], as_index=False).sum()

票数 2

Stack Overflow用户

发布于 2019-06-22 00:00:05

所以你可以检查transform

jr1_provider = provider_subset.groupby(['Provider', 'Field', 'Downloads JR1 2017'], as_index=False).transform('sum').values.tolist()

票数 3

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/56706848

复制

相似问题

问Python pandas - groupby()跳过Dataframe中的重复值
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python pandas - groupby()跳过Dataframe中的重复值EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python pandas - groupby()跳过Dataframe中的重复值
EN