首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >带条件的重复行pandas dataframe python

带条件的重复行pandas dataframe python
EN

Stack Overflow用户
提问于 2019-10-01 23:09:22
回答 2查看 102关注 0票数 2

我的数据帧出了点问题。

我的df是:

代码语言:javascript
复制
product      power                   brand
product_1    3 x 1500W               brand_A
product_2    2x1000W + 1x100W
product 3    1x1500W + 1x500W        brand_B
product 4    500W

我需要将每一行乘以乘积(乘幂减去)

我的df期望:

代码语言:javascript
复制
product      power               brand          new_product
product_1    1500W               brand_A        product_1_1
product_1    1500W               brand_A        product_1_2
product_1    1500W               brand_A        product_1_3
product_2    1000W                              product_2_1
product_2    1000W                              product_2_2
product_2    100W                               product_2_3
product 3    1500W               brand_B        product_3_1
product 3    500W                brand_B        product_3_2
product 4    500W                               product_4_1

谢谢你的帮忙

EN

回答 2

Stack Overflow用户

发布于 2019-10-01 23:16:54

我将执行字符串提取和合并,然后执行一些清理任务:

代码语言:javascript
复制
df1 = (df.power.str.extractall('(\d+)\s?x\s?(\d+W)')
         .reset_index(level=1,drop=True)
      )

new_df = df.merge(df1[1].repeat(df1[0]), 
                  left_index=True, 
                  right_index=True,
                  how='outer')

# update the power column
new_df['power']= np.where(new_df[1].isna(), new_df['power'], new_df[1])

# drop the extra 1 column
new_df.drop(1, axis=1, inplace=True)

# new_product column
new_df['new_product'] = (new_df['product'] + '_' + 
                         new_df.groupby('product').cumcount().add(1).astype(str) )

输出:

代码语言:javascript
复制
     product  power    brand  new_product
0  product_1  1500W  brand_A  product_1_1
0  product_1  1500W  brand_A  product_1_2
0  product_1  1500W  brand_A  product_1_3
1  product_2  1000W     None  product_2_1
1  product_2  1000W     None  product_2_2
1  product_2   100W     None  product_2_3
2  product 3  1500W  brand_B  product 3_1
2  product 3   500W  brand_B  product 3_2
3  product 4   500W     None  product 4_1
票数 3
EN

Stack Overflow用户

发布于 2019-10-01 23:50:55

@Quang Hoang是一个更正确的答案,因为它只使用pandas方法实现。无论如何,我只留下一个使用纯python的解决方案:

代码语言:javascript
复制
import pandas as pd
import numpy as np 

cols = ['product', 'power', 'brand']

data = [
  ['product_1', '3 x 1500W', 'brand_A'],
  ['product_2', '2x1000W + 1x100W', np.nan],
  ['product 3', '1x1500W + 1x500W', 'brand_B'],
  ['product 4', '500W', np.nan]
]

df = pd.DataFrame(columns=cols, data=data)
print(df)

原始数据:

代码语言:javascript
复制
     product             power    brand
0  product_1         3 x 1500W  brand_A
1  product_2  2x1000W + 1x100W      NaN
2  product 3  1x1500W + 1x500W  brand_B
3  product 4              500W      NaN

数据角力

代码语言:javascript
复制
items = df.power.values.tolist()
brands = df.brand.values.tolist()

res = zip(items, brands)

new_data = []

for idx, aux in enumerate(res):
  item, brand = aux
  for idx2, power_model in enumerate(item.split('+')):
      res = power_model.strip().split('x')
      if len(res) == 2:
        units, val = res
      else:
        units = 1
        val = res[0]

      for _ in range(int(units)):
        new_data.append(
            [
              f'product_{idx + 1}', 
              val,
              brand,
              f'product_{idx + 1}_{idx2 + 1}'
            ]
        )

new_cols = ['product', 'power', 'brand', 'new_product']
df2 = pd.DataFrame(columns=new_cols, data=new_data)

print(df2)

结果

代码语言:javascript
复制
     product   power    brand  new_product
0  product_1   1500W  brand_A  product_1_1
1  product_1   1500W  brand_A  product_1_1
2  product_1   1500W  brand_A  product_1_1
3  product_2   1000W      NaN  product_2_1
4  product_2   1000W      NaN  product_2_1
5  product_2    100W      NaN  product_2_2
6  product_3   1500W  brand_B  product_3_1
7  product_3    500W  brand_B  product_3_2
8  product_4    500W      NaN  product_4_1
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/58187756

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档