文章/答案/技术大牛

发布

问如何合并前12列？
EN

Stack Overflow用户

提问于 2019-05-13 20:15:49

回答 5查看 100关注 0票数 1

我有一个文本文件，其中包含如下文本：

Somename of someone                                   1234 7894
Even some more name                                   2345 5343
Even more of the same                                 6572 6456
I am a customer                                       1324 7894
I am another customer                                 5612 3657
Also I am a customer and I am number Three            9631 7411
And I am number four and not the latest one in list   8529 9369
And here I am                                         4567 9876

我需要从中创建一个CSV文件，但问题是名称包含12列，因此我需要将前12列中的所有列合并为1列，这样CSV文件将如下所示：

Somename of someone,123456,789456

cut -d ' ' -f1-11  test | sed "s/[[:space:]]/\\ /g" | sed "s/\t/\\ /g" > test1

给我一个包含前12列的文件。

text

awk

sed

python

bash

回答 5

Stack Overflow用户

发布于 2019-05-13 21:48:20

使用GNU sed for \s/\S表示空格/非空格，并使用-E启用ERE：

$ sed -E 's/\s+(\S+)\s+(\S+)$/,\1,\2/' file
Somename of someone,1234,7894
Even some more name,2345,5343
Even more of the same,6572,6456
I am a customer,1324,7894
I am another customer,5612,3657
Also I am a customer and I am number Three,9631,7411
And I am number four and not the latest one in list,8529,9369
And here I am,4567,9876

和任何POSIX sed的功能等价物：

$ sed 's/[[:space:]]*\([^[:space:]]\{1,\}\)[[:space:]]*\([^[:space:]]\{1,\}\)$/,\1,\2/' file
Somename of someone,1234,7894
Even some more name,2345,5343
Even more of the same,6572,6456
I am a customer,1324,7894
I am another customer,5612,3657
Also I am a customer and I am number Three,9631,7411
And I am number four and not the latest one in list,8529,9369
And here I am,4567,9876

或者使用任何awk：

$ awk -v OFS=',' '{x=$(NF-1) OFS $NF; sub(/([[:space:]]+[^[:space:]]+){2}$/,""); print $0, x}' file
Somename of someone,1234,7894
Even some more name,2345,5343
Even more of the same,6572,6456
I am a customer,1324,7894
I am another customer,5612,3657
Also I am a customer and I am number Three,9631,7411
And I am number four and not the latest one in list,8529,9369
And here I am,4567,9876

票数 2

Stack Overflow用户

发布于 2019-05-13 20:35:36

如果与名称相关的不同列是同一CSV列的一部分，因此应该保持不变，为什么不只处理最后两列呢？

$ sed 's/\t* *\([0-9]\+\)\t* *\([0-9]\+\)$/,\1,\2/' input_file
Somename of someone,123456,789456
Even some more name,234567,534312
Even more of the same,657212,645613

票数 1

Stack Overflow用户

发布于 2019-05-13 21:34:29

如果你不介意改用GNU AWK，你可以这样做：

gawk 'BEGIN {FIELDWIDTHS = "54 5 5"; OFS = ","} {print $1, $2, $3}' FILE

进一步解释：

实际上，您有3列固定宽度的数据，因此FIELDWIDTHS = "54 5 5"

You希望输出字段分隔符是逗号，因此OFS = ","

为

注意，FIELDWIDTHS是GNU AWK的一个特性。

如果您不介意在CSV中保留空格，那么您就完成了。

或者，如果您还需要删除空格，则：

# test.gawk

BEGIN {
  FIELDWIDTHS = "54 5 5"
  OFS = ","
}
{
  for (f=1; f<=NF; f++) {
    sub(/ +$/, "", $f)   # Delete whitespace.
  }
  print
}

测试：

▶ gawk -f test.gawk FILE
Somename of someone,1234,7894
Even some more name,2345,5343
Even more of the same,6572,6456
I am a customer,1324,7894
I am another customer,5612,3657
Also I am a customer and I am number Three,9631,7411
And I am number four and not the latest one in list,8529,9369
And here I am,4567,9876

(请注意，在第二个版本中，正如Ed Morton在评论中所建议的那样，我在最后只能使用print，因为我们修改了有效地更新$0的字段，并且字段分隔符被OFS取代。)

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/56112158

复制

相似问题

问如何合并前12列？
EN

回答 5

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何合并前12列？EN

回答 5

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何合并前12列？
EN