首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何合并前12列?

如何合并前12列?
EN

Stack Overflow用户
提问于 2019-05-13 20:15:49
回答 5查看 100关注 0票数 1

我有一个文本文件,其中包含如下文本:

代码语言:javascript
复制
Somename of someone                                   1234 7894
Even some more name                                   2345 5343
Even more of the same                                 6572 6456
I am a customer                                       1324 7894
I am another customer                                 5612 3657
Also I am a customer and I am number Three            9631 7411
And I am number four and not the latest one in list   8529 9369
And here I am                                         4567 9876

我需要从中创建一个CSV文件,但问题是名称包含12列,因此我需要将前12列中的所有列合并为1列,这样CSV文件将如下所示:

代码语言:javascript
复制
Somename of someone,123456,789456
代码语言:javascript
复制
cut -d ' ' -f1-11  test | sed "s/[[:space:]]/\\ /g" | sed "s/\t/\\ /g" > test1

给我一个包含前12列的文件。

EN

回答 5

Stack Overflow用户

发布于 2019-05-13 21:48:20

使用GNU sed for \s/\S表示空格/非空格,并使用-E启用ERE:

代码语言:javascript
复制
$ sed -E 's/\s+(\S+)\s+(\S+)$/,\1,\2/' file
Somename of someone,1234,7894
Even some more name,2345,5343
Even more of the same,6572,6456
I am a customer,1324,7894
I am another customer,5612,3657
Also I am a customer and I am number Three,9631,7411
And I am number four and not the latest one in list,8529,9369
And here I am,4567,9876

和任何POSIX sed的功能等价物:

代码语言:javascript
复制
$ sed 's/[[:space:]]*\([^[:space:]]\{1,\}\)[[:space:]]*\([^[:space:]]\{1,\}\)$/,\1,\2/' file
Somename of someone,1234,7894
Even some more name,2345,5343
Even more of the same,6572,6456
I am a customer,1324,7894
I am another customer,5612,3657
Also I am a customer and I am number Three,9631,7411
And I am number four and not the latest one in list,8529,9369
And here I am,4567,9876

或者使用任何awk:

代码语言:javascript
复制
$ awk -v OFS=',' '{x=$(NF-1) OFS $NF; sub(/([[:space:]]+[^[:space:]]+){2}$/,""); print $0, x}' file
Somename of someone,1234,7894
Even some more name,2345,5343
Even more of the same,6572,6456
I am a customer,1324,7894
I am another customer,5612,3657
Also I am a customer and I am number Three,9631,7411
And I am number four and not the latest one in list,8529,9369
And here I am,4567,9876
票数 2
EN

Stack Overflow用户

发布于 2019-05-13 20:35:36

如果与名称相关的不同列是同一CSV列的一部分,因此应该保持不变,为什么不只处理最后两列呢?

代码语言:javascript
复制
$ sed 's/\t* *\([0-9]\+\)\t* *\([0-9]\+\)$/,\1,\2/' input_file
Somename of someone,123456,789456
Even some more name,234567,534312
Even more of the same,657212,645613
票数 1
EN

Stack Overflow用户

发布于 2019-05-13 21:34:29

如果你不介意改用GNU AWK,你可以这样做:

代码语言:javascript
复制
gawk 'BEGIN {FIELDWIDTHS = "54 5 5"; OFS = ","} {print $1, $2, $3}' FILE

进一步解释:

实际上,您有3列固定宽度的数据,因此FIELDWIDTHS = "54 5 5"

  • You希望输出字段分隔符是逗号,因此OFS = ","

注意,FIELDWIDTHS是GNU AWK的一个特性。

如果您不介意在CSV中保留空格,那么您就完成了。

或者,如果您还需要删除空格,则:

代码语言:javascript
复制
# test.gawk

BEGIN {
  FIELDWIDTHS = "54 5 5"
  OFS = ","
}
{
  for (f=1; f<=NF; f++) {
    sub(/ +$/, "", $f)   # Delete whitespace.
  }
  print
}

测试:

代码语言:javascript
复制
▶ gawk -f test.gawk FILE
Somename of someone,1234,7894
Even some more name,2345,5343
Even more of the same,6572,6456
I am a customer,1324,7894
I am another customer,5612,3657
Also I am a customer and I am number Three,9631,7411
And I am number four and not the latest one in list,8529,9369
And here I am,4567,9876

(请注意,在第二个版本中,正如Ed Morton在评论中所建议的那样,我在最后只能使用print,因为我们修改了有效地更新$0的字段,并且字段分隔符被OFS取代。)

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/56112158

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档