如何使用SED在子字符串中替换特定字符
如何使用SED在子字符串中替换特定字符
我有一个包含多行的csv文件,内容如下:
"ABC-DEF-d98263","12345678","176568981","","588","ABC-DEF-11947","","GAUZE PACKING STRIPS 1/4"","","","2019-02-04T19:09:00-05:00","","XXX","XXX","2019-02-12T23:57:48-06:00","XXX-XXX-176568981" "ABC-DEF-d1494751","98765432","98765432","1073552394","284","ABC-DEF-77997","","ACE WRAP 3"","","","2015-10-29T18:45:00-07:00","Sent","XXX","XXX","2018-04-05T19:38:41-05:00","XXX-XXX-76954940"
我想要将每行中第8列或者在GAUZE PACKING STRIPS 1/4或ACE WRAP 3之后的"",
替换为",
,而不影响其他"",
。
我尝试了sed 's/[[:alnum:]]""//g' file.csv
,但它也删除了
。有什么办法吗?非常感谢!
问题的出现原因是需要使用SED命令来替换一个子字符串中的特定字符。现在给出了一个使用awk命令的解决方法,但是需要将其转换为SED命令的形式。
解决方法如下:
$ sed 's/\(.*,\)\(.*,\)\(.*,\)\(.*,\)\(.*,\)\(.*,\)\(.*,\)\(.*\)\(\"\"\)\(.*\)/\1\2\3\4\5\6\7\8\"\10\"\9/' file
输出结果如下:
"ABC-DEF-d98263","12345678","176568981","","588","ABC-DEF-11947","","GAUZE PACKING STRIPS 1/4","","","2019-02-04T19:09:00-05:00","","XXX","XXX","2019-02-12T23:57:48-06:00","XXX-XXX-176568981" "ABC-DEF-d1494751","98765432","98765432","1073552394","284","ABC-DEF-77997","","ACE WRAP 3","","","2015-10-29T18:45:00-07:00","Sent","XXX","XXX","2018-04-05T19:38:41-05:00","XXX-XXX-76954940"
现在可以使用SED命令替换特定子字符串中的字符了。
问题的出现原因是用户想知道如何使用SED命令在一个子字符串中替换特定字符。解决方法是使用SED的正则表达式和捕获组来匹配和替换特定字符。
用户可以使用捕获组来匹配和替换位于双引号之间并紧跟双引号的任何内容。用于匹配的正则表达式如下:("[^",]*")"
。需要注意两个问题:第一,引号"
要直接匹配,中间的表达式[^",]*
表示正则表达式将匹配除了"
和,
之外的任何字符。这意味着它将防止匹配的字符串中有引号。
最后,括号()
是一个捕获组,我们可以使用反斜杠和数字引用在()
之间匹配的子正则表达式。例如,\1
将被第一个捕获组的匹配替换,\3
将被第三个捕获组的匹配替换,依此类推。
解决这个问题的SED脚本可能如下所示:sed -re 's/("[^",]*")"/\1/g'
。
注意最后一个双引号在捕获组之外,它不会被\1
替换。
捕获组是扩展正则表达式(ERE)的一个特性,因此需要使用-r
标志在SED中启用它们,否则将使用基本正则表达式(BRE)。
还要注意末尾的/g
。这是为了让SED能够在同一行中匹配和替换多个出现。
下面是一个示例:
$ cat test "ABC-DEF-d98263","12345678","176568981","","588","ABC-DEF-11947","","GAUZE PACKING STRIPS 1/4"","","","2019-02-04T19:09:00-05:00",""","XXX","XXX","2019-02-12T23:57:48-06:00"","XXX-XXX-176568981" $ cat test | sed -re 's/("[^",]*")"/\1/g' "ABC-DEF-d98263","12345678","176568981","","588","ABC-DEF-11947","","GAUZE PACKING STRIPS 1/4","","","2019-02-04T19:09:00-05:00","","XXX","XXX","2019-02-12T23:57:48-06:00","XXX-XXX-176568981"
这个示例非常有效!谢谢 🙂 由于我处理了很多文件,我添加了一个额外的参数来保持静默并在原文件中进行更改:sed -i -re 's/("[^",]*")"/\1/g' file.csv
。