我正在寻找一个输出文本中所有引用的SimpleGrepSedPerlOr
PythonOneLiner.
例1:
echo “HAL,” noted Frank,“said that everything was going extremely well.” | SimpleGrepSedPerlOrPythonOneLiner
标准输出:
"HAL," "said that everything was going extremely well.”
例2:
cat MicrosoftWindowsXPEula.txt | SimpleGrepSedPerlOrPythonOneLiner
标准输出:
"EULA" "Software" "Workstation Computer" "Device" "DRM"
等等
解决方法
我喜欢这个:
perl -ne 'print "$_\n" foreach /"((?>[^"\\]|\\+[^"]|\\(?:\\\\)*")*)"/g;'
它有点冗长,但它比最简单的实现更好地处理转义引用和回溯.它的意思是:
my $re = qr{ " # Begin it with literal quote ( (?> # prevent backtracking once the alternation has been # satisfied. It either agrees or it does not. This expression # only needs one direction,or we fail out of the branch [^"\\] # a character that is not a dquote or a backslash | \\+ # OR if a backslash,then any number of backslashes followed by [^"] # something that is not a quote | \\ # OR again a backslash (?>\\\\)* # followed by any number of *pairs* of backslashes (as units) " # and a quote )* # any number of *set* qualifying phrases ) # all batched up together " # Ended by a literal quote }x;
如果你不需要那么大的力量 – 说它只是可能是对话而不是结构化的引用,那么
/"([^"]*)"/
可能与其他任何东西一样有效.