"a\tb c\nd".split => ["a","b","c","d"] "a\tb c\nd".split(' ') => ["a","d"] "a\tb c\nd".split(/ /) => ["a\tb","c\nd"]
The source(string.c from 2.0.0)超过200行,包含如下所示的段落:
/* L 5909 */ else if (rb_enc_asciicompat(enc2) == 1) { if (RSTRING_LEN(spat) == 1 && RSTRING_PTR(spat)[0] == ' '){ split_type = awk; } }
后来,在awk split类型的代码中,实际参数甚至不再使用,并且与普通拆分相同.
>有没有人觉得这有点破碎?
>这有充分的理由吗?
>像这样的“魔术”是否比大多数人在Ruby中想到的更频繁?
解决方法
awk
‘s split().所以这是源于Unix的悠久传统.
从perldoc分裂:
As another special case,split emulates the default behavior of the command line tool awk when the PATTERN is either omitted or a literal string composed of a single space character (such as ‘ ‘ or “\x20″,but not e.g. / / ). In this case,any leading whitespace in EXPR is removed before splitting occurs,and the PATTERN is instead treated as if it were /\s+/ ; in particular,this means that any contiguous whitespace (not just a single space character) is used as a separator. However,this special treatment can be avoided by specifying the pattern / / instead of the string ” ”,thereby allowing only a single space character to be a separator.