【Perl】Perl语言正则表达式初识

Perl语言提供的正则表达式应用非常广，事实上很多shell命令也支持Perl正则表达式，比如grep、sed等。

Perl 正则表达式的元字符：

.                   a single character
    \s                  a whitespace character (space,tab,newline,...)
    \S                  non-whitespace character
    \d                  a digit (0-9)
    \D                  a non-digit
    \w                  a word character (a-z,A-Z,0-9,_)
    \W                  a non-word character
    [aeIoU]             matches a single character in the given set
    [^aeIoU]            matches a single character outside the given set
    [0-9]               matches all of the digits
    [a-z]               matches all of the low-case letters
    [^0-9]              matches all of the non-digits
    [^a-z]              matches all of the non-low-case letters
    (foo|bar|baz)       matches any of the alternatives specified
    ^                   start of string
    $                   end of string

其他元字符：

*                   zero or more of the prevIoUs thing
    +                   one or more of the prevIoUs thing
    ?                   zero or one of the prevIoUs thing
    {3}                 matches exactly 3 of the prevIoUs thing
    {3,6}               matches between 3 and 6 of the prevIoUs thing
    {3,}                matches 3 or more of the prevIoUs thing

简单例子：

/^\d+/              string starts with one or more digits
    /^$/                nothing in the string (start and end are adjacent)
    /(\d\s){3}/         a three digits,each followed by a whitespace
                        character (eg "3 4 5 ")
    /(a.)+/             matches a string in which every odd-numbered letter
                        is a (eg "abacadaf")
    # This loop reads from STDIN,and prints non-blank lines:
    while (<>) {
        next if /^$/;
        print;
    }

注：

\d 匹配一个数字的字符，和 [0-9] 语法一样

\d+ 匹配多个数字字符串，和 [0-9]+ 语法一样

\D 非数字，其他同 \d

\D+ 非数字，其他同 \d+

\w 英文字母或数字的字符串，和 [a-zA-Z0-9] 语法一样

\w+ 和 [a-zA-Z0-9]+ 语法一样

\W 非英文字母或数字的字符串，和 [^a-zA-Z0-9] 语法一样

\W+ 和 [^a-zA-Z0-9]+ 语法一样

\s 空格，和 [\n\t\r\f] 语法一样

\s+ 和 [\n\t\r\f]+ 一样

\S 非空格，和 [^\n\t\r\f] 语法一样

\S+ 和 [^\n\t\r\f]+ 语法一样

\b 匹配以英文字母,数字为边界的字符串

\B 匹配不以英文字母,数值为边界的字符串

(pattern) () 这个符号会记住所找寻到的字符串，是一个很实用的语法。第一个 () 内所找到的字符串变成 $1 这个变量或是 \1 变量，第二个 () 内所找到的字符串变成 $2 这个变量或是 \2 变量，以此类推下去。  

/pattern/i i 这个参数表示忽略英文大小写，也就是在匹配字符串的时候，不考虑英文的大小写问题。 

\ 如果要在 pattern 模式中找寻一个特殊字符，如 "*"，则要在这个字符前加上 \ 符号，这样才会让特殊字符失效

参考资料：

http://perldoc.perl.org/perlintro.html#Regular-expressions

【Perl】Perl语言正则表达式初识

猜你在找的Perl相关文章