基本定义: |
解释 |
解释 |
备注 |
ab |
Concatenation |
和 |
|
a | b |
OR |
或 |
|
a* |
EMPTY OR MORE |
空或多个 |
贪婪性(尽可能多的取配符合的pattern) |
a+ |
1 OR MORE |
1个或多个 |
贪婪性 |
a? |
NONE OR 1 |
空或1个 |
贪婪性 |
+? OR *? OR |
懒惰性(与贪婪性正好相反 失败了 就回溯表达式) |
Matching a HTML tag:(ex:<HTML>) before: <.+> |
After: <.+?> OR <.*?> |
惰性代替方案 |
一个贪婪重复与一个取反字符集 |
不用回溯 |
Even better: <[^>]+> |
缩写: |
|||
. |
ALL THE CHARS |
所有字符 |
|
\d OR \\d |
ANY DIGIT |
所有数字 |
[0-9] |
\D |
NEG of(\d) |
除数字之外 |
[^0-9] |
\w |
WORD |
单词 |
[a-zA-Z_0-9] |
\W |
NEG (\w) |
处单词之外 |
[^\w] |
\s |
ESCAPE SEQ. |
逃离顺序符 |
[ \t\n\x0B\f\r] 注意有空格 |
\S |
NEG (\s) |
除了逃离顺序符之外的 |
[^\s] |
|
|||
边界: |
|||
^ |
START |
锚:开始 |
|
$ |
END |
锚:尾 |
|
\b |
WORD BOUNDARY |
单词边界**(需注意其判明机制) |
|
\B |
NEG (\b) |
除单词边界之外的 |
|
|
|||
括号,否定和其他 |
|||
\ |
LITERALS |
转义** |
[,\, ^,-,", ., ",] |
(ab) |
"()"TREAT AS A GROUP |
小括号表示集体 |
|
[ab] |
OR *ONLY MATCHING CHAR |
中括号表示或者 |
|
[^ab] |
NEG OF (a|b) |
中括号里面加插入号表示“非” |
|
[a-z] |
RANGE |
破折号表示区间 |
|
&& |
INTERSECTION * FOR BOLLEAN RELATION |
“&&”表示共有的 |
|
[a,b] |
TO SEPARATE? * NOT SURE |
逗号用于区分 |
|
((?!xxxx).)* if then |
NEGATIVE LOOK AROUND?* ADVANCED USAGE |
?(A)B|C)IF A THEN B ELSE C. (?(A)B) IF A THEN B |
详情:http://ocpsoft.org/opensource/guide-to-regular-expressions-in-java-part-2/ |
|
重复特定的一个字符: |
|||
X{N} |
MATCH X EXACTLY N TIMES |
取配单个字符X 正好M次 |
|
X{N,} |
AT LEAST N TIMES |
至少N次 |
|
X{N,M} |
AT LEAST N TIMES BUT NO MORE THEN M TIMES |
至少N次 最多不超过M次 |
|
需要注意的地方: |
|||
java 转义时要多加一个“\” |
\\ |
||
[(ab)(ba)] |
"[]"WOULD NOT WORK FOR STRING |
中括号不适合字符串的应用 |
|
[(ab)(bc)] ?= [a,b,c] |
SAME THING |
同上 |
|
ab|ba OR (ab|ba) |
THIS IS PORPER USE FOR STING |
正确用字符串 |
|
[^a-z] |
NOT a to z |
||
[^abc] |
Not a OR b OR c |
||
"\\" -> "\\\\" |
FOR JAVA |
||
[a-z&&[def]] |
INTERSECTION *BE CAREFULL |
满足a-z同时满足d或e或f 其实就是d|e|f |
|
[a-d[m-p]] |
UNION |
或,a到d或者m到p |
|
[a-z&&[^bc]] |
SUBTRACTION |
除了b或c之外从a到z |