正则表达式之group
1 基本概念
捕获组就是把正则表达式中子表达式匹配的内容,保存到内存中以数字编号或手动命名的组里,方便后面引用
在Java中使用正则表达式返回符合正则表达式的字符串就要用到group(),group中记录了所有符合指定表达式的字符串
捕获组也就是Pattern中以括号对“()”分割出的子Pattern。至于为什么要用捕获组呢,主要是为了能找出在一次匹配中你更关心的部分。 捕获组可以通过从左到右计算其开括号来编号。例如,在表达式 “(x)(y\w)(z)” 中,存在三个这样的组:
1. x
2. y\w
3. z
捕获组的编号是按照“(”出现的顺序,从左到右编号的,组零group(0)
始终代表整个表达式。
(\d{4})-(\d{2}-(\d\d))
1 1 2 3 3 2
2 实例
2.1 实例1
java code
@Test public void test2(){ Pattern p = Pattern.compile("(\\d+,)(\\d+)"); String s = "123,456-34,345"; Matcher m = p.matcher(s); while (m.find()) { System.out.println(m.group()); System.out.println(m.group(1)); System.out.println(m.group(2)); } System.out.println(m.groupCount()); }
console
123,456
123,
456
34,345
34,
345
2
2.2 实例2
java code
@Test public void test3(){ String regex = "(x)(y\\w*)(z)"; String input = "exY123z,xy456z"; Pattern p = Pattern.compile(regex,Pattern.CASE_INSENSITIVE); Matcher m = p.matcher(input); while (m.find()) { System.out.println(m.group()); System.out.println(m.group(2)); } System.out.println(m.groupCount()); }
console
xY123z
Y123
xy456z
y456
3
2.3 实例3
java code
@Test public void test4() { String str = "Hello,World! in Java."; Pattern pattern = Pattern.compile("W(or)(ld!)"); Matcher matcher = pattern.matcher(str); while (matcher.find()) { System.out.println("Group 0:" + matcher.group(0)); System.out.println("Group 1:" + matcher.group(1)); System.out.println("Group 2:" + matcher.group(2)); System.out.println("Start 0:" + matcher.start(0) + " End 0:" * matcher.end(0)); System.out.println("Start 1:" + matcher.start(1) + " End 1:" * matcher.end(1)); System.out.println("Start 2:" + matcher.start(2) + " End 2:" * matcher.end(2)); System.out.println(str.substring(matcher.start(0),matcher.end(1))); } }
console
Group 0:World!
Group 1:or
Group 2:ld!
Start 0:6 End 0:12
Start 1:7 End 1:9
Start 2:9 End 2:12
Wor
3 思考
3.1 需求
我们用正则表达式的时候,有时候会需要取到匹配到的字符串的某一部分,比如:
http://xxx.xx.com/index.jsp?user=linzhicong&aaa=1
http://xxx.xx.com/index.jsp?user=yangfangwei&aaa=1
http://xxx.xx.com/index.jsp?login=hahaha&aaa=1
假设有这样一个需求: 取到这些url中参数user对应的值(linzhicong和yangfangwei)
3.2 实现1
java code
@Test public void test5() { String str = "http://xxx.xx.com/index.jsp?user=linzhicong&aaa=1"; Pattern pattern = Pattern.compile("user=(.*)&"); Matcher matcher = pattern.matcher(str); while (matcher.find()) { System.out.println(matcher.group(1)); } }
console
linzhicong
3.3 实现2
java code
@Test public void test5() { String str = "http://xxx.xx.com/index.jsp?user=linzhicong&aaa=1" + "http://xxx.xx.com/index.jsp?user=yangfangwei&aaa=1" + "http://xxx.xx.com/index.jsp?login=hahaha&aaa=1"; Pattern pattern = Pattern.compile("user=(.*)&"); Matcher matcher = pattern.matcher(str); while (matcher.find()) { System.out.println(matcher.group(1)); } }
console
linzhicong&aaa=1http://xxx.xx.com/index.jsp?user=yangfangwei&aaa=1http://xxx.xx.com/index.jsp?login=hahaha
请思考为什么出现这个结果?怎么修改才能实现需求?
3.4 实现3
“user=([^\.])&” “user=([\w])&” “user=([a-z]*)&” ......
java code
@Test public void test5() { String str = "http://xxx.xx.com/index.jsp?user=linzhicong&aaa=1" + "http://xxx.xx.com/index.jsp?user=yangfangwei&aaa=1" + "http://xxx.xx.com/index.jsp?login=hahaha&aaa=1"; Pattern pattern = Pattern.compile("user=([a-z]*)&"); Matcher matcher = pattern.matcher(str); while (matcher.find()) { System.out.println(matcher.group(1)); } }
console
linzhicong
yangfangwei