grep用法与正则表达式详解

首先要记住的是，正则表达式与通配符不一样，它们表示的含义并不相同。正则表达式只是一种表示法，只要工具支持这种表示法，那么该工具就可以处理正则表达式的字符串。vim，grep,awk,sed等都支持正则表达式。

1. grep用法

描述

全局搜索一个正则表达式，并输出结果。

用法

# grep [-acinv] '搜索内容串' filename

其中，搜索串可以是正则表达式。

参数

-a，以文本文件方式搜索

-c，显示匹配的行数（就是显示有多少行匹配了）

-n，打印包含匹配项的行和行标

-i，匹配时忽略大小写

-s，错误信息不输出

-v，反向选择，即查找不包含匹配项的行

-x，输出完全匹配内容

实例

regular_express.txt的内容

"Open Source" is a good mechanism to develop programs.
apple is my favorite food.
Football game is not use feet only.
this dress doesn't fit me.
However,this dress is about $ 3183 dollars.
GNU is free air not free beer.
I can't finish the test.
Oh! the soup taste good!
motorcycle is cheap than car.
the symbol '*' is represented as star.
The gd software is a library for drafting programs.
You are the best is menu you are the no.1.
The world is the same with 'glad'.
google is the best tools for search keyword.
goooooogle yes!
go! go! Let's go.
#I am VBird

使用-n选项，搜索有the的行，并输出行号

# grep -n 'the' regular_express.txt

7:I can't finish the test.
8:Oh! the soup taste good!
10:the symbol '*' is represented as star.
12:You are the best is menu you are the no.1.
13:The world is the same with 'glad'.
14:google is the best tools for search keyword.

使用-v选项，搜索没有the的行，并输出行号

# grep -nv 'the' regular_express.txt

1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
5:However,this dress is about $ 3183 dollars.
6:GNU is free air not free beer.
9:motorcycle is cheap than car.
11:The gd software is a library for drafting programs.
15:goooooogle yes!
16:go! go! Let's go.
17:#I am VBird

2. grep与基础正则表达式

2.1 语法

\，转义（忽略表达式中字符原有含义）

^，表示匹配行的开头。^与[^]的意义不同，’^string’表示^后面的字符在行的开头

$，表示匹配行的结尾(不是字符，是位置）

^$，表示空行，因为只有行首和行尾

\ <，从匹配正则表达式的行开始

\ >，到匹配正则表达式的行结束

[ ]，表示匹配[ ]中的某一个字符，如[ade]，表示匹配a或d或e

[^ ]，^用作[ ]内字符的前缀，表示不匹配包含[ ]内字符的行

[ - ]，表示匹配的字符范围。[a-z] 表示小写字母，[0-9] 表示0~9数字，[A-Z] 则是大写字母，[a-zA-Z0-9]表示所有数字与英文字符

. ，匹配所有的单个字符

*，匹配所有字符，长度可以为0

.*，匹配0个或多个任意字符

2.2 实例

使用[]，匹配集合字符串

[]表示其中的某一个字符，例如[ae]表示a或e。

# grep -n 't[ae]st' regular_express.txt 

7:I can't finish the test.
8:Oh! the soup taste good!

使用[^]，不匹配包含[ ]内字符的字符串

匹配oo前没有g的字符串所在的行，使用 ‘[^g]oo’ 作匹配字符串。

# grep -n '[^g]oo' regular_express.txt 

2:apple is my favorite food.
3:Football game is not use feet only.
14:google is the best tools for search keyword.
15:goooooogle yes!

使用[ - ]，表示字符范围

匹配包含数字的行

# grep -n '[0-9]' regular_express.txt 

5:However,this dress is about $ 3183 dollars.
12:You are the best is menu you are the no.1.

使用^，匹配以^后面的字符为开头的行

匹配以the开头的行

# grep -n '^the' regular_express.txt 

10:the symbol '*' is represented as star.

匹配以小写字母开头的行

# grep -n '^[a-z]' regular_express.txt 

2:apple is my favorite food.
4:this dress doesn't fit me.
9:motorcycle is cheap than car.
10:the symbol '*' is represented as star.
14:google is the best tools for search keyword.
15:goooooogle yes!
16:go! go! Let's go.

匹配不以英文字母开头的行

# grep -n '^[^a-zA-Z]' regular_express.txt 

1:"Open Source" is a good mechanism to develop programs.
17:#I am VBird

使用$，匹配以$前面的字符为结尾的行

匹配末尾是.的行

//.是正则表达式的特殊符号，所以要用\转义

# grep -n '\.$' regular_express.txt

1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
5:However,this dress is about $ 3183 dollars.
6:GNU is free air not free beer.
7:I can't finish the test.
9:motorcycle is cheap than car.
10:the symbol '*' is represented as star.
11:The gd software is a library for drafting programs.
12:You are the best is menu you are the no.1.
13:The world is the same with 'glad'.
14:google is the best tools for search keyword.
16:go! go! Let's go.

使用’^$’，匹配空行

# grep -n '^$' regular_express.txt

使用-v ‘^$’，匹配非空行

# grep -vn '^$' regular_express.txt 

1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
5:However,this dress is about $ 3183 dollars.
6:GNU is free air not free beer.
7:I can't finish the test.
8:Oh! the soup taste good!
9:motorcycle is cheap than car.
10:the symbol '*' is represented as star.
11:The gd software is a library for drafting programs.
12:You are the best is menu you are the no.1.
13:The world is the same with 'glad'.
14:google is the best tools for search keyword.
15:goooooogle yes!
16:go! go! Let's go.
17:#I am VBird

任意一个字符. 与重复字符 *

在bash中，*代表通配符，用来代表任意个字符，但是在正则表达式中，含义却不同，*表示有0个或多个某字符。例如oo*,表示第一个o一定存在，第二个o可以有一个或多个，也可以没有，因此代表至少一个o。.代表任意一个字符必须存在。 g??d 可以用’g..d’ 表示，其中good，gxxd，gabd都符合。

匹配字符串g..d，..表示任意两个字符

# grep -n 'g..d' regular_express.txt 

1:"Open Source" is a good mechanism to develop programs.
8:Oh! the soup taste good!
13:The world is the same with 'glad'.

匹配含有两个o以上的字符串

// 'ooo*'的前两个o一定存在，第三个o*表示可没有，也可有多个。

# grep -n 'ooo*' regular_express.txt 

1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
8:Oh! the soup taste good!
14:google is the best tools for search keyword.
15:goooooogle yes!

匹配以g为开头和结尾，且中间至少有一个o的字符串，即gog，goog，gooog…

# grep -n 'goo*g' regular_express.txt 

14:google is the best tools for search keyword.
15:goooooogle yes!

匹配以g为开头和结尾的字符串

// .*表示0个或多个任意字符

# grep -n 'g.*g' regular_express.txt   

1:"Open Source" is a good mechanism to develop programs.
11:The gd software is a library for drafting programs.
14:google is the best tools for search keyword.
15:goooooogle yes!
16:go! go! Let's go.

指定连续重复字符的个数范围{ }

. *只能限制0个或多个，如果要确切的限定重复字符的数量，需要用{范围} 。用,隔开2,5，表示2~5个；2,表示2到更多个。注意，由于{ }在SHELL中有特殊意义，因此作为正则表达式用时，需要\转义，即\{ \}。

匹配包含两个o的字符串的行。

# grep -n 'o\{2\}' regular_express.txt 

1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
8:Oh! the soup taste good!
14:google is the best tools for search keyword.
15:goooooogle yes!

匹配g后面跟2~5个o，之后再跟一个g的字符串的行。

# grep -n 'go\{2,5\}g' regular_express.txt 

14:google is the best tools for search keyword.

匹配g后面跟2个以上o，之后再跟g的字符串的行。

# grep -n 'go\{2,\}g' regular_express.txt 

14:google is the best tools for search keyword.
15:goooooogle yes!

注意，如果想让[]中的^－不表示特殊意义，可以放在[]中的后面。'[^a-z\.!^ -]'表示没有小写字母，没有.和!,没有空格，没有-的字符串，注意[ ]里面有个小空格。shell的反向选择为[!range],正则表达式为 [^range]。

3. grep与扩展正则表达式

扩展正则表达式是对基础正则表达式添加了几个特殊符号构成的，它令某些操作更加方便。

去除空白行和行首为#的行

# grep -v '^$' regular_express.txt | grep -v '^#'

"Open Source" is a good mechanism to develop programs.
apple is my favorite food.
Football game is not use feet only.
this dress doesn't fit me.

然而使用扩展正则表达式的egrep与扩展特殊符号|，会方便许多。

# egrep -v '^$|^#' regular_express.txt 

"Open Source" is a good mechanism to develop programs.
apple is my favorite food.
Football game is not use feet only.
this dress doesn't fit me.

注意，grep只支持基础表达式，而egrep支持扩展表达式。其实egrep是grep -E 的别名而已，因此grep -E支持扩展正则表达式。这里的|表示或，即满足^$或者^#的字符串。

这里列出几个扩展特殊符号：

＋，和.*的作用类似，表示一个或多个重复字符。

?，和.*的作用类似，表示0个或一个字符。

｜，表示或关系，比如'gd|good|dog'表示有gd，good或dog的串。

（），将部分内容合成一个单元组。比如要搜索glad或good，可以这样'g(la|oo)d'。()的好处是可以对小组使用+ ? *等。比如要搜索以A和C的开头和结尾，中间有至少一个(xyz)的串，可以这样：'A+(xyz)C'。

4. 参考文章

https://blog.csdn.net/hellochenlian/article/details/34088179