前端之家收集整理的这篇文章主要介绍了
正则表达式-学习笔记,
前端之家小编觉得挺不错的,现在分享给大家,也给大家做个参考。
Full regular expressions are composed of two types of characters. The special characters are called
Metacbaracter,while the rest are called literal,or normal text characters. It might help to consider regular expressions as their own language,with literal text acting as the words and
Metacharacters as the grammer. The egrep command interprets the first command-line argument as a regular expression,and any remaining arguments as the file(s) to search. Note,however,that the single quotes are not part of regular expression,but are needed by command shell. ^ and $ which represent the start and end,respectively,of the line of text as it is being checked. [...],usually called a character class,lets you list the characters you want to allow at that point in the match. Within a character class,the character-class
Metacharacter '-' indicates a range of characters. Note that a dash is a
Metacharacter only within a character class - otherwise it matches the normal dash character class. If you use [^...] instead of [...],the class matches any character that isn't listed. The
Metacharacter . is a shorthand for a character class that matches any character. It can be convenient when you want to have an "any character here" placeholder in your expression. A very convenient
Metacharacter is |,which means "or". With the parenthese are
required because without them,it will be different. Case-insensitive and case-sensitive is not a part of the regular-expression language,but is a related useful feature many tools provide. egrep's command-line option "-i" tells it to do a case-insensitive match. A common problem is that a regualr expression that matches the word you want can often also match where the "word" is embedded within a larger word. You can use the
Metasequnces \< and \> if your version happens to support them. You can think of them as word-based version of ^ and $ that match the position at the start and end of a word. The
Metacharacter ? means optional. It is placed after the character or string which is srounded by parenthese. It means that it is allowed to appear at that point in the expression,but whose existence isn't actually
required to still be considered a successful match. Similar to the question mark are + and *. The
Metachacter + means "one or more of the immediately-preceding item",and * means "any number,including none,of the item". Some version of egrep support a
Metasequence for providing your own minimum and maximum times of repetition,it is {min,max} placed after the item. Backreferencing is a regular-expression feature that allows you to match new text that is the same as some text matched earlier in the expression. Finally,we replace the second word by the special
Metasequence \1(\2,\3...). For example,we can use \<[a-zA-Z]+ +\1\> to find the double word.