别有一番风味的正则表达式

写一个用于判断给定字符串是否只包含数字或者字母的正则表达式。

针对这样需要，我们可以采用类似于如下的方式实现。

public boolean isAlphaNumeric(String value) {
        if(StringUtils.isEmpty(value)){
            return false;
        }
        return value.matches("^[a-zA-Z0-9]*$");
    }

在java.util.regex.Pattern类中，其实除了字符类或者预定义字符类，如：

Character classes
[abc]	a,b,or c (simple class)
[^abc]	Any character except a,or c (negation)
[a-zA-Z]	a through z or A through Z,inclusive (range)
[a-d[m-p]]	a through d,or m through p: [a-dm-p] (union)
[a-z&&[def]]	d,e,or f (intersection)
[a-z&&[^bc]]	a through z,except for b and c: [ad-z] (subtraction)
[a-z&&[^m-p]]	a through z,and not m through p: [a-lq-z](subtraction)

Predefined character classes
.	Any character (may or may not match line terminators)
\d	A digit: [0-9]
\D	A non-digit: [^0-9]
\s	A whitespace character: [ \t\n\x0B\f\r]
\S	A non-whitespace character: [^\s]
\w	A word character: [a-zA-Z_0-9]
\W	A non-word character: [^\w]

参考https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html

之外，还支持Unicode 字符属性别名,POSIX 正则表达式字符类以及java字符类型等。让我们一起来看一下相关的源代码和API的说明，然后尝试一下能写出什么不一样的正则表达式出来。走起~~~

Unicode字符属性别名

// Unicode character property aliases,defined in
            // http://www.unicode.org/Public/UNIDATA/PropertyValueAliases.txt
            defCategory("Cn",1<<Character.UNASSIGNED);
            defCategory("Lu",1<<Character.UPPERCASE_LETTER);
            defCategory("Ll",1<<Character.LOWERCASE_LETTER);
            defCategory("Lt",1<<Character.TITLECASE_LETTER);
            defCategory("Lm",1<<Character.MODIFIER_LETTER);
            defCategory("Lo",1<<Character.OTHER_LETTER);
            defCategory("Mn",1<<Character.NON_SPACING_MARK);
            defCategory("Me",1<<Character.ENCLOSING_MARK);
            defCategory("Mc",1<<Character.COMBINING_SPACING_MARK);
            defCategory("Nd",1<<Character.DECIMAL_DIGIT_NUMBER);
            defCategory("Nl",1<<Character.LETTER_NUMBER);
            defCategory("No",1<<Character.OTHER_NUMBER);
            defCategory("Zs",1<<Character.SPACE_SEPARATOR);
            defCategory("Zl",1<<Character.LINE_SEPARATOR);
            defCategory("Zp",1<<Character.PARAGRAPH_SEPARATOR);
            defCategory("Cc",1<<Character.CONTROL);
            defCategory("Cf",1<<Character.FORMAT);
            defCategory("Co",1<<Character.PRIVATE_USE);
            defCategory("Cs",1<<Character.SURROGATE);
            defCategory("Pd",1<<Character.DASH_PUNCTUATION);
            defCategory("Ps",1<<Character.START_PUNCTUATION);
            defCategory("Pe",1<<Character.END_PUNCTUATION);
            defCategory("Pc",1<<Character.CONNECTOR_PUNCTUATION);
            defCategory("Po",1<<Character.OTHER_PUNCTUATION);
            defCategory("Sm",1<<Character.MATH_SYMBOL);
            defCategory("Sc",1<<Character.CURRENCY_SYMBOL);
            defCategory("Sk",1<<Character.MODIFIER_SYMBOL);
            defCategory("So",1<<Character.OTHER_SYMBOL);
            defCategory("Pi",1<<Character.INITIAL_QUOTE_PUNCTUATION);
            defCategory("Pf",1<<Character.FINAL_QUOTE_PUNCTUATION);
            defCategory("L",((1<<Character.UPPERCASE_LETTER) |
                              (1<<Character.LOWERCASE_LETTER) |
                              (1<<Character.TITLECASE_LETTER) |
                              (1<<Character.MODIFIER_LETTER)  |
                              (1<<Character.OTHER_LETTER)));
            defCategory("M",((1<<Character.NON_SPACING_MARK) |
                              (1<<Character.ENCLOSING_MARK)   |
                              (1<<Character.COMBINING_SPACING_MARK)));
            defCategory("N",((1<<Character.DECIMAL_DIGIT_NUMBER) |
                              (1<<Character.LETTER_NUMBER)        |
                              (1<<Character.OTHER_NUMBER)));
            defCategory("Z",((1<<Character.SPACE_SEPARATOR) |
                              (1<<Character.LINE_SEPARATOR)  |
                              (1<<Character.PARAGRAPH_SEPARATOR)));
            defCategory("C",((1<<Character.CONTROL)     |
                              (1<<Character.FORMAT)      |
                              (1<<Character.PRIVATE_USE) |
                              (1<<Character.SURROGATE))); // Other
            defCategory("P",((1<<Character.DASH_PUNCTUATION)      |
                              (1<<Character.START_PUNCTUATION)     |
                              (1<<Character.END_PUNCTUATION)       |
                              (1<<Character.CONNECTOR_PUNCTUATION) |
                              (1<<Character.OTHER_PUNCTUATION)     |
                              (1<<Character.INITIAL_QUOTE_PUNCTUATION) |
                              (1<<Character.FINAL_QUOTE_PUNCTUATION)));
            defCategory("S",((1<<Character.MATH_SYMBOL)     |
                              (1<<Character.CURRENCY_SYMBOL) |
                              (1<<Character.MODIFIER_SYMBOL) |
                              (1<<Character.OTHER_SYMBOL)));
            defCategory("LC",((1<<Character.UPPERCASE_LETTER) |
                               (1<<Character.LOWERCASE_LETTER) |
                               (1<<Character.TITLECASE_LETTER)));
            defCategory("LD",((1<<Character.UPPERCASE_LETTER) |
                               (1<<Character.LOWERCASE_LETTER) |
                               (1<<Character.TITLECASE_LETTER) |
                               (1<<Character.MODIFIER_LETTER)  |
                               (1<<Character.OTHER_LETTER)     |
                               (1<<Character.DECIMAL_DIGIT_NUMBER)));
            defRange("L1",0x00,0xFF); // Latin-1
            map.put("all",new CharPropertyFactory() {
                    CharProperty make() { return new All(); }});

\p{Lu}	            An uppercase letter (category)
[\p{L}&&[^\p{Lu}]] 	Any letter except an uppercase letter (subtraction)

示例

判断是否由大写字母组成

/**
		 * 判断是否都是由大写字母组成
		 */
		String regexExp = "^[\\p{Lu}]*$";
		System.out.println("a".matches(regexExp));// false
		System.out.println("B".matches(regexExp));// true
		System.out.println("aB".matches(regexExp));// false
		System.out.println("@".matches(regexExp));// false

POSIX 字符类（仅 US-ASCII）

// Posix regular expression character classes,defined in
            // http://www.unix.org/onlinepubs/009695399/basedefs/xbd_chap09.html
            defRange("ASCII",0x7F);   // ASCII
            defCtype("Alnum",ASCII.ALNUM);  // Alphanumeric characters
            defCtype("Alpha",ASCII.ALPHA);  // Alphabetic characters
            defCtype("Blank",ASCII.BLANK);  // Space and tab characters
            defCtype("Cntrl",ASCII.CNTRL);  // Control characters
            defRange("Digit",'0','9');     // Numeric characters
            defCtype("Graph",ASCII.GRAPH);  // printable and visible
            defRange("Lower",'a','z');     // Lower-case alphabetic
            defRange("Print",0x20,0x7E);   // Printable characters
            defCtype("Punct",ASCII.PUNCT);  // Punctuation characters
            defCtype("Space",ASCII.SPACE);  // Space characters
            defRange("Upper",'A','Z');     // Upper-case alphabetic
            defCtype("XDigit",ASCII.XDIGIT); // hexadecimal digits

POSIX character classes (US-ASCII only)

\p{Lower}	A lower-case alphabetic character: [a-z]
\p{Upper}	An upper-case alphabetic character:[A-Z]
\p{ASCII}	All ASCII:[\x00-\x7F]
\p{Alpha}	An alphabetic character:[\p{Lower}\p{Upper}]
\p{Digit}	A decimal digit: [0-9]
\p{Alnum}	An alphanumeric character:[\p{Alpha}\p{Digit}]
\p{Punct}	Punctuation: One of !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
\p{Graph}	A visible character: [\p{Alnum}\p{Punct}]
\p{Print}	A printable character: [\p{Graph}\x20]
\p{Blank}	A space or a tab: [ \t]
\p{Cntrl}	A control character: [\x00-\x1F\x7F]
\p{XDigit}	A hexadecimal digit: [0-9a-fA-F]

示例

判断是否由ASCII值组成

/**
		 * 判断是否由ASCII码组成
		 */
		String regexExp = "^[\\p{ASCII}]*$";
		System.out.println("a".matches(regexExp));// true
		System.out.println("1".matches(regexExp));// true
		System.out.println("-".matches(regexExp));// true
		System.out.println("@".matches(regexExp));// true
		System.out.println("我".matches(regexExp));// false

判断是否由小写字母组成

/**
		 * 判断是否由小写字母组成
		 */
		String regexExp = "^[\\p{Lower}]*$";
		System.out.println("a".matches(regexExp));// true
		System.out.println("1".matches(regexExp));// false
		System.out.println("-".matches(regexExp));// false
		System.out.println("@".matches(regexExp));// false
		System.out.println("B".matches(regexExp));// false

判断是是否是有数字组成

/**
		 * 判断是否由数字组成
		 */
		String regexExp = "^[\\p{Digit}]*$";
		System.out.println("a".matches(regexExp));// false
		System.out.println("1".matches(regexExp));// true
		System.out.println("-".matches(regexExp));// false
		System.out.println("@".matches(regexExp));// false
		System.out.println("B".matches(regexExp));// false

剩余的，如Upper用于判断是否由大写字母组成等，有兴趣的读者可以自己写一下。

java.lang.Character 类（简单的 java 字符类型）

// Java character properties,defined by methods in Character.java
            defClone("javaLowerCase",new CloneableProperty() {
                boolean isSatisfiedBy(int ch) {
                    return Character.isLowerCase(ch);}});
            defClone("javaUpperCase",new CloneableProperty() {
                boolean isSatisfiedBy(int ch) {
                    return Character.isUpperCase(ch);}});
            defClone("javaAlphabetic",new CloneableProperty() {
                boolean isSatisfiedBy(int ch) {
                    return Character.isAlphabetic(ch);}});
            defClone("javaIdeographic",new CloneableProperty() {
                boolean isSatisfiedBy(int ch) {
                    return Character.isIdeographic(ch);}});
            defClone("javaTitleCase",new CloneableProperty() {
                boolean isSatisfiedBy(int ch) {
                    return Character.isTitleCase(ch);}});
            defClone("javaDigit",new CloneableProperty() {
                boolean isSatisfiedBy(int ch) {
                    return Character.isDigit(ch);}});
            defClone("javaDefined",new CloneableProperty() {
                boolean isSatisfiedBy(int ch) {
                    return Character.isDefined(ch);}});
            defClone("javaLetter",new CloneableProperty() {
                boolean isSatisfiedBy(int ch) {
                    return Character.isLetter(ch);}});
            defClone("javaLetterOrDigit",new CloneableProperty() {
                boolean isSatisfiedBy(int ch) {
                    return Character.isLetterOrDigit(ch);}});
            defClone("javaJavaIdentifierStart",new CloneableProperty() {
                boolean isSatisfiedBy(int ch) {
                    return Character.isJavaIdentifierStart(ch);}});
            defClone("javaJavaIdentifierPart",new CloneableProperty() {
                boolean isSatisfiedBy(int ch) {
                    return Character.isJavaIdentifierPart(ch);}});
            defClone("javaUnicodeIdentifierStart",new CloneableProperty() {
                boolean isSatisfiedBy(int ch) {
                    return Character.isUnicodeIdentifierStart(ch);}});
            defClone("javaUnicodeIdentifierPart",new CloneableProperty() {
                boolean isSatisfiedBy(int ch) {
                    return Character.isUnicodeIdentifierPart(ch);}});
            defClone("javaIdentifierIgnorable",new CloneableProperty() {
                boolean isSatisfiedBy(int ch) {
                    return Character.isIdentifierIgnorable(ch);}});
            defClone("javaSpaceChar",new CloneableProperty() {
                boolean isSatisfiedBy(int ch) {
                    return Character.isSpaceChar(ch);}});
            defClone("javaWhitespace",new CloneableProperty() {
                boolean isSatisfiedBy(int ch) {
                    return Character.isWhitespace(ch);}});
            defClone("javaISOControl",new CloneableProperty() {
                boolean isSatisfiedBy(int ch) {
                    return Character.isISOControl(ch);}});
            defClone("javaMirrored",new CloneableProperty() {
                boolean isSatisfiedBy(int ch) {
                    return Character.isMirrored(ch);}});

参考https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html

java.lang.Character classes (simple java character type)

\p{javaLowerCase}	Equivalent to java.lang.Character.isLowerCase()
\p{javaUpperCase}	Equivalent to java.lang.Character.isUpperCase()
\p{javaWhitespace}	Equivalent to java.lang.Character.isWhitespace()
\p{javaMirrored}	Equivalent to java.lang.Character.isMirrored()

示例

判断是否由小写字母组成

/**
		 * 判断是否由小写字母组成
		 */
		String regexExp = "^[\\p{javaLowerCase}]*$";
		System.out.println("a".matches(regexExp));// true
		System.out.println("1".matches(regexExp));// false
		System.out.println("-".matches(regexExp));// false
		System.out.println("@".matches(regexExp));// false
		System.out.println("B".matches(regexExp));// false

判断是否由数字或者字母组成

/**
		 * 判断是否由数字或者字母组成
		 */
		String regexExp = "^[\\p{javaLetterOrDigit}]*$";
		System.out.println("a".matches(regexExp));// true
		System.out.println("1".matches(regexExp));// true
		System.out.println("-".matches(regexExp));// false
		System.out.println("@".matches(regexExp));// false
		System.out.println("B1".matches(regexExp));// true

小结

通过上述示例的编写，想必大家对判断数字、字母、ASCII码等正则表达式的编写，又多了几种写法了。

别有一番风味的正则表达式

Unicode字符属性别名

示例

判断是否由大写字母组成

POSIX 字符类（仅 US-ASCII）

示例

判断是否由ASCII值组成

判断是否由小写字母组成

判断是是否是有数字组成

java.lang.Character 类（简单的 java 字符类型）

示例

判断是否由小写字母组成

判断是否由数字或者字母组成

小结

猜你在找的正则表达式相关文章