正则表达式--- REGEXP_REPLACE 函数
转载请注明出处:http://www.jb51.cc/article/p-getsvitq-dk.html
这题又是考正则表达式,我们先根据题意,操作如下:
- hr@OCM>col"PHONENUMBER"fora50 @H_403_45@hr@OCM>SELECTphone_number,REGEXP_REPLACE(phone_number,'([[:digit:]]{3})\.([[:digit:]]{3})\.([[:digit:]]{4})','(\1)\2-\3')"PHONENUMBER"
- 2FROMemployees; @H_403_45@
- PHONE_NUMBERPHONENUMBER @H_403_45@----------------------------------------------------------------------
- 650.507.9833(650)507-9833 @H_403_45@650.507.9844(650)507-9844
- 515.123.4444(515)123-4444 @H_403_45@011.44.1644.429264011.44.1644.429264
- 011.44.1644.429263011.44.1644.429263 @H_403_45@011.44.1644.429262011.44.1644.429262
- 省略结果。。。。。 @H_403_45@650.501.4876(650)501-4876
- 650.507.9811(650)507-9811 @H_403_45@650.507.9822(650)507-9822
- @H_403_45@107rowsselected.
根据查询结果可以得出正确答案是:C
*************************************************************
让我们首先看一下传统的REPLACE sql函数,它把一个字符串用另一个字符串来替换。假设您的数据在正文中有不必要的空格,您希望用单个空格来替换它们。利用REPLACE函数,您需要准确地列出您要替换多少个空格。然而,多余空格的数目在正文的各处可能不是相同的。下面的示例在Joe和Smith之间有三个空格。REPLACE函数的参数指定要用一个空格来替换两个空格。在这种情况下,结果在原来的字符串的Joe和Smith之间留下了一个额外的空格。
REGEXP_REPLACE函数把替换功能向前推进了一步,其语法在表 9中列出。以下查询用单个空格替换了任意两个或更多的空格。( )子表达式包含了单个空格,它可以按{2,}的指示重复两次或更多次。
REGEXP_REPLACE('Joe Smith','( ){2,' ')
表示在被搜索的字符串JoeSmith中那些@H_506_301@符合(或叫匹配)正则表达式(){2,}的子字符串(这里即为三个空格符号)将被这里的REGEXP_REPLACE函数里的第三个参数值即单个空格符号所代替。
正则表达式的一个有用的特性是能够存储子表达式供以后重用;这也被称为后向引用(在表 10 中对其进行了概述)。它允许复杂的替换功能,如在新的位置上交换模式或显示重复出现的单词或字母。子表达式的匹配部分保存在临时缓冲区中。缓冲区从左至右进行编号,并利用\digit 符号进行访问,其中digit 是 1 到 9 之间的一个数字,它匹配第digit 个子表达式,子表达式用一组圆括号来显示。
接下来的例子显示了通过按编号引用各个子表达式将姓名 Ellen Hildi Smith 转变为 Smith,Ellen Hildi。
该 sql 语句显示了用圆括号括住的三个单独的子表达式。每一个单独的子表达式包含一个匹配元字符 (.),并紧跟着 * 元字符,表示任何字符(除换行符之外)都必须匹配零次或更多次。空格将各个子表达式分开,空格也必须匹配。圆括号创建获取值的子表达式,并且可以用\digit 来引用。第一个子表达式被赋值为 \1 ,第二个 \2,以此类推。这些后向引用被用在这个函数的最后一个参数 (\3,\1 \2) 中,这个函数有效地返回了替换子字符串,并按期望的格式来排列它们(包括逗号和空格)。表 11 详细说明了该正则表达式的各个组成部分。
注释:
REGEXP_REPLACE('Ellen Hildi Smith','(.*) (.*) (.*)',\1 \2')
表示在被搜索的字符串Ellen Hildi Smith中那些@H_506_301@符合(或叫匹配)正则表达式(.*) (.*) (.*)【其中该正则表达式中有空格将各个子表达式分开,空格也必须匹配】的子字符串(这里即为Ellen Hildi Smith)将被这里的REGEXP_REPLACE函数里的第三个参数值即\3,\1 \2所代替。\1代表的是正则表达式(.*) (.*) (.*)中的第一个以圆括号来表示的子表达式(.*),而该(.*)的值为Ellen,\3和\2也以此类推。
后向引用对替换、格式化和代替值非常有用,并且您可以用它们来查找相邻出现的值。接下来的例子显示了使用REGEP_SUBSTR 函数来查找任意被空格隔开的重复出现的字母数字值。显示的结果给出了识别重复出现的单词is 的子字符串。
正则表达式([[:alnum:]]+)([[:space:]]+)\1,等价于([[:alnum:]]+)([[:space:]]+)([[:alnum:]]+),即这里的\1表示的就是([[:alnum:]]+)([[:space:]]+)里的第一个以圆括号来表示的子表达式([[:alnum:]]+)。
REGEXP_SUBSTR( 'Thefinaltestisistheimplementation',‘([[:alnum:]]+)([[:space:]]+)\1')
表示的是显示在被搜索的字符串Thefinaltestisistheimplementation中那些@H_506_301@符合(或叫匹配)正则表达式 ([[:alnum:]]+)([[:space:]]+)\1的子字符串(这里即为‘is is’)。
来自官方文档:
REGEXP_REPLACE
extends the functionality of the REPLACE
function by letting you search a string for a regular expression pattern. By default,the function returnssource_char
with every occurrence of the regular expression pattern replaced withreplace_string
. The string returned is in the same character set assource_char
. The function returnsVARCHAR2
if the first argument is not a LOB and returnsCLOB
if the first argument is a LOB.
This function complies with the POSIX regular expression standard and the Unicode Regular Expression Guidelines. For more information,please refer toAppendix C,"Oracle Regular Expression Support".
-
@H_403_45@
source_char
is a character expression that serves as the search value. It is commonly a character column and can be of any of the datatypesCHAR
,VARCHAR2
,NCHAR
,NVARCHAR2
,CLOB
orNCLOB
.
pattern
is the regular expression. It is usually a text literal and can be of any of the datatypesCHAR
,orNVARCHAR2
. It can contain up to 512 bytes. If the datatype ofpattern
is different from the datatype ofsource_char
,Oracle Database convertspattern
to the datatype ofsource_char
. For a listing of the operators you can specify inpattern
,"Oracle Regular Expression Support".
replace_string
can be of any of the datatypes CHAR
, VARCHAR2
,CLOB
,orNCLOB
. Ifreplace_string
is aCLOB
orNCLOB
,then Oracle truncatesreplace_string
to 32K. Thereplace_string
can contain up to 500 backreferences to subexpressions in the form\n
,wheren
is a number from 1 to 9. Ifn
is the backslash character inreplace_string
,then you must precede it with the escape character (\\
). For more information on backreference expressions,please refer to the notes to"Oracle Regular Expression Support",Table C-1.
position
is a positive integer indicating the character of source_char
where Oracle should begin the search. The default is 1,meaning that Oracle begins the search at the first character ofsource_char
.
occurrence
is a nonnegative integer indicating the occurrence of the replace operation:
-
@H_403_45@
If you specify 0,then Oracle replaces all occurrences of the match.
@H_403_45@If you specify a positive integer n
,then Oracle replaces the n
th occurrence.
match_parameter
is a text literal that lets you change the default matching behavior of the function. This argument affects only the matching process and has no effect onreplace_string
. You can specify one or more of the following values formatch_parameter
:
-
@H_403_45@
'i'
specifies case-insensitive matching.
'c'
specifies case-sensitive matching.
'n'
allows the period (.),which is the match-any-character character,to match the newline character. If you omit this parameter,the period does not match the newline character.
'm'
treats the source string as multiple lines. Oracle interprets^
and$
as the start and end,respectively,of any line anywhere in the source string,rather than only at the start or end of the entire source string. If you omit this parameter,Oracle treats the source string as a single line.
'x' ignores whitespace characters. By default,whitespace characters match themselves.
If you specify multiple contradictory values,Oracle uses the last value. For example,if you specify'ic'
,then Oracle uses case-sensitive matching. If you specify a character other than those shown above,then Oracle returns an error.
If you omit match_parameter
,then:
The following example examines phone_number
,looking for the patternxxx
.xxx
.xxxx
. Oracle reformats this pattern with (xxx
)xxx
-xxxx
.
- SELECT @H_403_45@REGEXP_REPLACE(phone_number,
- '([[:digit:]]{3})\.([[:digit:]]{3})\.([[:digit:]]{4})', @H_403_45@'(\1)\2-\3')"REGEXP_REPLACE"
- FROMemployees; @H_403_45@
- REGEXP_REPLACE @H_403_45@--------------------------------------------------------------------------------
- (515)123-4567 @H_403_45@(515)123-4568
- (515)123-4569 @H_403_45@(590)423-4567
- ...
The following example examines
country_name
. Oracle puts a space after each non-null character in the string.
The following example examines the string,looking for two or more spaces. Oracle replaces each occurrence of two or more spaces with a single space.