正则表达式的一般用法

重点是学会正则表达式的书写，写不好查取得数据就很杂乱，尤其是在以后的爬取网页提取html内容时会很麻烦

# coding: utf-8
# python 内置re模块，用来支持正则表达式
import re
# 正则表达式

string = 'abccccccdedfdgbgds'

# 1.构造正则表达式
# a.* 贪婪模式 匹配到abbcdefggs全部
# a.*? 非贪婪模式 匹配到a 尽可能少的匹配字符
pattern = re.compile('a.*b')
# 2.使用正则表达式，在大字符串中搜索数据
# match() 1.正则表达式 2.要进行搜索的字符串
# match() 如果搜索的字符串是以正则表达式开头的字符，则可以搜索到结果,返回结果对象，如果搜不到，则返回None
rs = re.match(pattern,string)
# 判断结果是非为空，如果为空，就不再取值
if rs:
    print rs.group()
else:
    print '没有查询到结果！'

# search() 搜索函数 不需要以正则表达式开头，只要在大字符串中存在符合正则的字符，就可以查到，会搜索第一个匹配的数据
pattern1 = re.compile('b.*?d')
# 1.正则  2.搜索字符串
rs = re.search(pattern1,string)
if rs:
    print rs.group()
else:
    print '没有匹配到数据'

# findall() 会将大字符串中所有符合规则字符全部找到，并且以列表的形式返回,如果搜索不到结果，返回一个空列表
# 1.正则  2.搜索字符串
rs = re.findall(pattern1,string)
if rs:
    for content in rs:
        print content

# r 把后面的字符串原样保存  原样字符串，不会对字符串中的转译字符进行转译
string_2 = r'1.查找学员\n2.添加学员\n3.退出'
# replace() 替换源字符串中的某些字符
rs = string_2.replace(r'\n',',')
print rs

# sub()替换原字符串中的某些符合条件的数据
pattern2 = re.compile(r'\\n')
# 1.正则  2.替换后的字符串  3.要进行替换的字符串
rs = re.sub(pattern2,',string_2)
print rs

string3 = 'abcdefghigkdlmhn'
pattern3 = re.compile(r'd.*?h')
rs = re.sub(pattern3,'------',string3)
print rs

正则表达式的一般用法

猜你在找的正则表达式相关文章