正则表达式知识详解系列,通过代码示例来说明正则表达式知识
源代码下载地址:http://download.csdn.net/detail/gnail_oug/9504094
示例功能:
<span style="white-space:pre"> </span>/** * 根据url读取网页内容 * @date 2016-04-27 10:34:13 * @author sgl * @param urlStr * @return */ public static String readHtml(String urlStr){ StringBuffer sb=new StringBuffer(""); BufferedReader br=null; try { URL url=new URL(urlStr); HttpURLConnection conn=(HttpURLConnection)url.openConnection(); InputStream in=conn.getInputStream(); br=new BufferedReader(new InputStreamReader(in)); String line=null; while((line=br.readLine())!=null){ sb.append(line); } } catch (MalformedURLException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } finally{ if(br!=null){ try { br.close(); } catch (IOException e) { e.printStackTrace(); } } } return sb.toString(); }
步骤二、从页面内容中找出邮箱地址
/** * 从字符串中找出邮件地址 * @date 2016-04-27 10:35:27 * @author sgl * @param str * @return */ public static List<String>findEmail(String str){ Pattern p=Pattern.compile("[\\w-]+@[\\w\\.-]*\\w+\\.\\w{2,6}"); Matcher m=p.matcher(str); List<String>list=new ArrayList<String>(); while(m.find()){ list.add(m.group()); } return list; }
步骤三、获取邮箱地址
public static void main(String[] args) { String htmlTxt=Demo02.readHtml("http://blog.sina.com.cn/s/blog_515617e60101e151.html"); List<String>list=Demo02.findEmail(htmlTxt); for(String email:list){ System.out.println(email); } System.out.println(list.size()); }
运行结果:(中间部分省略了)
530180782@qq.com 243678025@qq.com 398018489@qq.com .... 398018489@qq.com 595064131@qq.com 362483245@qq.com 285340035@qq.com 448280012@qq.com 438