使用正则表达式抓取腾讯微博上的图片

最近突然对腾讯微博上的内容产生了兴趣，于是研究了腾讯微博的API，可是官方给的资料是在看不下去，无奈之下，找到了一个开源的类库，包括腾讯微博和新浪微博的接口，但是最令我纠结的是，这两种API无一例外地都需要在授权以后填写验证码，而这种验证码必须手动输入，无法用程序来获取，所以这条路只好放弃。怎么做呢，当然了分析源代码了，呵呵，这个不多说啦，直接上代码吧！

WebClient Client = new WebClient();
string webstring = Client.DownloadString(Url);//这里获取网页源码当然可以使用HttpWebRequest和HttpWebResponse啦，不过这种在这里貌似比较简单啦！
Regex r = new Regex(@"http://t3\.qpic\.cn/mblogpic/([a-z|0-9]+?)/460 ");//这里可以根据网页的代码分析出来，但是不同的网页是不一样的，这个不知道是什么原因
MatchCollection m = r.Matches(webstring);
string[] src=new string[m.Count];
for (int i = 0; i <m.Count; i++)
{
string small = m[i].Groups[0].ToString();
int k = small.LastIndexOf("/") + 1;
small = small.Substring(0,k);
small = small + "2000";//如果你仔细研究过腾讯微博的图片，就会发现鼠标点击显示大图其实是把大图和小图分开储存的，小图是460，大图是2000
string filename = Application.StartupPath + @"\images\" + m[i].Groups[1].ToString() + ".jpg";
Client.DownloadFileAsync(new System.Uri(small),filename);
src[i]=Application.StartupPath + @"\images\" + m[i].Groups[1].ToString() + ".jpg";
}
return src;

一个问题：WebClient 不支持并发 I/O 操作

使用WebClinet可能会触发这个错误，这是因为一个webclient实例一次只能连到一个服务器，所以连第二个就会出错。用循环创建实例的方法可以解决，即需要在循环体内重新new WebClient.。

使用正则表达式抓取腾讯微博上的图片

猜你在找的正则表达式相关文章