我想从这个页面获取数据:
http://www.canadapost.ca/cpotools/apps/track/personal/findByTrackNumber?trackingNumber=0656887000494793
但该页面转发到:
http://www.canadapost.ca/cpotools/apps/track/personal/findByTrackNumber?execution=eXs1
因此,当我从OpenUri使用open来尝试获取数据时,它会抛出一个RuntimeError错误,说HTTP重定向循环:
解决方法
你需要一个像
Mechanize这样的工具.从它的描述:
The Mechanize library is used for
automating interaction with websites.
Mechanize automatically stores and
sends cookies,follows redirects,can
follow links,and submit forms. Form
fields can be populated and submitted.
Mechanize also keeps track of the
sites that you have visited as a
history.
这正是你需要的.所以,
sudo gem install mechanize
然后
require 'mechanize' agent = WWW::Mechanize.new page = agent.get "http://www.canadapost.ca/cpotools/apps/track/personal/findByTrackNumber trackingNumber=0656887000494793" page.content # Get the resulting page as a string page.body # Get the body content of the resulting page as a string page.search(".somecss") # Search for specific elements by XPath/CSS using nokogiri
而你已经准备好摇滚了.