如果我想用R抓取带有参数的页面,该怎么办?

前端之家收集整理的这篇文章主要介绍了如果我想用R抓取带有参数的页面,该怎么办?前端之家小编觉得挺不错的,现在分享给大家,也给大家做个参考。
我想在这里抓取的页面http://stoptb.org/countries/tbteam/searchExperts.asp需要在此页面中提交参数: http://stoptb.org/countries/tbteam/experts.asp获取数据.由于参数没有嵌套在URL中,我不知道如何用R传递它们.有没有办法在R中执行此操作?

(顺便说一下,我对ASP几乎一无所知,所以也许这就是我缺少的组件.)

解决方法

您可以使用RHTMLForms

您可能需要先安装它:

# install.packages("RHTMLForms",repos = "http://www.omegahat.org/R")

或在Windows下您可能需要

# install.packages("RHTMLForms",repos = "http://www.omegahat.org/R",type = "source")


 require(RHTMLForms)
 require(RCurl)
 require(XML)
 forms = getHTMLFormDescription("http://stoptb.org/countries/tbteam/experts.asp")
 fun = createFunction(forms$sExperts)
 # find experts with expertise in "Infection control: Engineering Consultant"
 results <- fun(Expertise = "Infection control: Engineering Consultant")

 tableData <- getNodeSet(htmlParse(results),"//*/table[@class = 'data']")
 readHTMLTable(tableData[[1]])

#                              V1                   V2                     V3
#1                                                <NA>                   <NA>
#2                 Name of Expert Country of Residence                  Email
#3               Girmay,Desalegn             Ethiopia    deskebede@yahoo.com
#4            IVANCHENKO,VARVARA              Estonia v.ivanchenko81@mail.ru
#5                   JAUCOT,Alex              Belgium  alex.jaucot@gmail.com
#6 Mulder,Hans Johannes Henricus              Namibia        hmulder@iway.na
#7                    Walls,Neil            Australia        neil@nwalls.com
#8                 Zuccotti,Thea                Italy     thea_zuc@yahoo.com
#                  V4
#1               <NA>
#2 Number of Missions
#3                  0
#4                  3
#5                  0
#6                  0
#7                  0
#8                  1

或创建一个阅读器来返回一个表

returnTable <- function(results){
  tableData <- getNodeSet(htmlParse(results),"//*/table[@class = 'data']")
  readHTMLTable(tableData[[1]])
 }
 fun = createFunction(forms$sExperts,reader = returnTable)
 fun(CBased = "Bhutan") # find experts based in Bhutan
#                 V1                   V2                      V3
#1                                   <NA>                    <NA>
#2    Name of Expert Country of Residence                   Email
#3 Wangchuk,Lungten               Bhutan drlungten@health.gov.bt
#                  V4
#1               <NA>
#2 Number of Missions
#3                  2
原文链接:https://www.f2er.com/html/232128.html

猜你在找的HTML相关文章