我正在尝试编写一个爬虫来下载一些信息,类似于
this Stack Overflow post.这个答案对于创建填充表单很有用,但是当提交按钮不是表单的一部分时,我很难找到提交表单的方法.这是一个例子:
session <- html_session("www.chase.com") form <- html_form(session)[[3]] filledform <- set_values(form,`user_name` = user_name,`usr_password` = usr_password) session <- submit_form(session,filledform)
此时,我收到此错误:
Error in names(submits)[[1]] : subscript out of bounds
如何提交此表单?
解决方法
这是一个对我有用的肮脏的黑客:在研究了
submit_form
source code之后,我想通过在我的代码版本的表单中注入一个虚假的提交按钮来解决问题,然后submit_form函数会调用它.它可以工作,除了它会发出一个警告,经常列出一个不合适的输入对象(不过在下面的例子中).但是,尽管有警告,代码对我有用:
session <- html_session("www.chase.com") form <- html_form(session)[[3]] # Form on home page has no submit button,# so inject a fake submit button or else rvest cannot submit it. # When I do this,rvest gives a warning "Submitting with '___'",where "___" is # often an irrelevant field item. # This warning might be an rvest (version 0.3.2) bug,but the code works. fake_submit_button <- list(name = NULL,type = "submit",value = NULL,checked = NULL,disabled = NULL,readonly = NULL,required = FALSE) attr(fake_submit_button,"class") <- "input" form[["fields"]][["submit"]] <- fake_submit_button user_name <- "user" usr_password <- "password" filledform <- set_values(form,filledform)
成功的结果显示以下警告,我只是忽略:
> Submitting with 'submit'