How to submit login form in Rvest package w/o button argument

Currently, this issue is the same as the open issue #159 in the rvest package, which causes issues where not all fields in a form have a type value. This buy may be fixed in a future release.

However, we can work around the issue by monkey patching the underlying function rvest:::submit_request.

The core problem is the helper function is_submit. Initially, it's defined like this:

is_submit <- function(x) tolower(x$type) %in% c("submit", 
        "image", "button")

As logical as this is, however, it fails in two scenarios:

  1. There is no type element.
  2. The type element is NULL.

Both of these happen to occur on the United login form. We can resolve this by adding two checks inside the function.

custom.submit_request <- function (form, submit = NULL) 
{
  is_submit <- function(x) {
    if (!exists("type", x) | is.null(x$type)){
      return(F);
    }
    tolower(x$type) %in% c("submit", "image", "button")
  } 
  submits <- Filter(is_submit, form$fields)
  if (length(submits) == 0) {
    stop("Could not find possible submission target.", call. = FALSE)
  }
  if (is.null(submit)) {
    submit <- names(submits)[[1]]
    message("Submitting with '", submit, "'")
  }
  if (!(submit %in% names(submits))) {
    stop("Unknown submission name '", submit, "'.\n", "Possible values: ", 
         paste0(names(submits), collapse = ", "), call. = FALSE)
  }
  other_submits <- setdiff(names(submits), submit)
  method <- form$method
  if (!(method %in% c("POST", "GET"))) {
    warning("Invalid method (", method, "), defaulting to GET", 
            call. = FALSE)
    method <- "GET"
  }
  url <- form$url
  fields <- form$fields
  fields <- Filter(function(x) length(x$value) > 0, fields)
  fields <- fields[setdiff(names(fields), other_submits)]
  values <- pluck(fields, "value")
  names(values) <- names(fields)
  list(method = method, encode = form$enctype, url = url, values = values)
}

To monkey patch, we need to use the R.utils package (install via install.packages("R.utils") if you don't have it).

library(R.utils)

reassignInPackage("submit_request", "rvest", custom.submit_request)

From there, we can issue our own request.

account <- account %>% 
     submit_form(login, "ctl00$ContentInfo$SignInSecure")

And that works!

(Well, "works" is a misnomer. Due to United employing more aggressive authentication requirements -- including known browsers -- this results in a 301 Unauthorized. However, it fixes the error).

A full reproducible example involved a couple of other minor code changes:

library(magrittr)
library(rvest)

url <- "https://www.united.com/web/en-US/apps/account/account.aspx"
account <- html_session(url)
login <- account %>%
  html_nodes("form") %>%
  extract2(1) %>%
  html_form() %>%
  set_values(
    `ctl00$ContentInfo$SignIn$onepass$txtField` = "USER",
    `ctl00$ContentInfo$SignIn$password$txtPassword` = "PASS")
account <- account %>% 
  submit_form(login, "ctl00$ContentInfo$SignInSecure")