Automated access by scripts: Difference between revisions
(proposal made in April 2010) |
m (→Workaround) |
||
Line 49: | Line 49: | ||
== Workaround == | == Workaround == | ||
The following example illustrates how one can currently access a Raven-protected URL from within a shell script and curl. Note that the script has to rewrite the WLS URL that contains the session state, to replace the login page authenticate.html with the redirect page authenticate2.html. This script will fail if there are any interactive confirmations or notifications after the login stage, or if the WLS authors make any changes to the URL structure, form | The following example illustrates how one can currently access a Raven-protected URL from within a shell script and curl. Note that the script has to rewrite the WLS URL that contains the session state, to replace the login page authenticate.html with the redirect page authenticate2.html. This script will fail if there are any interactive confirmations or notifications after the login stage, or if the WLS authors make any changes to the URL structure, form fields, or other aspects of the site design. | ||
#!/bin/bash | #!/bin/bash |
Revision as of 17:14, 16 May 2011
Raven was designed to authenticate human users of a web browser. However, in practice, skilled users who know how to program, especially system administrators, often need to automate workflows that involve HTTP access to a Raven-protected resource. While there exist workarounds (see section "Workaround" below) that allow scripts to get past the Raven login screen, these are cumbersome and fragile.
Proposal: WLS to recognize two new cookies
The following simple proposal to make the Raven WLS more script friendly was made by Markus Kuhn on 28 Apr 2010 on the cs-raven-discuss mailinglist.
It involves adding to the Raven WLS three cookies that carry the login name and password and that, if presented, allow a client to by-pass the manual login screen and any additional manual confirmation screens during the Raven authentication sequence.
A script that tries to access any Raven-protected content will first add three cookies for https://raven.cl.cam.ac.uk/ to the "cookie jar" of its HTTP client tool/library:
- Ucam-WLS-ID=your-crsid
- Ucam-WLS-Passwd=your-raven-password
- Ucam-WLS-mode=automatic
In addition, the script needs to instruct its HTTP client tool/library to automatically follow any HTTP redirects that it encounters.
Ucam-WLS-ID is already understood by the Raven WLS today. It replaces the login-name form element on the login page with the provided value, such that the user does not have to type in their crsid. It is currently set if a user asks https://raven.cam.ac.uk/auth/account/ to pre-fill your login name in the password form.
Ucam-WLS-Passwd would equivalently allow a client to tell Raven in advance the password, such that there is no need for Raven to display any interactive password form if both Ucam-WLS-ID and Ucam-WLS-Passwd are provided.
Ucam-WLS-mode=automatic would tell Raven explicitly that the client is a machine and is therefore not interested in any interactive notification or confirmation screens, as it can't understand English prose anyway. If Ucam-WLS-mode=automatic is present, any new interactive notifications or confirmations are postponed until the user logs in the next time without the cookie Ucam-WLS-mode=automatic.
With these three cookies set correctly, the WLS would either immediately redirect the client back to the application server's WAA where it came from (HTTP result code 302 or 303), or – if the login was not successful (wrong or missing password) – abort with an HTTP error (403 "Forbidden" seems appropriate).
Example
Say you want to access a Raven-protected web page using curl, a popular Unix command-line tool for making HTTP requests:
To deliver the above three cookies safely (i.e. only to https://raven.cam.ac.uk/), create a file "/tmp/cookiejar.txt" with content
raven.cam.ac.uk FALSE / TRUE 2147483647 Ucam-WLS-ID your-crsid raven.cam.ac.uk FALSE / TRUE 2147483647 Ucam-WLS-Passwd your-raven-password raven.cam.ac.uk FALSE / TRUE 2147483647 Ucam-WLS-mode automatic
[Columns: domain, tailmatch, path, secure, expires, name, value]
It is important to set the "secure" flag to TRUE for Ucam-WLS-Passwd such that this cookie is not accidentally submitted over an insecure non-HTTPS connection. The "expire" value is just 231−1 (19 Jan 2038), the maximum time_t value supported on 32-bit platforms.
Then a single curl call of the form
curl -L -b/tmp/cookiejar.txt -c/tmp/cookiejar.txt ....
will not only get you past Raven, but will also leave in your cookie jar the application's session cookie that then makes further Raven calls unnecessary (until timeout). Option -L causes curl to automatically follow redirects, option -b reads the cookie jar, and option -c writes back an updated cookie jar.
In other words, getting with curl past Raven will become as easy as creating one temporary file that contains the username and password, plus providing three additional command-line options to each invocation of curl.
Most other http scripting tools and libraries have equivalent facilities to set cookies and automatically follow redirects, and therefore would equally benefit greatly from this simple extension.
Workaround
The following example illustrates how one can currently access a Raven-protected URL from within a shell script and curl. Note that the script has to rewrite the WLS URL that contains the session state, to replace the login page authenticate.html with the redirect page authenticate2.html. This script will fail if there are any interactive confirmations or notifications after the login stage, or if the WLS authors make any changes to the URL structure, form fields, or other aspects of the site design.
#!/bin/bash # Demonstration of Raven authentication into Lookup using curl # Markus Kuhn -- 2010-04-08 # # Usage: login=mgk25 passwd=... ./raven-client-demo # # some parameters # login=mgk25 # Raven userid to be used # passwd=... # Raven password to be used url='http://www.lookup.cam.ac.uk/person/mgk25' # some URL that triggers a redirect to Raven # # storage space for session cookies (secret) cookiejar=/tmp/raven-demo-$USER.cookiejar rm -f $cookiejar ; touch $cookiejar ; chmod go-rwx $cookiejar # # firstly, trigger and then handle the Raven redirect curlopt="-s --output /dev/null -b$cookiejar -c$cookiejar" # get redirected to Raven's "authenticate.html" form with the right cookie redirect1=`curl -w'%{redirect_url}' $curlopt "$url"` # fill in the password form by accessing Raven's "authenticate2.html" page redirect2=`curl -w'%{redirect_url}' $curlopt --data userid="$login" --data-urlencode pwd="$passwd" --data submit=Submit "${redirect1/authenticate.html/authenticate2.html}"` # now follow the redirect back and obtain the application's session # cookie that attests our successful login curl $curlopt "$redirect2" # and now we can get on to do some real work ... # Example 1: download the Lookup webpage $url curl -b$cookiejar -c$cookiejar -s "$url" # Example 2: download the CSV list of CRSIDs that are members of an institution institution=WOLFC curl -b$cookiejar -c$cookiejar -s --data sort=crsid --data _action_download_members=Download http://www.lookup.cam.ac.uk/inst/$institution/bulk-update-members
Tip: use the Firefox add-on Live HTTP headers on an interactive session to understand how to run the same transaction automatically, e.g. using curl.