Re-using Raven's password database: Difference between revisions

From RavenWiki
Jump to navigationJump to search
(Add login box montage, centre images)
(Final(?) draft)
Line 1: Line 1:
'''''WARNING'': this is incomplete 'work in progress''''
'''''WARNING'': this is incomplete 'work in progress''''


Raven currently provides a reasonable authentication system for interactive, web-based applications but doesn't support non-interactive web-based applications, nor non-web ones, and this covers a multitude of potential uses: Windows/MacOS/Unix logon, IMAP, POP, SMTP, LDAP, WebDAV, etc., etc. Since Raven of necessity has a database containing username/password pairs for most people in the University, and most people know their Raven password, it is tempting to assume that extending it to support these other uses would be simple.  
Raven currently provides a usable authentication system for interactive, web-based applications but doesn't support non-interactive web-based applications, nor non-web ones, and this covers many potential uses: Windows/MacOS/Unix logon, IMAP, POP, SMTP, LDAP, WebDAV (and so CalDAV), etc., etc. Since Raven of necessity has a database containing username/password pairs for most people in the University, and most of them know their Raven password, it is tempting to assume that extending it to support these other uses would be simple.  


This paper explores various ways in which password-based authentication could be provided and attempts to point out the advantages and drawbacks of each proposal. In reading this, it is useful to consider what security properties you wouild actually want from such a system. It's also important to consider who might be attempting to attach what, and how much we care about it: are we talking about a) a bored University student, b) a tabloid journalist, c) an organised criminal, or d) the American NSA, and are they trying to gain access to a) their mate's photo archive, b) the next heir to the throne's email account, c) the University financial system, or d) everything.
This paper explores various ways in which password-based authentication could be provided and attempts to point out the advantages and drawbacks of each proposal. In reading this, it is useful to consider what security properties you would actually want from such a system. It's also important to consider who might be attempting to attach what, and how much we care about it: are we talking about a) a bored University student, b) a tabloid journalist, c) an organised criminal, or d) the American NSA, and are they trying to gain access to a) their mate's photo archive, b) the next heir to the throne's email account, c) the University financial system, or d) everything. Remember that the important thing isn't that legitimate users are required to provide their password before gaining access, but that people can not realistically gain access using an identity that is not their own. These are ''not'' the same thing.


: "''At first he thought that the whole world had blown up. Then he thought that only the Forest part of it had; and then he thought that only he had..."'' Piglet, Winnie-The-Pooh, A.A. Milne, Methuen 1926.
: "''At first he thought that the whole world had blown up. Then he thought that only the Forest part of it had; and then he thought that only he had..."'' Piglet, Winnie-The-Pooh, A.A. Milne, Methuen 1926.
Line 9: Line 9:
==Option the first: forget passwords==
==Option the first: forget passwords==


We could just forget passwords and simply ask people to tell us what their CRSid was. This would even simplify Raven itself:
Stepping back a bit, we could just forget passwords and simply ask people to tell us what their CRSid was. This would even simplify Raven itself:


[[Image:Simple-raven-login.png|center|Raven login, no password]]
[[Image:Simple-raven-login.png|center|Raven login, no password]]


Doing this would save people from having to remember their password, and would save the CS from having to issue them.  Both of these would be significant advantages. Anyone can easily set up almost any system 'protected' in this way - the hard part might actually be ''stopping'' it from asking for a password.
This would save people from having a password that they has to remember, and would save the UCS from having to issue them.  Both of these would be significant advantages. Anyone can easily set up almost any system 'protected' in this way - the hard part might actually be ''stopping'' it from asking for a password.


The obvious downside is that we'd have to trust everyone who can get to a login prompt not to lie (and for many networked services this means everyone in the entire world), and this is rather unrealistic.
The obvious downside is that we'd have to trust everyone who can get to a login prompt not to lie (and for many networked services this means everyone in the entire world), and this is rather unrealistic.
Line 23: Line 23:
This would be easy to remember (and easy to recover if you forgot it). It's also fairly easy to set up a system protected in this way.  
This would be easy to remember (and easy to recover if you forgot it). It's also fairly easy to set up a system protected in this way.  


The problem is that we'd still have to trust everyone who knew the password not to lie. It's fair to assume that a 'secret' known by 55,000 (and rising) people would not stay a secret for long so we'd have to trust quite a lot of people not to lie. It would also be next to impossible ever to change this password. Again this is probably not a realistic option (though this doesn't actually stop people using this as an authentication 'solution', though usually on a small scale - think of the PIN used to arm and disarm most intruder alarm systems).
The problem is that we'd still have to trust everyone who knew the password not to lie. It's fair to assume that a 'secret' known by 55,000 (and rising) people would not stay a secret for long so we'd have to trust quite a lot of people not to lie. It would also be next to impossible ever to change this password. Again this is probably not a realistic option (though this doesn't actually stop people using it, though usually on a small scale - think of the codes used to arm and disarm most intruder alarm systems).


==Option the third: distribute a list of user name/plaintext password pairs==
==Option the third: distribute a list of user name/plaintext password pairs==


We could (in theory, though not currently in practice) extract a list of user names and plain text passwords from Raven, one for each user, and then distribute this list to every system that needs to authenticate users.  
We could (in theory, though not currently in practice) extract a list of user names and corresponding plain text passwords from Raven, one for each user, and then distribute this list to every system that needs to authenticate users.  


Users would quote their user name and password, the system could look up the the user name and compare the offered password with the one on the list - if it matches they get in, if not they don't. Implementing authentication in this way would be fairly easy, and system administrators could post-process the list into whatever form their system actually needed.
Users could then quote their user name and password, the system could look up the the user name and compare the offered password with the one on the list - if it matches they would get in, if not they wouldn't. Implementing authentication in this way would be fairly easy, and system administrators could post-process the list into whatever form their system actually needed.


This largely avoids the problem of having to trust potential users not to lie - the vast majority of users will only be able to authenticate as themselves. It's still possible for users to give away their password, but that's a feature of password-based authentication that's effectively impossible to avoid.
This avoids the problem of having to trust most users not to lie - the vast majority of users will only be able to authenticate as themselves. It's still possible for users to give away their password, but that's a feature of password-based authentication that's effectively impossible to avoid.


The list itself, however, is now a major problem. '''''Anyone''''' with access to it can authenticate as '''''any user''''' on '''''any system''''' using this authentication service, so we'd still have to trust people not to do so. Worse, many users use the same password for multiple authentication systems, so any with access to the list may also be able to authenticate on systems that don't use this authentication system.  
The list itself, however, is now a major problem. '''''Anyone''''' with access to it can authenticate as '''''any user''''' on '''''any system''''' that relies on this authentication service. Worse, many people use the same password for multiple authentication systems, so anyone with access to the list could probably also authenticate on systems that don't use this authentication system.  


All of the managers of all the systems using the authentication service would inherently have access to the list. Even if we choose to assume that none of these administrators will be actively malicious, there is still the danger that they might accidentally or recklessly allow it to leak (from a hacked server, on a memory stick, on a laptop left on a train, etc.). So the security of any one system under this model still depends on a whole group of people that the individual system's administrator has no particular reason to trust. In practice it would be necessary to severely restrict which systems could participate in such an authentication system, if it could be made to work at all, to the point where it could never provide a 'universal' authentication system for the University.
All of the managers of all the systems using the authentication service would inherently have access to the list, so we'd have to trust them. Even if we assume that none of these administrators would be actively malicious, there is still the danger that they might accidentally or recklessly allow the list to leak (from a hacked server, on a memory stick, on a laptop left on a train, etc.). So the security of each under this model still depends on a whole group of people that each system's administrator has no particular reason to trust. In practice the only option would be to severely restrict the systems that could participate, to the point where it could never provide a 'universal' authentication system for the University.


==Option the fourth: distribute a list of user name/'crypted' password pairs==
==Option the fourth: distribute a list of user name/'crypted' password pairs==


Rather than distributing plaintext passwords, we could distribute them in a 'non-reversibly encrypted' (or 'hashed') format. For example the 'crypt' or 'md5' password format used in Unix password files. In principle this makes it possible to check that a proffered password is correct (by encrypting it and checking the encrypted versions match), without exposing all the passwords on the list.
Rather than distributing plaintext passwords, we could distribute them 'non-reversibly encrypted' (or 'hashed') - for example using the 'crypt' or 'md5' password format used in Unix password files. In principle this makes it possible to check that a proffered password is correct (by encrypting it and checking the encrypted versions match), without exposing all the passwords on the list.


Unfortunately, hashed passwords can be recovered by a 'dictionary attack', in which an attacker generates a dictionary of words and their plaintext equivalents and then searches the hashed passwords for matches. Since users have a bad habit of choosing common words (or trivial variations of them) as passwords, such an attack will normally recover a reasonable proportion of passwords. So even with hashed passwords the list would be vulnerable to the problems mentioned in the previous section. Worse, everyone logging-in would still have to offer their plain text password for verification, at which point they would still be vulnerable to a malicious or negligent system administrator.
Unfortunately hashed passwords can be recovered by a 'dictionary attack', in which an attacker generates a dictionary of words and their hashed equivalents and then searches the hashed passwords for matches. Since users have a bad habit of choosing common words (or trivial variations of them) as passwords, such an attack will normally recover a reasonable proportion of passwords. So even with hashed passwords the list would be vulnerable to compromise and would still have to be treated as confidential. Worse, some authentication schemes (particularly those using challenge-response) need access to plaintext passwords, or at least particular hashes of the password, and so can't be supported under this sort of scheme.


==Option the fifth: central password verification==
==Option the fifth: central password verification==


Even if a list could be made to work, distributing it is going to be difficult. Especially if you need it in a timely manner. One solution to this would be to have a 'central password verification service'. In this model, a system using the authentication service solicits a user name and password for a user and forwards them (hopefully in a secure fashion) to the central service which returns a match/no match response. Almost any network protocol can be used for this, but it's quite common to use LDAP and overload LDAP's 'user login' process to provide authentication. It;s not uncommon for software to suooprt this 'out of the box'. Alternatives include POP, IMAP, RADIUS, and home-grown protocols. Configuring any of these so they don't actually expose plaintext passwords on the network makes them somewhat more complicated, and sometimes is not supported, and so is a step which is often omitted.
Even if a list could be made to work distributing it would be difficult, especially if it needs to be done in a timely manner. One solution would be to have a 'central password verification service'. In this model, a system using the authentication service solicits a user name and password for a user and forwards them to the central service which returns a match/no match response. Almost any network protocol could be used for this, but it's common to use LDAP and overload LDAP's 'user login' process to provide authentication. Alternatives include POP, IMAP, RADIUS, and home-grown protocols. Configuring these so that they don't expose plaintext passwords on the network, and so the authentication server can't be spoofed, makes them somewhat more complicated and is a step which is often omitted.


This approach avoids exposing the list of everyone's plaintext or hashed passwords to system administrators (and potentially others) which significantly reduces the exposure. However as in the previous example everyone logging-in still has to offer their plain text password for verification, at which point it is still be vulnerable to a malicious or negligent system administrator. Again, this means that such a service with a particular password could only realistically be offered to sets of services whose administrators are realistically able to trust each other and who are willing to accept the risk that one of their number might one day make a mistake.   
This approach successfully avoids exposing the list of everyone's plaintext or hashed passwords to system administrators (and potentially others) which significantly reduces the exposure. However everyone logging-in still has to offer their plaintext password for verification, at which point it is still be vulnerable to a malicious or negligent system administrator. Consider two sites 'A' and 'B' which are both using such a system and have some users in common. What grounds can site 'A's administrators have for believing that site 'B's administrators will not (accidentaly or deliberately) capture and disclose or use username/password pairs that would allow forged logins on site 'A'. What if site 'A' were on the Student Run Computer System and site 'B' was the University Financial System, or vice-versa?  


There is a further problem, which actually also manifests in the previous two solutions as well. When asked for their 'central authentication system password', how can the user know that it's safe to quote it? Given all of the following, which is the bogus site that is just trying to capture your password?
There is unfortunately a further problem. If asked for their 'central authentication system password', how can a user know that it is actually safe to quote it? The problem is that passwords are only safe if you don't disclose them, but that to use them you have to ''disclose'' them. If you disclose them to a malicious site then you've lost. This is the basis of 'phishing' attacks aimed (with significant success) at getting people to disclose the electronic banking or mail system passwords. Given all of the following, which are the bogus site that is just trying to capture your password?


[[Image:Login-box-montage.jpg|center|Assorted login boxes]]
[[Image:Login-box-montage.jpg|center|Assorted login boxes]]


Note that this vulnerability is essentially the one being exploited by the current round of 'phishing' attacks aimed (with significant success) at getting people to disclose the electronic banking and mail system passwords. There is no obvious solution to this.
There is no obvious solution to this.  


==Option the sixth: Kerberos (or similar)==
==Option the sixth: Kerberos (or similar)==


Kerberos was deliberately designed to overcome this problem. It allows a central verification service to assert that a user has correctly quoted a password, and so has authenticated themselves, without the user having to disclose their password to anything other than their local workstation
Kerberos was designed precisely to overcome this problem. It allows a central verification service to assert that a user has correctly quoted a password, and so has authenticated themselves, without the user having to disclose their password to anything other than their local workstation. It uses assorted cryptographic sleights of hand to do this and has numerous other important properties, such as preventing the recipient of an assertion from using that to impersonate the user on an other service.
 
In principle Raven could be extended to become a Kerberos central verification service (a 'KDC'). But using Kerberos comes at a cost. Firstly the user's local workstation needs Kerberos software installed and configured. Secondly the Kerberos protocol is significantly more complicated that a username/password exchange, so unless  software already supports it then it is likely to be difficult to add. Further Keberos, like all other password systems, assumes that you can trust your local workstation. If it's been compromised, for example with a virus that installs a key logger or by a malicious or inept system administrator, then even Kerberos can not keep password authentication secure.
 
The only upside is that Microsoft have adopted Kerberos (or at least, a version of Kerberos) for authentication under Windows and so many computer users, especially those participating in a well run Active Domain, may already have valid Kerberos credentails and these may work with some Microsoft software.
 
Note that some software (for example this [http://modauthkerb.sourceforge.net/ 'Kerberos' module for Apache]) actually just misuse the initial Kerberos user authentication process by soliciting the user's username and plaintext password and then seeuing if it can authenticate as the user. In this they are just using the Kerberos system as a  central password verification service with all the problems that this entails.
 
==Where does Raven fit into this?==
 
The current Ucam WebAuth system used by Raven, and a whole range of similar systems, depend on the client software in use being a web browsers and on browsers including as standard a fortuitous combination of features:
* Support for HTTP redirects, allowing the client to be instructed to contact the authentication server direct, allowing communication that bypasses the server initiating authentication;
* Support for the https: protocol, providing both security for the user's password on the wire and a way for users to positively confirm that they are communicating with the real authenticatiion server and not an imposter (and so it's safe to disclose their password); and
* Provision of a user interface which can solicit a user and and password.
Ucam WebAuth has some features in common with Kerberos, but without the need to distribute or configure client software, though its use does require much more investment at the server end than a simple username/password system. It does, crucially, only require the user to disclose their password to the central authentication server and provides a way for users to easily identify it. As a result it, just, manages to provide a reasonably reliable authentication system using passwords, but if course only in a web environment.

Revision as of 17:16, 3 December 2008

WARNING: this is incomplete 'work in progress'

Raven currently provides a usable authentication system for interactive, web-based applications but doesn't support non-interactive web-based applications, nor non-web ones, and this covers many potential uses: Windows/MacOS/Unix logon, IMAP, POP, SMTP, LDAP, WebDAV (and so CalDAV), etc., etc. Since Raven of necessity has a database containing username/password pairs for most people in the University, and most of them know their Raven password, it is tempting to assume that extending it to support these other uses would be simple.

This paper explores various ways in which password-based authentication could be provided and attempts to point out the advantages and drawbacks of each proposal. In reading this, it is useful to consider what security properties you would actually want from such a system. It's also important to consider who might be attempting to attach what, and how much we care about it: are we talking about a) a bored University student, b) a tabloid journalist, c) an organised criminal, or d) the American NSA, and are they trying to gain access to a) their mate's photo archive, b) the next heir to the throne's email account, c) the University financial system, or d) everything. Remember that the important thing isn't that legitimate users are required to provide their password before gaining access, but that people can not realistically gain access using an identity that is not their own. These are not the same thing.

"At first he thought that the whole world had blown up. Then he thought that only the Forest part of it had; and then he thought that only he had..." Piglet, Winnie-The-Pooh, A.A. Milne, Methuen 1926.

Option the first: forget passwords

Stepping back a bit, we could just forget passwords and simply ask people to tell us what their CRSid was. This would even simplify Raven itself:

Raven login, no password

This would save people from having a password that they has to remember, and would save the UCS from having to issue them. Both of these would be significant advantages. Anyone can easily set up almost any system 'protected' in this way - the hard part might actually be stopping it from asking for a password.

The obvious downside is that we'd have to trust everyone who can get to a login prompt not to lie (and for many networked services this means everyone in the entire world), and this is rather unrealistic.

Option the second: use a single fixed password

Rather than forgetting passwords completely, we could could use a single, fixed, 'well known' password to protect all accounts.

This would be easy to remember (and easy to recover if you forgot it). It's also fairly easy to set up a system protected in this way.

The problem is that we'd still have to trust everyone who knew the password not to lie. It's fair to assume that a 'secret' known by 55,000 (and rising) people would not stay a secret for long so we'd have to trust quite a lot of people not to lie. It would also be next to impossible ever to change this password. Again this is probably not a realistic option (though this doesn't actually stop people using it, though usually on a small scale - think of the codes used to arm and disarm most intruder alarm systems).

Option the third: distribute a list of user name/plaintext password pairs

We could (in theory, though not currently in practice) extract a list of user names and corresponding plain text passwords from Raven, one for each user, and then distribute this list to every system that needs to authenticate users.

Users could then quote their user name and password, the system could look up the the user name and compare the offered password with the one on the list - if it matches they would get in, if not they wouldn't. Implementing authentication in this way would be fairly easy, and system administrators could post-process the list into whatever form their system actually needed.

This avoids the problem of having to trust most users not to lie - the vast majority of users will only be able to authenticate as themselves. It's still possible for users to give away their password, but that's a feature of password-based authentication that's effectively impossible to avoid.

The list itself, however, is now a major problem. Anyone with access to it can authenticate as any user on any system that relies on this authentication service. Worse, many people use the same password for multiple authentication systems, so anyone with access to the list could probably also authenticate on systems that don't use this authentication system.

All of the managers of all the systems using the authentication service would inherently have access to the list, so we'd have to trust them. Even if we assume that none of these administrators would be actively malicious, there is still the danger that they might accidentally or recklessly allow the list to leak (from a hacked server, on a memory stick, on a laptop left on a train, etc.). So the security of each under this model still depends on a whole group of people that each system's administrator has no particular reason to trust. In practice the only option would be to severely restrict the systems that could participate, to the point where it could never provide a 'universal' authentication system for the University.

Option the fourth: distribute a list of user name/'crypted' password pairs

Rather than distributing plaintext passwords, we could distribute them 'non-reversibly encrypted' (or 'hashed') - for example using the 'crypt' or 'md5' password format used in Unix password files. In principle this makes it possible to check that a proffered password is correct (by encrypting it and checking the encrypted versions match), without exposing all the passwords on the list.

Unfortunately hashed passwords can be recovered by a 'dictionary attack', in which an attacker generates a dictionary of words and their hashed equivalents and then searches the hashed passwords for matches. Since users have a bad habit of choosing common words (or trivial variations of them) as passwords, such an attack will normally recover a reasonable proportion of passwords. So even with hashed passwords the list would be vulnerable to compromise and would still have to be treated as confidential. Worse, some authentication schemes (particularly those using challenge-response) need access to plaintext passwords, or at least particular hashes of the password, and so can't be supported under this sort of scheme.

Option the fifth: central password verification

Even if a list could be made to work distributing it would be difficult, especially if it needs to be done in a timely manner. One solution would be to have a 'central password verification service'. In this model, a system using the authentication service solicits a user name and password for a user and forwards them to the central service which returns a match/no match response. Almost any network protocol could be used for this, but it's common to use LDAP and overload LDAP's 'user login' process to provide authentication. Alternatives include POP, IMAP, RADIUS, and home-grown protocols. Configuring these so that they don't expose plaintext passwords on the network, and so the authentication server can't be spoofed, makes them somewhat more complicated and is a step which is often omitted.

This approach successfully avoids exposing the list of everyone's plaintext or hashed passwords to system administrators (and potentially others) which significantly reduces the exposure. However everyone logging-in still has to offer their plaintext password for verification, at which point it is still be vulnerable to a malicious or negligent system administrator. Consider two sites 'A' and 'B' which are both using such a system and have some users in common. What grounds can site 'A's administrators have for believing that site 'B's administrators will not (accidentaly or deliberately) capture and disclose or use username/password pairs that would allow forged logins on site 'A'. What if site 'A' were on the Student Run Computer System and site 'B' was the University Financial System, or vice-versa?

There is unfortunately a further problem. If asked for their 'central authentication system password', how can a user know that it is actually safe to quote it? The problem is that passwords are only safe if you don't disclose them, but that to use them you have to disclose them. If you disclose them to a malicious site then you've lost. This is the basis of 'phishing' attacks aimed (with significant success) at getting people to disclose the electronic banking or mail system passwords. Given all of the following, which are the bogus site that is just trying to capture your password?

Assorted login boxes

There is no obvious solution to this.

Option the sixth: Kerberos (or similar)

Kerberos was designed precisely to overcome this problem. It allows a central verification service to assert that a user has correctly quoted a password, and so has authenticated themselves, without the user having to disclose their password to anything other than their local workstation. It uses assorted cryptographic sleights of hand to do this and has numerous other important properties, such as preventing the recipient of an assertion from using that to impersonate the user on an other service.

In principle Raven could be extended to become a Kerberos central verification service (a 'KDC'). But using Kerberos comes at a cost. Firstly the user's local workstation needs Kerberos software installed and configured. Secondly the Kerberos protocol is significantly more complicated that a username/password exchange, so unless software already supports it then it is likely to be difficult to add. Further Keberos, like all other password systems, assumes that you can trust your local workstation. If it's been compromised, for example with a virus that installs a key logger or by a malicious or inept system administrator, then even Kerberos can not keep password authentication secure.

The only upside is that Microsoft have adopted Kerberos (or at least, a version of Kerberos) for authentication under Windows and so many computer users, especially those participating in a well run Active Domain, may already have valid Kerberos credentails and these may work with some Microsoft software.

Note that some software (for example this 'Kerberos' module for Apache) actually just misuse the initial Kerberos user authentication process by soliciting the user's username and plaintext password and then seeuing if it can authenticate as the user. In this they are just using the Kerberos system as a central password verification service with all the problems that this entails.

Where does Raven fit into this?

The current Ucam WebAuth system used by Raven, and a whole range of similar systems, depend on the client software in use being a web browsers and on browsers including as standard a fortuitous combination of features:

  • Support for HTTP redirects, allowing the client to be instructed to contact the authentication server direct, allowing communication that bypasses the server initiating authentication;
  • Support for the https: protocol, providing both security for the user's password on the wire and a way for users to positively confirm that they are communicating with the real authenticatiion server and not an imposter (and so it's safe to disclose their password); and
  • Provision of a user interface which can solicit a user and and password.

Ucam WebAuth has some features in common with Kerberos, but without the need to distribute or configure client software, though its use does require much more investment at the server end than a simple username/password system. It does, crucially, only require the user to disclose their password to the central authentication server and provides a way for users to easily identify it. As a result it, just, manages to provide a reasonably reliable authentication system using passwords, but if course only in a web environment.