Shibboleth Attribute Usage and Derivation
This was a working document belonging to the Computing Service's Shibboleth Development Project. This project is complete (Raven now supports Shibboleth) and this document only remains for historical and reference purposes. Be aware that it is not being maintained and may be misleading if read out of context.
Introduction
Shibboleth's ability to support authorisation depends on IdPs releasing values for relevant attributes, and on IdPs and SPs agreeing their meaning. The choice of attributes and their meaning is a matter either for bilateral agreement between IdPs and SPs, or for agreement between the members of a federation.
As far as 'formal' Shibboleth use in the UK goes, the UK Access Management Federation for Education and Research is likely to be the driving force for the foreseeable future. This will be particularly true of the use of Shibboleth as a replacement for Athens. This document reviews the attributes recommended for use within the UK Federation, their definitions, and sets out out options for populating them with particular reference to Athens replacement. A number of open questions appear in the text.
UK Federation attribute policy
The UK Federation defines four attributes that identity providers are recommended to support, and that service providers should consider when setting attribute requirements. From [UKTRP] sec. 7:
- eduPersonScopedAffiliation. This attribute indicates the user's relationship (e.g., staff, student, etc.) with the organisation. For many applications, examination of this attribute is sufficient to determine whether the user has sufficient privilege to access the resource.
- eduPersonTargetedID. If a service provider is presented only with the affiliation of an anonymous subject, as provided by eduPersonScopedAffiliation, it cannot provide service personalisation or usage monitoring across sessions. These capabilities are enabled by the eduPersonTargetedID attribute, which provides a persistent user pseudonym, distinct for each service provider.
- eduPersonPrincipalName. This attribute is used where a persistent user identifier, consistent across different services, is required. It often corresponds to the user's single sign-on (SSO) name, and may be useful for securing both internal institutional services and external services where access control lists are used.
- eduPersonEntitlement. This attribute enables an organisation to assert that a user satisfies an additional set of specific conditions that apply for access to a particular resource. A user may possess different values of the eduPersonEntitlement attribute relevant to different resources.
Federation policy recommends that wherever possible only the first two attributes (eduPersonScopedAffiliation and eduPersonTargetedID) should be required by SPs.
The four core attributes are derived from definitions in the EduPerson Object Class Specification ( [eduPerson03] and/or [eduPerson06]), but note that the the definition of eduPersonPrincipalName changed significantly between [eduPerson03] and [eduPerson06].
Sec 7.3 and 7.5 of [UKTRP] acknowledges that additional attributes may be required in some cases. Sec 7.4 recommends that these should if possible be taken from the eduPerson object class ([eduPerson06, eduPerson03]), the person and organizationalPerson object classes (X.521), or the inetOrgPerson object class (RFC 2798). MIMAS Landmap, for example, appears to optionally accept givenName, sn, ou, and mail, presumably to support account provisioning. It is likely that these and other similar attributes would be useful for Shibboleth deployments within the University.
Core attribute definition and use
Three of the four core attributes are 'scoped' and share a common syntax: local-part@security-domain, where local-part is attribute-specific and security-domain is a DNS name that the federation operator has verified is registered to the identity provider's owner. In effect this provides a partitioned namespace within which IdPs can create unique names. According to [UKTRP] sec. 7.1.1 "Institutions in the HE/FE sector are recommended to use their principal institutional domain name as their scope".
Specific issues for the four core attributes are as follows:
eduPersonScopedAffiliation (ePSA)
This is defined in [UKTRP] sec. 7.1.2 and [eduPerson06] sec. 2.2.9. [UKTRP] says:
- "This attribute enables an organisation to assert its relationship with the user. This addresses the common case where a resource is provided on a site licence basis, and the only access requirement is that the user is a bona fide member of the organisation, or a specific school or faculty within it.
- "The attribute is multi-valued (that is, a user can have more than one value for the attribute), and is structured as a scoped attribute, with the form affiliation@security-domain, where affiliation is one of a number of prescribed categories of user."
The prescribed user categories are student, staff, faculty, employee, member, affiliate, and alum. Limited guidance on the expected use of these categories is given in [UKTRP] secs. 7.1.2.1 and 7.1.2.3, and [eduPerson06] sec. 2.2.1. In particular, [eduPerson06] says of the 'member' value:
- "'Member' is intended to include faculty, staff, student, and other persons with a basic set of privileges that go with membership in the university community (e.g., they are given institutional email and calendar accounts). It could be glossed as "member in good standing of the university community."
and
- "Each institution decides the criteria for membership in each affiliation classification."
Unofficial discussion with UK Federation staff suggest that the woolliness of this definition is deliberate, and is intended to give organisations latitude in deciding what constitutes "member in good standing" within their community. There appears to be an assumption that contracts between universities and content providers have a similar degrees of woolliness (they do, after all, tend to allow IP address based 'authentication' which is similarly wooly), or can be mutually agreed to be compatible. There is a suggestion that, at least in its default state, the JISC standard contracts will eventually be aligned with this. SPs who wish to be more selective about licensing have the option of requiring an eduPersonEntitlement (see below).
EDINA are already using an ePSA value of 'member@cam.ac.uk' to grant access to 'Film and Sound Online' and appear to be happy with the inherent woolliness. There are currently no known examples of the use of categories other than 'member'.
ePSA is multi-valued, allowing all possible combinations of categories to be expressed. ePSA, in itself, is unlikely to constitute personal data within the meaning of the Data Protection Act 1998.
eduPersonTargetedID (ePTID)
ePTID is described in [UKTRP] sec. 7.1.3 and [eduPerson06] sec 2.2.10. It is intended to support functions such as personalisation in a way that does not reveal the user's identity or allow collusion between SPs. Its value is a persistent, opaque user identifier whose value is different for each SP to which it is released. It has the form pseudonym@security-domain.
A number of existing Shibbolised services in the UK Federation make use of ePTID if it is available (see http://www.ukfederation.org.uk/content/Documents/AttributeUsage).
ePTID is theoretically multi-valued, but in practice only a single value will ever be released at one time. Federation documentation, in particular the Federation's Recommendations for use of personal data [UKRUPD] suggest that ePTID is unlikely to constitute personal data within the meaning of the Data Protection Act 1998. It is not clear that this is the case - if it isn't then steps will be required to comply with the the data protection principles set out in the Act.
- Question: Is ePTID 'Personal Data' under the terms of the Data Protection Act'?
- Question: If so, how should we best comply with the data protection principles set out in the Act in respect of ePTID, including the likelihood of transfer outside Europe?
eduPersonPrincipalName (ePPN)
This attribute is described in [UKTRP] sec. 7.1.4 and [eduPerson06] sec 2.2.8. It represents a persistent, unique user identifier which is consistent across all services. A typical use case would be a resource protected by an access control list which is updated using values supplied out-of-band. To support this use there is an argument for constructing ePPN from values with which users are already familiar.
ePPN has the form local-name@security-domain. Despite the apparent similarity, it should not be confused with Kerberos identifier or email addresses. [UKTRP] recommends that the value of ePPN should corresponds to the identifier which a user presents when authenticating to local institutional services.
Within the UK Federation, currently only MIMAS Landmap and the Shibboleth project wiki require ePPN, though it's not clear that the Shibboleth wiki should.
ePPN is single-valued. It is almost inevitable that ePPN constitutes personal data within the meaning of the Data Protection Act 1998. Any exchange of this attribute will have to comply with the data protection principles set out in the Act.
- Question: How should we best comply with the data protection principles set out in the data Protection Act in respect of ePPN, including the likelihood of transfer outside Europe?
eduPersonEntitlement (ePE)
See [UKTRP] sec. 7.1.5 and [eduPerson06] 2.2.2. This provides an escape mechanism allowing an IdP to assert one or more entitlements, typically specified by an SP, on behalf of particular IdP users. An example use case would be asserting that a particular user is entitled to access a particular resource under the terms of the relevant licence.
Values for ePE have the form of Uniform Resource Identifiers (URI), most frequently using the 'http' or 'urn' schemes. In the case of a value using the 'http' scheme, [UKTRP] recommends but does not require that the value resolve to a document giving the definition of the value.
Three uses have already emerged in practice:
- EDINA Film and Sound Online requires that an ePE of urn:mace:ac.uk:sdss.ac.uk:entitlement:emol.sdss.ac.uk:restricted be released for anyone qualified to access their medically-restricted material.
- The Eduserv Shibboleth-to-Athens gateway requires that an ePE (of an unknown form) is released to indicate the Athens permission set(s) that should be associated with the user on the Athens side of the gateway.
- JSTOR is not yet available via Shibboleth in the UK, but in the US they require that an ePE value of urn:mace:dir:entitlement:common-lib-terms be asserted for anyone entitled to access the resource. This US-specific value appears to be (ill-)defined at http://middleware.internet2.edu/urn-mace/urn-mace-dir-entitlement.html
ePE is multi-valued, though in practice only entitlements relevant to a particular SP will be released to it. ePE, in itself, is unlikely to constitute personal data within the meaning of the Data Protection Act 1998. However there may be circumstances (see [UKRUPD] sec. 3.2.4) when they could represent personal data or even sensitive personal data.
Populating Attributes
The standard Internet2 IdP includes reasonably flexible support for obtaining attribute values from external sources such as LDAP directories or SQL databases based on a user's authenticated ID (a CrsID in out case). It also includes support for manipulating values to derive new ones. The expectation is that the data needed to drive a University IdP will, where possible, come from lookup.
In deciding how attributes will be populated there are in effect three possibilities:
- All the information needed is already available in lookup. This is the easiest option and requires only configuration of the IdP.
- At least some of the information needed is not currently in lookup but is available on-line in the University. To use this it will either be necessary to feed it to lookup, or to give the IdP access by some other route. Data from the Computing Service's Jackdaw database, even if not in lookup, may be reasonably easy to access. An issue here is that lookup is in the middle of a major rewrite but there are currently no staff available to complete this work.
- At least some of the information needed is not currently available on-line. To make use of such information someone will have to maintain it manually and it will somehow have to be made available to the IdP. Groups in lookup could perhaps be used for this, but managing groups of any significant size is likely to be cumbersome. It is possible that entitlement to access some existing electronic resources, at least according to current criteria, will fall into this category.
An additional complication is that individual users are entitled to suppress almost all of their lookup data, and that this suppression applies equally to access to data by automated systems such as a Shibboleth IdP. It is not considered acceptable to force users to make information visible simply to enable them to access particular resources. In theory the IdP could be given privileged access to suppressed data, but that then raises the question of how the user's express wish that data be suppressed should be expressed in the values supplied as Shibboleth attributes.
- Question: Should a central IdP have privileged access to data that users have suppressed in lookup? If so, how?
- Question: If so, how should the fact that data was suppressed be expressed in the IdPs policy on releasing attribute values?
The Internet2 IdP supports the concept of Attribute Release Policies. Such policies are applied on a system-wide and/or per-user basis and define what attributes and attribute values can be disclosed to particular SPs. These could perhaps be used to give users control over their own information, but the author has yet to see a user interface to this functionality, other than direct editing of XML. An assumption seems to be that users will establish their attribute release policies in advance of any authentication that will result in attribute release. An alternative approach might be to modify the IdP to tell users exactly what is about to be disclosed about them, and to give them the option of suppressing particular items. Without support for a 'remember this decision in future' feature, doing this on every authentication is likely to rapidly become tedious. In any case, data protection considerations require that informed consent be obtained before at least some attribute values are released under at least some circumstances.
- Question: How can users be given effective, informed, usable control over attribute values released on their behalf?
Issues relating to specific attributes:
eduPersonScopedAffiliation
From the discussion above it seems clear that a value of 'member' for ePSA should be asserted in line with the "member in good standing" or "bona fide member of the organisation" criteria from the relevant documentation and not, for example, necessarily in line with existing criteria used to issue Athens accounts. Just how useful this attribute will prove to be will depend on the future attitude of SPs.
On this basis it seems that possession of a Raven account, or perhaps appearing in lookup, could be sufficient to grant 'member' status. We do, for example, grant access to our Lapwing wireless network to anyone with a Raven account. In theory it would be a good idea to move away from granting resources purely based on possession of a Raven account, but in the absence of a widely-supported and widely understood "member in good standing" attribute in our data stores this may be the best we can manage.
- Question: Is it likely to be acceptable to assert an ePE value of 'member@cam.ac.uk' based on general criteria such as "member in good standing" or "bona fide member of the organisation"?
- Question: If so, should this depend on possession of a Raven account, on appearing in lookup, or some other criteria (what)?
It may be possible to assert other ePSA values given additional data - Jackdaw maintains 'Staff' and 'Student' flags for example, though the coverage is not complete. Since there are no current use cases for such information it may not yet be worth doing.
- Question: Is there currently any value in asserting any values of ePE other than 'member'? If so, on what basis?
eduPersonTargetedID
The Internet2 IdP includes support for deriving ePTID algorithmically, in either its [eduPerson03] or [eduPerson06] format, from a hash of a secret and the IDs of the user, the IdP and the SP.
The drawback to this is that if any of these data items change then so will the corresponding ePTID, with the result that the user will loose access to the 'account' it represents. This approach also makes it impossible to give the user a new ePTID for a particular service. This might be required, for example, if the anonymity of their existing ID were to be compromised.
The alternative is to generate random ePTID values the first time a user accesses a particular service via a particular IdP and to store this value for future use. The downside with this is the need to provide appropriate secure storage.
- Question: Should ePTID be derived algorithmically or should it be generated at random and stored for future reuse? If the latter, where?
eduPersonPrincipalName
Following UK Federation guidelines, the pPPN for a user with a CrsID of fjc55 would be fjc55@cam.ac.uk. This is likely to be textually identical to their email address.
There are two potential problems with this: it will be difficult to convince users that we are not releasing their email address when we release their ePPN, and some SPs (particularly internal ones) may try to use ePPN as an email address which will work for some but not all of our users.
A recent discussion on the JISC-SHIBBOLETH@JISCMAIL.AC.UK and UKFEDERATION-DISCUSS@JISCMAIL.AC.UK mailing lists (here, and here) included the following (from Ian Young <ian@iay.org.uk>):
- "If you use something other than your institutional domain name, you'll be doing something different to everyone else. This in turn means that you'll have to have a conversation with every SP explaining your choice to them."
and:
- "You could do this [use some persistent identifier other than CrsID], it would certainly be less hard to explain to other people than an odd choice of scope. It's not what eduPerson says (they talk in terms of being able to use the LHS of ePPN for local authentication) but the main practical implication is going to be that people won't know what their ePPN is, which may cause some support issues."
In practice it looks as if it would be a courageous decision not to use CrsID as the local part of ePPN, or not to use cam.ac.uk as the scope for our institutional IdP.
- Question: Do we have any realistic alternative to using CrsID as the local part of ePPN and 'cam.ac.uk' as the scope?
eduPersonEntitlement
It is not yet clear what this attribute will get since SPs could rely on ePSA in many cases. However in practice at least some and perhaps many SPs will demand that their own ePE values be asserted for users that meet their own particular licencing conditions.
While tedious, this may not be as difficult to achieve as it appears at first sight. We currently manage to grant Athens accounts on an 'all-or-nothing' basis so it should be possible to grant a corresponding block of ePE values on the same basis. The complication, which we may in any case have to address to support asserting the ePE values required by the Shibboleth-to-Athens gateway, is that our current Athens policy appears to rely on information (such as identification of Visiting Scholars and persons with affiliated status, exclusion of non-academically active retired staff, etc) which is not currently available on-line. Exact replication of the current policy is likely to require that lists of at least some categorys of users be manually maintained by somebody - presumably the Library for cases that apply directly to library-based electronic resources.
Other use cases for ePE will presumably emerge in due course.
- Question: Is it likely that we can establish common criteria for release of at least ePE values relating to library-based electronic resources?
- Question: Is it necessary to use current Athens account criteria for this?
- Question: If so, how can we arrange access to the information necessary to evaluate these criteria?
Other attributes
Values for 'simple' attributes, such as givenName, sn, ou, and mail could in general be derived for data in lookup. A notable exception being givenName, which on principle the Computing Service does not maintain. More complex attributes could perhaps be populated, though without examples or use cases it is difficult to speculate.
References
- [eduPerson03]
- Internet2 Middleware Architecture Committee for Education, Directory Working Group (MACE-Dir). EduPerson Object Class Specification (200312). Document ID Internet2-mace-dir-eduPerson-200312. http://www.nmi-edit.org/eduPerson/internet2-mace-dir-eduperson-200312.html
- [eduPerson06]
- Internet2 Middleware Architecture Committee for Education, Directory Working Group (MACE-Dir). EduPerson Object Class Specification (200604). Document ID internet2-mace-dir-eduPerson-200604. http://www.nmi-edit.org/eduPerson/internet2-mace-dir-eduperson-200604.html
- [UKTRP]
- UK Access Management Federation for Education and Research: Technical Recommendations for Participants. Document ID ST/AAI/UKF/DOC/003. http://www.ukfederation.org.uk/library/uploads/Documents/technical-recommendations-for-participants.pdf
- [UKRUPD]
- UK Access Management Federation for Education and Research: Recommendations for Use of Personal Data. Document ID ST/AAI/UKF/DOC/004. http://www.ukfederation.org.uk/library/uploads/Documents/recommendations-for-use-of-personal-data.pdf