More on eduPersonPrincipleName
Update 2007-07-20: we do not plan to go down this route, and will be using <crsid>@cam.ac.uk as duPersonPrincipleName.
An attribute commonly used with Shibboleth is eduPersonPrincipleName (see the eduPerson specification). It represents a persistent, globally unique user identifier for an individual. My reading of the various standards and specifications had lead me to conclude that for us it would have to be of the form firstname.lastname@example.org which is inconveniently similar to a valid email address (see Shibboleth Attribute Usage and Derivation for more details) and includes things like the user's initials, some indication of how long they have been associated with the University, etc. Recent correspondence has identified one early-adopter (Cardiff) already using something other than their normal user-id as the local part of ePPN, and one who intends to. This leads me to re-consider the issue.
Ideally, the local part of an ePPN would
- be unique within the current and future University population and never reassigned
- be clearly not a CRSid or other University identifier
- be reasonably easy to transfer e.g. by telephone, though not necessarily memorable
- leak no other information about the person to whom it is assigned
I don't believe any existing user data meets these requirements, almost by definition. This, regretably, suggests we have to assign a new identifier.
I suggest we use a randomly assigned, entirely numeric identifier for this purpose and make the resulting value or eduPersonPrincipleName available in the 'Identity' section of lookup. The identifier could be assigned either by Jackdaw or lookup. The assigning system would need to at least record all such identifiers ever allocated to ensure uniqueness.
CRSid space appears to include about 4,500,000,000 differs. Jackdaw's UID counter currently stands at approximately 160,000, growing at approximately 10,000/year - at this rate a 6 digit number would last us until 2090 and a 7 digit one to almost the end of the century. To ease random generation, it would be helpful if the number of available identifiers was significantly, say order 10, bigger than the maximum number of identifiers that will ever be needed.
I suggest we use a 9-digit number starting at 100,000,000, formatted as three blocks of three (both for ease of reading and to distinguish it from, e.g. University Staff or Student number). So for example
Note that I believe that such an identifier will have exactly the same properties under the Data Protection Act as one composed from CRSid.