Moving the UID/GID of a user
This page documents the recommended procedure for moving the UID and/or GID of a user in the CL Unix LDAP tables.
Background
Modern POSIX operating systems (Linux, macOS, FreeBSD) expect users to have a numeric user identifier (UID) of at least 1000, as lower UIDs are reserved for pseudo-users allocated by the operating-system vendor. However, when the Computer Laboratory first used Unix file systems in the mid 1980s, that limit was still at 100. As a result, we still have, as of 2016, 20 users with UID < 1000.
A campaign to renumber these users and their files will have several benefits:
- their user identifier will no longer be confused with the identity of system pseudo-users (e.g., files owned by maj1 will finally show up in “ls -l” as owned by “maj1” and not as e.g. “avahi-autoipd”)
- the operating system will recognize them as regular users, enabling operations (such as certain xdm login functions) that are blocked for UIDs < 1000
- incorporating the LDAP tables on a self-managed computer will no longer risk interfering with the local /etc/passwd tables
- LDAP users with UID ≥ 500 can be incorporated on self-managed machines via libnss-extrausers
- various local configuration hacks (e.g. changing the minimum_uid parameter of pam_krb5.so) to workaround the default UID ≥ 1000 assumption can be dropped on lab-managed Linux machines (improving defense in depth)
Affected users
User CRSID old_UID old_GID new_UID new_GID migration date/time --------------------------------------------------------------------------------------- Martyn Johnson maj1 101 60101 tbd tbd tbd Piete Brooks pb22 104 60104 tbd tbd tbd Arthur Norman acn1 107 60107 tbd tbd tbd Mike Gordon mjcg 110 60110 tbd tbd tbd Peter Robinson pr10 111 60111 tbd tbd tbd Jon Fairbairn jf15 128 60128 tbd tbd tbd Larry Paulson lp15 138 60138 tbd tbd tbd Caroline Blackmun ceb4 145 60145 tbd tbd tbd Graham Titmus gt19 178 60178 tbd tbd tbd Ian Leslie iml1 243 60243 tbd tbd tbd Glynn Winskel gw104 244 60244 tbd tbd tbd Andy Hopper ah12 260 60260 tbd tbd tbd Alan Mycroft am21 300 60300 tbd tbd tbd Frank King fhk1 301 60301 tbd tbd tbd Martin Richards mr10 302 60302 tbd tbd tbd Derek McAuley drm10 336 60336 tbd tbd tbd Jean Bacon jmb25 341 60341 tbd tbd tbd Robin Fairbairns rf10 344 60344 tbd tbd tbd Ted Briscoe ejb1 412 60412 tbd tbd tbd Ann Copestake aac10 432 60432 tbd tbd tbd
List produced with
/homes/mgk25/proj/filer/ldap_uids.pl -r100,499 -L
Moving process
Preparation
- Inform user of the planned date and time of the change and advise them to log out and (ideally) also terminate long-running processes during the migration period.
- Ask the user on what other POSIX filesystems than those on elmer they own files (local disks of desktops, servers and virtual machines connected to the Unix LDAP servers)
- Make a note of their old numeric UID and GID (e.g. in table above). We will refer to these as $old_UID and $old_GID.
Actual move
- Update in the administrative database their UID and GID to their new value, and make a note of these (e.g. in table above). We will refer to these as $new_UID and $new_GID. As per the new departmental UID/GID allocation plan:
- If the user is a person identified by CRSID: make sure 1100 ≤ $new_UID = $new_GID < 9000.
- For a pseudo-user: make sure 9000 ≤ $new_UID = $new_GID < 9500.
- Wait until the new UID and GID have propagated to all the LDAP servers, as well as to the name server database cache (NSDB) on elmer. The latter can be forced with the nfs nsdb flush command.
- Make lists of directory prefixes of the locations of files that still have the old UID/GID:
ugid-find uid=$old_UID prefixes dirnames print >uiddir-$old_UID ugid-find gid=$old_GID prefixes dirnames print >giddir-$old_GID
- Review the directory prefix lists to see whether they look plausible, and manually fix where needed. (This is especially important if the renumbering is needed to resolve a collision, i.e. where multiple people have used the same numeric identifier in the past.)
- Log into one of the omnipotent NFSv3 clients. Check with "mount" that it really mounts elmer with nfsvers=3 (so we can see untranslated UID/GID values). Then run as "root":
xargs -r <uiddir-$old_UID chown -hcR --from=$old_UID $new_UID xargs -r <giddir-$old_GID chgrp -hcR --from=$old_GID $new_GID
Completion
- Finally, notify the user that they can log in again.
- Remind the user to run the following commands as root on their own local file systems, or do it for them, as agreed:
chown -hcR --from=$old_UID $new_UID path ... chgrp -hcR --from=$old_GID $new_GID path ...
Some notes on the process
Dealing with Windows NTFS files
The filer can store files with either Unix or Windows NTFS access-control attributes. We do not need and want to touch files with Windows NTFS attributes for several reasons:
- Executing a chown/chgrp on a Windows NTFS file will turn it into a Unix file, and potentially destroy access-control list information as a result.
- There is no need to chown/chgrp a Windows NTFS file, because such files have Windows SIDs instead of UIDs
The trick is to first change the UID and GID of a user in the Unix LDAP tables, and then wait until that change has propagated through to the NetApp name server database cache. When a Unix user queries an NTFS file with stat(), the NetApp will first translate the SID of the file via an Active Directory LDAP lookup into a user name (CRSID). It will then translate that CRSID via a Unix LDAP lookup into the UID that stat() shows.
Once the UID and GID change in the Unix LDAP server has reached the NetApp name server database cache, all Windows NTFS files will show already the new (emulated) UID/GID via NFSv3, and therefore be ignored by the subsequent ugid-find and chown/chgrp invocations. This way, only Unix files are affected by the migration.
NFS version
It is very important that the commands
- ugid-scan
- ugid-find
- chown
- chgrp
are only executed on NFSv3 clients. This is because they are dealing here with UID/GID numbers that are no longer listed in LDAP, and the NFSv4 server will map all of these to nobody/nogroup. NFSv4 does not communicate numeric UID or GID values, instead it uses LDAP to translate these into “CRSID@cl.cam.ac.uk” strings on the wire. This would be counterproductive here.
Kerberos/AD identity
During the migration process, the Active Directory Kerberos identity (principal name, SID, etc.) remains unaffected. We are only changing the LDAP UID/GID. Therefore we are not creating a new user identity and deleting an old one in the Active Directory. This way, there will be hardly any disruption to Windows users, and all Kerberos and Active Directory metadata associated with the user remains intact.
Dealing with odd filenames
The above recipe uses the ugid-find “print” command to generate a list of LF-terminated path names. This is to make it easy to manually review and edit the generated list of directory prefixes with a text editor. However, this format might break in case one of the listed pathnames contains a linefeed character. This is very unlikely in Unix-originated filenames, but does occasionally occur with Windows Explorer generated files. The ugid-find “print0” and the xargs option -0 can be used to exchange a NUL terminated list of pathnames, which is immune to such problems.
A more sensible approach is to rename filenames that contain an LF, which is very likely an accident.
Dealing with long-running processes
If the user has long-running processes that they do not want to stop for the duration of the migration, one possible approach is to have two separate migrations, one for the UID and one for the GID. The user then would have to arrange that the long-running processes retain file access during the migration through at least one of these, the UID or the GID, while the other one is being moved. This still requires that these processes are restarted (after a new login) between the UID and GID move, such that they benefit from the new identity after the first half of the migration.