Moving the UID/GID of a user

From Computer Laboratory System Administration
Jump to navigationJump to search

This page documents the recommended procedure for moving the UID and/or GID of a user in the CL Unix LDAP tables.

Background

Modern POSIX operating systems (Linux, macOS, FreeBSD) expect users to have a numeric user identifier (UID) of at least 1000, as lower UIDs are reserved for pseudo-users allocated by the operating-system vendor. However, when the Computer Laboratory first used Unix file systems in the mid 1980s, that limit was still at 100. As a result, we still have, as of 2016, 20 users with UID < 1000.

A campaign to renumber these users and their files will have several benefits:

  • their user identifier will no longer be confused with the identity of system pseudo-users (e.g., files owned by maj1 will finally show up in “ls -l” as owned by “maj1” and not as e.g. “avahi-autoipd”)
  • the operating system will recognize them as regular users, enabling operations (such as certain xdm login functions) that are blocked for UIDs < 1000
  • incorporating the LDAP tables on a self-managed computer will no longer risk interfering with the local /etc/passwd tables
  • LDAP users with UID ≥ 500 can be incorporated on self-managed machines via libnss-extrausers
  • various local configuration hacks (e.g. changing the minimum_uid parameter of pam_krb5.so) to workaround the default UID ≥ 1000 assumption can be dropped on lab-managed Linux machines (improving defense in depth)

Affected users

User              CRSID    old_UID  old_GID    new_UID  new_GID     migration date/time
---------------------------------------------------------------------------------------
Martyn Johnson     maj1       101    60101        tbd      tbd       tbd
Piete Brooks       pb22       104    60104        tbd      tbd       tbd
Arthur Norman      acn1       107    60107        tbd      tbd       tbd
Mike Gordon        mjcg       110    60110        tbd      tbd       tbd
Peter Robinson     pr10       111    60111        tbd      tbd       tbd
Jon Fairbairn      jf15       128    60128        tbd      tbd       tbd
Larry Paulson      lp15       138    60138        tbd      tbd       tbd
Caroline Blackmun  ceb4       145    60145        tbd      tbd       tbd
Graham Titmus      gt19       178    60178        tbd      tbd       tbd
Ian Leslie         iml1       243    60243        tbd      tbd       tbd
Glynn Winskel      gw104      244    60244        tbd      tbd       tbd
Andy Hopper        ah12       260    60260        tbd      tbd       tbd
Alan Mycroft       am21       300    60300        tbd      tbd       tbd
Frank King         fhk1       301    60301        tbd      tbd       tbd
Martin Richards    mr10       302    60302        tbd      tbd       tbd
Derek McAuley      drm10      336    60336        tbd      tbd       tbd
Jean Bacon         jmb25      341    60341        tbd      tbd       tbd
Robin Fairbairns   rf10       344    60344        tbd      tbd       tbd
Ted Briscoe        ejb1       412    60412        tbd      tbd       tbd
Ann Copestake      aac10      432    60432        tbd      tbd       tbd

List produced with

/homes/mgk25/proj/filer/ldap_uids.pl -r100,499 -L

Moving process

Preparation

  1. Inform user of the planned date and time of the change and advise them to log out and (ideally) also terminate long-running processes during the migration period.
  2. Ask the user on what other POSIX filesystems than those on elmer they own files (local disks of desktops, servers and virtual machines connected to the Unix LDAP servers)
  3. Make a note of their old numeric UID and GID (e.g. in table above). We will refer to these as $old_UID and $old_GID.

Actual move

  1. Update in the administrative database their UID and GID to their new value, and make a note of these (e.g. in table above). We will refer to these as $new_UID and $new_GID. As per the new departmental UID/GID allocation plan:
    • If the user is a person identified by CRSID: make sure 1100 ≤ $new_UID = $new_GID < 9000.
    • For a pseudo-user: make sure 9000 ≤ $new_UID = $new_GID < 9500.
  2. Wait until the new UID and GID have propagated to all the LDAP servers, as well as to the name server database cache (NSDB) on elmer. The latter can be forced with the nfs nsdb flush command.
  3. Make lists of directory prefixes of the locations of files that still have the old UID/GID:
     ugid-find uid=$old_UID prefixes dirnames print >uiddir-$old_UID
     ugid-find gid=$old_GID prefixes dirnames print >giddir-$old_GID
    
  4. Review the directory prefix lists to see whether they look plausible, and manually fix where needed. (This is especially important if the renumbering is needed to resolve a collision, i.e. where multiple people have used the same numeric identifier in the past.)
  5. Log into one of the omnipotent NFSv3 clients. Check with "mount" that it really mounts elmer with nfsvers=3 (so we can see untranslated UID/GID values). Then run as "root":
      xargs -r <uiddir-$old_UID chown -hcR --from=$old_UID $new_UID
      xargs -r <giddir-$old_GID chgrp -hcR --from=$old_GID $new_GID
    

Completion

  1. Finally, notify the user that they can log in again.
  2. Remind the user to run the following commands as root on their own local file systems, or do it for them, as agreed:
      chown -hcR --from=$old_UID $new_UID path ...
      chgrp -hcR --from=$old_GID $new_GID path ...
    

Some notes on the process

Dealing with Windows NTFS files

The filer can store files with either Unix or Windows NTFS access-control attributes. We do not need and want to touch files with Windows NTFS attributes for several reasons:

  • Executing a chown/chgrp on a Windows NTFS file will turn it into a Unix file, and potentially destroy access-control list information as a result.
  • There is no need to chown/chgrp a Windows NTFS file, because such files have Windows SIDs instead of UIDs

The trick is to first change the UID and GID of a user in the Unix LDAP tables, and then wait until that change has propagated through to the NetApp name server database cache. When a Unix user queries an NTFS file with stat(), the NetApp will first translate the SID of the file via an Active Directory LDAP lookup into a user name (CRSID). It will then translate that CRSID via a Unix LDAP lookup into the UID that stat() shows.

Once the UID and GID change in the Unix LDAP server has reached the NetApp name server database cache, all Windows NTFS files will show already the new (emulated) UID/GID via NFSv3, and therefore be ignored by the subsequent ugid-find and chown/chgrp invocations. This way, only Unix files are affected by the migration.

NFS version

It is very important that the commands

  • ugid-scan
  • ugid-find
  • chown
  • chgrp

are only executed on NFSv3 clients. This is because they are dealing here with UID/GID numbers that are no longer listed in LDAP, and the NFSv4 server will map all of these to nobody/nogroup. NFSv4 does not communicate numeric UID or GID values, instead it uses LDAP to translate these into “CRSID@cl.cam.ac.uk” strings on the wire. This would be counterproductive here.

Kerberos/AD identity

During the migration process, the Active Directory Kerberos identity (principal name, SID, etc.) remains unaffected. We are only changing the LDAP UID/GID. Therefore we are not creating a new user identity and deleting an old one in the Active Directory. This way, there will be hardly any disruption to Windows users, and all Kerberos and Active Directory metadata associated with the user remains intact.

Dealing with odd filenames

The above recipe uses the ugid-find “print” command to generate a list of LF-terminated path names. This is to make it easy to manually review and edit the generated list of directory prefixes with a text editor. However, this format might break in case one of the listed pathnames contains a linefeed character. This is very unlikely in Unix-originated filenames, but does occasionally occur with Windows Explorer generated files. The ugid-find “print0” and the xargs option -0 can be used to exchange a NUL terminated list of pathnames, which is immune to such problems.

A more sensible approach is to rename filenames that contain an LF, which is very likely an accident.

Dealing with long-running processes

If the user has long-running processes that they do not want to stop for the duration of the migration, one possible approach is to have two separate migrations, one for the UID and one for the GID. The user then would have to arrange that the long-running processes retain file access during the migration through at least one of these, the UID or the GID, while the other one is being moved. This still requires that these processes are restarted (after a new login) between the UID and GID move, such that they benefit from the new identity after the first half of the migration.

See also