Moving the UID/GID of a user: Difference between revisions
(first draft (to be continued)) |
(completed first draft of recipe) |
||
Line 3: | Line 3: | ||
== Background == | == Background == | ||
Modern POSIX operating systems expect users to have a numeric user | Modern POSIX operating systems (Linux, macOS, FreeBSD) expect users to have a numeric [https://en.wikipedia.org/wiki/User_identifier user identifier (UID)] of at least 1000, as lower UIDs are reserved for pseudo-users allocated by the operating-system vendor. However, when the Computer Laboratory first used Unix file systems in the mid 1980s, that limit was still at 100. As a result, we still have, as of 2016, 20 users with UID < 1000. | ||
A campaign to renumber these users and their files will have several benefits: | |||
* their user identifier will no longer be confused with the identity of system pseudo-users (e.g., files owned by maj1 will finally show up in “ls -l” as owned by “maj1” and not as e.g. “avahi-autoipd”) | |||
* the operating system will recognize them as regular users, enabling operations (such as certain xdm login functions) that are blocked for UIDs<1000 | |||
* incorporating the LDAP tables on a self-managed computer will no longer risk interfering with the local /etc/passwd tables | |||
* LDAP users with UID≥500 [https://www.cl.cam.ac.uk/~mgk25/offline-ldap/ can be incorporated on self-managed machines via libnss-extrausers] | |||
* various local configuration hacks (e.g. changing the minimum_uid parameter of pam_krb5.so) to workaround the default UID≥1000 assumption can be dropped on lab-managed Linux machines (improving defense in depth) | |||
== Moving process == | == Moving process == | ||
=== Preparation === | |||
<ol> | |||
<li> Inform user of the planned date and time of the change and advise them to log out and (ideally) also terminate long-running processes during the migration period. | |||
<li> Ask the user on what other POSIX filesystems than those on elmer they own files (local disks of desktops, servers and virtual machines connected to the Unix LDAP servers) | |||
<li> Make a note of their old numeric UID and GID. We will refer to these as $old_UID and $old_GID. | |||
</ol> | |||
=== Actual move === | |||
<ol start=4> | |||
<li> Update in the administrative database their UID and GID to their new value, and make a note of these. We will refer to these as $new_UID and $new_GID. As per the new departmental [[UID/GID allocation]] plan: | |||
<ul> | |||
<li> If the user is a person identified by CRSID: make sure 1100 ≤ $new_UID = $new_GID < 9000. | |||
<li> For a pseudo-user: make sure 9000 ≤ $new_UID = $new_GID < 9500. | |||
</ul> | |||
<li> Wait until the new UID and GID have propagated to all the LDAP servers, as well as to the [https://library.netapp.com/ecmdocs/ECMP1196993/html/GUID-EE8A6E6E-B37D-4F90-BD67-A5B9493A88A8.html name server database cache (NSDB)] on elmer. The latter can be forced with the [https://library.netapp.com/ecmdocs/ECMP1196993/html/GUID-A71E3CF5-DF18-4D88-8321-86FD075FCB2A.html nfs nsdb flush] command. | |||
<li> Make lists of directory prefixes of the locations of files that still have the old UID/GID: | |||
<pre> | |||
ugid-find uid=$old_UID dirnames prefixes print >uiddir-$old_UID | |||
ugid-find gid=$old_GID dirnames prefixes print >giddir-$old_UID | |||
</pre> | |||
<li> Review the directory prefix lists to see whether they look plausible, and manually fix where needed. (This is especially important if the renumbering is needed to resolve a collision, i.e. where multiple people have used the same numeric identifier in the past.) | |||
<li> Log into one of the omnipotent NFSv3 clients. Check with "mount" that it really mounts elmer with nfsvers=3 (so we can see untranslated UID/GID values). Then run as "root": | |||
<pre> | |||
xargs -r <uiddir-$old_UID chown -hcR --from=$old_UID $new_UID | |||
xargs -r <giddir-$old_GID chgrp -hcR --from=$old_GID $new_GID | |||
</pre> | |||
</ol> | |||
=== Completion === | |||
<ol start=9> | |||
<li> Finally, notify the user that they can log in again. | |||
<li> Remind the user to run the following commands as root on their own local file systems, or do it for them, as agreed: | |||
<pre> | |||
chown -hcR --from=$old_UID $new_UID path ... | |||
chgrp -hcR --from=$old_GID $new_GID path ... | |||
</pre> | |||
</ol> | |||
== Some notes on the process == | |||
=== Dealing with Windows NTFS files === | |||
The filer can store files with either Unix or Windows NTFS access-control attributes. We do not need and want to touch files with Windows NTFS attributes for several reasons: | |||
* Executing a chown/chgrp on a Windows NTFS file will turn it into a Unix file, and potentially destroy access-control list information as a result. | |||
* There is no need to chown/chgrp a Windows NTFS file, because such files have Windows SIDs instead of UIDs | |||
The trick is to first change the UID and GID of a user in the Unix LDAP tables, and then wait until that change has propagated through to the NetApp name server database cache. When a Unix user queries an NTFS file with stat(), the NetApp will first translate the SID of the file via an Active Directory LDAP lookup into a user name (CRSID). It will then translate that CRSID via a Unix LDAP lookup into the UID that stat() shows. | |||
Once the UID and GID change in the Unix LDAP server has reached the NetApp name server database cache, all Windows NTFS files will show already the new (emulated) UID/GID via NFSv3, and therefore be ignored by the subsequent ugid-find and chown/chgrp invocations. This way, only Unix files are affected by the migration. | |||
===NFS version=== | |||
It is very important that the commands | |||
* ugid-scan | |||
* ugid-find | |||
* chown | |||
* chgrp | |||
are only executed on NFSv3 clients. This is because they are dealing here with UID/GID numbers that are no longer listed in LDAP, and the NFSv4 server will map all of these to nobody/nogroup. NFSv4 does not communicate numeric UID or GID values, instead it uses LDAP to translate these into “CRSID@cl.cam.ac.uk” strings on the wire. This would be counterproductive here. | |||
===Kerberos/AD identity=== | |||
During the migration process, the Active Directory Kerberos identity (principal name, SID, etc.) remains unaffected. We are only changing the LDAP UID/GID. Therefore we are '''not''' creating a new user identity and deleting an old one in the Active Directory. This way, there will be hardly any disruption to Windows users, and all Kerberos and Active Directory metadata associated with the user remains intact. | |||
===Dealing with odd filenames=== | |||
The above recipe uses the ugid-find “print” command to generate a list of LF-terminated path names. This is to make it easy to manually review and edit the generated list of directory prefixes with a text editor. However, this format might break in case one of the listed pathnames contains a linefeed character. This is very unlikely in Unix-originated filenames, but does occasionally occur with Windows Explorer generated files. The ugid-find “print0” and the xargs option -0 can be used to exchange a NUL terminated list of pathnames, which is immune to such problems. | |||
A more sensible approach is to rename filenames that contain an LF, which is very likely an accident. | |||
[. | == See also == | ||
* [[UID/GID allocation]] | |||
* [https://github.com/mgkuhn/ugid-scan ugid-scan UID/GID file search engine] |
Revision as of 18:06, 21 March 2017
This page documents the recommended procedure for moving the UID and/or GID of a user in the CL Unix LDAP tables.
Background
Modern POSIX operating systems (Linux, macOS, FreeBSD) expect users to have a numeric user identifier (UID) of at least 1000, as lower UIDs are reserved for pseudo-users allocated by the operating-system vendor. However, when the Computer Laboratory first used Unix file systems in the mid 1980s, that limit was still at 100. As a result, we still have, as of 2016, 20 users with UID < 1000.
A campaign to renumber these users and their files will have several benefits:
- their user identifier will no longer be confused with the identity of system pseudo-users (e.g., files owned by maj1 will finally show up in “ls -l” as owned by “maj1” and not as e.g. “avahi-autoipd”)
- the operating system will recognize them as regular users, enabling operations (such as certain xdm login functions) that are blocked for UIDs<1000
- incorporating the LDAP tables on a self-managed computer will no longer risk interfering with the local /etc/passwd tables
- LDAP users with UID≥500 can be incorporated on self-managed machines via libnss-extrausers
- various local configuration hacks (e.g. changing the minimum_uid parameter of pam_krb5.so) to workaround the default UID≥1000 assumption can be dropped on lab-managed Linux machines (improving defense in depth)
Moving process
Preparation
- Inform user of the planned date and time of the change and advise them to log out and (ideally) also terminate long-running processes during the migration period.
- Ask the user on what other POSIX filesystems than those on elmer they own files (local disks of desktops, servers and virtual machines connected to the Unix LDAP servers)
- Make a note of their old numeric UID and GID. We will refer to these as $old_UID and $old_GID.
Actual move
- Update in the administrative database their UID and GID to their new value, and make a note of these. We will refer to these as $new_UID and $new_GID. As per the new departmental UID/GID allocation plan:
- If the user is a person identified by CRSID: make sure 1100 ≤ $new_UID = $new_GID < 9000.
- For a pseudo-user: make sure 9000 ≤ $new_UID = $new_GID < 9500.
- Wait until the new UID and GID have propagated to all the LDAP servers, as well as to the name server database cache (NSDB) on elmer. The latter can be forced with the nfs nsdb flush command.
- Make lists of directory prefixes of the locations of files that still have the old UID/GID:
ugid-find uid=$old_UID dirnames prefixes print >uiddir-$old_UID ugid-find gid=$old_GID dirnames prefixes print >giddir-$old_UID
- Review the directory prefix lists to see whether they look plausible, and manually fix where needed. (This is especially important if the renumbering is needed to resolve a collision, i.e. where multiple people have used the same numeric identifier in the past.)
- Log into one of the omnipotent NFSv3 clients. Check with "mount" that it really mounts elmer with nfsvers=3 (so we can see untranslated UID/GID values). Then run as "root":
xargs -r <uiddir-$old_UID chown -hcR --from=$old_UID $new_UID xargs -r <giddir-$old_GID chgrp -hcR --from=$old_GID $new_GID
Completion
- Finally, notify the user that they can log in again.
- Remind the user to run the following commands as root on their own local file systems, or do it for them, as agreed:
chown -hcR --from=$old_UID $new_UID path ... chgrp -hcR --from=$old_GID $new_GID path ...
Some notes on the process
Dealing with Windows NTFS files
The filer can store files with either Unix or Windows NTFS access-control attributes. We do not need and want to touch files with Windows NTFS attributes for several reasons:
- Executing a chown/chgrp on a Windows NTFS file will turn it into a Unix file, and potentially destroy access-control list information as a result.
- There is no need to chown/chgrp a Windows NTFS file, because such files have Windows SIDs instead of UIDs
The trick is to first change the UID and GID of a user in the Unix LDAP tables, and then wait until that change has propagated through to the NetApp name server database cache. When a Unix user queries an NTFS file with stat(), the NetApp will first translate the SID of the file via an Active Directory LDAP lookup into a user name (CRSID). It will then translate that CRSID via a Unix LDAP lookup into the UID that stat() shows.
Once the UID and GID change in the Unix LDAP server has reached the NetApp name server database cache, all Windows NTFS files will show already the new (emulated) UID/GID via NFSv3, and therefore be ignored by the subsequent ugid-find and chown/chgrp invocations. This way, only Unix files are affected by the migration.
NFS version
It is very important that the commands
- ugid-scan
- ugid-find
- chown
- chgrp
are only executed on NFSv3 clients. This is because they are dealing here with UID/GID numbers that are no longer listed in LDAP, and the NFSv4 server will map all of these to nobody/nogroup. NFSv4 does not communicate numeric UID or GID values, instead it uses LDAP to translate these into “CRSID@cl.cam.ac.uk” strings on the wire. This would be counterproductive here.
Kerberos/AD identity
During the migration process, the Active Directory Kerberos identity (principal name, SID, etc.) remains unaffected. We are only changing the LDAP UID/GID. Therefore we are not creating a new user identity and deleting an old one in the Active Directory. This way, there will be hardly any disruption to Windows users, and all Kerberos and Active Directory metadata associated with the user remains intact.
Dealing with odd filenames
The above recipe uses the ugid-find “print” command to generate a list of LF-terminated path names. This is to make it easy to manually review and edit the generated list of directory prefixes with a text editor. However, this format might break in case one of the listed pathnames contains a linefeed character. This is very unlikely in Unix-originated filenames, but does occasionally occur with Windows Explorer generated files. The ugid-find “print0” and the xargs option -0 can be used to exchange a NUL terminated list of pathnames, which is immune to such problems.
A more sensible approach is to rename filenames that contain an LF, which is very likely an accident.