Service Desk Knowledgebase: Scratch space

From Computer Laboratory System Administration
Jump to navigationJump to search

Return to the Service Desk Knowledgebase SERVICE PORTFOLIO

Help Desk Scratch Space

Special information for re-use of PCCL0xx machines for 2015/10

There is a ticklist of steps to ensure a lab system is setup and documented correctly at http://www.wiki.cl.cam.ac.uk/rowiki/SysInfo/MachineSetup whose ToC can be used as an aide-memoire to check that everything has been done, or by going into the text itself, to see what needs to be done.

Due to unfortunate expected dates for new Intel CPUs and chipsets and Asus Motherboards, a number of 2015/10 arrivals will be given ex-SW11 PWF Dell machines to tide them over until the BMC version of the Asus motherboard is available and tested.

To aid with the setup of these machines, the ToC has been analysed and the expected required steps listed below.

There are two classes of users:

  1. RSs should be allocated machine names 128.232.65.50 to 128.232.65.59
  2. Other 'misc pool' temporary use (e.g. short term visitors; people buying kit when they have settled in) should be allocated machine names 128.232.65.60 to 128.232.65.69


HelpDesk needs to tell oper the user (and office+desk if known) and a DNS name ($HOST) to use - Go to laira and cd /anfs/glob/src/etc/named/src and then view cl.data to find names of machines in the IP address range with x's i.e. ex-PCCL0xx showing they are still available e.g.:

 petteril        IN      A       128.232.65.59
                 IN      TXT     "Using the case of ex-PCCL0xx"

When done, oper will tell Helpdesk the Inv# and 'old' name of the system used, so that 2.2 and 2.3 can be done.

  • Check that the RT Subject: line contains the user's CRSID and the assigned host, e.g. "2015/10 RS Christopher Bryant cjb255 kit albacore"
  • 2.2 DNS - pre use if needed (e.g. Linux): update the TXT RR for the machine to note the PCCL0xx machine actually used (change vi to ed, pico, etc as preferred)
(cd /global/src/etc/named; co -l src/cl.data; vi src/cl.data; ci -u src/cl.data)
  • Machine then handed over to operators for the actual install of the OS and the positioning of the hardware. After that is completed then the following POST INSTALL tasks need doing
  • 4.0 keytab install (Linux): Ensure $HOST has a correct keytab.
sudo klist -k /etc/krb5.keytab

which should have the machine name not no-name-linux If the command below fails, contact gt19 to create a new keytab. Login to $HOST using sudo ssh $HOST and then run:

cl-onserver --keytab
  • 4.1 User Admin - when running if needed (Linux): If oper were not told the 'assigned user', on $HOST run:
cl-asuser cl-hostid-fix --user $CRSid -a
/global/src/usr.bin/ssh/fetch-host-key scan $HOST
  • 4.8 ownfiles - at leisure if needed (Linux): to ensure that ownfiles data is collected, run
(umask 2; touch /usr/groups/linux/ownfiles/CKSUM/$HOST)
  • 4.9 WoL - at leisure: to ensure that WoL is available, run:
/usr/groups/netmaint/boot_wol_file-add.pl $HOST
  • Update the RS or "Visitors" work queue to mark the task to be "completed" in the first case, or "OK" in the second (also adding the Inventory number and name of the PC after, e.g. "OK #16200 ouse")
  • Resolve the RT ticket.

Common case of setting up a new Ubuntu machine

There is a ticklist of steps to ensure a lab system is setup and documented correctly at http://www.wiki.cl.cam.ac.uk/rowiki/SysInfo/MachineSetup whose ToC can be used as an aide-memoire to check that everything has been done, or by going into the text itself, to see what needs to be done.

The common case of a new machine called '$HOST' for user '$CRSID' is run through below. To be able to run the commands which use $HOST , set the shell environment variable. So for machine foo with a user spqr1 , run

HOST=foo
CRSID=spqr1

HelpDesk sets some things up, asks oper to do their bit, and when told that it is done, test it and finish off the job.

  • 2.1 Gather info - first thing: collect all the information required using the RT ticket, such as the machine name, the subdomain (i.e. 'special' subnet, such as the Security Group, DTG or SRG experimental networks (if any)), VLAN, assigned user, etc.
  • Check that the RT Subject: line contains the user's CRSID and the assigned host, e.g. "2015/10 RS Christopher Bryant cjb255 kit albacore"
  • 2.2 DNS - pre use if needed (e.g. Linux): create an entry in the DNS for the machine on the correct subnet, with the appriorate subdomain (if any), and any BMC. Include a TXT RR with the owner and the RT ticket number. BMCs on the same VLAN as the host (typically user workstations using iAMT) should have the same name as host, with a -bmc suffix, but if using the BMC subnet (typically servers with dedicated BMC NICs) they should be on the BMC VLAN in the .bmc subdomain. If a 'same VLAN' BMC is in a subdomain, create a DNS alias for the BMC in the root domain. On the Managed Linux subnet, the top half of the subnet is used for the BMCs, with the address being in the class C which is 8 larger. Some subnets (e.g. SRG) have 'port blocked' CIDR blocks for BMCs, so look to see where other BMCs are. Thus a standard machine might be
foo         IN      A       128.232.65.83
            IN      TXT     "pb22 RT#12345"
...
foo-bmc     IN      A       128.232.73.83

while a machine on the security subnet might be

foo.sec     IN      A       128.232.18.83
            IN      TXT     "pb22 RT#1234"
foo-bmc.sec IN      A       128.232.18.84
foo-bmc     IN      CNAME   foo-bmc.sec

To update the dns for $HOST (change vi to ed, pico, etc as preferred), install and test it, on an omnipotent server

cd /global/src/etc/named
co -l src/cl.data
vi src/cl.data
ci -u src/cl.data
make install
host $HOST dns0
  • 3 Machine install: tell the operators which equipment to use (some way to identify it, where it is, what its Inventory Number is, etc) and the info listed below if they need it:
    • 2.3 Inventory - pre use if using DHCP (Windows) <CLCO>: create or update the Inventory information, telling them any equipment details which are not already on the Inventory (e.g. "PC WoC ASUS 1150 Q87M-E i5-4670 32GB"), name, PO number, supplier, owner, user, RT ticket number and any other info for the 'comment', print off and stick on a label
    • 3.1 Network setup <oper>: tell them the office, desk, floorbox and VLAN to use
    • 3.2 BMC BIOS setup - if present <oper>: tell them if there is a BMC
    • 3.3 OS install <oper>: tell them to do a 'standard Linux install'
    • 4.10 Wiring database - once physically installed <oper>: check that the wiring info is up to date
  • 3.4 keytab install (Linux): Ensure $HOST has a keytab. If the command below fails, create a new keytab (contact gt19 is necessary). Login to $HOST (this will probably need to be done from laira using 'sudo ssh $HOST', if the hist doesn't have a keytab) and run:
cl-onserver --keytab
  • 4.1 User Admin - when running if needed (Linux): If oper were not told the 'assigned user', on $HOST run:
cl-asuser cl-hostid-fix --user $CRSID -a
The machine mentioned in the Subject: line has now been re-installed for you and should be ready to use.
When you arrive on October 5th, Please login and check that the basics work, i.e. that you can login, access the web, and send email.
If not, please reply to this ticket, which will re-open it, and we will try to sort the problem.

If you have other requests, please do NOT reply to this ticket, but instead open a new ticket, and mention this one.

Now may be a good time to look again at
http://www.wiki.cl.cam.ac.uk/rowiki/SysInfo/BedtimeReading
and the pages to which it points.
  • 4.5 hosts.props - at leisure: All machines should be added to hosts.props in /global/src/usr.lib. The format is somewhat overwhelming, so it may be easiest to copy a similar existing entry (note that they are sorted alphabetically). You can find a basic HW spec of the machine $HOST (e.g. backus for a Q87M-E system), and then type that string in place of $type to see which others machines are similar. If there are no other matches, try removing words from the end of $type to look for more generic information. If there is no useful match, email sys-admin asking for help. When a suitable machine '$from' has been found, clone its information. So for $HOST, try
type=$(/anfs/repl/etc/wtfi -S CL_Equipment-raw -q -f Equipment -w $$HOST)
echo trying type=\"$type\"
/anfs/repl/etc/wtfi -S CL_Equipment-raw -q -f name "$type" | sort -u
# set from to be a suitable host, e.g.: from=backus
/global/src/usr.lib/hosts.props-add.pl $from $HOST
  • 4.6 ssh_known_hosts - at leisure if needed (Linux): when the machine is running, on a different machine run
/global/src/usr.bin/ssh/fetch-host-key scan $HOST
  • 4.7 BMC ACL - when up if present: check that the user has BMC credentials in /homes/$CRSID/.amtpw, then from a Lab machine, open a browser to the BMC (typically http://$HOST-bmc.cl.cam.ac.uk:16992) as user admin, delete any previous assigned user, and add the new one with all privs. The command to setup credentials on an omnipotent server is:
/usr/groups/netmaint/setamt $CRSid
  • 4.8 ownfiles - at leisure if needed (Linux): to ensure that ownfiles data is collected, run
(umask 2; touch /usr/groups/linux/ownfiles/CKSUM/$HOST)
  • 4.9 WoL - at leisure: to ensure that Wake on Lan (WoL) is available, on laira run:
/usr/groups/netmaint/boot_wol_file-add.pl $HOST
  • Check that the RS or "Visitors" work queue has the task "completed" in the first case, or "OK" in the second (also adding the Inventory number and name of the PC after, e.g. "OK #16200 ouse")
  • Resolve the RT ticket.

Procedure To Be Tidied Up (maybe done above?)

So from https://rt.cl.cam.ac.uk/Ticket/Display.html?id=96922 this seems to boil down to the following for the Help Desk for the https://rt.cl.cam.ac.uk/Ticket/Display.html?id=96580 test case.

NOTE: There is ONLY now one case...


Machine install

keytab install (Linux)

On $HOST: cl-onserver --keytab
If there is no keytab to install, create one and retry

Tidies

  • 4.1 User Admin - when running if needed (Linux)
If oper were not told the 'assigned user' for 3.3,
on $HOST: cl-asuser cl-hostid-fix --user $CRSID -a
  • 4.2 Arrivals - when done
fix https://dbwebserver.ad.cl.cam.ac.uk/SCG/Equipment/PhDArrivals.aspx
Update RT ticket to include machine name in ticket Subject:
  • 4.3 Tell the user - when done
Send user message
Resolve RT
  • 4.6 ssh_known_hosts - at leisure if needed (Linux)
when the machine is running, on *another* machine run
/global/src/usr.bin/ssh/fetch-host-key scan $HOST
  • 4.8 ownfiles - at leisure if needed (Linux)
run: (umask 2; touch /usr/groups/linux/ownfiles/CKSUM/$HOST)
  • 4.9 WoL - at leisure
run: /usr/groups/netmaint/boot_wol_file-add.pl $HOST


HelpDesk needs to tell oper the DNS name to use as well as the user and office+desk When done, oper will tell Helpdesk the Inv# and 'old' name of the system used, so that 2.2 and 2.3 can be done.

Pre install

  • 2.2 DNS - pre use if needed (e.g. Linux)
names and addresses assigned, but once 3.3 is done, update TXT RR
  • 2.3 Inventory - pre use if using DHCP (Windows) <CLCO>
once 3.3 is done, put RT# in comment, set user and office