Service Desk Knowledgebase: Scratch space: Difference between revisions

From Computer Laboratory System Administration
Jump to navigationJump to search
Line 60: Line 60:
  host '''<font color="red">$HOST</font>''' dns0
  host '''<font color="red">$HOST</font>''' dns0
* '''3 Machine install''': tell the operators which equipment to use (some way to identify it, where it is, what its Inventory Number is, etc) and the info listed below if they need it:
* '''3 Machine install''': tell the operators which equipment to use (some way to identify it, where it is, what its Inventory Number is, etc) and the info listed below if they need it:
** '''2.3 Inventory - pre use if using DHCP (Windows) <Done by a CL CO>''': create or update the Inventory information, telling them any equipment details which are not already on the Inventory (e.g. "'''PC WoC ASUS 1150 Q87M-E i5-4670 32GB'''"), name, PO number, supplier, owner, user, RT ticket number and any other info for the 'comment', print off and stick on a label
** '''2.3 Inventory - pre use if using DHCP (Windows) <Done by a Computer Lab CO>''': create or update the Inventory information, telling them any equipment details which are not already on the Inventory (e.g. "'''PC WoC ASUS 1150 Q87M-E i5-4670 32GB'''"), name, PO number, supplier, owner, user, RT ticket number and any other info for the 'comment', print off and stick on a label
** '''3.1 Network setup <Done by Oper>''': tell them the office, desk, floorbox and VLAN to use
** '''3.1 Network setup <Done by Oper>''': tell them the office, desk, floorbox and VLAN to use
** '''3.2 BMC BIOS setup - if present <Done by Oper>''': tell them if there is a BMC
** '''3.2 BMC BIOS setup - if present <Done by Oper>''': tell them if there is a BMC
** '''3.3 OS install <oper>''': tell them to do a 'standard Linux install'
** '''3.3 OS install <Done by Oper>''': tell them to do a 'standard Linux install'
** '''4.10 Wiring database - once physically installed <Done by Oper>''': check that the wiring info is up to date
** '''4.10 Wiring database - once physically installed <Done by Oper>''': check that the wiring info is up to date
* '''3.4 keytab install (Linux)''': Ensure '''<font color="red">$HOST</font>''' has a keytab. If the command below fails, create a new keytab <Contact gt19 is necessary>. Login to '''<font color="red">$HOST</font>''' (this will probably need to be done from laira using 'sudo ssh '''<font color="red">$HOST</font>'''', if the host doesn't have a keytab) and run:
* '''3.4 keytab install (Linux)''': Ensure '''<font color="red">$HOST</font>''' has a keytab. If the command below fails, create a new keytab <Contact gt19 is necessary>. Login to '''<font color="red">$HOST</font>''' (this will probably need to be done from laira using 'sudo ssh '''<font color="red">$HOST</font>'''', if the host doesn't have a keytab) and run:

Revision as of 13:24, 9 December 2015

Return to the Service Desk Knowledgebase SERVICE PORTFOLIO

Help Desk Scratch Space

Special information for re-use of PCCL0xx machines for 2015/10

There is a ticklist of steps to ensure a lab system is setup and documented correctly at http://www.wiki.cl.cam.ac.uk/rowiki/SysInfo/MachineSetup whose ToC can be used as an aide-memoire to check that everything has been done, or by going into the text itself, to see what needs to be done.

Due to unfortunate expected dates for new Intel CPUs and chipsets and Asus Motherboards, a number of 2015/10 arrivals will be given ex-SW11 PWF Dell machines to tide them over until the BMC version of the Asus motherboard is available and tested.

To aid with the setup of these machines, the ToC has been analysed and the expected required steps listed below.

There are two classes of users:

  1. RSs should be allocated machine names 128.232.65.50 to 128.232.65.59
  2. Other 'misc pool' temporary use (e.g. short term visitors; people buying kit when they have settled in) should be allocated machine names 128.232.65.60 to 128.232.65.69


HelpDesk needs to tell oper the user (and office+desk if known) and a DNS name ($HOST) to use - Go to laira and cd /anfs/glob/src/etc/named/src and then view cl.data to find names of machines in the IP address range with x's i.e. ex-PCCL0xx showing they are still available e.g.:

 petteril        IN      A       128.232.65.59
                 IN      TXT     "Using the case of ex-PCCL0xx"

When done, oper will tell Helpdesk the Inv# and 'old' name of the system used, so that 2.2 and 2.3 can be done.

  • Check that the RT Subject: line contains the user's CRSID and the assigned host, e.g. "2015/10 RS Christopher Bryant cjb255 kit albacore"
  • 2.2 DNS - pre use if needed (e.g. Linux): update the TXT RR for the machine to note the PCCL0xx machine actually used (change vi to ed, pico, etc as preferred)
(cd /global/src/etc/named; co -l src/cl.data; vi src/cl.data; ci -u src/cl.data)
  • Machine then handed over to operators for the actual install of the OS and the positioning of the hardware. After that is completed then the following POST INSTALL tasks need doing

The Post-Install tasks can now been migrated to [Service_Desk_Knowledgebase:_Resources#Post-Install_Tasks Resources - Post-Install Tasks]

Common case of setting up a new Ubuntu machine

There is a ticklist of steps to ensure a lab system is setup and documented correctly at http://www.wiki.cl.cam.ac.uk/rowiki/SysInfo/MachineSetup whose ToC can be used as an aide-memoire to check that everything has been done, or by going into the text itself, to see what needs to be done.

The common case of a new machine called '$HOST' for user '$CRSID' is run through below. To be able to run the commands which use $HOST & '$CRSID' , you can set the shell environment variables. So for a machine called "foo" with an assigned user with the CRSid of "spqr1" , run:

HOST=foo
CRSID=spqr1

Essentially the HelpDesk sets some things up, asks oper to do their bit, and when the Helpdesk is told that it is done, test it and finish off the job.

  • 2.1 Gather info - first thing: collect all the information required using the RT ticket, such as the machine name, the subdomain (i.e. 'special' subnet, such as the Security Group, DTG or SRG experimental networks (if any)), VLAN, assigned user, etc.
  • Check that the RT Subject: line contains the user's CRSID and the assigned host, e.g. "2015/10 RS Christopher Bryant cjb255 kit albacore"
  • 2.2 DNS - pre use if needed (e.g. Linux): create an entry in the DNS for the machine on the correct subnet, with the appriorate subdomain (if any), and any BMC. Include a TXT RR with the owner and the RT ticket number. BMCs on the same VLAN as the host (typically user workstations using iAMT) should have the same name as host, with a -bmc suffix, but if using the BMC subnet (typically servers with dedicated BMC NICs) they should be on the BMC VLAN in the .bmc subdomain. If a 'same VLAN' BMC is in a subdomain, create a DNS alias for the BMC in the root domain. On the Managed Linux subnet, the top half of the subnet is used for the BMCs, with the address being in the class C which is 8 larger. Some subnets (e.g. SRG) have 'port blocked' CIDR blocks for BMCs, so look to see where other BMCs are. Thus a standard machine might be
foo         IN      A       128.232.65.83
            IN      TXT     "pb22 RT#12345"
...
foo-bmc     IN      A       128.232.73.83

while a machine on the security subnet might be

foo.sec     IN      A       128.232.18.83
            IN      TXT     "pb22 RT#1234"
foo-bmc.sec IN      A       128.232.18.84
foo-bmc     IN      CNAME   foo-bmc.sec

To update the dns for $HOST (change vi to ed, pico, etc as preferred), install and test it, on an omnipotent server:

cd /global/src/etc/named
co -l src/cl.data
vi src/cl.data
ci -u src/cl.data
make install
host $HOST dns0
  • 3 Machine install: tell the operators which equipment to use (some way to identify it, where it is, what its Inventory Number is, etc) and the info listed below if they need it:
    • 2.3 Inventory - pre use if using DHCP (Windows) <Done by a Computer Lab CO>: create or update the Inventory information, telling them any equipment details which are not already on the Inventory (e.g. "PC WoC ASUS 1150 Q87M-E i5-4670 32GB"), name, PO number, supplier, owner, user, RT ticket number and any other info for the 'comment', print off and stick on a label
    • 3.1 Network setup <Done by Oper>: tell them the office, desk, floorbox and VLAN to use
    • 3.2 BMC BIOS setup - if present <Done by Oper>: tell them if there is a BMC
    • 3.3 OS install <Done by Oper>: tell them to do a 'standard Linux install'
    • 4.10 Wiring database - once physically installed <Done by Oper>: check that the wiring info is up to date
  • 3.4 keytab install (Linux): Ensure $HOST has a keytab. If the command below fails, create a new keytab <Contact gt19 is necessary>. Login to $HOST (this will probably need to be done from laira using 'sudo ssh $HOST', if the host doesn't have a keytab) and run:
cl-onserver --keytab
  • 4.1 User Admin - when running if needed (Linux): If oper were not told the 'assigned user', on $HOST run:
cl-asuser cl-hostid-fix --user '$CRSID' -a
The machine mentioned in the Subject: line has now been re-installed for you and should be ready to use.
When you arrive on October 5th, Please login and check that the basics work, i.e. that you can login, access the web, and send email.
If not, please reply to this ticket, which will re-open it, and we will try to sort the problem.

If you have other requests, please do NOT reply to this ticket, but instead open a new ticket, and mention this one.

Now may be a good time to look again at
http://www.wiki.cl.cam.ac.uk/rowiki/SysInfo/BedtimeReading
and the pages to which it points.
  • 4.5 hosts.props - at leisure: All machines should be added to hosts.props in /global/src/usr.lib. The format is somewhat overwhelming, so it may be easiest to copy a similar existing entry (note that they are sorted alphabetically). You can find a basic HW spec of the machine $HOST (e.g. backus for a Q87M-E system), and then type that string in place of $type to see which others machines are similar. If there are no other matches, try removing words from the end of $type to look for more generic information. If there is no useful match, email sys-admin asking for help. When a suitable machine '$from' has been found, clone its information. So for $HOST, try
type=$(/anfs/repl/etc/wtfi -S CL_Equipment-raw -q -f Equipment -w $$HOST)
echo trying type=\"$type\"
/anfs/repl/etc/wtfi -S CL_Equipment-raw -q -f name "$type" | sort -u
# set from to be a suitable host, e.g.: from=backus
/global/src/usr.lib/hosts.props-add.pl $from $HOST
  • 4.6 ssh_known_hosts - at leisure if needed (Linux): when the machine is running, on a different machine run
/global/src/usr.bin/ssh/fetch-host-key scan $HOST
  • 4.7 BMC ACL - when up if present: check that the user has BMC credentials in /homes/$CRSID/.amtpw, then from a Lab machine, open a browser to the BMC (typically http://$HOST-bmc.cl.cam.ac.uk:16992) as user admin, delete any previous assigned user, and add the new one with all privs. The command to setup credentials on an omnipotent server is:
/usr/groups/netmaint/setamt $CRSid
  • 4.8 ownfiles - at leisure if needed (Linux): to ensure that ownfiles data is collected, run
(umask 2; touch /usr/groups/linux/ownfiles/CKSUM/$HOST)
  • 4.9 WoL - at leisure: to ensure that Wake on Lan (WoL) is available, on laira run:
/usr/groups/netmaint/boot_wol_file-add.pl $HOST
  • Check that the RS or "Visitors" work queue has the task "completed" in the first case, or "OK" in the second (also adding the Inventory number and name of the PC after, e.g. "OK #16200 ouse")
  • Resolve the RT ticket.

Procedure To Be Tidied Up (maybe done above?)

So from https://rt.cl.cam.ac.uk/Ticket/Display.html?id=96922 this seems to boil down to the following for the Help Desk for the https://rt.cl.cam.ac.uk/Ticket/Display.html?id=96580 test case.

NOTE: There is ONLY now one case...


Machine install

keytab install (Linux)

On $HOST: cl-onserver --keytab
If there is no keytab to install, create one and retry

Tidies

  • 4.1 User Admin - when running if needed (Linux)
If oper were not told the 'assigned user' for 3.3,
on $HOST: cl-asuser cl-hostid-fix --user $CRSID -a
  • 4.2 Arrivals - when done
fix https://dbwebserver.ad.cl.cam.ac.uk/SCG/Equipment/PhDArrivals.aspx
Update RT ticket to include machine name in ticket Subject:
  • 4.3 Tell the user - when done
Send user message
Resolve RT
  • 4.6 ssh_known_hosts - at leisure if needed (Linux)
when the machine is running, on *another* machine run
/global/src/usr.bin/ssh/fetch-host-key scan $HOST
  • 4.8 ownfiles - at leisure if needed (Linux)
run: (umask 2; touch /usr/groups/linux/ownfiles/CKSUM/$HOST)
  • 4.9 WoL - at leisure
run: /usr/groups/netmaint/boot_wol_file-add.pl $HOST


HelpDesk needs to tell oper the DNS name to use as well as the user and office+desk When done, oper will tell Helpdesk the Inv# and 'old' name of the system used, so that 2.2 and 2.3 can be done.

Pre install

  • 2.2 DNS - pre use if needed (e.g. Linux)
names and addresses assigned, but once 3.3 is done, update TXT RR
  • 2.3 Inventory - pre use if using DHCP (Windows) <CLCO>
once 3.3 is done, put RT# in comment, set user and office