Service Desk Knowledgebase: Operator Tasks

From Computer Laboratory System Administration
Revision as of 17:29, 9 November 2015 by pb22 (talk | contribs) (→‎Reinstall a Linux machine: Add fsck to test device is not in use)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search


This is the Operator Tasks content page of the CL Wiki Service Desk Knowledgebase. Its purpose is to provide information to the Service Desk team on how to handle problems and requests about this CL service. If you are involved with the provision of this CL service please feel free to add to the knowledge about that it.

If CL staff need to tell the Service Desk team about problems with this service please email
sys-admin-aside@cl.cam.ac.uk.

Return to the Service Desk Knowledgebase SERVICE PORTFOLIO

Key Service Description & URLs

The operation processes are to install and move hardware and to install and update the operating system when a machine is re-deployed. In addition telephones and network connections are maintained. Unused equipment is to be reclaimed and unused wiring to be removed.

CL Customer Documentation

Further CL Sys-Admin Resources

Underpinning Services

  • ??? - Any supporting or underpinning services

Customer-base for this Service

  • All staff and students of the Computer Laboratory

Costs

  • Free to all current staff and students of the Computer Laboratory

SLA

  • N/A

Service Desk Call Handling Procedure

  • RT tickets can be passed to the Operator team by changing the Queue as oper with the Owner set to Nobody and the Status set to new.
  • RT tickets can be escalated to the sys-admin team by changing the Queue as sys-admin with the Owner set to Nobody and the Status set to new.
  • RT tickets can be escalated to the experts by changing the Queue to backoffice with the Owner set to Nobody and the Status as new.
  • RT tickets can be Taken from the oper queue by members of the oper team. When the work requested has been completed the ticket status should be set to new, the owner to nobody and the queue to sys-admin.

Contacts

Primary

Other

Availability

  • Monday to Friday 9-5

Hints, Tips & Known Issues

Procedure for Patching

Graham Titmus (27/01/2015)

The patches are documented in the database. All physical cables on the patch panels should be documented. For patches for telephones only the patch should have a comment in it which has Telephone at the start. All other patches will also have a connection to a physical machine as well.

If a person is being moved from one room to another then the Staff List should be updated when the move is completed with the new office number, the desk they are at is found form the Office Plan any mistakes in that should be reported to Reception to be corrected.

Adding a patch

  1. Go to the floorbox page and enter the box name (somethings like WC2E-012 - the box number always ends with 3 digits pad with zeros if required) and press [Enter]
  2. If no connection shows up for the port you plan to use press the [Add Connection] button
  3. On the AddConnection page add in the port number (between 1 and 4, phones usually are in port 1) and the machine inventory number. Then click on create.
  4. You will be taken back to the floorbox page. Click on [Trace] by the connection you have just added.
  5. On the Cable Trace page you should see a single line for a connection in the floor box. Click on [Add Patch].
  6. You will then be on the Add Patch page. Enter the other end of the patch in the form HOST-012 (use 3 digits with preceding zeros for the last part of the host port). Click on [Create]. The wiring database is now completely updated.
  7. Put a comment on the RT ticket saying which HOST port the patch was added to.
  8. The VLAN now needs to be configured for that port by the operators following the following the Updating VLANs in the Cisco switches procedure.

Removing a patch

If the connection is shared with a telephone then the patch should not be removed but the CONnection for the machine does need to be removed.

With Phone
  1. Go to the floorbox page and enter the box name (somethings like WC2E-012 - the box number always ends with 3 digits pad with zeros if required) and press [Enter]
  2. Click on [Trace] by the connection you have just added.
  3. On the Cable trace page you should see a four line trace for a connection in the floor box. Click on the number to the left of the label [CON].
  4. On the next page click on Delete Connection
  5. Put a comment on the RT ticket saying which HOST port the patch was removed from.
  6. The VLAN now needs to be configured for that port by the operators following the Updating VLANs in the Cisco switches procedure .
Without phone
  1. Go to the floorbox page and enter the box name (somethings like WC2E-012 - the box number always ends with 3 digits pad with zeros if required) and press [Enter]
  2. Click on [Trace] by the connection you have just added.
  3. On the Cable trace page you should see a four line trace for a connection in the floor box. Click on [Delete All].
  4. Put a comment on the RT ticket saying which HOST port the patch was removed from.
  5. The VLAN now needs to be configured for that port by the operators following the Updating VLANs in the Cisco switches procedure .

Reinstall a Linux machine

Graham Titmus (27/01/2015)

To reinstall a machine that is currently working you can do so by remotely copying a fresh image into a spare partition, while the machine is in use.

  • Use SSH to login to the machine using your CL account
  • Check to see which partitions are available with a 'ext' file systems on the partition with:
    cl-asuser blkid | grep ext
    For example:
 kiku:~: cl-asuser blkid | grep ext
 /dev/md2: LABEL="kiku_U12.04_md3" UUID="58956853-b55f-475c-bbe1-4177875228d5" TYPE="ext4"
 /dev/md1: LABEL="kiku_U14.04_md1" UUID="58956853-b55f-475c-beef-4177875225d8" TYPE="ext4"
 /dev/mapper/vg01-scratch: LABEL="scratch" UUID="53ce4c02-dde6-4542-89ca-9a47446143a7" TYPE="ext3"
 kiku:~: 
  • Check which partition is in use at present with:
    df /
    For example:
 kiku:~: df /
 Filesystem     1K-blocks     Used Available Use% Mounted on
 /dev/md1        31364084 17037644  12985036  57% /
 kiku:~: 
  • Choose the device like /dev/md2 that is not currently in use - the disc may show up as a UUID which you check against the 'blkid' listing (if /dev/sda2 is the current root then use dev=sda1 or dev=sda2 etc instead) and do:
    dev=md2
    cl-asuser blkid | grep /dev/$dev
    For example:
 kiku:~: cl-asuser blkid | grep /dev/md2
 /dev/md2: LABEL="kiku_U14.04_md2" UUID="80e8e80e-2002-1404-c1c1-000055e6cb10" TYPE="ext4"
 kiku:~: 
  • If the device is listed as being available, and is 'spare' as it is NOT reported as in use by the "df /" command then run the following to check it is not in use - it should give a brief summary sudh as "sumire_U14.04_md: clean, 321333/1050624 files, 1905673/4194304 blocks" and not something like "/dev/md1 is mounted. e2fsck: Cannot continue, aborting.":
sudo fsck /dev/$dev
  • zero the first 3 blocks to mark the partition as 'unused':
    sudo dd if=/dev/zero bs=512 count=3 of=/dev/$dev
    For example:
 kiku:~: sudo dd if=/dev/zero bs=512 count=3 of=/dev/md2
 3+0 records in
 3+0 records out
 1536 bytes (1.5 kB) copied, 0.0400386 s, 38.4 kB/s
 kiku:~: 
  • To do the reinstall run:
    sudo /a/misc-nosnap1/distros/ubuntu/clone/restore reinstall $dev
    For example:
 kiku:~: sudo /a/misc-nosnap1/distros/ubuntu/clone/restore reinstall md2
 [sudo] password for vrw10:
 Assumimng OK to use UUID [80e8e80e-2002-1404-c1c1-000055e6dd99]: 
 copy M.gz to /dev/md2 for kiku (128.232.64.14) mks40 :
 unpack M.gz into /dev/md2
 /dev/md2: LABEL="kiku_U14.04_md2" UUID="80e8e80e-2002-1404-c1c1-000055e6dc2a" TYPE="ext4"
 Left /dev/md2 ASIS
 fsck from util-linux 2.20.1
 e2fsck 1.42.9 (4-Feb-2014)
 kiku_U14.04_md2: clean, 277873/1313280 files, 1463806/5242880 blocks
 fsck from util-linux 2.20.1
 e2fsck 1.42.9 (4-Feb-2014)
 Pass 1: Checking inodes, blocks, and sizes
 Pass 2: Checking directory structure
 Pass 3: Checking directory connectivity
 Pass 4: Checking reference counts
 Pass 5: Checking group summary information
 kiku_U14.04_md2: 277873/1313280 files (0.7% non-contiguous), 1463806/5242880 blocks
 resize2fs 1.42.9 (4-Feb-2014)
 The filesystem is already 5242880 blocks long.  Nothing to do!
 
 tune2fs 1.42.9 (4-Feb-2014)
 no etc/krb5.keytab-Kiku
 Refresh files from CP ...
 ‘/etc/user-config/bundles’ -> ‘/mnt/etc/user-config/bundles’
 ‘/etc/user-config/patches’ -> ‘/mnt/etc/user-config/patches’
 ‘/etc/network/interfaces’ -> ‘/mnt/etc/network/interfaces’
 ‘/etc/krb5.keytab’ -> ‘/mnt/etc/krb5.keytab’
 ‘/etc/ssh/ssh_host_dsa_key’ -> ‘/mnt/etc/ssh/ssh_host_dsa_key’
 ‘/etc/ssh/ssh_host_dsa_key-DIST’ -> ‘/mnt/etc/ssh/ssh_host_dsa_key-DIST’
 ‘/etc/ssh/ssh_host_dsa_key.pub’ -> ‘/mnt/etc/ssh/ssh_host_dsa_key.pub’
 ‘/etc/ssh/ssh_host_dsa_key.pub-DIST’ -> ‘/mnt/etc/ssh/ssh_host_dsa_key.pub-DIST’
 ‘/etc/ssh/ssh_host_ecdsa_key’ -> ‘/mnt/etc/ssh/ssh_host_ecdsa_key’
 ‘/etc/ssh/ssh_host_ecdsa_key.pub’ -> ‘/mnt/etc/ssh/ssh_host_ecdsa_key.pub’
 ‘/etc/ssh/ssh_host_ed25519_key’ -> ‘/mnt/etc/ssh/ssh_host_ed25519_key’
 ‘/etc/ssh/ssh_host_ed25519_key.pub’ -> ‘/mnt/etc/ssh/ssh_host_ed25519_key.pub’
 ‘/etc/ssh/ssh_host_rsa_key’ -> ‘/mnt/etc/ssh/ssh_host_rsa_key’
 ‘/etc/ssh/ssh_host_rsa_key-DIST’ -> ‘/mnt/etc/ssh/ssh_host_rsa_key-DIST’
 ‘/etc/ssh/ssh_host_rsa_key.pub’ -> ‘/mnt/etc/ssh/ssh_host_rsa_key.pub’
 ‘/etc/ssh/ssh_host_rsa_key.pub-DIST’ -> ‘/mnt/etc/ssh/ssh_host_rsa_key.pub-DIST’
 warning: failed to read mtab
 mount: can't find /run/rpc_pipefs in /etc/fstab or /etc/mtab
   warning: failed to mount run/rpc_pipefs - this is probably not a problem
 chrooted to /mnt run: apt-get install mdadm
 Reading package lists... Done
 Building dependency tree
 Reading state information... Done
 mdadm is already the newest version.
 0 to upgrade, 0 to newly install, 0 to remove and 0 not to upgrade.
 
 ... it is not a problem if the next command fails - press RETURN to proceed
   
 chrooted to /mnt run: grub-install /dev/sda ...
 Installing for i386-pc platform.
 Installation finished. No error reported.
 chrooted to /mnt run: grub-mkconfig -o /boot/grub/grub.cfg
 Generating grub configuration file ...
 Warning: Setting GRUB_TIMEOUT to a non-zero value when GRUB_HIDDEN_TIMEOUT is set is no longer supported.
 Found linux image: /boot/vmlinuz-3.13.0-59-generic
 Found initrd image: /boot/initrd.img-3.13.0-59-generic
 Found memtest86+ image: /boot/memtest86+.elf
 Found memtest86+ image: /boot/memtest86+.bin
 done
 Generating grub configuration file ...
 Warning: Setting GRUB_TIMEOUT to a non-zero value when GRUB_HIDDEN_TIMEOUT is set is no longer supported.
 Found linux image: /boot/vmlinuz-3.13.0-52-generic
 Found initrd image: /boot/initrd.img-3.13.0-52-generic
 Found linux image: /boot/vmlinuz-3.13.0-51-generic
 Found initrd image: /boot/initrd.img-3.13.0-51-generic
 Found memtest86+ image: /boot/memtest86+.elf
 Found memtest86+ image: /boot/memtest86+.bin
 done
 
 ======= Start chroot shell. If nothing to do, type 'exit' or wait 60 seconds
 root@kiku:/# timed out waiting for input: auto-logout
 ======= End   shell
 resize2fs 1.42.9 (4-Feb-2014)
 The filesystem is already 5242880 blocks long.  Nothing to do!
 
 
 /a/misc-nosnap1/distros/ubuntu/clone/restore completed
 kiku:~: 
  • NOTE: If sudo says that the command is not permitted, someone with suitable privileges can add the cl-sudoers-d-uis-oper-reinstall package, and then anyone listed in /etc/sudoers.d/cl-90-uis-oper-reinstall will be able to use it, so they can run restore with a first arguement of 'safe' i.e.:
    sudo /a/misc-nosnap1/distros/ubuntu/clone/restore safe reinstall $dev
  • The script will start one or two root shells to allow the system to be checked and tweaked if needed (one chroot'ed to the new FS, the second, if not using 'safe', using the main root FS). In each case, once nothing more is to be done, as per the output displayed "type 'exit' or wait 60 seconds", so exit the shell or just wait for it to time out.
  • When the reinstall is completed type
    w
    to check who is logged in.
  • Reboot the machine if there are no other users, or liaise with any active users and do so. Use:
    sudo reboot
  • Once rebooted, check you can login
  • If the machine is to be given to someone different from the previous owner, then ensure the database is up to date, and run, replacing $crsid with their actual CRSid i.e. gt19:
    cl-asuser cl-hostid-fix --user $crsid -a

Install a new machine

Graham Titmus (27/01/2015)

The simplest option is to clone a master disc.

Categorising Keywords

  • Machine moves installation network patching telephones