Service Desk Knowledgebase: XenE: Difference between revisions
(4 intermediate revisions by 2 users not shown) | |||
Line 34: | Line 34: | ||
* [http://helpdesk.csx.cam.ac.uk/ RT] tickets should be processed as any others - ascertain what the actual problem is, check the server / service name is in the Subject:, etc. Look for simple things such as machine powered off, File System errors causing the File System to become read-only (e.g. after a network or filer outage), shortage of RAM (look for 'Out Of Memory killer' oom-killer). If you can login and see other users, or long running jobs, tread carefully, but otherwise a reboot is probably a good start if there is nothing obvious. If it's a not a simple case, add a comment asking for CO CL help, and if there is no response during the HD period, escalate it by changing the '''Queue''' to '''backoffice''' with the '''Owner''' set to '''Nobody''' and the '''Status''' as '''new'''. Tell the requestor:<br /> ''This problem is not straightforward, so I am passing this request over to the experts who will be in contact when they need further info, or have some news.'' | * [http://helpdesk.csx.cam.ac.uk/ RT] tickets should be processed as any others - ascertain what the actual problem is, check the server / service name is in the Subject:, etc. Look for simple things such as machine powered off, File System errors causing the File System to become read-only (e.g. after a network or filer outage), shortage of RAM (look for 'Out Of Memory killer' oom-killer). If you can login and see other users, or long running jobs, tread carefully, but otherwise a reboot is probably a good start if there is nothing obvious. If it's a not a simple case, add a comment asking for CO CL help, and if there is no response during the HD period, escalate it by changing the '''Queue''' to '''backoffice''' with the '''Owner''' set to '''Nobody''' and the '''Status''' as '''new'''. Tell the requestor:<br /> ''This problem is not straightforward, so I am passing this request over to the experts who will be in contact when they need further info, or have some news.'' | ||
===Checking VMs are in expected state === | |||
* On an omnipotent machine run '''cl-onserver --xe cl-vm-status''' | |||
* Restart any machines which should be running | |||
===Checking if VM is operational === | ===Checking if VM is operational === | ||
Line 148: | Line 152: | ||
==Hints, Tips & Known Issues== | ==Hints, Tips & Known Issues== | ||
===How can you tell if it's a Xen Virtual Machine?=== | ===How can you tell if it's a Xen Virtual Machine?=== | ||
[http://www.lookup.cam.ac.uk/person/crsid/pb22 Piete Brooks] (1 Sept 2015) | |||
http://www.wiki.cl.cam.ac.uk/clwiki/SysInfo/VirtualMachines has: | http://www.wiki.cl.cam.ac.uk/clwiki/SysInfo/VirtualMachines has: | ||
Line 159: | Line 163: | ||
# look in the '''arpwatch''' logs | # look in the '''arpwatch''' logs | ||
# look at the IPv6 address | # look at the IPv6 address | ||
# look in the [https://dbwebserver.ad.cl.cam.ac.uk/SCG/Equipment/Inventory.aspx | # look in the [https://dbwebserver.ad.cl.cam.ac.uk/SCG/Equipment/Inventory.aspx Inventory] Database | ||
5. should also tell you that it is 'Virtual Machine' 'XenServer' 'xene-pool' 'Xen' etc. For example: | 5. should also tell you that it is 'Virtual Machine' 'XenServer' 'xene-pool' 'Xen' etc. For example: | ||
https://dbwebserver.ad.cl.cam.ac.uk/SCG/Equipment/InventoryDetails.aspx?InventoryNo=18968 | https://dbwebserver.ad.cl.cam.ac.uk/SCG/Equipment/InventoryDetails.aspx?InventoryNo=18968 | ||
has '''Equipment: Virtual machine''' and '''Supplier: XenServer''' for the server '''www-netfpga''' | has '''Equipment: Virtual machine''' and '''Supplier: XenServer''' for the server '''www-netfpga'''. | ||
Also, at the bottom the '''Media''' for any network should be '''virtual''' | Also, at the bottom, the '''Media''' for any network should be '''virtual''' | ||
If the serv*ER* (rather than serv*ICE*) is not in the Inventory database, it's probably Xen | If the serv*ER* (rather than serv*ICE*) is not in the [https://dbwebserver.ad.cl.cam.ac.uk/SCG/Equipment/Inventory.aspx Inventory] database, it's probably Xen (and missed the 'bulk add' a while back - so please add it) | ||
(and missed the 'bulk add' a while back - so please add it) | |||
The other approach is to type (part of) the name into the XenCenter search | The other approach is to type (part of) the name into the XenCenter search box (see [[Service_Desk_Knowledgebase:_XenE#Accessing_the_Xen_Console | Accessing the Xen Console]]), and it it finds nothing, it's probably not Xen | ||
box (see | |||
===Checking if VM is operational === | ===Checking if VM is operational === |
Latest revision as of 09:16, 3 May 2016
This is the XenE content page of the CL Wiki Service Desk Knowledgebase. Its purpose is to provide information to the Service Desk team on how to handle problems and requests about this CL service. If you are involved with the provision of this CL service please feel free to add to the knowledge about that it.
If CL staff need to tell the Service Desk team about problems with this service please email
sys-admin-aside@cl.cam.ac.uk.
Return to the Service Desk Knowledgebase SERVICE PORTFOLIO
Key Service Description & URLs
- Xen virtual machine monitor
- http://www.wiki.cl.cam.ac.uk/clwiki/SysInfo/HelpDesk/XenE - - CL Sys-Admin Documentation
- http://www.wiki.cl.cam.ac.uk/clwiki/SysInfo/XeneAdmin - CL Documentation
- Computer Laboratory News (Twitter use @UC_CL_SysAdm)
CL Customer Documentation
Further CL Sys-Admin Documentation
- ???
Underpinning Services
- ??? - Any supporting or underpinning services
Customer-base for this Service
- All staff and students of the Computer Laboratory
Costs
- Free to all current staff and students of the Computer Laboratory
SLA
- N/A
Service Desk Call Handling Procedure
- RT tickets should be processed as any others - ascertain what the actual problem is, check the server / service name is in the Subject:, etc. Look for simple things such as machine powered off, File System errors causing the File System to become read-only (e.g. after a network or filer outage), shortage of RAM (look for 'Out Of Memory killer' oom-killer). If you can login and see other users, or long running jobs, tread carefully, but otherwise a reboot is probably a good start if there is nothing obvious. If it's a not a simple case, add a comment asking for CO CL help, and if there is no response during the HD period, escalate it by changing the Queue to backoffice with the Owner set to Nobody and the Status as new. Tell the requestor:
This problem is not straightforward, so I am passing this request over to the experts who will be in contact when they need further info, or have some news.
Checking VMs are in expected state
- On an omnipotent machine run cl-onserver --xe cl-vm-status
- Restart any machines which should be running
Checking if VM is operational
iwm21 (talk) 14:02, 9 July 2015 (BST) (9th July 2015)
- If a server is basically broken (as this one appears to be - cannot ssh in) the first step is to try 'Reboot' from XenCenter.
- If it doesn't reboot after a while, right click on the VM and select 'Force reboot'.
- If it comes up needing a fsck, let it run and accept suggested fixes.
- If it does not respond to input at the 'fsck' stage, change it to boot with 'init=/bin/sh', fsck, then reset the boot option and reboot.
- If it doesn't work after the above (try ssh, and expected services such as web) escalate.
Otherwise ask the user to check that it is OK.
A Xen VM is not running
The procedure below is sufficient to cover the majority of problems reported with VMs not running/available.
- First make sure you have the Xen console running and the pools visible and connected (see section of Accessing the Xen Console)
- In the search bar at the top of the left pane type part of the name of the VM you are looking for.
- Expand any pools and servers that are displayed until you see the VM of interest.
- If the VM cannot be found then escalate as above.
- If the machine is stopped then start it and check it boots correctly by selecting the console tab and monitoring progress.
- If it is running select it and then select the console tab in the main pane.
- You should then see what is wrong. Linux servers can get stuck performing a file system check and need confirmation before they will fix up errors, allow that to proceed.
- Any other error needs escalating.
A Xen VM needs more disc space
- First make sure you have the Xen console running and the pools visible and connected (see section of Accessing the Xen Console)
- In the search bar at the top of the left pane type part of the name of the VM you are looking for.
- Expand any pools and servers that are displayed until you see the VM of interest.
- If the VM cannot be found then escalate as above.
- Click on the Storage tab in the main pane and look at the disc sizes.
- Click on the Console tab and then login
- Check the disk size using df -h to see if the disc is using all the space it has. If the size is much less then you can probably advise the owner to increase the partition using cl-asuser resize2fs (be aware there might be multiple partitions on a disc in which case they are labelled with a final number, if the whole disc is used it ends in a letter - this is only a concern if there are letters other than 1 associated with a disc - check with sudo blkid
- If it is as big as it can be then you need to arrange with the user when the machine can be shut down. When it can you just use the shutdown button at the top.
- When the machine is halted (shows as red in left pane) click on the Storage tab again.
- Select the disc to be expanded with a left click and then click on the Properties button.
- Click on the Size and Location section and then increase the size as needed.
- Click OK to close the properties window and then click on Start to restart the VM.
- Once the machine is running again select the Console tab, login and resize the disc to the requested size using something like (substituting the correct disc name)
cl-asuser resize2fs /dev/xvda1 +2G
A Xen VM needs more memory
Check with the requestor when the VM can be shut down, when it can then
- First make sure you have the Xen console running and the pools visible and connected (see section of Accessing the Xen Console)
- In the search bar at the top of the left pane type part of the name of the VM you are looking for.
- Expand any pools and servers that are displayed until you see the VM of interest.
- If the VM cannot be found then escalate as above.
- Select the VM in the left pane, click on the Console tab and check you have the correct machine.
- Shut it down using the red button in the top bar.
- When the machine shows with a red label in the left pane click on the Memory tab then on the Edit button.
- Alter the memory as requested (we always use fixed sizes) and click OK.
- Restart the machine and check it shows the appropriate value in the Memory tab.
- Select the Console tab and verify that the machine is running.
- Any faults escalate as above.
Accessing the Xen Console
- Use Remote Desktop Connection (RDP) to go to ts01.ad.cl.cam.ac.uk or ts00.ad.cl.cam.ac.uk and login using your CRSid@ad.cl.cam.ac.uk account
- Use [Start] > All Programs > Citrix > Citrix XenCenter
- Wait - Citrix XenCenter takes a long while to start up!
The First time it is run (or if no Xen pools are found) the pools of machines need to be added using:
- Click XenCenter (in left panel)
- Click Add a server (in right panel)
- Set Server: xene-pool1.cl.cam.ac.uk
- Set User name: root
- Set Password: "Enable password"
- Click [Add]
- Click Add a server (in right panel)
- Set Server: xene-pool2.cl.cam.ac.uk
- Set User name: root
- Set Password: "Enable password"
- Click [Add]
- Click Add a server (in right panel)
- Set Server: xene-pool3.cl.cam.ac.uk
- Set User name: root
- Set Password: "Enable password"
- Click [Add]
- Click Add a server (in right panel)
- Set Server: xene-pool4.cl.cam.ac.uk
- Set User name: root
- Set Password: "Enable password"
- Click [Add]
To finish:
- Close down Citrix XenCenter with [X]
- Use [Start] [Log off] to terminate your RDP to the TS01 terminal server
Contacts
Primary
- pb22@cl.cam.ac.uk if URGENT for Piete Brooks
- ???@lists.cam.ac.uk (Goes to ???)
- Tel: ???
Other
Availability
- Monday:
- Tuesday:
- Wednesday:
- Thursday:
- Friday:
- Saturday: Closed
- Sunday: Closed
Additional CL Staff Resources
- ???
Hints, Tips & Known Issues
How can you tell if it's a Xen Virtual Machine?
Piete Brooks (1 Sept 2015)
http://www.wiki.cl.cam.ac.uk/clwiki/SysInfo/VirtualMachines has: "Determining that a machine is using xen, VirtualPC or VMWare" which is rather geeky and out of date, but it still holds.
The tricky bit is to find its MAC address - you can:
- ping on on the same VLAN (I use ware or sxp12) and then use arp
- look in ownfiles
- look in the arpwatch logs
- look at the IPv6 address
- look in the Inventory Database
5. should also tell you that it is 'Virtual Machine' 'XenServer' 'xene-pool' 'Xen' etc. For example:
https://dbwebserver.ad.cl.cam.ac.uk/SCG/Equipment/InventoryDetails.aspx?InventoryNo=18968
has Equipment: Virtual machine and Supplier: XenServer for the server www-netfpga.
Also, at the bottom, the Media for any network should be virtual
If the serv*ER* (rather than serv*ICE*) is not in the Inventory database, it's probably Xen (and missed the 'bulk add' a while back - so please add it)
The other approach is to type (part of) the name into the XenCenter search box (see Accessing the Xen Console), and it it finds nothing, it's probably not Xen
Checking if VM is operational
iwm21 (talk) 14:02, 9 July 2015 (BST) (9th July 2015)
- If a server is basically broken (as this one appears to be - cannot ssh in) the first step is to try 'Reboot' from XenCenter.
- If it doesn't reboot after a while, right click on the VM and select 'Force reboot'.
- If it comes up needing a fsck, let it run and accept suggested fixes.
- If it doesn't work after the above (try ssh, and expected services such as web) escalate.
Otherwise ask the user to check that it is OK.
Categorising Keywords
- XenE Xen XenAppliance XenEnterprise XenServer