Ubuntu
This is the process and overview of working through Cobbler to deploy a physical server or VM in the RHEV farm. This guide is not meant to include any pre-work on a server or inventory or post install work with salt configuration and switch configurations
If you are deploying a server from scratch, refer to the deployment landing page to get started. We will update the documentation in the future, but for now, ignore references to GTSWD and Simon as this is no longer supported in Ubuntu 22 onward and replaced with Salt.
Add to Cobbler
- Navigate to https://shoemaker.cc.gatech.edu/cobbler_web
- Sign in with GT credentials
- Navigate to "Systems" in the left-hand bar
- If you have a system you know fits what you want:
- Find that system and click "Copy"
- Set the name of your new host
- If there is not a system that fits what you want:
- Click "Create New System" at the top of the page
Verify the Cobbler configuration
- Name: this will set
$HOSTNAME
on the host - Profile: Determines OS and base metadata
- Netboot Enabled: Required for first/re-install
- Once the install successfully completes, this will become unchecked and will need to be re-enabled for future reinstalls
- Kickstart metadata: This will determine all of the variables we will use
- By default you'll probably want to fill in the disks that will be used in the kickstart metadata
- Edit Interface: Should have name of network interface
- MAC Address: In
1f:1f:1f:1f:1f:1f
format, the MAC address that will be connecting to Cobbler
Save and sync!
Cobbler Metadata Options
Metadata Cheatsheet
When defining variables in Cobbler, always use a single space to separate out different variables. i.e., var1=value var2=value
Variable Name | Default Value | Info |
---|---|---|
disks | /dev/sda | Comma-separated list of disks to use for OS installation. Defining two disks will create a RAID 1 setup. Prepend with "/dev/". E.g, "disks=/dev/sda,/dev/sdb" |
swap_size | 8G | Size in GiB of the swap space on the machine. Setting to "0" will not create any swap space during installation. |
swap_part | 0 (False) | By default, a swap file ("/swapfile") will be created on the machine. Setting to 1 (True) will create a swap partition instead. Note: If using a swap partition, the value of "swap_size" will be doubled if there are two disks. E.g., "swap_part=1" |
legacy | 0 (False) | By default, UEFI firmware is assumed to be available. Setting to 1 (True) will use Legacy hardware. Note: Installation will FAIL if this value has not been defined & the system is using Legacy firmware. |
boot_size | 2048 | The size of the "/boot" partition in MiB |
boot_size_efi | 1024 | The size of the "/boot/efi" partition in MiB |
root_size | N/A | If defined, the size of the system's root partition ("/") in MiB. If undefined, the "/" partition will grow to fill the remaining disk space after any "tmp", "var", or "usr" definitions |
tmp_size | N/A | If defined, the size of the system's "/tmp" partition in MiB. If undefined, no separate partition/logical volume (LV) will be created |
var_size | N/A | If defined, the size of the system's "/var" partition in MiB. If undefined, no separate partition/LV will be created |
usr_size | N/A | If defined, the size of the system's "/usr" partition in MiB. If undefined, no separate partition/LV will be created |
home_size | N/A | If defined, the size of the system's "/home" partition in MiB. If undefined, no separate partition/LV will be created |
Storage configuration
In order to set custom storage parameters, we are currently using keywords in the kickstart metadata to determine any additional partitions from the default. Metadata values are created with a label=value label2=value etc
form. Each data type is separated by a space and an =
with no spaces is used to assign the label with value. Most labels are looking for a single value, but disks will look for as many disks as you place all separated by a comma.
Examples:
- disks=/dev/sda,/dev/sdb swap_size=15G will create a raid mirror of sda and sdb and create a swap.img file at 15G
- disks=/dev/sda,/dev/sdb swap_size=15G swap_part=1 will create a raid mirror of sda and sdb and create a swap partition at 15G
- disks=/dev/sda,/dev/sdb swap_size=15G swap_size=20G will create a raid mirror of sda and sdb and create a swap.img file at 15G and /var partition at 20G
Regardless of configuration, there will always be two partitions created for grub and /boot. Additionally regardless of the partitions added below, root, will always take up the remaining space on the drive.
Ubuntu Defaults:
- Assumes /dev/sda as the only disk being used
- everything installed as a lvm group in a single partition and mounted as /
- a swap file as 8GB
There can be a variety of reasons to have different partition and disk setups between different servers, so these metadata keywords will hopefully accommodate most of what we can imagine seeing.
Accepted Keywords
disks=/dev/diska,/dev/diskb
will specify the disks that the OS will be installed on.- If two disks are entered, the disks will be configured as Raid1
- If more than two disks are entered the disks will be configured as Raid5
swap_part=1
will specify whether swap will be a .img file in root or a separate partition- by default a .img file will be created instead of a partition
- "1" will create swap as a partition
swap_size=10G
will specify size that swap will be- By default this will be a .img file in root unless swap_part is set to 1
- The default size is 8GB
- Values can be entered as MB values with no identifier, G or T for GB/TB sizes, or percentage of the drive that is being used
tmp_size=10G
will specify size that /tmp will be- Specifying this value will create a tmp partition that will be mounted at /tmp
- By default this partition will not be created and /tmp will exist in the root partition
- Values can be entered as MB values with no identifier, G or T for GB/TB sizes, or percentage of the drive that is being used
var_size=10G
will specify size that /var will be- Specifying this value will create a tmp partition that will be mounted at /var
- By default this partition will not be created and /var will exist in the root partition
- Values can be entered as MB values with no identifier, G or T for GB/TB sizes, or percentage of the drive that is being used
home_size=10G
will specify size that /tmp will be- Specifying this value will create a tmp partition that will be mounted at /home
- By default this partition will not be created and /home will exist in the root partition
- Values can be entered as MB values with no identifier, G or T for GB/TB sizes, or percentage of the drive that is being used
VM Considerations
RAM and ISO size
The .iso
for Ubuntu 24 is almost twice the size of the .iso
for Ubuntu 22 and, as such, can be too large for the initial RAM disk if default memory parameters are used. Historically we have used 4096 MB as the base RAM for a VM, however you will need to increase this, at least during provisioning.
RHV 4.2 and Legacy Boot
We should be building in the Rhev 4.4 instance, but if you are building on the 4.2 instance - Since RHEV 4.2 does not support UEFI booting, VMs will fail to load if the Cobbler profile is set to deploy a UEFI system. The fix to this is a kickstart metadata option to enable legacy boot:
legacy=1
This will update the autoinstall template for the system to ensure BIOS boot parameters are used during partitioning.
RHEV 4.4 supports UEFI, so using the updated RHEV farm should make the legacy boot process unecessary (hopefully).
Custom Root Partioning
If the above metadata storage configuration does not fill the needs you are looking for, there is an alternative snippet that will only accept root_size and disks as metadata entries. The format for these entries is the same as above. This can be used to give an appropriate amount of space for root and leave the rest of the drive unpartitioned where you can configure storage to whatever is appropriate for your use case.
Please note, while the kickstart is tested, I have not had a chance to test the management metadata override yet
In the template files section under management of your system, add the following line:
/var/lib/cobbler/snippets/ubuntu-meta-data=meta-data /var/lib/cobbler/snippets/ubuntu-user-data-root=user-data
This overrides the field in the ubuntu-22 profile that says. Please do not change this at the profile
/var/lib/cobbler/snippets/ubuntu-meta-data=meta-data /var/lib/cobbler/snippets/ubuntu-user-data=user-data
Rhel
This is the process and overview of working through Cobbler to deploy a physical server or VM in the RHEV farm. This guide is not meant to include any pre-work on a server or inventory or post-install work with salt configuration and switch configurations.
To start deploying a server from scratch, refer to the deployment landing page. Any references to GTSWD or Simon should be ignored, as SALT has replaced these tools.
Add to Cobbler
- Navigate to https://shoemaker.cc.gatech.edu/cobbler_web
- Sign in with GT credentials
- Navigate to "Systems" in the left-hand bar
- If you have a system you know fits what you want:
- Find that system and click "Copy"
- Set the name of your new host
- If there is not a system that fits what you want:
- Click "Create New System" at the top of the page
Verify the Cobbler configuration
- Name: This will set
$HOSTNAME
on the host - Profile: Determines OS and base metadata
- Use "RHEL9.0-x86_64-coc-salt"
- Netboot Enabled: Required for first/re-install
- Once the install successfully completes, this will become unchecked and will need to be re-enabled for future reinstalls
- Kickstart metadata: This will determine all of the variables we will use
- The available metadata options are detailed below
- Kickstart
- Leave as "inherit"
- Edit Interface: Should have name of network interface
- MAC Address: In
1f:1f:1f:1f:1f:1f
format, the MAC address that will be connecting to Cobbler
Save and sync!
Metadata Options and Storage Configuration
Variable Name | Default Value | Info |
---|---|---|
disks | sda | Comma-separated list of disks to use for OS installation. Defining two disks will create a RAID 1 setup. Do not prepend with "/dev/". E.g, "disks=sda,sdb" |
swap_size | 8192 | Size in MiB of the swap space on the machine. Setting to "0" will not create any swap space during installation. |
swap_part | 0 (False) | By default, a swap file ("/swapfile") will be created on the machine. Setting to 1 (True) will create a swap partition instead. Note: If using a swap partition, the value of "swap_size" will be doubled if there are two disks. E.g., "swap_part=1" |
legacy | 0 (False) | By default, UEFI firmware is assumed to be available. Setting to 1 (True) will use Legacy hardware. Note: Installation will FAIL if this value has not been defined & the system is using Legacy firmware. |
boot_size | 2048 | The size of the "/boot" partition in MiB |
boot_size_efi | 1024 | The size of the "/boot/efi" partition in MiB |
root_size | N/A | If defined, the size of the system's root partition ("/") in MiB. If undefined, the "/" partition will grow to fill the remaining disk space after any "tmp", "var", or "usr" definitions |
tmp_size | N/A | If defined, the size of the system's "/tmp" partition in MiB. If undefined, no separate partition/logical volume (LV) will be created |
var_size | N/A | If defined, the size of the system's "/var" partition in MiB. If undefined, no separate partition/LV will be created |
usr_size | N/A | If defined, the size of the system's "/usr" partition in MiB. If undefined, no separate partition/LV will be created |
home_size | N/A | If defined, the size of the system's "/home" partition in MiB. If undefined, no separate partition/LV will be created |
A space should separate all variable definitions. i.e., "var1=value var2=value"
Examples:
- (no metadata defined; defaults) - UEFI installation on a single "sda" disk. An 8 GiB "/swapfile" will be created. "/boot" will be 2GiB, and "/boot/efi" will be 1GiB. "/" will take the rest of the disk space.
- "disks=nvme0n1,nvme1n1 swap_part=1 var_size=4196" - UEFI installation on disks "nvme0n1" & "nvme1n1" as RAID 1. Two 8GiB swap partitions will be created—one on each disk for 16GiB of swap total. "/var" will be a 4GiB partition. "/boot" and "/boot/efi" will use default config. No separate "/usr", "/tmp", or "/home" partitions will be created. "/" will use the rest of the disk space.
- "disks=sdx,sdy swap_size=0 legacy=1 root_size=12288" - Legacy installation on disks "sdx" & "sdy" as RAID 1. No "/swapfile" or swap partition will be created. "/boot" will use the default config. No separate "/boot/efi", "/usr", "/tmp", "/home", or "/var" partitions will be created. "/" will use 12GiB of space. The remainder of the disk will be unallocated.
Troubleshooting
Figuring out Disk Names/NIC Names
If you don't know the disk names you need for OS installation, attempt to boot with the defaults and watch the console. The installer will fail to install and enter a crash mode. From a management console, you can enter a shell and run lsblk
or ip a
to get information about the drives and network interfaces. Update the information in the metadata or network interface, and attempt to boot again.
Ubuntu DHCP Fails #1
When trying to run an Ubuntu kickstart, as of 12/15/22 we have been seeing some systems fail their DHCP configuration step. One of the ways to address this is to enable the "management" variable for the network interface in Cobbler.
Determining this situation applies to you
- The Ubuntu installer will fail, usually with a claim that DHCP timed out
- Switching to the full log at the console (e.g.,
Alt+F4
to view tty4), you see log messages that the installer tries to DHCP on a different interface than the one you expect (usually this shows up as the installer trying to DHCP on `eno1
`)
How to fix
- Switch to another TTY (e.g.,
Alt + F2
) and useip l
to determine the system interface name that you are using for setup - On https://shoemaker.cc.gatech.edu/cobbler_web, in the system profile, ensure the name of the network interface matches the name the system is using
- SSH into shoemaker, then use the Cobbler CLI to set the interface management variable to true, e.g.,
cobbler system edit --name=AAA --interface=YYY --management=true
- Note: the
--name
parameter should be set to the same name that was input when creating the system profile in Cobbler. Usingcobbler system report --name=AAA
can help verify you have the right system name - Note: the
--interface
parameter should match the interface string you used in step 2 above
- Note: the
The above references Cobbler CLI commands. For more info on using the Cobbler CLI, look at the following section in this page or the Cobbler documentation.
In debugging the above, Tim Trent found the most success using the same interface name as the system used (e.g., if ip l
lists an interface as ens8f0
, go ahead and use that in Cobbler). It may not be necessary, but unless something changes go ahead and try that. When using a different interface name, Tim got a separate failure from the Ubuntu installer and netcfg segmentation faults in the logs.
Get NBP file but no PXEboot #1
When trying to run a kickstart you may see the system claim an NBP file has been downloaded but then instantly fail and proceed to other boot options (e.g., pxe on another interface, UEFI shell, etc.). This was caused by the system cc-booter not having an interface on the same VLAN as the system being deployed.
Determining this situation applies to you
- The system makes it to the PXE step on the appropriate interface and recognizes that there is media connected and a link on the interface
- The system displays a message that the "NBP File" has downloaded
- The system does not display any further messages for that interface and continues through further boot options
/var/log/messages
on shoemaker.cc.gatech.edu indicates a tftp session has opened from your expected IP address, but this tftp is instantly "closed by user"- This will look something like the following:
Dec 15 15:00:00 shoemaker in.tftpd[0018]: RRQ from 130.207.1.1 filename pxelinux.0
Dec 15 15:00:00 shoemaker in.tftpd[0018]: Error code 8: User aborted the transfer
Dec 15 15:00:00 shoemaker in.tftpd[9998]: RRQ from 130.207.1.1 filename pxelinux.0
Dec 15 15:00:00 shoemaker in.tftpd[9998]: Client 130.207.1.1 finished pxelinux.0
- CC-Booter has an interface on the appropriate subnet
How to fix
- Verify portfast is enabled on the port the server is connected to (see Cisco IOS and NXOS Basics)
Get NBP file but no PXEboot #2
When trying to run a kickstart you may see the system claim an NBP file has been downloaded but then instantly fail and proceed to other boot options (e.g., pxe on another interface, UEFI shell, etc.). This was caused by the system cc-booter not having an interface on the same VLAN as the system being deployed.
Determining this situation applies to you
- The system makes it to the PXE step on the appropriate interface and recognizes that there is media connected and a link on the interface
- The system displays a message that the "NBP File" has downloaded
- The system does not display any further messages for that interface and continues through further boot options
/var/log/messages
on shoemaker.cc.gatech.edu indicates a tftp session has opened from your expected IP address, but this tftp is instantly "closed by user"- This will look something like the following:
Dec 15 15:00:00 shoemaker in.tftpd[0018]: RRQ from 130.207.1.1 filename pxelinux.0
Dec 15 15:00:00 shoemaker in.tftpd[0018]: Error code 8: User aborted the transfer
Dec 15 15:00:00 shoemaker in.tftpd[9998]: RRQ from 130.207.1.1 filename pxelinux.0
Dec 15 15:00:00 shoemaker in.tftpd[9998]: Client 130.207.1.1 finished pxelinux.0
How to fix
- Add a new interface for cc-booter that is on the same VLAN as the system you are having trouble deploying
There are some issues in RHEVM about having too many interfaces for a given machine, so these will be added per VLAN as needed rather than all at once. Please check with the infrastructure team before modifying cc-booter.
GRUB Fails to install on nvme
When trying to run an Ubuntu kickstart the installer will often fail at the grub-install step if installing onto an nvme drive. This is due to a hard-coded reference in the default Ubuntu kickstart to /dev/sda
Determining this situation applies to you
- You have selected to install the OS on an nvme device, e.g.,
/dev/nvme0n1
- The installer makes it to the grub step and then fails
- Alternate TTY view of installer logs (
Alt+F4
) mentions installing on/dev/sda
How to fix
- In cobbler, modify the system profile for the affected machine. Under "general" tab, select the
`...coc-ubuntu...nvme.seed`
kickstart - Restart and try the PXE boot again
RAID partition fails on VM
When trying to run an Ubuntu kickstart the installer may fail at the disk-partitioning step while trying to set up a software RAID. This is due to the base kickstart having some configuration issues when using RAID vs other partitioning step.
Determining this situation applies to you
- You are installing Ubuntu onto one of the CoC RHEVM VMs
- The installer makes it to the partitioning step and asks for input
- On select "Yes" the partitioner starts, but then fails with an "Error while setting up RAID"
- Alternate TTY view of installer logs (
Alt+F4
) mentions the following errors:md-devices: mdadm: No arrays found in config file or automatically
partman: No matching physical volumes found
...
partman-auto-raid: Error: No recipe specified in partman-auto-raid/recipe
How to fix
- In cobbler, modify the system profile for the affected machine. Under "general" tab, select the
`...coc-ubuntu...lvm.seed`
kickstart - Restart and try the PXE boot again
IPMI Remote console fails
Sometimes the remote console on the IPMI interface does not show the screen.
Determining if the situation applies to you
- You go to mgt-<hostname>.cc.gatech.edu and open up the remote console by clicking Remote Control > Launch Console
- You have verified that you set the current interface to HTML5 (and not Java plugin)
- You see a blank white screen and pressing Alt+Ctrl+F2 does not change anything
How to fix
- Do a BMC reset from the IPMI page. You can find it at Maintenance > BMC Reset. Select the option to preserve user settings. The problem is after the BMC reset, this will still reset the BMC ADMIN password which we will need to change back to the TSO IPMI password. The original BMC password can be found on the server (either on a tag or on the motherboard). To reset the ADMIN password without the original password follow the below steps.
- Run the following on the host machine
- install ipmitool
root@host:~# apt install ipmitool
- Find out the user_id of the ADMIN user
root@host:~# ipmitool user list
- Use the user_id from above to reset the password
root@host:~# ipmitool user set password <user_id> <new_password>
- install ipmitool
- Now login to the IPMI console with username ADMIN and password: usual TSO IPMI password.
Leftover EFI Partitions break GRUB
When trying to deploy an OS onto a machine with disks that have not been wiped, GRUB can get confused and fail to load after install. This occurs when an existing EFI partition holds a partition table that disagrees with what the bootloader thinks should exist. This is particularly common in the Ubuntu 22 deployment as, assuming there are multiple disks, the only disks that will be wiped are those specified for the install and any additional disks (which may have leftover EFI partitions) are ignored by the autoinstall deployment.
Determining this situation applies to you
- You have completed the initial PXEBoot/kickstart of a machine without error and proceeded to the first reboot
- Instead of displaying a grub menu, the grub shell appears
- Using a sequence of commands like the following, you are able to successfully boot into the system
grub> root=(lvm/benegesserit--vg)
grub> linux /boot/vmlinuz-X.Y.Z-generic root=/dev/benegesserit-vg/root
grub> initrd /boot/initrd.img-X.Y.Z-generic
grub> boot
- Your system shows correct partitions via
lsblk
orfdisk -l
, but there is an existing, unused partition from a previous OS install's EFI files.
How to fix
- Wipe/delete the old partition table from the unused disk
There are multiple ways to clear out the old disk, from full wipes like dd if=/dev/null of=/dev/sdc
to commands like shred
, but it is sufficient to simply use a tool like fdisk
to create a new, empty partition table on the device or to clear just the partition that has the EFI/boot data.
Enabling new users to have edit access on cobbler systems
- edit the
/etc/cobbler/users.conf
to include the name of the user you want to have edit access. - Restart the service with
systemctl restart cobblerd
Cobbler Command Line Documentation
This documentation is not strictly necessary and potentially old as we use the web interface primarily, but still putting it here for historic purposes.
Cobbler Profiles
Cobbler allows one to describe how a host should be installed using several "objects": distros, profiles, repos, and systems. The distro object is a representation of a linux distribution. A profile is a child object of a distro, but it has a lot of variables that are used to customize an installation. The repo object represents a software repository that can be used during installation. Finally, the system object represents a physical host, and it is associated with a profile, and cobbler uses the variables of the profile and the variables of the system to generate the necessary information to pxe boot a host. That's probably confusing, but hopefully the examples will clear it up.
Adding a new system
To add a new system to cobbler, specify the mac address, dns name, and profile:
# cobbler system add --name=strangepork --dns-name=strangepork.cc.gatech.edu --mac=00:18:8b:86:d7:71 --profile=f14-i386-desktop-coc
That's it! cobbler will create a pxelinux config file for the host linked to the mac address. When the host is network booted, it will get the customized config file that has the instructions for where the appropriate kernel and initrd are for the f14 distribution (in this case), as well as a url to a customized kickstart for the f14-i386-desktop-coc profile. To see a list of cobbler profiles, just run cobbler profile list
Editing a system
To edit a system, use the same options as add (to see a list of options, type cobbler system --help), except specify the edit subcommand:
# cobbler system edit --name=strangepork --profile=f14-i386-desktop-test
The edit subcommand cannot change a system's name, but there is a rename subcommand that can.
Modifying kickstarts
To modify the kickstart, there are two options. If a custom kickstart for just a one-off host install, then the easiest thing to do is just override the kickstart variable on the host:
# cobbler system edit --name=strangepork --kickstart=/var/lib/cobbler/kickstart/strangepork.ks
To make a new kickstart available to a group of hosts, the best approach is to create a new profile. To copy an existing profile:
# cobbler profile copy --name=f14-i386-desktop-coc --newname=f14-i386-desktop-test --kickstart=/var/lib/cobbler/kickstart/test.ks
If just overriding the kickstart, inheriting from another profile is probably better:
# cobbler profile add --name=f14-i386-desktop-test --parent=f14-i386-desktop-coc --kickstart=/var/lib/cobbler/kickstart/test.ks