Simon stream creation best practices

1) Pre-requisites

Simon streams are (almost always) written as bash scripts, so you should be familiar with bash scripting!
You can write python and perl scripts as well, just mind your shebangs!
You must have SSH access to dev-gtswd2.cc.gatech.edu.
- Streams are located at /s1/gtswd
It is best practice to run shellcheck to test your scripts before deploying them (though it will probably complain about undefined variables).

2) 1. Structure

Current best-practice is to abstract configuration from installation. There are two levels of abstraction you can take:

Create separate patches for installation & configuration in a single stream (e.g., x86-rhel7/coc-sw-web-phpxx)
Create separate streams containing only software or configuration patches (e.g., x86-rhel7/coc-sw-base and x86-rhel7/coc-config-base)

Both approaches are acceptable, and each come with their own set of benefits.

2.1) 1.1 Stream Names

When naming a stream, best practice is to format the name in the following way:

coc-[sw | config]-[$GROUP]-$PURPOSE

caveat: If the stream is for software that is conceivably applicable to units outside of coc, it can be best practice to omit the `coc-` preface

Adding `sw` or `config` to the stream name is optional, but helpful if a stream is dedicated to a specific set of tasks.
The $GROUP in the stream name is also optional, and should be used to specify which broad group the stream is intended for. Examples can be Institutional units, like GVU, CSE, CERCS, or organizational units, like Web, Ops, Research.

The $PURPOSE portion of the stream name should be a human-readable summary that captures the essence of the stream (e.g., fireeye for a stream that installs FireEye, workstation for a stream that sets up a workstation deployment, etc.)

2.2) 1.2 Patch Names

Patches in a stream are run sequentially, based on the preface of the patch name, including decimals ( 0 -> 1. , 1 -> 1.2 , 1.2 -> 2 ). It is important to note that each patch must start with a number `{X}X_`, however human-readable naming can and should be appended. In general, if a patch depends on software/configuration that will be handled within the same stream, the patch should simply be named to indicate a later (i.e., higher number) order.

Where possible, current style includes the process to be carried out in a patch name, as well as the artifact it operates on, i.e., 1_install_httpd, 2_apache_templates, etc., etc.

2.3) 1.3 Patch Directories

When a patch directory is used, it should be named following:

$ORDER.dir

Where `$ORDER` matches the step number of the patch that will use that directory.

3) 2. Writing a Patch

In general, remember to create new streams as appropriate. A new stream is primarily a set of directories, so the main issue to learn is writing new patches.

3.1) 2.1 Rule #1: CASE

The first rule for writing patches is Copy And Steal Everything. In general, work with Simon is geared towards making code re-usable, however even specialized scripts often need only simple edits. Where possible, copying existing work will make your life much easier and is fully encouraged.

3.2) 2.2 The Patch Header

The patch header is a block comment that always appears at the top of the script. A sample header is:

########################
# 1.awesomesoftware
# 
# 19700101 gburdell3
#FLAGS="dir"
#Depends on: coc-sw-base/0
########################

The following sections deconstruct components of the header.

2.2.1 Patch Name

The first line of the header should simply be the name of the patch. This is to improve readability.

2.2.2 Commentary

Generally a remark of what the patch is supposed to do, if the name doesn't make it immediately obvious.

2.2.3 Date & User

The next text-based line (usually after an empty comment, for readability) is the date the patch was written and the username of the author. This helps track the context of the patch as well as who the point of contact should be if there are questions about it.

2.2.4 Flags

There are a variety of flags that can be applied to a patch, these are parsed by Simon only if there is no empty line at any point preceding the line specifying flags. This parsing restriction is why we include them in the header. Some of the flags available are:

dir: There is a directory containing files which should be copied to the local machine when the script runs. The directory can be referenced in the script by the bash variable `{$GTSWD_PATCH_DIR}`
zip/tgz/etc: An 'archive extension' flag indicates that there is an archive which should be downloaded and uncompressed at script runtime. This is not commonly used anymore (as you can simply put the archive in a patch directory and access it with the `dir` flag), but it is perfectly acceptable.
reboot: If a patch requires a reboot, this flag can be used to 'schedule' a reboot in the Simon patch process. The machine will not attempt to reboot immediately, rather it will save the reboot request until a patch requests a change in machine state
rebootnow: Similar to the `reboot` flag, however this will instruct Simon stop execution and reboot right away. Conceptually, this should be included in scripts where the machine should not execute any further commands until a reboot occurs.
boottime: This informs Simon that the script is ideally run during boottime. This is the primary example of a 'state-change' request in Simon patches where, if a script has requested a reboot, that reboot will happen before execution of the `boottime`-flagged patch.
- `boottime` will not schedule a reboot, so the flag is often combined with `reboot` or `rebootnow` in order to ensure a reboot happens. This is not required!
- If you manually initiate a Simon run, you can force all boottime scripts to run without needing a state-change by using the command `simon -b`, e.g., `/var/gtswd/simon/simon -b sync-and-nightly`

Note: Flags are not case-sensitive, simply ensure it is prefaced by the `#` symbol

2.2.5 Dependencies

Patch dependencies can be specified in a manner similar to flags. This follows the same conventions as the other flags, however the syntax is:

# Depends on: $STREAM/$PATCH_NUMBER

Note: The dependency only requires the patch number. The full name of the patch CAN be included for readability, but if it gets renamed things will break, so usually we only use the number.

3.3) 2.5 The Patch Variables and GTSWD Functions

The beginning of patches usually has a few common elements, mostly importing script functions and setting up variables. Generally speaking these are copied from other scripts, however a short description is provided here:

For software installs, variables are often used to specify the package list, service list, and appropriate install command. This is to allow the main logic of the code to be reused in new patches, where the only changes required for package/platform compatability are in those three variables.
PAUSENSLCD is a boolean variable commonly used with scripts that invoke processes that will modify users (ex: apt/rpm installs of services). If you include this variable, it is expected that a "true" value will cause the script to stop/restart nslcd during the script run in order to avoid conflicting operations
Most scripts source the main GTSWD functions through a common line at the beginning. (maybe include here?)

3.4) 2.6 The Patch logic

The patch logic is largely up to you. There are a few style/practice guidelines that are recommended, but they are not hard requirements.

2.6.1 Echo the script name

Usually the following line is included as one of the first lines in any script:

echo "[ $0 ]"

This is to include the script's name in the output log to make debugging easier (i.e., if an error occurs, you know which script threw it).

2.6.2 Include Exit codes

A common practice is to include an exit code at each step of the process. If you use a steadily incrementing exit code, then the output log will provide information as to which step in the patch caused issues. For example:

$INSTALL_COMMAND awesomepackage || exit 2

The above line will allow you to know that, if the log shows the script exited with code 2, then installing `awesomepackage` is the step that failed.

There are a few different logical constructs which are used, including `&& exit 0` for instance where you only consider a script successfully completed if a specific command completes, or brackets if multiple commands (such as echo-ing log info) should be run on command failure.

Best practice, especially when making use of exit codes for each line, is to include a final exit statement as the last line in the patch. This should be either `exit 0` or `exit 1` (or similar) depending on whether a construct like `&& exit 0` is used in the script.

2.6.3 Write code that won't hurt if it is run twice

Best practice is to write streams that have checks/logic to ensure they won't cause any trouble if they are run more than once. You can't assume that the code will only be run once because again, rule #1 (section 2.1) says to copy and steal, and if someone doesn't make an edit to a stolen script, it's running again.

4) 3. Resources

We have template patches available in the TSO GitHub

Template for installing software https://it.github.gatech.edu/TSO/simon-sw-install
Template for configuring software: https://it.github.gatech.edu/TSO/simon-sw-config

Filing Categories

Identifier Categories

OS
- Linux
- - RHEL
  - Ubuntu

Specific categories

GeneralTSO