UserData Template for Ubuntu 14.04 EC2 Instances in AWS

In any elastic environment there is a recurring issue: How to quickly spin up new boxes? Over time multiple options emerge. Many environments will rely on a pre-baked machine instances. In Amazon AWS those are called Amazon Machine Instances (AMIs), in Joyent’s SDC – images, but no matter the name they present pre-build, (mostly) pre-configured digital artifact that the underlying cloud layer will bootstrap and execute. They are fast to bootstrap, but limited. Hard to manage different versions, hard to switch virtualization technologies (PV vs. HVM, AWS vs. Joyent, etc), hard to deal with software versioning. Managing elastic environment with pre-baked images is probably the fastest way to start, but probably the most expensive way in the long run.

Another option is to use some sort of configuration management system. Chef, Puppet, Salt, Ansible … a lot of choices. Those are flexible, but depending on the usage scenarios can be slow and may require additional “interventions” to work properly. There are two additional “gotchas” that are not commonly discussed. First, those tools will force some sort in-house configuration/pseudo-programming language and terminology. Second, security is a tricky concept to implement within such system. Managing elastic environments with configuration management systems is definitely possible, but comes with some dependencies and prerequisites you should account for in the design phase.

Third option, AWS UserData / Joyent script, is a reasonable compromise. This is effectively a script that executes one upon virtual machine creation. It allows you to configure the instance, attach/configure storages, install software, etc. There are obvious benefits to that approach:

  • Treat that script like any other coding artifact, use version control, code reviews, etc;
  • It is easily modifiable upon need or request;
  • It can be used with virtually any instance type;
  • It is a single source of truth for the instance configuration;
  • It integrates nicely with the whole Control Plane concept.

Here is a basic template for Ubuntu 14.04 used with reasonable success to cover wide variety of deployment needs:

Trivial. Yet, incorporates a lot in just ~200 lines of code:

  1. Disk layout management;
  2. Package repositories configuration;
  3. Basic tool set and third party software installation;
  4. Service reconfiguration (NTP, Automatic security updates);
  5. System reconfiguration (limits, sysctl, users, directories, crontab);
  6. Post-reboot startup configuration;
  7. Identity discovery and self-tagging;

As added bonus, the cloud-init package will properly log all output during the script execution in /var/log/cloud-init-output.log for failure investigations. Current script uses -ex bash parameters, which means it will explicitly echo all executed commands (-x) and exit at first sign of unsuccessful command execution (-e).

NOTE: There is one important component, purposefully omitted from the template UserData, the log file management. We plan on discussing that in a separate article.


DevOps Shell Script Template

In everyday life of a DevOps engineer you will have to create multiple pieces of code. Some of those will be run once, others … well others will live forever. Although it may be compelling to just put all the commands in a text editor, save the result and execute it, one should always consider the “bigger picture”. What will happen if your script is run on another OS, on another Linux distribution, or even on a different version of the same Linux distribution?! Another point of view is to think what will happen if somehow your neat 10-line-script has to be executed on say 500 servers?! Can you be sure that all the commands will run successfully there? Can you even be sure that all the commands will even be present? Usually … No!

Faced with similar problems on a daily basis we started devising simple solutions and practices to address them. One of those is the process of standardizing the way different utilities behave, the way they take arguments and report errors. Upon further investigation it became clear that a pattern can be extracted and synthesized in a series of template, one can use in daily work to keep common behavior between different utilities and components.

Here is the basic template used in shell scripts:

Nothing fancy. Basic framework that does the following:

  1. Lines 3 – 13: Make sure basic documentation, dependency list and example usage patterns are provided with the script itself;
  2. Lines 15 – 16: Define meaningful return codes to allow other utils to identify possible execution problems and react accordingly;
  3. Lines 18 – 27: Basic help/usage() function to provide the user with short guidance on how to use the script;
  4. Lines 29 – 52: Dependency checks to make sure all utilities the script needs are available and executable in the system;
  5. Lines 54 – 77: Argument parsing of everything passed on the command line that supports both short and long argument names;
  6. Lines 79 – 91: Validity checks of the argument values that should make sure arguments are passed contextually correct values;
  7. Lines 95 – N: Actual programming logic to be implemented …

This template is successfully used in a various scenarios: command line utilities, Nagios plugins, startup/shutdown scripts, UserData scripts, daemons implemented in shell script with the help of start-stop-daemon, etc. It is also used to allow deployment on multiple operating systems and distribution versions. Resulting utilities and system components are more resilient, include better documentation and dependency sections, provide the user with similar and intuitive way to get help or pass arguments. Error handling is functional enough to go beyond the simple OK / ERROR state. And all of those are important feature when components must be run in highly heterogenous environments such as most cloud deployments!