In any elastic environment there is a recurring issue: How to quickly spin up new boxes? Over time multiple options emerge. Many environments will rely on a pre-baked machine instances. In Amazon AWS those are called Amazon Machine Instances (AMIs), in Joyent’s SDC – images, but no matter the name they present pre-build, (mostly) pre-configured digital artifact that the underlying cloud layer will bootstrap and execute. They are fast to bootstrap, but limited. Hard to manage different versions, hard to switch virtualization technologies (PV vs. HVM, AWS vs. Joyent, etc), hard to deal with software versioning. Managing elastic environment with pre-baked images is probably the fastest way to start, but probably the most expensive way in the long run.
Another option is to use some sort of configuration management system. Chef, Puppet, Salt, Ansible … a lot of choices. Those are flexible, but depending on the usage scenarios can be slow and may require additional “interventions” to work properly. There are two additional “gotchas” that are not commonly discussed. First, those tools will force some sort in-house configuration/pseudo-programming language and terminology. Second, security is a tricky concept to implement within such system. Managing elastic environments with configuration management systems is definitely possible, but comes with some dependencies and prerequisites you should account for in the design phase.
Third option, AWS UserData / Joyent script, is a reasonable compromise. This is effectively a script that executes one upon virtual machine creation. It allows you to configure the instance, attach/configure storages, install software, etc. There are obvious benefits to that approach:
- Treat that script like any other coding artifact, use version control, code reviews, etc;
- It is easily modifiable upon need or request;
- It can be used with virtually any instance type;
- It is a single source of truth for the instance configuration;
- It integrates nicely with the whole Control Plane concept.
Here is a basic template for Ubuntu 14.04 used with reasonable success to cover wide variety of deployment needs:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 |
#!/bin/bash -ex # DESCRIPTION: The following UserData script is created to ... # # Maintainer: ivachkov [at] xi-group [dot] com # # Requirements: # OS: Ubuntu 14.04 LTS # Repositories: # ... # Packages: # htop, iotop, dstat, ... # PIP Packages: # boto, awscli, ... # # Additional information if necessary # ... # # Debian apt-get install function to eliminate prompts export DEBIAN_FRONTEND=noninteractive apt_get_install() { DEBIAN_FRONTEND=noninteractive apt-get -y \ -o DPkg::Options::=--force-confnew \ install $@ } # Configure disk layout INSTANCE_STORE_0="/dev/xvdb" IS0_PART_1="/dev/xvdb1" IS0_PART_2="/dev/xvdb2" # INSTANCE_STORE_1="/dev/xvdc" # IS1_PART_1="/dev/xvdc1" # IS1_PART_2="/dev/xvdc2" # ... # Unmount /dev/xvdb if already mounted MOUNTED=`df -h | awk '{print $1}' | grep $INSTANCE_STORE_0` if [ ! -z "$MOUNTED" ]; then umount -f $INSTANCE_STORE_0 fi # Partition the disk (8GB for SWAP / Rest for /mnt) (echo n; echo p; echo 1; echo 2048; echo +8G; echo t; echo 82; echo n; echo p; echo 2; echo; echo; echo w) | fdisk $INSTANCE_STORE_0 # Make and enable swap mkswap $IS0_PART_1 swapon $IS0_PART_1 # Make /mnt partition and mount it mkfs.ext4 $IS0_PART_2 mount $IS0_PART_2 /mnt # Update /etc/fstab if necessary # sed -i s/$INSTANCE_STORE_0/$IS0_PART_2/g /etc/fstab # Add external repositories # # Example 1: MongoDB # apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10 # echo 'deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' | sudo tee /etc/apt/sources.list.d/mongodb.list # # Example 2: Salt # add-apt-repository ppa:saltstack/salt # # Example 3: *Internal repository* # curl --silent https://apt.mydomain.com/my.apt.gpg.key | apt-key add - # curl --silent -o /etc/apt/sources.list.d/my.apt.list https://apt.mydomain.com/my.apt.list # Update the packace indexes apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -y -o Dpkg::Options::="--force-confnew" dist-upgrade # Install basic APT packages and requirements apt_get_install htop sysstat dstat iotop # apt_get_install ... apt_get_install python-pip apt_get_install ntp # apt_get_install ... # Install PIP requirements pip install six==1.8.0 pip install boto pip install awscli # pip install ... # Configure NTP service ntp stop # Stop ntp daemon to free NTP socket sleep 3 # Give the daemon some time to exit ntpdate pool.ntp.org # Sync time service ntp start # Re-enable the NTP daemon # Configure other system-specific settings ... # Configure automatic security updates cat > /etc/apt/apt.conf.d/20auto-upgrades << "EOF" APT::Periodic::Update-Package-Lists "1"; APT::Periodic::Unattended-Upgrade "1"; EOF /etc/init.d/unattended-upgrades restart # Update system limits cat > /etc/security/limits.d/my_limits.conf << "EOF" * soft nofile 999999 * hard nofile 999999 root soft nofile 999999 root hard nofile 999999 EOF ulimit -n 999999 # Update sysctl variables cat > /etc/sysctl.d/my_sysctl.conf << "EOF" net.core.somaxconn=65535 net.core.netdev_max_backlog=65535 # net.core.rmem_max=8388608 # net.core.wmem_max=8388608 # net.core.rmem_default=65536 # net.core.wmem_default=65536 # net.ipv4.tcp_rmem=8192 873800 8388608 # net.ipv4.tcp_wmem=4096 655360 8388608 # net.ipv4.tcp_mem=8388608 8388608 8388608 # net.ipv4.tcp_max_tw_buckets=6000000 # net.ipv4.tcp_max_syn_backlog=65536 # net.ipv4.tcp_max_orphans=262144 # net.ipv4.tcp_synack_retries = 2 # net.ipv4.tcp_syn_retries = 2 # net.ipv4.tcp_fin_timeout = 7 # net.ipv4.tcp_slow_start_after_idle = 0 # net.ipv4.ip_local_port_range = 2000 65000 # net.ipv4.tcp_window_scaling = 1 # net.ipv4.tcp_max_syn_backlog = 3240000 # net.ipv4.tcp_congestion_control = cubic EOF sysctl -p /etc/sysctl.d/my_sysctl.conf # Create specific users and groups # addgroup ... # useradd ... # usermod ... # Create expected set of directories DIRECTORIES=" /var/log/... /run/... /srv/... /opt/... " for DIRECTORY in $DIRECTORIES; do mkdir -p $DIRECTORY chown USER:GROUP $DIRECTORY done # Create custom_crontab cat > /home/ubuntu/custom_crontab << "EOF" EOF # Enable custom cronjobs su - ubuntu -c "/usr/bin/crontab /home/ubuntu/custom_crontab" # Install main application / service # ... # ... # Configure main application / service # ... # ... # Make everythig survive reboot cat > /etc/rc.local << "EOF" #!/bin/sh # Regenerate disk layout on ephemeral storage # ... # Start the application # ... EOF # Start application # service XXX restart # Tag the instance (NOTE: Depends on configure AWS CLI) INSTANCE_ID=`curl -s http://169.254.169.254/latest/meta-data/instance-id` # aws ec2 create-tags --resources $INSTANCE_ID --tags Key=Name,Value=... # Mark successful execution exit 0 |
Trivial. Yet, incorporates a lot in just ~200 lines of code:
- Disk layout management;
- Package repositories configuration;
- Basic tool set and third party software installation;
- Service reconfiguration (NTP, Automatic security updates);
- System reconfiguration (limits, sysctl, users, directories, crontab);
- Post-reboot startup configuration;
- Identity discovery and self-tagging;
As added bonus, the cloud-init package will properly log all output during the script execution in /var/log/cloud-init-output.log for failure investigations. Current script uses -ex bash parameters, which means it will explicitly echo all executed commands (-x) and exit at first sign of unsuccessful command execution (-e).
NOTE: There is one important component, purposefully omitted from the template UserData, the log file management. We plan on discussing that in a separate article.
References
- http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html
- http://wiki.joyent.com/wiki/display/sdc/Using+the+Metadata+API
Related Posts
- How to deploy single-node Hadoop setup in AWS
- Small Tip: How to use –block-device-mappings to manage instance volumes with AWS CLI
- How to implement multi-cloud deployment for scalability and reliability
- Small Tip: How to use AWS CLI to start Spot instances with UserData
- Small Tip: How to use AWS CLI ‘–filter’ parameter