Xi Group Ltd. Company Blog » Xi Group Ltd. Company Blog » Development

How to deploy single-node Hadoop setup in AWS

Ivo Vachkov — Wed, 04 Feb 2015 08:19:25 +0000

Common issue in the Software Development Lifecycle is the need to quickly bootstrap vanilla environment, deploy some code onto it, run it and then scrap it. This is a core concept in Continuous Integration / Continuous Delivery (CI/CD). It is a stepping stone towards immutable infrastructure. Properly automated implementation can also save time (no need to configure it manually) and money (no need to track potential regression issues in the development process).

Over the course of several years, found this to be extremely useful when used in BigData projects that use Hadoop. Installation of Hadoop is not always straight-forward. It depends on various internal and external components (JDK, Map-Reduce Framework, HDFS, etc). It can be messy. Different components communicate over various ports and protocols. HDFS uses somewhat clumsy semantics to deal with files and directories. For those and similar reasons we decided to present our take on Hadoop installation on a single node for development purposes.

The following shell script is simplified, fully functional skeleton implementation that will install Hadoop on a c3.xlarge, Fedora 20 node in AWS and run a test job on it:

#!/bin/bash

# Key file to be generated and its filesystem location
KEY_NAME="test-hadoop-key"
KEY_FILE="/tmp/$KEY_NAME"

# Security group name and description
SG_NAME="test-hadoop-sg"
SG_DESC="Test Hadoop Security Group"

# Temporary files; General Log and Instance User data
LOG_FILE="/tmp/test-hadoop-setup.log"
USR_DATA="/tmp/test-hadoop-userdata.sh"

# Instance details
AWS_PROFILE="$$profile$$"
AWS_REGION="us-east-1"
AMI_ID="ami-21362b48"
INST_TAG="test-hadoop-single"
INST_TYPE="c3.xlarge"
DISK_SIZE="20"

# Default return codes
RET_CODE_OK=0
RET_CODE_ERROR=1

# Check for various utilities that will be used 

# Check for supported operating system
P_UNAME=`whereis uname | cut -d' ' -f2`
if [ ! -x "$P_UNAME" ]; then
	echo "$0: No UNAME available in the system"
	exit $RET_CODE_ERROR;
fi
OS=`$P_UNAME`
if [ "$OS" != "Linux" ]; then
	echo "$0: Unsupported OS!";
	exit $RET_CODE_ERROR;
fi

# Check if awscli is available in the system
P_AWS=`whereis aws | cut -d' ' -f2`
if [ ! -x "$P_AWS" ]; then
	echo "$0: No 'aws' available in the system!";
	exit $RET_CODE_ERROR;
fi

# Check if awk is available in the system
P_AWK=`whereis awk | cut -d' ' -f2`
if [ ! -x "$P_AWK" ]; then
	echo "$0: No 'awk' available in the system!";
	exit $RET_CODE_ERROR;
fi

# Check if grep is available in the system
P_GREP=`whereis grep | cut -d' ' -f2`
if [ ! -x "$P_GREP" ]; then
	echo "$0: No 'grep' available in the system!";
	exit $RET_CODE_ERROR;
fi

# Check if sed is available in the system
P_SED=`whereis sed | cut -d' ' -f2`
if [ ! -x "$P_SED" ]; then
	echo "$0: No 'sed' available in the system!";
	exit $RET_CODE_ERROR;
fi

# Check if ssh is available in the system
P_SSH=`whereis ssh | cut -d' ' -f2`
if [ ! -x "$P_SSH" ]; then
	echo "$0: No 'ssh' available in the system!";
	exit $RET_CODE_ERROR;
fi

# Check if ssh-keygen is available in the system
P_SSH_KEYGEN=`whereis ssh-keygen | cut -d' ' -f2`
if [ ! -x "$P_SSH_KEYGEN" ]; then
	echo "$0: No 'ssh-keygen' available in the system!";
	exit $RET_CODE_ERROR;
fi

# Userdata code to bootstrap Hadoop 2.X on Fedora 20 instance
cat > $USR_DATA << "EOF"
#!/bin/bash

# Mark execution start
echo "START" > /root/userdata.state

# Install Hadoop
yum --assumeyes install hadoop-common hadoop-common-native hadoop-hdfs hadoop-mapreduce hadoop-mapreduce-examples hadoop-yarn

# Configure HDFS
hdfs-create-dirs

# Bootstrap Hadoop services
systemctl start hadoop-namenode && sleep 2
systemctl start hadoop-datanode && sleep 2
systemctl start hadoop-nodemanager && sleep 2
systemctl start hadoop-resourcemanager && sleep 2

# Make Hadoop services start after reboot
systemctl enable hadoop-namenode hadoop-datanode hadoop-nodemanager hadoop-resourcemanager

# Configure Hadoop user
runuser hdfs -s /bin/bash /bin/bash -c "hadoop fs -mkdir /user/fedora"
runuser hdfs -s /bin/bash /bin/bash -c "hadoop fs -chown fedora /user/fedora"

# Deploy additional software dependencies
# ... 

# Deploy main application 
# ... 

# Mark execution end
echo "DONE" > /root/userdata.state
EOF

# Create Security Group
echo -n "Creating '$SG_NAME' security group ... "
aws ec2 create-security-group --group-name $SG_NAME --description "$SG_DESC" --region $AWS_REGION --profile $AWS_PROFILE > $LOG_FILE
echo "Done."

# Add open SSH access
echo -n "Adding access rules to '$SG_NAME' security group ... "
aws ec2 authorize-security-group-ingress --group-name $SG_NAME --protocol tcp --port 22 --cidr 0.0.0.0/0 --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE

# Add open Hadoop ports access
aws ec2 authorize-security-group-ingress --group-name $SG_NAME --protocol tcp --port 8088 --cidr 0.0.0.0/0 --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE
aws ec2 authorize-security-group-ingress --group-name $SG_NAME --protocol tcp --port 50010 --cidr 0.0.0.0/0 --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE
aws ec2 authorize-security-group-ingress --group-name $SG_NAME --protocol tcp --port 50020 --cidr 0.0.0.0/0 --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE
aws ec2 authorize-security-group-ingress --group-name $SG_NAME --protocol tcp --port 50030 --cidr 0.0.0.0/0 --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE
aws ec2 authorize-security-group-ingress --group-name $SG_NAME --protocol tcp --port 50070 --cidr 0.0.0.0/0 --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE
aws ec2 authorize-security-group-ingress --group-name $SG_NAME --protocol tcp --port 50075 --cidr 0.0.0.0/0 --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE
aws ec2 authorize-security-group-ingress --group-name $SG_NAME --protocol tcp --port 50090 --cidr 0.0.0.0/0 --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE
echo "Done."

# Generate New Key Pair and Import it
echo -n "Generating key pair '$KEY_NAME' for general access ... "
rm -rf $KEY_FILE $KEY_FILE.pub
ssh-keygen -t rsa -f $KEY_FILE -N '' >> $LOG_FILE
aws ec2 import-key-pair --key-name $KEY_NAME --public-key-material "`cat $KEY_FILE.pub`" --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE
echo "Done."

# Build the Hadoop box
echo -n "Starting Hadoop instance ... "
RI_OUT=`aws ec2 run-instances --image-id $AMI_ID --count 1 --instance-type $INST_TYPE --key-name $KEY_NAME --security-groups $SG_NAME --user-data file:///tmp/test-hadoop-userdata.sh --block-device-mapping "[{\"DeviceName\":\"/dev/sda1\", \"Ebs\":{\"VolumeSize\":$DISK_SIZE, \"DeleteOnTermination\": true} } ]" --region $AWS_REGION --profile $AWS_PROFILE`
I_ID=`echo $RI_OUT | grep "InstanceId" | awk '{print $43}' | sed 's/,$//' | sed -e 's/^"//'  -e 's/"$//'`
echo $RI_OUT >> $LOG_FILE
echo "Done."

# Tag the Hadoop box
echo -n "Tagging Hadoop instance '$I_ID' ... "
aws ec2 create-tags --resources $I_ID --tags Key=Name,Value=$INST_TAG --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE
echo "Done."

# Obtain instance public IP address
echo -n "Obtaining instance '$I_ID' public hostname ... "

# Delays in AWS fabric, reiterate until public hostname is assigned ...
while true; do
	sleep 3

	HOST=`aws ec2 describe-instances --instance-ids $I_ID --region $AWS_REGION --profile $AWS_PROFILE | grep PublicDnsName | awk -F":" '{print $2}' | awk '{print $1}' | sed 's/,$//' | sed -e 's/^"//'  -e 's/"$//'`;
	if [[ $HOST == ec2* ]]; then
		break;
	fi
done
echo "Done."

# Poll until system is ready
echo -n "Waiting for instance '$I_ID' to configure itself (will take approx. 5 minutes) ... "
while true; do
	sleep 5;

	TEMP_OUT=`ssh -q -o "StrictHostKeyChecking=no" -i $KEY_FILE -t fedora@$HOST "sudo cat /root/userdata.state"`;

	# Clear some strange symbols 
	STATE=`echo $TEMP_OUT | cut -c1-4`;

	if [ "$STATE" = "DONE" ]; then
		break;
	fi
done
echo "Done."

# Test Hadoop setup
echo "========== Testing Single-node Hadoop =========="
ssh -q -o "StrictHostKeyChecking=no" -i $KEY_FILE fedora@$HOST "hadoop jar /usr/share/java/hadoop/hadoop-mapreduce-examples.jar pi 10 1000000"
echo "========== Done =========="

# Run main Application here
# echo "========== Testing Main Application Single-node Hadoop =========="
# ssh -q -o "StrictHostKeyChecking=no" -i $KEY_FILE fedora@$HOST "hadoop jar ..."
# echo "========== Done =========="

# Terminate instance
echo -n "Terminating Hadoop instance '$I_ID' ... "
aws ec2 terminate-instances --instance-ids $I_ID --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE

# Poll until instance is terminated
while true; do
	sleep 5;

	TERMINATED=`aws ec2 describe-instances --instance-ids $I_ID --region $AWS_REGION --profile $AWS_PROFILE | grep terminated`;
	if [ ! -z "$TERMINATED" ]; then
		break;
	fi
done
echo "Done."

# Remove SSH Keypair
echo -n "Removing key pair '$KEY_NAME' ... "
aws ec2 delete-key-pair --key-name $KEY_NAME --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE
echo "Done."

# Remove Security Group
echo -n "Removing '$SG_NAME' security group ... "
aws ec2 delete-security-group --group-name $SG_NAME --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE
echo "Done."

# Remove local resources
rm -rf $USR_DATA
rm -rf $KEY_FILE $KEY_FILE.pub
rm -rf $LOG_FILE

# Normal termination
exit $RET_CODE_OK

Additional notes:

Please, edit the AWS_PROFILE variable. AWS CLI commands depend on this!
Activity log is kept in /tmp/test-hadoop-setup.log and will be recreated with every new run of the script.
In case of normal execution, all allocated resources will be cleaned upon termination.
This script is ready to be used as Jenkins build-and-deploy job.
Since the single-node Hadoop/HDFS is terminated, output data that goes to HDFS should be transferred out of the instance before termination!

Example run should look like:

:~> ./aws-hadoop-single.sh
Creating 'test-hadoop-sg' security group ... Done.
Adding access rules to 'test-hadoop-sg' security group ... Done.
Generating key pair 'test-hadoop-key' for general access ... Done.
Starting Hadoop instance ... Done.
Tagging Hadoop instance 'i-b3b27f5c' ... Done.
Obtaining instance 'i-b3b27f5c' public hostname ... Done.
Waiting for instance 'i-b3b27f5c' to configure itself (will take approx. 5 minutes) ... Done.
========== Testing Single-node Hadoop ==========
Number of Maps  = 10
Samples per Map = 1000000
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Wrote input for Map #5
Wrote input for Map #6
Wrote input for Map #7
Wrote input for Map #8
Wrote input for Map #9
Starting Job
15/02/04 07:27:05 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/02/04 07:27:05 INFO input.FileInputFormat: Total input paths to process : 10
15/02/04 07:27:05 INFO mapreduce.JobSubmitter: number of splits:10
15/02/04 07:27:05 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
15/02/04 07:27:05 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
15/02/04 07:27:05 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
15/02/04 07:27:05 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
15/02/04 07:27:05 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
15/02/04 07:27:05 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
15/02/04 07:27:05 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
15/02/04 07:27:05 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
15/02/04 07:27:05 INFO Configuration.deprecation: mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class
15/02/04 07:27:05 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class
15/02/04 07:27:05 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
15/02/04 07:27:05 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
15/02/04 07:27:05 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class
15/02/04 07:27:05 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
15/02/04 07:27:05 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
15/02/04 07:27:05 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
15/02/04 07:27:05 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1423034805647_0001
15/02/04 07:27:05 INFO impl.YarnClientImpl: Submitted application application_1423034805647_0001 to ResourceManager at /0.0.0.0:8032
15/02/04 07:27:05 INFO mapreduce.Job: The url to track the job: http://ip-10-63-188-40:8088/proxy/application_1423034805647_0001/
15/02/04 07:27:05 INFO mapreduce.Job: Running job: job_1423034805647_0001
15/02/04 07:27:11 INFO mapreduce.Job: Job job_1423034805647_0001 running in uber mode : false
15/02/04 07:27:11 INFO mapreduce.Job:  map 0% reduce 0%
15/02/04 07:27:24 INFO mapreduce.Job:  map 60% reduce 0%
15/02/04 07:27:33 INFO mapreduce.Job:  map 100% reduce 0%
15/02/04 07:27:34 INFO mapreduce.Job:  map 100% reduce 100%
15/02/04 07:27:34 INFO mapreduce.Job: Job job_1423034805647_0001 completed successfully
Job Finished in 29.302 seconds
15/02/04 07:27:34 INFO mapreduce.Job: Counters: 43
        File System Counters
                FILE: Number of bytes read=226
                FILE: Number of bytes written=882378
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=2660
                HDFS: Number of bytes written=215
                HDFS: Number of read operations=43
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=3
        Job Counters
                Launched map tasks=10
                Launched reduce tasks=1
                Data-local map tasks=10
                Total time spent by all maps in occupied slots (ms)=93289
                Total time spent by all reduces in occupied slots (ms)=7055
        Map-Reduce Framework
                Map input records=10
                Map output records=20
                Map output bytes=180
                Map output materialized bytes=280
                Input split bytes=1480
                Combine input records=0
                Combine output records=0
                Reduce input groups=2
                Reduce shuffle bytes=280
                Reduce input records=20
                Reduce output records=0
                Spilled Records=40
                Shuffled Maps =10
                Failed Shuffles=0
                Merged Map outputs=10
                GC time elapsed (ms)=1561
                CPU time spent (ms)=7210
                Physical memory (bytes) snapshot=2750681088
                Virtual memory (bytes) snapshot=11076927488
                Total committed heap usage (bytes)=2197291008
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=1180
        File Output Format Counters
                Bytes Written=97
Estimated value of Pi is 3.14158440000000000000
========== Done ==========
Terminating Hadoop instance 'i-b3b27f5c' ... Done.
Removing key pair 'test-hadoop-key' ... Done.
Removing 'test-hadoop-sg' security group ... Done.
:~>

Hopefully, this short introduction will advance your efforts to automate development tasks in BigData projects!

If you want to discuss more complex scenarios including automated deployments over multi-node Hadoop clusters, AWS Elastic MapReduce, AWS DataPipeline or other components of the BigData ecosystem, do not hesitate to Contact Us!

References

UserData Template for Ubuntu 14.04 EC2 Instances in AWS

Ivo Vachkov — Tue, 27 Jan 2015 11:41:14 +0000

In any elastic environment there is a recurring issue: How to quickly spin up new boxes? Over time multiple options emerge. Many environments will rely on a pre-baked machine instances. In Amazon AWS those are called Amazon Machine Instances (AMIs), in Joyent’s SDC – images, but no matter the name they present pre-build, (mostly) pre-configured digital artifact that the underlying cloud layer will bootstrap and execute. They are fast to bootstrap, but limited. Hard to manage different versions, hard to switch virtualization technologies (PV vs. HVM, AWS vs. Joyent, etc), hard to deal with software versioning. Managing elastic environment with pre-baked images is probably the fastest way to start, but probably the most expensive way in the long run.

Another option is to use some sort of configuration management system. Chef, Puppet, Salt, Ansible … a lot of choices. Those are flexible, but depending on the usage scenarios can be slow and may require additional “interventions” to work properly. There are two additional “gotchas” that are not commonly discussed. First, those tools will force some sort in-house configuration/pseudo-programming language and terminology. Second, security is a tricky concept to implement within such system. Managing elastic environments with configuration management systems is definitely possible, but comes with some dependencies and prerequisites you should account for in the design phase.

Third option, AWS UserData / Joyent script, is a reasonable compromise. This is effectively a script that executes one upon virtual machine creation. It allows you to configure the instance, attach/configure storages, install software, etc. There are obvious benefits to that approach:

Treat that script like any other coding artifact, use version control, code reviews, etc;
It is easily modifiable upon need or request;
It can be used with virtually any instance type;
It is a single source of truth for the instance configuration;
It integrates nicely with the whole Control Plane concept.

Here is a basic template for Ubuntu 14.04 used with reasonable success to cover wide variety of deployment needs:

#!/bin/bash -ex

# DESCRIPTION: The following UserData script is created to ... 
# 
# Maintainer: ivachkov [at] xi-group [dot] com
# 
# Requirements:
#	OS: Ubuntu 14.04 LTS
#	Repositories: 
#		...
#	Packages:
# 		htop, iotop, dstat, ...
#	PIP Packages:
#		boto, awscli, ...
# 
# Additional information if necessary
# 	... 
# 

# Debian apt-get install function to eliminate prompts
export DEBIAN_FRONTEND=noninteractive
apt_get_install()
{
	DEBIAN_FRONTEND=noninteractive apt-get -y \
		-o DPkg::Options::=--force-confnew \
		install $@
}

# Configure disk layout 
INSTANCE_STORE_0="/dev/xvdb"
IS0_PART_1="/dev/xvdb1"
IS0_PART_2="/dev/xvdb2"

# INSTANCE_STORE_1="/dev/xvdc"
# IS1_PART_1="/dev/xvdc1"
# IS1_PART_2="/dev/xvdc2"

# ... 

# Unmount /dev/xvdb if already mounted
MOUNTED=`df -h | awk '{print $1}' | grep $INSTANCE_STORE_0`
if [ ! -z "$MOUNTED" ]; then
	umount -f $INSTANCE_STORE_0
fi

# Partition the disk (8GB for SWAP / Rest for /mnt)
(echo n; echo p; echo 1; echo 2048; echo +8G; echo t; echo 82; echo n; echo p; echo 2; echo; echo; echo w) | fdisk $INSTANCE_STORE_0

# Make and enable swap
mkswap $IS0_PART_1
swapon $IS0_PART_1

# Make /mnt partition and mount it
mkfs.ext4 $IS0_PART_2
mount $IS0_PART_2 /mnt

# Update /etc/fstab if necessary 
# sed -i s/$INSTANCE_STORE_0/$IS0_PART_2/g /etc/fstab

# Add external repositories
# 
# Example 1: MongoDB
# apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10
# echo 'deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' | sudo tee /etc/apt/sources.list.d/mongodb.list
# 
# Example 2: Salt
# add-apt-repository ppa:saltstack/salt
# 
# Example 3: *Internal repository*
# curl --silent https://apt.mydomain.com/my.apt.gpg.key | apt-key add -
# curl --silent -o /etc/apt/sources.list.d/my.apt.list https://apt.mydomain.com/my.apt.list

# Update the packace indexes
apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -y -o Dpkg::Options::="--force-confnew" dist-upgrade

# Install basic APT packages and requirements
apt_get_install htop sysstat dstat iotop
# apt_get_install ... 
apt_get_install python-pip
apt_get_install ntp
# apt_get_install ... 

# Install PIP requirements
pip install six==1.8.0
pip install boto
pip install awscli
# pip install ... 

# Configure NTP
service ntp stop		# Stop ntp daemon to free NTP socket
sleep 3				# Give the daemon some time to exit
ntpdate pool.ntp.org		# Sync time
service ntp start		# Re-enable the NTP daemon

# Configure other system-specific settings ... 

# Configure automatic security updates
cat > /etc/apt/apt.conf.d/20auto-upgrades << "EOF"
APT::Periodic::Update-Package-Lists "1";
APT::Periodic::Unattended-Upgrade "1";
EOF
/etc/init.d/unattended-upgrades restart

# Update system limits
cat > /etc/security/limits.d/my_limits.conf << "EOF"
*               soft    nofile          999999
*               hard    nofile          999999
root            soft    nofile          999999
root            hard    nofile          999999
EOF
ulimit -n 999999

# Update sysctl variables
cat > /etc/sysctl.d/my_sysctl.conf << "EOF"
net.core.somaxconn=65535
net.core.netdev_max_backlog=65535
# net.core.rmem_max=8388608
# net.core.wmem_max=8388608
# net.core.rmem_default=65536
# net.core.wmem_default=65536
# net.ipv4.tcp_rmem=8192 873800 8388608
# net.ipv4.tcp_wmem=4096 655360 8388608
# net.ipv4.tcp_mem=8388608 8388608 8388608
# net.ipv4.tcp_max_tw_buckets=6000000
# net.ipv4.tcp_max_syn_backlog=65536
# net.ipv4.tcp_max_orphans=262144
# net.ipv4.tcp_synack_retries = 2
# net.ipv4.tcp_syn_retries = 2
# net.ipv4.tcp_fin_timeout = 7
# net.ipv4.tcp_slow_start_after_idle = 0
# net.ipv4.ip_local_port_range = 2000 65000
# net.ipv4.tcp_window_scaling = 1
# net.ipv4.tcp_max_syn_backlog = 3240000
# net.ipv4.tcp_congestion_control = cubic
EOF
sysctl -p /etc/sysctl.d/my_sysctl.conf

# Create specific users and groups 
# addgroup ...
# useradd ... 
# usermod ...

# Create expected set of directories
DIRECTORIES="
	/var/log/...
	/run/...
	/srv/... 
	/opt/...
	"

for DIRECTORY in $DIRECTORIES; do
	mkdir -p $DIRECTORY
	chown USER:GROUP $DIRECTORY	
done

# Create custom_crontab
cat > /home/ubuntu/custom_crontab << "EOF"

EOF

# Enable custom cronjobs
su - ubuntu -c "/usr/bin/crontab /home/ubuntu/custom_crontab"

# Install main application / service 
# ...
# ... 

# Configure main application / service
# ... 
# ... 

# Make everythig survive reboot
cat > /etc/rc.local << "EOF"
#!/bin/sh

# Regenerate disk layout on ephemeral storage 
# ... 

# Start the application 
# ... 

EOF

# Start application
# service XXX restart 

# Tag the instance (NOTE: Depends on configure AWS CLI)
INSTANCE_ID=`curl -s http://169.254.169.254/latest/meta-data/instance-id`
# aws ec2 create-tags --resources $INSTANCE_ID --tags Key=Name,Value=... 

# Mark successful execution
exit 0

Trivial. Yet, incorporates a lot in just ~200 lines of code:

Disk layout management;
Package repositories configuration;
Basic tool set and third party software installation;
Service reconfiguration (NTP, Automatic security updates);
System reconfiguration (limits, sysctl, users, directories, crontab);
Post-reboot startup configuration;
Identity discovery and self-tagging;

As added bonus, the cloud-init package will properly log all output during the script execution in /var/log/cloud-init-output.log for failure investigations. Current script uses -ex bash parameters, which means it will explicitly echo all executed commands (-x) and exit at first sign of unsuccessful command execution (-e).

NOTE: There is one important component, purposefully omitted from the template UserData, the log file management. We plan on discussing that in a separate article.

References

Small Tip: How to use –block-device-mappings to manage instance volumes with AWS CLI

Ivo Vachkov — Wed, 26 Nov 2014 10:18:37 +0000

This post will present one of the less popular features in the AWS CLI tool set, how to deal with EC2 instance volumes through the use of –block-device-mappings parameter. Previous post, Small Tip: Use AWS CLI to create instances with bigger root partitions already presents one of the common use cases, modifying the instance root partition size. However, use of ‘–block-device-mappings’ can go far beyond this simple feature.

Default documentation (http://docs.aws.amazon.com/cli/latest/reference/ec2/run-instances.html) although a good start is somewhat limited. Several tips and tricks will be presented here.

The location of the JSON block device mapping specification can be quite flexible. The mappings can be supplied:

1. Using command line directly:

--block-device-mappings '[ {"DeviceName":"/dev/sdb","VirtualName":"ephemeral0"}, {"DeviceName":"/dev/sdc","VirtualName":"ephemeral1"}]'

2. Using file as a source:

--block-device-mappings file:////home/ec2-user/mapping.json

3. Using URL as a source:

--block-device-mappings http://mybucket.s3.amazonaws.com/mapping.json

Source: http://understeer.hatenablog.com/entry/2013/10/18/223618

Other common scenarios:

1. To reorder default ephemeral volumes to ensure stability of the environment:

[
  {
    "DeviceName": "/dev/sde",
    "VirtualName": "ephemeral0"
  },
  {
    "DeviceName": "/dev/sdf",
    "VirtualName": "ephemeral1"
  }
]

NOTE: Useful for additional UserData processing or deployments with hardcoded settings.

2. To allocate additional EBS Volume with specific size (100GB), to be associated with the EC2 instance:

[
  {
    "DeviceName": "/dev/sdg",
    "Ebs": {
      "VolumeSize": 100
    }
  }
]

NOTE: Useful for cases where cheaper instance types are outfitted with big volumes (Disk intensive tasks run on low-CPU/MEM instance types).

3. To allocate new volume from Snapshot ID:

[
  {
    "DeviceName": "/dev/sdh",
    "Ebs": {
      "SnapshotId": "snap-xxxxxxxx"
    }
  }
]

NOTE: Useful to pre-loading newly created instances with specific disk data and still retaining the ability to modify the local copy.

4. To omit mapping of a particular Device Name:

[
  {
    "DeviceName": "/dev/sdj",
    "NoDevice": ""
  }
]

NOTE: Useful to overwrite default AWS behavior.

5. To allocate new EBS Volume with explicit termination behavior (Keep after instance termination):

[
  {
    "DeviceName": "/dev/sdc",
    "Ebs": {
      "VolumeSize": 10,
      "DeleteOnTermination": false
    }
  }
]

NOTE: Useful to keep instance data after termination, additional cost may be significant if those volumes are not released after examination.

6. To allocate new, encrypted, EBS Volume with Reserved IOPS:

[
  {
    "DeviceName": "/dev/sdc",
    "Ebs": {
      "VolumeSize": 10,
      "VolumeType": "io1",
      "Iops": 1000,
      "Encrypted": true
    }
  }
]

NOTE: Useful to set minimum required performance levels (I/O Operations Per Second) for the specified volume.

Outlined functionality should cover wide range of potentially use cases for DevOps engineers who want to use automation to customize their infrastructure. Flexible instance volume management is a key ingredient for successful implementation of the ‘Infrastructure-as-Code’ paradigm!

References

How to implement multi-cloud deployment for scalability and reliability

Ivo Vachkov — Fri, 18 Jul 2014 15:20:12 +0000

Introduction

This post will present interesting approach to scalability and reliability:

How to implement multi-cloud application deployment ?!

There are many reasons why this is interesting topic. Avoiding provider lockdown, reducing cloud provider outage impact, increasing world-wide coverage, disaster recovery / preparedness are only some of them. The obvious benefits of multi-cloud deployment are increased reliability and outage impact minimization. However, there are drawbacks too: supporting different sets of code to accommodate similar, but different services, increased cost, increased infrastructure complexity, different tools … Yet, despite the drawbacks, the possible benefits far outweigh the negatives!

In the following article a simple service will be deployed in automated fashion over two different Cloud Service Providers: Amazon AWS and Joyent. Third provider, CloudFlare, will be used to service DNS requests. The choice of providers is not random. They are chosen because of particular similarities and because the ease of use. All of those providers have consistent, comprehensive APIs that allow automation through programming in parallel to the command line tools.

Preliminary information

The service setup, described here, although synthetic, is representative of multiple usage scenarios. More complex scenarios are also possible. Special care should be taken to address use of common resources or non-replicable resources/states. Understand the dependencies of your application architecture before using multi-cloud setup. Or contact Xi Group Ltd. to aid you in this process!

The following Cloud Service Providers will be used to deploy executable code on:

DNS requests will be served by CloudFlare. The test domain is: scalability.expert

Required tools are:

Additional information can be found in AWS CLI, Joyent CloudAPI Documentation and CloudFlare ClientAPI.

Implementation Details

A service, website for www.scalability.expert, has to be deployed over multiple clouds. For simplicity, it is assumed that this is a static web site, served by NginX. It will run on Ubuntu 14.04 LTS. Instance types chosen in both AWS and Joyent are pretty limited, but should provide enough computing power to run NginX and serve static content. CloudFlare must be configured with basic settings for the DNS zone it will serve (in this case, the free CloudFlare account is used).

Each computing instance, when bootstrapped or restarted, will start the NginX and register itself in CloudFlare. At that point it should be able to receive client traffic. Upon termination or shutdown, each instance should remove its own entries from CloudFlare thus preventing DNS zone pollution with dead entries. In a previous article, How to implement Service Discovery in the Cloud, it was discussed how DNS-SD can be implemented for similar setup with increased client complexity. In a multi-tier architecture this a proper solution. However, lack of control over the client browser may prove that a simplistic solution, like the one described here, is a better choice.

CloudFlare

CloudFlare setup uses the free account and one domain, scalability.expert, is configured:

Basic configuration includes only one entry for the zone name:

As seen by the orange cloud icon, the requests for this record will be routed through CloudFlare’s network for inspection and analysis!

AWS UserData / Joyent Script

To automate the process of configuring instances, the following UserData script will be used:

#!/bin/bash -ex

# Debian apt-get install function
apt_get_install()
{
    DEBIAN_FRONTEND=noninteractive apt-get -y \
    -o DPkg::Options::=--force-confdef \
    -o DPkg::Options::=--force-confold \
    install $@
}
 
# Mark execution start
echo "STARTING" > /root/user_data_run
 
# Some initial setup
export DEBIAN_FRONTEND=noninteractive
apt-get update && apt-get upgrade -y

# Mark progress ...
echo "OS UPDATE FINISHED" >> /root/user_data_run
 
# Install required packages
apt_get_install jq nginx

# Mark progress ...
echo "SOFTWARE DEPENDENCIES INSTALLED" >> /root/user_data_run

# Create test html page
mkdir /var/www
cat > /var/www/index.html << "EOF"

    
        Demo Page
    
 
    
        Demo Page

        Status: running
    

EOF

# Configure NginX
cat > /etc/nginx/conf.d/demo.conf << "EOF"
# Minimal NginX VirtualHost setup
server {
    listen 8080;
 
    root /var/www;
    index index.html index.htm;
 
    location / {
        try_files $uri $uri/ =404;
    }
}
EOF
 
# Restart NginX with the new settings
/etc/init.d/nginx restart

# Mark progress ...
echo "NGINX CONFIGURED" >> /root/user_data_run

# /etc/init.d startup script
cat > /etc/init.d/cloudflare-submit.sh << "EOF"
#! /bin/bash
#
# Author: Ivo Vachkov (ivachkov@xi-group.com)
#
### BEGIN INIT INFO
# Provides: DNS-SD Service Group Registration / De-Registration
# Required-Start:
# Should-Start:
# Required-Stop:
# Should-Stop:
# Default-Start:  2 3 4 5
# Default-Stop:   0 1 6
# Short-Description:    Start / Stop script for DNS-SD
# Description:          Use to JOIN/LEAVE DNS-SD Service Group
### END INIT INFO

set -e
umask 022

# DNS Configuration details
ZONE="scalability.expert"
HOST="www"
TTL="120"
IP=""

# CloudFlare oSpecific Settings
CF_HOST="https://www.cloudflare.com/api_json.html"
CF_SERVICEMODE="0" # 0: Disable / 1: Enable CloudFlare acceleration network

# Edit the following parameters with your specific settings
CF_TOKEN="cloudflaretoken" 
CF_ACCOUNT="account@cloudflare.com"

# Execution log file
LOG_FILE=/var/log/cloudflare-submit.log

source /lib/lsb/init-functions

export PATH="${PATH:+$PATH:}/usr/sbin:/sbin:/usr/bin:/usr/local/bin:/usr/local/sbin"

# Get public IP
get_public_ip () {
        # Check what cloud provider this code is running on
        if [ ! -f "/var/lib/cloud/data/instance-id" ]; then
                echo "$0: /var/lib/cloud/data/instance-id is not available! Unsupported environment! Exiting ..."
                exit 1
        fi

        # Get the instance public IP address
        I_ID=`cat /var/lib/cloud/data/instance-id`
        if [[ $I_ID == i-* ]]; then
                # Amazon AWS
                IP=`curl http://169.254.169.254/latest/meta-data/public-ipv4`
        else
                # Joyent
                IP=`ifconfig eth0 | grep "inet addr" | awk '{print $2}' | cut -c6-`
        fi
}

# Default Start function
cloudflare_register () {
        # Get instance public IP address
        get_public_ip

        # Check the resutl
        if [ -z "$IP" ]; then
                echo "$0: Unable to obtain public IP Address! Exiting ..."
                exit 1
        fi

        # Execute update towards CloudFlare API
        curl -s $CF_HOST \
                -d "a=rec_new" \
                -d "tkn=$CF_TOKEN" \
                -d "email=$CF_ACCOUNT" \
                -d "z=$ZONE" \
                -d "type=A" \
                -d "name=$HOST" \
                -d "content=$IP" \
                -d "ttl=$TTL" >> $LOG_FILE
    
        # Get record ID for this IP
        REC_ID=`curl -s $CF_HOST \
                -d "a=rec_load_all" \
                -d "tkn=$CF_TOKEN" \
                -d "email=$CF_ACCOUNT" \
                -d "z=$ZONE" | jq -a '.response.recs.objs[] | .content, .rec_id' | grep -A 1 $IP| tail -1 | awk -F"\"" '{print $2}'`

        # Update with desired service mode
        curl -s $CF_HOST \
                -d "a=rec_edit" \
                -d "tkn=$CF_TOKEN" \
                -d "email=$CF_ACCOUNT" \
                -d "z=$ZONE" \
                -d "id=$REC_ID" \
                -d "type=A" \
                -d "name=$HOST" \
                -d "content=$IP" \
                -d "ttl=1" \
                -d "service_mode=$CF_SERVICEMODE" >> $LOG_FILE
}

# Default Stop function
cloudflare_deregister () {
        # Get instance public IP address
        get_public_ip

        # Check the resutl
        if [ -z "$IP" ]; then
                echo "$0: Unable to obtain public IP Address! Exiting ..."
                exit 1
        fi

        # Get record ID for this IP
        REC_ID=`curl -s $CF_HOST \
                -d "a=rec_load_all" \
                -d "tkn=$CF_TOKEN" \
                -d "email=$CF_ACCOUNT" \
                -d "z=$ZONE" | jq -a '.response.recs.objs[] | .content, .rec_id' | grep -A 1 $IP| tail -1 | awk -F"\"" '{print $2}'`

        # Execute update towards CloudFlare API
        curl -s $CF_HOST \
                -d "a=rec_delete" \
                -d "tkn=$CF_TOKEN" \
                -d "email=$CF_ACCOUNT" \
                -d "z=$ZONE" \
                -d "id=$REC_ID" >> $LOG_FILE
}

case "$1" in
start)
        log_daemon_msg "Registering $HOST.$ZONE  with CloudFlare ... " || true
        cloudflare_register
        ;;
stop)
        log_daemon_msg "De-Registering $HOST.$ZONE with CloudFlare ... " || true
        cloudflare_deregister
        ;;
restart)
        log_daemon_msg "Restarting ... " || true
        cloudflare_deregister
        cloudflare_register
        ;;
*)
        log_action_msg "Usage: $0 {start|stop|restart}" || true
        exit 1
esac

exit 0
EOF

# Add it to the startup / shutdown process
chmod +x /etc/init.d/cloudflare-submit.sh
update-rc.d cloudflare-submit.sh defaults 99

# Mark progress ...
echo "CLOUDFLARE SCRIPT INSTALLED" >> /root/user_data_run

# Register with CloudFlare to start receiving requests
/etc/init.d/cloudflare-submit.sh start

# Mark execution end
echo "DONE" > /root/user_data_run

This UserData script contains three components:

Lines 0 – 62: Boilerplate, OS update, installation and configuration of NginX;
Lines 64 – 215: cloudflare-submit.sh, main script that will be called on startup and shutdown of the instance. cloudflare-submit.sh will register the instance’s public IP address with CloudFlare and set required protection. By default, protection and acceleration is off. Additional configuration is required to make this script work for your setup, account details must be configured in the specified variables!
Lines 217 – 228: Setting proper script permissions, configuring automatic start of cloudflare-submit.sh and executing it to register with CloudFlare.

Code is reasonably straight-forward. init.d startup script is divided to multiple functions and output is redirected to a log file for debugging purposes. External dependencies are kept to a minimum. Distinguishing between AWS EC2 and Joyent instances is done by analyzing the instance ID. In AWS, all EC2 instances have instance IDs starting with ‘i-‘, while Joyent uses (by the looks of it) some sort of UUID. This part of the logic is particularly important if the code should be extended to support other cloud providers!

Both AWS and Joyent offer Ubuntu 14.04 support, so the same code can be use to configure the instances in automated fashion. This is particularly handy when it comes to data driven instance management and the DRY principle. Command line tools for both cloud providers also offer similar syntax, which makes it easier to utilize this functionality.

Amazon AWS

Staring new instances within Amazon AWS is straight-forward, assuming awscli is properly configured:

aws ec2 run-instances \
    --image-id ami-018c9568 \
    --count 1 \
    --instance-type t1.micro \
    --key-name test-key \
    --security-groups test-sg \
    --user-data file://userdata-script.sh

Joyent

Starting news instances within Joyent is somewhat more complex, but there is comprehensive documentation:

sdc-createmachine \
    --account account_name \
    --keyId aa:bb:cc:dd:ee:ff:gg:hh:ii:jj:kk:ll:mm:nn:oo:pp \
    --name test \
    --package "4dad8aa6-2c7c-e20a-be26-c7f4f1925a9a" \
    --tag Name=test \
    --url "https://us-east-1.api.joyentcloud.com" \
    --metadata "Name=test" \
    --image 286b0dc0-d09e-43f2-976a-bb1880ebdb6c \
    --script userdata-script.sh

This particular example will start new SmartMachine instance using the 4dad8aa6-2c7c-e20a-be26-c7f4f1925a9a package (g3-devtier-0.25-kvm, 3rd generation, virtual machine (KVM) with 256MB RAM) and 286b0dc0-d09e-43f2-976a-bb1880ebdb6c (ubuntu-certified-14.04) image. SSH key details are supplied through the specific combinations of Web-interface settings and SSH key signature. For the list of available packages (instance types) and images (software stacks) consult the API: ListPackages, ListImages.

NOTE: Joyent offers rich Metadata support, which can be quite flexible tool when managing large number of instances!

Successful service configuration

Successful service configuration will result in proper DNS entries to be added to the scalability.expert DNS zone in CloudFlare:

After configured TTL, those should be visible world-wide:

:~> nslookup www.scalability.expert
Server:         8.8.4.4
Address:        8.8.4.4#53

Non-authoritative answer:
Name:   www.scalability.expert
Address: 54.83.175.90
Name:   www.scalability.expert
Address: 165.225.137.102

:~>

As seen, both AWS (54.83.175.90) and Joyent (165.225.137.102) IP addresses are returned, i.e. DNS Round-Robin. Service can simply be tested with:

:~> curl http://www.scalability.expert:8080/

    
        Demo Page
    

    
        Demo Page

        Status: running
    

:~>

Resulting calls can be seen in the NginX log files on both instances:

NOTE: CloudFlare protection and acceleration features are explicitly disabled in this example! It is strongly suggested to enabled them for production purposes!

Conclusion

It should be clear now, that whenever software architecture follows certain design principles and application is properly decoupled in multiple tiers, the whole system can be deployed within multiple cloud providers. DevOps principles for automated deployment can be implemented in this environment as well. The overall system is with improved scalability, reliability and in case of data driven elastic deployments, even cost! Proper design is key, but the technology provided by companies like Amazon and Joyent make it easier to turn whiteboard drawings into actual systems with hundreds of nodes!

References

Small Tip: AWS announces T2 instance types

Ivo Vachkov — Fri, 04 Jul 2014 14:38:04 +0000

One of the oldest and probably one of the most popular instance types, the t1.micro was recently upgraded by AWS. Three new instance types were introduced to fill the gap between t1.micro and the current-next, m3.medium. The new generation is called T2, uses only HVM based virtualization and comes with EBS only store support. There are three new instance types:

t2.micro
t2.small
t2.medium

Those instance types are all “Burstable Performance Instances” which means they are suitable for unsustained loads. This is also supported by the EBS Only store, which effectively means that high-volume I/O is out of the question. The fact that those instances are all using HVM-based virtualization, however, supports quick SCALE-UP to more potent instance types, if needs arise. One notable remark here is that T2 instances are VPC-only, which is a strong indication of the will to move everything into VPCs nowadays. AWS wants you to start using VPCs from the start!

The instance resource matrix now looks like this:

Instance Type	Virtualization Type	CPU Cores	Memory	Storage
t1.micro	PV	1	0.613 GB	EBS Only
t2.micro	HVM	1	1 GB	EBS Only
m1.small	PV	1	1.7 GB	EBS Only
t2.small	HVM	1	2 GB	EBS Only
m3.medium	HVM	1	3.75 GB	EBS + SSD
t2.medium	HVM	2	4 GB	EBS Only

As stated by AWS, the target uses for the new, T2 instance type family, includes:

Development environments;
Private experimentation;
Educational use;
Build servers / Code repositories;
Low-traffic web applications;
Small databases.

To evaluate the meaning of “Burstable Performance Instances“, here are CPU benchmark results on several instance instance types:

Instance Type	DES crypts/s	MD5 crypts/s	Blowfish crypts/s	Generic crypts/s
t1.micro	~ 2 407 000	~ 6 869	~ 442	~ 187 257
t2.micro	~ 4 757 000	~ 14 164	~ 851	~ 344 928
m1.small	~ 1 218 000	~ 3 480	~ 222	~ 92 870
t2.small	~ 4 993 000	~ 14 245	~ 854	~ 347 961
m3.medium	~ 2 272 000	~ 6 429	~ 386	~ 158 342
t2.medium	~ 5 045 000	~ 14 592	~ 878	~ 356 544

All instances use detault settings for storage, Amazon Linux AMI 2014.03.2, John The Ripper 1.8.0, measuring real crypts with many salts! The test is fairly synthetic, but answers the key question: What difference does it make to have a Burstable instance type? And the answer: If CPU load is not sustainable, it’s more than twice as fast!

Price-wise the new instance types are also better. Cost reduction of On Demand prices of more than 35% allows you to run t2.micro for less than 10 USD/m! Watch out, DigitalOcean! Obviously, Amazon wants change the already established “AWS for business, DigitalOcean for home” mantra into “AWS Everywhere”.

In conclusion, the new, T2 instance type family, closes the gap between unacceptably low performance instance type (t1.micro) and too expensive instances types (m1.small, m3.medium) which creates the sweet-spot for entry users, cloud enthusiast and home users. As someone said: “Now you have an instance type to run WordPress on!”

DevOps Shell Script Template

Ivo Vachkov — Thu, 03 Jul 2014 15:49:21 +0000

In everyday life of a DevOps engineer you will have to create multiple pieces of code. Some of those will be run once, others … well others will live forever. Although it may be compelling to just put all the commands in a text editor, save the result and execute it, one should always consider the “bigger picture”. What will happen if your script is run on another OS, on another Linux distribution, or even on a different version of the same Linux distribution?! Another point of view is to think what will happen if somehow your neat 10-line-script has to be executed on say 500 servers?! Can you be sure that all the commands will run successfully there? Can you even be sure that all the commands will even be present? Usually … No!

Faced with similar problems on a daily basis we started devising simple solutions and practices to address them. One of those is the process of standardizing the way different utilities behave, the way they take arguments and report errors. Upon further investigation it became clear that a pattern can be extracted and synthesized in a series of template, one can use in daily work to keep common behavior between different utilities and components.

Here is the basic template used in shell scripts:

#!/bin/sh
#
# DESCRIPTION: ... Include functional description ...
#
# Requiresments:
#	awk
#	... 
#	uname
#
# Example usag:
#	$ template.sh -h 
#	$ template.sh -p ARG1 -q ARG2
#

RET_CODE_OK=0
RET_CODE_ERROR=1

# Help / Usage function
print_help() {
	echo "$0: Functional description of the utility"
	echo ""
	echo "$0: Usage"
	echo "    [-h] Print help"
	echo "    [-p] (MANDATORY) First argument"
	echo "    [-q] (OPTIONAL) Second argument"
	exit $RET_CODE_ERROR;
}

# Check for supported operating system
p_uname=`whereis uname | cut -d' ' -f2`
if [ ! -x "$p_uname" ]; then
	echo "$0: No UNAME available in the system"
	exit $RET_CODE_ERROR;
fi
OS=`$p_uname`
if [ "$OS" != "Linux" ]; then
	echo "$0: Unsupported OS!";
	exit $RET_CODE_ERROR;
fi

# Check if awk is available in the system
p_awk=`whereis awk | cut -d' ' -f2`
if [ ! -x "$p_awk" ]; then
	echo "$0: No AWK available in the system!";
	exit $RET_CODE_ERROR;
fi

# Check for other used local utilities
#	bc
#	curl
#	grep 
#	etc ...

# Parse command line arguments
while test -n "$1"; do
	case "$1" in
	--help|-h)
		print_help
		exit 0
		;;
	-p)
		P_ARG=$2
		shift
		;;
	-q)
		Q_ARG=$2
		shift
		;;
	*)
		echo "$0: Unknown Argument: $1"
		print_help
		exit $RET_CODE_ERROR;
		;;
	esac
	
	shift
done

# Check if mandatory argument is present?
if [ -z "$P_ARG" ]; then
	echo "$0: Required parameter not specified!"
	print_help
	exit $RET_CODE_ERROR;
fi

# ... 

# Check if optional argument is present and if not, initialize!
if [ -z "$Q_ARG" ]; then
	Q_ARG="0";
fi

# ... 

# DO THE ACTUAL WORK HERE 

exit $RET_CODE_OK;

Nothing fancy. Basic framework that does the following:

Lines 3 – 13: Make sure basic documentation, dependency list and example usage patterns are provided with the script itself;
Lines 15 – 16: Define meaningful return codes to allow other utils to identify possible execution problems and react accordingly;
Lines 18 – 27: Basic help/usage() function to provide the user with short guidance on how to use the script;
Lines 29 – 52: Dependency checks to make sure all utilities the script needs are available and executable in the system;
Lines 54 – 77: Argument parsing of everything passed on the command line that supports both short and long argument names;
Lines 79 – 91: Validity checks of the argument values that should make sure arguments are passed contextually correct values;
Lines 95 – N: Actual programming logic to be implemented …

This template is successfully used in a various scenarios: command line utilities, Nagios plugins, startup/shutdown scripts, UserData scripts, daemons implemented in shell script with the help of start-stop-daemon, etc. It is also used to allow deployment on multiple operating systems and distribution versions. Resulting utilities and system components are more resilient, include better documentation and dependency sections, provide the user with similar and intuitive way to get help or pass arguments. Error handling is functional enough to go beyond the simple OK / ERROR state. And all of those are important feature when components must be run in highly heterogenous environments such as most cloud deployments!

Small Tip: How to run non-deamon()-ized processes in the background with SupervisorD

Ivo Vachkov — Thu, 26 Jun 2014 11:18:23 +0000

The following article will demonstrate how to use Ubuntu 14.04 LTS and SupervisorD to manage the not-so-uncommon case of long running services that expect to be running in active console / terminal. Those are usually quickly / badly written pieces of code that do not use daemon(), or equivalent function, to properly go into background but instead run forever in the foreground. Over the years multiple solutions emerged, including quite the ugly ones (nohup … 2>&1 logfile &). Luckily, there is a better one, and it’s called SupervisorD. With Ubuntu 14.04 LTS it even comes as a package and it should be part of your DevOps arsenal of tools!

In a typical Python / Web-scale environment multiple components will be implemented in a de-coupled, micro-services, REST-based architecture. One of the popular frameworks for REST is Bottle. And there are multiple approaches to build services with Bottle when full-blown HTTP Server is available (Apache, NginX, etc.) or if performance matters. All of those are valid and somewhat documented. But still, there is the case (and it more common than one would think) when developer will create Bottle server to handle simple task and it will propagate into production, using ugly solution like Screen/TMUX or even nohup. Here is a way to put this under proper control.

Test Server code: test-server.py

#!/usr/bin/env python

# Description: Demo Bottle Server to demonstrate use of SupervisorD
#
# How to run:
#       test-server.py -c test-server.conf
#
# Exepects the following configuration file:
#
#       server:
#               bind_ip: 0.0.0.0
#               bind_port: 8080
#
#       configuration_variable: true
#

import argparse
import time
import yaml
import sys

from bottle import route, run, template

# GET: /
@route('/')
def index():
        static_page = """


        Test Server


        Test Server is working!


        """
        return static_page

# Return the server->bind_ip value from the parsed configuration
def get_bind_ip(config):
        if config:
                return config['server']['bind_ip']
        else:
                return None

# Return the server->bind_port value from the parsed configuation
def get_bind_port(config):
        if config:
                return config['server']['bind_port']
        else:
                return None

# Return sample configuration variable
def get_config_data(config):
        if config:
                return config['configuration_variable']
        else:
                return None

# Main entry point for the application
def main():
        """ Main Entry Point for the appliation """

        # Parse command line arguments
        parser = argparse.ArgumentParser(description='Demo Server using Bottle')
        parser.add_argument('-c', '--config', type=str, required=True, dest='config', help='Configuration File Location')

        args = parser.parse_args()
        conf_file = args.config

        # Check config file accessibility
        try:
                conf_fd = open(conf_file, 'r')
        except IOError as e:
                if e.errno == errno.EACCES or e.errno == errno.ENOENT:
                        print("{progname}: Unable to read the configuration file ({config})!".format(progname=sys.argv[0], config=conf_file))
                        sys.exit(1)
        else:
                with conf_fd:
                        config = yaml.load(conf_fd)
                        conf_fd.close()

        # Get configuration data
        bind_ip = get_bind_ip(config)
        bind_port = get_bind_port(config)

        if bind_ip == None or bind_port == None:
                print("{progname}: Required configuration variable is unavailable!".format(progname=sys.argv[0]))
                sys.exit(1)

        config_data = get_config_data(config)

        # Run the web-server
        if config_data == True:
                run(host=bind_ip, port=bind_port)

if __name__ == '__main__':
    main()

Test server configuration file: test-server.conf

# Sample configuration file in YAML format for test-server.py

server:
    bind_ip: 0.0.0.0
    bind_port: 8080

configuration_variable: true

Manual execution of the server code will looks like this:

ubuntu@ip-10-67-161-137:~/test-server$ ./test-server.py -c test-server.conf
Bottle v0.12.0 server starting up (using WSGIRefServer())...
Listening on http://0.0.0.0:8080/
Hit Ctrl-C to quit.

94.155.194.28 - - [23/Jun/2014 12:34:39] "GET / HTTP/1.1" 200 126
^C
ubuntu@ip-10-67-161-137:~/test-server$

When the controlling terminal is lost the server will be terminated. Obviously, this is neither acceptable, nor desirable behavior.

With SupervisorD (sudo aptitude install supervisor) the service can be properly managed using simple configuration file.

Example SupervisorD configuration file: /etc/supervisor/conf.d/test-server.conf

[program:test-server]
command=/home/ubuntu/test-server/test-server.py -c /home/ubuntu/test-server/test-server.conf
user=ubuntu
redirect_stderr=true

To start the service, execute:

ubuntu@ip-10-67-161-137:~$ sudo supervisorctl start test-server
test-server: started
ubuntu@ip-10-67-161-137:~$

To verify successful service start:

ubuntu@ip-10-67-161-137:~$ ps ax
. . . 
 4353 ?        Ss     0:00 /usr/bin/python /usr/bin/supervisord -c /etc/supervisor/supervisord.conf
 4355 ?        S      0:00 python /home/ubuntu/test-server/test-server.py -c /home/ubuntu/test-server/test-server.conf
. . .
ubuntu@ip-10-67-161-137:~$

SupervisorD will redirect stdout and stderr to properly named log files:

ubuntu@ip-10-67-161-137:~$ sudo cat /var/log/supervisor/test-server-stdout---supervisor-ssaGXP.log
Bottle v0.12.0 server starting up (using WSGIRefServer())...
Listening on http://0.0.0.0:8080/
Hit Ctrl-C to quit.

94.155.194.28 - - [23/Jun/2014 13:31:19] "GET / HTTP/1.1" 200 126
ubuntu@ip-10-67-161-137:~$

Those log files can be integrated with a centralized logging architecture or processed for error / anomaly detection separately.

SupervisorD also comes with handy, command-line control utility, supervisorctl:

ubuntu@ip-10-67-161-137:~$ sudo supervisorctl status test-server
test-server                      RUNNING    pid 4355, uptime 0:11:40
ubuntu@ip-10-67-161-137:~$

With some additional effort SupervisorD can react to various types of events (http://supervisord.org/events.html) which bring it one step closer to full process monitoring & notification solution!

References

SupervisorD Homepage: http://supervisord.org
Bottle Web Framework: http://bottlepy.org/docs/dev/index.html

How to implement Service Discovery in the Cloud

Ivo Vachkov — Tue, 17 Jun 2014 13:51:42 +0000

Introduction

Service Discovery is not new technology. Unfortunately, it is barely understood and rarely implemented. It is a problem that many system architects face and it is key to multiple desirable qualities of a modern, cloud enabled, elastic distributed system such as reliability, availability, maintainability. There are multiple ways to approach service discovery:

Hardcode service locations;
Develop proprietary solution;
Use existing technology.

Hardcoding is still the common case. How often do you encounter hardcoded URLs in configuration files?! Developing proprietary solution becomes popular too. Multiple companies decided to address Service Discovery by implementing some sort of distributed key-value store. Amongst the popular ones: Etsy’s etcd, Heroku’s Doozer, Apache ZooKeeper, Google’s Chubby. Even Redis can used for such purposes. But for many cases additional software layers and programming complexity is not needed. There is already existing solution based on DNS. It is called DNS-SD and is defined in RFC6763.

DNS-SD utilizes PTR, SRV and TXT DNS records to provide flexible service discovery. All major DNS implementations support it. All major cloud providers support it. DNS is well established technology, well understood by both Operations and Development personnel with strong support in programming languages and libraries. It is highly-available by replication.

How does DNS-SD work?

DNS-SD uses three DNS records types: PTR, SRV, TXT:

PTR record is defined in RFC1035 as “domain name pointer”. Unlike CNAME records no processing of the contents is performed, data is returned directly.
SRV record is defined in RFC2782 as “service locator”. It should provide protocol agnostic way to locate services, in contrast to the MX records. It contains four components: priority, weight, port and target.
TXT record is defined in RFC1035 as “text string”.

There are multiple specifics around protocol and service naming conventions that are beyond the scope of this post. For more information please refer to RFC6763. For the purposes of this article, it is assumed that a proprietary TCP-based service, called theService that has different reincarnations runs on TCP port 4218 on multiple hosts. The basic idea is:

Create a pointer record for _theSerivce that contains all available incarnations of the service;
For each incarnation create SRV record (where the service is located) and TXT record (any additional information for the client) that specify the service details.

This is what sample configuration looks like in AWS Route53 for the unilans.net. domain:

Using nslookup results can be verified:

:~> nslookup -q=PTR _theService._tcp.unilans.net.
Server:         8.8.8.8
Address:        8.8.8.8#53

Non-authoritative answer:
_theService._tcp.unilans.net    name = _incarnation1._theService._tcp.unilans.net.
_theService._tcp.unilans.net    name = _incarnation2._theService._tcp.unilans.net.

Authoritative answers can be found from:

:~> nslookup -q=any _incarnation1._theService._tcp.unilans.net.
Server:         8.8.8.8
Address:        8.8.8.8#53

Non-authoritative answer:
_incarnation1._theService._tcp.unilans.net      text = "txtvers=1\; data=sampledata\;"
_incarnation1._theService._tcp.unilans.net      service = 0 0 4218 host1.unilans.net.

Authoritative answers can be found from:

:~>

Now a client that wants to use incarnation1 of theService has means to access it (Host: host1.unilans.net, Port: 4218).

Load-balaing can be implementing by adding another entry in the service locator record with the same priority and weight:

Resulting DNS lookup:

:~> nslookup -q=any _incarnation1._theService._tcp.unilans.net.
Server:         8.8.8.8
Address:        8.8.8.8#53

Non-authoritative answer:
_incarnation1._theService._tcp.unilans.net      text = "txtvers=1\; data=sampledata\;"
_incarnation1._theService._tcp.unilans.net      service = 0 0 4218 host1.unilans.net.
_incarnation1._theService._tcp.unilans.net      service = 0 0 4218 host100.unilans.net.

Authoritative answers can be found from:

:~>

In a similar way, fail-over can be implemented by using different priority (or load distribution using different weights):

Resulting DNS lookup:

:~> nslookup -q=any _incarnation1._theService._tcp.unilans.net.
Server:         8.8.8.8
Address:        8.8.8.8#53

Non-authoritative answer:
_incarnation1._theService._tcp.unilans.net      text = "txtvers=1\; data=sampledata\;"
_incarnation1._theService._tcp.unilans.net      service = 0 0 4218 host1.unilans.net.
_incarnation1._theService._tcp.unilans.net      service = 1 0 4218 host100.unilans.net.

Authoritative answers can be found from:

:~>

NOTE: With DNS the client is the one to implement the load-balacing or the fail-over (although there are exceptions to this rule)!

Benefits of using DNS-SD for Service Discovery

This technology can be used to support multiple version of a service. Using the built-in support for different reincarnations of the same service, versioning can be implemented in clean granular way. Common problem in REST system, usually solved by nasty URL schemes or rewriting URLs. With DNS-SD required metadata can be passed through the TXT records and multiple versions of the communication protocol can be supported, each in contained environment … No name space pollution, no clumsy URL schemes, no URL rewriting …

This technology can be utilized to reduce complexity while building distributed systems. The clients will most certainly go through the process of name resolution anyway, so why not incorporate service discover in it?! Instead of dealing with external system (installation, operation, maintenance) and all the possible issues (hard to configure, hard to maintain, immature, fault-intollerant, requires additional libraries in the codebase, etc), incorporate this with the name resolution. DNS is well supported on virtually all operating systems and with all programming languages that provide network programming abilities. System architecture complexity is reduced because subsystem that already exists is providing additional services, instead of introducing new systems.

This technology can be utilized to increase reliability / fault-tolerance. Reliability / fault-tolerance can be easily increased by serving multiple entries with the service locator records. Priority can be used by the client to go through the list of entries in controlled manner and weight to balance the load between the service providers on each priority level. The combination of backend support (control plane updating DNS-SD records) and reasonably intelligent clients (implementing service discovery and priority/weight parsing) should give granular control over the fail-over and load-balancing processes in the communication between multiple entities.

This technology supports system elasticity. Modern cloud service providers have APIs to control DNS zones. In this article, AWS Route53 will be used to demonstrate how elastic service can be introduced through DNS-SD to clients. Backend service scaling logic can modify service locator records to reflect current service state as far as DNS zone modification API is available. This is just part of the control plane for the service …

Bonus point: DNS also gives you simple, replicated key-value store through TXT records!

Implementation of Service Discovery with DNS-SD, AWS Route53, AWS IAM and AWS EC2 UserData

Following is a set of steps and sample code to implement Service Discovery in AWS, using Route53, IAM and EC2.

Manual configuration

1. Create PTR and TXT Records for theService in Route53:

This is a simple example for one service with one incarnation (v1).

NOTE: There is no SRV since the service is currently not running anywhere! Active service providers will create/update/delete SRV entries.

2. Create IAM role for EC2 instances to be able to modify DNS records in desired Zone:

Use the following policy:

{
   "Version": "2012-10-17",
   "Statement":[
      {
         "Effect":"Allow",
         "Action":[
            "route53:ListHostedZones"
         ],
         "Resource":"*"
      },
      {
         "Effect":"Allow",
         "Action":[
            "route53:GetHostedZone", 
            "route53:ListResourceRecordSets",
            "route53:ChangeResourceRecordSets"
         ],
         "Resource":"arn:aws:route53:::hostedzone/XXXXYYYYZZZZ"
      },
      {
         "Effect":"Allow",
         "Action":[
            "route53:GetChange"
         ],
         "Resource":"arn:aws:route53:::change/*"
      }
   ]
}

… where XXXXYYYYZZZZ is your hosted zone ID!

Automated JOIN/LEAVE in service group

Manual settings, outlined in the previous section give the basic framework of the DNS-SD setup. There is no SRV record since there are no active instances providing the service. Ideally, each active service provider will register/de-register with the service when available. This is key here: DNS-SD can be integrated cleanly with the elastic nature of the cloud. Once this integration is at place, all clients will only need to resolve DNS records in order to obtain list of active service providers. For demonstration purposes the following script was created:

#!/usr/bin/env python

# The following code modifies AWS Route53 entries to demonstrate usage of DNS-SD in cloud environments
#
# To JOIN Service group:
# 	dns-sd.py -z unilans.net -s _v1._theservice._tcp.unilans.net. -p 8080 join
#
# To LEAVE Service group:
#	dns-sd.py -z unilans.net -s _v1._theservice._tcp.unilans.net. -p 8080 leave
#
# NOTE: THIS IS FOR DEMONSTRATION PURPOSES ONLY! ERROR HANDLING IS ABSOLUTE MINIMAL! THIS IS *NOT* PRODUCTION CODE!

import sys
import copy
import argparse

import requests
import boto.route53

def main():
	"""
	Main entry point
	"""

	# Parse command line arguments
	parser = argparse.ArgumentParser(description='Example code to update service records in Route53 hosted DNS zones')
	parser.add_argument('-z', '--zone', type=str, required=True, dest='zone', help='Zone Name')
	parser.add_argument('-s', '--service', type=str, required=True, dest='service', help='Service Name')
	parser.add_argument('-p', '--port', type=int, required=True, dest='port', help='Service Port')
	parser.add_argument('operation', metavar='OPERATION', type=str, help='Operation [join|leave]', choices=['join', 'leave'])

	args = parser.parse_args()
	operation = args.operation
	zone = args.zone
	service = args.service
	port = args.port

	# Establish connection to Route53 API
	conn = boto.route53.connection.Route53Connection()

	# Get zone handler
	z = conn.get_zone(zone)
	if not z:
		print "{progname}: Wrong or inaccessible zone!".format(progname=sys.argv[0])
		sys.exit(-1)

	# Get EC2 Public IP Address
	response = requests.get('http://169.254.169.254/latest/meta-data/public-ipv4')
	if response.status_code == 200:
		public_ipv4 = response.text
	else:
		print "{progname}: Unable to obtain public IP address from AWS!".format(progname=sys.argv[0])
		sys.exit(-1)

	# Generate domain-specific hostname
	fqdn_hostname = '{hostname}.{zone}'.format(hostname=public_ipv4.replace(".", "-"), zone=zone)

	# Act, based on operation request (join | leave)
	if operation.upper() == 'join'.upper():
		# Create A record
		z.add_a(fqdn_hostname, public_ipv4, ttl=60)

		# Obtain service locator records
		r = z.find_records(service, 'SRV')
		if not r:
			# Create SRV record
			srv_value = u'0 0 {port} {fqdn}'.format(port=port, fqdn=fqdn_hostname)
			z.add_record('SRV', service, srv_value, ttl=60)
		else:
			# Add to SRV record
			srv_value = u'0 0 {port} {fqdn}'.format(port=port, fqdn=fqdn_hostname)
			tmp_r = copy.deepcopy(r)
			tmp_r.resource_records.append(srv_value)
			z.update_record(r, tmp_r.resource_records)

	elif operation.upper() == 'leave'.upper():
		# Remove entry from the SRV record
		r = z.find_records(service, 'SRV')
		if r:
			tmp_r = copy.deepcopy(r)
			for record in tmp_r.resource_records:
				if fqdn_hostname in record:
					tmp_r.resource_records.remove(record)

			if len(tmp_r.resource_records) == 0:
				# Remove the SRV entry itself
				z.delete_record(r)
			else:
				# Update the SRV record
				z.update_record(r, tmp_r.resource_records)

		# Remove A record
		r = z.find_records(fqdn_hostname, 'A')
		if r:
			z.delete_record(r)

	else:
		print "{progname}: Wrong operation!".format(progname=sys.argv[0])
		sys.exit(-1)

if __name__ == '__main__':
	main()

Copy of the code can be downloaded from https://s3-us-west-2.amazonaws.com/blog.xi-group.com/aws-route53-iam-ec2-dns-sd/dns-sd.py

This code, given DNS zone, service name and service port, will update necessary DNS records to join or leave the service group.

Starting with initial state:

Executing JOIN:

dns-sd.py -z unilans.net -s _v1._theservice._tcp.unilans.net. -p 8080 join

Result:

Executing LEAVE:

dns-sd.py -z unilans.net -s _v1._theservice._tcp.unilans.net. -p 8080 leave

Result:

Domain-specific hostname is created, service location record (SRV) is created with proper port and hostname. When host leaves the service group, domain-specific hostname is removed, so is the entry in the SRV record, or the whole record if this is the last entry.

Fully automated setup

UserData will be used to fully automate the process. There are many options: Puppet, Chef, Salt, Ansible and all of those can be used, but the UserData solution is with reduced complexity, no external dependencies and can be directly utilized by other AWS Services like CloudFormation, AutoScalingGroups, etc.

The full UserData content is as follows:

#!/bin/bash -ex

# Debian apt-get install function
apt_get_install()
{
	DEBIAN_FRONTEND=noninteractive apt-get -y \
	-o DPkg::Options::=--force-confdef \
	-o DPkg::Options::=--force-confold \
	install $@
}

# Mark execution start
echo "STARTING" > /root/user_data_run

# Some initial setup
set -e -x
export DEBIAN_FRONTEND=noninteractive
apt-get update && apt-get upgrade -y

# Install required packages
apt_get_install python-boto python-requests
apt_get_install nginx

# Create test html page
mkdir /var/www
cat > /var/www/index.html << "EOF"

	
		Demo Page
	

	
		Demo Page

		Status: running
	

EOF

# Configure NginX
cat > /etc/nginx/conf.d/demo.conf << "EOF"
# Minimal NginX VirtualHost setup
server {
	listen 8080;

	root /var/www;
	index index.html index.htm;

	location / {
		try_files $uri $uri/ =404;
	}
}
EOF

# Restart NginX with the new settings
/etc/init.d/nginx restart

# Create dns-sd.py
cat > /usr/local/sbin/dns-sd.py << "EOF"
#!/usr/bin/env python

# The following code modifies AWS Route53 entries to demonstrate usage of DNS-SD in cloud environments
#
# To JOIN Service group:
# 	dns-sd.py -z unilans.net -s _v1._theservice._tcp.unilans.net. -p 8080 join
#
# To LEAVE Service group:
#	dns-sd.py -z unilans.net -s _v1._theservice._tcp.unilans.net. -p 8080 leave
#
# NOTE: THIS IS FOR DEMONSTRATION PURPOSES ONLY! ERROR HANDLING IS ABSOLUTE MINIMAL! THIS IS *NOT* PRODUCTION CODE!

import sys
import copy
import argparse

import requests
import boto.route53

def main():
	"""
	Main entry point
	"""

	# Parse command line arguments
	parser = argparse.ArgumentParser(description='Example code to update service records in Route53 hosted DNS zones')
	parser.add_argument('-z', '--zone', type=str, required=True, dest='zone', help='Zone Name')
	parser.add_argument('-s', '--service', type=str, required=True, dest='service', help='Service Name')
	parser.add_argument('-p', '--port', type=int, required=True, dest='port', help='Service Port')
	parser.add_argument('operation', metavar='OPERATION', type=str, help='Operation [join|leave]', choices=['join', 'leave'])

	args = parser.parse_args()
	operation = args.operation
	zone = args.zone
	service = args.service
	port = args.port

	# Establish connection to Route53 API
	conn = boto.route53.connection.Route53Connection()

	# Get zone handler
	z = conn.get_zone(zone)
	if not z:
		print "{progname}: Wrong or inaccessible zone!".format(progname=sys.argv[0])
		sys.exit(-1)

	# Get EC2 Public IP Address
	response = requests.get('http://169.254.169.254/latest/meta-data/public-ipv4')
	if response.status_code == 200:
		public_ipv4 = response.text
	else:
		print "{progname}: Unable to obtain public IP address from AWS!".format(progname=sys.argv[0])
		sys.exit(-1)

	# Generate domain-specific hostname
	fqdn_hostname = '{hostname}.{zone}'.format(hostname=public_ipv4.replace(".", "-"), zone=zone)

	# Act, based on operation request (join | leave)
	if operation.upper() == 'join'.upper():
		# Create A record
		z.add_a(fqdn_hostname, public_ipv4, ttl=60)

		# Obtain service locator records
		r = z.find_records(service, 'SRV')
		if not r:
			# Create SRV record
			srv_value = u'0 0 {port} {fqdn}'.format(port=port, fqdn=fqdn_hostname)
			z.add_record('SRV', service, srv_value, ttl=60)
		else:
			# Add to SRV record
			srv_value = u'0 0 {port} {fqdn}'.format(port=port, fqdn=fqdn_hostname)
			tmp_r = copy.deepcopy(r)
			tmp_r.resource_records.append(srv_value)
			z.update_record(r, tmp_r.resource_records)

	elif operation.upper() == 'leave'.upper():
		# Remove entry from the SRV record
		r = z.find_records(service, 'SRV')
		if r:
			tmp_r = copy.deepcopy(r)
			for record in tmp_r.resource_records:
				if fqdn_hostname in record:
					tmp_r.resource_records.remove(record)

			if len(tmp_r.resource_records) == 0:
				# Remove the SRV entry itself
				z.delete_record(r)
			else:
				# Update the SRV record
				z.update_record(r, tmp_r.resource_records)

		# Remove A record
		r = z.find_records(fqdn_hostname, 'A')
		if r:
			z.delete_record(r)

	else:
		print "{progname}: Wrong operation!".format(progname=sys.argv[0])
		sys.exit(-1)

if __name__ == '__main__':
	main()

EOF

# Make dns-sd.py executable
chmod +x /usr/local/sbin/dns-sd.py

# Create startup job
cat > /etc/init.d/dns-sd << "EOF"
#! /bin/bash
#
# Author: Ivo Vachkov (ivachkov@xi-group.com)
#
### BEGIN INIT INFO
# Provides: DNS-SD Service Group Registration / De-Registration
# Required-Start:
# Should-Start:
# Required-Stop:
# Should-Stop:
# Default-Start:  2 3 4 5
# Default-Stop:   0 1 6
# Short-Description:    Start / Stop script for DNS-SD
# Description:          Use to JOIN/LEAVE DNS-SD Service Group
### END INIT INFO

set -e
umask 022

# Configuration details
DNS_SD="/usr/local/sbin/dns-sd.py"
DNS_ZONE="unilans.net"
SERVICE_NAME="_v1._theservice._tcp.unilans.net."
SERVICE_PORT="8080"

. /lib/lsb/init-functions

export PATH="${PATH:+$PATH:}/usr/sbin:/sbin:/usr/bin:/usr/local/bin:/usr/local/sbin"

# Default Start function
dns_sd_join () {
	$DNS_SD -z $DNS_ZONE -s $SERVICE_NAME -p $SERVICE_PORT join
}

# Default Stop function
dns_sd_leave () {
	$DNS_SD -z $DNS_ZONE -s $SERVICE_NAME -p $SERVICE_PORT leave
}

case "$1" in
start)
	log_daemon_msg "Joining $DNS_ZONE|$SERVICE_NAME:$SERVICE_PORT ... " || true
	dns_sd_join
	;;
stop)
	log_daemon_msg "Leaving $DNS_ZONE|$SERVICE_NAME:$SERVICE_PORT ... " || true
	dns_sd_leave
	;;
restart)
	log_daemon_msg "Restarting ... " || true
	dns_sd_leave
	dns_sd_join
	;;
*)
	log_action_msg "Usage: $0 {start|stop|restart}" || true
	exit 1
esac

exit 0
EOF

# Make /etc/init.d/dns-sd executable
chmod +x /etc/init.d/dns-sd

# Set automatic execution on start/shutdown
update-rc.d dns-sd defaults 99

# Execute initial service group JOIN
/etc/init.d/dns-sd start

# Mark execution end
echo "DONE" > /root/user_data_run

Copy of the code can be downloaded from https://s3-us-west-2.amazonaws.com/blog.xi-group.com/aws-route53-iam-ec2-dns-sd/userdata.sh

Starting 3 test instances to verify functionality:

aws ec2 run-instances --image-id ami-018c9568 --count 3 --instance-type t1.micro --key-name test-key --security-groups test-sg --iam-instance-profile Name=DNS-SD-Route53-EC2-Role --user-data file://userdata.sh

Resulting changes to Route53:

Three new boxes self-registered in the Service group. Stopping manually one leads to de-registration:

Elastic systems are possible to implement with DNS-SD! Note however, that the DNS records are limited to 65536 bytes, so the amount of entries that can go into SRV record, although big, is limited!

Client code

To demonstrate DNS-SD resolution, the following sample code was created:

#!/usr/bin/env python

# The following code demonstrates how to resolve DNS-SD Service Descriptions
#
# Example execution:
#	client.py -z unilans.net. -s theService -p tcp -v v1
#
# NOTE: THIS IS FOR DEMONSTRATION PURPOSES ONLY! ERROR HANDLING IS ABSOLUTE MINIMAL! THIS IS *NOT* PRODUCTION CODE!

import sys
import random
import argparse

import requests
import dns.resolver

def main():
	"""
	Main entry point
	"""

	# Parse command line arguments
	parser = argparse.ArgumentParser(description='Example code to resolve DNS-SD service descriptions')
	parser.add_argument('-z', '--zone', type=str, required=True, dest='zone', help='Zone Name')
	parser.add_argument('-s', '--service', type=str, required=True, dest='service', help='Service Name')
	parser.add_argument('-p', '--protocol', type=str, required=True, dest='protocol', help='Service Transport Protoco [tcp|udp]', choices=['tcp', 'udp'])
	parser.add_argument('-v', '--version', type=str, required=True, dest='version', help='Service Version')

	args = parser.parse_args()
	zone = args.zone
	service = args.service
	protocol = args.protocol
	version = args.version

	# Obtain PTR Record
	service_id = '_{service}._{protocol}.{zone}'.format(service=service, protocol=protocol, zone=zone)
	answer = dns.resolver.query(service_id, 'PTR')

	# Find the service incarnation
	if answer:
		for record in answer.rrset:
			r = str(record.target).split('.')
			if version in r[0]:
				service_version = str(record.target)

	# Discover and consume the actual service
	if service_version:
		# Get SRV and TXT
		answer_srv = dns.resolver.query(service_version, 'SRV')
		answer_txt = dns.resolver.query(service_version, 'TXT')

		service_addr = ''
		service_port = 0

		# If those are valid get random service location entry
		if answer_srv and answer_txt:
			srv_entry = random.choice(answer_srv.rrset.items)
			if srv_entry:
				service_addr = srv_entry.target
				service_port = srv_entry.port

	service_uri = 'http://{host}:{port}/'.format(host=service_addr, port=service_port)
	r = requests.get(service_uri)
	if r.status_code == 200:
		print r.text

if __name__ == '__main__':
	main()

Copy of the code can be downloaded from https://s3-us-west-2.amazonaws.com/blog.xi-group.com/aws-route53-iam-ec2-dns-sd/client.py

Why would that be better?! Yes, there is added complexity in the name resolution process. But, more importantly, details needed to find the service are agnostic to its location, or specific to the client. Service-specific infrastructure can change, but the client will not be affected, as long as the discovery process is performed.

Sample run:

:~> client.py -z unilans.net. -s theService -p tcp -v v1

        
                Demo Page
        

        
                Demo Page

                Status: running
        

:~>

Voilà! Reliable Service Discovery in elastic systems!

Additional Notes

Some additional notes and well-knowns:

Examples in this article could be extended to support fail-over or more sophisticated forms of load-balancing. Current random.choice() solution should be good enough for the generic case;
More complex setup with different priorities and weights can be demonstrated too;
Service health-check before DNS-SD registration can be demonstrated too;
Non-HTTP service can be demonstrated to use DNS-SD. Technology is application-agnostic.
TXT contents are not used throughout this article. Those can be used to carry additional meta-data (NOTE: This is public! Anyone can query your DNS TXT records with this setup!).

Conclusion

Quick implementation of DNS-SD with AWS Route53, IAM and EC2 was presented in this article. It can be used as a bare-bone setup that can be further extended and productized. It solves common problem in elastic systems: Service Discovery! All key components are implemented in either Python or Shell script with minimal dependencies (sudo aptitude install awscli, python-boto, python-requests, python-dnspython), although the implementation is not dependent on a particular programming language.

References

Xi Group Ltd. Company Blog » Xi Group Ltd. Company Blog » Development

How to deploy single-node Hadoop setup in AWS

Related Posts

UserData Template for Ubuntu 14.04 EC2 Instances in AWS

Related Posts

Small Tip: How to use –block-device-mappings to manage instance volumes with AWS CLI

Related Posts

How to implement multi-cloud deployment for scalability and reliability

Introduction

Preliminary information

Implementation Details

CloudFlare

AWS UserData / Joyent Script

Demo Page

Amazon AWS

Joyent

Successful service configuration

Demo Page

Conclusion

Related Posts

Small Tip: AWS announces T2 instance types

Related Posts

DevOps Shell Script Template

Related Posts

Small Tip: How to run non-deamon()-ized processes in the background with SupervisorD

Test Server is working!

Related Posts

How to implement Service Discovery in the Cloud

Introduction

How does DNS-SD work?

Benefits of using DNS-SD for Service Discovery

Implementation of Service Discovery with DNS-SD, AWS Route53, AWS IAM and AWS EC2 UserData

Manual configuration

Automated JOIN/LEAVE in service group

Fully automated setup

Demo Page

Client code

Demo Page

Additional Notes

Conclusion

Related Posts