Xi Group Ltd. Company Blog | High-quality DevOps Services

How to deploy single-node Hadoop setup in AWS

2015/02/04 AWS, Big Data, Development, DevOps, Operations No comments AWS, AWS CLI, big data, DevOps, hadoop, map reduce, single-node hadoop

Common issue in the Software Development Lifecycle is the need to quickly bootstrap vanilla environment, deploy some code onto it, run it and then scrap it. This is a core concept in Continuous Integration / Continuous Delivery (CI/CD). It is a stepping stone towards immutable infrastructure. Properly automated implementation can also save time (no need to configure it manually) and money (no need to track potential regression issues in the development process).

Over the course of several years, found this to be extremely useful when used in BigData projects that use Hadoop. Installation of Hadoop is not always straight-forward. It depends on various internal and external components (JDK, Map-Reduce Framework, HDFS, etc). It can be messy. Different components communicate over various ports and protocols. HDFS uses somewhat clumsy semantics to deal with files and directories. For those and similar reasons we decided to present our take on Hadoop installation on a single node for development purposes.

The following shell script is simplified, fully functional skeleton implementation that will install Hadoop on a c3.xlarge, Fedora 20 node in AWS and run a test job on it:

#!/bin/bash

# Key file to be generated and its filesystem location
KEY_NAME="test-hadoop-key"
KEY_FILE="/tmp/$KEY_NAME"

# Security group name and description
SG_NAME="test-hadoop-sg"
SG_DESC="Test Hadoop Security Group"

# Temporary files; General Log and Instance User data
LOG_FILE="/tmp/test-hadoop-setup.log"
USR_DATA="/tmp/test-hadoop-userdata.sh"

# Instance details
AWS_PROFILE="$$profile$$"
AWS_REGION="us-east-1"
AMI_ID="ami-21362b48"
INST_TAG="test-hadoop-single"
INST_TYPE="c3.xlarge"
DISK_SIZE="20"

# Default return codes
RET_CODE_OK=0
RET_CODE_ERROR=1

# Check for various utilities that will be used 

# Check for supported operating system
P_UNAME=`whereis uname | cut -d' ' -f2`
if [ ! -x "$P_UNAME" ]; then
	echo "$0: No UNAME available in the system"
	exit $RET_CODE_ERROR;
fi
OS=`$P_UNAME`
if [ "$OS" != "Linux" ]; then
	echo "$0: Unsupported OS!";
	exit $RET_CODE_ERROR;
fi

# Check if awscli is available in the system
P_AWS=`whereis aws | cut -d' ' -f2`
if [ ! -x "$P_AWS" ]; then
	echo "$0: No 'aws' available in the system!";
	exit $RET_CODE_ERROR;
fi

# Check if awk is available in the system
P_AWK=`whereis awk | cut -d' ' -f2`
if [ ! -x "$P_AWK" ]; then
	echo "$0: No 'awk' available in the system!";
	exit $RET_CODE_ERROR;
fi

# Check if grep is available in the system
P_GREP=`whereis grep | cut -d' ' -f2`
if [ ! -x "$P_GREP" ]; then
	echo "$0: No 'grep' available in the system!";
	exit $RET_CODE_ERROR;
fi

# Check if sed is available in the system
P_SED=`whereis sed | cut -d' ' -f2`
if [ ! -x "$P_SED" ]; then
	echo "$0: No 'sed' available in the system!";
	exit $RET_CODE_ERROR;
fi

# Check if ssh is available in the system
P_SSH=`whereis ssh | cut -d' ' -f2`
if [ ! -x "$P_SSH" ]; then
	echo "$0: No 'ssh' available in the system!";
	exit $RET_CODE_ERROR;
fi

# Check if ssh-keygen is available in the system
P_SSH_KEYGEN=`whereis ssh-keygen | cut -d' ' -f2`
if [ ! -x "$P_SSH_KEYGEN" ]; then
	echo "$0: No 'ssh-keygen' available in the system!";
	exit $RET_CODE_ERROR;
fi

# Userdata code to bootstrap Hadoop 2.X on Fedora 20 instance
cat > $USR_DATA << "EOF"
#!/bin/bash

# Mark execution start
echo "START" > /root/userdata.state

# Install Hadoop
yum --assumeyes install hadoop-common hadoop-common-native hadoop-hdfs hadoop-mapreduce hadoop-mapreduce-examples hadoop-yarn

# Configure HDFS
hdfs-create-dirs

# Bootstrap Hadoop services
systemctl start hadoop-namenode && sleep 2
systemctl start hadoop-datanode && sleep 2
systemctl start hadoop-nodemanager && sleep 2
systemctl start hadoop-resourcemanager && sleep 2

# Make Hadoop services start after reboot
systemctl enable hadoop-namenode hadoop-datanode hadoop-nodemanager hadoop-resourcemanager

# Configure Hadoop user
runuser hdfs -s /bin/bash /bin/bash -c "hadoop fs -mkdir /user/fedora"
runuser hdfs -s /bin/bash /bin/bash -c "hadoop fs -chown fedora /user/fedora"

# Deploy additional software dependencies
# ... 

# Deploy main application 
# ... 

# Mark execution end
echo "DONE" > /root/userdata.state
EOF

# Create Security Group
echo -n "Creating '$SG_NAME' security group ... "
aws ec2 create-security-group --group-name $SG_NAME --description "$SG_DESC" --region $AWS_REGION --profile $AWS_PROFILE > $LOG_FILE
echo "Done."

# Add open SSH access
echo -n "Adding access rules to '$SG_NAME' security group ... "
aws ec2 authorize-security-group-ingress --group-name $SG_NAME --protocol tcp --port 22 --cidr 0.0.0.0/0 --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE

# Add open Hadoop ports access
aws ec2 authorize-security-group-ingress --group-name $SG_NAME --protocol tcp --port 8088 --cidr 0.0.0.0/0 --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE
aws ec2 authorize-security-group-ingress --group-name $SG_NAME --protocol tcp --port 50010 --cidr 0.0.0.0/0 --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE
aws ec2 authorize-security-group-ingress --group-name $SG_NAME --protocol tcp --port 50020 --cidr 0.0.0.0/0 --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE
aws ec2 authorize-security-group-ingress --group-name $SG_NAME --protocol tcp --port 50030 --cidr 0.0.0.0/0 --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE
aws ec2 authorize-security-group-ingress --group-name $SG_NAME --protocol tcp --port 50070 --cidr 0.0.0.0/0 --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE
aws ec2 authorize-security-group-ingress --group-name $SG_NAME --protocol tcp --port 50075 --cidr 0.0.0.0/0 --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE
aws ec2 authorize-security-group-ingress --group-name $SG_NAME --protocol tcp --port 50090 --cidr 0.0.0.0/0 --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE
echo "Done."

# Generate New Key Pair and Import it
echo -n "Generating key pair '$KEY_NAME' for general access ... "
rm -rf $KEY_FILE $KEY_FILE.pub
ssh-keygen -t rsa -f $KEY_FILE -N '' >> $LOG_FILE
aws ec2 import-key-pair --key-name $KEY_NAME --public-key-material "`cat $KEY_FILE.pub`" --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE
echo "Done."

# Build the Hadoop box
echo -n "Starting Hadoop instance ... "
RI_OUT=`aws ec2 run-instances --image-id $AMI_ID --count 1 --instance-type $INST_TYPE --key-name $KEY_NAME --security-groups $SG_NAME --user-data file:///tmp/test-hadoop-userdata.sh --block-device-mapping "[{\"DeviceName\":\"/dev/sda1\", \"Ebs\":{\"VolumeSize\":$DISK_SIZE, \"DeleteOnTermination\": true} } ]" --region $AWS_REGION --profile $AWS_PROFILE`
I_ID=`echo $RI_OUT | grep "InstanceId" | awk '{print $43}' | sed 's/,$//' | sed -e 's/^"//'  -e 's/"$//'`
echo $RI_OUT >> $LOG_FILE
echo "Done."

# Tag the Hadoop box
echo -n "Tagging Hadoop instance '$I_ID' ... "
aws ec2 create-tags --resources $I_ID --tags Key=Name,Value=$INST_TAG --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE
echo "Done."

# Obtain instance public IP address
echo -n "Obtaining instance '$I_ID' public hostname ... "

# Delays in AWS fabric, reiterate until public hostname is assigned ...
while true; do
	sleep 3

	HOST=`aws ec2 describe-instances --instance-ids $I_ID --region $AWS_REGION --profile $AWS_PROFILE | grep PublicDnsName | awk -F":" '{print $2}' | awk '{print $1}' | sed 's/,$//' | sed -e 's/^"//'  -e 's/"$//'`;
	if [[ $HOST == ec2* ]]; then
		break;
	fi
done
echo "Done."

# Poll until system is ready
echo -n "Waiting for instance '$I_ID' to configure itself (will take approx. 5 minutes) ... "
while true; do
	sleep 5;

	TEMP_OUT=`ssh -q -o "StrictHostKeyChecking=no" -i $KEY_FILE -t fedora@$HOST "sudo cat /root/userdata.state"`;

	# Clear some strange symbols 
	STATE=`echo $TEMP_OUT | cut -c1-4`;

	if [ "$STATE" = "DONE" ]; then
		break;
	fi
done
echo "Done."

# Test Hadoop setup
echo "========== Testing Single-node Hadoop =========="
ssh -q -o "StrictHostKeyChecking=no" -i $KEY_FILE fedora@$HOST "hadoop jar /usr/share/java/hadoop/hadoop-mapreduce-examples.jar pi 10 1000000"
echo "========== Done =========="

# Run main Application here
# echo "========== Testing Main Application Single-node Hadoop =========="
# ssh -q -o "StrictHostKeyChecking=no" -i $KEY_FILE fedora@$HOST "hadoop jar ..."
# echo "========== Done =========="

# Terminate instance
echo -n "Terminating Hadoop instance '$I_ID' ... "
aws ec2 terminate-instances --instance-ids $I_ID --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE

# Poll until instance is terminated
while true; do
	sleep 5;

	TERMINATED=`aws ec2 describe-instances --instance-ids $I_ID --region $AWS_REGION --profile $AWS_PROFILE | grep terminated`;
	if [ ! -z "$TERMINATED" ]; then
		break;
	fi
done
echo "Done."

# Remove SSH Keypair
echo -n "Removing key pair '$KEY_NAME' ... "
aws ec2 delete-key-pair --key-name $KEY_NAME --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE
echo "Done."

# Remove Security Group
echo -n "Removing '$SG_NAME' security group ... "
aws ec2 delete-security-group --group-name $SG_NAME --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE
echo "Done."

# Remove local resources
rm -rf $USR_DATA
rm -rf $KEY_FILE $KEY_FILE.pub
rm -rf $LOG_FILE

# Normal termination
exit $RET_CODE_OK

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

#!/bin/bash

# Key file to be generated and its filesystem location

KEY_NAME="test-hadoop-key"

KEY_FILE="/tmp/$KEY_NAME"

# Security group name and description

SG_NAME="test-hadoop-sg"

SG_DESC="Test Hadoop Security Group"

# Temporary files; General Log and Instance User data

LOG_FILE="/tmp/test-hadoop-setup.log"

USR_DATA="/tmp/test-hadoop-userdata.sh"

# Instance details

AWS_PROFILE="$$profile$$"

AWS_REGION="us-east-1"

AMI_ID="ami-21362b48"

INST_TAG="test-hadoop-single"

INST_TYPE="c3.xlarge"

DISK_SIZE="20"

# Default return codes

RET_CODE_OK=0

RET_CODE_ERROR=1

# Check for various utilities that will be used

# Check for supported operating system

P_UNAME=`whereis uname | cut -d' ' -f2`

if [ ! -x "$P_UNAME" ]; then

echo "$0: No UNAME available in the system"

exit $RET_CODE_ERROR;

OS=`$P_UNAME`

if [ "$OS" != "Linux" ]; then

echo "$0: Unsupported OS!";

exit $RET_CODE_ERROR;

# Check if awscli is available in the system

P_AWS=`whereis aws | cut -d' ' -f2`

if [ ! -x "$P_AWS" ]; then

echo "$0: No 'aws' available in the system!";

exit $RET_CODE_ERROR;

# Check if awk is available in the system

P_AWK=`whereis awk | cut -d' ' -f2`

if [ ! -x "$P_AWK" ]; then

echo "$0: No 'awk' available in the system!";

exit $RET_CODE_ERROR;

# Check if grep is available in the system

P_GREP=`whereis grep | cut -d' ' -f2`

if [ ! -x "$P_GREP" ]; then

echo "$0: No 'grep' available in the system!";

exit $RET_CODE_ERROR;

# Check if sed is available in the system

P_SED=`whereis sed | cut -d' ' -f2`

if [ ! -x "$P_SED" ]; then

echo "$0: No 'sed' available in the system!";

exit $RET_CODE_ERROR;

# Check if ssh is available in the system

P_SSH=`whereis ssh | cut -d' ' -f2`

if [ ! -x "$P_SSH" ]; then

echo "$0: No 'ssh' available in the system!";

exit $RET_CODE_ERROR;

# Check if ssh-keygen is available in the system

P_SSH_KEYGEN=`whereis ssh-keygen | cut -d' ' -f2`

if [ ! -x "$P_SSH_KEYGEN" ]; then

echo "$0: No 'ssh-keygen' available in the system!";

exit $RET_CODE_ERROR;

# Userdata code to bootstrap Hadoop 2.X on Fedora 20 instance

cat > $USR_DATA << "EOF"

#!/bin/bash

# Mark execution start

echo "START" > /root/userdata.state

# Install Hadoop

yum --assumeyes install hadoop-common hadoop-common-native hadoop-hdfs hadoop-mapreduce hadoop-mapreduce-examples hadoop-yarn

# Configure HDFS

hdfs-create-dirs

# Bootstrap Hadoop services

systemctl start hadoop-namenode && sleep 2

systemctl start hadoop-datanode && sleep 2

systemctl start hadoop-nodemanager && sleep 2

systemctl start hadoop-resourcemanager && sleep 2

# Make Hadoop services start after reboot

systemctl enable hadoop-namenode hadoop-datanode hadoop-nodemanager hadoop-resourcemanager

# Configure Hadoop user

runuser hdfs -s /bin/bash /bin/bash -c "hadoop fs -mkdir /user/fedora"

runuser hdfs -s /bin/bash /bin/bash -c "hadoop fs -chown fedora /user/fedora"

# Deploy additional software dependencies

# ...

# Deploy main application

# ...

# Mark execution end

echo "DONE" > /root/userdata.state

EOF

# Create Security Group

echo -n "Creating '$SG_NAME' security group ... "

aws ec2 create-security-group --group-name $SG_NAME --description "$SG_DESC" --region $AWS_REGION --profile $AWS_PROFILE > $LOG_FILE

echo "Done."

# Add open SSH access

echo -n "Adding access rules to '$SG_NAME' security group ... "

aws ec2 authorize-security-group-ingress --group-name $SG_NAME --protocol tcp --port 22 --cidr 0.0.0.0/0 --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE

# Add open Hadoop ports access

aws ec2 authorize-security-group-ingress --group-name $SG_NAME --protocol tcp --port 8088 --cidr 0.0.0.0/0 --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE

aws ec2 authorize-security-group-ingress --group-name $SG_NAME --protocol tcp --port 50010 --cidr 0.0.0.0/0 --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE

aws ec2 authorize-security-group-ingress --group-name $SG_NAME --protocol tcp --port 50020 --cidr 0.0.0.0/0 --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE

aws ec2 authorize-security-group-ingress --group-name $SG_NAME --protocol tcp --port 50030 --cidr 0.0.0.0/0 --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE

aws ec2 authorize-security-group-ingress --group-name $SG_NAME --protocol tcp --port 50070 --cidr 0.0.0.0/0 --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE

aws ec2 authorize-security-group-ingress --group-name $SG_NAME --protocol tcp --port 50075 --cidr 0.0.0.0/0 --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE

aws ec2 authorize-security-group-ingress --group-name $SG_NAME --protocol tcp --port 50090 --cidr 0.0.0.0/0 --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE

echo "Done."

# Generate New Key Pair and Import it

echo -n "Generating key pair '$KEY_NAME' for general access ... "

rm -rf $KEY_FILE $KEY_FILE.pub

ssh-keygen -t rsa -f $KEY_FILE -N '' >> $LOG_FILE

aws ec2 import-key-pair --key-name $KEY_NAME --public-key-material "`cat $KEY_FILE.pub`" --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE

echo "Done."

# Build the Hadoop box

echo -n "Starting Hadoop instance ... "

RI_OUT=`aws ec2 run-instances --image-id $AMI_ID --count 1 --instance-type $INST_TYPE --key-name $KEY_NAME --security-groups $SG_NAME --user-data file:///tmp/test-hadoop-userdata.sh --block-device-mapping "[{\"DeviceName\":\"/dev/sda1\", \"Ebs\":{\"VolumeSize\":$DISK_SIZE, \"DeleteOnTermination\": true} } ]" --region $AWS_REGION --profile $AWS_PROFILE`

I_ID=`echo $RI_OUT | grep "InstanceId" | awk '{print $43}' | sed 's/,$//' | sed -e 's/^"//' -e 's/"$//'`

echo $RI_OUT >> $LOG_FILE

echo "Done."

# Tag the Hadoop box

echo -n "Tagging Hadoop instance '$I_ID' ... "

aws ec2 create-tags --resources $I_ID --tags Key=Name,Value=$INST_TAG --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE

echo "Done."

# Obtain instance public IP address

echo -n "Obtaining instance '$I_ID' public hostname ... "

# Delays in AWS fabric, reiterate until public hostname is assigned ...

while true; do

sleep 3

HOST=`aws ec2 describe-instances --instance-ids $I_ID --region $AWS_REGION --profile $AWS_PROFILE | grep PublicDnsName | awk -F":" '{print $2}' | awk '{print $1}' | sed 's/,$//' | sed -e 's/^"//' -e 's/"$//'`;

if [[ $HOST == ec2* ]]; then

break;

done

echo "Done."

# Poll until system is ready

echo -n "Waiting for instance '$I_ID' to configure itself (will take approx. 5 minutes) ... "

while true; do

sleep 5;

TEMP_OUT=`ssh -q -o "StrictHostKeyChecking=no" -i $KEY_FILE -t fedora@$HOST "sudo cat /root/userdata.state"`;

# Clear some strange symbols

STATE=`echo $TEMP_OUT | cut -c1-4`;

if [ "$STATE" = "DONE" ]; then

break;

done

echo "Done."

# Test Hadoop setup

echo "========== Testing Single-node Hadoop =========="

ssh -q -o "StrictHostKeyChecking=no" -i $KEY_FILE fedora@$HOST "hadoop jar /usr/share/java/hadoop/hadoop-mapreduce-examples.jar pi 10 1000000"

echo "========== Done =========="

# Run main Application here

# echo "========== Testing Main Application Single-node Hadoop =========="

# ssh -q -o "StrictHostKeyChecking=no" -i $KEY_FILE fedora@$HOST "hadoop jar ..."

# echo "========== Done =========="

# Terminate instance

echo -n "Terminating Hadoop instance '$I_ID' ... "

aws ec2 terminate-instances --instance-ids $I_ID --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE

# Poll until instance is terminated

while true; do

sleep 5;

TERMINATED=`aws ec2 describe-instances --instance-ids $I_ID --region $AWS_REGION --profile $AWS_PROFILE | grep terminated`;

if [ ! -z "$TERMINATED" ]; then

break;

done

echo "Done."

# Remove SSH Keypair

echo -n "Removing key pair '$KEY_NAME' ... "

aws ec2 delete-key-pair --key-name $KEY_NAME --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE

echo "Done."

# Remove Security Group

echo -n "Removing '$SG_NAME' security group ... "

aws ec2 delete-security-group --group-name $SG_NAME --region $AWS_REGION --profile $AWS_PROFILE >> $LOG_FILE

echo "Done."

# Remove local resources

rm -rf $USR_DATA

rm -rf $KEY_FILE $KEY_FILE.pub

rm -rf $LOG_FILE

# Normal termination

exit $RET_CODE_OK

Additional notes:

Please, edit the AWS_PROFILE variable. AWS CLI commands depend on this!
Activity log is kept in /tmp/test-hadoop-setup.log and will be recreated with every new run of the script.
In case of normal execution, all allocated resources will be cleaned upon termination.
This script is ready to be used as Jenkins build-and-deploy job.
Since the single-node Hadoop/HDFS is terminated, output data that goes to HDFS should be transferred out of the instance before termination!

Example run should look like:

:~> ./aws-hadoop-single.sh
Creating 'test-hadoop-sg' security group ... Done.
Adding access rules to 'test-hadoop-sg' security group ... Done.
Generating key pair 'test-hadoop-key' for general access ... Done.
Starting Hadoop instance ... Done.
Tagging Hadoop instance 'i-b3b27f5c' ... Done.
Obtaining instance 'i-b3b27f5c' public hostname ... Done.
Waiting for instance 'i-b3b27f5c' to configure itself (will take approx. 5 minutes) ... Done.
========== Testing Single-node Hadoop ==========
Number of Maps  = 10
Samples per Map = 1000000
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Wrote input for Map #5
Wrote input for Map #6
Wrote input for Map #7
Wrote input for Map #8
Wrote input for Map #9
Starting Job
15/02/04 07:27:05 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/02/04 07:27:05 INFO input.FileInputFormat: Total input paths to process : 10
15/02/04 07:27:05 INFO mapreduce.JobSubmitter: number of splits:10
15/02/04 07:27:05 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
15/02/04 07:27:05 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
15/02/04 07:27:05 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
15/02/04 07:27:05 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
15/02/04 07:27:05 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
15/02/04 07:27:05 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
15/02/04 07:27:05 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
15/02/04 07:27:05 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
15/02/04 07:27:05 INFO Configuration.deprecation: mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class
15/02/04 07:27:05 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class
15/02/04 07:27:05 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
15/02/04 07:27:05 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
15/02/04 07:27:05 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class
15/02/04 07:27:05 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
15/02/04 07:27:05 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
15/02/04 07:27:05 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
15/02/04 07:27:05 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1423034805647_0001
15/02/04 07:27:05 INFO impl.YarnClientImpl: Submitted application application_1423034805647_0001 to ResourceManager at /0.0.0.0:8032
15/02/04 07:27:05 INFO mapreduce.Job: The url to track the job: http://ip-10-63-188-40:8088/proxy/application_1423034805647_0001/
15/02/04 07:27:05 INFO mapreduce.Job: Running job: job_1423034805647_0001
15/02/04 07:27:11 INFO mapreduce.Job: Job job_1423034805647_0001 running in uber mode : false
15/02/04 07:27:11 INFO mapreduce.Job:  map 0% reduce 0%
15/02/04 07:27:24 INFO mapreduce.Job:  map 60% reduce 0%
15/02/04 07:27:33 INFO mapreduce.Job:  map 100% reduce 0%
15/02/04 07:27:34 INFO mapreduce.Job:  map 100% reduce 100%
15/02/04 07:27:34 INFO mapreduce.Job: Job job_1423034805647_0001 completed successfully
Job Finished in 29.302 seconds
15/02/04 07:27:34 INFO mapreduce.Job: Counters: 43
        File System Counters
                FILE: Number of bytes read=226
                FILE: Number of bytes written=882378
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=2660
                HDFS: Number of bytes written=215
                HDFS: Number of read operations=43
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=3
        Job Counters
                Launched map tasks=10
                Launched reduce tasks=1
                Data-local map tasks=10
                Total time spent by all maps in occupied slots (ms)=93289
                Total time spent by all reduces in occupied slots (ms)=7055
        Map-Reduce Framework
                Map input records=10
                Map output records=20
                Map output bytes=180
                Map output materialized bytes=280
                Input split bytes=1480
                Combine input records=0
                Combine output records=0
                Reduce input groups=2
                Reduce shuffle bytes=280
                Reduce input records=20
                Reduce output records=0
                Spilled Records=40
                Shuffled Maps =10
                Failed Shuffles=0
                Merged Map outputs=10
                GC time elapsed (ms)=1561
                CPU time spent (ms)=7210
                Physical memory (bytes) snapshot=2750681088
                Virtual memory (bytes) snapshot=11076927488
                Total committed heap usage (bytes)=2197291008
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=1180
        File Output Format Counters
                Bytes Written=97
Estimated value of Pi is 3.14158440000000000000
========== Done ==========
Terminating Hadoop instance 'i-b3b27f5c' ... Done.
Removing key pair 'test-hadoop-key' ... Done.
Removing 'test-hadoop-sg' security group ... Done.
:~>

100

101

102

103

104

105

106

107

108

:~> ./aws-hadoop-single.sh

Creating 'test-hadoop-sg' security group ... Done.

Adding access rules to 'test-hadoop-sg' security group ... Done.

Generating key pair 'test-hadoop-key' for general access ... Done.

Starting Hadoop instance ... Done.

Tagging Hadoop instance 'i-b3b27f5c' ... Done.

Obtaining instance 'i-b3b27f5c' public hostname ... Done.

Waiting for instance 'i-b3b27f5c' to configure itself (will take approx. 5 minutes) ... Done.

========== Testing Single-node Hadoop ==========

Number of Maps = 10

Samples per Map = 1000000

Wrote input for Map #0

Wrote input for Map #1

Wrote input for Map #2

Wrote input for Map #3

Wrote input for Map #4

Wrote input for Map #5

Wrote input for Map #6

Wrote input for Map #7

Wrote input for Map #8

Wrote input for Map #9

Starting Job

15/02/04 07:27:05 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032

15/02/04 07:27:05 INFO input.FileInputFormat: Total input paths to process : 10

15/02/04 07:27:05 INFO mapreduce.JobSubmitter: number of splits:10

15/02/04 07:27:05 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name

15/02/04 07:27:05 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar

15/02/04 07:27:05 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative

15/02/04 07:27:05 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces

15/02/04 07:27:05 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class

15/02/04 07:27:05 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative

15/02/04 07:27:05 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class

15/02/04 07:27:05 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name

15/02/04 07:27:05 INFO Configuration.deprecation: mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class

15/02/04 07:27:05 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class

15/02/04 07:27:05 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir

15/02/04 07:27:05 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir

15/02/04 07:27:05 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class

15/02/04 07:27:05 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps

15/02/04 07:27:05 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class

15/02/04 07:27:05 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir

15/02/04 07:27:05 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1423034805647_0001

15/02/04 07:27:05 INFO impl.YarnClientImpl: Submitted application application_1423034805647_0001 to ResourceManager at /0.0.0.0:8032

15/02/04 07:27:05 INFO mapreduce.Job: The url to track the job: http://ip-10-63-188-40:8088/proxy/application_1423034805647_0001/

15/02/04 07:27:05 INFO mapreduce.Job: Running job: job_1423034805647_0001

15/02/04 07:27:11 INFO mapreduce.Job: Job job_1423034805647_0001 running in uber mode : false

15/02/04 07:27:11 INFO mapreduce.Job: map 0% reduce 0%

15/02/04 07:27:24 INFO mapreduce.Job: map 60% reduce 0%

15/02/04 07:27:33 INFO mapreduce.Job: map 100% reduce 0%

15/02/04 07:27:34 INFO mapreduce.Job: map 100% reduce 100%

15/02/04 07:27:34 INFO mapreduce.Job: Job job_1423034805647_0001 completed successfully

Job Finished in 29.302 seconds

15/02/04 07:27:34 INFO mapreduce.Job: Counters: 43

File System Counters

FILE: Number of bytes read=226

FILE: Number of bytes written=882378

FILE: Number of read operations=0

FILE: Number of large read operations=0

FILE: Number of write operations=0

HDFS: Number of bytes read=2660

HDFS: Number of bytes written=215

HDFS: Number of read operations=43

HDFS: Number of large read operations=0

HDFS: Number of write operations=3

Job Counters

Launched map tasks=10

Launched reduce tasks=1

Data-local map tasks=10

Total time spent by all maps in occupied slots (ms)=93289

Total time spent by all reduces in occupied slots (ms)=7055

Map-Reduce Framework

Map input records=10

Map output records=20

Map output bytes=180

Map output materialized bytes=280

Input split bytes=1480

Combine input records=0

Combine output records=0

Reduce input groups=2

Reduce shuffle bytes=280

Reduce input records=20

Reduce output records=0

Spilled Records=40

Shuffled Maps =10

Failed Shuffles=0

Merged Map outputs=10

GC time elapsed (ms)=1561

CPU time spent (ms)=7210

Physical memory (bytes) snapshot=2750681088

Virtual memory (bytes) snapshot=11076927488

Total committed heap usage (bytes)=2197291008

Shuffle Errors

BAD_ID=0

CONNECTION=0

IO_ERROR=0

WRONG_LENGTH=0

WRONG_MAP=0

WRONG_REDUCE=0

File Input Format Counters

Bytes Read=1180

File Output Format Counters

Bytes Written=97

Estimated value of Pi is 3.14158440000000000000

========== Done ==========

Terminating Hadoop instance 'i-b3b27f5c' ... Done.

Removing key pair 'test-hadoop-key' ... Done.

Removing 'test-hadoop-sg' security group ... Done.

:~>

Hopefully, this short introduction will advance your efforts to automate development tasks in BigData projects!

If you want to discuss more complex scenarios including automated deployments over multi-node Hadoop clusters, AWS Elastic MapReduce, AWS DataPipeline or other components of the BigData ecosystem, do not hesitate to Contact Us!

References

UserData Template for Ubuntu 14.04 EC2 Instances in AWS

2015/01/27 AWS, Development, DevOps, Operations No comments AWS, AWS CLI, DevOps, template, UserData

In any elastic environment there is a recurring issue: How to quickly spin up new boxes? Over time multiple options emerge. Many environments will rely on a pre-baked machine instances. In Amazon AWS those are called Amazon Machine Instances (AMIs), in Joyent’s SDC – images, but no matter the name they present pre-build, (mostly) pre-configured digital artifact that the underlying cloud layer will bootstrap and execute. They are fast to bootstrap, but limited. Hard to manage different versions, hard to switch virtualization technologies (PV vs. HVM, AWS vs. Joyent, etc), hard to deal with software versioning. Managing elastic environment with pre-baked images is probably the fastest way to start, but probably the most expensive way in the long run.

Another option is to use some sort of configuration management system. Chef, Puppet, Salt, Ansible … a lot of choices. Those are flexible, but depending on the usage scenarios can be slow and may require additional “interventions” to work properly. There are two additional “gotchas” that are not commonly discussed. First, those tools will force some sort in-house configuration/pseudo-programming language and terminology. Second, security is a tricky concept to implement within such system. Managing elastic environments with configuration management systems is definitely possible, but comes with some dependencies and prerequisites you should account for in the design phase.

Third option, AWS UserData / Joyent script, is a reasonable compromise. This is effectively a script that executes one upon virtual machine creation. It allows you to configure the instance, attach/configure storages, install software, etc. There are obvious benefits to that approach:

Treat that script like any other coding artifact, use version control, code reviews, etc;
It is easily modifiable upon need or request;
It can be used with virtually any instance type;
It is a single source of truth for the instance configuration;
It integrates nicely with the whole Control Plane concept.

Here is a basic template for Ubuntu 14.04 used with reasonable success to cover wide variety of deployment needs:

#!/bin/bash -ex

# DESCRIPTION: The following UserData script is created to ... 
# 
# Maintainer: ivachkov [at] xi-group [dot] com
# 
# Requirements:
#	OS: Ubuntu 14.04 LTS
#	Repositories: 
#		...
#	Packages:
# 		htop, iotop, dstat, ...
#	PIP Packages:
#		boto, awscli, ...
# 
# Additional information if necessary
# 	... 
# 

# Debian apt-get install function to eliminate prompts
export DEBIAN_FRONTEND=noninteractive
apt_get_install()
{
	DEBIAN_FRONTEND=noninteractive apt-get -y \
		-o DPkg::Options::=--force-confnew \
		install $@
}

# Configure disk layout 
INSTANCE_STORE_0="/dev/xvdb"
IS0_PART_1="/dev/xvdb1"
IS0_PART_2="/dev/xvdb2"

# INSTANCE_STORE_1="/dev/xvdc"
# IS1_PART_1="/dev/xvdc1"
# IS1_PART_2="/dev/xvdc2"

# ... 

# Unmount /dev/xvdb if already mounted
MOUNTED=`df -h | awk '{print $1}' | grep $INSTANCE_STORE_0`
if [ ! -z "$MOUNTED" ]; then
	umount -f $INSTANCE_STORE_0
fi

# Partition the disk (8GB for SWAP / Rest for /mnt)
(echo n; echo p; echo 1; echo 2048; echo +8G; echo t; echo 82; echo n; echo p; echo 2; echo; echo; echo w) | fdisk $INSTANCE_STORE_0

# Make and enable swap
mkswap $IS0_PART_1
swapon $IS0_PART_1

# Make /mnt partition and mount it
mkfs.ext4 $IS0_PART_2
mount $IS0_PART_2 /mnt

# Update /etc/fstab if necessary 
# sed -i s/$INSTANCE_STORE_0/$IS0_PART_2/g /etc/fstab

# Add external repositories
# 
# Example 1: MongoDB
# apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10
# echo 'deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' | sudo tee /etc/apt/sources.list.d/mongodb.list
# 
# Example 2: Salt
# add-apt-repository ppa:saltstack/salt
# 
# Example 3: *Internal repository*
# curl --silent https://apt.mydomain.com/my.apt.gpg.key | apt-key add -
# curl --silent -o /etc/apt/sources.list.d/my.apt.list https://apt.mydomain.com/my.apt.list

# Update the packace indexes
apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -y -o Dpkg::Options::="--force-confnew" dist-upgrade

# Install basic APT packages and requirements
apt_get_install htop sysstat dstat iotop
# apt_get_install ... 
apt_get_install python-pip
apt_get_install ntp
# apt_get_install ... 

# Install PIP requirements
pip install six==1.8.0
pip install boto
pip install awscli
# pip install ... 

# Configure NTP
service ntp stop		# Stop ntp daemon to free NTP socket
sleep 3				# Give the daemon some time to exit
ntpdate pool.ntp.org		# Sync time
service ntp start		# Re-enable the NTP daemon

# Configure other system-specific settings ... 

# Configure automatic security updates
cat > /etc/apt/apt.conf.d/20auto-upgrades << "EOF"
APT::Periodic::Update-Package-Lists "1";
APT::Periodic::Unattended-Upgrade "1";
EOF
/etc/init.d/unattended-upgrades restart

# Update system limits
cat > /etc/security/limits.d/my_limits.conf << "EOF"
*               soft    nofile          999999
*               hard    nofile          999999
root            soft    nofile          999999
root            hard    nofile          999999
EOF
ulimit -n 999999

# Update sysctl variables
cat > /etc/sysctl.d/my_sysctl.conf << "EOF"
net.core.somaxconn=65535
net.core.netdev_max_backlog=65535
# net.core.rmem_max=8388608
# net.core.wmem_max=8388608
# net.core.rmem_default=65536
# net.core.wmem_default=65536
# net.ipv4.tcp_rmem=8192 873800 8388608
# net.ipv4.tcp_wmem=4096 655360 8388608
# net.ipv4.tcp_mem=8388608 8388608 8388608
# net.ipv4.tcp_max_tw_buckets=6000000
# net.ipv4.tcp_max_syn_backlog=65536
# net.ipv4.tcp_max_orphans=262144
# net.ipv4.tcp_synack_retries = 2
# net.ipv4.tcp_syn_retries = 2
# net.ipv4.tcp_fin_timeout = 7
# net.ipv4.tcp_slow_start_after_idle = 0
# net.ipv4.ip_local_port_range = 2000 65000
# net.ipv4.tcp_window_scaling = 1
# net.ipv4.tcp_max_syn_backlog = 3240000
# net.ipv4.tcp_congestion_control = cubic
EOF
sysctl -p /etc/sysctl.d/my_sysctl.conf

# Create specific users and groups 
# addgroup ...
# useradd ... 
# usermod ...

# Create expected set of directories
DIRECTORIES="
	/var/log/...
	/run/...
	/srv/... 
	/opt/...
	"

for DIRECTORY in $DIRECTORIES; do
	mkdir -p $DIRECTORY
	chown USER:GROUP $DIRECTORY	
done

# Create custom_crontab
cat > /home/ubuntu/custom_crontab << "EOF"

EOF

# Enable custom cronjobs
su - ubuntu -c "/usr/bin/crontab /home/ubuntu/custom_crontab"

# Install main application / service 
# ...
# ... 

# Configure main application / service
# ... 
# ... 

# Make everythig survive reboot
cat > /etc/rc.local << "EOF"
#!/bin/sh

# Regenerate disk layout on ephemeral storage 
# ... 

# Start the application 
# ... 

EOF

# Start application
# service XXX restart 

# Tag the instance (NOTE: Depends on configure AWS CLI)
INSTANCE_ID=`curl -s http://169.254.169.254/latest/meta-data/instance-id`
# aws ec2 create-tags --resources $INSTANCE_ID --tags Key=Name,Value=... 

# Mark successful execution
exit 0

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

#!/bin/bash -ex

# DESCRIPTION: The following UserData script is created to ...

# Maintainer: ivachkov [at] xi-group [dot] com

# Requirements:

# OS: Ubuntu 14.04 LTS

# Repositories:

# ...

# Packages:

# htop, iotop, dstat, ...

# PIP Packages:

# boto, awscli, ...

# Additional information if necessary

# ...

# Debian apt-get install function to eliminate prompts

export DEBIAN_FRONTEND=noninteractive

apt_get_install()

{

DEBIAN_FRONTEND=noninteractive apt-get -y \

-o DPkg::Options::=--force-confnew \

install $@

}

# Configure disk layout

INSTANCE_STORE_0="/dev/xvdb"

IS0_PART_1="/dev/xvdb1"

IS0_PART_2="/dev/xvdb2"

# INSTANCE_STORE_1="/dev/xvdc"

# IS1_PART_1="/dev/xvdc1"

# IS1_PART_2="/dev/xvdc2"

# ...

# Unmount /dev/xvdb if already mounted

MOUNTED=`df -h | awk '{print $1}' | grep $INSTANCE_STORE_0`

if [ ! -z "$MOUNTED" ]; then

umount -f $INSTANCE_STORE_0

# Partition the disk (8GB for SWAP / Rest for /mnt)

(echo n; echo p; echo 1; echo 2048; echo +8G; echo t; echo 82; echo n; echo p; echo 2; echo; echo; echo w) | fdisk $INSTANCE_STORE_0

# Make and enable swap

mkswap $IS0_PART_1

swapon $IS0_PART_1

# Make /mnt partition and mount it

mkfs.ext4 $IS0_PART_2

mount $IS0_PART_2 /mnt

# Update /etc/fstab if necessary

# sed -i s/$INSTANCE_STORE_0/$IS0_PART_2/g /etc/fstab

# Add external repositories

# Example 1: MongoDB

# apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10

# echo 'deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' | sudo tee /etc/apt/sources.list.d/mongodb.list

# Example 2: Salt

# add-apt-repository ppa:saltstack/salt

# Example 3: *Internal repository*

# curl --silent https://apt.mydomain.com/my.apt.gpg.key | apt-key add -

# curl --silent -o /etc/apt/sources.list.d/my.apt.list https://apt.mydomain.com/my.apt.list

# Update the packace indexes

apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -y -o Dpkg::Options::="--force-confnew" dist-upgrade

# Install basic APT packages and requirements

apt_get_install htop sysstat dstat iotop

# apt_get_install ...

apt_get_install python-pip

apt_get_install ntp

# apt_get_install ...

# Install PIP requirements

pip install six==1.8.0

pip install boto

pip install awscli

# pip install ...

# Configure NTP

service ntp stop # Stop ntp daemon to free NTP socket

sleep 3 # Give the daemon some time to exit

ntpdate pool.ntp.org # Sync time

service ntp start # Re-enable the NTP daemon

# Configure other system-specific settings ...

# Configure automatic security updates

cat > /etc/apt/apt.conf.d/20auto-upgrades << "EOF"

APT::Periodic::Update-Package-Lists "1";

APT::Periodic::Unattended-Upgrade "1";

EOF

/etc/init.d/unattended-upgrades restart

# Update system limits

cat > /etc/security/limits.d/my_limits.conf << "EOF"

* soft nofile 999999

* hard nofile 999999

root soft nofile 999999

root hard nofile 999999

EOF

ulimit -n 999999

# Update sysctl variables

cat > /etc/sysctl.d/my_sysctl.conf << "EOF"

net.core.somaxconn=65535

net.core.netdev_max_backlog=65535

# net.core.rmem_max=8388608

# net.core.wmem_max=8388608

# net.core.rmem_default=65536

# net.core.wmem_default=65536

# net.ipv4.tcp_rmem=8192 873800 8388608

# net.ipv4.tcp_wmem=4096 655360 8388608

# net.ipv4.tcp_mem=8388608 8388608 8388608

# net.ipv4.tcp_max_tw_buckets=6000000

# net.ipv4.tcp_max_syn_backlog=65536

# net.ipv4.tcp_max_orphans=262144

# net.ipv4.tcp_synack_retries = 2

# net.ipv4.tcp_syn_retries = 2

# net.ipv4.tcp_fin_timeout = 7

# net.ipv4.tcp_slow_start_after_idle = 0

# net.ipv4.ip_local_port_range = 2000 65000

# net.ipv4.tcp_window_scaling = 1

# net.ipv4.tcp_max_syn_backlog = 3240000

# net.ipv4.tcp_congestion_control = cubic

EOF

sysctl -p /etc/sysctl.d/my_sysctl.conf

# Create specific users and groups

# addgroup ...

# useradd ...

# usermod ...

# Create expected set of directories

DIRECTORIES="

/var/log/...

/run/...

/srv/...

/opt/...

for DIRECTORY in $DIRECTORIES; do

mkdir -p $DIRECTORY

chown USER:GROUP $DIRECTORY

done

# Create custom_crontab

cat > /home/ubuntu/custom_crontab << "EOF"

EOF

# Enable custom cronjobs

su - ubuntu -c "/usr/bin/crontab /home/ubuntu/custom_crontab"

# Install main application / service

# ...

# Configure main application / service

# ...

# Make everythig survive reboot

cat > /etc/rc.local << "EOF"

#!/bin/sh

# Regenerate disk layout on ephemeral storage

# ...

# Start the application

# ...

EOF

# Start application

# service XXX restart

# Tag the instance (NOTE: Depends on configure AWS CLI)

INSTANCE_ID=`curl -s http://169.254.169.254/latest/meta-data/instance-id`

# aws ec2 create-tags --resources $INSTANCE_ID --tags Key=Name,Value=...

# Mark successful execution

exit 0

Trivial. Yet, incorporates a lot in just ~200 lines of code:

Disk layout management;
Package repositories configuration;
Basic tool set and third party software installation;
Service reconfiguration (NTP, Automatic security updates);
System reconfiguration (limits, sysctl, users, directories, crontab);
Post-reboot startup configuration;
Identity discovery and self-tagging;

As added bonus, the cloud-init package will properly log all output during the script execution in /var/log/cloud-init-output.log for failure investigations. Current script uses -ex bash parameters, which means it will explicitly echo all executed commands (-x) and exit at first sign of unsuccessful command execution (-e).

NOTE: There is one important component, purposefully omitted from the template UserData, the log file management. We plan on discussing that in a separate article.

References

Small Tip: How to use AWS CLI ‘–filter’ parameter

2015/01/20 AWS, DevOps, Operations, Small Tip 5 comments AWS, AWS CLI, DevOps, extract fields, filter parameter, output filter, parse output

This post will present another, useful feature of the AWS CLI tool set, the –filter parameter. This command line parameter is available and extremely helpful in EC2 namespace (aws ec2 describe-*).There are various ways to use –filter parameter.

1. –filter parameter can get filtering properties directly from the command line:

aws ec2 describe-instances --filter Name="instance-id",Values="i-1234abcd"

1	aws ec2 describe-instances --filter Name="instance-id",Values="i-1234abcd"

2. –filter parameter will also use JSON-encoded filter file:

aws ec2 describe-instances --filters file://filters.json

1	aws ec2 describe-instances --filters file://filters.json

The filters.json file uses the following structure:

[
  {
    "Name": "instance-type",
    "Values": ["m1.small", "m1.medium"]
  },
  {
    "Name": "availability-zone",
    "Values": ["us-west-2c"]
  }
]

[

{

"Name": "instance-type",

"Values": ["m1.small", "m1.medium"]

{

"Name": "availability-zone",

"Values": ["us-west-2c"]

}

]

There are various AWS CLI components that provide –filter parameters. For additional information check the References section.

To demonstrate the way this functionality can be used in various scenarios, there are several examples:

1. Filter by availability zone:

aws ec2 describe-instances --filter Name="availability-zone",Values="us-east-1b"

1	aws ec2 describe-instances --filter Name="availability-zone",Values="us-east-1b"

2. Filter by security group (EC2-Classic):

aws ec2 describe-instances --filter Name="group-name",Values="default"

1	aws ec2 describe-instances --filter Name="group-name",Values="default"

3. Filter by security group (EC2-VPC):

aws ec2 describe-instances --filter Name="instance.group-name",Values="default"

1	aws ec2 describe-instances --filter Name="instance.group-name",Values="default"

4. Filter only spot instances

aws ec2 describe-instances --filter Name="instance-lifecycle",Values="spot"

1	aws ec2 describe-instances --filter Name="instance-lifecycle",Values="spot"

5. Filter only running EC2 instances:

aws ec2 describe-instances --filter Name="instance-state-name",Values="running"

1	aws ec2 describe-instances --filter Name="instance-state-name",Values="running"

6. Filter only stopped EC2 instances:

aws ec2 describe-instances --filter Name="instance-state-name",Values="stopped"

1	aws ec2 describe-instances --filter Name="instance-state-name",Values="stopped"

7. Filter by SSH Key name

aws ec2 describe-instances --filter Name="key-name",Values="ssh-key"

1	aws ec2 describe-instances --filter Name="key-name",Values="ssh-key"

8. Filter by Tag:

aws ec2 describe-instances --filter "Name=tag-key,Values=Name" "Name=tag-value,Values=string"

1	aws ec2 describe-instances --filter "Name=tag-key,Values=Name" "Name=tag-value,Values=string"

9. Filter by Tag with a wildcard (‘*’):

aws ec2 describe-instances --filter "Name=tag-key,Values=MyTag" "Name=tag-value,Values=abcd*efgh"

1	aws ec2 describe-instances --filter "Name=tag-key,Values=MyTag" "Name=tag-value,Values=abcd*efgh"

10. Filter by multiple criteria (all running instances with string ’email’ in the value of the Name tag):

aws ec2 describe-instances --filter "Name=instance-state-name,Values=running" "Name=tag-key,Values=Name" "Name=tag-value,Values=*email*"

1	aws ec2 describe-instances --filter "Name=instance-state-name,Values=running" "Name=tag-key,Values=Name" "Name=tag-value,Values=email"

11. Filter by multiple criteria (all running instances with empty Name tag);

aws ec2 describe-instances --filter "Name=instance-state-name,Values=running" "Name=tag-key,Values=Name" "Name=tag-value,Values=''"

1	aws ec2 describe-instances --filter "Name=instance-state-name,Values=running" "Name=tag-key,Values=Name" "Name=tag-value,Values=''"

Those examples are very close to production ones used in several large AWS deployments. They are used to:

Monitor changes in instance populations;
Monitor successful configuration of resources;
Track deployment / rollout of new software version;
Track stopped instances to prevent unnecessary resource usage;
Ensure desired service distributions over availability zones and regions;
Ensure service distribution over instances with different lifecycle;

Be sure to utilize this functionality in your monitoring infrastructure. It has been powerful source of operational insights and great source of raw data for our intelligent control planes!

If you want to talk more on this subject or just share your experience, do not hesitate to Contact Us!

References

AWS, DevOps, Outsourcing …

2014/12/08 AWS, DevOps, Operations, Xi Group Ltd. No comments AWS, cloudops, DevOps, devops consultancy, outsourced devops, outsourcing

Is it possible to outsource DevOps?!

We asked ourselves that exact question before Xi Group Ltd. ventured into offering DevOps services. And the answer did not come easy. DevOps is cultural phenomenon. DevOps relies on close communication and by extension is location-dependent. DevOps is also technological phenomenon. Software has to be created to implement it. So it seemed that outsourcing is not really viable for DevOps …

Several projects and many months later, we know that this is not true. DevOps can be outsourced. External company can be integral part of DevOps strategy and day-to-day operational activities. Such cooperation can be beneficial from cultural and technological perspectives.

From a cultural perspective, choosing specialized DevOps partner can help you in several ways. They will help you break the “enterprise silo” model and mindset. This one is especially hard, even with some form of internal governance and support. Developers under pressure will keep writing code and let “ops” deal with it later. Any operational shortcomings will be attributed to the “ops guys” because once you ship it “it’s their problem”. Knowledgeable third party, not invested in any of the teams, can facilitate a communication flow between those teams speaking their language, explaining core architectural and operational principles, demanding proper implementation and keeping eye on the final goal: easy deployments of functional components. Communication is key! And successful outsourcing partner will know this. Will be vocal and active in all phases of the Software Development Lifecycle. It is the experience that comes from complex deployments of large scale distributed systems, that you should look for in your DevOps partners.

From a technology perspective, choosing specialized DevOps partner can also be beneficial endeavor. The chance is they already have several projects behind their back. They have the basic tooling already developed and can shorten your implementation lifecycle with components and know-how you should otherwise develop on your own. Proper DevOps partner will supply you with proper technology choices for the components you develop/operate. And NO, Docker is not always the answer! There are other ways to achieve immutable infrastructure. Build process automation, deployment automation, monitoring and log processing are already part of our daily arsenal of tools. We developed those, validated their usability in production environments … and now you can benefit from them too! Those the basics you should expect from a DevOps partner. You should expect active input on technology, operational requirements, non-functional requirements, predictability requirements, monitoring and scalability. Anything less … is not DevOps.

So … Is it possible to outsource DevOps?! Yes, we believe it is!

What should you look for in a partner? Experience with complex deployments and proper tooling to shorten your implementation cycle … as a start.

Do you want to know more?! … Contact us!

Small Tip: How to use –block-device-mappings to manage instance volumes with AWS CLI

2014/11/26 AWS, Development, DevOps, Operations, Small Tip AWS, AWS CLI, block device mappings, DevOps, instance store, volumes

This post will present one of the less popular features in the AWS CLI tool set, how to deal with EC2 instance volumes through the use of –block-device-mappings parameter. Previous post, Small Tip: Use AWS CLI to create instances with bigger root partitions already presents one of the common use cases, modifying the instance root partition size. However, use of ‘–block-device-mappings’ can go far beyond this simple feature.

Default documentation (http://docs.aws.amazon.com/cli/latest/reference/ec2/run-instances.html) although a good start is somewhat limited. Several tips and tricks will be presented here.

The location of the JSON block device mapping specification can be quite flexible. The mappings can be supplied:

1. Using command line directly:

--block-device-mappings '[ {"DeviceName":"/dev/sdb","VirtualName":"ephemeral0"}, {"DeviceName":"/dev/sdc","VirtualName":"ephemeral1"}]'

1	--block-device-mappings '[ {"DeviceName":"/dev/sdb","VirtualName":"ephemeral0"}, {"DeviceName":"/dev/sdc","VirtualName":"ephemeral1"}]'

2. Using file as a source:

--block-device-mappings file:////home/ec2-user/mapping.json

1	--block-device-mappings file:////home/ec2-user/mapping.json

3. Using URL as a source:

--block-device-mappings http://mybucket.s3.amazonaws.com/mapping.json

1	--block-device-mappings http://mybucket.s3.amazonaws.com/mapping.json

Source: http://understeer.hatenablog.com/entry/2013/10/18/223618

Other common scenarios:

1. To reorder default ephemeral volumes to ensure stability of the environment:

[
  {
    "DeviceName": "/dev/sde",
    "VirtualName": "ephemeral0"
  },
  {
    "DeviceName": "/dev/sdf",
    "VirtualName": "ephemeral1"
  }
]

[

{

"DeviceName": "/dev/sde",

"VirtualName": "ephemeral0"

{

"DeviceName": "/dev/sdf",

"VirtualName": "ephemeral1"

}

]

NOTE: Useful for additional UserData processing or deployments with hardcoded settings.

2. To allocate additional EBS Volume with specific size (100GB), to be associated with the EC2 instance:

[
  {
    "DeviceName": "/dev/sdg",
    "Ebs": {
      "VolumeSize": 100
    }
  }
]

[

{

"DeviceName": "/dev/sdg",

"Ebs": {

"VolumeSize": 100

}

]

NOTE: Useful for cases where cheaper instance types are outfitted with big volumes (Disk intensive tasks run on low-CPU/MEM instance types).

3. To allocate new volume from Snapshot ID:

[
  {
    "DeviceName": "/dev/sdh",
    "Ebs": {
      "SnapshotId": "snap-xxxxxxxx"
    }
  }
]

[

{

"DeviceName": "/dev/sdh",

"Ebs": {

"SnapshotId": "snap-xxxxxxxx"

}

]

NOTE: Useful to pre-loading newly created instances with specific disk data and still retaining the ability to modify the local copy.

4. To omit mapping of a particular Device Name:

[
  {
    "DeviceName": "/dev/sdj",
    "NoDevice": ""
  }
]

[

{

"DeviceName": "/dev/sdj",

"NoDevice": ""

}

]

NOTE: Useful to overwrite default AWS behavior.

5. To allocate new EBS Volume with explicit termination behavior (Keep after instance termination):

[
  {
    "DeviceName": "/dev/sdc",
    "Ebs": {
      "VolumeSize": 10,
      "DeleteOnTermination": false
    }
  }
]

[

{

"DeviceName": "/dev/sdc",

"Ebs": {

"VolumeSize": 10,

"DeleteOnTermination": false

}

]

NOTE: Useful to keep instance data after termination, additional cost may be significant if those volumes are not released after examination.

6. To allocate new, encrypted, EBS Volume with Reserved IOPS:

[
  {
    "DeviceName": "/dev/sdc",
    "Ebs": {
      "VolumeSize": 10,
      "VolumeType": "io1",
      "Iops": 1000,
      "Encrypted": true
    }
  }
]

[

{

"DeviceName": "/dev/sdc",

"Ebs": {

"VolumeSize": 10,

"VolumeType": "io1",

"Iops": 1000,

"Encrypted": true

}

]

NOTE: Useful to set minimum required performance levels (I/O Operations Per Second) for the specified volume.

Outlined functionality should cover wide range of potentially use cases for DevOps engineers who want to use automation to customize their infrastructure. Flexible instance volume management is a key ingredient for successful implementation of the ‘Infrastructure-as-Code’ paradigm!

References

A few myths about DevOps

2014/10/24 DevOps, Operations, Xi Group Ltd. 1 comment Aggressive DevOps, DevOps, devops-as-a-service, offshored devops, outsourced devops

It has been more than 2 years, since Xi Group Ltd. ventured to provide DevOps services. We worked on multiple projects, for established companies and startups. In that period, we experienced multiple misconceptions about what DevOps actually is or what should it be. Here are the ones we find most common.

Myth: Developers can do Ops

People, lacking operational background, no matter the position or title, will fail to become DevOps Engineers. In the same way in which one can not build a house without proper foundation, becoming DevOps Engineer without operational expertise is futile. In fact, we experience the opposite to be true: Developers are generally bad at Ops. And there are complex reasons for this. Education, work experience and the separation of roles in most enterprises are mostly geared towards deep specialization. In contrast, the DevOps Engineer has to be able to fill-in for a lot of roles: sometimes system architect, sometimes build engineer, monitoring specialist or even quality assurance specialist. Hence to be a good DevOps candidate you need exposure to a lot of technologies and activities. Operation background is a must. Development background helps.

Myth: Sysadmins are obsolete

Saying that system administrator or system engineers are obsolete is like saying that Unix is obsolete. Of course it is … in its original form. But it has been evolving for more than 40 years now. And it is the same with the System Administrator role. DevOps is not revolutionary step. It is evolutional development of the role based on the changes in the environment. Adaptation is required. Culture is changing, the Cloud is mainstream technology today. In fact, many of those that were System Administrators before the DevOps movement emerged were doing ‘DevOps’ in one way or another. And those are the people you want to be on your DevOps team today. Without being disrespectful, we’d prefer forty-something-year-old Unix sysadmin that knows SH and “some TCL, Perl, Expect” to any monkey-patching, “Ruby/node.js’s the deal” developer!

Myth: DevOps can not be outsourced / off-shored

This a matter of organizational structure and culture. In the same way Operations or Development can be outsourced or off-shored, DevOps can be. Communication is key. Presence is important. But technology allows us to have it all without being physically in the same room. In our experience it boils down to company / team culture. If you have the right ingredients in the mix it will not matter where the guy that manages your 1000+ instances Hadoop cluster physically is. He may be in your conference room and if you ignore his input stuff will break. He may be hundred of miles away and if team cohesion is strong product will be deployed.

Myth: You need DevOps only if you use Cloud

Another common misconception. Although DevOps movement gained momentum with the increasing popularity of the Cloud those are not in a causal relation. You can apply most of the DevOps culture and principles without using cloud technologies. Any significantly complex system can benefit from them. The Cloud created specific challenges (elasticity!) that DevOps practices can meet, but the same or similar challenges are found in most distributed systems anyway. DevOps is first cultural, then technological.

Myth: DevOps is supplementary activity

This is common incarnation of the “we can add it later” mentality. In the industry, one will meet a lot of those. We get the “We can add it later” reply for requests about Security, Performance, Maintainability, etc. However, much like Security, if you’re committed to actually do DevOps – do it from the start. Include DevOps-aware participants in all phases of the Software Development Lifecycle (SDLC). Treating DevOps as supplementary activity will bring only sub-optimal results, which in turn translates into increased operational or development cost.

If you are committed to implement DevOps, don’t believe any of the myths mentioned above. Define your goals and create the environment to implement them successfully! After all, DevOps is about better/faster/smarter execution …

How to implement multi-cloud deployment for scalability and reliability

2014/07/18 AWS, Development, DevOps, Operations, theCloud AWS, AWS CLI, cloud, cloudflare, DevOps, distributed systems, dns, elastic computing, joyent, multi-cloud

Introduction

This post will present interesting approach to scalability and reliability:

How to implement multi-cloud application deployment ?!

There are many reasons why this is interesting topic. Avoiding provider lockdown, reducing cloud provider outage impact, increasing world-wide coverage, disaster recovery / preparedness are only some of them. The obvious benefits of multi-cloud deployment are increased reliability and outage impact minimization. However, there are drawbacks too: supporting different sets of code to accommodate similar, but different services, increased cost, increased infrastructure complexity, different tools … Yet, despite the drawbacks, the possible benefits far outweigh the negatives!

In the following article a simple service will be deployed in automated fashion over two different Cloud Service Providers: Amazon AWS and Joyent. Third provider, CloudFlare, will be used to service DNS requests. The choice of providers is not random. They are chosen because of particular similarities and because the ease of use. All of those providers have consistent, comprehensive APIs that allow automation through programming in parallel to the command line tools.

Preliminary information

The service setup, described here, although synthetic, is representative of multiple usage scenarios. More complex scenarios are also possible. Special care should be taken to address use of common resources or non-replicable resources/states. Understand the dependencies of your application architecture before using multi-cloud setup. Or contact Xi Group Ltd. to aid you in this process!

The following Cloud Service Providers will be used to deploy executable code on:

DNS requests will be served by CloudFlare. The test domain is: scalability.expert

Required tools are:

Additional information can be found in AWS CLI, Joyent CloudAPI Documentation and CloudFlare ClientAPI.

Implementation Details

A service, website for www.scalability.expert, has to be deployed over multiple clouds. For simplicity, it is assumed that this is a static web site, served by NginX. It will run on Ubuntu 14.04 LTS. Instance types chosen in both AWS and Joyent are pretty limited, but should provide enough computing power to run NginX and serve static content. CloudFlare must be configured with basic settings for the DNS zone it will serve (in this case, the free CloudFlare account is used).

Each computing instance, when bootstrapped or restarted, will start the NginX and register itself in CloudFlare. At that point it should be able to receive client traffic. Upon termination or shutdown, each instance should remove its own entries from CloudFlare thus preventing DNS zone pollution with dead entries. In a previous article, How to implement Service Discovery in the Cloud, it was discussed how DNS-SD can be implemented for similar setup with increased client complexity. In a multi-tier architecture this a proper solution. However, lack of control over the client browser may prove that a simplistic solution, like the one described here, is a better choice.

CloudFlare

CloudFlare setup uses the free account and one domain, scalability.expert, is configured:

Basic configuration includes only one entry for the zone name:

As seen by the orange cloud icon, the requests for this record will be routed through CloudFlare’s network for inspection and analysis!

AWS UserData / Joyent Script

To automate the process of configuring instances, the following UserData script will be used:

#!/bin/bash -ex

# Debian apt-get install function
apt_get_install()
{
    DEBIAN_FRONTEND=noninteractive apt-get -y \
    -o DPkg::Options::=--force-confdef \
    -o DPkg::Options::=--force-confold \
    install $@
}
 
# Mark execution start
echo "STARTING" > /root/user_data_run
 
# Some initial setup
export DEBIAN_FRONTEND=noninteractive
apt-get update && apt-get upgrade -y

# Mark progress ...
echo "OS UPDATE FINISHED" >> /root/user_data_run
 
# Install required packages
apt_get_install jq nginx

# Mark progress ...
echo "SOFTWARE DEPENDENCIES INSTALLED" >> /root/user_data_run

# Create test html page
mkdir /var/www
cat > /var/www/index.html << "EOF"
<html>
    <head>
        <title>Demo Page</title>
    </head>
 
    <body>
        <center><h2>Demo Page</h2></center><br>
        <center>Status: running</center>
    </body>
</html>
EOF

# Configure NginX
cat > /etc/nginx/conf.d/demo.conf << "EOF"
# Minimal NginX VirtualHost setup
server {
    listen 8080;
 
    root /var/www;
    index index.html index.htm;
 
    location / {
        try_files $uri $uri/ =404;
    }
}
EOF
 
# Restart NginX with the new settings
/etc/init.d/nginx restart

# Mark progress ...
echo "NGINX CONFIGURED" >> /root/user_data_run

# /etc/init.d startup script
cat > /etc/init.d/cloudflare-submit.sh << "EOF"
#! /bin/bash
#
# Author: Ivo Vachkov (ivachkov@xi-group.com)
#
### BEGIN INIT INFO
# Provides: DNS-SD Service Group Registration / De-Registration
# Required-Start:
# Should-Start:
# Required-Stop:
# Should-Stop:
# Default-Start:  2 3 4 5
# Default-Stop:   0 1 6
# Short-Description:    Start / Stop script for DNS-SD
# Description:          Use to JOIN/LEAVE DNS-SD Service Group
### END INIT INFO

set -e
umask 022

# DNS Configuration details
ZONE="scalability.expert"
HOST="www"
TTL="120"
IP=""

# CloudFlare oSpecific Settings
CF_HOST="https://www.cloudflare.com/api_json.html"
CF_SERVICEMODE="0" # 0: Disable / 1: Enable CloudFlare acceleration network

# Edit the following parameters with your specific settings
CF_TOKEN="cloudflaretoken" 
CF_ACCOUNT="account@cloudflare.com"

# Execution log file
LOG_FILE=/var/log/cloudflare-submit.log

source /lib/lsb/init-functions

export PATH="${PATH:+$PATH:}/usr/sbin:/sbin:/usr/bin:/usr/local/bin:/usr/local/sbin"

# Get public IP
get_public_ip () {
        # Check what cloud provider this code is running on
        if [ ! -f "/var/lib/cloud/data/instance-id" ]; then
                echo "$0: /var/lib/cloud/data/instance-id is not available! Unsupported environment! Exiting ..."
                exit 1
        fi

        # Get the instance public IP address
        I_ID=`cat /var/lib/cloud/data/instance-id`
        if [[ $I_ID == i-* ]]; then
                # Amazon AWS
                IP=`curl http://169.254.169.254/latest/meta-data/public-ipv4`
        else
                # Joyent
                IP=`ifconfig eth0 | grep "inet addr" | awk '{print $2}' | cut -c6-`
        fi
}

# Default Start function
cloudflare_register () {
        # Get instance public IP address
        get_public_ip

        # Check the resutl
        if [ -z "$IP" ]; then
                echo "$0: Unable to obtain public IP Address! Exiting ..."
                exit 1
        fi

        # Execute update towards CloudFlare API
        curl -s $CF_HOST \
                -d "a=rec_new" \
                -d "tkn=$CF_TOKEN" \
                -d "email=$CF_ACCOUNT" \
                -d "z=$ZONE" \
                -d "type=A" \
                -d "name=$HOST" \
                -d "content=$IP" \
                -d "ttl=$TTL" >> $LOG_FILE
    
        # Get record ID for this IP
        REC_ID=`curl -s $CF_HOST \
                -d "a=rec_load_all" \
                -d "tkn=$CF_TOKEN" \
                -d "email=$CF_ACCOUNT" \
                -d "z=$ZONE" | jq -a '.response.recs.objs[] | .content, .rec_id' | grep -A 1 $IP| tail -1 | awk -F"\"" '{print $2}'`

        # Update with desired service mode
        curl -s $CF_HOST \
                -d "a=rec_edit" \
                -d "tkn=$CF_TOKEN" \
                -d "email=$CF_ACCOUNT" \
                -d "z=$ZONE" \
                -d "id=$REC_ID" \
                -d "type=A" \
                -d "name=$HOST" \
                -d "content=$IP" \
                -d "ttl=1" \
                -d "service_mode=$CF_SERVICEMODE" >> $LOG_FILE
}

# Default Stop function
cloudflare_deregister () {
        # Get instance public IP address
        get_public_ip

        # Check the resutl
        if [ -z "$IP" ]; then
                echo "$0: Unable to obtain public IP Address! Exiting ..."
                exit 1
        fi

        # Get record ID for this IP
        REC_ID=`curl -s $CF_HOST \
                -d "a=rec_load_all" \
                -d "tkn=$CF_TOKEN" \
                -d "email=$CF_ACCOUNT" \
                -d "z=$ZONE" | jq -a '.response.recs.objs[] | .content, .rec_id' | grep -A 1 $IP| tail -1 | awk -F"\"" '{print $2}'`

        # Execute update towards CloudFlare API
        curl -s $CF_HOST \
                -d "a=rec_delete" \
                -d "tkn=$CF_TOKEN" \
                -d "email=$CF_ACCOUNT" \
                -d "z=$ZONE" \
                -d "id=$REC_ID" >> $LOG_FILE
}

case "$1" in
start)
        log_daemon_msg "Registering $HOST.$ZONE  with CloudFlare ... " || true
        cloudflare_register
        ;;
stop)
        log_daemon_msg "De-Registering $HOST.$ZONE with CloudFlare ... " || true
        cloudflare_deregister
        ;;
restart)
        log_daemon_msg "Restarting ... " || true
        cloudflare_deregister
        cloudflare_register
        ;;
*)
        log_action_msg "Usage: $0 {start|stop|restart}" || true
        exit 1
esac

exit 0
EOF

# Add it to the startup / shutdown process
chmod +x /etc/init.d/cloudflare-submit.sh
update-rc.d cloudflare-submit.sh defaults 99

# Mark progress ...
echo "CLOUDFLARE SCRIPT INSTALLED" >> /root/user_data_run

# Register with CloudFlare to start receiving requests
/etc/init.d/cloudflare-submit.sh start

# Mark execution end
echo "DONE" > /root/user_data_run

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

#!/bin/bash -ex

# Debian apt-get install function

apt_get_install()

{

DEBIAN_FRONTEND=noninteractive apt-get -y \

-o DPkg::Options::=--force-confdef \

-o DPkg::Options::=--force-confold \

install $@

}

# Mark execution start

echo "STARTING" > /root/user_data_run

# Some initial setup

export DEBIAN_FRONTEND=noninteractive

apt-get update && apt-get upgrade -y

# Mark progress ...

echo "OS UPDATE FINISHED" >> /root/user_data_run

# Install required packages

apt_get_install jq nginx

# Mark progress ...

echo "SOFTWARE DEPENDENCIES INSTALLED" >> /root/user_data_run

# Create test html page

mkdir /var/www

cat > /var/www/index.html << "EOF"

<html>

<head>

</head>

<body>

<center>Status: running</center>

</body>

</html>

EOF

# Configure NginX

cat > /etc/nginx/conf.d/demo.conf << "EOF"

# Minimal NginX VirtualHost setup

server {

listen 8080;

root /var/www;

index index.html index.htm;

location / {

try_files $uri $uri/ =404;

}

EOF

# Restart NginX with the new settings

/etc/init.d/nginx restart

# Mark progress ...

echo "NGINX CONFIGURED" >> /root/user_data_run

# /etc/init.d startup script

cat > /etc/init.d/cloudflare-submit.sh << "EOF"

#! /bin/bash

# Author: Ivo Vachkov (ivachkov@xi-group.com)

### BEGIN INIT INFO

# Provides: DNS-SD Service Group Registration / De-Registration

# Required-Start:

# Should-Start:

# Required-Stop:

# Should-Stop:

# Default-Start: 2 3 4 5

# Default-Stop: 0 1 6

# Short-Description: Start / Stop script for DNS-SD

# Description: Use to JOIN/LEAVE DNS-SD Service Group

### END INIT INFO

set -e

umask 022

# DNS Configuration details

ZONE="scalability.expert"

HOST="www"

TTL="120"

IP=""

# CloudFlare oSpecific Settings

CF_HOST="https://www.cloudflare.com/api_json.html"

CF_SERVICEMODE="0" # 0: Disable / 1: Enable CloudFlare acceleration network

# Edit the following parameters with your specific settings

CF_TOKEN="cloudflaretoken"

CF_ACCOUNT="account@cloudflare.com"

# Execution log file

LOG_FILE=/var/log/cloudflare-submit.log

source /lib/lsb/init-functions

export PATH="${PATH:+$PATH:}/usr/sbin:/sbin:/usr/bin:/usr/local/bin:/usr/local/sbin"

# Get public IP

get_public_ip () {

# Check what cloud provider this code is running on

if [ ! -f "/var/lib/cloud/data/instance-id" ]; then

echo "$0: /var/lib/cloud/data/instance-id is not available! Unsupported environment! Exiting ..."

exit 1

# Get the instance public IP address

I_ID=`cat /var/lib/cloud/data/instance-id`

if [[ $I_ID == i-* ]]; then

# Amazon AWS

IP=`curl http://169.254.169.254/latest/meta-data/public-ipv4`

else

# Joyent

IP=`ifconfig eth0 | grep "inet addr" | awk '{print $2}' | cut -c6-`

}

# Default Start function

cloudflare_register () {

# Get instance public IP address

get_public_ip

# Check the resutl

if [ -z "$IP" ]; then

echo "$0: Unable to obtain public IP Address! Exiting ..."

exit 1

# Execute update towards CloudFlare API

curl -s $CF_HOST \

-d "a=rec_new" \

-d "tkn=$CF_TOKEN" \

-d "email=$CF_ACCOUNT" \

-d "z=$ZONE" \

-d "type=A" \

-d "name=$HOST" \

-d "content=$IP" \

-d "ttl=$TTL" >> $LOG_FILE

# Get record ID for this IP

REC_ID=`curl -s $CF_HOST \

-d "a=rec_load_all" \

-d "tkn=$CF_TOKEN" \

-d "email=$CF_ACCOUNT" \

-d "z=$ZONE" | jq -a '.response.recs.objs[] | .content, .rec_id' | grep -A 1 $IP| tail -1 | awk -F"\"" '{print $2}'`

# Update with desired service mode

curl -s $CF_HOST \

-d "a=rec_edit" \

-d "tkn=$CF_TOKEN" \

-d "email=$CF_ACCOUNT" \

-d "z=$ZONE" \

-d "id=$REC_ID" \

-d "type=A" \

-d "name=$HOST" \

-d "content=$IP" \

-d "ttl=1" \

-d "service_mode=$CF_SERVICEMODE" >> $LOG_FILE

}

# Default Stop function

cloudflare_deregister () {

# Get instance public IP address

get_public_ip

# Check the resutl

if [ -z "$IP" ]; then

echo "$0: Unable to obtain public IP Address! Exiting ..."

exit 1

# Get record ID for this IP

REC_ID=`curl -s $CF_HOST \

-d "a=rec_load_all" \

-d "tkn=$CF_TOKEN" \

-d "email=$CF_ACCOUNT" \

-d "z=$ZONE" | jq -a '.response.recs.objs[] | .content, .rec_id' | grep -A 1 $IP| tail -1 | awk -F"\"" '{print $2}'`

# Execute update towards CloudFlare API

curl -s $CF_HOST \

-d "a=rec_delete" \

-d "tkn=$CF_TOKEN" \

-d "email=$CF_ACCOUNT" \

-d "z=$ZONE" \

-d "id=$REC_ID" >> $LOG_FILE

}

case "$1" in

start)

log_daemon_msg "Registering $HOST.$ZONE with CloudFlare ... " || true

cloudflare_register

;;

stop)

log_daemon_msg "De-Registering $HOST.$ZONE with CloudFlare ... " || true

cloudflare_deregister

;;

restart)

log_daemon_msg "Restarting ... " || true

cloudflare_deregister

cloudflare_register

;;

log_action_msg "Usage: $0 {start|stop|restart}" || true

exit 1

esac

exit 0

EOF

# Add it to the startup / shutdown process

chmod +x /etc/init.d/cloudflare-submit.sh

update-rc.d cloudflare-submit.sh defaults 99

# Mark progress ...

echo "CLOUDFLARE SCRIPT INSTALLED" >> /root/user_data_run

# Register with CloudFlare to start receiving requests

/etc/init.d/cloudflare-submit.sh start

# Mark execution end

echo "DONE" > /root/user_data_run

This UserData script contains three components:

Lines 0 – 62: Boilerplate, OS update, installation and configuration of NginX;
Lines 64 – 215: cloudflare-submit.sh, main script that will be called on startup and shutdown of the instance. cloudflare-submit.sh will register the instance’s public IP address with CloudFlare and set required protection. By default, protection and acceleration is off. Additional configuration is required to make this script work for your setup, account details must be configured in the specified variables!
Lines 217 – 228: Setting proper script permissions, configuring automatic start of cloudflare-submit.sh and executing it to register with CloudFlare.

Code is reasonably straight-forward. init.d startup script is divided to multiple functions and output is redirected to a log file for debugging purposes. External dependencies are kept to a minimum. Distinguishing between AWS EC2 and Joyent instances is done by analyzing the instance ID. In AWS, all EC2 instances have instance IDs starting with ‘i-‘, while Joyent uses (by the looks of it) some sort of UUID. This part of the logic is particularly important if the code should be extended to support other cloud providers!

Both AWS and Joyent offer Ubuntu 14.04 support, so the same code can be use to configure the instances in automated fashion. This is particularly handy when it comes to data driven instance management and the DRY principle. Command line tools for both cloud providers also offer similar syntax, which makes it easier to utilize this functionality.

Amazon AWS

Staring new instances within Amazon AWS is straight-forward, assuming awscli is properly configured:

aws ec2 run-instances \
    --image-id ami-018c9568 \
    --count 1 \
    --instance-type t1.micro \
    --key-name test-key \
    --security-groups test-sg \
    --user-data file://userdata-script.sh

aws ec2 run-instances \

--image-id ami-018c9568 \

--count 1 \

--instance-type t1.micro \

--key-name test-key \

--security-groups test-sg \

--user-data file://userdata-script.sh

Joyent

Starting news instances within Joyent is somewhat more complex, but there is comprehensive documentation:

sdc-createmachine \
    --account account_name \
    --keyId aa:bb:cc:dd:ee:ff:gg:hh:ii:jj:kk:ll:mm:nn:oo:pp \
    --name test \
    --package "4dad8aa6-2c7c-e20a-be26-c7f4f1925a9a" \
    --tag Name=test \
    --url "https://us-east-1.api.joyentcloud.com" \
    --metadata "Name=test" \
    --image 286b0dc0-d09e-43f2-976a-bb1880ebdb6c \
    --script userdata-script.sh

sdc-createmachine \

--account account_name \

--keyId aa:bb:cc:dd:ee:ff:gg:hh:ii:jj:kk:ll:mm:nn:oo:pp \

--name test \

--package "4dad8aa6-2c7c-e20a-be26-c7f4f1925a9a" \

--tag Name=test \

--url "https://us-east-1.api.joyentcloud.com" \

--metadata "Name=test" \

--image 286b0dc0-d09e-43f2-976a-bb1880ebdb6c \

--script userdata-script.sh

This particular example will start new SmartMachine instance using the 4dad8aa6-2c7c-e20a-be26-c7f4f1925a9a package (g3-devtier-0.25-kvm, 3rd generation, virtual machine (KVM) with 256MB RAM) and 286b0dc0-d09e-43f2-976a-bb1880ebdb6c (ubuntu-certified-14.04) image. SSH key details are supplied through the specific combinations of Web-interface settings and SSH key signature. For the list of available packages (instance types) and images (software stacks) consult the API: ListPackages, ListImages.

NOTE: Joyent offers rich Metadata support, which can be quite flexible tool when managing large number of instances!

Successful service configuration

Successful service configuration will result in proper DNS entries to be added to the scalability.expert DNS zone in CloudFlare:

After configured TTL, those should be visible world-wide:

:~> nslookup www.scalability.expert
Server:         8.8.4.4
Address:        8.8.4.4#53

Non-authoritative answer:
Name:   www.scalability.expert
Address: 54.83.175.90
Name:   www.scalability.expert
Address: 165.225.137.102

:~>

:~> nslookup www.scalability.expert

Server: 8.8.4.4

Address: 8.8.4.4#53

Non-authoritative answer:

Name: www.scalability.expert

Address: 54.83.175.90

Name: www.scalability.expert

Address: 165.225.137.102

:~>

As seen, both AWS (54.83.175.90) and Joyent (165.225.137.102) IP addresses are returned, i.e. DNS Round-Robin. Service can simply be tested with:

:~> curl http://www.scalability.expert:8080/
<html>
    <head>
        <title>Demo Page</title>
    </head>

    <body>
        <center><h2>Demo Page</h2></center><br>
        <center>Status: running</center>
    </body>
</html>
:~>

:~> curl http://www.scalability.expert:8080/

<html>

<head>

</head>

<body>

<center>Status: running</center>

</body>

</html>

:~>

Resulting calls can be seen in the NginX log files on both instances:

NOTE: CloudFlare protection and acceleration features are explicitly disabled in this example! It is strongly suggested to enabled them for production purposes!

Conclusion

It should be clear now, that whenever software architecture follows certain design principles and application is properly decoupled in multiple tiers, the whole system can be deployed within multiple cloud providers. DevOps principles for automated deployment can be implemented in this environment as well. The overall system is with improved scalability, reliability and in case of data driven elastic deployments, even cost! Proper design is key, but the technology provided by companies like Amazon and Joyent make it easier to turn whiteboard drawings into actual systems with hundreds of nodes!

References

Small Tip: How to use AWS CLI to start Spot instances with UserData

2014/07/12 AWS, DevOps, Operations, Small Tip AWS, AWS CLI, DevOps, spot instances, UserData

Common occurrence in the list of daily DevOps tasks is the one to deal with AWS EC2 Spot Instances. They offer the same performance, as the OnDemand counterparts, they are cheap to the extend that user can specify the hourly price. The drawback is that AWS can reclaim them if the market price goes beyond the user’s price. Still, those are key component, a basic building block, in every modern elastic system. As such, DevOps engineers must regularly interact with those.

AWS provides proper command line interface, aws ec2 request-spot-instances exposes multiple options to the user. However, some of the common use cases are not comprehensively covered in the documentation. For example, creating Spot Instances with Userdata using the command line tools is somewhat obscure and convoluted, although common need in DevOps and Developers lives. The tricky part: It must be BASE64 encoded!

Assume the following, simple UserData script, must be deployed on numerous EC2 Spot Instances:

#!/bin/bash -ex

# Debian apt-get install function
apt_get_install()
{
        DEBIAN_FRONTEND=noninteractive apt-get -y \
        -o DPkg::Options::=--force-confdef \
        -o DPkg::Options::=--force-confold \
        install $@
}

# Mark execution start
echo "STARTING" > /root/user_data_run

# Some initial setup
set -e -x
export DEBIAN_FRONTEND=noninteractive
apt-get update && apt-get upgrade -y

# Install required packages
apt_get_install nginx

# Create test html page
mkdir /var/www
cat > /var/www/index.html << "EOF"
<html>
        <head>
                <title>Demo Page</title>
                </head>

        <body>
                <center><h2>Demo Page</h2></center><br>
                <center>Status: running</center>
        </body>
</html>
EOF

# Configure NginX
cat > /etc/nginx/conf.d/demo.conf << "EOF"
# Minimal NginX VirtualHost setup
server {
        listen 8080;

        root /var/www;
        index index.html index.htm;

        location / {
                try_files $uri $uri/ =404;
        }
}
EOF

# Restart NginX with the new settings
/etc/init.d/nginx restart

# Mark execution end
echo "DONE" > /root/user_data_run

#!/bin/bash -ex

# Debian apt-get install function

apt_get_install()

{

DEBIAN_FRONTEND=noninteractive apt-get -y \

-o DPkg::Options::=--force-confdef \

-o DPkg::Options::=--force-confold \

install $@

}

# Mark execution start

echo "STARTING" > /root/user_data_run

# Some initial setup

set -e -x

export DEBIAN_FRONTEND=noninteractive

apt-get update && apt-get upgrade -y

# Install required packages

apt_get_install nginx

# Create test html page

mkdir /var/www

cat > /var/www/index.html << "EOF"

<html>

<head>

</head>

<body>

<center>Status: running</center>

</body>

</html>

EOF

# Configure NginX

cat > /etc/nginx/conf.d/demo.conf << "EOF"

# Minimal NginX VirtualHost setup

server {

listen 8080;

root /var/www;

index index.html index.htm;

location / {

try_files $uri $uri/ =404;

}

EOF

# Restart NginX with the new settings

/etc/init.d/nginx restart

# Mark execution end

echo "DONE" > /root/user_data_run

Make sure base64 command is available in your system, or use equivalent, to encode the sample userdata.sh file before passing to the launch specification:

aws ec2 request-spot-instances \
    --spot-price 0.01 \
    --instance-count 2 \
    --launch-specification \
        "{ \
            \"ImageId\":\"ami-a6926dce\", \
            \"InstanceType\":\"m3.medium\", \
            \"KeyName\":\"test-key\", \
            \"SecurityGroups\": [\"test-sg\"], \
            \"UserData\":\"`base64 userdata.sh`\" \
        }"

aws ec2 request-spot-instances \

--spot-price 0.01 \

--instance-count 2 \

--launch-specification \

"{ \

\"ImageId\":\"ami-a6926dce\", \

\"InstanceType\":\"m3.medium\", \

\"KeyName\":\"test-key\", \

\"SecurityGroups\": [\"test-sg\"], \

\"UserData\":\"`base64 userdata.sh`\" \

In this example two spot instance requests will be created for m3.medim instances, using ami-a6926dce AMI, test-key SSH key, running in test-sg Security Group. BASE64-encoded contents of userdata.sh will be attached to the request so upon fulfillment the Userdata will be passed to the newly created instances and executed after boot-up.

Spot instance requests will be created in the AWS EC2 Dashboard:

Once the Spot Instance Requests (SIRs) are fulfilled, InstanceID will be associated for each SIR:

EC2 Instances dashboard will show newly created Spot Instances (notice the “Lifecycle: spot” in Instance details):

Using the proper credentials, one can verify successful execution of the userdata.sh on each instance:

:~> ssh -i ~/.ssh/test-key.pem ubuntu@ec2-54-211-6-104.compute-1.amazonaws.com "tail /var/log/cloud-init-output.log"
Setting up nginx (1.4.6-1ubuntu3) ...
Processing triggers for libc-bin (2.19-0ubuntu6) ...
+ mkdir /var/www
+ cat
+ cat
+ /etc/init.d/nginx restart
 * Restarting nginx nginx
   ...done.
+ echo DONE
Cloud-init v. 0.7.5 finished at Sat, 12 Jul 2014 18:17:09 +0000. Datasource DataSourceEc2.  Up 76.38 seconds
:~>

:~> ssh -i ~/.ssh/test-key.pem ubuntu@ec2-54-211-6-104.compute-1.amazonaws.com "tail /var/log/cloud-init-output.log"

Setting up nginx (1.4.6-1ubuntu3) ...

Processing triggers for libc-bin (2.19-0ubuntu6) ...

+ mkdir /var/www

+ cat

+ /etc/init.d/nginx restart

* Restarting nginx nginx

...done.

+ echo DONE

Cloud-init v. 0.7.5 finished at Sat, 12 Jul 2014 18:17:09 +0000. Datasource DataSourceEc2. Up 76.38 seconds

:~>

… and more importantly, if the configured service works as expected:

:~> curl http://ec2-54-211-6-104.compute-1.amazonaws.com:8080/
<html>
        <head>
                <title>Demo Page</title>
                </head>

        <body>
                <center><h2>Demo Page</h2></center><br>
                <center>Status: running</center>
        </body>
</html>
:~>

:~> curl http://ec2-54-211-6-104.compute-1.amazonaws.com:8080/

<html>

<head>

</head>

<body>

<center>Status: running</center>

</body>

</html>

:~>

Newly created Spot Instances are serving traffic, running at 0.01 USD/hr and will happily do so until the market price for this instance type goes above the specified price!

References

http://docs.aws.amazon.com/cli/latest/reference/ec2/request-spot-instances.html

How to deploy single-node Hadoop setup in AWS

Related Posts

UserData Template for Ubuntu 14.04 EC2 Instances in AWS

Related Posts

Small Tip: How to use AWS CLI ‘–filter’ parameter

Related Posts

AWS, DevOps, Outsourcing …

Is it possible to outsource DevOps?!

Related Posts

Small Tip: How to use –block-device-mappings to manage instance volumes with AWS CLI

Related Posts

A few myths about DevOps

Myth: Developers can do Ops

Myth: Sysadmins are obsolete

Myth: DevOps can not be outsourced / off-shored

Myth: You need DevOps only if you use Cloud

Myth: DevOps is supplementary activity

Related Posts

Small Tip: How to use AWS CLI to start Spot instances with UserData

Related Posts

Categories

Recent Posts