Under The Hood of Cloud Computing: ec2

Showing posts with label ec2. Show all posts

Wednesday, June 5, 2013

OpenShift on AWS EC2, Part 5 - Preparing Configuration Management (with Puppet)

I'm 5 posts into this and still haven't gotten to any OpenShift yet, except for doling out the instances and defining the ports and securitygroups for network communication. I did say "from the ground up" though, so if you've been here from the beginning, you knew what you were getting into.

In this post I'm going to build and run the tasks needed to turn an EC2 base instance with almost nothing installed into a Puppet master, or a Puppet client. There are a number of little details that need managing to get puppet to communicate and to make it as easy as possible to manage updates.

First a short recap for people just joining and so I can get my bearings.

Previously, our heros...

In the first post I introduced a set of tools I'd worked up for myself to help me understand and then automate the interactions with AWS.

In the second one I registered a DNS domain and delegated it to the AWS Route53 DNS service.

In the third I figured out what hosts (or classes of hosts) I'd need to run for an OpenShift service. Then I defined a set of network filter rules (using the AWS EC2 securitygroup feature) to make sure that my hosts and my customers could interact.

Finally in the previous post I selected an AMI to use as the base for my hosts, allocated a static IP address, added DNS A record, and started an instance for the puppet master and broker hosts. The remaining three (data1, message1, and node1) were left as an exercise for the reader.

So now I have five AWS EC2 instances running. I can reach them via SSH. The default account ec2-user has sudo ALL permissions. The instances are completely unconfigured.

The next few sections are a bunch of exposition and theory. It explains some about what I'm doing and why, but doesn't contain a lot of doing. Scan ahead if you get bored to the real stuff closer to the bottom.

The End of EC2

With the completion of the 4th post, we're done with EC2. All of the interactions from here on occur over SSH. The only remaining interactions with Amazon will be with Route53. The broker will be configured to update the app.example.org zone when applications are added or removed.

You could reach this point with any other host provisioning platform, AWS cloudformation, libvirt, virtualbox, Hyper-V, VMWare, or bare metal, it doesn't matter. Each of those will have its own provisioning details but if you can get to networked hosts with stable public domain names you can pick up here and go on, ignoring everything but the first post.

The first post is still needed for the process I'm defining because the origin-setup tools written with Thor aren't just used for EC2 manipulation. If that's all they were for I would have used one of the existing EC2 CLI packages.

Configuration Management: An Operations Religion

I mean this with every coloring and shade of meaning it can have, complete with schisms and dogma and redemption and truth.

Some small shop system administrators think that configuration management isn't for them, it isn't needed. I differ with that opinion. Configuration management systems have two complementary goals. Only one of them is managing large numbers of systems. The important goal is managing even one repeatably. This is the Primary Dogma of System Administration. If you can't do it 1000 times, you can't do it at all.

The service I'm outlining only requires four hosts (the puppet master will be 5). I could do it on one. That's how most demos until now have done it. I could describe to you how to manually install and tweak each of the components in an OpenShift system, but its very unlikely that anyone would ever be able to reproduce what I described exactly. (I speak from direct experience here, following that kind of description in natural language is hard and writing it is harder) Using a CMS it is possible to expose what needs to be configured specially and what can be defaulted, and to allow (if its done well) for flexibility and customization.

The religion comes in when you try to decide which one.

I'm going to go with sheep and expedience and choose Puppet. Other than that I'm not going to explain why.

Brief Principals of Puppet

Puppet is one of the currently popular configuration management systems. It is widely available and has a large knowledgeable user base. (that's why).

The Master/Agent deployment

The standard installation of puppet contains a puppet master and one or more puppet clients running the puppet agent service. The configuration information is stored on the puppet master host. The agent processes periodically poll the master for updates to their configuration. When an agent detects a change in the configuration spec the change is applied to the host.

The puppet master scheme has some known scaling issues, but for this scenario it will suit just fine. If the OpenShift service grows beyond what the master/agent model can handle, then there are other ways of managing and distributing the configuration, but they are beyond the scope of this demonstration.

The Site Model Paradigm

That's the only time you'll see me use that word. I promise.

The puppet configuration is really a description of every component, facet and variable you care about in the configuration of your hosts. It is a model in the sense that it represents the components and their relationships. The model can be compared to reality to find differences. Procedures can be defined to resolve the differences and bring the model and reality into agreement.

There are some things to be aware of. The model is, at any moment, static. It represents the current ideal configuration. The agents are responsible for polling for changes to the model and for generating the comparisons as well as applying any changes to the host. It is certain that when a change is made to the model, there will be a window of time when the site does not match. Usually it doesn't matter, but sometimes changes have to be coordinated. Later I may add MCollective to the configuration to address this. MCollective is Puppet's messaging/remote procedure call service and it allows for more timing control than the standard Puppet agent pull model.

Also, the model is only aware of what you tell it to be aware of. Anything that you don't specify is.... undetermined. Now specifying everything will bury you and your servers under the weight of just trying to stay in sync. It's important to determine what you really care about and what you don't. It's also important to look carefully at what you're leaving out to be sure that it's safe.

Preparing the Puppet Master and Clients

As usual, there's something you have to do before you can do the thing you really want to do. While puppet can manage pretty much anything about a system after it is set up, it can''t set it self up from nothing.

The puppet master must have a well known public hostname (DNS). Check.
Each participating client must have a well known public hostname (DNS): Check
The master and clients must know its own hostname (for id to the master) Err.
The master and clients must have time sync. Ummm
The master and clients must have the puppet (master/client) software installed. Not Check.
The master must have any additional required modules installed.
The master must have a private certificate authority (CA) so that it can sign client credentials. Not yet
The clients must generate and submit a client certificate for the master to sign. Nope.
The master must have a copy of the site configuration files to generate the configuration model. No.

The first four points are generic host setup, and the first two are complete. Installing the puppet software should be simple, but I may need to check and/or tweak the package repositories to get the version I want. The last four are pure puppet configuration and the last one is the goal line.

Hostname

Puppet uses the hostname value set on each host to identfy the host. Each host should have its hostname set to the FQDN of the IP address on which it expects incoming connections.

Time Sync on Virtual Machines

Time sync needs a little space here. On an ordinary bare-metal host I'd say "install an ntpd on every host". NTP daemons are light weight and more reliable and stable than something like cron job to re-sync. Virtual machines are special though.

On a properly configured virtual machine, the system time comes from the VM host. As the guest, you must assume that the host is doing the right thing. The guest VM has a simulated real-time clock (RTC) which is a pass-through either of the host clock or the underlying hardware RTC. In either case, the guest is not allowed to adjust the underlying clock.

Typically a service like ntpd gets time information from outside and not only slews the system (OS) clock but it compares that to the RTC and tries to compensate for drift between the RTC and the "real" time. In the default case it will even adjust the RTC to keep it in line with the system clock and "real" time.

As a guest, it's impolite to go around adjusting your host's clocks.

So a virtual machine system like an IaaS is one of the few places I'd advise against installing a time server. If your VMs aren't in sync, call your provider and ask them why their hardware clocks are off. If they can't give you a good answer, find a new IaaS provider.

Time Zones and The Cloud

I'm going to throw one more timey-wimey things in here. I set the system timezone on every server host to UTC. If I ever have to compare logs on servers from different regions of the world (this is the cloud remember?) I don't have to convert time zones. User accounts can always set their timezone to localtime using the TZ environment variable. The tasks offer an option so that you can override the timezone.

Host Preparation vs. Software Configuration

It would be fairly easy to write a single task that completes all of the bullet points listed above, but something bothers me about that idea. The first 4 are generic host tasks. The last four are distinctly puppet configuration related. Installing the software packages sits on the edge of both. The system tasks are required on every host. Only the puppet master will get the puppet master service software and configuration. The puppet clients will get different software and a different configuration process.

I'm going to take advantage of the design of Thor to create three separate tasks to accomplish the job:

origin:prepare - do the common hosty tasks
origin:puppetmaster - prepare and then install and configure a master
origin:puppetclient - prepare, and then install and register a client

So the origin:prepare task needs to set the hostname on the box to match the FQDN. I prefer also to enable the local firewall service and open a port for SSH to minimize the risk of unexpected exposure. This is also where I'd put a task to add a software repository for the puppet packages if needed.

Each of the origin:puppetmaster and origin:puppetclient tasks will invoke the origin:prepare task first.

File Revision Control

Since Configuration Management is all about control and repeatability it also makes sense to place the configuration files themselves under revision control. For this example I'm going to place the site configuration in a Github repository. Changes can be made in a remote work space and pushed to the repository. ;Then they can be pulled down to the puppet master and the service notified to re-read the configurations. ;They can also be reverted as needed.

When the Puppet site configuration is created on the puppet master, it will be cloned from the git repo on github.

Initialize the Puppet Master

The puppet master process runs as a service on the puppet server. It listens for polling queries from puppet agents on remote machines. The puppet master service must read the site configurations to build the models that will define each host. The puppet service runs as a non-root user and group, each named "puppet". The default location for puppet configuration files is in /etc/puppet. This area is only writable by the root user. Other service files reside in /var/lib/puppet. This area is writable by the puppet user and group. Further, SELinux limits access by the puppet user to files outside these spaces.

On RHEL6, the EC2 login user is still root. The user and group settings aren't really needed there, but they are still consistent.

The way I choose to manage this is:

Add the ec2-user to the puppet group
Place the site configuration in /var/lib/puppet/site
Update the puppet configuration file (/etc/puppet/puppet.conf) to reflect the change
Clone the configuration repo into the local configuration directory
Symlink the configuration repo root into the ec2-user home directory.

This way the ec2-user has permission and access to update the site configuration.

Puppet uses x509 server and client certificates. The puppet master needs a server certificate and needs to self-sign it before it can sign client certificates or accept connections from clients.

Once the server certificate is generated and signed, I also need to enable and start the puppet master service. Finally, I need to add a firewall rule allowing inbound connections on the puppet master port, 8140/TCP.

So the process of initializing the puppet master is this:

install the puppet master software
modify the puppet config file to reflect the new site configuration file location
install additional puppet modules
generate server certificate and sign it
add ec2-user to puppet group (or root user on RHEL6)
create site configuration directory and set owner, group, permissions
clone the git repository into the configuration directory
start and enable the puppet master service

Installing Packages

Since I'm using Thor, the package installation process is a Thor task. Each sub-task will only run once within the invocation of its parent. The origin:puppetmaster task calls the origin:prepare task and provides a set of packages needed for a puppet master in addition to any installed as part of the standard preparation (firewall management and augeas). For the puppet master, these additional packages are the puppet-master and git packages. Dependencies are resolved by YUM.

Adding user to Puppet group

The puppet service is controlled by the root user, but runs as a role user and group both called puppet. I would like the login user to be able to manage the puppet site configuration files, but not to log in either as the root or puppet user. I'll add the ec2-user user to the puppet group, and set the group write permissions so that this user can manage the site configuration.

Creating the Site Configuration Space

As noted above, the ec2-user account will be used to manage the puppet site configuration files. The files must be writable by the ec2-user (through the puppet group) but they must also be readable by the puppet user and service. In addition, since these are service configurations rather than (local) host configuration files, I'd prefer that they not reside in /etc.

SELinux policy restricts the location of files which the puppet service processes can read. One of those locations is in /var/lib/puppet. Rather than update the policy, it seems easier to place the site configuration data within /var/lib/puppet.

I create a new directory /var/lib/puppet/site and set the owner, group and permissions so that the puppet user/group and read and write the files. I also set the permissions so that new files will inherit the group and group permissions. This way the ec2-user will have the needed access, and SELinux will not prevent the puppet master service from reading the files. In a later step I'll use git to clone the site configuration files into place.

Install Service Configuration File (setting variables)

Moving the location of the site configuration files from the default (/etc/puppet/manifests) and adding a location for user defined modules requires updating the default configuration file. Currently I make three alterations to the default file:

set the puppet master hostname as needed
set the location of the site configuration (manifests)
add a location to the modulepath

I use a template file, push a copy to the master and use sed to replace the values before copying the updated file into place.

Installing Standard Modules

Puppet provides a set of standard modules for managing common aspects of clients. These are installed from a web site on PuppetLabs with the puppet module install command. these are installed before starting the master process.

Unpacking Site Configuration (From git)

I already have a task for cloning a git repository on a remote host. Unpack the site configurations into the directory prepared previously. The git repo must have two directories at the top: manifests and modules. These will contain the site configuration and any custom modules needed for OpenShift. These locations are configured into the puppet master configuration above.

Adding Firewall Rules

The puppet master service listens on port 8140/TCP. I need to add an allow rule so that inbound connections to the puppet master will succeed.

Just to be safe I also add an explicit rule to allow SSH (22/TCP) before restarting the firewall service.

These match the securitygroup rule definitions defined in the third post. Some people would question the need for running a host based firewall when EC2 provides network filtering I would refer anyone who asks that to read up on Defense in Depth.

Filtering the Puppet logs into a separate file

It is much easier to observe the operation of the service if the logs are in a separate file. I add an entry to the /etc/rsyslog.d/ directory and restart the rsyslog daemon to place puppet master logs in /var/log/puppet-master.log

Enabling and Starting the Puppet Master Service

Finally, when all of the puppet master host customization is complete, I can enable and start the puppet master service.

What all that looks like

That's a whole long list and I created a whole set of Thor tasks to manage the steps. Then I created an uber-task to execute it all. It starts with the result of origin:baseinstance (run with the securitygroups default and puppetmaster). It results in a running puppet master waiting for clients to connect.

thor origin:puppetmaster puppet.infra.example.org --siterepo https://github.com/markllama/origin-puppet
origin:puppetmaster puppet.infra.example.org
task: remote:available puppet.infra.example.org
task: origin:prepare puppet.infra.example.org
task: remote:distribution puppet.infra.example.org
fedora 18
task: remote:arch puppet.infra.example.org
x86_64
task: remote:timezone puppet.infra.example.org UTC
task: remote:hostname puppet.infra.example.org
task: remote:yum:install puppet.infra.example.org puppet-server git system-config-firewall-base augeas
task: puppet:master:join_group puppet.infra.example.org
task: remote:git:clone puppet.infra.example.org https://github.com/markllama/origin-puppet
task: puppet:master:configure puppet.infra.example.org
task: puppet:master:enable_logging puppet.infra.example.org
task: puppet:module:install puppet.infra.example.org puppetlabs-ntp
task: remote:firewall:stop puppet.infra.example.org
task: remote:firewall:service puppet.infra.example.org ssh
task: remote:firewall:port puppet.infra.example.org 8140
task: remote:firewall:start puppet.infra.example.org
task: remote:service:start puppet.infra.example.org puppetmaster
task: remote:service:enable puppet.infra.example.org puppetmaster

You can check that the puppet master has created and signed its own CA certificate by listing the puppet certificates like this:

thor puppet:cert list puppet.infra.example.org --all
task puppet:cert:list puppet.infra.example.org
+ puppet.infra.example.org BD:27:A5:3B:AE:F5:1D:05:7E:8F:E7:E9:CA:BA:32:4B

This indicates that there is now a single certifiicate associated with the puppet master. This certificate will be used to sign the client certificates as they are submitted.

Initializing a Puppet Client

The first part of creating a puppet client host is the same as for the master (almost). It involves installing some basic puppet packages (puppet, facter, augeas), setting the hostname and time zone and the rest of the hosty stuff. Then we get to the puppet client registration.

The puppet agent runs on the controlled client hosts. It polls the puppet master periodically checking for updates to the configuration model for the host.

When the puppet agent starts the first time it generates an x509 client certificate and sends a signing request to the puppet master.

When the puppet master receives an unsigned certificate from an agent for the first time it places it in a list of certificates waiting to be signed. The user can then sign and accept each new client certificate and the initial identification process is complete. From then on the puppet agent polls using its client certificate for identification and the signature provides authentication.

The process then for installing and initializing the puppet client is this:

On the client:

install the puppet agent package
configure the puppet master hostname into the configuration file
enable the puppet agent service
start the puppet agent service

Then on the puppet master:

wait for the client unsigned certificate to arrive
sign the new client certificate

This is what it looks like for the broker host:

thor origin:puppetclient broker.infra.example.org puppet.infra.example.org
origin:puppetclient broker.infra.example.org, puppet.infra.example.org
task: remote:available broker.infra.example.org
task: origin:prepare broker.infra.example.org
task: remote:distribution broker.infra.example.org
fedora 18
task: remote:arch broker.infra.example.org
x86_64
task: remote:timezone broker.infra.example.org UTC
task: remote:hostname broker.infra.example.org
task: remote:yum:install broker.infra.example.org puppet facter system-config-firewall-base augeas
task: puppet:agent:set_server broker.infra.example.org puppet.infra.example.org
task: puppet:agent:enable_logging broker.infra.example.org
task: remote:service:enable broker.infra.example.org puppet
task: remote:service:start broker.infra.example.org puppet
task: puppet:cert:sign puppet.infra.example.org broker.infra.example.org

At this point the client can request its own configuration model and the master will confirm the identity of the client and return the requested information.

thor puppet:cert:list puppet.infra.example.org --all
task puppet:cert:list puppet.infra.example.org
+ broker.infra.example.org 09:97:22:B9:A9:16:AE:B1:32:93:EC:3A:6D:7A:CF:67
+ puppet.infra.example.org 70:B8:E0:C0:F8:5B:48:67:4E:92:91:D2:0D:E4:2B:F4

Repeat the origin:puppetclient step for the data1, message1 and node1 instances you created last time. You did create them, right? Check the certs as each one registers.

The next step is to actually build a model for the client to request by creating a site manifest and a set of node descriptions.

That means: we finally get to do some OpenShift.

Thursday, May 30, 2013

OpenShift on AWS EC2, Part 4 - The First Machine

There's enough infrastructure in place now that I should be able to create the first instance for my OpenShift service. I'm going to be managing the configuration with a Puppet master, so that will be the first instance I create.

The puppet master must have a public name and a fixed IP address. I need to be able to reach it via SSH, and the puppet agents need to be able to find it by name (oversimplification, go with me on this).

With Route53 and EC2 configured, I can request a static (elastic) IP and associate it with a hostname in my domain. I can also associate it with a new instance after the instance is launched. I can specify the network filtering rules so I can access the host over the network.

I actually have a task that does all this in one go, but I'm going to walk through the steps once so it's not magic.

NOTE: if you haven't pulled down the origin-setup tools from github and you want to follow along, you should go back to the first post in this series and do so.

This is not the only way to accomplish the goals set here. You can use the AWS web console, CloudFormation or even tools like the control plugins for Vagrant.

Instances and Images in EC2

First, a little terminology. EC2 has a number of terms to disambiguate... things.

An image is a static piece of storage which contains an OS. It is the "gold copy" that we used to make when we still cloned hard disks to copy systems. An image cannot run an OS. It's storage. An image does have some metadata though. It has an associated machine architecture. It has instructions for how it is to be mounted when it is used to create an instance.

(actually, this is a lie, an image is the metadata, the storage is really in a snapshot with a volume but that's too much and not really important right now.)

An instance is a runnable copy of an image. It has a copy of the disk, but it also has the ability to start and stop. It is assigned an IP address when it starts. A copy of your RSA security key is installed when it starts so that you can log in.

When you want a new machine, you create an instance from an image. You select an image which uses the architecture and contains the OS that you want. You give the instance a name, and comment, and its security groups. There are other things you can specify as well, but they don't come into play here.

Finding the Right Image

People like their three letter abbreviations. On the web interface you'll see the term "AMI", which, I think stands for "Amazon Machine Image". Otherwise known as "an image" in this context. While the image ID's all begin with ami- I'm going to continue to refer to them as "images".

For OpenShift I want to start with either a Fedora or a RHEL (or CentOS) image. I can't think of a reason anymore not to use a 64 bit OS and VM, so I'll specify that. You can easily find official RHEL images on the AWS web console or using the AWS Marketplace. You can find CentOS in the Marketplace. There are "official" Fedora images there too, though they're not publicized.

What I do is use the web interface to find a recommended image and then make a note of the owner ID of the image. From then on I can use the owner ID to find images using the CLI tools. It doesn't look like you can look up an owner's information from their owner ID.

New instances (running machines) are created from images. Conversely new images can be created from an instance. People can create and register and publish their images, so there can be lots of things that look like they're "official" which may have been altered. It takes a little sleuthing to find the images that come from the source you want.

Using the AWS console, I narrowed the Fedora x86_64 images down to this:

I made a note of the owner ID, and the pattern for the names and I can search for them on the CLI like this:

thor ec2:image list --owner 125523088429 --name 'Fedora*' --arch x86_64
ami-2509664c 125523088429 x86_64 Fedora-x86_64-17-1-sda 
ami-6f3b5006 125523088429 x86_64 Fedora-x86_64-19-Beta-20130523-sda 
ami-b71078de 125523088429 x86_64 Fedora-x86_64-18-20130521-sda

Note that the --name search allows for globbing using the asterisk (*) character.

Launching the First Instance

I think I have enough information now to fire up the first instance for my OpenShift service. The first one will be the puppet master, as that will control the configuration of the rest.

What I know:

hostname - puppet.infra.example.com
base image - ami-b71078de
securitygroup(s) - default, allow SSH
SSH key pair name

Later, I will also need a static (ElasticIP) address and a DNS A record. Both of those can be set after the instance is running.

There is one last thing to decide. When you create an EC2 instance, you must specify the instance type which is a kind of sizing for the machine resources. AWS has a table of EC2 instance types that you can use to help you size your instances to your needs. Since I'm only building a demo, I'm going to use the t1.micro type. This has 7GB instance storage a single virtual core and enough memory for this purpose. The CPU usage is also free (Storage and unused elastic IPs still cost).

size: t1.micro

So, here we go, with the CLI tools:

thor ec2:instance create --name puppet --type t1.micro --image ami-b71078de --key <mykeyname> --securitygroup default 
task: ec2:instance:create --image ami-b71078de --name puppet
  id = i-d8c912bb

That's actually pretty.... anti-climactic. I've got a convention that each task echos the required arguments back as it is invoked. That way as the tasks are composed into bigger tasks, you can see what's going on inside while it runs.

All this one seemed to do was to return an image id. To see what's going on, I can request the status of the instance:

thor ec2:instance status --id i-d8c912bb
pending

Since I'm impatient, I do that a few more times and after about 30 seconds it changes to this:

thor ec2:instance status --id i-d8c912bb
running

I want to log in, but so far all I know is the instance ID. I can ask for the hostname.

thor ec2:instance hostname --id i-d8c912bb
ec2-23-22-234-113.compute-1.amazonaws.com

And with that I should be able to log in via SSH using my private key:

ssh -i ~/.ssh/<mykeyfile;>.pem ec2-user@ec2-23-22-234-113.compute-1.amazonaws.com
The authenticity of host 'ec2-23-22-234-113.compute-1.amazonaws.com (23.22.234.113)' can't be established.
RSA key fingerprint is 64:ec:6d:7d:af:ae:9a:70:78:0d:02:28:f1:c3:45:50.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ec2-23-22-234-113.compute-1.amazonaws.com,23.22.234.113' (RSA) to the list of known hosts.

It looks like I generated and saved my key pair right, and specified it correctly when creating the instance.

The Fedora instances don't use the root user as the primary remote login. Instead, there's an ec2-user account which has ssh ALL:ALL permissions. That is, the ec2-user account can use sudo without providing a password. This really just gives you a little separation and forces you to think before you take some action as root.

Getting a Static IP Address

Now I have a host running and I can get into it, but the hostname is some long abstract string in the EC2 amazonaws.com domain. I want MY name on it. I also want to be able to reboot the host and have it get the same IP address and name. Well, it's not quite that simple.

Amazon EC2 has a curious and wonderful feature. Each running instance actually has two IP addresses associated with it. One is the internal IP address (the one configured in eth0). But that's in an RFC 1918 private network space. You can't route it. You can't reach it. You could even have a duplicate inside your corporate or home network.

The second address is an external IP address and this is the one you can see and can route to. Amazon works some router table magic at the network border to establish the connection between the internal and external addresses. What this means is that EC2 can change your external IP address without doing a thing to the host behind it. This is where Elastic IP addresses come in.

As with all of these things, you can do it from the web interface, but since I'm trying to automate things, I've made a set of tasks to manipulate the elastic IPs. I'm lazy and there's no other kind of IP in EC2 that I can change, so the tasks are in the ec2:ip namespace.

Creating a new IP is pretty much what you'd expect. You're not allowed to specify anything about it so it's as simple as can be:

thor ec2:ip create
task: ec2:ip:create
184.72.228.220

Once again, not very exciting. Since each IP must be unique, the address itself serves as an ID. An address isn't very useful until it's associated with a running instance. The ipaddress task can retrieve the IP address of an instance. It can also set the external IP address (the address must be an allocated Elastic IP)

thor ec2:instance ipaddress 184.72.228.220 --id i-d8c912bb
task:  ec2:instance:ipaddress 184.72.228.220

You can get the status and more information about an instance. You can also request the status using the instance name rather than the ID. For objects which have an ID and a name, you can query using either one, but you must specify it with an argument. For objects like the IP address which do not have a name, the id is the first argument f any query.

thor ec2:instance info --name puppet --verbose
EC2 Instance: i-d8c912bb (puppet)
  DNS Name: ec2-184-72-228-220.compute-1.amazonaws.com
  IP Address: 184.72.228.220
  Status: running
  Image: ami-b71078de
  Platform: 
  Private IP: 10.212.234.234
  Private Hostname: ip-10-212-234-234.ec2.internal

And now for something completely different: Route53 and DNS

I now have a a running host with the operating system and architecture I want. It has a fixed address. But it has a really funny domain name.

When I created my Route53 zones, I split them in two. infra.example.org will contain my service hosts. app.example.com will contain the application CNAME records. The broker will only have permission to change the application zone. It won't be able to damage the infrastructure either through a compromise or a bug.

I'm going to call the puppet master puppet.infra.example.org. It will have the IP address I was granted above.

All of the previous tasks were in the ec2: namespace. Route53 is actually a different service within AWS, so it gets its own namespace.

An IP address record has four components:

type
name
value
ttl (time to live, in seconds)

All of the infrastructure records will be A (address) records. The TTL has a regular default and there's no reason generally to override it. The value of an A record is IP address.

The name in an A record is a Fully Qualified Domain Name (FQDN). It has both the domain suffix and and the hostname and any sub-domain parts. To save some trouble parsing, the route53:record:create task expects the zone first, and the host part next as a separate argument. The last two arguments are the type and value.

thor route53:record create infra.example.org puppet a 184.72.228.220
task: route53:record:create infra.example.org puppet a 184.72.228.220

Also pretty anti-climactic. This time though there will be an external effect.

First, I can list the contents of the infra.example.org zone from Route53. Then I can also query the A record from DNS, though this may take some time to be available.

thor route53:record:get infra.example.org puppet A 
task: route53:record:get infra.example.org puppet A
puppet.infra.example.org. A
  184.72.228.220

And the same when viewed with host:

host puppet.infra.example.org
puppet.infra.example.org has address 184.72.228.220

The SOA records for AWS Route53 have a TTL of 900 seconds (15 minutes). When you add or remove a record from a zone, you also cause an update to the SOA record serial number. Between you and Amazon there are almost certainly one or more caching nameservers and they will only refresh their cache when the SOA TTL expires. So you could experience a delay of up to 15 minutes from the time that you create a new record in a zone and when it resolves. I'm hoping this doesn't hold true for individual records, because it's going to cause problems for OpenShift.

You can check the TTL of the SOA record by requesting the record directly using dig:

dig infra.example.org soa

; <<>> DiG 9.9.2-rl.028.23-P2-RedHat-9.9.2-10.P2.fc18 <<>> infra.example.org soa
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 60006
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;infra.example.org.  IN SOA

;; ANSWER SECTION:
infra.example.org. 900 IN SOA ns-1450.awsdns-53.org. awsdns-hostmaster.amazon.com. 1 7200 900 1209600 86400

;; Query time: 222 msec
;; SERVER: 172.30.42.65#53(172.30.42.65)
;; WHEN: Wed May 29 18:46:46 2013
;; MSG SIZE  rcvd: 130

The '900' on the first line of the answer section is the record TTL.

Wrapping it all up.

The beauty of Thor is that you can take each of the tasks defined above and compose them into more complex tasks. You can invoke each task individually from the command line or you can invoke the composed task and observe the process.

Because this task uses several others from both EC2 and Route53, I put it under a different namespace. All of the specific composed tasks will go in the origin: namespace.

The composed task is called origin:baseinstance. At the top I know the fully qualified domain name of the host, the image and securitygroups that I want to use to create the instance. Since I already have the puppet master this one will be the broker.

hosthame: broker.infra.example.org
image: ami-b71078de
instance type: t1.micro
securitygroups: default, broker
key pair name: <mykeypair>

thor origin:baseinstance broker --hostname broker.infra.example.org --image ami-b71078de --type t1.micro --keypair <mykeypair> --securitygroup default broker 
task: origin:baseinstance broker
task: ec2:ip:create
184.73.182.10
task: route53:zone:contains broker.infra.example.org
Z1PLM62Y00LCIN infra.example.org.
task: route53:record:create infra.example.org. broker A 184.73.182.10
- image id: ami-b71078de
task: ec2:instance:create ami-b71078de broker
  id = i-19b1f576
task: remote:available ec2-54-226-116-229.compute-1.amazonaws.com
task: ec2:ip:associate 184.73.182.10 i-19b1f576

This process takes about two minutes. If you add --verbose you can see more of what is happening. There is a delay waiting for the A record creation to sync so that you don't accidentally create negative cache records which can slow propagation. Also you can see the remote:available task which polls a host for SSH login access. This allows time for the instance to be created, start running and reach multi-user network state.

ssh ec2-user@broker.infra.example.org
The authenticity of host 'broker.infra.example.org (184.73.182.10)' can't be established.
RSA key fingerprint is 8f:db:46:25:bf:19:2e:47:f5:f4:4a:23:a5:98:e3:5c.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'broker.infra.example.org,184.73.182.10' (RSA) to the list of known hosts.
Last login: Thu May 30 11:37:08 2013 from 66.187.233.206

I will duplicate this process for the data and message servers, and for one node to begin.
My tier of AWS only allows 5 Elastic IP addresses, so I'm at my limit. For a real production setup, only the broker, nodes and possibly the puppet master require fixed IP addresses and public DNS. The datastore and message servers could use dynamic addresses, but then they will require some tweaking on restart. I'm sure Amazon will give you more IP addresses for money, but I haven't looked into it.

Summary

There's a lot packed into this post:

Select an image to use as a base
Manage IP addresses
Bind IP addresses to running instances
Create a running instance.

All of this can be done with the AWS console. The ec2, route53 tasks just make it a little easier and the origin:baseinstance task wraps it all up so that creating new bare hosts is a single step.

In the next post I'll establish the puppet master service on the puppet server and install a puppet agent on each of the other infrastructure hosts. From then all of the service management will happen in puppet and we can let EC2 fade into the background.

References

EC2 Documentation

Route53 Documentation

Zone
Record

Access via SSH

Tuesday, May 28, 2013

OpenShift on AWS EC2, Part 3: Getting In and Out (securitygroups)

In the previous two posts, I talked about tools to manage AWS EC2 with a CLI toolset, and preparing AWS Route53 so that the OpenShift broker will be able to publish new applications. There is one more facet of EC2 that needs to be addressed before trying to start the instances which will host the OpenShift service components.

AWS EC2 provides (enforces?) network port filtering. The filter rule sets are called securitygroups. AWS also offers two forms of EC2, "classic", and "VPC" (virtual private cloud). Managing securitygroups for classic and VPC are a little different. I'm going to present securitygroups in EC2-Classic. If you're going to use EC2-VPC, you'll need to read the Amazon documentation and adapt your processes to the VPC behaviors. Also note that securitygroups have a scope. They can be applied only in the region in which they are defined.

In EC2-Classic you must associate all of the securitygroups with a new instance when you launch it (create it from an image). You cannot change the set of securitygroups associated with an instances later. You can change the rulesets in the securitygroups and the new rules will be applied immediately to all of the members of the securitygroup.

Amazon provides a default securitygroup which basically restricts all network traffic to the members (but not *between* members). To make OpenShift work we will need a set of security groups which allow communications between the OpenShift Broker and the back-end services, and between the broker and nodes (through some form of messaging). We will also need to allow external access to the OpenShift broker (for control) and to the nodes (for user access to the applications).

The creation of the securitygroups probably does not need to be automated. The securitygroups will be created and the rulesets defined only once for a given OpenShift service. The web interface is probably appropriate for this.

Since we'll be creating the instances with the CLI, it will be necessary to be able to list, examine to apply the securitygroups to new instances there as well.

NOTE: These are not the security settings you are looking for.

The securitygroups and rulesets shown here are designed to demonstrate the securitygroup features and the user interface used to manage them. They are not designed with an eye to the best possible function and security for your service. You must look at your service design and requirements to create the best group and rulesets for your service.

Most people focus on the inbound (ingress) filtering rules. I'm going to go with that. I won't be defining any outbound (egress) rule sets.

I expect to need a different group for each type of host:

OpenShift broker
OpenShift node
datastore
message broker
puppetmaster

In addition I'm going to manage the service hosts with Puppet using a puppetmaster host. Each of the service hosts will be a puppet client. I don't think the puppet agent needs any special rules so I only have one additional securitygroup.

If I also planned to use an external authentication service on the broker, I would need a securitygroup for that. I could also extend this set to include build and test servers for development of OpenShift itself.

Defining Securitygroups

Each of the groups below has only a single rule. To be rigorous I could add the SSH (22/TCP) rule to the node securitygroup. It is actually required for the operation of the node, not just for administrative remote access.

securitygroup	service	port/proto	source	comments
default	SSH	22/TCP	OpenShift Ops	remote access and control
puppetmaster	puppetmaster	8140/TCP	all managed hosts	configuration management
datastore	mongodb	27017/TCP	OpenShift Broker Hosts	NoSQL DB
messagebroker	activemq/stomp	61613/TCP	OpenShift broke and node hosts	carries MCollective
broker	httpd (apache2)	80/TCP, 443/TCP	OpenShift Ops and Users (unrestricted)	Ruby on Rails and Passenger
node	httpd (apache2)	80/TCP, 443/TCP	OpenShift Application Users (unrestricted)	HTTP routing
	Web Sockets	8000/TCP, 8443/TCP	OpenShift App user	web sockets
	SSH	22/TCP	OpenShift App Users (unrestricted)	shell and app control

Populating each securitygroup is a two step process. First create the empty security group. Then add the rules to the group. At that point, the group is ready to be applied to new instances.

Creating a Securitygroup

Each security group starts with a name and an option description string. The restrictions on the names are different from EC2-Classic and EC2-VPC securitygroups. See the Amazon documentation for the differences. Simple upper/lower case strings with no white space are allowed in both. The descriptions are more freeform.

You can add new securitygroups on the AWS EC2 console page. Select the "Security Groups" tab on the left side and click "Create Security Group". Fill in the name and description fields, make sure that the VPC selector indicates "No VPC" and click "Yes, Create".

Adding Rulesets

Securitygroup rulesets are one of the more complex elements in EC2. When using the web interface, Amazon provides a set of pre-defined rules for things like HTTP and SSH and common database connections. You should use them when they're appropriate. The web interface also allows you to create custom rulesets.

There are several things to note about this display. The default group has three mandatory rules (blue and white bars in the lower right). These allow all of the members of the group unrestricted access to each other.

I'm adding the SSH rule which allows inbound port 22 connections. I'm leaving the source as the default 0.0.0.0/0. This is the IPv4 notation for "everything", so there will be no restrictions on the source of inbound SSH connections. If you want to restrict SSH access so that connections come only from your corporate network, you can set the exit address space for your company there.

Since the members of the default group have unrestricted access to each other and since I'm going to apply the default group to all of my instances, it turns out that I only need special rules for access to hosts from the outside. I need to add the SSH rule above, and I need to allow web access to the broker and node hosts. I am going to create these as distinct groups because I can't change the assigned groups for an instance after it is launched. I'd like the ability to restrict access to the broker later.

If I were to apply rigorous security to this setup, I would avoid using the default group. Instead I would create a distinct group for each service component. Then I would add rulesets which allow only the required communications. This would decrease the risk that a compromise of one host would grant access to the rest of the service hosts.

Since it's a one-time task, I created both of my securitygroups and rulesets using the web interface. I have written Thor tasks to create and populate securitygroups:

 thor help ec2:securitygroup
Tasks:
  thor ec2:securitygroup:create NAME                        # create a new se...
  thor ec2:securitygroup:delete                             # delete the secu...
  thor ec2:securitygroup:help [TASK]                        # Describe availa...
  thor ec2:securitygroup:info                               # retrieve and re...
  thor ec2:securitygroup:list                               # list the availa...
  thor ec2:securitygroup:rule:add PROTOCOL PORTS [SOURCES]  # add a permissio...
  thor ec2:securitygroup:rules                              # list the rules ...

Options:
  [--verbose]

The list of tasks is incomplete, as I have not needed to change or delete rulesets. If I find that I need those tasks, I'll add them.

Next Up

This is everything that must be done before beginning to create running instances for my OpenShift service. In the next post I'll select a base image to use for my host instances and begin creating running machines.

References

AWS Network Security (securitygroups) http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-network-security.html

Thursday, May 23, 2013

OpenShift on AWS EC2, Part 1: From the wheels up

Someone asked me recently how to build an OpenShift Origin service on Amazon Web Services EC2. My first thought was "easy, we do this all the time". I started going through what exists for our own testing, development and deployment. It clearly works, it's clearly the place to start, right? Just fire up a few instances, tweak the existing puppet configs and zoom! right?

Then I started trying to figure out how to describe it and adapt it to general use, and I found myself adding more and more caveats and limitations and internal assumptions. It's grown organically to do what is needed but what I have available isn't really designed for general use. Some of it I couldn't understand just from reading and observing (since I'm kind of a hands on break-it-to-understand-it kind of guy). Time to start taking it apart so I can put it back together. When I can do that and it starts up when I turn the key, then I can claim to understand it.

So I decided to go back to the fundamentals not of OpenShift, but of AWS EC2 itself.

Defining a Goal: Machines Ready To Eat

An OpenShift service consists of a number of component services. Ideally each component would have multiple instances for availability and scaling, but that's not required for initial setup. Only the OpenShift broker, console and nodes need to be exposed to the users.

The host configuration is complex enough that even for a small service it is best to use a Configuration Management System (CMS) to configure and manage the system, but the CMS can't start work until the hosts exist and have network communications. The CMS itself must be installed and configured. Once the hosts exist and are bound together then the CMS can do the rest of the work and a clean boundary of control and access is established. This will later allow the bottom layer (establishing hosts and installing/configuring the CMS) to be replaced without affecting the actual service installation above.

So the goal here is: create and connect hosts with a CMS installed using EC2. That's the base on which the OpenShift service will be built. If you run each of the component services on its own host using external DNS and authentication services, OpenShift requires a minimum of four hosts:

OpenShift Broker
Data Store (mongodb)
Message Broker (activemq)
OpenShift Node

Each of these can (theoretically, at least) be duplicated to provide high availability, but for now I'll start there. The goal of this series of posts is to create the hosts on which these services will be installed. We won't come back to OpenShift itself until that's done.

AWS EC2: Getting the lay of the land

If you're not familiar with AWS EC2, go check out https://aws.amazon.com . EC2 is the part of AWS which provides "virtual" hosts (for a fee, of course). There are free-to-try levels, but you are required to give a credit card to sign up and you're very likely to start incurring charges for storage even if you stick to the "free" tier. Read, be informed, decide for yourself.

AWS without the "W"

AWS presents a modern single-page web interface for all interactions, but I'm interested in command line or scripted interaction. Amazon does provide a REST protocol and has implemented libraries for a wide number of scripting languages. I'm using the rubygem-aws-sdk library (which is, surprisingly enough, written in Ruby) because I also want to use another Ruby tool called Thor.

Tasks and the Command Line Interface

Thor is a ruby library which helps create really nice command line "tasks". The beauty of Thor is that you can use it both to define individual tasks and to compose those tasks into more complex task sequences. This allows you to test each step as a distinct CLI operation and also to debug only the step that fails when one inevitably does.

I'm going to use Thor and the aws-sdk to create a CLI interface to the AWS low level operations, and then compose them to create higher level tasks which, in the end, will leave me with a set of hosts ready to receive an OpenShift service.

I'm not going to try to create a comprehensive CLI interface to AWS. I'm only going to create the steps that I need to get this job done. A number of the steps will encapsulate operations which may seem trivial, but this will allow for better consistency and visibility of the operations. A primary goal is to have as little magic as possible. At the same time, I want to avoid overwhelming the user (me) with unnecessary detail when things are working as planned.

I'm not going to make you sit through the entire development process (which isn't complete). Instead I mean to show the tools that I've developed and use them to cleanly define the base on which an OpenShift service would sit.

AWS Setup

To work with AWS, you must have an established account. To use the the REST API you need to have generated a set of access keys. To log into your EC2 instances you need to have generated a set of SSH key pairs and placed them so your SSH client can find them. (Usually in $HOME/.ssh) and configure your ssh client to use those keys when logging into EC2 instances (in $HOME/.ssh/config).

AWS Access Keys
AWS SSH Key Pairs
SSH client configuration

You can learn about and generate both sets of keys on the AWS Security Credentials page

Origin-Setup (really EC2 and SSH tools)

The tool set is currently called origin-setup and it resides in a repository on Github. The name is a misnomer, there's not actually any OpenShift in most of it.

Github repo URL: https://github.com/markllama/origin-setup

Requirements

The tasks are written in Ruby using the Thor library. They also require several other rubygems. All of them are available on Fedora 18 as RPMs.

ruby
rubygems
rubygem-thor
rubygem-aws-sdk
rubygem-parseconfig
rubygem-net-ssh
rubygem-net-scp

Getting (and setting) the Bits

Thor can be used to create stand-alone CLI commands, but I have not done that yet for these tasks. To use them you need to cd into the origin-setup directory and call thor directly. You will also need to set the RUBYLIB path to find a small helper library which manages the AWS authentication.

git clone https://github.com/markllama/origin-setup
cd origin-setup
export RUBYLIB=`pwd`/lib
thor list --all

AWS Again: configuring the toolset

The final step is to give the origin-setup toolset the information needed to communicate with the AWS REST interface.

AWSAccessKeyId=YOURKEYIDHERE
AWSSecretKey=YOURSECRETKEYHERE
AWSKeyPairName=YOURKEYPAIRNAMEHERE
RemoteUser=ec2-user
AWSEC2Type=t1.micro

This file contains what is essentially the passwords to your AWS account. You should set the permissions on this file so that only you can read it and protect the contents as you would your credit card.

The RemoteUser is the default user for SSH logins (F18+). For RHEL6 it would be root. The AWSEC2Type value defines the default instance "type" to be created when you create a new instance. The t1.micro instance type is small and it is in the free tier. You will need to choose a larger type for real use.

Turn the Key

You should be able to use the thor command to explore the list of available tasks. Thor allows the creation of namespaces to contain related tasks. Most of the important tasks to begin with are in the ec2 namespace.

You can see the available tasks with the thor list command:

thor list ec2 --all
ec2
---
thor ec2:image:create                                     # Create a new imag...
thor ec2:image:delete                                     # Delete an existin...
thor ec2:image:find TAGNAME                               # find the id of im...
thor ec2:image:info                                       # retrieve informat...
thor ec2:image:list                                       # list the availabl...
thor ec2:image:tag --tag=TAG                              # set or retrieve i...
thor ec2:instance:create --image=IMAGE --name=NAME        # create a new EC2 ...
thor ec2:instance:delete                                  # delete an EC2 ins...
thor ec2:instance:hostname                                # print the hostnam...
thor ec2:instance:info                                    # get information a...
thor ec2:instance:ipaddress [IPADDR]                      # set or get the ex...
thor ec2:instance:list                                    # list the set of r...
thor ec2:instance:private_hostname                        # print the interna...
thor ec2:instance:private_ipaddress                       # print the interna...
thor ec2:instance:rename --newname=NEWNAME                # rename an EC2 ins...
thor ec2:instance:start                                   # start an existing...
thor ec2:instance:status                                  # get status of an ...
thor ec2:instance:stop                                    # stop a running EC...
thor ec2:instance:tag --tag=TAG                           # set or retrieve i...
thor ec2:instance:wait                                    # wait until an ins...
thor ec2:ip:associate IPADDR INSTANCE                     # associate and Ela...
thor ec2:ip:associate IPADDR INSTANCE                     # associate and Ela...
thor ec2:ip:create                                        # create a new elas...
thor ec2:ip:delete IPADDR                                 # delete an elastic IP
thor ec2:ip:list                                          # list the defined ...
thor ec2:securitygroup:create NAME                        # create a new secu...
thor ec2:securitygroup:delete                             # delete the securi...
thor ec2:securitygroup:info                               # retrieve and repo...
thor ec2:securitygroup:list                               # list the availabl...
thor ec2:securitygroup:rule:add PROTOCOL PORTS [SOURCES]  # add a permission ...
thor ec2:snapshot:delete SNAPSHOT                         # delete the snapshot
thor ec2:snapshot:list                                    # list the availabl...
thor ec2:volume:delete VOLUME                             # delete the volume
thor ec2:volume:list                                      # list the availabl...

It's time to see if you can talk to EC2. This first query requests a list of images produced by the Fedora hosted team:

thor ec2:image list --name \*Fedora\* --owner 125523088429
ami-2509664c Fedora-x86_64-17-1-sda
ami-4b0b6422 Fedora-i386-17-1-sda
ami-6f640c06 Fedora-i386-18-20130521-sda
ami-b71078de Fedora-x86_64-18-20130521-sda
ami-d13758b8 Fedora-18-ec2-20130105-x86_64-sda
ami-dd3758b4 Fedora-18-ec2-20130105-i386-sda
ami-ed375884 Fedora-17-ec2-20120515-i386-sda
ami-fd375894 Fedora-17-ec2-20120515-x86_64-sda

If instead you get a really long messy ruby error, then check the permissions and contents of your ~/.awscred file.

It's probably a good idea, before experimenting too much here to go get familar with EC2 and Route53 using the web console a bit.

Next post I'll establish the DNS zone in Route53 and show how to manage DNS records to prepare for my OpenShift service.

References

AWS EC2 Console - managing remote virtual machines
AWS Route53 (DNS) Console - managing DNS
rubygem-aws-sdk - an implimentation of the AWS REST protocol in Ruby
SSH publickey - secure login without passwords
Thor - A ruby gem to build command line interface "tasks"
Puppet - A popular Configuration Management System
Git - a popular Source Code Management system
Github - a site for keeping Git repositories

origin-setup - a set of Thor tasks for managing AWS EC2 and Route53
With a goal of automating the creation of an OpenShift Origin service in EC2