tag:blogger.com,1999:blog-50221860076954579232024-03-28T00:13:12.108-07:00Under The Hood of Cloud Computingmarkllamahttp://www.blogger.com/profile/14193184544557876514noreply@blogger.comBlogger41125tag:blogger.com,1999:blog-5022186007695457923.post-67328719311370223252016-10-25T10:59:00.001-07:002016-10-25T10:59:29.697-07:00A beginner's perspective on OpenStack DesignateI'm at the <a href="https://www.openstack.org/summit/barcelona-2016">OpenStack Summit</a> this week in Barcelona. Beautiful place, but a conference center is a conference center.<br />
<br />
My first session today was an introduction and discussion of Designate, the OpenStack DNS control module.<br />
<br />
I've been working with OpenStack for VM instances and with OpenShift to run container services in the cloud. One major issues that always gets back burnered in discussions is DNS. I refer to this as <i>publication</i>, an unusual term but I think the best one to describe a critical aspect of IaaS and PaaS cloud services.<br />
<br />The point of these services is self-serve computing resources, most often services you can offer to others. If you have no way of telling others where to find your services... they cant. DNS is the way you tell people where to find your stuff.<br />
<br />
Historically the DNS service has been managed by an IT "priesthood" who are rightfully protective. DNS is the first and most critical service on any modern network. It's largely invisible and it works so well that most sysadmins don't actually understand how it works. DNS is one of the last services to fall to the self-service mind-set of cloud computing and that's with good reason.<br />
<br />
I was under the misconception that Designate would be solely a dynamic DNS service that would be used to publish new instances or containers within the service. I also thought perhaps it had its own front end to respond to queries. It quickly became clear that Designate is not very useful without those external front-line services.<br />
<br />
Listening to the talk it became clear that the Designate developers also see this conservatism as a barrier to adoption. A significant portion of the talk was dedicated to creating roll-out plans that build confidence slowly, absorbing more and more of the wild barnyard of existing services.<br />
<br />
Designate seems to be more of a control plane and database for DNS services than an actual front-line server responding to queries. You continue to run BIND or Active Directory/DNS or Infoblox to respond to queries, but the database is stored in the OpenStack service (with a back end DB?) and the database propagates to the caching or front end DNS services.<br />
<br />
This leads to the idea of Designate eventually taking over control of all of the DNS services in an enterprise. It has the capability to define roles for users, allowing fine grained control of what actions user can take, while offering for the first time a true kiosk self-service IP naming mechanism.<br />
<br />
I know how I plan to use Designate in my OpenShift and OpenStack service deployments. It appears I may still need to create backing DNS servers, but I'll at least get WebUI and API change management for the masses. I've used <span style="font-family: Courier New, Courier, monospace;">nsupdate</span><span style="font-family: Arial, Helvetica, sans-serif;"> to create dynamic DNS zones before, but it always seemed to scare other people off. With Designate I'm going to be able to deploy both my new services and containers within them and publish them with the short turn-around a cloud service demands.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<br />
<ul>
<li><span style="font-family: Arial, Helvetica, sans-serif;"><a href="http://docs.openstack.org/developer/designate/">OpenStack Designate</a></span></li>
</ul>
markllamahttp://www.blogger.com/profile/14193184544557876514noreply@blogger.com2tag:blogger.com,1999:blog-5022186007695457923.post-69576996245681744042014-10-20T17:23:00.000-07:002014-10-20T17:23:56.603-07:00Storage Concepts in Docker: Network and Cloud Storage.This is the third and final post on storage in Docker. It's also going to be the most abstract, as most of it now is still wishful.<br />
<br />
<ul>
<li><a href="http://cloud-mechanic.blogspot.com/2014/10/storage-concepts-in-docker.html">Shared Storage</a></li>
<li><a href="http://cloud-mechanic.blogspot.com/2014/10/storage-concepts-in-docker-persistent.html">Persistent (local) storage</a></li>
<li>Network and Cloud Storage (this post)</li>
</ul>
<div>
<br /></div>
The previous two posts dealt with shared internal storage and persistent host storage in Docker. These two mechanisms allow you to share storage on a single host. While this has its uses, very quickly people find that they need more than local storage<br />
<br />
<h2>
Types of Storage in Containers</h2>
<div>
Lots of people are talking about storage in Docker containers. Not many are careful to qualify what they mean by that. Some of the conversation is getting confused because different people have different goals for storage in containers.<br />
<br />
<h3>
Docker Internal Storage</h3>
</div>
<div>
This is the simplest form of storage in Docker. Each container has its own space on the host. This is inside the container and it is temporary, being created when the container is instantiated and removed some time after the container is terminated. When two containers reside on the same host they can share this docker-internal storage.<br />
<br /></div>
<h3>
Host Storage</h3>
<div>
Containers can be configured to use host storage. The space must be allocated and configured on the host so that the processes within the containers will have the necessary permissions to read and write to the host storage. Again, containers on the same host can share storage.<br />
<br /></div>
<h3>
Network Storage</h3>
<div>
Or "Network Attached Storage" (NAS) in which I slovenly include Storage Area Networks (SAN).<br />
I'm also including modern storage services like Gluster and Ceph. For container purposes these are the same thing: Storage which is not directly attached via the SCSI or SATA bus, but rather over an IP network but which, once mounted appears to the host as a block device.<br />
<br />
If you are running your minions in an environment where you can configure NAS <u>universally</u> then you may be able to use network storage within your Kubernetes cluster.<br />
<br />
Remember that Docker runs as root on each minion. You may find that there are issues related to differences in the user database between the containers, minions and storage. Until the cgroup user namespace work is finished and integrated with Docker, unifying UID/GID maps will be a problem that requires attention when building containers and deploying them.<br />
<br /></div>
<h3>
Cloud Storage</h3>
<div>
Cloud storage is.. well not the other kinds. It's generally offered in a "storage as a service" model. Most people think of Amazon AWS storage (EBS and S3) but Google is growing its cloud storage and OpenStack offers the possibility of creating on-premise cloud storage services as well.<br />
<br />
Cloud storage generally takes two forms. The first is good old-fashioned block storage. The other is newer and is known as object storage. They have different behaviors and use characteristics.<br />
<br /></div>
<h4>
Block Storage</h4>
<div>
Once it is attached to a host, cloud block storage is indistinguishable from direct attached storage. You can use disk utilities to partition it and create filesystems. You can mount it so that the filesystem appears within the host file tree.<br />
<br />
Block storage requires very low latency. This means that it is generally limited to relatively local networks. It works fine within the infrastructure of a cloud service such as AWS or OpenStack, but running block storage over wide area networks is often difficult and prone to failure.<br />
<br />
Block storage is attached to the host and then the docker VOLUME mechanism is used to import the storage tree into one or more containers. If the storage is mounted automatically and uniformly on every minion (and that information is public) then it is possible to use block storage in clusters of container hosts.<br />
<br /></div>
<h4>
Object Storage</h4>
<div>
Object storage is a relatively new idea. For files with a long life that do not change often and can be retrieved as a unit object storage is often a good They're also good to use as a repository configuration information which is too large or sensitive to be placed in an environment variable or CLI argument.<br />
<br />
OpenStack Cinder, AWS S3 and Google Cloud Storage are examples of open source and commercial object stores.</div>
<div>
<br /></div>
<div>
The usage characteristics of object storage make it so that latency is not the kind of issue that it is with block storage.</div>
<div>
<br /></div>
<div>
One other characteristic of object storage makes it really suited to use in containers. Object storage is usually accessed by retrieval over HTTP using a RESTful protocol. This means that the container host does not need to be involved in accessing the contents. So long as the container has the software and the access information for the storage processes within the container can retrieve it. All that is required is that the container is able to reach the storage service through the host network interface(s). This makes object storage a strong choice for container storage where ever the other characteristics are acceptable.<br />
<br /></div>
<div>
<h2>
Storage and Kubernetes</h2>
</div>
<div>
Pretty much every application will need storage in some form. To build large scale containerized applications it will be essential for Kubernetes to make it possible for the containers to access and share persistent storage. The form that the storage takes will depend on the character of the application and the environment of the cluster.</div>
<div>
<br />
With all of the forms of NAS (remember, I'm being slovenly) the host is involved in accessing and mounting the storage so that it appears to Docker as if it is normal host storage. This means that one of three conditions must be met on the host:<br />
<br />
<ol>
<li>All of the available storage is mounted on all minions before any containers start</li>
<li>The host is configured to automount the storage on the first attempt to access a path</li>
<li>This host is able to accept and act on mount requests from Kubernetes</li>
</ol>
<div>
<br /></div>
<div>
This third also requires modifications to Kubernetes so that the user can specify the path to the required storage and provide any access/authentication information that will be required by the host.</div>
<div>
<br /></div>
<div>
For Cloud block storage the only option is #3 from above. Google has added a mechanism to mount Google Cloud Engine Persistent Disk volumes into Kubernetes clusters. The current mechanism (as of 20-Oct-2014) is hard coded. The developers understand that they will need a plugin mechanism to allow adding AWS EBS, OpenStack Cinder and others. I don't think work on any other cloud storage services has begun yet.</div>
<div>
<br /></div>
<div>
Object storage is the shining light. While it has limited use cases, those cases are really common and really important. Object storage access can be built into the image and the only thing the Kubernetes cluster must provide is network access to the object store service. </div>
<div>
<br /></div>
<h2>
Summary</h2>
<div>
Generalized shared and cloud storage within Kubernetes clusters (or any cluster of container hosts) is, at this time, an unsolved problem. Everyone knows it is a top priority and everyone working on the idea of clustered container hosts is thinking about it and experimenting with solutions. I don't think it will be long before some solutions become available and I'm confident that there will be working solutions within the timeframe of <strike>*mumble*</strike>. </div>
<div>
<br /></div>
<div>
For Kubernetes, there is <a href="https://github.com/GoogleCloudPlatform/kubernetes/pull/1515">an open issue</a> discussing persistent storage options and how to design them into the service, both on the back end and the front end (how does one tell Kubernetes how to access storage for containers?)</div>
<div>
<br /></div>
<div>
I'm going to be playing with a few of the possibilities because I'm going to need them. Until they are available, I can create a Pulp service in Kubernetes, but I can't make it persistent. Since the startup cost of creating an RPM mirror is huge, it's not much use except as a demonstrator until persistent storage is available.</div>
<div>
<br /></div>
<h2>
References</h2>
<ul>
<li><a href="https://en.wikipedia.org/wiki/Network-attached_storage">Network Attached Storage</a></li>
<ul>
<li><a href="https://en.wikipedia.org/wiki/Storage_area_network">Storage Area Network</a></li>
<li><a href="https://en.wikipedia.org/wiki/Network_File_System">Network File System</a> (NFS)</li>
<li><a href="http://www.gluster.org/">Gluster</a></li>
<li><a href="http://ceph.com/">Ceph</a></li>
<li><a href="https://en.wikipedia.org/wiki/ISCSI">iSCSI</a></li>
</ul>
<li><a href="http://www.openstack.org/software/openstack-storage/">OpenStack Cloud Storage</a> </li>
<ul>
<li><a href="https://wiki.openstack.org/wiki/Cinder#OpenStack_Block_Storage_.28.22Cinder.22.29">Cinder</a> - block storage</li>
<li><a href="https://wiki.openstack.org/wiki/Swift#OpenStack_Object_Storage_.28.22Swift.22.29">Swift </a>- object storage</li>
</ul>
<li>AWS Cloud Storage</li>
<ul>
<li><a href="https://aws.amazon.com/ebs/">EBS</a> - block storage</li>
<li><a href="https://aws.amazon.com/s3/">S3</a> - object storage</li>
</ul>
<li><a href="https://cloud.google.com/storage/docs">Google Cloud Storage</a></li>
<ul>
<li>Google Cloud Engine<a href="https://cloud.google.com/compute/docs/disks#persistentdisks"> Persistent Disks</a> - block storage</li>
<li><a href="https://cloud.google.com/storage/">Google Storage</a> - object storage</li>
</ul>
</ul>
</div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
markllamahttp://www.blogger.com/profile/14193184544557876514noreply@blogger.com5tag:blogger.com,1999:blog-5022186007695457923.post-19609577045150181422014-10-10T13:04:00.002-07:002014-10-10T13:04:50.919-07:00Storage Concepts in Docker: Persistent StorageThis is the second of three posts on storage management in <a href="http://www.docker.com/">Docker</a>:<br />
<br />
<ul>
<li><a href="http://cloud-mechanic.blogspot.com/2014/10/storage-concepts-in-docker.html">Shared Storage and the VOLUME directive</a></li>
<li>Persistent Storage: the --volume CLI option (this post)</li>
<li>Storage in Kubernetes</li>
</ul>
<div>
<br /></div>
<div>
This is a side trip on my way to creating a containerized <a href="http://pulpproject.org/">Pulp</a> content mirroring service using <a href="http://www.docker.com/">Docker</a> and <a href="https://github.com/GoogleCloudPlatform/kubernetes">Kuberentes</a>. The storage portion is important (and confusing) enough to warrant special attention.</div>
<div>
<br />
<h2>
Persistent Storage</h2>
<div>
In the previous post I talked about the mechanisms that Docker offers for sharing storage between containers. This kind of storage is limited to containers on the same host and it does not survive after the last connected container is destroyed.</div>
<div>
<br /></div>
<div>
If you're running a long-lived service like a database or a file repository you're going to need storage which exists outside the container space and has a life span longer than the container which uses it.</div>
<div>
The Dockerfile VOLUME directive is the mechanism to define where external storage will be mounted inside a container.<br />
<br />
NOTE: I'm only discussing single host local storage. The issues around network storage are still wide open and beyond the scope of a single post.</div>
<div>
<br /></div>
<h2>
Container Views and Context</h2>
<div>
Containers work by providing two different views of the resources on the host. Outside the container, the OS can see everything, but the processes inside are fooled into seeing only what the container writer wants them to see. The problem is not just what they see though, but how they see it.</div>
<div>
<br /></div>
<div>
There are a number of resources which define the view of the OS. The most significant ones for file storage are the user and group databases (in <span style="font-family: Courier New, Courier, monospace;">/etc/passwd</span> and <span style="font-family: Courier New, Courier, monospace;">/etc/group</span>). The OS uses numeric UID and GID values to identify users and decide how to apply permissions. These numeric values are mapped to names using the passwd and group files. The host and containers each have their own copies of these files and the entries in these files will almost certainly differ between the host and the container. The ownership and permissions on the external file tree must be set to match the expectations of the processes which will run in the container.<br />
<br />
SELinux also controls access to file resources. The SELinux labels on the file tree on the host must be set so that system policy will allow the processes inside the container to operate on them as needed.<br />
<br />
In this post most of my effort will be spent looking at the view from inside and adjusting the settings on the file tree outside to allow the container processes to do their work.<br />
<br /></div>
<h2>
Dockerfile VOLUME directive Redux</h2>
</div>
<div>
As noted in the previous post, the VOLUME directive defines a boundary in the filesystem within a container. That boundary can be used as a handle to export a portion of the container file tree. It can also be used to mark a place to mount an external filesystem for import to the container.</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="http://1.bp.blogspot.com/-orFO83KfxSI/VDbeCP3IuWI/AAAAAAAAFDg/3kFXEo-Yudc/s1600/MongoDB%2BImage%2B-%2BVolumes.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://1.bp.blogspot.com/-orFO83KfxSI/VDbeCP3IuWI/AAAAAAAAFDg/3kFXEo-Yudc/s1600/MongoDB%2BImage%2B-%2BVolumes.png" height="172" width="320" /></a></div>
<br /></div>
<div>
When used with the Docker CLI <span style="font-family: Courier New, Courier, monospace;">--volumes-from</span><span style="font-family: inherit;"> option it is possible to create containers that share storage from one container to any number of others. The mount points defined in the VOLUME directives are mapped one-to-one from the source container to the destinations.</span></div>
<div>
<span style="font-family: inherit;"><br /></span></div>
<h2>
<span style="font-family: inherit;">Importing Storage: The --volume CLI option</span></h2>
<div>
<span style="font-family: inherit;">When starting a Docker container I can cause Docker to map any external path to an internal path using the </span><span style="font-family: Courier New, Courier, monospace;">--volume</span><span style="font-family: inherit;"> (or </span><span style="font-family: Courier New, Courier, monospace;">-v</span><span style="font-family: inherit;">) option. This option takes two paths separated by a colon (:). The first path is the host file or directory to be imported into the container. The second is the mount point within the container.</span></div>
<div>
<br />
<span style="background-color: #073763; color: #f3f3f3; font-family: Courier New, Courier, monospace;">docker run --volume <host path>:<container path> ...</span><br />
<span style="background-color: magenta; color: white; font-family: Courier New, Courier, monospace;"><br /></span></div>
<h2>
Example: MongoDB persistent data</h2>
<div>
Say I want to run a database on my host, but I don't want to have to install the DB software into the system. Docker makes it possible for me to run my database in a container and not have to worry about which version the OS has installed. However, I do want the data to persist if I shut the container down and restart it, whether for maintenance or to upgrade the container.<br />
<br />
The Dockerfile for my MongoDB container looks like this:<br />
<br />
<script src="https://gist.github.com/markllama/829690622aacee395836.js"></script>
<br />
<ul>
<li>Lines 1 and 2 are the boilerplate you've seen to define the base image and the maintainer information.</li>
<li>Line 7 installs the MongoDB server package</li>
<li>Lines 9 - 11 create the directory for the database storage and ensures that it will not be pruned by placing a hidden file named <span style="font-family: Courier New, Courier, monospace;">.keep</span><span style="font-family: inherit;"> inside. They also set the permissions for that directory i<i>n the context of the container view</i> to allow the mongodb user to write the directory.</span></li>
<li>Line 15 specifies the location of the imported volume.</li>
<li>Line 17 opens the firewall for inbound connections to the MongoDB</li>
<li>Lines 19 and 20 set the user that will run the primary process and the location where it will start.</li>
<li>Lines 22 and 23 define the binary to execute when the container starts and the default arguments</li>
</ul>
<br />
To run a shell in the container, use the <span style="font-family: Courier New, Courier, monospace;">--entrypoint</span><span style="font-family: inherit;"> option. Arguments to the </span><span style="font-family: Courier New, Courier, monospace;">docker run</span><span style="font-family: inherit;"> command will be passed directly to the mongod process, overriding the defaults.</span><br />
<br />
<h3>
What works?</h3>
<div>
<br /></div>
<div>
I know that this image works when I just use the default internal Docker storage. I know that file ownership and permissions will be an issue, so the first thing to do is to look inside a working container and see what the ownership and permissions look like.</div>
<div>
<br />
<pre class="brush:bash ; title: 'A Mongodb container with no storage attached' ; highlight: [1,3,6]">docker run -it --name mongodb --entrypoint /bin/sh markllama/mongodb
sh-4.2$ id
uid=184(mongodb) gid=998(mongodb) groups=998(mongodb)
sh-4.2$ ls -ldZ /var/lib/mongodb
drwxr-xr-x. mongodb mongodb system_u:object_r:<span style="font-family: inherit;">docker_var_lib_t</span>:s0 /var/lib/mongodb
</pre>
</div>
Now I know the UID and GID which the container process uses (UID = 184, GID = 998). I'll have to make sure that this user/group can write to the host directory which I map into the container.<br />
<br />
I know that the default permissions are 755 (rwx, r-x, r-x), which is fairly common.<br />
<br />
I also see that the directory has a special SELinux label: <span style="font-family: Courier New, Courier, monospace;">docker_var_lib_t</span><span style="font-family: inherit;">. </span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Together, the directory ownership/permissions and the SELinux policy could prevent access by the container process to the host files. Both are going to require root access on the host to fix.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Interesting Note: When inside the container, attempts to determine the process SELinux context are met with a message indicating that SELinux is not enabled. Apparently, from the view point inside the container, it isn't.</span><br />
<span style="font-family: inherit;"><br /></span>
<br />
<h2>
<span style="font-family: inherit;">Preparing the Host Directory</span></h2>
</div>
<div>
<span style="font-family: inherit;">I could just go ahead and create a directory with the right ownership, permissions and label and attach it to my MongoDB container and say "Voila!". What fun would that be? Instead I'm going to create the target directory and mount it into a container and try writing to it from inside. When that fails I'll update the ownership, permissionad and label, (from outside) each time checking the view and capabilities (from inside) to see how it changes.</span></div>
<div>
<span style="font-family: inherit;"><br /></span></div>
<div>
<span style="font-family: inherit;">I am going to disable SELinux temporarily so I can isolate the file ownership/permissions from the SELinux labeling.</span></div>
<div>
<br />
<pre class="brush:bash ; title: 'Disable SELinux: Create MongoDB data directory (outside)'; highlight: 4 ">sudo setenforce 0
mkdir ~/mongodb
ls -ldZ ~/mongodb
drwxrwxr-x. mlamouri mlamouri unconfined_u:object_r:user_home_t:s0 /home/mlamouri/mongodb
</pre>
<br />
I'm also going to create a file inside the directory (from the view of the host) so that I can verify (from the view in the container) that I've mounted the correct directory.
<br />
<br />
<pre class="brush:bash ; title: 'Create a file for reference (outside)' ; highlight: 3">touch ~/mongodb/from_outside
ls -lZ ~/mongodb/from_outside
-rw-rw-r--. mlamouri mlamouri unconfined_u:object_r:user_home_t:s0 /home/mlamouri/mongodb/from_outside
</pre>
<br />
Note the default ownership and permissions on that file.
</div>
<div>
<br />
Now I'm ready to try mounting that into the mongodb container (knowing that write access will fail)
<br />
<br />
<h2>
Starting the Container with a Volume Mounted</h2>
</div>
<div>
I want to be able to examine the runtime environment inside the container before I let it fly with a mongod process. I'll set the entrypoint on the CLI to run a shell instead and use the <span style="font-family: Courier New, Courier, monospace;">-it</span><span style="font-family: inherit;"> options so it runs interactively and terminates when I exit the shell.</span></div>
<div>
<br /></div>
<div>
The external path to the volume is <span style="font-family: Courier New, Courier, monospace;">/home/mlamouri/mongodb</span> and the internal path is <span style="font-family: Courier New, Courier, monospace;">/var/lib/mongodb</span>.</div>
<div>
<br />
<pre class="brush:bash ; title: 'Run Container with volume imported' ; highlight: [1,3,6,9,12,15]">docker run -it --name mongodb --volume ~/mongodb:/var/lib/mongodb --entrypoint /bin/sh markllama/mongodb
sh-4.2$ id
uid=184(mongodb) gid=998(mongodb) groups=998(mongodb)
sh-4.2$ pwd
/var/lib/mongodb
sh-4.2$ ls
from_outside
sh-4.2$ ls -ld /var/lib/mongodb
drwxrwxr-x. 2 15149 15149 4096 Oct 9 21:04 /var/lib/mongodb
sh-4.2$ touch from_inside
touch: cannot touch 'from_inside': Permission denied
</pre>
<br />
As expected, From inside the container, I can't write the mounted volume. I can read it (with SELinux disabled) because I have te directory permissions open to the world for read and execute. Now I'll change the ownership of the directory from the outside.
<br />
<br />
<h2>
Adjusting The Ownership</h2>
<pre class="brush: bash ; title: 'Set directory ownership (outside)' ; highlight: [1,2]">sudo chown 184:998 ~/mongodb
ls -ld ~/mongodb
drwxrwxr-x. 2 mongodb polkitd 4096 Oct 9 20:46 /home/bos/mlamouri/mongodb
</pre>
<br />
It turns out I have the mongo-server package installed on my host and it has assigned the same UID to the monogodb user as the container has. However, the group for mongodb inside the container corresponds to the polkitd group on the host.<br />
<br />
Now I can try writing a file there from the inside again. From the (still running) container shell:<br />
<br />
<pre class="brush:bash; title: 'Try to create a file (inside)' ; highlight: [1,5,7,12]">sh-4.2$ ls -l
total 0
-rw-rw-r--. 1 mongodb mongodb 0 Oct 9 20:46 from_outside
sh-4.2$ touch from_inside
sh-4.2$ ls -l
total 0
-rw-r--r--. 1 mongodb mongodb 0 Oct 10 01:55 from_inside
-rw-rw-r--. 1 mongodb mongodb 0 Oct 9 20:46 from_outside
sh-4.2$ ls -Z
-rw-r--r--. mongodb mongodb system_u:object_r:user_home_t:s0 from_inside
-rw-rw-r--. mongodb mongodb unconfined_u:object_r:user_home_t:s0 from_outside
</pre>
<br /></div>
<div>
<br />
<h2>
Re-Enabling SELinux (and causing fails again)</h2>
</div>
<div>
There are two access control barriers for files. The Linux file ownership and permissions are one. The second is SELinux and I have to turn it back on. This will break things again until I also set the SELinux label on the directory on the host.<br />
<br />
<pre class="brush:bash ; title: 'Re-enable SELinux' ; highlight: 1">sudo setenforce 1
</pre>
<br />
Now when I try to read the directory inside the container or create a file, the request is rejected with permission denied.
<br />
<br />
<pre class="brush:bash ; title : 'Test access with SELinux enforcing' ; highlight: [1,3]">sh-4.2$ ls
ls: cannot open directory .: Permission denied
sh-4.2$ touch from_inside_with_selinux
touch: cannot touch 'from_inside_with_selinux': Permission denied
</pre>
<br /></div>
<div>
Just to refresh, here's the SELinux label for the directory as seen from the host:
<br />
<pre class="brush:bash ; title: 'Default SELinux label (outside)' ; highlight: 1">ls -dZ mongodb
drwxrwxr-x. mongodb polkitd unconfined_u:object_r:user_home_t:s0 mongodb
</pre>
<br />
<h2>
SELinux Diversion: What's Happening?</h2>
<div>
In the end I'm just going to apply the SELinux label which I found on the volume directory when I used Docker internal storage. I'm going to step aside for a second here though and look at how I can find out more about what SELinux is rejecting</div>
<div>
<br /></div>
When SELinux rejects a request it logs that request. The logs go into <span style="font-family: Courier New, Courier, monospace;">/var/log/audit/audit.log</span><span style="font-family: inherit;">. These are definitely cryptic and can be daunting but they're not entirely inscrutable.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">First I can use what I know to filter out things I don't care about. I know I want AVC messages (AVC is an abbreviation for <i>Access Vector Cache</i>. Yeah. Not useful). These messages are indicated by <i>type=AVC</i> in the logs. Second, I know that I am concerned with attempts to access files labeled <i>user_home_t</i>. These two will help me narrow down the messages I care about.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">These are very long lines so you may have to scroll right a bit to see the important parts.</span></div>
<div>
<br />
<pre class="brush:bash ; title: 'AVC records for user_home_t' ; highlight: 1">sudo grep type=AVC /var/log/audit/audit.log | grep user_home_t
type=AVC msg=audit(1412948687.224:8235): avc: denied { add_name } for pid=11135 comm="touch" name="from_inside" scontext=system_u:system_r:svirt_lxc_net_t:s0:c687,c763 tcontext=unconfined_u:object_r:user_home_t:s0 tclass=dir permissive=1
type=AVC msg=audit(1412948687.224:8235): avc: denied { create } for pid=11135 comm="touch" name="from_inside" scontext=system_u:system_r:svirt_lxc_net_t:s0:c687,c763 tcontext=system_u:object_r:user_home_t:s0 tclass=file permissive=1
type=AVC msg=audit(1412948876.731:8257): avc: denied { write } for pid=12800 comm="touch" name="mongodb" dev="sda4" ino=7749584 scontext=system_u:system_r:svirt_lxc_net_t:s0:c687,c763 tcontext=unconfined_u:object_r:user_home_t:s0 tclass=dir permissive=0
type=AVC msg=audit(1412948898.965:8258): avc: denied { write } for pid=11108 comm="sh" name=".bash_history" dev="sda4" ino=7751785 scontext=system_u:system_r:svirt_lxc_net_t:s0:c687,c763 tcontext=system_u:object_r:user_home_t:s0 tclass=file permissive=0
type=AVC msg=audit(1412948898.965:8259): avc: denied { append } for pid=11108 comm="sh" name=".bash_history" dev="sda4" ino=7751785 scontext=system_u:system_r:svirt_lxc_net_t:s0:c687,c763 tcontext=system_u:object_r:user_home_t:s0 tclass=file permissive=0
type=AVC msg=audit(1412948898.965:8260): avc: denied { read } for pid=11108 comm="sh" name=".bash_history" dev="sda4" ino=7751785 scontext=system_u:system_r:svirt_lxc_net_t:s0:c687,c763 tcontext=system_u:object_r:user_home_t:s0 tclass=file permissive=0
type=AVC msg=audit(1412949007.595:8289): avc: denied { read } for pid=14158 comm="sh" name=".bash_history" dev="sda4" ino=7751785 scontext=system_u:system_r:svirt_lxc_net_t:s0:c184,c197 tcontext=system_u:object_r:user_home_t:s0 tclass=file permissive=0
type=AVC msg=audit(1412949674.712:8307): avc: denied { write } for pid=14369 comm="touch" name="mongodb" dev="sda4" ino=7749584 scontext=system_u:system_r:svirt_lxc_net_t:s0:c184,c197 tcontext=unconfined_u:object_r:user_home_t:s0 tclass=dir permissive=0
</pre>
<br />
I found something I hadn't really expected. Every time I try to type a command in the shell within the container, the shell tries to write to the <span style="font-family: Courier New, Courier, monospace;">.bash_history</span><span style="font-family: inherit;"> file. This Is only an issue when I'm testing the container with a shell. Remember in the Dockerfile I set the WORKDIR directive to the top of the MongoDB data directory. That means when I start the shell in the container, the current working directory is </span><span style="font-family: Courier New, Courier, monospace;">/var/log/mongodb</span><span style="font-family: inherit;">. Which is the directory I'm trying to import. This won't matter when I'm running the daemon properly as there won't be any shell.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">The important thing this shows me is the SELinux context of the shell process within the container: </span><span style="font-family: Courier New, Courier, monospace;">system_u:system_r:svirt_lxc_net_t:s0</span><span style="font-family: inherit;"> . (note that I dropped off the MVC context, the "cc87,c763" on the end). That is the process which is being denied access to the working directory.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Given that list of AVCs I can feed them to </span><span style="font-family: Courier New, Courier, monospace;">audit2allow</span><span style="font-family: inherit;"> and get a big-hammer policy change to stop the AVCs.</span><br />
<span style="font-family: inherit;"><br /></span>
<br />
<pre class="brush:bash ; title: 'audit2allow for user_home_t AVCs' ; highlight: 1">sudo grep type=AVC /var/log/audit/audit.log | grep user_home_t| audit2allow
#============= svirt_lxc_net_t ==============
allow svirt_lxc_net_t user_home_t:dir { write remove_name add_name };
allow svirt_lxc_net_t user_home_t:file { write read create unlink open append };
</pre>
<br /></div>
<div>
This is a nice summary of what is happening and what fails. You could use this output to create a policy module which would allow this activity. <b>DON'T DO IT.</b> It's tempting to use <span style="font-family: Courier New, Courier, monospace;">audit2allow</span><span style="font-family: inherit;"> to just open things up when SELinux prevents things. Without understanding what your changing and why you risk creating holes you didn't mean to.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Instead I'm going to proceed by assigning a label to the directory tree which indicates what I mean to use it for (content for Docker containers). That is, by labeling the directory to allow Docker to mount and write it, it becomes evident to someone looking at it later what I meant to do.</span><br />
<span style="font-family: inherit;"><br /></span>
<br />
<h2>
<span style="font-family: inherit;">Labeling the MongoDB directory for use by Docker</span></h2>
</div>
<div>
<span style="font-family: inherit;">The processes running within Docker appear to have the SELinux context </span><span style="font-family: Courier New, Courier, monospace;">system_u:system_r:svirt_lxc_net_t</span><span style="font-family: inherit;">. From the example using the Docker internal storage for </span><span style="font-family: Courier New, Courier, monospace;">/var/lib/mongodb</span><span style="font-family: inherit;"> I know that the directory is labled </span><span style="font-family: Courier New, Courier, monospace;">system_u:object_r:docker_var_lib_t:s0</span><span style="font-family: inherit;">. If I apply that label to my working directory, the processes inside the container should be able to write to the directory and its children.</span></div>
<div>
<span style="font-family: inherit;"><br /></span></div>
<div>
<span style="font-family: inherit;">The SELinux tool for updating object (file) labels is </span><span style="font-family: Courier New, Courier, monospace;">chcon</span><span style="font-family: inherit;"> (for <i>change context</i>). It works much like </span><span style="font-family: Courier New, Courier, monospace;">chown</span><span style="font-family: inherit;"> or </span><span style="font-family: Courier New, Courier, monospace;">chmod</span><span style="font-family: inherit;">. Because I'm changing security labels that I don't own, I need to use sudo to make the change.</span></div>
<div>
<br />
<pre class="brush:bash ; title: 'Set SELinux labels (outisde)' ; highlight: [1,3,6,10]">sudo chcon -R system_u:object_r:docker_var_lib_t:s0 ~/mongodb
ls -dZ mongodb/
drwxrwxr-x. mongodb polkitd system_u:object_r:docker_var_lib_t:s0 mongodb/
ls -Z mongodb/
-rw-r--r--. mongodb polkitd system_u:object_r:docker_var_lib_t:s0 from_inside
-rw-rw-r--. mongodb polkitd system_u:object_r:docker_var_lib_t:s0 from_outside
getenforce
Enforcing
</pre>
<br />
Now the directory and all its contents have the correct ownership, permissions and SElinux label. SELinux is enforcing. I can try writing from inside the container again.
<br />
<br />
<pre class="brush: bash ; title: 'Write after labeling (inside)' ; highlight: [1,2]">sh-4.2$ touch from_inside_with_selinux
sh-4.2$ ls -l
total 0
-rw-r--r--. 1 mongodb mongodb 0 Oct 10 13:44 from_inside
-rw-r--r--. 1 mongodb mongodb 0 Oct 10 15:54 from_inside_with_selinux
-rw-rw-r--. 1 mongodb mongodb 0 Oct 9 20:46 from_outside
</pre>
<br />
That's it. Time to try running mongod inside the container.<br />
<br />
<h2>
Running the Mongodb Container</h2>
<div>
First I shut down and remove my existing mongod container. Then I can start one up for real. I Switch from interactive (<span style="font-family: Courier New, Courier, monospace;">-it</span>) to daemon (<span style="font-family: Courier New, Courier, monospace;">-d</span>) mode and remove the <span style="font-family: Courier New, Courier, monospace;">--entrypoint</span><span style="font-family: inherit;"> argument.</span></div>
<div>
<br /></div>
<pre class="brush: bash ; title: 'Clean up test container and start mongodb' ; highlight: [1,4,7]">sh-4.2$ exit
exit
docker rm mongodb
mongodb
docker run -d --name mongodb --volume ~/mongodb:/var/lib/mongodb markllama/mongodb
9e203806b4f07962202da7e0b870cd567883297748d9fe149948061ff0fa83f0
</pre>
<br />
I should now have a running mongodb container<br />
<br />
<pre class="brush:bash ; title: 'Check for running container' ; highlight: 1">docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
9e203806b4f0 markllama/mongodb:latest "/usr/bin/mongod --c 34 seconds ago Up 33 seconds 27017/tcp mongodb
</pre>
<br />
<br />
I can check the container logs to see if the process is running and indicates a good startup.<br />
<br />
<pre class="brush: bash ; title: 'View startup logs' ; highlight: [1]">docker logs mongodb
note: noprealloc may hurt performance in many applications
Fri Oct 10 16:01:25.560 [initandlisten] MongoDB starting : pid=1 port=27017 dbpath=/var/lib/mongodb 64-bit host=9e203806b4f0
Fri Oct 10 16:01:25.562 [initandlisten]
Fri Oct 10 16:01:25.562 [initandlisten] ** WARNING: You are running on a NUMA machine.
Fri Oct 10 16:01:25.562 [initandlisten] ** We suggest launching mongod like this to avoid performance problems:
Fri Oct 10 16:01:25.562 [initandlisten] ** numactl --interleave=all mongod [other options]
Fri Oct 10 16:01:25.562 [initandlisten]
Fri Oct 10 16:01:25.562 [initandlisten] db version v2.4.6
Fri Oct 10 16:01:25.562 [initandlisten] git version: nogitversion
Fri Oct 10 16:01:25.562 [initandlisten] build info: Linux buildvm-12.phx2.fedoraproject.org 3.10.9-200.fc19.x86_64 #1 SMP Wed Aug 21 19:27:58 UTC 2013 x86_64 BOOST_LIB_VERSION=1_54
Fri Oct 10 16:01:25.563 [initandlisten] allocator: tcmalloc
Fri Oct 10 16:01:25.563 [initandlisten] options: { config: "/etc/mongodb.conf", dbpath: "/var/lib/mongodb", nohttpinterface: "true", noprealloc: "true", quiet: true, smallfiles: "true" }
Fri Oct 10 16:01:25.636 [initandlisten] journal dir=/var/lib/mongodb/journal
Fri Oct 10 16:01:25.636 [initandlisten] recover : no journal files present, no recovery needed
Fri Oct 10 16:01:27.469 [initandlisten] preallocateIsFaster=true 27.58
Fri Oct 10 16:01:29.329 [initandlisten] preallocateIsFaster=true 28.04
</pre>
<br />
It looks like the daemon is running.<br />
<br />
I can use <span style="font-family: Courier New, Courier, monospace;">docker inspect</span><span style="font-family: inherit;"> to find the assigned IP address for the container. With that I can connect the mongo client to the service and test database access.</span><br />
<br />
<pre class="brush: bash ; title: 'Connect to database service' ; highlight: [1,4]">docker inspect --format '{{.NetworkSettings.IPAddress}}' mongodb
172.17.0.110
echo show dbs | mongo 172.17.0.110
MongoDB shell version: 2.4.6
connecting to: 172.17.0.110/test
local 0.03125GB
bye
</pre>
<br />
<br />
I know the database is running and answering queries. The last check is to look inside the directory I created for the database. It should have the test files I'd created as well as the database and journal files which mongod will create on startup.<br />
<br />
<pre class="brush: bash ; title: 'List contents of mongodb directory (outside)' ; highlight: [1]">ls -lZ ~mongodb
-rw-r--r--. mongodb polkitd system_u:object_r:docker_var_lib_t:s0 from_inside
-rw-r--r--. mongodb polkitd system_u:object_r:docker_var_lib_t:s0 from_inside_with_selinux
-rw-rw-r--. mongodb polkitd system_u:object_r:docker_var_lib_t:s0 from_outside
drwxr-xr-x. mongodb polkitd system_u:object_r:docker_var_lib_t:s0 journal
-rw-------. mongodb polkitd system_u:object_r:docker_var_lib_t:s0 local.0
-rw-------. mongodb polkitd system_u:object_r:docker_var_lib_t:s0 local.ns
-rwxr-xr-x. mongodb polkitd system_u:object_r:docker_var_lib_t:s0 mongod.lock
drwxr-xr-x. mongodb polkitd system_u:object_r:docker_var_lib_t:s0 _tmp
</pre>
<br />
There they are.<br />
<br /></div>
<div>
<h2>
Summary
</h2>
</div>
<div>
It took a little work to get a Docker container running a system service using persistent host storage for the database files.<br />
<br />
I had to get the container running without extra storage first and examine the container to see what it expected. The file ownership, permissions and the SELinux context all affect the ability to write files.<br />
<br />
<h3>
Tweaking for Storage</h3>
<br />
On the host I had to create a directory with the right characteristics. The UID and GID on the host may not match those inside the container. If the container service creates a user and group they will almost certainly not exist on a generic Docker container host.<br />
<br />
The Docker service uses a special set of SELinux contexts and labels to run. Docker runs as root and it does lots of potentially dangerous things. The SELinux policies for Docker are designed to prevent contained processes from escaping, at least through the resources SELinux can control.<br />
<br />
Setting the directory ownership and the SELinux context require root access. This isn't a really big deal as Docker also requires root (or at least membership in the docker group) but its another wart. It does mean that the ideal of running service containers in user space is an illusion. Once the directory is set up and running it will require root access to remove it as well. It's probably best not to place it in a user home directory as I did.<br />
<br />
<h3>
Scaling up: Multiple Hosts and Network Storage?</h3>
<br />
It is possible to run Docker service containers with persistent external storage from the host. This won't scale up to multiple hosts. Kubernetes has no way of making the required changes to the host. It might be possible to use network filesystems like NFS, Gluster or Ceph so long as the user accounts are made consistent.<br />
<br />
The other possibility for shared storage is cloud storage. I'll talk about that some in the next post, though it's not ready for Docker and Kubernetes yet.<br />
<br />
<h3>
Pending Features: User Namespaces (SELinux Namespaces?)</h3>
<br />
The user mapping may be resolved by a pending feature addition to Linux namespaces and Docker: User namespaces. This would allow a UID inside a container to be mapped to a different UID on the host. The same would be true for GIDs. This would allow me to run a container which uses the mongodb UID inside the container but is able to access files owned by my UID on the host. I don't have a timeline for this feature and the developers still raise their eyebrows in alarm when I ask about it, but it is work in progress.<br />
<br />
A feature which does not exist to my knowledge is SELinux namespaces. This is the idea that an SELinux label inside a container might be mapped to a different label outside. This would allow the docker_var_lib_dir_t label inside to be mapped to user_home_t outside. I suspect this would break lots of things and open up nasty holes so I don't expect it soon.<br />
<br />
<h3>
Next Up: Network (Cloud) Storage</h3>
Next up is some discussion (but not any demonstration at all) of the state of network storage </div>
<div>
<h2>
References
</h2>
</div>
<div>
<br />
<ul>
<li><a href="http://www.docker.com/">Docker</a></li>
<ul>
<li><a href="https://docs.docker.com/reference/builder/#user">USER directive</a></li>
<li><a href="https://docs.docker.com/reference/builder/#volume">VOLUME directive</a></li>
<li><a href="https://docs.docker.com/reference/builder/#workdir">WORKDIR directive</a></li>
</ul>
<li><a href="http://mongodb.org/">MongoDB</a></li>
<li><a href="http://selinuxproject.org/page/Main_Page">SELinux</a></li>
<ul>
<li><a href="https://fedoraproject.org/wiki/SELinux/getenforce">getenforce</a></li>
<li><a href="https://fedoraproject.org/wiki/SELinux/setenforce">setenforce</a></li>
<li><a href="https://fedoraproject.org/wiki/SELinux/setenforce">ls</a> -Z</li>
<li><a href="http://man7.org/linux/man-pages/man1/ps.1.html">ps</a> -Z</li>
<li><a href="https://fedoraproject.org/wiki/SELinux/chcon">chcon</a></li>
<li><a href="https://fedoraproject.org/wiki/SELinux/audit2allow">audit2allow</a></li>
</ul>
</ul>
<br />
<br /></div>
markllamahttp://www.blogger.com/profile/14193184544557876514noreply@blogger.com1tag:blogger.com,1999:blog-5022186007695457923.post-57472334881789712482014-10-07T07:46:00.002-07:002014-10-20T19:18:37.903-07:00Storage Concepts in Docker: Shared Storage and the VOLUME directiveIn the next few posts I'm going to take a break from the concrete work of creating images for Pulp in Docker. The next step in my project requires some work with storage and it's going to take a bit of time for exploration and then some careful planning. Note that when I get to moving them to Kubernetes I'll have to revisit some of this, as Kubernetes Pods place some constraints (and provide some capabilities) that Docker alone doesn't.<br />
<br />
This is going to take at least three posts:<br />
<br />
<ul>
<li>Shared Storage in Docker (this post)</li>
<li><a href="http://cloud-mechanic.blogspot.com/2014/10/storage-concepts-in-docker-persistent.html">Persistent Storage in Docker</a></li>
<li><a href="http://cloud-mechanic.blogspot.com/2014/10/storage-concepts-in-docker-network-and.html">Persistent Storage in Kubernetes</a></li>
</ul>
<div>
<br /></div>
<div>
It could take a fourth post for Persistent Storage in Kubernetes, but that would be a fairly short post because the answer right now is "you really can't do that yet". People are working hard to figure out how to get persistent storage into a Kubernetes cluster, but it's not ready yet.</div>
<div>
<br /></div>
<div>
For now I'm going to take them one large bite at a time.</div>
<br />
<h2>
Storage in a Containerized World</h2>
<div>
<br /></div>
<div>
The whole point of containers is that they don't leak. Nothing should escape or invade. The storage that is used by each container has a life span only as long as the container itself. Very quickly though one finds that truly closed containers aren't very useful. To make them do real work you have to punch some holes.<br />
<br />
The most common holes are network ports, both inbound and out. A daemon in a container listens for connections and serves responses to queries. It may also initiate new outbound queries to gather information to do its job. Network connections are generally point-to-point and ephemeral. New connections are created and dropped all the time. If a connection fails during a transaction, no problem, just create a new connection and resend the message. Sometimes though, what you really need is something that lasts.<br />
<br />
<h2>
Shared and Persistent Storage</h2>
<br />
Sometimes a process doesn't just want to send messages to other processes. Sometimes it needs to create an artifact and put it someplace that another process can find and use it. In this case network connections aren't really appropriate for trading that information. It needs disk storage. Both processes need access to<i> </i>the same bit of storage. The storage must be <i>shared.</i></div>
<div>
<br />
Another primary characteristic of Docker images (and containerized applications in general) is that they are 100% reproducible. This also makes them disposable. If it's trivial to make arbitrary numbers of copies of an image, then there's no problem throwing one away. You just make another.<br />
<br />
When you're dealing with shared storage the life span of a container can be a problem too. If the two containers which share the storage both have the same life span then the storage can be "private", shared just between them. When either container dies, they both do and the storage can be reclaimed. If the contents of the storage has a life span longer than the containers, or if they container processes have different life spans then the storage needs to be <i>persistent.</i><br />
<i><br /></i>
<br />
<h2>
Pulp and Docker Storage</h2>
<i><br /></i>
The purpose of the Pulp application is to act as a repository for long-term storage. The payload are files mirrored from remote repositories and offered locally. This can minimize long-haul network traffic and allow for network boundary security (controlled proxies) which might prohibit normal point-to-point connections between a local client and a remote content server.<br />
<br />
Two processes work with the payload content directly. The Pulp worker process is responsible for scanning the remote repositories, detecting new content and initiating a sync to the local mirror. The Apache process publishes the local content out to the clients which are the customers for the Pulp service. It consumes the local mirror content that has been provided by the Pulp workers. These two processes must both have access to the same storage to do their jobs.<br />
<br />
For demonstration purposes, shared storage is sufficient. The characteristics of shared storage in Docker and Kubernetes is complex enough to start without trying to solve the problem of persistence as well. In fact, persistent storage is still a largely an unsolved problem. This is because local persistent storage isn't very useful as soon as you try to run containers on different hosts. At that point you need a SAN/NAS or some other kind of network storage like OpenStack Cinder or AWS/EBS or Google Cloud Storage.<br />
<br />
So, this post is about the care and feeding of shared storage in Docker applications.<br />
<br /></div>
<h2>
Docker Image: the VOLUME directive</h2>
<div>
<br />
The Dockerfile has a number of directives which specify ways to poke holes in containers. <a href="https://docs.docker.com/reference/builder/#volume">The VOLUME directive</a> is used to indicate that a container wants to use external or shared storage.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="http://2.bp.blogspot.com/-nz-1tr6ruUM/VDM4bLLlhMI/AAAAAAAAFBQ/PxadSiOosa0/s1600/Pulp_Image_Volumes.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="http://2.bp.blogspot.com/-nz-1tr6ruUM/VDM4bLLlhMI/AAAAAAAAFBQ/PxadSiOosa0/s1600/Pulp_Image_Volumes.png" height="246" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Dockerfile: VOLUME directive</td></tr>
</tbody></table>
<br /></div>
<div>
The diagram above shows the effect of a VOLUME directive when creating a new image. It indicates that this image has two mount points which can be attached to external (to the container) storage.<br />
<br />
Here's the complete Dockerfile for the pulp-content image.<br />
<br />
<script src="https://gist.github.com/markllama/f411916593003fa1ce2f.js"></script>
<br />
<br />
Here's where the window metaphor breaks down. The VOLUME directive indicates a node in a file path where an external filesystem may be mounted. It's a dividing line, inside and outside. What happens to the files in the image that would fall on the outside?<br />
<br />
Docker places those files into their own filesystem as well. If the container is created without specifying an external volume to mount there, this default filesystem is mounted. The VOLUME directive defines a place where files can be imported <i>or exported.</i><br />
<br />
So what happens if you just start a container with that image, but don't specify an external mount?<br />
<br />
<h2>
Defaulted Volumes</h2>
<div>
<br /></div>
<div>
To continue with the flawed metaphor, every window has two sides. The VOLUME directive only specifies the boundary. It says "some filesystem may be provided to mount <i>here".</i> But if I don't provide a file tree to mount there (using the -v option) Docker mounts the file tree that was inside the image when it was built. I can run the <span style="font-family: Courier New, Courier, monospace;">pulp-content</span><span style="font-family: inherit;"> image with a shell and inspect the contents. I'll look at it both from the inside and the outside.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">I'm going to start an interactive </span><span style="font-family: Courier New, Courier, monospace;">pulp-content</span><span style="font-family: inherit;"> container with a shell so I can inspect the contents.</span></div>
<div>
<br /></div>
<div>
<pre class="brush:bash ; title: 'Container Mounts with Volumes' ; highlight: [1,12,13]">docker run -it --name volume-demo markllama/pulp-content /bin/sh
sh-4.2# mount
/dev/mapper/docker-8:4-2758071-77b5c9ba618358600e5b59c3657256d1a748aac1c14e2be3d9c505adddc92ce3 on / type ext4 (rw,relatime,context="system_u:object_r:svirt_sandbox_file_t:s0:c585,c908",discard,stripe=16,data=ordered)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev type tmpfs (rw,nosuid,context="system_u:object_r:svirt_sandbox_file_t:s0:c585,c908",mode=755)
shm on /dev/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,context="system_u:object_r:svirt_sandbox_file_t:s0:c585,c908",size=65536k)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,context="system_u:object_r:svirt_sandbox_file_t:s0:c585,c908",gid=5,mode=620,ptmxmode=666)
sysfs on /sys type sysfs (ro,nosuid,nodev,noexec,relatime,seclabel)
/dev/sda4 on /etc/resolv.conf type ext4 (rw,relatime,seclabel,data=ordered)
/dev/sda4 on /etc/hostname type ext4 (rw,relatime,seclabel,data=ordered)
/dev/sda4 on /etc/hosts type ext4 (rw,relatime,seclabel,data=ordered)
/dev/sda4 on /var/lib/pulp type ext4 (rw,relatime,seclabel,data=ordered)
/dev/sda4 on /var/www type ext4 (rw,relatime,seclabel,data=ordered)
devpts on /dev/console type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000)
proc on /proc/sys type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/sysrq-trigger type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/irq type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/bus type proc (ro,nosuid,nodev,noexec,relatime)
tmpfs on /proc/kcore type tmpfs (rw,nosuid,context="system_u:object_r:svirt_sandbox_file_t:s0:c585,c908",mode=755)
</pre>
<br /></div>
<div>
So there is a filesystem mounted on those two mount points. But what's in them?
<br />
<br />
<pre class="brush: bash ; title: 'Default Volume Contents' ; highlight: [1,7]">sh-4.2# find /var/www
/var/www
/var/www/pub
/var/www/html
/var/www/cgi-bin
sh-4.2# find /var/lib/pulp
/var/lib/pulp
/var/lib/pulp/published
/var/lib/pulp/published/yum
/var/lib/pulp/published/yum/https
/var/lib/pulp/published/yum/http
/var/lib/pulp/published/puppet
/var/lib/pulp/published/puppet/https
/var/lib/pulp/published/puppet/http
/var/lib/pulp/uploads
/var/lib/pulp/celery
/var/lib/pulp/static
/var/lib/pulp/static/rsa_pub.key
</pre>
</div>
<br />
That's what it looks like on the inside. But what's the view from outside? I can find out using <span style="font-family: Courier New, Courier, monospace;"><a href="https://docs.docker.com/reference/commandline/cli/#inspect">docker inspect</a></span><span style="font-family: inherit;">.</span><br />
<br />
<pre class="brush: bash ; title: 'Docker Volume Configuration' ; highlight: [1]">docker inspect --format '{{.Config.Volumes}}' volume-demo
map[/var/lib/pulp:map[] /var/www:map[]]
</pre>
<br />
First I ask what the volume configuration is for the container. That result tells me that I didn't provide any mapping for the two volumes. Next I check what volumes are actually provided.<br />
<br />
<pre class="brush: bash ; title: 'Docker Volume Information' ; highlight: [1]">docker inspect --format '{{.Volumes}}' volume-demo
map[
/var/lib/pulp:/var/lib/docker/vfs/dir/3a11750bd3c31a8025f0cba8b825e568dafff39638fa1a45a17487df545b0f6a
/var/www:/var/lib/docker/vfs/dir/0a86bd1085468f04feaeb47cc32cfdb0c05fd10e5c7b470790042107d9c02b70
]
</pre>
<br />
These are the volumes that are actually mounted on the container. I can see that <span style="font-family: Courier New, Courier, monospace;">/var/lib/pulp</span> and <span style="font-family: Courier New, Courier, monospace;">/var/www</span> have something mounted on them and that the volumes are actually stored in the host filesystem under <span style="font-family: Courier New, Courier, monospace;">/var/lib/docker/vfs/dir</span>. Graphically, here's what that looks like:<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="http://4.bp.blogspot.com/-gPZs_G2HdNY/VDPiT51wIII/AAAAAAAAFBo/9cBKuLa-sRY/s1600/Pulp%2BContent%2BContainer%2B-%2BNo%2Bexternal%2BVolumes%2B(1).png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="http://4.bp.blogspot.com/-gPZs_G2HdNY/VDPiT51wIII/AAAAAAAAFBo/9cBKuLa-sRY/s1600/Pulp%2BContent%2BContainer%2B-%2BNo%2Bexternal%2BVolumes%2B(1).png" height="219" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Default mounts with VOLUME directive</td></tr>
</tbody></table>
<br />
So now I have a container running with some storage that is, in a sense "outside" the container. I need to mount that same storage into another container. This is where the Docker <span style="font-family: Courier New, Courier, monospace;">--volumes-from</span> option picks up.<br />
<br />
<h2>
Shared Volumes in Docker</h2>
</div>
<div>
<br /></div>
<div>
Once a container exists with marked volumes it is possible to mount those volumes into other containers. Docker provides an option which allows all of the volumes from an existing container to be mounted one-to-one into another container.<br />
<br />
For this demonstration I'm going to just create another container from the <span style="font-family: Courier New, Courier, monospace;">pulp-content</span><span style="font-family: inherit;"> image, but this time I'm going to tell it to mount the volumes from the existing container:</span><br />
<br />
<pre class="brush:bash ; title: 'A container using --volumes-from' ; highlight: 1">docker run -it --name volumes-from-demo --volumes-from volume-demo markllama/pulp-content /bin/sh
</pre>
<br />
If you're following along you can use <span style="font-family: Courier New, Courier, monospace;">mount</span><span style="font-family: inherit;"> to show the internal mount points, and observe that they match those of the original container. From the outside I can use </span><span style="font-family: Courier New, Courier, monospace;">docker inspect</span><span style="font-family: inherit;"> to show that both containers are sharing the same volumes.</span></div>
<div>
<br />
<pre class="brush:bash ; title: 'A container using --volumes-from' ; highlight: [1,7]">docker inspect --format '{{.Volumes}}' volume-demo
map[
/var/lib/pulp:/var/lib/docker/vfs/dir/3a11750bd3c31a8025f0cba8b825e568dafff39638fa1a45a17487df545b0f6a
/var/www:/var/lib/docker/vfs/dir/0a86bd1085468f04feaeb47cc32cfdb0c05fd10e5c7b470790042107d9c02b70
]
docker inspect --format '{{.Volumes}}' volumes-from-demo
map[
/var/lib/pulp:/var/lib/docker/vfs/dir/3a11750bd3c31a8025f0cba8b825e568dafff39638fa1a45a17487df545b0f6a
/var/www:/var/lib/docker/vfs/dir/0a86bd1085468f04feaeb47cc32cfdb0c05fd10e5c7b470790042107d9c02b70
]
</pre>
<br />
These two containers have the same filesystems mounted on their declared volume mount points.<br />
<br /></div>
<div>
<h2>
Shared Storage and Pulp</h2>
</div>
<div>
The next two images I need to create for a Pulp service are going to require shared storage. The Pulp worker process places files in <span style="font-family: Courier New, Courier, monospace;">/var/lib/pulp</span><span style="font-family: inherit;"> and symlinks them into </span><span style="font-family: Courier New, Courier, monospace;">/var/www</span><span style="font-family: inherit;"> to make them available to the web server. The Apache server needs to be able to read both the web repository in </span><span style="font-family: Courier New, Courier, monospace;">/var/www</span><span style="font-family: inherit;"> and the Pulp content in </span><span style="font-family: Courier New, Courier, monospace;">/var/lib/pulp</span><span style="font-family: inherit;"> so that it can resolve the symlinks and serve the content to clients. I can build the images using the VOLUME directive to create the "windows" I need and then use a content image to hold the files. Both the worker and apache containers will use the </span><span style="font-family: Courier New, Courier, monospace;">--volumes-from</span><span style="font-family: inherit;"> directive to mount the storage from the content container.</span></div>
<div>
<span style="font-family: inherit;"><br /></span></div>
<div>
Here's what that will looks like in Docker:</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="http://3.bp.blogspot.com/-j2l192udbVI/VDPt96d2ncI/AAAAAAAAFCA/L-ck4aMjj48/s1600/Pulp%2BWorker%2B%2B%2BApache%2B%2B%2BContent%2B(1).png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="http://3.bp.blogspot.com/-j2l192udbVI/VDPt96d2ncI/AAAAAAAAFCA/L-ck4aMjj48/s1600/Pulp%2BWorker%2B%2B%2BApache%2B%2B%2BContent%2B(1).png" height="554" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Pulp Content Storage (Docker)</td></tr>
</tbody></table>
The content container will be created first. The content image uses the <span style="font-family: Courier New, Courier, monospace;">pulp-base</span><span style="font-family: inherit;"> as its parent so the file structure, ownership and permissions for the volume content will be initialized correctly. The worker and Apache containers will get their volumes from the content container.</span><br />
<div>
<br /></div>
<div>
<h2>
Summary</h2>
<div>
<br /></div>
<div>
In this post I learned what Docker does with a VOLUME directive if no external volume is provided for the container at runtime. I also learned how to share storage between two or more running containers.</div>
<div>
<br /></div>
<div>
In the next post I'll show <a href="http://cloud-mechanic.blogspot.com/2014/10/storage-concepts-in-docker-persistent.html">how to mount (persistent) host storage into a container</a>.</div>
<div>
<br /></div>
<div>
In the final post before going back to building a Pulp service I'll demonstrate how to create a pod with storage shared between the containers and if there's space, how to mount host storage into the pod as well.</div>
<div>
<br /></div>
<h2>
<span style="font-family: inherit;">References</span></h2>
<div>
<ul>
<li><a href="http://www.docker.com/">Docker</a></li>
<ul>
<li>Dockerfile <a href="https://docs.docker.com/reference/builder/#volume">VOLUME directive</a></li>
<li>Docker CLI <a href="https://docs.docker.com/reference/commandline/cli/#run">run command</a> (see --volume and --volumes-from options)</li>
<li>Docker CLI <a href="https://docs.docker.com/reference/commandline/cli/#inspect">inspect command</a></li>
</ul>
<li><a href="http://pulpproject.org/">Pulp</a></li>
</ul>
</div>
<div>
<span style="font-family: inherit;"><br /></span></div>
</div>
markllamahttp://www.blogger.com/profile/14193184544557876514noreply@blogger.com0tag:blogger.com,1999:blog-5022186007695457923.post-64423837018827885552014-09-29T17:51:00.000-07:002014-09-30T05:50:43.410-07:00Docker: Re-using a custom base image - Pulp Resource Manager image.Here's the next step in the ongoing saga of containerizing the Pulp service in <a href="http://www.docker.com/">Docker</a> for use with <a href="https://github.com/GoogleCloudPlatform/kubernetes">Kubernetes</a>.<br />
<br />
<a href="http://cloud-mechanic.blogspot.com/2014/09/docker-building-and-using-base-image.html">In the last post</a> I spent a bunch of effort creating a base image for a set of Pulp service components. Then I only implemented one, the <a href="http://www.celeryproject.org/">Celery</a> beat server. In this (hopefully much shorter) post I'll create a second image from that base. This one is going to be the <a href="http://pulp-user-guide.readthedocs.org/en/latest/server.html#resource-manager">Pulp Resource Manager</a> service.<br />
<br />
A couple of recap pieces to start.<br />
<br />
The <a href="http://www.pulpproject.org/">Pulp service</a> is made up of <a href="http://pulp-user-guide.readthedocs.org/en/latest/server.html#components">several independent processes</a> that communicate using <a href="http://amqp.org/">AMQP</a> messaging (through a <a href="http://qpid.apache.org/">QPID message bus</a>) and by access to a MongoDB database. The QPID services and the <a href="http://mongodb.org/">MongoDB</a> services are entirely independent of the Pulp service processes and communicate only over TCP/IP. There are also a couple of processes that are tightly coupled, both requiring access to shared data. These will come later. What's left is the Pulp Resource Manager process and the Pulp Admin REST service.<br />
<br />
I'm going to take these in two separate posts to make them a bit more digestible than the last one was.<br />
<br />
<h2>
Extending the Base - Again</h2>
<div>
<br /></div>
<div>
As in the case with the Pulp Beat service, the Resource Manager process is a singleton. Each pulp service has exactly one. (Discussions of <a href="https://en.wikipedia.org/wiki/High_availability">HA</a> and <a href="https://en.wikipedia.org/wiki/Single_point_of_failure">SPOF</a> will be held for later). The Resource Manager process communicates with the other components solely through the QPID message broker and the MongoDB over TCP. There is no need for persistent storage.</div>
<div>
<br /></div>
<div>
In fact the only difference between the Beat service and the Resource Manager is the invocation of the Celery service. This means that the only difference between the Docker specifications is the name and two sections of the <i><span style="font-family: Courier New, Courier, monospace;">run.sh</span></i> file.</div>
<div>
<br /></div>
<div>
The Dockerfile is in fact identical in content to that for the Pulp Beat container:</div>
<div>
<br />
<script src="https://gist.github.com/markllama/5de9f000546de9304955.js"></script>
</div>
<div>
Now to the <span style="font-family: Courier New, Courier, monospace;">run.sh</span> script.<br />
<br />
<script src="https://gist.github.com/markllama/2acebf3bd58add6a1085.js"></script>
<br />
The first difference in the <span style="font-family: Courier New, Courier, monospace;">run.sh</span> is simple. The Beat service is used to initialize the database. The Resource Manager doesn't have to do that.</div>
<div>
<br /></div>
<div>
The second is also pretty simple: The exec line at the end starts the Celery service use the <i>resource_manager</i> entry point instead of the beat service.</div>
<div>
<br /></div>
<div>
I do have one other note to myself. It appears that the <span style="font-family: Courier New, Courier, monospace;">wait_for_database()</span> function will be needed in every derivative of the pulp-base image. I should probably refactor that but I'm not going to do it yet.</div>
<div>
<br /></div>
<div>
<h2>
One Image or Many?</h2>
</div>
<div>
<br /></div>
<div>
So, if I hadn't been using shell functions, this really would come down to two lines different between the two. Does it really make sense to create two images? It is possible to pass a mode argument to the container on startup. Wouldn't that be simpler?<br />
<br />
It actually might be. It is possible to use the same image and pass an argument. The example from which mine are derived used that method.<br />
<br />
I have three reasons for using separate images. One is for teaching and the other two are development choices. Since one of my goals is to show how to create custom base images and then use derived images to create customizations I used this opportunity to show that.<br />
<br />
The deeper reasons have to do with human nature and the software development life cycle.<br />
<br />
People expect to be able to compose service by grabbing images off the shelf and plugging them together. Adding modal switches to the images means that they are not strongly differentiated by function. You can't just say "Oh, I need 5 functional parts, let me check the bins". You have to know more about each image than just how it connects to others. You have to know that this particular image can take more than one role within the service. I'd like to avoid that if I can. Creating images with so little difference feels like inefficiency, but only when viewed from the standpoint of the person producing the images. To the consumer it maintains the usage paradigm. Breaks in the paradigm can lead to mistakes or confusion.<br />
<br />
The other reason to use distinct images has to do with what I expect and hope will be a change in the habits of software developers.<br />
<br />
Developers of complex services currently feel a tension, when they are creating and packaging their software, between putting all of the code, binaries and configuration templates into a single package. You only create a new package if the function is strongly <i>different. </i>This makes it simpler to install the software and configure it once. On traditional systems where all of the process components would be running on the same host there was no good reason to separate the code for distinct processes based on their function. There are clear cases where the separation does happen in host software packaging, notably in client and server software. These clearly will run on different hosts. Other cases though are not clear cut.<br />
<br />
The case of the Pulp service is in a gray area. Much of the code is common to all four Celery based components (beat, resource manager, worker and admin REST service). It is likely possible to refactor the unique code into separate packages for the components, though the value is questionable at this point.<br />
<br />
I want to create distinct images because it's not very expensive, and it allows for easy refactoring should the Pulp packaging ever be decomposed to match the actual service components. Any changes would happen when the new images are built, but the consumer would not need to see any change. This is a consideration to keep in mind when ever I create a new service with different components from the same service RPM.<br />
<br />
<h2>
Running and Verifying the Resource Manager Image</h2>
<div>
<br /></div>
<div>
The Pulp Resource Manager process makes the same connections that the Pulp Beat process does. It's a little harder to detect the Resource Manager access to the database since the startup doesn't make radical changes like the DB initialization. I'm going to see if I can find some indications that the resource manager is running though. The QPID connection will be much easier to detect. The Resource Manager creates its own set of queues which will be easy to see.</div>
<div>
<br /></div>
<div>
The resource manager requires the database service and an initialized database. Testing this part will start where the previous post left off, with running QPID and MongoDB and with the Pulp Beat service active.</div>
<div>
<br /></div>
<div>
NOTE: there's currently (20140929) a bug in Kubernetes where, during the period between waiting for the image to download and when it actually starts, <span style="font-family: Courier New, Courier, monospace;">kubecfg list pods</span><span style="font-family: inherit;"> will indicate that the pods have terminated. If you see this, give it another minute for the pods to actually start and transfer to the running state.</span></div>
<div>
<br /></div>
<h3>
Testing in Docker</h3>
<div>
<br />
All I need to do using Docker directly is to verify that the container will start and run. The visibility in Kubernetes still isn't up to general dev and debugging.<br />
<br />
<pre class="brush: bash ; title: 'Start the pulp-resource-manager manually'; highlight: [1,2,3,4,5,6]">docker run -d --name pulp-resource-manager \
-v /dev/log:/dev/log \
-e PULP_SERVER_NAME=pulp.example.com \
-e SERVICE_HOST=10.245.2.2 \
markllama/pulp-resource-manager
0e8cbc4606cf8894f8be515709c8cd6a23f37b3a58fd84fecf0d8fca46c64eed
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0e8cbc4606cf markllama/pulp-resource-manager:latest "/run.sh" 9 minutes ago Up 9 minutes pulp-resource-manager
</pre>
<br />
Once it's running I can check the logs to verify that everything has started as needed and that the primary process has been executed at the end.<br />
<br />
<pre class="brush: bash ; title: 'Show the pulp-resource-manager logs' ; highlight: 1">docker logs pulp-resource-manager
+ '[' '!' -x /configure_pulp_server.sh ']'
+ . /configure_pulp_server.sh
++ set -x
++ PULP_SERVER_CONF=/etc/pulp/server.conf
++ export PULP_SERVER_CONF
++ PULP_SERVER_NAME=pulp.example.com
++ export PULP_SERVER_NAME
++ SERVICE_HOST=10.245.2.2
++ export SERVICE_HOST
++ DB_SERVICE_HOST=10.245.2.2
++ DB_SERVICE_PORT=27017
++ export DB_SERVICE_HOST DB_SERVICE_PORT
++ MSG_SERVICE_HOST=10.245.2.2
++ MSG_SERVICE_PORT=5672
++ MSG_SERVICE_USER=guest
++ export MSG_SERVICE_HOST MSG_SERVICE_PORT MSG_SERVICE_NAME
++ check_config_target
++ '[' '!' -f /etc/pulp/server.conf ']'
++ configure_server_name
++ augtool -s set '/files/etc/pulp/server.conf/target[. = '\''server'\'']/server_name' pulp.example.com
Saved 1 file(s)
++ configure_database
++ augtool -s set '/files/etc/pulp/server.conf/target[. = '\''database'\'']/seeds' 10.245.2.2:27017
Saved 1 file(s)
++ configure_messaging
++ augtool -s set '/files/etc/pulp/server.conf/target[. = '\''messaging'\'']/url' tcp://10.245.2.2:5672
Saved 1 file(s)
++ augtool -s set '/files/etc/pulp/server.conf/target[. = '\''tasks'\'']/broker_url' qpid://guest@10.245.2.2:5672
Saved 1 file(s)
+ '[' '!' -x /test_db_available.py ']'
+ wait_for_database
+ DB_TEST_TRIES=12
+ DB_TEST_POLLRATE=5
+ TRY=0
+ '[' 0 -lt 12 ']'
+ /test_db_available.py
Testing connection to MongoDB on 10.245.2.2, 27017
+ '[' 0 -ge 12 ']'
+ start_resource_manager
+ exec runuser apache -s /bin/bash -c '/usr/bin/celery worker -c 1 -n resource_manager@pulp.example.com --events --app=pulp.server.async.app --umask=18 --loglevel=INFO -Q resource_manager --logfile=/var/log/pulp/resource_manager.log'
</pre>
<br />
If you fail to see it start especially with "file not found" or "no access" errors, check the /dev/log volume mount and the SERVICE_HOST value.<br />
<br />
I also want to check that the QPID queues have been created.<br />
<br />
<br />
<pre class="brush: bash ; title: '' ; highlight: [1,11,12,13]">qpid-config queues -b guest@10.245.2.4
Queue Name Attributes
======================================================================
04f58686-35a6-49ca-b98e-376371cfaaf7:1.0 auto-del excl
06fa019e-a419-46af-a555-a820dd86e66b:1.0 auto-del excl
06fa019e-a419-46af-a555-a820dd86e66b:2.0 auto-del excl
0c72a9c9-e1bf-4515-ba4b-0d0f86e9d30a:1.0 auto-del excl
celeryev.ed1a92fd-7ad0-4ab1-935f-6bc6a215f7d3 auto-del --limit-policy=ring --argument passive=False --argument exclusive=False --argument arguments={}
e70d72aa-7b9a-4083-a88a-f9cc3c568e5c:0.0 auto-del excl
e7e53097-ae06-47ca-87d7-808f7042d173:1.0 auto-del excl
resource_manager --durable --argument passive=False --argument exclusive=False --argument arguments=None
resource_manager@pulp.example.com.celery.pidbox auto-del --limit-policy=ring --argument passive=False --argument exclusive=False --argument arguments=None
resource_manager@pulp.example.com.dq --durable auto-del --argument passive=False --argument exclusive=False --argument arguments=None
</pre>
<br />
Line 8 looks like the Celery Beat service queue and lines 11, 12, and 13 are clearly associated with the resource manager. So far, so good.<br />
<br /></div>
<h3>
Testing in Kubernetes</h3>
<div>
<br /></div>
<div>
I had to reset the database between starts to test the Pulp Beat container. This image doesn't change the database structure, so I don't need to reset. I can just create a new pod definition and try it out.<br />
<br />
Again, the differences from the Pulp Beat pod definition are pretty trivial.<br />
<br />
<script src="https://gist.github.com/markllama/648671cd28e60e0b88e3.js"></script>
</div>
So here's what it looks like when I start the pod:<br />
<br />
<pre class="brush:bash ; title: '' ; highlight: [1,7,15]">kubecfg -c pods/pulp-resource-manager.json create pods
I0930 00:00:24.581712 16159 request.go:292] Waiting for completion of /operations/14
ID Image(s) Host Labels Status
---------- ---------- ---------- ---------- ----------
pulp-resource-manager markllama/pulp-resource-manager / name=pulp-resource-manager Waiting
kubecfg list pods
ID Image(s) Host Labels Status
---------- ---------- ---------- ---------- ----------
pulpdb markllama/mongodb 10.245.2.2/10.245.2.2 name=db Running
pulpmsg markllama/qpid 10.245.2.2/10.245.2.2 name=msg Running
pulp-beat markllama/pulp-beat 10.245.2.4/10.245.2.4 name=pulp-beat Terminated
pulp-resource-manager markllama/pulp-resource-manager 10.245.2.4/10.245.2.4 name=pulp-resource-manager Terminated
kubecfg get pods/pulp-resource-manager
ID Image(s) Host Labels Status
---------- ---------- ---------- ---------- ----------
pulp-resource-manager markllama/pulp-resource-manager 10.245.2.4/10.245.2.4 name=pulp-resource-manager Running
</pre>
There are two things of note here. Line 13 shows the pulp-resource-manager pod as <u>terminated</u><i style="text-decoration: underline;">.</i> Remember the bug note from above. The pod isn't terminated, it's between the pause container which downloads the image for a new container and the execution.<br />
<br />
One line 15 I requested the information for that pod by name using the <i>get</i> command, rather than listing them all. This time it shows <u>running</u>. as it should. <br />
<br />
When you use <i>get</i> all you get by default is a one line summary. If you want details you have to consume them as JSON and they're complete. In fact they use the same schema as the JSON used to create the pods in the first place (with a bit more detail filled in). While this could be hard for humans to swallow, it makes it AWESOME to write programs and scripts to process the output. Every command should offer some form of structured data output. Meanwhile, I wish Kubernetes would offer a --verbose option with nicely formatted plaintext. It will come (or I'll write it if I get frustrated enough).<br />
<br />
Get ready... Here it comes.<br />
<br />
<pre class="brush: jscript ; title: '' ; highlight: 1">kubecfg --json get pods/pulp-resource-manager | python -m json.tool
{
"apiVersion": "v1beta1",
"creationTimestamp": "2014-09-30T00:00:24Z",
"currentState": {
"host": "10.245.2.4",
"hostIP": "10.245.2.4",
"info": {
"net": {
"detailInfo": {
"Args": null,
"Config": null,
"Created": "0001-01-01T00:00:00Z",
"Driver": "",
"HostConfig": null,
"HostnamePath": "",
"HostsPath": "",
"ID": "",
"Image": "",
"Name": "",
"NetworkSettings": null,
"Path": "",
"ResolvConfPath": "",
"State": {
"ExitCode": 0,
"FinishedAt": "0001-01-01T00:00:00Z",
"Paused": false,
"Pid": 0,
"Running": false,
"StartedAt": "0001-01-01T00:00:00Z"
},
"SysInitPath": "",
"Volumes": null,
"VolumesRW": null
},
"restartCount": 0,
"state": {
"running": {}
}
},
"pulp-resource-manager": {
"detailInfo": {
"Args": null,
"Config": null,
"Created": "0001-01-01T00:00:00Z",
"Driver": "",
"HostConfig": null,
"HostnamePath": "",
"HostsPath": "",
"ID": "",
"Image": "",
"Name": "",
"NetworkSettings": null,
"Path": "",
"ResolvConfPath": "",
"State": {
"ExitCode": 0,
"FinishedAt": "0001-01-01T00:00:00Z",
"Paused": false,
"Pid": 0,
"Running": false,
"StartedAt": "0001-01-01T00:00:00Z"
},
"SysInitPath": "",
"Volumes": null,
"VolumesRW": null
},
"restartCount": 0,
"state": {
"running": {}
}
}
},
"manifest": {
"containers": null,
"id": "",
"restartPolicy": {},
"version": "",
"volumes": null
},
"podIP": "10.244.3.4",
"status": "Running"
},
"desiredState": {
"host": "10.245.2.4",
"manifest": {
"containers": [
{
"env": [
{
"key": "PULP_SERVER_NAME",
"name": "PULP_SERVER_NAME",
"value": "pulp.example.com"
}
],
"image": "markllama/pulp-resource-manager",
"name": "pulp-resource-manager",
"volumeMounts": [
{
"mountPath": "/dev/log",
"name": "devlog",
"path": "/dev/log"
}
]
}
],
"id": "pulp-resource-manager",
"restartPolicy": {
"always": {}
},
"uuid": "c73a89c0-4834-11e4-aba7-0800279696e1",
"version": "v1beta1",
"volumes": [
{
"name": "devlog",
"source": {
"emptyDir": null,
"hostDir": {
"path": "/dev/log"
}
}
}
]
},
"status": "Running"
},
"id": "pulp-resource-manager",
"kind": "Pod",
"labels": {
"name": "pulp-resource-manager"
},
"resourceVersion": 20,
"selfLink": "/api/v1beta1/pods/pulp-resource-manager"
}
</pre>
<br />
So there you go.<br />
<br />
I won't repeat the QPID queue check here because if everything's going well it looks the same.<br />
<br />
<h2>
Summary</h2>
<div>
<br /></div>
<div>
As designed there isn't really much to say. The only real changes were to remove the DB setup and change the exec line to start the resource manager process. That's the idea of cookie cutters.<br />
<br />
The next one won't be as simple. It uses the Pulp software package, but it doesn't run a Celery service. Instead it runs an Apache daemon and a WSGI web service to offer the Pulp Admin REST protocol. It connects to the database and the messaging service. It also needs SSL and a pair of external public TCP connections.</div>
<br />
<h2>
References</h2>
<br />
<ul>
<li><a href="http://www.docker.com/">Docker</a><br />Containerized Applications</li>
<li><a href="https://github.com/GoogleCloudPlatform/kubernetes">Kubernetes</a><br />Orchestration for Docker applications</li>
<li><a href="http://pulpproject.org/">Pulp</a><br />Enterprise OS and configuration content management</li>
<li><a href="http://celeryproject.com/">Celery</a><br />A distributed job management framework</li>
<li><a href="http://qpid.apache.org/">QPID</a><br />AMQP Message service</li>
<li><a href="http://mongodb.org/">MongoDB</a><br />NoSQL Database</li>
</ul>
</div>
markllamahttp://www.blogger.com/profile/14193184544557876514noreply@blogger.com0tag:blogger.com,1999:blog-5022186007695457923.post-6025456341423314022014-09-26T09:09:00.001-07:002014-10-20T19:31:04.742-07:00Docker: Building and using a base image for Pulp services in KubernetesMy stated goal in this series of posts is to create a working containerized <a href="http://www.pulpproject.org/">Pulp</a> service running in a <a href="https://github.com/GoogleCloudPlatform/kubernetes">Kubernetes</a> cluster. After, what is it, 5 posts, I'm finally actually ready to do something with pulp itself.<br />
<br />
The Pulp service proper is made up of a single <a href="http://www.celeryproject.org/">Celery</a> beat process, a single resource manager process, and some number of pulp worker processes. These together do the work of Pulp, mirroring and managing the content that is Pulp's payload. The service also requires at least one Apache HTTP server to deliver the payload but that comes later.<br />
<br />
All of the Pulp processes are actually built on Celery. They all require the the same set of packages and much of the same configuration information. They all need use the <a href="http://www.mongodb.org/">MongoDB</a> and the <a href="https://qpid.apache.org/">QPID</a> services. The worker processes all need access to some shared storage, but the beat and resource manager do not.<br />
<br />
To build the <a href="http://www.docker.com/">Docker</a> images for these different containers, rather than duplicating the common parts, the best practice is to put those parts into a <i>base image</i> and then add one last layer to create each of the variations.<br />
<br />
In this post I'll demonstrate creating a shared base image for Pulp services and then I'll create the first image that will consume the base to create the Pulp beat service.<br />
<br />
The real trick is to figure out what the common parts are. Some are easy though so I'll start there.<br />
<br />
<h2>
Creating a Base Image</h2>
<div>
<br /></div>
<div>
For those of you who are coders, a base image is a little like an abstract class. It defines some important characteristics that are meant to be re-used, but it leaves others to be resolved later. The Docker community already provides a set of base images like the Fedora:20 image which have been hand-crafted to provide a minimal OS. Docker makes it easy to use the same mechanism for building our own images.</div>
<div>
<br /></div>
<div>
The list below enumerates the things that all of the Pulp service images will share. When I create the final images I'll add the final tweaks. Some of these will essentially be stubs to be used later.</div>
<div>
<br /></div>
<ul>
<li>Pulp Repo file<br />Pulp is not yet standard in the RHEL, CentOS or Fedora distributions</li>
<li>Pulp Server software</li>
<li>Communications Software (MongoDB and QPID client libraries)</li>
<li>Configuration tools: <a href="http://augeas.net/">Augeas</a></li>
</ul>
<br />
<div>
There is also some configuration scripting that will be required by all the pulp service containers:</div>
<div>
<br /></div>
<div>
<ul>
<li>A script to apply the customization/configuration for the execution environment</li>
<li>A test script to ensure that the database is available before starting the celery services</li>
<li>A test script to ensure that the message service is available</li>
</ul>
</div>
Given that start, here's what I get for the Dockerfile<br />
<br />
<script src="https://gist.github.com/markllama/8ffb6e5148f6f766d22b.js"></script>
<br />
Lines 1 and 2 should be familiar already. There are no new directives here but a couple of things need explaining.<br />
<br />
<br />
<ul>
<li>Line 1: The base image</li>
<li>Line 2: Contact information</li>
<li>Line 4: A usage comment<br />Pulp uses <i>syslog</i>. For a process inside a container to write to syslog you either have to have a syslogd running or you have to have write access to the host's <span style="font-family: Courier New, Courier, monospace;">/dev/log</span> file. I'll show this gets done when<br />I create a real app image from this base and run it.</li>
<li>Line 6: Create a yum repo for the Pulp package content.<br />You can add files using a URL for the source.</li>
<li>Lines 9-12: Install the Pulp packages, QPID client software and Augeas to help configuration.</li>
<li>Lines 15-17: COMMENTED: Install and connect the Docker content plugin<br />This is commented out at the moment. It hasn't been packaged yet and there are some issues with dependency resolution. I left it here to remind me to put it back when the problems are resolved.</li>
<li>Line 20: Add an Augeas lens definition to manage the Pulp <span style="font-family: Courier New, Courier, monospace;">server.conf</span> file<br />Augeas is suitet for managing config values, when a lens exists. More detail below</li>
<li>Line 23: Add a script to execute the configuration<br />This will be used by the derived images, but it works the same for all of them</li>
<li>Line 27: Add a script which can test for access to the MongoDB<br />Pulp will just blindly try to connect, but will just hang if the DB is unavailable. This script allows me to decide to wait or quit if the database isn't ready. If I quit, Kubernetes will re-spawn a new container to try again.</li>
</ul>
<div>
<br /></div>
<h3>
The Pulp Repo</h3>
<div>
<br /></div>
<div>
The Pulp server software is not yet in the standard Fedora or EPEL repositories. The packages are available from the contributed repositories on the Fedora project. The repo file is also there, accessible through a URL.</div>
<div>
<br /></div>
<div>
The docker <a href="https://docs.docker.com/reference/builder/#run">RUN directive</a> can take a URL as well as a local relative file path. </div>
<div>
<br /></div>
<div>
Line 4 pulls the Pulp repo file down and places it so that it can be used in the next step.</div>
<br />
<h3>
Pulp Packages (dependencies and tools)</h3>
<div>
<br /></div>
<div>
The Pulp software is easiest installed as a YUM group. I use a Dockerfile RUN directive to install the Pulp packages into the base image. This will install most of the packages needed for the service, but there are a couple of additional packages that aren't part of the package group.</div>
<div>
<br /></div>
<div>
Pulp can serve different types of repository mirrors. These are controlled by content plugins. I add the RPM plugin, python-pulp-rpm-common. I also add a couple of Python QPID libraries. However you can't run both <i><a href="https://docs.fedoraproject.org/en-US/Fedora/14/html/Software_Management_Guide/ch05s15s03.html">groupinstall</a></i> and the normal package<i> install</i> command in the same invocation so the additional Python QPID libaries are installed in a second command.</div>
<div>
<br /></div>
<div>
I also want to install <a href="http://augeas.net/">Augeas</a>. This is a tool that enables configuration editing using a structured API or CLI command.<br />
<br /></div>
<h3>
</h3>
<h3>
Augeas Lens for Pulp INI files</h3>
<br />
<a href="http://augeas.net/">Augeas</a> is an attempt to wrangle the flat file databases that make up the foundation of most *NIX application configuration. It offers a way to access individual key/value pairs within well known configuration files without resorting to tools like <i>sed</i> or <i>perl</i> and <i>regular expressions</i>. With augeas each key/value pair is assigned a path and can be queried and updated using that path. It offers both API and CLI interfaces though it's not nearly as commonly used as it should be.<br />
<br />
The down side of Augeas is that it doesn't include a description (<i><a href="http://augeas.net/docs/lenses.html">lens</a></i> in Augeas terminology) for Pulp config files. Pulp is too new. The upside is that the Pulp config files are fairly standard INI format, and it's easy to adapt the stock <a href="http://augeas.net/docs/references/lenses/files/inifile-aug.html">IniFile lens</a> for Pulp.<br />
<br />
I won't include the lens text inline here, but I <a href="https://gist.github.com/cda21dbfac1dd8605945">put it in a gist</a> if you want to look at it.<br />
<br />
The ADD directive on line 20 of the Dockerfile places the lens file in the Augeas library where it will be found automatically.<br />
<br />
<h3>
Pulp Server Configuration Script</h3>
<div>
<br />
All of the containers that use this base image will need to set a few configuration values for Pulp. These reside in <span style="font-family: Courier New, Courier, monospace;">/etc/pulp/server.conf</span><span style="font-family: inherit;"> which is an <a href="https://en.wikipedia.org/wiki/INI_file">INI formatted</a> text file. These settings indicate the identity of the pulp service itself and how the pulp processes communicate with the database and message bus.</span><br />
<span style="font-family: inherit;"><br />If you are starting a Docker container by hand you could either pass these values in as environment variables using the </span><span style="font-family: Courier New, Courier, monospace;">-e</span><span style="font-family: inherit;"> (</span><span style="font-family: Courier New, Courier, monospace;">--env</span><span style="font-family: inherit;">) option or by accepting additional positional arguments through the CMD. You'd have to establish the MongoDB and QPID services then get their IP addresses from Docker and feed the values into the Pulp server containers.</span><br />
<br />
Since Kubernetes is controlling the database and messaging pods and has the Service objects defined, it knows how to tell the Pulp containers where to find these services. It sets a few environment variables for every new container that starts after the service object is created. A new container can use these values to reach the external services it needs.<br />
<br />
Line 23 of the Dockerfile adds a short shell script which can accept the values from the environment variables that Kubernetes provides and configure them into the Pulp configuration.<br />
<br />
The script gathers the set of values it needs from the variables (or sets reasonable defaults) and then, using augtool (The CLI tool for Augeas) it updates the values in the <span style="font-family: Courier New, Courier, monospace;">pulp.conf</span> file.<br />
<br />
This is the snippet from the beginning of the <span style="font-family: Courier New, Courier, monospace;">configure_pulp_server.sh</span> script which sets the environment variables.<br />
<br />
<pre class="brush:bash ; title: 'Pulp Environment Variables from Kubernetes Services' ; highlight: []"># Take settings from Kubernetes service environment unless they are explicitly
# provided
PULP_SERVER_CONF=${PULP_SERVER_CONF:=/etc/pulp/server.conf}
export PULP_SERVER_CONF
PULP_SERVER_NAME=${PULP_SERVER_NAME:=pulp.example.com}
export PULP_SERVER_NAME
SERVICE_HOST=${SERVICE_HOST:=127.0.0.1}
export SERVICE_HOST
DB_SERVICE_HOST=${DB_SERVICE_HOST:=${SERVICE_HOST}}
DB_SERVICE_PORT=${DB_SERVICE_PORT:=27017}
export DB_SERVICE_HOST DB_SERVICE_PORT
MSG_SERVICE_HOST=${MSG_SERVICE_HOST:=${SERVICE_HOST}}
MSG_SERVICE_PORT=${MSG_SERVICE_PORT:=5672}
MSG_SERVICE_USER=${MSG_SERVICE_USER:=guest}
export MSG_SERVICE_HOST MSG_SERVICE_PORT MSG_SERVICE_NAME
</pre>
<br />
These are the values that the rest of the script will set into /etc/pulp/server.conf<br />
<br />
<span style="background-color: #999999; color: white;"><b>UPDATE:</b> As of the middle of October 2014 the SERVICE_HOST variable has been removed. Now each service gets its own IP address, so the generic SERVICE_HOST variable no longer makes sense. Each service variable must be provided explicitly when testing. Also, for testing the master host will provide a proxy to the service. However, as of this update the mechanism isn't working yet. I'll update this post when is working properly. If you are building from git source you can use a commit prior to 10/14/2014 and you can still use SERVICE_HOST test against the minions.</span><br />
<span style="background-color: #999999; color: white;"><br /></span></div>
<h3>
Container Startup and Remote Service Availability</h3>
<div>
<br /></div>
<div>
When the Pulp service starts up it will attempt to connect to a MongoDB and to a QPID message broker. If the database isn't ready, the Pulp service may just hang.<br />
<br />
Using Kubernetes it's best not to assume that the containers will arrive in any particular order. If the database service is unavailable, the pulp containers should just die. Kubernetes will notice and attempt to restart them periodically. When the database service is available the next client container will connect successfully and... not. die.<br />
<br />
I have added a check script to the base container which can be used to test the availability (and the correct access information) for the MongoDB. It also uses the environment variables provided by Kubernetes when the container starts.<br />
<br />
This script merely returns a shell <i>true</i> (return value: 0) if the database is available and <i>false</i> (return value: 1) if it fails to connect. This allows the startup script for the actual pulp service containers to check before attempting to start the pulp process and to cleanly report an error if the database is unavailable before exiting.<br />
<br />
I haven't included a script to test the QPID connectivity. So far I haven't seen a pulp service fail to start because the QPID service was unavailable when the client container starts.<br />
<br />
<h3>
Scripts are not executed in the base image</h3>
</div>
<div>
<br /></div>
<div>
The scripts listed above are provided in the base image, but the the base image has no ENTRYPOINT or CMD directives. It is not meant to be run on its own.</div>
<div>
<br /></div>
<div>
Each of the Pulp service images that uses this base will need to have a run script which will call these common scripts to set up the container environment before invoking the Pulp service processes. That's next.</div>
<div>
<br /></div>
<h2>
Using a Base Image: The Pulp-Beat Component</h2>
<div>
<br /></div>
<div>
The Pulp service is based on Celery. Celery is a framework for creating distributed task-based services. You extend the Celery framework to add the specific tasks that your application needs.</div>
<div>
<br /></div>
<div>
The task management is controlled by a "beat" process. Each Celery based service has to have exactly one beat server which is derived from the Celery scheduler class.</div>
<div>
<br /></div>
<div>
The beat server is a convenient place to do some of the service setup. Since there can only be one beat server and because it must be created first, I can use the beat service container startup to initialize the database.</div>
<div>
<br /></div>
<div>
The Docker development best-practices encourage image composition by layering. Creating a new layer means creating a new build space with a Dockerfile and any files that will be pulled in when the image is built.<br />
<br />
In the case of the pulp-base image all of the content is there. The customizations for the pulp-beat service are just the run script which configures and initializes the the service before starting. The Dockerfile is trivially simple:<br />
<br />
<script src="https://gist.github.com/markllama/4d895bca1d7b11ed2170.js"></script>
</div>
<br />
The real meat is in the run script, though even that is pretty anemic<br />
<br />
<script src="https://gist.github.com/markllama/509b61f1acbb4111f94f.js"></script>
The main section starts at line 44 and it's really just four steps. Two are defined in the base image scripts and two more are defined here.<br />
<br />
<ol>
<li>Apply the configuration customizations from the environment<br />These include setting the PULP_SERVER_NAME and the access parameters for the MongoDB and QPID services</li>
<li>Verify that the MongoDB is up and accessable<br />With Kubernetes you can't be dependent on ordering of the pod startups. This check allows some time for the DB to start and become available. Kubernetes will restart the beat pod if this fails but the checks here prevent some thrashing.</li>
<li>Initialize the MongoDB<br />This should only happen once. Within a pulp service the beat server is a singleton. I put the initialization step here so that it won't be confused later.</li>
<li>Execute the master process<br />This is a celery beat process customized with the Pulp master object</li>
</ol>
<div>
Even though the script line for each operation is fairly trivial I still put them into their own functions. This makes it easier for a reader to understand the logical progression and intent before going back to the function and examining the details. It also makes it easier to comment out a single function for testing and debugging.</div>
<div>
<br /></div>
<h2>
Testing the Beat Image (stand-alone)</h2>
<div>
<br /></div>
<div>
Since Kubernetes currently gives so little real access debug information for the container startup process I'm going to test the Pulp beat container first as a regular Docker container. I have my Kubernetes cluster running in Vagrant and I know the IP addresses of the MongoDB and QPID services.<br />
<br />
The other reason to test in plain Docker is that I want to manually verify the code which picks up and uses the configuration environment variables. There are four variables that will be required and two others that will likely default.<br />
<ul>
<li>PULP_SERVER_NAME</li>
<li>SERVICE_HOST</li>
<li>DB_SERVICE_HOST</li>
<li>MSG_SERVICE_HOST</li>
</ul>
<div>
The defaulted ones will be</div>
<div>
<ul>
<li>DB_SERVICE_PORT</li>
<li>MSG_SERVICE_PORT</li>
</ul>
<div>
DB_SERVICE_HOST and MSG_SERVICE_HOST can be provided directly or can pick up the value of SERVICE_HOST. I want to test both paths.</div>
</div>
<div>
<br /></div>
<div>
To test this I'm going to be running the Kubernetes Vagrant cluster on Virtualbox to provide the MongoDB and QPID servers. Then I'll run the Pulp beat server in Docker on the host. I know how to tell the beat server how to reach the services in the Kubernetes cluster (on 10.245.2.{2-4]}).</div>
<div>
<br /></div>
<div>
I'm going to assume that both the pulp-base and pulp-beat images are already built. I'm also going to start the container the first time using /bin/sh so I can manually start the run script and observe what it does.</div>
<br />
<pre class="brush: bash ; title: 'Start the pulp-beat container manually' ; highlight: 1">docker run -d --name pulp-beat -v /dev/log:/dev/log \
> -e PULP_SERVER_NAME=pulp.example.com \
> -e SERVICE_HOST=10.245.2.2 markllama/pulp-beat
f16a6f2278e20e0b039cb665bc5f55de39b13a1045f00e25cdab5219652f1d80
</pre>
<br />
This starts the container as a daemon and mounts /dev/log so that syslog will work. It also sets the PULP_SERVER_NAME and SERVICE_HOST variables.
<br />
<br />
<pre class="brush:bash ; title: 'Docker logs for pulp-beat startup' ; highlight: 1">docker logs pulp-beat
+ '[' '!' -x /configure_pulp_server.sh ']'
+ . /configure_pulp_server.sh
++ set -x
++ PULP_SERVER_CONF=/etc/pulp/server.conf
++ export PULP_SERVER_CONF
++ PULP_SERVER_NAME=pulp.example.com
++ export PULP_SERVER_NAME
++ SERVICE_HOST=10.245.2.2
++ export SERVICE_HOST
++ DB_SERVICE_HOST=10.245.2.2
++ DB_SERVICE_PORT=27017
++ export DB_SERVICE_HOST DB_SERVICE_PORT
++ MSG_SERVICE_HOST=10.245.2.2
++ MSG_SERVICE_PORT=5672
++ MSG_SERVICE_USER=guest
++ export MSG_SERVICE_HOST MSG_SERVICE_PORT MSG_SERVICE_NAME
++ check_config_target
++ '[' '!' -f /etc/pulp/server.conf ']'
++ configure_server_name
++ augtool -s set '/files/etc/pulp/server.conf/target[. = '\''server'\'']/server_name' pulp.example.com
Saved 1 file(s)
++ configure_database
++ augtool -s set '/files/etc/pulp/server.conf/target[. = '\''database'\'']/seeds' 10.245.2.2:27017
Saved 1 file(s)
++ configure_messaging
++ augtool -s set '/files/etc/pulp/server.conf/target[. = '\''messaging'\'']/url' tcp://10.245.2.2:5672
Saved 1 file(s)
++ augtool -s set '/files/etc/pulp/server.conf/target[. = '\''tasks'\'']/broker_url' qpid://guest@10.245.2.2:5672
Saved 1 file(s)
+ '[' '!' -x /test_db_available.py ']'
+ wait_for_database
+ DB_TEST_TRIES=12
+ DB_TEST_POLLRATE=5
+ TRY=0
+ '[' 0 -lt 12 ']'
+ /test_db_available.py
Testing connection to MongoDB on 10.245.2.2, 27017
+ '[' 0 -ge 12 ']'
+ initialize_database
+ runuser apache -s /bin/bash /bin/bash -c /usr/bin/pulp-manage-db
Loading content types.
Content types loaded.
Ensuring the admin role and user are in place.
Admin role and user are in place.
Beginning database migrations.
Applying pulp.server.db.migrations version 1
Migration to pulp.server.db.migrations version 1 complete.
...
Applying pulp_rpm.plugins.migrations version 16
Migration to pulp_rpm.plugins.migrations version 16 complete.
Database migrations complete.
+ run_celerybeat
+ exec runuser apache -s /bin/bash -c '/usr/bin/celery beat --workdir=/var/lib/pulp/celery --scheduler=pulp.server.async.scheduler.Scheduler -f /var/log/pulp/celerybeat.log -l INFO'
</pre>
<br />
This shows why I set the -x at the beginning of the run script. It causes the shell to emit each line as it is executed. You can see the environment variables as they are set. Then they are used to configure the pulp server.conf values. The database is checked and then initialized. Finally it executes the celery beat process which replaces the shell and continues executing.<br />
<br />
When this script runs it should have several side effects that I can check. As noted, it creates and initializes the pulp database. It also connects to the QPID server and creates several queues. I can check them in the same way I did when I created the MongoDB and QPID images in the first place.<br />
<br />
The database has been initialized<br />
<br />
<pre class="brush: bash ; title: 'Check DB presence' ; highlight: 1">echo show dbs | mongo 10.245.2.2
MongoDB shell version: 2.4.6
connecting to: 10.245.2.2/test
local 0.03125GB
pulp_database 0.03125GB
bye
</pre>
<br />
And the celery beat service has added a few queues to the QPID service<br />
<br />
<pre class="brush: bash ; title: 'Check QPID Queues' ; highlight: 1">qpid-config queues -b guest@10.245.2.4
Queue Name Attributes
======================================================================
0b78268e-256f-4832-bbcc-50c7777a8908:1.0 auto-del excl
411cc98f-eed3-45f9-b455-8d2e5d333262:0.0 auto-del excl
aaf61614-919e-49ea-843f-d83420e9232f:1.0 auto-del excl
celeryev.de500902-4c88-4d5c-90f4-1b4db366613d auto-del --limit-policy=ring --argument passive=False --argument exclusive=False --argument arguments={}
</pre>
<br />
<h3>
But what if I do it wrong?</h3>
You can see that the output from a correct startup is pretty lengthy. When I'm happy that the image is stable I'll remove the shell -x setting (and make it either an argument or environment switch for later). There are several other paths to test.<br />
<br />
<br />
<ol>
<li>Fail to provide Environment Variables</li>
<ol>
<li>PULP_SERVER_NAME</li>
<li>SERVICE_HOST</li>
<li>DB_SERVICE_HOST</li>
<li>MSG_SERVICE_HOST</li>
</ol>
<li>Fail to import /dev/log volume</li>
</ol>
<div>
Each of these will have slightly different failure modes. I suggest you try each of them and observe how it fails. Think of others, I'm sure I've missed some.</div>
<div>
<br /></div>
<div>
For the purposes of this post I'm going to treat these as exercises for the reader and move on.<br />
<br /></div>
</div>
<h2>
Testing the Beat Image (Kubernetes)</h2>
<div>
<br /></div>
<div>
Now things get interesting. I have to craft a Kubernetes pod description that creates the pulp-beat container, gives it access to logging and connects it to the database and messaging services.<br />
<br />
<h3>
Defining the Pulp Beat pod</h3>
<div>
<br /></div>
Because of the way I crafted the base image and run scripts, this isn't actually as difficult or as complicated as you might think. It turns out that the only environment variable I have to actually pass in is the PULP_SERVER_NAME. The rest of the environment values are going to be provided by the kubelet as defined by the Kubernetes service objects (and served by the MongoDB and QPID containers behind them).<br />
<div>
<br /></div>
<br />
<br /></div>
The only really significant thing here is the volume imports.<br />
<br />
Pulp uses the python logging mechanism and that in turn by default requires the <i>syslog</i> service. On Fedora 20, syslog is no longer a separate process. It's been absorbed into the systemd suite of low level services and is known now as <i>journald</i>. (cat flamewars/systemd/{pro,con} >/dev/null).<br />
<br />
For me this means that for Pulp to run properly it needs the ability to write syslog messages. In Fedora 20 this amounts to being able to write to a special file <span style="font-family: Courier New, Courier, monospace;">/dev/log</span>. This file isn't available in containers without some special magic. For Docker that magic is <span style="font-family: Courier New, Courier, monospace;">-v /dev/log:/dev/log</span>. This imports the host's <span style="font-family: Courier New, Courier, monospace;">/dev/log</span> into the container at the same location. For Kubernetes this is a little bit more involved.<br />
<br />
The Kubernetes pod construct has some interesting side-effects. The purpose of pods is to allow the creation of sets of containers that share resources. The JSON reflects this in how the shared resources are declared.<br />
<br />
In the pod spec, <a href="https://gist.github.com/markllama/81c4333b298522ce6507#file-kubernetes_pulp_beat_pod-L14-L17">lines 14-20</a> are inside the container hash for the container named <i>pulp-beat</i>. They indicate that a volume named "devlog" (line 15) will be mounted read/write (line 16) on <span style="font-family: Courier New, Courier, monospace;">/dev/log</span> inside the container (line 17).<br />
<br />
Note that this section does not define the named volume or indicate where it will come from. That's defined at the pod level not the container.<br />
<br />
Now look at <a href="https://gist.github.com/markllama/81c4333b298522ce6507#file-kubernetes_pulp_beat_pod-L20-L23">lines 20-23</a>. these are at the pod level (the list of containers has been closed on line 19). The <i>volumes</i> array contains a set of volume definitions. I only define one, named "devlog" (line 21) and indicate that it comes from the host and that the source path is <span style="font-family: Courier New, Courier, monospace;">/dev/log</span><span style="font-family: inherit;">.</span><br />
<br />
<span style="font-family: inherit;">All that to replace the docker argument </span><span style="font-family: Courier New, Courier, monospace;">-v /dev/log:/dev/log</span><span style="font-family: inherit;">.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Right now this seems like a lot of work for a trivial action. Later this distinction will become very important. The final pod for Pulp will be made up of at least two containers. The pod will import two different storage locations from the host and both containers will mount them.</span><br />
<span style="font-family: inherit;"><br /></span>
One last time for clarity: the <span style="font-family: Courier New, Courier, monospace;">volumes</span><span style="font-family: inherit;"> list is at the pod level. It defines a set of external resources that will be made available to the containers in the pod. The </span><span style="font-family: Courier New, Courier, monospace;">volumeMounts</span><span style="font-family: inherit;"> list is at the container level. It maps entries from the </span><span style="font-family: Courier New, Courier, monospace;">volumes</span><span style="font-family: inherit;"> section in the pod to mount points inside the container using the value of the <i>name</i> as the connecting handle.</span><br />
<span style="font-family: inherit;"><br /></span>
<br />
<h3>
Starting the Pulp Beat Pod</h3>
<div>
<br /></div>
<div>
Starting the pulp beat pod is just like starting the MongoDB and QPID pods was. At this point it does require that the Service objects have been created and that the service containers are running, so if you're following along and haven't done that, go do it. Since I'd run my pulp beat container manually and it had modified the mongodb, I also removed the pulp_database before proceeding.</div>
<div>
<br />
<pre class="brush:bash ; title: 'clean up pulp_database' ; highlight: [1,6]">echo 'db.dropDatabase()' | mongo 10.245.2.2/pulp_database
MongoDB shell version: 2.4.6
connecting to: 10.245.2.2/pulp_database
{ "dropped" : "pulp_database", "ok" : 1 }
bye
echo show dbs | mongo 10.245.2.2
MongoDB shell version: 2.4.6
connecting to: 10.245.2.2/test
local 0.03125GB
bye
</pre>
<br />
To start the pulp beat pod we go back to kubecfg (remember, I aliased <span style="font-family: Courier New, Courier, monospace;">kubecfg=~/kubernetes/cluster/kubecfg.sh</span>).<br />
<br />
<pre class="brush:bash ; title: 'start pulp beat pod' ; highlight: [1,6]">kubecfg -c pods/pulp-beat.json create pods
ID Image(s) Host Labels Status
---------- ---------- ---------- ---------- ----------
pulp-beat markllama/pulp-beat / name=pulp-beat Waiting
kubecfg get pods/pulp-beat
ID Image(s) Host Labels Status
---------- ---------- ---------- ---------- ----------
pulp-beat markllama/pulp-beat 10.245.2.2/10.245.2.2 name=pulp-beat Waiting
</pre>
<br />
Now I know that the pod has been assigned to 10.245.2.2 (minion-1) I can log in there directly and examine the docker container.
<br />
<br />
<pre class="brush:bash ; title: 'verify pulp beat pod' ; highlight: [1,3,5]">vagrant ssh minion-1
Last login: Fri Dec 20 18:02:34 2013 from 10.0.2.2
sudo docker ps | grep pulp-beat
2515129f2c7e markllama/pulp-beat:latest "/run.sh" 54 seconds ago Up 53 seconds k8s--pulp_-_beat.a6ba93e9--pulp_-_beat.etcd--d2a60369_-_458d_-_11e4_-_b682_-_0800279696e1--0b799f3d
sudo docker logs 2515129f2c7e
+ '[' '!' -x /configure_pulp_server.sh ']'
+ . /configure_pulp_server.sh
++ set -x
++ PULP_SERVER_CONF=/etc/pulp/server.conf
++ export PULP_SERVER_CONF
++ PULP_SERVER_NAME=pulp.example.com
++ export PULP_SERVER_NAME
++ SERVICE_HOST=10.245.2.2
++ export SERVICE_HOST
++ DB_SERVICE_HOST=10.245.2.2
++ DB_SERVICE_PORT=27017
++ export DB_SERVICE_HOST DB_SERVICE_PORT
++ MSG_SERVICE_HOST=10.245.2.2
++ MSG_SERVICE_PORT=5672
++ MSG_SERVICE_USER=guest
++ export MSG_SERVICE_HOST MSG_SERVICE_PORT MSG_SERVICE_NAME
++ check_config_target
++ '[' '!' -f /etc/pulp/server.conf ']'
++ configure_server_name
++ augtool -s set '/files/etc/pulp/server.conf/target[. = '\''server'\'']/server_name' pulp.example.com
Saved 1 file(s)
++ configure_database
++ augtool -s set '/files/etc/pulp/server.conf/target[. = '\''database'\'']/seeds' 10.245.2.2:27017
Saved 1 file(s)
++ configure_messaging
++ augtool -s set '/files/etc/pulp/server.conf/target[. = '\''messaging'\'']/url' tcp://10.245.2.2:5672
Saved 1 file(s)
++ augtool -s set '/files/etc/pulp/server.conf/target[. = '\''tasks'\'']/broker_url' qpid://guest@10.245.2.2:5672
Saved 1 file(s)
+ '[' '!' -x /test_db_available.py ']'
+ wait_for_database
+ DB_TEST_TRIES=12
+ DB_TEST_POLLRATE=5
+ TRY=0
+ '[' 0 -lt 12 ']'
+ /test_db_available.py
Testing connection to MongoDB on 10.245.2.2, 27017
+ '[' 0 -ge 12 ']'
+ initialize_database
+ runuser apache -s /bin/bash /bin/bash -c /usr/bin/pulp-manage-db
Loading content types.
Content types loaded.
Ensuring the admin role and user are in place.
Admin role and user are in place.
Beginning database migrations.
Applying pulp.server.db.migrations version 1
Migration to pulp.server.db.migrations version 1 complete.
...
Applying pulp_rpm.plugins.migrations version 16
Migration to pulp_rpm.plugins.migrations version 16 complete.
Database migrations complete.
+ run_celerybeat
+ exec runuser apache -s /bin/bash -c '/usr/bin/celery beat --workdir=/var/lib/pulp/celery --scheduler=pulp.server.async.scheduler.Scheduler -f /var/log/pulp/celerybeat.log -l INFO'
</pre>
<br />
If this is the first time running the image it may take a while for Kubernetes/Docker to pull it from the Docker hub. There may be a delay as the kubernetes pause container does the pull.
<br />
<br />
I can now run the same tests I did earlier on the MongoDB and QPID services to reassure myself that the pulp beat service is connected.<br />
<br />
<pre class="brush: bash ; title: 'verify pulp beat connectivity'; highlight: [1,8]">echo show dbs | mongo 10.245.2.2
MongoDB shell version: 2.4.6
connecting to: 10.245.2.2/test
local 0.03125GB
pulp_database 0.03125GB
bye
qpid-config queues -b guest@10.245.2.4
Queue Name Attributes
======================================================================
613f4b89-e63e-4230-9620-e932f5a777e5:0.0 auto-del excl
c990ea7b-3d7f-4603-80e5-176ebc649ff1:1.0 auto-del excl
celeryev.ffbc537b-1161-4049-b425-723487135fc2 auto-del --limit-policy=ring --argument passive=False --argument exclusive=False --argument arguments={}
e0155372-12ee-4c9a-9c4d-8f4863601b3a:1.0 auto-del excl
</pre>
<br />
After all that thought and planning the end result is actually kinda boring. Just the way I like it.<br />
<br />
<h2>
What's next?</h2>
<br />
The pulp-beat service is just the first real pulp component. It runs in isolation from the other components, communicating only through the messaging and database. There is another component like that, the <i>pulp-resource-manager</i>. This is another Celery process and the it is created, started and tested just like the pulp-beat service. I'm going to do one much-shorter post on that for completeness before tackling the next level of complexity.<br />
<br />
The two remaining different components are the content pods, which require shared storage and which will have two cooperating containers running inside the pod. One will manage the content mirroring and the other will serve the content out to clients. <br />
<br />
I think before that though I will tackle the Pulp Admin service. This is a public facing REST service which accepts pulp admin commands to create and manage the content repositories.<br />
<br />
Both of these will require the establishment of encryption, which means placing x509 certificates within the containers. These are the upcoming challenges.</div>
<h2>
<br />
References</h2>
<div>
<div>
<ul>
<li><a href="http://www.docker.com/">Docker</a> - Containerized applications</li>
<li><a href="https://github.com/GoogleCloudPlatform/kubernetes">Kubernetes</a> - Orchestration for creating containerized services</li>
<li><a href="https://www.mongodb.org/">MongoDB</a> - A Non-relational database</li>
<li><a href="https://qpid.apache.org/">QPID</a> - an AMQP messaging service</li>
</ul>
</div>
<ul>
<li><a href="http://www.pulpproject.org/">Pulp</a> - An enterprise OS content mirroring system</li>
<li><a href="http://www.celeryproject.org/">Celery</a> - A Distributed Task Queue Framework</li>
<li><a href="http://augeas.net/">Augeas</a> - Structured queries and updates to (largely) unstructured configurations</li>
<li><a href="https://en.wikipedia.org/wiki/INI_file">INI Files</a> - A simple format for simple configurations</li>
</ul>
</div>
markllamahttp://www.blogger.com/profile/14193184544557876514noreply@blogger.com0tag:blogger.com,1999:blog-5022186007695457923.post-29919572608976120482014-09-23T13:19:00.001-07:002014-09-23T13:19:16.380-07:00Kubernetes Under The Hood: EtcdKubernetes is an effort which originated within Google to provide an orchestration layer above Docker containers. Docker operation is limited to actions on a single host. Kubernetes attempts to provide a mechanism to manage large sets of containers on a cluster of container hosts. Above that will eventually be job management services like Mesos or Aurora.<br />
<br />
<h2>
Anatomy of Kubernetes Cluster</h2>
<div>
<br /></div>
A Kubernetes cluster is made up of three major active components<br />
<br />
<ol>
<li>Kubernetes <i>app-service</i></li>
<li>Kubernetes <i>kubelet</i> agent</li>
<li><i>etcd</i> distributed key/value database</li>
</ol>
<div>
<br /></div>
<div>
The <i>app-service</i> is the front end of the Kubernetes cluster. It accepts requests from clients to create and manage containers, services and replication controllers within the cluster. This is the control interface of Kubernetes.</div>
<div>
<br /></div>
<div>
The <i>kubelet</i> is the active agent. It resides on a Kubernetes cluster member host. It polls for instructions or state changes and acts to execute them on the host.</div>
<div>
<br /></div>
<div>
The <i>etcd</i> services are the communications bus for the Kubernetes cluster. The app-service posts cluster state changes to the etcd database in response to commands and queries. The kubelets read the contents of the etcd database and act on any changes they detect.<br />
<br />
There's also a <i>kube-proxy</i> process which does the Service network proxy work but that's not relevant to the larger operations.</div>
<div>
<br /></div>
<div>
This post is going to describe and play with the etcd.</div>
<div>
<br /></div>
<h2>
OK, so what is Etcd?</h2>
<div>
<br /></div>
<div>
<a href="https://coreos.com/using-coreos/etcd/">Etcd (or etcd) is a service</a> created by the <a href="https://coreos.com/">CoreOS</a> team to create a shared distributed configuration database. It's a replicated key/value store. The data are accessed using ordinary HTTP(S) GET and PUT queries. The status, metadata and payload are returned as members of a JSON data structure.</div>
<div>
<br />
Etcd has a companion CLI client for testing and manual interaction. This is called <i>etcdctl</i>. Etcdctl is merely a wrapper that hides the HTTP interactions and the raw JSON that is used as status and payload.<br />
<br /></div>
<h2>
Installing and Running Etcd</h2>
<div>
<br />
Etcd (and etcdctl, the CLI client) aren't yet available in RPM format from the standard repositories, or if they are they're very old. If you're running on 64 bit Linux you can pull the most recent binaries from<a href="https://github.com/coreos/etcd/releases"> the Github repository for CoreOS</a>. Download them, unpack the tar.gz file and place the binaries in your path.<br />
<br />
<br />
<pre class="brush: bash ; title: 'download and unpack etcd binaries' ; highlight: [1,7]">curl -s -L https://github.com/coreos/etcd/releases/download/v0.4.6/etcd-v0.4.6-linux-amd64.tar.gz | tar -xzvf -
etcd-v0.4.6-linux-amd64/
etcd-v0.4.6-linux-amd64/etcd
etcd-v0.4.6-linux-amd64/etcdctl
etcd-v0.4.6-linux-amd64/README-etcd.md
etcd-v0.4.6-linux-amd64/README-etcdctl.md
cd etcd-v0.4.6-linux-amd64
</pre>
<br />
<br />
Once you have the binaries, check out the <a href="https://github.com/coreos/etcd">Etcd</a> and <a href="https://github.com/coreos/etcdctl">Etcdctl</a> github pages for basic usage instructions. I'll duplicate here a little bit just to get moving.<br />
<br />
Etcd doesn't run as a traditional daemon. It remains connected to STDOUT and logs activity. I'm not going to demonstrate here how to turn it into a proper daemon. Instead I'll run it in one terminal session and use another to access it.<br />
<br />
NOTE 1: Etcd does not use standard longopts conventions. All of the options use single leading hyphens.<br />
NOTE 2: Etcdctl <u>does</u> follow the longopt conventions. Go figure.<br />
<br />
<pre class="brush:bash ; title: 'Starting an etcd' ; highlight: 1">./etcd
[etcd] Sep 23 10:36:04.655 WARNING | Using the directory myhost.etcd as the etcd curation directory because a directory was not specified.
[etcd] Sep 23 10:36:04.656 INFO | myhost is starting a new cluster
[etcd] Sep 23 10:36:04.658 INFO | etcd server [name myhost, listen on :4001, advertised url http://127.0.0.1:4001]
[etcd] Sep 23 10:36:04.658 INFO | peer server [name myhost, listen on :7001, advertised url http://127.0.0.1:7001]
[etcd] Sep 23 10:36:04.658 INFO | myhost starting in peer mode
[etcd] Sep 23 10:36:04.658 INFO | myhost: state changed from 'initialized' to 'follower'.
[etcd] Sep 23 10:36:04.658 INFO | myhost: state changed from 'follower' to 'leader'.
[etcd] Sep 23 10:36:04.658 INFO | myhost: leader changed from '' to 'myhost'.
</pre>
<br />
As you can see the daemon listens by default to the localhost interface on port 4001/TCP for client interactions and on port 7001/TCP for clustering communications. See the output of <span style="font-family: Courier New, Courier, monospace;">etcd -help</span><span style="font-family: inherit;"> for detailed options. You can also see the process whereby the new daemon attempts to connect to peers and determine its place within the cluster. Since there are no peers, this one elects itself leader.</span><br />
<br />
That output looks as if the etcd is running. I can check by querying the daemon version and some other information.<br />
<br />
<pre class="brush:bash ; title: 'Starting an etcd' ; highlight: 1">curl -s http://127.0.0.1:4001/version
etcd 0.4.6
</pre>
<br />
I can also get some stats from the daemon directly as well:<br />
<br />
<pre class="brush:bash ; title: 'Etcd stats' ; highlight: 1">curl -s -L http://127.0.0.1:4001/v2/stats/self | python -m json.tool
{
"leaderInfo": {
"leader": "myhost",
"startTime": "2014-09-23T10:37:04.839453766-04:00",
"uptime": "5h10m13.053046076s"
},
"name": "myhost",
"recvAppendRequestCnt": 0,
"sendAppendRequestCnt": 0,
"startTime": "2014-09-23T10:37:04.83945236-04:00",
"state": ""
}
</pre>
<br />
So now I know it's up and responding.<br />
<br /></div>
<div>
<div>
<h2>
Playing with Etcd</h2>
</div>
</div>
<div>
<br /></div>
<div>
Etcd responds to HTTP(S) queries both to set and retrieve data. All of the data are organized into a hierarchical key set (which for normal people means that the keys look like files in a tree of directories). The values are arbitrary strings. This makes it very easy to test and play with etcd using ordinary CLI web query tools like <i>curl</i>and <i>wget</i>. The binary releases also include a CLI client called <i>etcdctl</i> which simplifies the interaction, allowing the caller to focus on the logical operation and the result rather than the HTTP/JSON interaction. I'll show both methods where they are instructive, choosing the best one for each example.<br />
<br />
The examples here are adapted from the <a href="https://github.com/coreos/etcd#running-etcd">CoreOS examples</a> on Github. There's also a complete <a href="https://github.com/coreos/etcd/blob/master/Documentation/api.md">protocol document</a> for it as well<br />
<br />
Once the etcd is running I can begin working with it. <br />
<br />
Etcd is a <i>hierarchical key=value store</i>. This means that each piece of stored data has a <i>key</i> which uniquely identifies it within the database. The key is <i>hierarchical</i> in that the key is composed of a set of elements that form a <i>path</i> from a fixed known starting point for the database known as the <i>root</i>. Any given element in the database can either be a branch (directory) or a leaf (value). Directories contain other keys and are used to create the hierarchy of data.<br />
<i><br /></i>
This is all formal gobbledy-gook for "it looks just like a filesystem". In fact a number of the operations that etcdctl offers are exact analogs of filesystem commands: mkdir, rmdir, ls, rm.<br />
<br />
The first operation is to look at the contents of the root of the database. Expect this to be boring because there's nothing there yet.<br />
<br />
<br />
<pre class="brush: bash ; title: 'list the root of an empty database' ; highlight: 1">./etcdctl ls /
</pre>
<br />
<br />
See? There's nothing there. Boring.<br />
<br />
It looks a little different when you pull it using <span style="font-family: Courier New, Courier, monospace;">curl.</span><br />
<br />
<pre class="brush: bash ; title: 'GET the root of the DB tree with curl' ; highlight: 1">curl -s http://127.0.0.1:4001/v2/keys/ | python -m json.tool
{
"action": "get",
"node": {
"dir": true,
"key": "/"
}
}
</pre>
<br />
The return payload is JSON. I use the python <i>json.tool</i> module to pretty print it.<br />
<br />
I can see that this is the response to a GET request. The <i>node</i> hash describes the query and result. I asked for the root key (/) and it's an (empty) directory.<br />
<br />
Life will be a little more interesting if there's some data in the database. I'll add a value and I'm going to put it well down in the hierarchy to show how the tree structure works.<br />
<br />
<pre class="brush:bash ; title: 'Set a value using etcdctl' ; highlight: 1">./etcdctl set /foo/bar/gronk "I see you"
I see you
</pre>
<br />
Now when I ask etcdctl for the contents of the root directory I at least get some output:<br />
<br />
<pre>./etcdctl ls /
/foo
</pre>
<br />
But that's much more interesting when I look using <span style="font-family: Courier New, Courier, monospace;">curl</span>.
<br />
<br />
<pre class="brush:bash ; title: 'Get the root directory using curl' ; highlight: 1">curl -s http://127.0.0.1:4001/v2/keys/ | python -m json.tool
{
"action": "get",
"node": {
"dir": true,
"key": "/",
"nodes": [
{
"createdIndex": 7,
"dir": true,
"key": "/foo",
"modifiedIndex": 7
}
]
}
}
</pre>
<br />
This looks very similar to the previous response with the addition of the <i>nodes</i> array. I can infer that this list contains the set of directories and values that the root contains. In this case it contains one other subdirectory named <span style="font-family: Courier New, Courier, monospace;">/foo</span><span style="font-family: inherit;">.</span><br />
Creating a new value is also more fun using curl:<br />
<br />
<pre class="brush:bash ; title: 'Create a new key/value pair using curl' ; highlight: 1">curl -s http://127.0.0.1:4001/v2/keys/fiddle/faddle -XPUT -d value="popcorn" | python -m json.tool
{
"action": "set",
"node": {
"createdIndex": 8,
"key": "/fiddle/faddle",
"modifiedIndex": 8,
"value": "popcorn"
}
}
</pre>
<br />
The return payload is the REST acknowledgement response to the PUT query. It looks similar to the GET query response, but not identical. The action is (appropriately enough) <i>set</i>. Only a single node is returned, not the node list you get when querying a directory and the value is provided as well. The REST protocol (and the etcdctl command) allow for a number of modifiers for queries. Two I'm going to use a lot are <i>sort</i> and <i>recursive</i>.<br />
<br />
If I want to see the complete set of nodes underneath a directory I can use <span style="font-family: Courier New, Courier, monospace;">etcdctl ls</span><span style="font-family: inherit;"> with the --<i>recursive</i> option:</span><br />
<span style="font-family: inherit;"><br /></span>
<br />
<pre class="brush:bash ; title: 'Query a directory tree recursively with etcdctl' ; highlight: 1">./etcdctl ls / --recursive
/foo
/foo/bar
/foo/bar/gronk
/fiddle
/fiddle/faddle
</pre>
<br />
<div>
That's a nice pretty listing. As you can imagine, this gets a bit messier if you use <span style="font-family: Courier New, Courier, monospace;">curl</span> for the query. This is probably the last time I'll use <span style="font-family: Courier New, Courier, monospace;">curl</span> for a query here. </div>
<br />
<pre class="brush:bash ; title: 'Query a directory tree recursively using curl' ; highlight: 1">curl -s http://127.0.0.1:4001/v2/keys/?recursive=true| python -m json.tool
{
"action": "get",
"node": {
"dir": true,
"key": "/",
"nodes": [
{
"createdIndex": 7,
"dir": true,
"key": "/foo",
"modifiedIndex": 7,
"nodes": [
{
"createdIndex": 7,
"dir": true,
"key": "/foo/bar",
"modifiedIndex": 7,
"nodes": [
{
"createdIndex": 7,
"key": "/foo/bar/gronk",
"modifiedIndex": 7,
"value": "I see you"
}
]
}
]
},
{
"createdIndex": 8,
"dir": true,
"key": "/fiddle",
"modifiedIndex": 8,
"nodes": [
{
"createdIndex": 8,
"key": "/fiddle/faddle",
"modifiedIndex": 8,
"value": "popcorn"
}
]
}
]
}
}
</pre>
<br />
<h2>
Clustering Etcd</h2>
<div>
<br />
Etcd is designed to allow database replication and the formation of clusters. When two etcds connect, they use a different port from the normal client access port. An etcd that intends to participate listens on that second port and also connects to a list of peer processes which also are listening.</div>
<div>
<br /></div>
<div>
You can set up peering (replication) using the command line arguments --peer-addr and --peers or you can set the values in the configuration file /etc/etcd/etcd.conf</div>
<div>
<br /></div>
<div>
<a href="https://github.com/coreos/etcd/blob/master/Documentation/clustering.md">Complete clustering documentation</a> can be found on Github.</div>
<div>
<br /></div>
<h2>
Etcd and Security</h2>
</div>
<div>
<br /></div>
<div>
Etcd communications can be encrypted using SSL, but there is no authentication or access control. This makes it simple to use, but it makes it critical that you be careful never to place sensitive information like passwords or private keys into Etcd. It also means that you assume when using etcd that there are no malicious actors in the network space which has access. Any process with network access can both read and write any keys and values within the etcd. It is absolutely essential that access to etcd be protected at the network level because there's nothing else restricting access.</div>
<div>
<br />
Instructions for <a href="https://github.com/coreos/etcd/blob/master/Documentation/security.md">enabling SSL</a> to encrypt etcd traffic is also on Github<br />
<br />
Etcd can be configured to restrict access to queries which use a client certificate but this provides very limited access control. Clients are either allowed full access or denied. There is no concept of a user, or authentication or access control policy once a connection has been allowed.<br />
<br />
<h2>
Additional Capabilities of Etcd</h2>
</div>
<div>
<br />
Don't make the mistake of thinking that Etcd is a simple networked filesystem with an HTTP/REST protocol interface. Etcd has a number of other important capabilities related to its role in configuration and cluster management.</div>
<div>
<br /></div>
<div>
Each directory or leaf node can have a <i><a href="https://github.com/coreos/etcd/blob/master/Documentation/api.md#using-key-ttl">Time To Live</a></i> or TTL value associated with it. The TTL indicates the lifespan of they key/value pair in seconds. When a value is set, if the TTL is also set then that key/value pair will expire when the TTL drops to zero. After that the value will no longer be available.</div>
<div>
<br /></div>
<div>
It is also possible to create <i><a href="https://github.com/coreos/etcd/blob/master/Documentation/api.md#creating-a-hidden-node">hidden nodes</a>.</i> These are nodes that will not appear in directory listings. To access them the query must specify the correct path explicitly. Any node name which begins with an underscore character (_) will be hidden from directory queries.</div>
<div>
<br /></div>
<div>
Most importantly it is possible for clients to <a href="https://github.com/coreos/etcd/blob/master/Documentation/api.md#waiting-for-a-change">wait for changes</a> to a key. If I issue a GET query on a key with the <i>wait</i> flag set then the query will block, leaving the query incomplete and the TCP session open. Assuming that the client doesn't time out the query will remain open and unresolved until the etcd detects (and executes) a change request on that key. At that point the waiting query will also complete and return the new value. This can be used as an event management or messaging system to avoid unnecessary polling.<br />
<br /></div>
<h2>
Etcd in Kubernetes</h2>
<div>
<br />
Etcd is used by Kubernetes as both the cluster state database and as the communications mechanism between the app-server and the kubelet processes on the minion hosts. The app-server places values into the etcd in response to requests from the users for things like new pods or services, and it queries values from it to get status on the minions, pods and services.<br />
<br />
The kubelet processes also both query and update the contents of the database. They poll for desired state changes and create new pods and services in response. They also push status information back to the etcd to make it available to client queries.<br />
<br />
The root of the Kubernetes data tree within the etcd database is <span style="font-family: Courier New, Courier, monospace;">/registry</span><span style="font-family: inherit;">. Let's see what's there.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;"><br /></span>
<br />
<pre class="brush: bash ; title: 'etcd data tree from kubernetes vagrant cluster' ; highlight: 1">./etcdctl ls /registry --recursive
/registry/services
/registry/services/specs
/registry/services/specs/db
/registry/services/specs/msg
/registry/services/endpoints
/registry/services/endpoints/db
/registry/services/endpoints/msg
/registry/pods
/registry/pods/pulpdb
/registry/pods/pulpmsg
/registry/pods/pulp-beat
/registry/pods/pulp-resource-manager
/registry/hosts
/registry/hosts/10.245.2.3
/registry/hosts/10.245.2.3/kubelet
/registry/hosts/10.245.2.4
/registry/hosts/10.245.2.4/kubelet
/registry/hosts/10.245.2.2
/registry/hosts/10.245.2.2/kubelet
</pre>
</div>
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">I'm running the Vagrant cluster on Virtualbox with three minions. </span><span style="font-family: inherit;">These are listed under the </span><span style="font-family: Courier New, Courier, monospace;">hosts</span><span style="font-family: inherit;"> subtree. </span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">I've also defined two services, <i>db</i> and <i>msg</i> which are found under the </span><span style="font-family: Courier New, Courier, monospace;">services</span><span style="font-family: inherit;"> subtree. The service data is divided into two parts. The </span><span style="font-family: Courier New, Courier, monospace;">specs</span><span style="font-family: inherit;"> tree contains the definitions I provided for the two services. The </span><span style="font-family: Courier New, Courier, monospace;">endpoints</span><span style="font-family: inherit;"> subtree contains records which indicate the actual locations of the containers labeled to accept the service connections.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Finally I've defined four pods which make up the service I'm building (which happens to be a <a href="http://www.pulpproject.org/">Pulp</a> service). Each host is listed by its IP address at the moment. Work is on-going to allow the minions to be referred to by their host-name but that requires control of the nameservice which is available inside the containers. Without a universal nameservice for containers, IP addresses are the only way for processes inside a container to find hosts outside.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Some of the values here will look familiar to someone who has created pods and services using the <i>kubecfg</i> client. They are nearly identical to the JSON query and response payloads from the Kubernetes app-server.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">I don't recommend making any changes or additions to the etcd database in a running Kubernetes cluster. I haven't looked deeply enough yet into how the app-server and kubelet interact with etcd and it would be very easy I think to upset them. For now I'm able to query etcd and confirm that my commands have or have not been initiated and compare what I see to what I expect.</span><br />
<span style="font-family: inherit;"><br /></span>
<br />
<h2>
Summary</h2>
<div>
<br />
Etcd is a neat tool for storing and sharing configuration data. It's only useful (so far) in limited cases where there are no malicious or careless users, but it's a very young project. I am speculating that etcd is a a temporary component of Kubernetes. It provides the features needed to facilitate the development of the app-server and kubelet which are the core functions of Kubernetes. Once those are stable, if others feel the need to use a more secure or scalable component then it can be done. The configuration payload can remain and only the communications mechanism will need to be replaced.</div>
<h2>
<br /></h2>
<h2>
References</h2>
<div>
<br /></div>
<div>
<ul>
<li><a href="https://coreos.com/">CoreOS</a> - A Docker hosting environment</li>
<li><a href="https://github.com/coreos/etcd">Etcd</a> - A distributed replicated key/value database with a REST access protocol</li>
<ul>
<li><a href="https://github.com/coreos/etcd/releases">Releases</a></li>
<li><a href="https://github.com/coreos/etcd/tree/master/Documentation">Documentation</a></li>
<ul>
<li><a href="https://github.com/coreos/etcd/tree/master/Documentation/api.md">API</a></li>
<li><a href="https://github.com/coreos/etcd/tree/master/Documentation/clustering.md">Clustering</a></li>
<li><a href="https://github.com/coreos/etcd/tree/master/Documentation/security.md">Security</a> (SSL)</li>
</ul>
</ul>
<li><a href="https://docker.com/">Docker</a> - Host based containerized application hosting</li>
<li><a href="https://github.com/GoogleCloudPlatform/kubernetes">Kubernetes</a> - Orchestration tools for Docker</li>
<li><a href="http://pulpproject.org/">Pulp</a> - An enterprise class file repository and mirror system</li>
</ul>
</div>
<div>
<br /></div>
markllamahttp://www.blogger.com/profile/14193184544557876514noreply@blogger.com3tag:blogger.com,1999:blog-5022186007695457923.post-13907784160609126842014-09-04T13:29:00.000-07:002014-12-16T09:00:42.474-08:00Kubernetes: Simple Containers and ServicesFrom previous posts I now have a <a href="http://cloud-mechanic.blogspot.com/2014/08/docker-simple-service-container-example.html">MongoDB image</a> and another which runs a <a href="http://cloud-mechanic.blogspot.com/2014/09/docker-qpid-message-broker-container.html">QPID AMQP broker</a>. I intend for these to be used by the <a href="http://www.pulpproject.org/">Pulp</a> service components.<br />
<br />
What I'm going to do this time is to create the subsidiary services that I'll need for the Pulp service within a Kubernetes cluster.<br />
<br />
<span style="background-color: #eeeeee; color: #e06666; font-family: Verdana, sans-serif; font-size: large;">UPDATE 12/16/2014: recently the <i>kubecfg</i> command has been deprecated and replaced with <i>kubectl</i>. I've updated this post to reflect the CLI call and output from kubectl.</span><br />
<br />
<h2>
Pre-Launch</h2>
<br />
A Pulp service stores it's persistent data in the database. The service components, a <a href="http://www.celeryproject.org/">Celery Beat server</a> and a number of Celery workers, as well as one or more Apache web server daemons all communicate using the AMQP message broker. They store and retrieve data from the database.<br />
<br />
In a traditional bare-bare metal or VM based installation all of these services would likely be run on the same host. If they are distributed, then the IP addresses and credentials of the support services would have to be configured into Pulp servers manually or using some form of configuration management. Using containers the components can be isolated but the task of tracking them and configuring the consumer processes remains.<br />
<br />
Using just Docker, the first impulse of an implementer would be similar, to place all of the containers on the same host. This would simplify the management of the connectivity between the parts, but it also defeats some of the benefit of containerized applications: portability and non-locality. This isn't a failing of Docker. It is the result of conscious decisions to limit the scope of what Docker attempts to do, avoiding feature creep and bloat. And this is where a tool like Kubernetes comes in.<br />
<br />
As mentioned elsewhere, Kubernetes is a service which is designed to bind together a cluster of container hosts, which can be regular hosts running the etcd and kubelet daemons or they can be specialized images like Atomic or CoreOS. They can be private or public services such as Google Cloud<br />
<br />
For Pulp, I need to place a MongoDB and a QPID container within a Kubernetes cluster and create the infrastructure so that clients can find it and connect to it. For each of these I need to create a Kubernetes Service and a Pod (group of related containers).<br />
<br />
<h2>
Kicking the Tires</h2>
<div>
<br />
It's probably a good thing to explore a little bit before diving in so that I can see what to expect from Kubernetes in general. I also need to verify that I have a working environment before I start trying to bang on it.<br />
<br />
<h3>
Preparation</h3>
</div>
<div>
<br /></div>
<div>
If you're following along, at this point I'm going to assume that you have access to a running Kubernetes cluster. I'm going to be using the Vagrant test cluster as defined in the <a href="https://github.com/GoogleCloudPlatform/kubernetes">github repository</a> and described in the <a href="https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/getting-started-guides/vagrant.md">Vagrant version</a> of the <a href="https://github.com/GoogleCloudPlatform/kubernetes/tree/master/docs/getting-started-guides">Getting Started Guides</a>.</div>
<div>
<br /></div>
<div>
I'm also going to assume that you've built the kubernetes binaries. I'm using the shell wrappers in the cluster sub-directory, especially <span style="font-family: Courier New, Courier, monospace;">cluster/kubectl.sh</span>. If you try that and you haven't built the binaries you'll get a message that looks like this:</div>
<div>
<br /></div>
<pre class="brush: bash">cluster/kubectl.sh
It looks as if you don't have a compiled kubectl binary.
If you are running from a clone of the git repo, please run
'./build/run.sh hack/build-cross.sh'. Note that this requires having
Docker installed.
If you are running from a binary release tarball, something is wrong.
Look at http://kubernetes.io/ for information on how to contact the
development team for help.
</pre>
<div>
<span style="font-family: Courier New, Courier, monospace;"><br /></span></div>
<div>
<span style="font-family: inherit;">If you see that, do as it says. If that fails, you probably haven't installed the <i>golang</i> package.</span></div>
<div>
<br />
<br />
<br />
For convenience I alias the kubectl.sh wrapper so that I don't need the full path.
<br />
<br />
<pre class="brush:bash ; title: 'Alias kubectl'; highlight: 1">alias kubectl=~/kubernetes/cluster/kubectl.sh
</pre>
<br />
Like most CLI commands now if you invoke it with no arguments it prints usage.
<br />
<br />
<pre class="brush:bash ; title: 'kubectl usage' ; highlight: [1]">kubectl --help 2>1 | more
Usage of kubectl:
Usage:
kubectl [flags]
kubectl [command]
Available Commands:
version Print version of client and server
proxy Run a proxy to the Kubernetes API server
get [(-o|--output=)json|yaml|...] <resource> [<id>] Display one or many resources
describe <resource> <id> Show details of a specific resource
create -f filename Create a resource by filename or stdin
createall [-d directory] [-f filename] Create all resources specified in a directory, filename or stdin
update -f filename Update a resource by filename or stdin
delete ([-f filename] | (<resource> <id>)) Delete a resource by filename, stdin or resource and id
</pre>
<br />
The full <a href="https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/cli.md#details">usage output</a> can be found in the<a href="https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/cli.md"> CLI documentation</a> in the <a href="https://github.com/GoogleCloudPlatform/kubernetes/">Kubernetes Github repository</a>.<br />
<br /></div>
kubectl has one oddity that makes a lot of sense once you understand why it's there. The command is meant to produce output which is consumable by machines using UNIX pipes. The output is structured data formatted using JSON or YAML. To avoid strange errors in the parsers, the only output to STDOUT is structured data. This means that all of the human readable output goes to STDERR. This isn't just the error output though. This includes the help output. So if you want to run the help and usage output through a pager app like more(1) or less(1), you have to first redirect STDERR to STDOUT as I did above.
<br />
<br />
<h3>
Exploring the CLI control objects</h3>
<div>
<br /></div>
<h4>
</h4>
<div>
You can see in the REST API line the possible operations: <i>get, list, create, delete, update</i> . That line also shows the objects that the API can manage: <i>minions, pods, replicationControllers, servers</i>.<br />
<br /></div>
<h4>
</h4>
<h4>
Minions</h4>
<div>
<br />
A <i>minion</i> is a host that can accept containers. It runs an <span style="font-family: Courier New, Courier, monospace;">etcd</span> and a <span style="font-family: Courier New, Courier, monospace;">kubelet</span><span style="font-family: inherit;"> daemon in addition to the Docker daemon.</span><span style="font-family: inherit;">For our purposes a minion is where containers can go.</span></div>
<br />
I can list the minions in my cluster like this:
<br />
<br />
<div>
<pre class="brush: bash ; title: '' ; highlight: 1">kubectl get minions
NAME LABELS
10.245.2.4 <none>
10.245.2.2 <none>
10.245.2.3 <none>
</pre>
</div>
<br />
<div>
The only valid operation on minions using the REST protocol are the <i>list</i> and <i>get</i> actions. The get response isn't very interesting.</div>
<div>
<br /></div>
<div>
Until I add some of the other objects this is the most interesting query. It indicates that there are three minions connected and ready to accept containers.</div>
<div>
<br /></div>
<h4>
Pods</h4>
<div>
<br /></div>
<div>
A <i><a href="https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/pods.md">pod</a></i> is the Kubernetes object which describes a set of one or more containers to be run on the same minion. While the point of a cluster is to allow containers to run anywhere within the cluster, there are times when a set of containers must run together on the same host. Perhaps they share some external filesystem or some other resource. See the golang specification for the <a href="https://godoc.org/github.com/GoogleCloudPlatform/kubernetes/pkg/api/v1beta1#Pod">Pod struct</a>.</div>
<div>
<br />
<pre class="brush: bash ; title: 'list pods'; highlight: 1">kubectl get pods
NAME IMAGE(S) HOST LABELS STATUS
</pre>
See? not very interesting.<br />
<br /></div>
<div>
<h4>
Replication Controllers</h4>
</div>
<div>
<br /></div>
<div>
I'm going to defer talking about<i> replication controllers</i> in detail for now. It's enough to note their existence and purpose.<br />
<br />
Replication controllers are the tool to create HA or load balancing systems. Using a replication controller you can tell Kubernetes to create multiple running containers for a given image. Kubernetes will ensure that if one container fails or stops that a new container will be spawned to replace it.</div>
<div>
<br /></div>
<div>
I can list the replication controllers in the same way as minions or pods, but there's nothing to see yet.</div>
<br />
<h4>
Services</h4>
<br />
<div>
I think the term <i>service</i> is an unfortunate but probably unavoidable terminology overload.<br />
<br />
In Kubernetes, a service defines a TCP or UDP port reservation. It provides a way for applications running in containers to connect to each other without requiring that each one be configured with the end-point IP addresses. This both allows for abstracted configuration and for mobility and load balancing of the providing containers.<br />
<br />
When I define a Kubernetes service, the service providers (the MongoDB and QPID containers) will be labeled to receive traffic and the service consumers (the Pulp components) will be given the access information in the environment so that they can reach the providers. More about that later.<br />
<br />
I can list the services in the same way as I would minions or pods. And it turns out that creating a couple of Kubernetes services is the first step I need to take to prepare the Pulp support service containers.<br />
<br />
<h2>
Creating a Kubernetes Service Object</h2>
</div>
<div>
<br /></div>
<div>
In a cloud cluster one of the most important considerations is being able to find things. The whole point of the cloud is to promote non-locality. I don't care where things are, but I still have to be able to find them somehow.</div>
<div>
<br /></div>
<div>
A Kubernetes <a href="https://godoc.org/github.com/GoogleCloudPlatform/kubernetes/pkg/api#Service">Service</a> object is a handle that allows my MongoDB and QPID clients find the servers without them having to know where they <u>really</u> are. It defines a port to listen on and a way for clients to indicate that they want to accept the traffic that comes in. Kubernetes arranges for the traffic to be forwarded to the servers.<br />
<br />
Kubernetes both accepts and produces structured data formats for input and reporting. The two currently supported formats are JSON and YAML. The Service structure is relatively simple but it has elements which are shared by all of the top level data structures. Kubernetes doesn't yet have any tooling to make the creation of an object description easier than hand-crafting a snipped of JSON or YAML. Each of the structures <a href="https://godoc.org/github.com/GoogleCloudPlatform/kubernetes/pkg/api">is documented</a> in the godoc for Kubernetes. For now that's all you get.</div>
<div>
<br />
There are a couple of<a href="https://github.com/GoogleCloudPlatform/kubernetes/tree/master/examples"> provided examples</a> and these will have to do for now. The guestbook example demonstrates using ReplicationServers and master/slave implementation using Redis. The second shows how to perform a live update of the pods which make up an active service within a Kubernetes cluster. These are actually a bit more advanced than I'm ready for and don't give the detailed break-down of the moving parts that I mean to do.</div>
<div>
<br /></div>
<div>
<script src="https://gist.github.com/markllama/4766f846415d2b819542.js"></script>
This is a complete description of the service. Lines 5-8 define the actual content.<br />
<ul>
<li>Line 2 indicates that this is a Service object.</li>
<li>Line 3 indicates the object schema version.<br />v1beta1 is current<br />(note: my use of the term 'schema' is a loose one)</li>
<li>Line 4 identifies the Service object.<br />This must be unique within the set of services</li>
<li>Line 5 is the TCP port number that will be listening</li>
<li>Line 6 is for testing. It tells the proxy on the minion with that IP to listen for inbound connections.<br />I'll also use the publicIPs value to expose the HTTP and HTTPS services for Pulp</li>
<li>Lines 7-9 set the Selector<br />The selector is used to associate this Service object with containers that will accept the inbound traffic.<br />This will match with one of the label items assigned to the containers.</li>
</ul>
<div>
<br /></div>
<div>
When a new service is created Kubernetes establishes a listener on an available IP address (one of the minions addresses). While the service object exists any new containers will start with a new set of environment variables which provide access information. The value of the <i>selector</i> (converted to upper case) is used as the prefix for these environment variables so that containers can be designed to pick them up and use them for configuration.</div>
<div>
<br /></div>
<div>
For now I just need to establish the service so that when I create the DB and QPID containers they have something to be bound to.</div>
<div>
<br /></div>
<div>
The QPID service is identical to the MongoDB service, replacing the port (5672) and the selector (msg)</div>
<div>
<br /></div>
<h3>
Querying a Service Object</h3>
<div>
<br /></div>
<div>
I've just created a Service object. I wonder what Kubernetes thinks of it? I can list the services as seen above. I can also get the object information using <span style="font-family: Courier New, Courier, monospace;">kubectl.</span><br />
<span style="font-family: Courier New, Courier, monospace;"><br /></span></div>
<div>
<pre class="brush:bash ; title: 'Query a Service Object: Plaintext' ; highlight: 1">kubectl get services db
NAME LABELS SELECTOR IP PORT
db <none> name=db 10.0.41.48 27017
</none></pre>
<span style="font-family: inherit;"><br /></span></div>
<div>
<span style="font-family: inherit;">That's nice. I know the important information now. But what does it look like <u>really</u>.</span><br />
<span style="font-family: inherit;"><br /></span>
<br />
<pre class="brush: bash ; title 'Query a Service Object: JSON' ; highlight: 1">kubectl get --output=json services db
{
"kind": "Service",
"id": "db",
"uid": "c040da3d-8536-11e4-a18b-0800279696e1",
"creationTimestamp": "2014-12-16T15:18:12Z",
"selfLink": "/api/v1beta1/services/db?namespace=default",
"resourceVersion": 13,
"apiVersion": "v1beta1",
"namespace": "default",
"port": 27017,
"protocol": "TCP",
"selector": {
"name": "db"
},
"publicIPs": [
"10.245.2.2"
],
"containerPort": 0,
"portalIP": "10.0.41.48"
}
</pre>
<br />
Clearly Kubernetes has filled out some of the object fields. Note the <span style="font-family: Courier New, Courier, monospace;">--output=json</span> flag for structured data.<br />
<br />
I'll be using this method to query information about the other elements as I go along.<br />
<br /></div>
</div>
<h2>
Describing a Container (Pod) in Kubernetes</h2>
<div>
<br /></div>
<div>
We've seen how to run a container on a Docker host. With Kubernetes we have to create and submit a description of the container with all of the required variables defined.</div>
<div>
<br /></div>
<div>
Kubernetes has an additional abstraction called a <i>pod</i>. While Kubernetes is designed to allow the operator to ignore the location of containers within the cluster, there are times when a set of containers needs to be co-located on the same host. A pod is Kubernetes' way of grouping containers when needed. When starting a single container it will still be referred to as a member of a pod.<br />
<br />
<br />
Here's the description of a pod containing the MongoDB service image I created earlier.<br />
<br />
<script src="https://gist.github.com/markllama/bcb37503e8488119b7eb.js"></script>
<br />
<br />
<br />
This is actually a set of nested structures, maps and arrays.<br />
<br />
<br />
<ul>
<li>Lines 1-21 define a <a href="https://godoc.org/github.com/GoogleCloudPlatform/kubernetes/pkg/api#Pod">Pod</a>.</li>
<li>Lines 2-4 are elements of an inline <a href="https://godoc.org/github.com/GoogleCloudPlatform/kubernetes/pkg/runtime#JSONBase">JSONBase</a> structure</li>
<li>Lines 5-7 are a map (hash) of strings assigned to the Pod struct element named Labels.</li>
<li>Lines 8-20 define a <a href="https://godoc.org/github.com/GoogleCloudPlatform/kubernetes/pkg/api#PodState">PodState</a> named DesiredState.<br />The only required element is the <a href="https://godoc.org/github.com/GoogleCloudPlatform/kubernetes/pkg/api#ContainerManifest">ContainerManifest</a>, named Manifest in the PodState.</li>
<li>A Podstate has a required Version and ID, though it is not a subclass of JSONBase.<br /> It also has a list of <a href="https://godoc.org/github.com/GoogleCloudPlatform/kubernetes/pkg/api#Container">Containers</a> and an optional list of <a href="https://godoc.org/github.com/GoogleCloudPlatform/kubernetes/pkg/api#Volume">Volumes</a></li>
<li>Lines 12-18 define the set of containers (only one in this case) that will reside in the pod.<br />A Container has a name and an image path (in this case to the previously defined mongodb image).</li>
<li>Lines 15-17 are a set of <a href="https://godoc.org/github.com/GoogleCloudPlatform/kubernetes/pkg/api#Port">Port</a> specifications.<br /> These indicate that something inside the container will be listening on these ports.</li>
</ul>
<br />
<br />
You can see how learning the total schema means fishing through each of these structure definitions in the documentation. If you work at it you will get to know them. To be fair they are really meant to be generated and consumed by machines rather than humans. Kubernetes is still the business end of the service. Pretty dashboards will be provided later. The only visibility I really <b>need</b> is for development and diagnostics. There are gaps here too, but finding them is what experiments like this are about.<br />
<br />
<h4>
A note on Names and IDs</h4>
<br />
There are several places where there is a key named "name" or "id". I could give them all the same value, but I'm going to deliberately vary them so I can expose which ones are used for what purpose. Names can be arbitrary strings. I believe that IDs are restricted somewhat (no hyphens).<br />
<br />
<h2>
Creating the first Pod</h2>
<div>
<br /></div>
<div>
Now I can get back to business.</div>
<div>
<br /></div>
<div>
Once I have the Pod definition expressed in JSON I can submit that to <span style="font-family: Courier New, Courier, monospace;">kubectl</span><span style="font-family: inherit;"> for processing.</span></div>
<div>
<span style="font-family: inherit;"><br /></span>
<br />
<pre class="brush: bash ; title: 'Create a Pod' ; highlight: 1">kubectl create -f pods/mongodb.json
pulpdb
</pre>
</div>
<div>
<br />
<b>TADA!</b> I now have a MongoDB running in Kubernetes.<br />
<br />
<h2>
But how do I know?</h2>
<br /></div>
Now that I actually have a pod, I should be able to query the Kubernetes service about it and get more than an empty answer.<br />
<br />
<pre class="brush: bash ; title: 'Query a Pod: Plaintext' ; highlight: 1">kubectl get pods pulpdb
NAME IMAGE(S) HOST LABELS STATUS
pulpdb markllama/mongodb 10.245.2.3/10.245.2.3 name=db Running
</pre>
<br />
Familiar and Boring. But I can get more from kubectl by asking for the raw JSON return from the query.<br />
<br />
<pre class="brush: jscript ; title: 'Query a Pod: JSON' ; highlight: 1">{
"kind": "Pod",
"id": "pulpdb",
"uid": "4bac8381-8537-11e4-a18b-0800279696e1",
"creationTimestamp": "2014-12-16T15:22:06Z",
"selfLink": "/api/v1beta1/pods/pulpdb?namespace=default",
"resourceVersion": 22,
"apiVersion": "v1beta1",
"namespace": "default",
"labels": {
"name": "db"
},
"desiredState": {
"manifest": {
"version": "v1beta2",
"id": "",
"volumes": [
{
"name": "devlog",
"source": {
"hostDir": {
"path": "/dev/log"
},
...
"pulp-db": {
"state": {
"running": {
"startedAt": "2014-12-16T15:27:04Z"
}
},
"restartCount": 0,
"image": "markllama/mongodb",
"containerID": "docker://8f21d45e49b18b37b98ea7556346095261699bc
3664b52813a533edccee55a63"
}
}
}
}
</pre>
<br />
It's<u> really</u> long. So I'm not going to include it inline. Instead I put it<a href="https://gist.github.com/markllama/b7a770b9e0e1e2af938b"> <span id="goog_831323549"></span>into a gist</a><span id="goog_831323550"></span>.<br />
<br />
If you fish through it you'll find the same elements I used to create the pod, and lots, lots more. The structure now contains both a <i>desiredState</i> and a <i>currentState</i> sub-structure, with very different contents.<br />
<br />
Now a lot of this is just noise to us, but lines <a href="https://gist.github.com/markllama/b7a770b9e0e1e2af938b#file-kube_pod_get_reply-L59-L72">59-72</a> are of particular interest. These show the effects of the Service object that was created previously. These are the environment variables and network ports declared. These are the values that a client container will use to connect to this service container.<br />
<br />
<h2>
Testing the MongoDB service</h2>
<div>
<br /></div>
<div>
If you've read my <a href="http://cloud-mechanic.blogspot.com/2014/08/docker-simple-service-container-example.html">previous blog post on creating a MongoDB Docker image</a> you'll be familiar with the process I used to verify the basic operation of the service.</div>
<div>
<br /></div>
<div>
In that case I was running the container using Docker on my laptop. I knew exactly where the container was running and I had direct access to the Docker CLI so that I could ask Docker about my new container.</div>
<div>
I'd opened up the MongoDB port and told Docker to bind it to a random port on the host and I could connect directly to that port.</div>
<div>
<br /></div>
<div>
In a Kubernetes cluster there's no way to know a priori where the MongoDB container will end up. You have to ask Kubernetes where it is. Further you don't have direct access to the Docker CLI.<br />
<br />
This is where that <span style="font-family: Courier New, Courier, monospace;">publicIPs</span><span style="font-family: inherit;"> key in the mongodb-service.json file comes in. I set the public IP value of the db service to an external IP address of one of the Kubernetes minions: 10.245.2.2. This causes the proxy on that minion to accept inbound connections and forward them to the db service pods where ever they are.</span></div>
<div>
<br /></div>
<div>
<div>
The minion host is accessible from my desktop so I can test the connectivity directly.<br />
<br /></div>
</div>
<div>
<pre class="brush: bash ; title: 'List databases in MongoDB' ; highlight: 1">echo "show dbs" | mongo 10.245.2.2
MongoDB shell version: 2.4.6
connecting to: 10.245.2.4/test
local 0.03125GB
bye
</pre>
<br /></div>
<h2>
And now for QPID?</h2>
</div>
<div>
<br /></div>
<div>
As with the Service object, creating and testing the QPID container within Kubernetes requires the same process. Create a JSON file which describes the QPID service and another for the pod. Submit them and test as before.</div>
<div>
<br /></div>
<h2>
Summary</h2>
<br />
Now I have two running network services inside the Kubernetes cluster. This consists of a Kubernetes Service object and a Kubernetes Pod which is running the image I'd created for each service application.<br />
<br />
I can prove to myself that the application services are running and accessible, though for some of the detailed tests I have to go under the covers of Kuberntes still.<br />
<br />
I have the information I need to craft images for the other Pulp services so that they can consume the database and messenger services.<br />
<br />
<h2>
Next Up</h2>
<div>
<br /></div>
<div>
In the next post I mean to create the first Pulp service image, the Celery Beat server. There are elements that all of the remaining images will have in common, so I'm going to first build a base image and then apply the last layer to differentiate the beat server from the Pulp resource manager and the pulp workers.</div>
<div>
<br /></div>
<h2>
References</h2>
<div>
<br /></div>
<div>
<ul>
<li>Docker<br /><a href="https://docker.com/">https://docker.com/</a></li>
<li>Kubernetes<br /><a href="https://github.com/GoogleCloudPlatform/kubernetes/">https://github.com/GoogleCloudPlatform/kubernetes/</a></li>
<li>Kubernetes Source Code Documentation<br /><a href="https://godoc.org/github.com/GoogleCloudPlatform/kubernetes">https://godoc.org/github.com/GoogleCloudPlatform/kubernetes</a></li>
<li>Pulp<br /><a href="http://www.pulpproject.org/">http://www.pulpproject.org/</a></li>
<li>Celery<br /><a href="http://www.celeryproject.org/">http://www.celeryproject.org/</a></li>
<li>JSON<br /> <a href="http://json.org/">http://json.org/</a></li>
<li>YAML<br /><a href="http://yaml.org/">http://yaml.org/</a></li>
<li>Pretty Printing JSON with Python<br />http://stackoverflow.com/questions/352098/how-can-i-pretty-print-json</li>
</ul>
</div>
markllamahttp://www.blogger.com/profile/14193184544557876514noreply@blogger.com5tag:blogger.com,1999:blog-5022186007695457923.post-71135090615879280752014-09-01T17:30:00.002-07:002014-09-01T17:51:24.690-07:00Docker: A QPID Message Broker ContainerOK I lied. I realized I can't just move on to working with Pulp in Kubernetes without building the other sub-service Pulp needs.<br />
<br />
This one is merely going to be an exposition of the QPID container and it's actually simpler than the MongoDB container, so this will be a short one. A QPID service is even simpler than a MongoDB because (so long as you don't care about store-and-forward messages) you don't need persistent storage.<br />
<br />
Like the MongoDB container, I need to define the package set that will be installed on top of the base image. I also need to declare a TCP port for the QPID service. Finally I need to define the primary process that will be started when the container starts. This will be an invocation of the QPID service daemon.<br />
<br />
<h2>
QPID Dockerfile</h2>
<br />
Here's the Dockerfile for QPID on Fedora 20.<br />
<br />
<div>
<script src="https://gist.github.com/markllama/61b1b25cb384ad25689f.js"></script>
</div>
<br />
Let's walk through the Dockerfile directives quickly.<br />
<br />
Line 1: <a href="https://docs.docker.com/reference/builder/#from">FROM</a> - Just as in the MongoDB image, I'm using the stock Fedora 20 image as the base<br />
<br />
<div>
Line 2: <a href="https://docs.docker.com/reference/builder/#maintainer">MAINTAINER</a> - Indicate who to contact with problems (AND THANKS!)</div>
<div>
<br /></div>
Yeah, that's me.<br />
<br />
<div>
Line 7: <a href="https://docs.docker.com/reference/builder/#run">RUN</a> - Install the QPID packages</div>
<div>
<br />
I think there are several QPID servers. I'm using the one written in C++, hence the package names: qpid-cpp-server and qpidd-cpp-server-store. </div>
<br />
Line 10: <a href="https://docs.docker.com/reference/builder/#add">ADD</a> - Create a location for the daemon to run. If you specify a file to add but there is no matching file in the build context directory, then Docker will create the target in the container as an empty directory.<br />
<br />
I'm creating /.qpidd for the daemon to run in.<br />
<br />
Line 12: <a href="https://docs.docker.com/reference/builder/#workdir">WORKDIR</a> - Set the location where the initial process will run. Here is where I tell Docker to run the daemon in the directory I created with the previous ADD directive.<br />
<br />
Line 14: <a href="https://docs.docker.com/reference/builder/#expose">EXPOSE</a> - QPID uses port 5672/TCP. This line opens the firewall for that port and causes Docker to bind it to a host port.<br />
<br />
<div>
Line 16: <a href="https://docs.docker.com/reference/builder/#entrypoint">ENTRYPOINT</a> - This indicates the binary or script that will be called when the container runs.<br />
<br />
The ENTRYPOINT and CMD directives are used to craft the invocation of the primary process of the container.<br />
<br />
<h3>
Explaining ENTRYPOINT and CMD</h3>
<div>
<br /></div>
<div>
I got some help for this from a Stackoverflow article: <a href="http://stackoverflow.com/questions/21553353/what-is-the-difference-between-cmd-and-entrypoint-in-a-dockerfile">What is the difference between CMD and ENTRYPOINT</a></div>
<div>
<br /></div>
When I docker container is run, a single process is started inside the container. This process may spawn others, but it remains as the anchor process for all of the others.<br />
<br />
The invocation of the container primary process is created by combining the values of the ENTRYPOINT and CMD directives. The ENTRYPOINT, if it is set, becomes the path of the binary to be executed. The value of the CMD directive is used as the arguments to the primary process.<br />
<br />
There are two twists on this.<br />
<br />
If no ENTRYPOINT is provided, then the CMD directive is run using <span style="font-family: Courier New, Courier, monospace;">/bin/sh -c</span><span style="font-family: inherit;">.</span><br />
Also if the docker run command has any positional arguments following the regular docker arguments, these will replace the CMD value.<br />
<br />
By setting the ENTRYPOINT to the QPID command, then the arguments to the daemon can be passed directly on the docker run line.<br />
<br />
If an image has an ENTRYPOINT directive then it can be overridden with the --entrypoint option to docker run.<br />
<br /></div>
<h2>
Building the Image</h2>
<div>
<br />
<pre><pre class="brush: bash ; title: 'Build the QPID container' ; highlight: 1">docker build -t markllama/qpid images/qpid
Sending build context to Docker daemon 2.56 kB
Sending build context to Docker daemon
Step 0 : FROM fedora:20
---> 88b42ffd1f7c
Step 1 : MAINTAINER Mark Lamourine <markllama gmail.com="">
---> Using cache
---> 95516239225e
Step 2 : RUN yum install -y qpid-cpp-server qpid-cpp-server-store python-qpid-qmf python-qpid && yum clean all
---> Running in 7fc6b6ed2128
Resolving Dependencies
--> Running transaction check
---> Package python-qpid.noarch 0:0.26-2.fc20 will be installed
--> Processing Dependency: python-qpid-common = 0.26-2.fc20 for package: python-qpid-0.26-2.fc20.noarch
---> Package python-qpid-qmf.x86_64 0:0.26-2.fc20 will be installed
--> Processing Dependency: qpid-qmf(x86-64) = 0.26-2.fc20 for package: python-qpid-qmf-0.26-2.fc20.x86_64
--> Processing Dependency: libqmf2.so.1()(64bit) for package: python-qpid-qmf-0.26-2.fc20.x86_64
---> Package qpid-cpp-server.x86_64 0:0.26-11.fc20 will be installed
--> Processing Dependency: qpid(client)(x86-64) = 0.26 for package: qpid-cpp-server-0.26-11.fc20.x86_64
--> Processing Dependency: qpid-proton-c(x86-64) >= 0.5 for package: qpid-cpp-server-0.26-11.fc20.x86_64
...
python-qpid-common.noarch 0:0.26-2.fc20
qpid-cpp-client.x86_64 0:0.26-11.fc20
qpid-proton-c.x86_64 0:0.7-3.fc20
qpid-qmf.x86_64 0:0.26-2.fc20
Complete!
Cleaning repos: fedora updates
Cleaning up everything
---> d7e61654fb92
Removing intermediate container 7fc6b6ed2128
Step 3 : ADD . /.qpidd
---> 10c44a5719a5
Removing intermediate container a8a37c5986a5
Step 4 : WORKDIR /.qpidd
---> Running in 2833da1629d9
---> 1963a2551db8
Removing intermediate container 2833da1629d9
Step 5 : EXPOSE 5672
---> Running in d0d92a1e58ad
---> 425ba5994308
Removing intermediate container d0d92a1e58ad
Step 6 : ENTRYPOINT ["/usr/bin/qpidd", "-t", "--auth=no"]
---> Running in e678dc1a4b66
---> ae30e626e215
Removing intermediate container e678dc1a4b66
Successfully built ae30e626e215
</markllama></pre>
</pre>
</div>
<h2>
Verifying the Image</h2>
<div>
<br />
With respect to docker, verifying the image is the same as it was for the MongoDB image.</div>
<div>
<br /></div>
<div>
<pre class="brush: bash ; title: 'Start QPID Container' ; highlight: 1">docker run -d --name qpid1 --publish-all markllama/mongodb
1b513bee6d8d5d4328059a059f9520c469ff405228b88370b91bb85ef659b708
</pre>
</div>
<h3>
</h3>
<br />
<h3>
Process information</h3>
<div>
<br />
<pre class="brush: bash ; title: 'List containers' ; highlight: 1">docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
1b513bee6d8d markllama/qpid:latest /usr/sbin/qpidd -t - 7 seconds ago Up 5 seconds 0.0.0.0:49157->5672/tcp qpid1
</pre>
<br /></div>
<h3>
Docker logs</h3>
<div>
<br />
<pre class="brush: bash ; title: 'List Container Log' ; highlight: 1">docker logs qpid1
2014-09-01 23:32:34 [Model] trace Mgmt create memory. id:amqp-broker
2014-09-01 23:32:34 [Broker] info Management enabled
2014-09-01 23:32:34 [Management] info ManagementAgent generated broker ID: dc7d2
473-58e4-4eea-a21b-46105345054e
...
2014-09-01 23:32:34 [Management] debug ManagementAgent added class org.apache.qp
id.broker:queueThresholdExceeded
2014-09-01 23:32:34 [Model] trace Mgmt create system. id:cfaf5a0f-1291-41e5-b0c0
-e5eb07c77c1e
2014-09-01 23:32:34 [Model] trace Mgmt create broker. id:amqp-broker
2014-09-01 23:32:34 [Model] trace Mgmt create vhost. id:org.apache.qpid.broker:b
roker:amqp-broker,/
2014-09-01 23:32:34 [Security] notice SSL plugin not enabled, you must set --ssl
-cert-db to enable it.
2014-09-01 23:32:34 [Broker] info Loaded protocol AMQP 1.0
2014-09-01 23:32:35 [Store] notice Journal "TplStore": Created
2014-09-01 23:32:35 [Store] notice Store module initialized; store-dir=//.qpidd
2014-09-01 23:32:35 [Store] info > Default files per journal: 8
2014-09-01 23:32:35 [Store] info > Default journal file size: 24 (wpgs)
2014-09-01 23:32:35 [Store] info > Default write cache page size: 32 (KiB)
2014-09-01 23:32:35 [Store] info > Default number of write cache pages: 32
2014-09-01 23:32:35 [Store] info > TPL files per journal: 8
2014-09-01 23:32:35 [Store] info > TPL journal file size: 24 (wpgs)
2014-09-01 23:32:35 [Store] info > TPL write cache page size: 4 (KiB)
2014-09-01 23:32:35 [Store] info > TPL number of write cache pages: 64
2014-09-01 23:32:35 [Model] trace Mgmt create exchange. id:
...
2014-09-01 23:32:36 [Model] trace Mgmt create exchange. id:qmf.default.direct
2014-09-01 23:32:36 [Broker] notice SASL disabled: No Authentication Performed
2014-09-01 23:32:36 [Security] info Policy file not specified. ACL Disabled, no
ACL checking being done!
2014-09-01 23:32:36 [Security] trace Initialising SSL plugin
2014-09-01 23:32:36 [Network] info Listening to: 0.0.0.0:5672
2014-09-01 23:32:36 [Network] info Listening to: [::]:5672
2014-09-01 23:32:36 [Network] notice Listening on TCP/TCP6 port 5672
2014-09-01 23:32:36 [Store] info Enabling management instrumentation for the sto
re.
...
2014-09-01 23:32:36 [Model] trace Mgmt create store. id:org.apache.qpid.broker:b
roker:amqp-broker
2014-09-01 23:32:36 [Management] debug Management object (V1) added: org.apache.
qpid.legacystore:store:org.apache.qpid.broker:broker:amqp-broker
2014-09-01 23:32:36 [Broker] notice Broker running
</pre>
<br />
The QPID logs will continue accumulating. With the default debug level it reports a lot of connection information.<br />
<br /></div>
<h3>
Connectivity</h3>
<div>
<br /></div>
<div>
To test connectivity to the QPID services I use the <span style="font-family: Courier New, Courier, monospace;">qpid-config</span> command from the <span style="font-family: Courier New, Courier, monospace;">qpid-utils</span> package on Fedora. Install that package to get the command.<br />
<br />
<br />
<pre class="brush: bash; title: 'Test Connectivity and List Queues' ; highlight: 1">qpid-config queues -b guest@127.0.0.1:49157
Queue Name Attributes
=================================================================
7783123e-9589-4814-8b7b-b976a576c853:0.0 auto-del excl
</pre>
<br />
<br />
This command lists the queues present on the broker. It connects using the guest account and specifies the localhost IPv4 address and the port indicated by the output of the <span style="font-family: Courier New, Courier, monospace;">docker ps</span> or <span style="font-family: Courier New, Courier, monospace;">docker ports</span> commands.<br />
<br />
This is a very simple connectivity test. The single queue is the default for an unused AMQP server. Once the Pulp components connect they will create additional queues.<br />
<br /></div>
<div>
<h2>
Running A Shell in an image with an ENTRYPOINT</h2>
</div>
<div>
<br /></div>
<div>
Using an ENTRYPOINT directive has a couple of effects that you want to be aware of.</div>
<div>
<br /></div>
<div>
On the plus side you can add arguments to the entrypoint binary just by adding them after the image name on the invocation.</div>
<div>
<br /></div>
<div>
One gotcha is that you can't just put /bin/sh after the image to get a shell as you otherwise would. It is very common and convenient to examine an image by running it with a shell, overriding the CMD. Docker provides the --entrypoint option to allow overriding when necessary.</div>
<div>
<br />
<pre class="brush: bash ; title: 'Run a container with an ENTRYPOINT via a shell' ; highlight: 1">docker run -it --entrypoint /bin/sh markllama/qpid
sh-4.2# pwd
/.qpidd
sh-4.2# ls
Dockerfile
sh-4.2# exit
exit
</pre>
</div>
<div>
<br />
Now I have images for both of the secondary services that Pulp needs.<br />
<br />
Time to start playing with Kubernetes a bit.<br />
<br />
References:<br />
<br />
<br />
<ul>
<li><a href="http://www.docker.com/">Docker</a></li>
<li><a href="https://qpid.apache.org/">Apache QPID</a></li>
</ul>
</div>
markllamahttp://www.blogger.com/profile/14193184544557876514noreply@blogger.com0tag:blogger.com,1999:blog-5022186007695457923.post-15692928792502875222014-08-30T09:54:00.001-07:002014-08-30T09:54:15.616-07:00Docker: A simple service container example with MongoDBIn my previous post I said I was going to build, over time a Pulp repository using a set of containerized service components and host it in a Kubernetes cluster.<br />
<br />
<h2>
A complete Pulp service</h2>
<br />
The Pulp service is composed of a number of sub-services:<br />
<br />
<ul>
<li>A MongoDB database</li>
<li>A QPID AMQP message broker</li>
<li>A number of Celery processes</li>
<ul>
<li>1 Celery Beat process</li>
<li>1 Pulp Resource Manager (Celery worker) process</li>
<li>>1 Pulp worker (Celery worker) process</li>
</ul>
<li>>1 Apache HTTPD - serves mirrored content to clients</li>
<li>>1 Crane service - Docker plugin for Pulp</li>
</ul>
<div>
<br /></div>
<div>
This diagram illustrates the components and connectivity of a Pulp service as it will be composed in Kubernetes using Docker containers.<br />
<br /></div>
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="http://2.bp.blogspot.com/-lyklyUQZwEQ/U_56tPRbw2I/AAAAAAAAEy4/jKvcACLcY9Y/s1600/Pulp%2BService%2BStructure%2Bin%2BDocker%2Bwith%2BKubernetes%2B-%2B2.0.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="Diagram of Pulp service component structure" border="0" src="http://2.bp.blogspot.com/-lyklyUQZwEQ/U_56tPRbw2I/AAAAAAAAEy4/jKvcACLcY9Y/s1600/Pulp%2BService%2BStructure%2Bin%2BDocker%2Bwith%2BKubernetes%2B-%2B2.0.png" height="452" title="Pulp Service Component Structure" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Pulp Service Component Structure</td></tr>
</tbody></table>
<br />
<br />
The simplest images will be those for the QPID and MongoDB services. I'm going to show how to create the MongoDB image first.<br />
<br />
There are several things I will not be addressing in this simple example:<br />
<br />
<ol>
<li><u>HA and replication</u><br />In production the MongoDB would be replicated<br />In production the QPID AMQP service would have a mesh of brokers</li>
<li><u>Communications Security</u><br />In production the links between components to the MongoDB and the QPID message broker would be encrypted and authenticated.<br /><br />Key management is actually a real problem with Docker at the moment and will require its own set of discussions.</li>
</ol>
<br />
<h2>
A Docker container for MongoDB</h2>
<br />
This post essentially duplicates the instructions for creating a MongoDB image which are <a href="http://docs.docker.com/examples/mongodb/">provided on the Docker documentation site</a>. I'm going to walk through them here for several reasons. First is for completeness and for practice on the basics of creating a simple image. Second, the Docker example uses Ubuntu for the base image. I am going to use Fedora. In later posts I'm going to be doing some work with Yum repos and RPM installation. Finally I'm going to make some notes which are relevant to the suitability of a container for use in a Kubernetes cluster.<br />
<br />
<h4>
Work Environment</h4>
<br />
I'm working on Fedora 20 with the <span style="font-family: Courier New, Courier, monospace;">docker-io</span> package installed and the docker service enabled and running. I've also added my username to the docker group in <span style="font-family: Courier New, Courier, monospace;">/etc/group</span> so I don't need to use sudo to issue docker commands. If your work environment differs you'll probably have to adapt some.<br />
<br />
<h4>
Defining the Container: Dockerfile</h4>
New docker images are defined in a <i>Dockerfile</i>. Capitalization matters in the file name. The Dockerfile must reside in a directory of its own. Any auxiliary files that the Dockerfile may reference will reside in the same directory.<br />
<br />
The <a href="http://docs.docker.com/reference/builder/">syntax for a Dockerfile</a> is documented on the Docker web site.<br />
<br />
This is the Dockerfile for the MongDB image in Fedora 20:
<br />
<br />
<div>
<script alt="Dockerfile Gist" src="https://gist.github.com/markllama/829690622aacee395836.js"></script>
</div>
<br />
That's really all it takes to define a new container image. The first two lines are the only ones that are mandatory for all Dockerfiles. The rest form the description of the new container.<br />
<br />
<a href="https://docs.docker.com/reference/builder/#from">Dockerfile: FROM</a><br />
<br />
Line 1 indicates the base image to begin with. It refers to an <a href="https://registry.hub.docker.com/_/fedora/">existing image</a> on the official public <a href="https://registry.hub.docker.com/">Docker registry</a>. This image is offered and maintained by the Fedora team. I specify the Fedora 20 version. If I had left the version tag off, the Dockerfile would use the latest tagged image available.<br />
<br />
<a href="https://docs.docker.com/reference/builder/#maintainer">Dockerfile: MAINTAINER</a><br />
<br />
Line 2 gives contact information for the maintainer of the image definition.<br />
<br />
Diversion:<br />
<br />
Lines 4 and 5 are an unofficial comment. It's a fragment of JSON which contains some information about how the image is meant to be used.<br />
<br />
<a href="https://docs.docker.com/reference/builder/#run">Dockerfile: RUN</a><br />
<br />
Line 7 is where the real fun begins. The RUN directive indicates that what follows is a command to be executed in the context of the base image. It will make changes or additions which will be captured and used to create a new layer. In fact, every directive from here on out creates a new layer. When the image is run, the layers are composed to form the final contents of the container before executing any commands within the container.<br />
<br />
The shell command which is the value of the RUN directive must be treated by the shell as a single line. If the command is too long to fit in an 80 character line then shell escapes (\<cr>) and conjunctions (';' or '&&' or '||') are used to indicate line continuation just as if you were writing into a shell on the CLI.<br />
<br />
This particular line installs the <span style="font-family: Courier New, Courier, monospace;">mongodb-server</span><span style="font-family: inherit;"> package and then cleans up the YUM cache. This last is required because any differences in the file tree from the begin state will be included in the next image layer. Cleaning up after YUM prevents including the cached RPMs and metadata from bloating the layer and the image.</span><br />
<span style="font-family: inherit;"><br /></span><span style="font-family: inherit;">Line 10 is another RUN statement. This one prepares the directory where the MongoDB storage will reside. Ordinarily this would be created on a host when the MongoDB package is installed with a little more during the startup process for the daemon. They're here explicitly because I'm going to punch a hole in the container so that I can mount the data storage area from host. The mount process can overwrite some of the directory settings. Setting them explicitly here ensures that the directory is present and the permissions are correct for mounting the external storage.</span><br />
<br />
<a href="https://docs.docker.com/reference/builder/#add">Dockefile: ADD</a><br />
<br />
<span style="font-family: inherit;">Line 14 adds a file to the container. In this case it's a slightly tweaked </span><span style="font-family: Courier New, Courier, monospace;">mongodb.conf</span><span style="font-family: inherit;"> file. It adds a couple of switches which the Ubuntu example from the Docker documentation applies using CLI arguments to the </span><span style="font-family: Courier New, Courier, monospace;">docker run</span><span style="font-family: inherit;"> invocation. The ADD directive takes the input file from the directory containing the Dockerfile and will overwrite the destination file inside the container.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Lines 16-22 don't add new content but rather describe the run-time environment for the contents of the container.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;"><a href="https://docs.docker.com/reference/builder/#volume">Dockerfile: VOLUME</a></span><br />
<span style="font-family: inherit;"><br /></span>
Line 16 officially declares that the directory <span style="font-family: Courier New, Courier, monospace;">/var/lib/mongodb</span> will be used as a mountpoint for external storage.<br />
<br />
<a href="https://docs.docker.com/reference/builder/#expose">Dockerfile: EXPOSE</a><br />
<br />
<span style="font-family: inherit;">Line 18 declares that TCP port 21017 will be exposed. This will allow connections from outside the container to access the mongodb inside.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;"><a href="https://docs.docker.com/reference/builder/#user">Dockerfile: USER</a></span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Line 20 declares that the first command executed will be run as the </span><span style="font-family: Courier New, Courier, monospace;">mongodb</span><span style="font-family: inherit;"> user.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;"><a href="https://docs.docker.com/reference/builder/#workdir">Dockerfile: WORKDIR</a></span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Line 22 declares that the command will will run in </span><span style="font-family: Courier New, Courier, monospace;">/var/lib/mongodb</span><span style="font-family: inherit;">, the home directory for the </span><span style="font-family: Courier New, Courier, monospace;">mongodb</span><span style="font-family: inherit;"> user.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;"><a href="https://docs.docker.com/reference/builder/#run">Dockerfile: CMD</a></span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">The last line of the Dockerfile traditionally describes the default command to be executed when the container starts.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Line 24 uses the CMD directive. The arguments are an array of strings which make up the program to be invoked by default on container start.</span><br />
<span style="font-family: inherit;"><br /></span>
<br />
<h2>
<span style="font-family: inherit;">Building the Docker Image</span></h2>
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">With the Dockerfile and the mongodb.conf template in the image directory (in my case, the directory is </span><span style="font-family: Courier New, Courier, monospace;">images/mongodb</span><span style="font-family: inherit;">) I'm ready to build the image. </span><span style="font-family: inherit;">The transcript for the build process is pretty long. This one I include in its entirety so you can see all of the activity that results from the Dockerfile directives.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;"><br /></span>
<br />
<pre class="brush: bash ; title: 'build mongodb image' ; highlight: 1">docker build -t markllama/mongodb images/mongodb
Sending build context to Docker daemon 4.096 kB
Sending build context to Docker daemon
Step 0 : FROM fedora:20
Pulling repository fedora
88b42ffd1f7c: Download complete
511136ea3c5a: Download complete
c69cab00d6ef: Download complete
---> 88b42ffd1f7c
Step 1 : MAINTAINER Mark Lamourine <markllama gmail.com="">
---> Running in 38db2e5fffbb
---> fc120ab67c77
Removing intermediate container 38db2e5fffbb
Step 2 : RUN yum install -y mongodb-server && yum clean all
---> Running in 42e55f18d490
Resolving Dependencies
--> Running transaction check
---> Package mongodb-server.x86_64 0:2.4.6-1.fc20 will be installed
--> Processing Dependency: v8 for package: mongodb-server-2.4.6-1.fc20.x86_64
...
Installed:
mongodb-server.x86_64 0:2.4.6-1.fc20
...
Complete!
Cleaning repos: fedora updates
Cleaning up everything
---> 8924655bac6e
Removing intermediate container 42e55f18d490
Step 3 : RUN mkdir -p /var/lib/mongodb && touch /var/lib/mongodb/.keep && chown -R mongodb:mongodb /var/lib/mongodb
---> Running in 88f5f059c3ff
---> f8e4eaed6105
Removing intermediate container 88f5f059c3ff
Step 4 : ADD mongodb.conf /etc/mongodb.conf
---> eb358bbbaf75
Removing intermediate container 090e1e36f7f6
Step 5 : VOLUME [ "/var/lib/mongodb" ]
---> Running in deb3367ff8cd
---> f91654280383
Removing intermediate container deb3367ff8cd
Step 6 : EXPOSE 27017
---> Running in 0c1d97e7aa12
---> 46157892e3fe
Removing intermediate container 0c1d97e7aa12
Step 7 : USER mongodb
---> Running in 70575d2a7504
---> 54dca617b94c
Removing intermediate container 70575d2a7504
Step 8 : WORKDIR /var/lib/mongodb
---> Running in 91759055c498
---> 0214a3fbcafc
Removing intermediate container 91759055c498
Step 9 : CMD [ "/usr/bin/mongod", "--quiet", "--config", "/etc/mongodb.conf", "run"]
---> Running in 6b48f1489a3e
---> 13d97f81beb4
Removing intermediate container 6b48f1489a3e
Successfully built 13d97f81beb4
</markllama></pre>
<br />
You can see how each directive in the Dockerfile corresponds to a build step, and you can see the activity that each directive generates.
<br />
<br />
When docker processes a Dockerfile what it really does first is to put the base image in a container and run it but execute a command in that container based on the first Docker file directive. Each directive causes some change to the contents of the container.<br />
<br />
A Docker container is actually composed of a set of file trees that are layered using a read-only union filesystem with a read/write layer on the top. Any changes go into the top layer. When you unmount the underying layers, what remains in the read/write layer are the changes caused by the first directive. When building a new image the changes for each directive are archived into a tarball and checksummed to produce the new layer and the layer's ID.<br />
<br />
This process is repeated for each directive, accumulating new layers until all of the directives have been processed. The intermediate containers are deleted, the new layer files are saved and tagged. The end result is a new image (a set of new layers).<br />
<br />
<h2>
Running the Mongo Container</h2>
<div>
<br /></div>
<div>
This simplest test is to for the new container is to try running it and observing what happens.</div>
<div>
<br /></div>
<div>
<pre class="brush: bash ; title: '' ; highlight: 1">docker run --name mongodb1 --detach --publish-all markllama/mongodb
a90b275d00d451fde4edd9bc99798a4487815e38c8efbe51bfde505c17d920ab
</pre>
<br />
This invocation indicates that docker should run the image named <span style="font-family: Courier New, Courier, monospace;">markllama/mongodb</span>. When it does, it should detach (run as a daemon) and make all of the network ports exposed by the container available to the host. (that's the --publish-all). It will name the newly created container <span style="font-family: Courier New, Courier, monospace;">mongodb1</span> so that you can distinguish it from other instances of the same image. It also allows you to refer to the container by name rather than needing the ID hash all the time. If you don't provide a name, docker will assign one from some randomly selected words.<br />
<br />
The response is a hash which is the full ID of the new running container. Most times you'll be able to get away with a shorter version of the hash (as presented by docker ps. See below) or by the container name.<br />
<br />
<h2>
Examining the Running Container(s)</h2>
</div>
<div>
<br /></div>
<div>
So the container is running. There's a MongoDB waiting for for a connection. Or is there? How can I tell and how can I figure out how to connect to it?</div>
<div>
<br /></div>
<div>
Docker offers a number of commands to view various aspects of the running containers.<br />
<br />
<h3>
<span style="font-family: inherit;">Listing the Running Containers.</span></h3>
<div>
<span style="font-family: inherit;"><br /></span></div>
To list the running containers use <span style="font-family: Courier New, Courier, monospace;">docker ps</span><span style="font-family: inherit;">. </span><br />
<span style="font-family: inherit;"><br /></span>
<br />
<pre>docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a90b275d00d4 markllama/mongodb:latest /usr/bin/mongod --qu 5 mins ago Up 5 min 0.0.0.0:49155->27017/tcp mongodb1
</pre>
<span style="font-family: inherit;"><br /></span></div>
<div>
This line will likely wrap unless you have a wide screen.<br />
<br />
In this case there is only one running container. Each line is a summary report on a single container. The important elements for now are the name, id and the ports summary. This last tells me that I should be able to connect from the host to the container MongoDB using localhost:49155 which is forward to the container's exposed port 27017<br />
<br />
<h3>
What did it do on startup?</h3>
<div>
<br /></div>
<div>
A running container has one special process which is sort of like the init process on a host. That's the process indicated by the CMD or ENTRYPOINT directive in the Dockerfile.</div>
<div>
<br /></div>
<div>
When the container starts, the STDOUT of the initial process is connected to the the docker service. I can retrieve the output by requesting the logs.</div>
<div>
<br />
For Docker commands which apply to single containers the final argument is either the ID or name of a container. Since I named the mongodb container I can use the name to access it.
<br />
<br />
<pre class="brush: bash ; title: 'docker logs for mongodb' ; highlight: 1">docker logs mongodb1
Thu Aug 28 20:38:08.496 [initandlisten] MongoDB starting : pid=1 port=27017 dbpath=/var/lib/mongodb 64-bit host=a90b275d00d4
Thu Aug 28 20:38:08.498 [initandlisten] db version v2.4.6
Thu Aug 28 20:38:08.498 [initandlisten] git version: nogitversion
Thu Aug 28 20:38:08.498 [initandlisten] build info: Linux buildvm-12.phx2.fedoraproject.org 3.10.9-200.fc19.x86_64 #1 SMP Wed Aug 21 19:27:58 UTC 2013 x86_64 BOOST_LIB_VERSION=1_54
Thu Aug 28 20:38:08.498 [initandlisten] allocator: tcmalloc
Thu Aug 28 20:38:08.498 [initandlisten] options: { command: [ "run" ], config: "/etc/mongodb.conf", dbpath: "/var/lib/mongodb", nohttpinterface: "true", noprealloc: "true", quiet: true, smallfiles: "true" }
Thu Aug 28 20:38:08.532 [initandlisten] journal dir=/var/lib/mongodb/journal
Thu Aug 28 20:38:08.532 [initandlisten] recover : no journal files present, no recovery needed
Thu Aug 28 20:38:10.325 [initandlisten] preallocateIsFaster=true 26.96
Thu Aug 28 20:38:12.149 [initandlisten] preallocateIsFaster=true 27.5
Thu Aug 28 20:38:14.977 [initandlisten] preallocateIsFaster=true 27.58
Thu Aug 28 20:38:14.977 [initandlisten] preallocateIsFaster check took 6.444 secs
Thu Aug 28 20:38:14.977 [initandlisten] preallocating a journal file /var/lib/mongodb/journal/prealloc.0
Thu Aug 28 20:38:16.165 [initandlisten] preallocating a journal file /var/lib/mongodb/journal/prealloc.1
Thu Aug 28 20:38:17.306 [initandlisten] preallocating a journal file /var/lib/mongodb/journal/prealloc.2
Thu Aug 28 20:38:18.603 [FileAllocator] allocating new datafile /var/lib/mongodb/local.ns, filling with zeroes...
Thu Aug 28 20:38:18.603 [FileAllocator] creating directory /var/lib/mongodb/_tmp
Thu Aug 28 20:38:18.629 [FileAllocator] done allocating datafile /var/lib/mongodb/local.ns, size: 16MB, took 0.008 secs
Thu Aug 28 20:38:18.629 [FileAllocator] allocating new datafile /var/lib/mongodb/local.0, filling with zeroes...
Thu Aug 28 20:38:18.637 [FileAllocator] done allocating datafile /var/lib/mongodb/local.0, size: 16MB, took 0.007 secs
Thu Aug 28 20:38:18.640 [initandlisten] waiting for connections on port 27017
</pre>
<br />
This is just what I'd expect for a running mongod.</div>
<h3>
</h3>
<h3>
Just the Port Information please?</h3>
<br />
If I know the name of the container or its ID I can request the port information explicitly. This is useful when the output must be parsed, perhaps by a program that will create another container needing to connect to the database.<br />
<br /></div>
<div>
<pre class="brush:bash ; title: 'query port information' ; highlight: 1">docker port mongodb1 27017
0.0.0.0:49155
</pre>
<br /></div>
<div>
<h3>
But is it <i>working?</i></h3>
</div>
<div>
<i><br /></i></div>
<div>
Docker thinks there's something running. I have enough information now to try connecting to the database itself. From the host I can try connecting to the database itself.</div>
<div>
<br /></div>
<div>
The ports information indicates that the container port 27017 is forward to the host "all interfaces" port 49155. If the host firewall allows connections in on that port the database could be used (or attacked) from outside.</div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<pre class="brush: bash ; title: 'check mongodb connectivity' ; highlight: 1">echo "show dbs" | mongo localhost:49155
MongoDB shell version: 2.4.6
connecting to: localhost:49155/test
local 0.03125GB
bye
</pre>
<br />
<h2>
What next?</h2>
<div>
<br /></div>
<div>
At this point I have verified that I have a running MongoDB accessible from the host (or outside if I allow).</div>
<div>
<br /></div>
<div>
There's lots more that you can do and query about the containers using the <span style="font-family: Courier New, Courier, monospace;">docker</span><span style="font-family: inherit;"> CLI command, but there's no need to detail it all here. You can learn more from the <a href="https://docs.docker.com/reference/commandline/cli/">Docker documentation web site</a></span></div>
<div>
<br /></div>
<div>
Before I start on the Pulp service proper I also need a QPID service container. This is very similar to the MongoDB container so I won't go into detail.<br />
<br />
Since the point of the exercise is to run Pulp in Docker with Kubernetes, the next step will be to run the MongoDB and QPID containers using Kubernetes.</div>
</div>
markllamahttp://www.blogger.com/profile/14193184544557876514noreply@blogger.com0tag:blogger.com,1999:blog-5022186007695457923.post-74704206966964922702014-08-27T16:47:00.000-07:002014-09-02T10:02:08.198-07:00Intro to Containerized Applications: Docker and Kubernetes<h2>
Application Virtualization</h2>
In a world where a Hot New Thing In Tech is manufactured by marketing departments on demand for every annual trade show in every year there's something that is is stirring up interest all by itself (though it has it's own share of marketing help) The idea of <i>application containers</i> in general and <i>Docker</i> specifically has become a big deal in the software development industry in the last year.<br />
<br />
I'm generally pretty skeptical of the hype that surrounds emerging tech and concepts (see <a href="https://en.wikipedia.org/wiki/DevOps">DevOps)</a> but I think Docker has the potential to be "disruptive" in the business sense of causing people to have to re-think how they do things in light of new (not yet well understood) possibilities.<br />
<br />
In the next few posts (which I hope to have in rapid succession now) I plan to go into some more detail about how to create Docker containers which are suitable for composition into a working service within a Kubernetes cluster. The application I'm going to use is <a href="http://www.pulpproject.org/">Pulp</a>, a software repository mirror with life-cycle management capabilities. It's not really the ideal candidate because of some of the TBD work remaining in Docker and Kubernetes, but it is a fairly simple service that uses a database, a messaging service and shared storage. Each of these brings out capabilities and challanges intrinsic in building containerized services.<br />
<h2>
TL;DR.</h2>
<div>
Let me say at the outset that this is a long more philosophical than technical post. I'm going to get to the guts of these tools in all their gooey glory but I want to set myself some context before I start. If you want to get right to tearing open the toys, you can go straight to the sites for Docker and Kubernetes:</div>
<div>
<br /></div>
<div>
<ul>
<li>Docker - Containerized applications<br /><a href="http://www.docker.com/">http://www.docker.com</a></li>
<li>Kubernetes - Clustering container hosts<br /><a href="https://github.com/GoogleCloudPlatform/kubernetes">https://github.com/GoogleCloudPlatform/kubernetes</a></li>
</ul>
<div>
<br /></div>
</div>
<h2>
The Obligatory History Lesson</h2>
<br />
For 15 years, since the introduction of VMWare Workstation in 1999 [1], the primary mover of cloud computing has been the <i>virtual machine.</i> Once the idea was out there a number of other hardware virtualization methods were created: Xen, and KVM for Linux and Microsoft Hyper-V on Windows. Some of this software virtualization of hardware caused some problems on the real hardware so in 2006 both Intel and AMD introduced processors with special features to improve the performance and behavior of virtual machines running on their real machines. [2]<br />
<br />
All of these technologies have similar characteristics. They also have similar benefits and gotchas.<br />
<br />
The computer which runs all of the virtual machines (henceforth: VMs) is known as the <i>host</i>. Each of the VM instances is known as a <i>guest</i>. The guests each use one or more (generally very large) files in the host disk space which contains the entire filesystem of the guest. While each guest is running they typically consume a single (again,very large) process on the host.Various methods are used to grant the guest VMs access to the public network both for traffic out of and into the VM. The VM process simulates and entire computer so that for most reasonable purposes it looks and behaves as if it's a real computer.<br />
<br />
This is very different from what has become known as <i>multi-tenant</i> computing. This is the traditional model in which each computer has accounts and users can log into their account and share (and compete for) the disk space and CPU resources. They also often have access to the shared security information. The root account is special and it's a truism among sysadmins that if you can log into a computer you can gain root access if you try hard enough.<br />
<br />
Sysadmins have to work very hard in multi-tenant computing environments to prevent both malicious and accidental conflicts between their users' processes and resource use. If, instead of an account on the host, you give each user a whole VM, the VM provides a nice (?) clean (?) boundary (?) between each user and the sensitive host OS. <br />
<br />
Because VMs are just programs, it is also possible to automate the creation and management of user machines. This is what has made possible modern commercial cloud services. Without virtualization, on-demand public cloud computing would be unworkable.<br />
<br />
There <u>are</u> a number of down-sides to using VMs to manage user computing. Because each VM is a separate computer, each one must have an OS installed and then applications installed and configured. This can be mitigated somewhat by creating and using disk images. This is the equivalent of the ancient practice of creating a "gold disk" and cloning it to create new machines. Still each VM must be treated as a complete OS requiring all of the monitoring and maintenance by a qualified system administrator that a bare-metal host needs. It also contains the entire filesystem of a bare-metal server and requires comparable memory from its host.<br />
<br />
<h2>
Docker</h2>
<div>
<br />
For the buzzword savvy Docker is a software containerization mechanism. Explaining what that means takes a bit of doing. It also totally misses the point, because the enabling technology is totally unimportant. What matters is what it allows us to do. But first, for the tech weenies among you....<br />
<br />
<h3>
Docker Tech: Cgroups, Namespaces and Containers</h3>
<div>
<br /></div>
Ordinary Linux processes have a largely unobstructed view of the resources available from the operating system. They can view the entire file system (subject to user and group access control). They have access to memory and to the network interfaces. They also have access to at least some information about the other processes running on the host.<br />
<br />
Docker takes advantage of <i><a href="http://en.wikipedia.org/wiki/Cgroups">cgroups</a></i> and <a href="http://kernel%20namespaces/"><i>kernel namespaces</i></a> to manipulate the view that a process has of its surroundings. A <i>container</i> is a view of the filesystem and operating system which is a carefully crafted subset of what an ordinary process would see. Processes in a container can be made almost totally unaware of the other processes running on the host. The container presents a limited file system tree which can entirely replace what the process would see if it were not in a container. In some ways this is like a traditional <i><a href="https://en.wikipedia.org/wiki/Chroot">chroot</a> </i>environment but the depth of the control is much more profound.<br />
<br />
So far, this does look a lot like <a href="https://en.wikipedia.org/wiki/Solaris_Containers">Solaris Containers</a>[3], but that's just the tech, there's more.<br />
<br />
<h3>
The Docker Ecosystem</h3>
<div>
<br /></div>
<div>
The real significant contribution of Docker is the way in which containers and their contents are defined and then distributed.</div>
<div>
<br /></div>
<div>
It would take a Sysadmin Superman to manually create the content and environmental settings to duplicate what Docker does with a few CLI commands. I know some people who could do it, but frankly it probably wouldn't be worth the time spent even for them. Even I don't really want to get that far into the mechanics (though I could be convinced if there's interest). What you can do with it though is pretty impressive.</div>
<div>
<br /></div>
<div>
<b>Note</b>: Other people describe better than I could how to install Docker and prepare it for use. <a href="https://docs.docker.com/installation/#installation">Go there, do that, come back</a>.</div>
<div>
<br /></div>
<div>
<b>Hint</b>: on Fedora 20+ you can add your use to the "docker" line in <span style="font-family: Courier New, Courier, monospace;">/etc/group</span> and avoid a lot of calls to <span style="font-family: Courier New, Courier, monospace;">sudo</span> when running the <span style="font-family: Courier New, Courier, monospace;">docker</span> command.</div>
<div>
<br /></div>
<div>
To run a Docker container you just need to know the name of the image and any arguments you want to pass to the process inside. The simplest images to run are the <span style="font-family: inherit;"><i>ubuntu</i></span> and <i>fedora</i> images:</div>
<div>
<br />
<br /></div>
<div>
<pre class="brush: bash; title: 'docker Hello World'; highlight: 1">docker run fedora /bin/echo "Hello World"
Unable to find image 'fedora' locally
Pulling repository fedora
88b42ffd1f7c: Download complete
511136ea3c5a: Download complete
c69cab00d6ef: Download complete
Hello World
</pre>
</div>
<br />
Now honestly, short of a Java app that's probably the heaviest weight "Hello World" you've ever done. What happened was, your local docker system looked for a container image named "fedora" and didn't find one. So it went to the official Docker registry at docker.io and looked for one there. It found it, downloaded it and then started the container and ran the shell command inside, returning the STDOUT to your console.</div>
<br />
Now look at those three lines following the "<span style="font-family: Courier New, Courier, monospace;">Pulling repository</span>" output from the <span style="font-family: Courier New, Courier, monospace;">docker run</span> command.<br />
<br />
A docker "image" is a fiction. Nearly all images are composed of a number of <i>layers</i>. The base layer or <i>base image</i> usually provides the minimal OS filesystem content, libraries and such. Then layers are added for application packages or configuration information. Each layer is stored as a tarball with the contents and a little bit of metadata which indicates, among other things, the list of layers below it.. Each layer is given an ID based on a hash of the tarball so that each can be uniquely identified. When an "image" is stored on the Docker registry, it is given a name and possibly a label so that it can be retrieved on demand.<br />
<br />
In this case Docker downloaded three image layers and then composed them to make the fedora image and then ran the container and executed <span style="font-family: Courier New, Courier, monospace;">/bin/echo</span> inside it.<br />
<br />
You can view the containers that are or have been run on your system with <span style="font-family: Courier New, Courier, monospace;">docker ps</span>.<br />
<br />
<br />
<pre class="brush: bash ; title: 'docker ps example'; highlight: 1">docker ps -l
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
612bc60ede7a fedora:20 /bin/echo 'hello wor 7 minutes ago Exited (0) 7 minutes ago naughty_pike
</pre>
<br />
<br />
You output will very likely be wrapped around unless you have a very wide terminal screen open. The <span font-family:="" monospace="" new="">-l</span> switch tells docker only to print information about the last container created.<br />
<div>
<br /></div>
You can also run a shell inside the container so you can poke around. The <span style="font-family: Courier New, Courier, monospace;">-it</span><span style="font-family: inherit;"> switches indicate that the container will be run interactively and that it should be terminated when the primary process exits.</span><br />
<br />
<pre class="brush: bash ; title: 'simple docker run with shell' ; highlight: 1"></pre>
<pre class="brush: bash ; title: 'simple docker run with shell' ; highlight: 1">docker run -it fedora /bin/sh
sh-4.2# ls
bin etc lib lost+found mnt proc run srv tmp var
dev home lib64 media opt root sbin sys usr
sh-4.2# ps -ef
PID TTY TIME CMD
1 ? 00:00:00 sh
8 ? 00:00:00 ps
sh-4.2# df -k
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/docker-8:4-2758071-97e6230110ded813bff36c0a9a397d74d89af18718ea897712a43312f8a56805 10190136 429260 9220204 5% \
tmpfs 24725556 0 24725556 0% /dev
shm 65536 0 65536 0% /dev/shm
/dev/sda4 132492664 21656752 104082428 18% /etc/hosts
tmpfs 24725556 0 24725556 0% /proc/kcore
sh-4.2# exit
</pre>
<br />
<br />
That's three simple commands inside the container. The file system at / seems to be fairly ordinary for a complete (though minimal) operating system. It shows that there appear to be only two processes running in the container, though, and the mounted filesystems are a much smaller set than you would expect.<br />
<br />
Now that you have this <i>base image</i>, you can use it to create new images by adding layers of your own. You can also register with docker.io so that you can push the resulting images back out and make them available for others to use.<br />
<br />
These are the two aspects of Docker that make it truly significant.<br />
<br />
<h3>
From Software Packaging to Application Packaging</h3>
<div>
<br /></div>
Another historic diversion. Think about this: how do we get software?<br />
<br />
<h4>
Tarballs to RPMs (and Debs)</h4>
<br />
Back in the old days we used to pass around software using FTP and tarballs. We built it ourselves with a compiler.<span style="font-family: Courier New, Courier, monospace;">compress</span><span style="font-family: inherit;">, </span><span style="font-family: Courier New, Courier, monospace;">gzip</span><span style="font-family: inherit;">, </span><span style="font-family: Courier New, Courier, monospace;">configure</span><span style="font-family: inherit;"> and </span><span style="font-family: Courier New, Courier, monospace;">make</span><span style="font-family: Arial, Helvetica, sans-serif;"> </span><span style="font-family: inherit;">made it lots faster but not easier. At least for me, Solaris introduced <a href="https://en.wikipedia.org/wiki/Package_(package_management_system)">software packages</a>, bundles of pre-compiled software which included dependency information so that you could just ask for LaTeX and you'd get all of the stuff you needed for it to work without having to either rebuild it or chase down all the broken loose ends.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: inherit;">Now, many people have problems with<a href="https://en.wikipedia.org/wiki/Package_management_system"> package management systems</a>. Some people have favorites or pets, but I can tell you from first hand experience, I don't care which one I have, but I don't want not to have one. (yes, I hear you <a href="http://www.gentoo.org/">Gentoo</a>, no thanks)</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">For a long time software binary packages were the only way to deliver software to an OS. You still had to install and configure the OS. If you could craft the perfect configuration and you had the right disks you could clone your working OS onto a new disk and have a perfect copy. Then you had to tweak the host and network configurations, but that was much less trouble than a complete re-install.</span><br />
<br />
<h4>
<span style="font-family: Arial, Helvetica, sans-serif;">Automated OS Installation</span></h4>
<br />
<span style="font-family: inherit;">Network boot mechanisms like <a href="https://en.wikipedia.org/wiki/Preboot_Execution_Environment">PXE</a> and software installation tools,<a href="http://docs.oracle.com/cd/E19253-01/817-5506/"> Jumpstart</a>, <a href="https://fedoraproject.org/wiki/Anaconda/Kickstart">Kickstart/Anaconda</a>, <a href="https://www.suse.com/documentation/sles11/singlehtml/book_autoyast/book_autoyast.html">AutoYAST</a> and others made the Golden Image go away. They let you define the system configuration and then would automate the installation and configuration process for you*. You no longer had to worry about cloning and you didn't have to do a bunch of </span>archaeology<span style="font-family: inherit;"> on your golden disk when it was out of date and you needed to make a new one. All of your choices were encapsulated in your OS config files. You could read them, tweak them and run it again.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">* yes, I didn't mention <a href="https://en.wikipedia.org/wiki/Software_configuration_management">Configuration Management</a>, but that's really an extension of the boot/install process in this case, not a fundamentally different thing.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">In either case though, if you wanted to run two applications on the same host, the possibility existed that they would collide or interfere with each other in some way. Each application also presented a potential security risk to the others. If you crack the host using one app you could fairly surely gain access to everything else on the host. Even inadvertent interactions could cause problems that would be difficult to diagnose and harder to mitigate.</span><br />
<span style="font-family: inherit;"><br /></span>
<br />
<h4>
<span style="font-family: inherit;">Virtual Disks and the Rebirth of the Clones</span></h4>
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">With the advent of virtual machines, the clone was back, but now it was called a <i>disk image.</i> You could just copy the disk image to a host and boot it in a VM. If you want more you make copies and tweak them after boot time.</span><br />
<br />
<span style="font-family: inherit;">So now we had two different delivery mechanisms: Packages for software to be installed (either on bare metal or in a VM) and disk images for completed installations to be run in a VM. That is: unconfigured application software or fully configured operating systems.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">You can isolate applications on one host by placing them into different VMs. But this means you have to configure not one, but three operating systems to build an application that requires two services. That's three ways to get reliability and security wrong. Three distinct moving parts that require a qualified sysadmin to manage them and at least two things which the Developer/Operators will need to access to make the services work.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Docker offers something new. It offers the possibility of distributing just the application. *</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit; font-size: x-small;">* Yeah, there's more overhead than that, but nearly a complete VM and layers can be shared.</span><br />
<br />
<h4>
Containerization: Application Level Software Delivery</h4>
<br />
Docker offers the possibility of delivering software in units somewhere between the binary package and the disk image. Docker containers have isolation characteristics similar to apps running in VMs without the overhead of a complete running kernel in memory and without all of the auxiliary services that a complete OS requires.<br />
<br />
Docker also offers the capability for developers of reasonable skill to create and customize the application images and then to compose them into complex services which can then be run on a single host, or distributed across many.<br />
<br />
The docker registry presents a well-known central location for developers to push their images and name them so that consumers can find them, download them and use them without additional interaction. Because the application has been tested in the container, the developer can be sure that she's identified all of the configuration information that might need to be passed in and out. She can explicitly document that, removing many opportunities for misconfiguration or adverse interactions between services on the same host.<br />
<br />
It's the dawning of a new day.<br />
<br />
If only it were that easy.<br />
<br />
<h3>
Here There Be Dragons</h3>
<div>
<br /></div>
<div>
When a new day dawns on an unfamiliar landscape it slowly reveals a new vista to the eye. If you're in a high place you might see far off, but nearer things could be hidden under the canopy of trees or behind a fold in the land, so that when you actually step down and begin exploring you encounter surprises.</div>
<div>
<br /></div>
<div>
When ever a new technology appears people tend to try to use it the same way they're used to using their older tools. It generally takes a while to figure out the best way to use a new tool and to come to terms with its differentness. There's often a lot of exploring and a fair number of false starts and retraced steps before the real best uses settle out.</div>
<div>
<br /></div>
<div>
Docker does have some youthful shortcomings.</div>
<div>
<br /></div>
<div>
Docker is marvelously good at pulling images from a specific repository (known to the world as the Docker.io Registry) and running them on a specific host to which you are logged on. It's also good at pushing new images to the docker registry. These are both very localized point-to-point transactions.</div>
<div>
<br /></div>
<div>
<div>
Docker has no awareness of anything other than the host it is running on and the docker registry. It's not aware of other docker hosts nearby. It's not aware of alternate registries. It's not even aware of the resources in containers on the same host that container's might want to share.</div>
</div>
<div>
<br /></div>
<div>
The only way to manage a specific container is to log onto its host and run the docker command to examine and manipulate it.</div>
<div>
<br /></div>
<div>
The first thing anyone wants to do when they create a container of any kind is to punch holes in it. What good is a container where you can't reach the contents? Sometimes people want to see in. Other times people want to insert things that weren't there in the first place. And they want to connect pipes between the containers, and from the containers to the outside world.</div>
<div>
<br /></div>
<div>
Docker does have ways of exposing specific network ports from a container to the host or to the host's external network interfaces.</div>
<div>
<br /></div>
<div>
It can import a part of the host filesystem into a container. It also has ways to share storage between two containers on the same host. What it doesn't have is a way to identify and use storage which can be shared between hosts. If you want to have a cluster of docker hosts where the containers can share storage, this is a problem.</div>
<div>
<br /></div>
<div>
It also doesn't have a means to get secret information from... well anywhere... safely from its hidey hole into the container. Since it's trivial for anyone to push an image to the public registry, it's really important not to put secret information into any image, even one that's going to be pushed to a private registry.</div>
<div>
<br /></div>
<div>
As noted, Docker does what it does really well. The developers have been very careful not to try to over reach, and I agree with most of their decisions. The issues I listed above are not flaws in Docker, they are mostly tasks that are outside Docker's scope. This keeps the Docker development drive focused on the problems they are trying to solve so they can do it well.</div>
<div>
<br /></div>
<div>
<span style="font-family: inherit;">To use Docker on anything but a small scale you need something else. Something that is aware of clusters of container hosts, the resources available to each host and how to bind those resources to new containers regardless of which host ends up holding the container. Something that is capable of describing complex multi-container applications which can be spread across the hosts in a cluster and yet be properly and securely connected. </span></div>
<br />
Read on.
<br />
<br />
<h2>
Kubernetes</h2>
<div>
<br /></div>
<div>
Who might want to run vast numbers of containerized applications spread over multiple enormous host clusters without regard to network topology or physical geography? Who else? Google.<br />
<br />
Kubernetes is Google's response to the problem of managing Docker containers on a scale larger than a couple of manually configured hosts. It, like Docker is a young project and there are an awful lot of TBDs, but there's a working core and a lot of active development. Google and the other partners that have joined the Kubernetes effort have very strong motivation to make this work.<br />
<br />
Kubernetes is made up of two service processes that run on each Docker host (in addition to the <span style="font-family: Courier New, Courier, monospace;">dockerd</span><span style="font-family: inherit;">). The </span><span style="font-family: Courier New, Courier, monospace;">etcd</span><span style="font-family: inherit;"> binds the hosts into a cluster and distributes the configuration information. The </span><span style="font-family: Courier New, Courier, monospace;">kubelet</span><span style="font-family: inherit;"> daemon is the active agent on each container host which responds to requests to create, monitor and destroy containers. In Kubernetes parlance, a container host is known as a <i>minion<span style="font-family: inherit;">.</span></i></span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">The </span><span style="font-family: Courier New, Courier, monospace;"><a href="https://coreos.com/blog/distributed-configuration-with-etcd/">etcd</a></span><span style="font-family: inherit;"> service is taken from <a href="https://coreos.com/">CoreOS </a>which is an attempt at application level software packaging and system management that predates Docker. CoreOS seems to be adopting Docker as its container format. </span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">There is one other service process, the Kubernetes </span><span style="font-family: Courier New, Courier, monospace;">app-service</span><span style="font-family: inherit;"> which acts as the head node for the cluster. The app-service accepts commands from users and forwards them to the minions as needed. Any host running the Kubernetes </span><span style="font-family: Courier New, Courier, monospace;">app-server</span><span style="font-family: inherit;"> process is known as a <i>master</i>.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Clients communicate with the masters using the </span><span style="font-family: Courier New, Courier, monospace;">kubecfg</span><span style="font-family: inherit;"> command.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">A little more terminology is in order.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">As noted, container hosts are known as <i>minions</i>. Sometimes several containers must be run on the same minion so that they can share local resources. Kubernetes introduces the concept of a <i>pod</i> of containers to represent a set of containers that must run on the same host. You can't access individual containers within a pod at the moment (there are lots more caveats like this. It is a REALLY young project). </span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Installing Kubernetes is a bit more intense than Docker. Both Docker and Kubernetes are written in Go. Docker is mature enough that it is available as binary packages for both RPM and DEB packaged Linux distributions. (see your local package manager for <i>docker-io</i> and it's dependencies.)</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">The simplest way to get Kubernetes right now is to run it in VirtualBox VMs managed by Vagrant. I recommend the <a href="https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/getting-started-guides/vagrant.md">Kubernetes Getting Started Guide for Vagrant</a> . There's a bit of assembly required.</span><br />
<span style="font-family: inherit;"><br /></span>
Hint: once it's built, I create an alias to the <span style="font-family: Courier New, Courier, monospace;">cluster/kubecfg.sh</span><span style="font-family: inherit;"> so I don't have to put it in my path or type it out every time.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">I'm not going to show very much about Kubernetes yet. It doesn't really make sense to run any interactive containers like the "Hello World" or the Fedora 20 shell using Kubernetes. It's really for running persistent services. I'll get into it deeply in a coming post. For now I'll just walk through the Vagrant startup and simple queries of the test cluster.</span><br />
<span style="font-family: inherit;"><br /></span>
<br />
<pre class="brush: bash ; title: 'Starting a Kubernetes cluster with Vagrant'; highlight: 1">$ vagrant up
Bringing machine 'master' up with 'virtualbox' provider...
Bringing machine 'minion-1' up with 'virtualbox' provider...
Bringing machine 'minion-2' up with 'virtualbox' provider...
Bringing machine 'minion-3' up with 'virtualbox' provider...
==> master: Importing base box 'fedora20'...
...
master:
==> master: Summary
==> master: -------------
==> master: Succeeded: 44
==> master: Failed: 0
==> master: -------------
==> master: Total: 44
==> master:
==> minion-1: Importing base box 'fedora20'...
Progress: 90%
...
==> minion-3: Complete!
==> minion-3: * INFO: Running install_fedora_stable_post()
==> minion-3: disabled
==> minion-3: ln -s '/usr/lib/systemd/system/salt-minion.service' '/etc/systemd/system/multi-user.target.wants/salt-minion.service'
==> minion-3: INFO: Running install_fedora_check_services()
==> minion-3: INFO: Running install_fedora_restart_daemons()
==> minion-3: * INFO: Salt installed!
</pre>
<br />
At this point there are only three interesting commands. They show the set of minions in the cluster, the running pods and the services that are defined. The last two aren't very interesting because there aren't any pods or services.<br />
<br />
<pre class="brush: bash ; title: 'list minions' ; highlight: 1">$ cluster/kubecfg.sh list minions
minions
----------
10.245.2.2
10.245.2.3
10.245.2.4
$ cluster/kubecfg.sh list pods
Name Image(s) Host Labels
---------- ---------- ---------- ----------
$ cluster/kubecfg.sh list services
Name Labels Selector Port
---------- ---------- ---------- ----------
</pre>
</div>
<br />
We know about minions and pods. In Kubernetes a <i>service</i> is actually a port proxy for a TCP port. This allows kubernetes to place service containers arbitrarily while still allowing other containers to connect to them by a well known IP address and port. Containers that wish to accept traffic for that port will use the <i>selector</i> value to indicate that. They service will then forward traffic to those containers.<br />
<br />
Right now Kubernetes accepts requests and prints reports in structured data formats, JSON or YAML. To create a new pod or service, you describe the new object using one of these data formats and then submit the description with a "create" command.<br />
<br />
<h2>
Summary</h2>
<div>
I think software containers in general and Docker in particular have a very significant future. I'm not a big bandwagon person but I think this one is going to matter.</div>
<div>
<br /></div>
<div>
Docker's going to need some more work itself and it's going to need a lot of infrastructure around it to make it suitable for the enterprise and for public cloud use. Kubernetes is one piece that will make using Docker on a large scale possible.</div>
<br />
<br />
See you soon.<br />
<h2>
References</h2>
<br />
<ul>
<li>[1] VMWare - <a href="https://en.wikipedia.org/wiki/Vmware#History">https://en.wikipedia.org/wiki/Vmware#History</a></li>
<li>[2] X86 Hardware Virtualization - <a href="https://en.wikipedia.org/wiki/X86_virtualization">https://en.wikipedia.org/wiki/X86_virtualization</a></li>
<li>[2] Solaris Containers - <a href="https://en.wikipedia.org/wiki/Solaris_Containers">https://en.wikipedia.org/wiki/Solaris_Containers</a></li>
</ul>
markllamahttp://www.blogger.com/profile/14193184544557876514noreply@blogger.com8tag:blogger.com,1999:blog-5022186007695457923.post-55500121485357902502014-08-18T08:20:00.000-07:002014-08-18T08:23:38.512-07:00Hey! I'm Back (and the Cloud is Bigger than Ever)After a few months trying to do Businessy things that I don't think I'm very good at it looks like I could be back in the software dev/sysadmin space again for a while.<br />
<br />
You'll notice a name change on the blog: It's no longer just about <a href="http://openshift.redhat.com/">OpenShift</a>. Red Hat is getting into a number of related and extremely innovative and promising projects and trying to make them work together. I'm working to assist on a number of these projects where an extra hand is needed and I get to learn all kinds of cool stuff in the process.<br />
<br />
The projects all revolve around one form of "virtualization" or another and all of the efforts are on taking these tools and using them to create enterprise class services.<br />
<br />
<h2>
OpenStack</h2>
<div>
<br /></div>
<a href="http://www.openstack.org/">OpenStack</a> is essentially Amazon Web Services(r) for on-premise use. To put it another way, it's an attempt to mechanize all of the functions of all of the groups in a typical enterprise IT department: networking, data center host management, OS and application provisioning, storage management, database services, user management and policies and more.<br />
<br />
Merely replacing all of the people in an organization that do these things would be boring (and counterproductive). What OpenStack really offers is the ability to push control of the resources closer to the real user, offering self-service access to things which used to require coordination between experts and representatives from a number of different groups with the expected long lead times. The ops people can focus on making sure there are sufficient resources to work, and the users, the developers and the applications admins can just take what they need (subject to policy) to do their work.<br />
<br />
Now that's nice for the end user. They get a snazzy dashboard and near-instant response to requests. But the life of the sysadmin hasn't really changed, just the parts they run. The sysadmin still has to create, monitor and support multiple complex services on real hardware. She also can't easily delegate the parts to the old traditional silos. The sysadmin can't be just concerned with hardware and OS and NIC configuration. The whole network fabric (storage too) all has to be understood by <b>everyone</b> on the design, deployment and operations team(s). Message to sysadmins: Don't worry one bit about job security, so long as you keep learning like crazy.<br />
<br />
<h2>
Docker</h2>
<div>
<br /></div>
<a href="http://www.docker.com/">Docker</a> (and more generally "containerization") is the current hot growth topic.<br />
<br />
Many people are now familiar with Virtual Machines. A virtual machine is a process running on a host machine which simulates another (possibly totally different) computer. The virtual machine software simulates a whole computer right down to mimicking hardware responses. From inside the virtual machine it looks like you have a complete real computer at your disposal.<br />
<br />
The downside is that VMs require the installation and management of a complete operating system withing the virtual machine. VMs allow isolation but have a lot of heft to them. The host machine has to be powerful enough to contain whole other computers (sometimes many of them) while still doing it's own job.<br />
<br />
Docker uses some newish ideas to offer a middle ground between traditional multi-tenent computing, where a number of unrelated (and possibly conflicting) services run as peers on a single computer and the total isolation (and duplication) that VMs require.<br />
<br />
The enabling technology is known as <a href="https://en.wikipedia.org/wiki/Cgroups"><i>cgroups</i></a> and specifically <i><a href="https://en.wikipedia.org/wiki/Cgroups#Namespace_isolation">kernel namespaces</a></i>. The names are unimportant really. What namespaces do is to allow the host operating system to provide each process with a distinct carefully tuned view of the parts of the host that the process needs to do its job. The view is called a <i>container</i> and any processes which run in the container can interact with each other as normal. However they are entirely unaware of any other processes running on the host. In a sense containers act as blinders, protecting processes running on the same host from each other by preventing them from even seeing each other.<br />
<br />
Docker is a container service which standardizes and manages the development, creation and deployment of the containers and their contents in a clear and unified way. It provides a means to create a single-purpose container for, say, a database service and then allows the <br />
<br />
<h2>
Kubernetes</h2>
<div>
<br /></div>
<div>
While Docker itself is cool, it really focuses on the environment on a single host and on individual images and containers. <a href="https://github.com/GoogleCloudPlatform/kubernetes">Kubernetes</a> is a project initiated at Google but adopted by a number of other software and service vendors. Kubernetes aims to provide a way for application developers to define and then deploy complex applications composed of a number of Docker containers and potentially spread over a number of container hosts.<br />
<br />
I think Kubernetes (or something like it) is going to have a really strong influence on the acceptance and use of containerized applications. It's likely to be the face most application operations teams see on the apps they deploy. It's going to be critical both for both the Dev and Ops elements because it's going to be critical to the design and deployment of complex applications.<br />
<br />
As a sysadmin this is where my strongest interest is. Docker and Atomic are parts, Kubernetes is the glue.<br />
<br /></div>
<h2>
Project Atomic</h2>
<div>
<br /></div>
<div>
And where do you put all those fancy complex applications you've created using Docker and defined using Kubernetes? <a href="http://www.projectatomic.io/">Project Atomic</a> is a Red Hat project to create a hosting environment specifically for containerized applications. </div>
<div>
<br /></div>
<div>
Rather than running (I mean: installing, configuring and maintaining) a general purpose computer running the Docker daemon and a Kubernetes agent and all of the other attendant internals, Project Atomic will provide a host definition tuned for use as a container host. A general purpose OS installation often has a number of service components which aren't necessary and may even pose a hazard to the container services. Project Atomic is building an OS image designed to do one thing: Run containers.</div>
<div>
<br /></div>
<div>
Atomic is itself a stripped down general purpose OS. It can run on bare metal, or on OpenStack or even on public cloud services like AWS or Rackspace or Google Cloud.<br />
<br />
<br />
<h2>
Go(lang)</h2>
<div>
It's been a long time since I worked in a system level language. Go (or <i><a href="http://golang.org/">golang</a></i> to distinguish it from the venerable Chinese strategy board game) is a new environment created by a couple of the luminaries of early Unix, Robert Griesemer, Rob Pike, and Ken Thompson at Google. It aims to address some of the shortcomings of C in the age of distributed and concurrent programming, neither of which really existed when C was created.</div>
<div>
<br /></div>
<div>
Docker and several other significant new applications are written in Go and it's catching on with system level developers. I quickly bumped up on my scripting language habits when I started getting into Go and I was reminded of why system languages are still important. It's refreshing to know I can still think at that level.</div>
<div>
<br /></div>
<div>
I think Go is going to spread quickly in the next few years and I'm going to learn to work with it along with the common scripting environments.</div>
<div>
<br /></div>
<h2>
Look Up: There's more than one kind of cloud.</h2>
<div>
<br /></div>
<div>
In the past I've been focused on one product and one aspect of Cloud Computing. Make no mistake, Cloud Computing is still in it's infancy and we're still learning what kind of thing it wants to grow up into. The range of enterprise deployment models is getting bigger. Applications can be delivered as traditional software, as VM images for personal or enterprise use (<a href="https://www.virtualbox.org/">VirtualBox</a> and <a href="http://www.vagrantup.com/">Vagrant </a>to OpenStack to AWS) and now as containers which sit somewhere in between. Each has its own best uses and we're still exploring the boundaries.</div>
<div>
<br /></div>
<div>
So now I'm going to branch out too and look at each of these and look at all of them. My focus is still going to be what's going on inside, the place where you can stick your hand in and lose fingers. Lots of other people are talking about the glossy paint job and the snazzy electronic dashboard. I'll leave that to them.</div>
<div>
<br /></div>
<div>
Tut Tut... it looks like rain....(but I like the rain)</div>
<div>
<br /></div>
<h2>
References</h2>
</div>
<div>
<br /></div>
<div>
<ul>
<li>OpenShift - "Platform as a Service" - Developer/App Ops environment<br /><a href="https://www.openshift.com/">https://www.openshift.com/</a></li>
<li>OpenStack - Automated Self/Service "Everything your IT Departement Does"<br /><a href="http://www.openstack.org/">http://www.openstack.org/</a></li>
<li>Docker - Linux Application and Service containers - "intermediate virtualization"?<br /><a href="https://www.docker.com/">https://www.docker.com/</a></li>
<li>Project Atomic - A minimal tuned Linux image for running containerized applications<br /><a href="http://www.projectatomic.io/">http://www.projectatomic.io/</a></li>
<li>Kubernetes - Deployment orchestration for containerized applications<br /><a href="https://github.com/GoogleCloudPlatform/kubernetes">https://github.com/GoogleCloudPlatform/kubernetes</a></li>
<li>The Foreman - OS deployment (and much more!) service<br />http://theforeman.org/</li>
<ul>
<li>Pulp - Enterprise software content mirroring<br /><a href="http://www.pulpproject.org/">http://www.pulpproject.org/</a></li>
<li>Katello - Enterprise OS management<br /><a href="http://www.katello.org/">http://www.katello.org/</a></li>
</ul>
<li>Go(Lang) - A modern system-level programming language<br /><a href="http://golang.org/">http://golang.org/</a></li>
<li>Vagrant - managing a complex virtual development environment<br /><a href="http://www.vagrantup.com/">http://www.vagrantup.com/</a></li>
</ul>
</div>
markllamahttp://www.blogger.com/profile/14193184544557876514noreply@blogger.com1tag:blogger.com,1999:blog-5022186007695457923.post-75806244082789120972014-01-17T13:54:00.002-08:002014-01-17T13:54:42.156-08:00Hanging up my creeper and closing up shop.I dunno if I have any fans of these blog posts, but if I do, I want to let you know that there aren't likely to be any more here.
I've recently been moved into another group and I won't be working directly with OpenShift anymore.
Never fear, the rest of the team is working like crazy as they always have to bring you great stuff and there are some real cool things coming down the line.
Feel free to ask questions about what's here. I can certainly answer those until they become stale because the software has moved on.
It's been fun.
- Markmarkllamahttp://www.blogger.com/profile/14193184544557876514noreply@blogger.com4tag:blogger.com,1999:blog-5022186007695457923.post-88342787515108918482013-12-13T07:03:00.000-08:002013-12-14T06:17:01.668-08:00OpenShift Service Development: Building a Build BoxI found this week that I needed to have build box so that I could repeatedly run the dev/build/install/test cycle. I've messed around with it on and off since i started working on OpenShift but I looked back and realized that I've never written a procedure for creating the build box. So here it is.<br />
<br />
The build process takes source code from a git repository and transforms it into packages. Finally it places the packages into an install repository so that they will be available to the target hosts via yum. The yum repository is published by a small web server. It doesn't need to be fancy as it's just flat files.<br />
<br />
The instructions here are for Fedora 18 or 19. There are some special considerations for RHEL6 or CentOS6. These are detailed <a href="http://www.blogger.com/blogger.g?blogID=5022186007695457923#rhel6">in a section at the bottom of this post.</a> There are notes inline for when the process is different for RHEL.<br />
<br />
There are also some considerations for <a href="http://www.blogger.com/ec2">creating a build box in AWS EC2</a> (or any managed hosting service).<br />
<br />
<h3>
Install the build/publish software</h3>
<div>
<br /></div>
On a minimal install of Fedora, install the base packages needed for the build service. Git to retrieve the source code, tito to build the RPMs, and thttpd to serve the YUM repository to the install targets.<br />
<br />
<pre class="brush: bash ; title: 'install build/publishing tools'">sudo yum install git tito thttpd firewalld
</pre>
<br />
(On RHEL6, enable <a href="https://fedoraproject.org/wiki/EPEL">EPEL repository</a> and skip firewalld)<br />
<br />
<h3>
Create the YUM repository root directory</h3>
<div>
<br /></div>
<div>
Next, create a location for the YUM repository. Place it in a space where thttpd will find it and make it writable by the build user (assumed to be the current user)<br />
<br />
<br /></div>
<pre class="brush: bash ; title: 'create yum repository space for publishing'">sudo mkdir /var/www/thttpd/tito
sudo chown $(id --name --user):$(id --name --group) /var/www/thttpd/tito
</pre>
<br />
<h3>
Enable Web Services</h3>
<div>
<br /></div>
<div>
I have to publish the packages to the install hosts once they're built. I need a web server and on Fedora, I need the firewall daemon running and configured to allow HTTP communications.</div>
<br />
<pre class="brush: bash ; title: 'enable and start the web server'">sudo systemctl enable thttpd
sudo systemctl start thttpd
sudo systemctl enable firewalld
sudo systemctl start firewalld
# Open the port for now
sudo firewall-cmd --zone public --add-service http
# Make the change persistent across reboots
sudo firewall-cmd --zone public --add-service http --permanent
</pre>
<br />
<h3>
Configure Tito Output Location</h3>
<div>
<br /></div>
Tito places the build results and RPMs in /tmp/tito by default. I can set the target location using the titorc file.<br />
<br />
<br />
<pre class="brush: bash ; title: 'set the target location for the tito output'">echo "RPMBUILD_BASEDIR=/var/www/thttpd/tito/" > $HOME/.titorc
</pre>
<br />
<h3>
Retrieve the Source Code Repository</h3>
<div>
<br /></div>
Now that the publication and build services are prepared, it's time to actually get the software source code.<br />
<br />
<pre class="brush: bash ; title: 'clone the OpenShift Origin git repo'">git clone https://github.com/openshift/origin-server.git
</pre>
<br />
If you are doing development, substitute your own fork and branch. If you are doing your editing on the build box (not really recommended, but slightly time saving) you can also use the git: (ssh) protocol and add your github user SSH key so that you can both pull and push changes.<br />
<br />
<h3>
Install Package Build Requirements</h3>
<div>
<br /></div>
Before you can build packages, you must also install any build requirements for the packages. The bourne shell code snippet below will walk the source code tree, find each package root and install all of the build requirements it finds using yum-builddep.<br />
<br />
This triggers off the presence of a .spec file in the root of a package tree. It's critical as a package developer to note all build requirements in the .spec file.<br />
<br />
<br />
<pre class="brush: bash ; title: 'install all packages required for builds'">for SPECPATH in $(find origin-server -name \*.spec)
do
PKGDIR=$(dirname $SPECPATH )
SPECFILE=$(basename $SPECPATH)
(cd $PKGDIR ; sudo yum-builddep -y $SPECFILE )
done
</pre>
<br />
<h3>
Build All Packages</h3>
Now that all the build requirements are installed, it's time to build the software.<br />
<br />
The bourne shell snippet below will walk the entire source tree and locate the root of each package tree and run tito to build the package. If you are building test packages, uncomment the TEST assignment line.<br />
<br />
<pre class="brush: bash; title: 'build all packages'"># TEST=--test
# SCL=--scl=ruby193 # for RHEL6
for SPECPATH in $(find origin-server -name \*.spec)
do
PKGDIR=$(dirname $SPECPATH )
SPECFILE=$(basename $SPECPATH)
(cd $PKGDIR ; tito build --rpm $TEST $SCL)
done
createrepo /var/www/thttpd/tito
</pre>
<br />
This snippet walks the source tree and runs <u>tito</u> to build each package. The last line rebuilds the YUM repository metadata from the packages present.<br />
<br />
You can build a single package by moving to the root of the package tree and running <u>tito</u> manually. You'll also have to re-run <u>createrepo</u> each time you update a package. If you've rebuilt a package but yum claims there's no update available check that.<br />
<br />
Which reminds me, if you're using yum for frequent updates (more than once a day), you'll also have to clear the metadata on the client machine so that it sees the updated packages<br />
<br />
<pre class="brush:bash ; title : 'clean YUM metadata on client after package rebuilds'">
client# sudo yum clean metadata
</pre>
<br />
<h3>
Building Test Packages</h3>
<div>
<br /></div>
<div>
Tito builds not from the most recent commit. That is it ignores files in the workspace which have not been committed. It also requires at least one initial tito tag to operate.</div>
<div>
<br /></div>
<div>
Tito builds test packages by creating a temporary commit and tag. This allows it to create a package with a unique name for each test build. Each time you make a change and rebuild, a serial number is auto-incremented so that yum will see the new package as an 'update' and accept it in preference to any currently installed version.</div>
<div>
<br /></div>
<h3>
Configuring a yum repo on the install host</h3>
<div>
<br /></div>
<div>
You can supersede the stock Fedora or RHEL OpenShift package repositories by placing a new repo file in /etc/yum.repos.d</div>
<div>
<br /></div>
<div>
<pre class="brush:bash ; title 'example test/build yum repo file'; highlight: 1">/etc/yum.repos.d/openshift_buildtest.repo
[openshift_buildtest]
name=OpenShift Build/Test repository
baseurl=http://build.example.com/tito/
enabled=1
gpgcheck=0
</pre>
<br />
<h3>
<a href="http://www.blogger.com/blogger.g?blogID=5022186007695457923" id="rhel6">Considerations for RHEL6</a></h3>
<div>
<br /></div>
<div>
<br /></div>
<div>
There are two significant differences between Fedora and RHEL6 when creating a build box.</div>
<div>
<br />
<h4>
Firewall and Services on RHEL6</h4>
<br /></div>
<div>
On RHEL6 <u>systemd</u> and <u>firewalld</u> are not available. Use <u>iptables</u> and <u>lokkit</u> instead of <u>firewalld</u> and <u>firewall-cmd</u> to open the TCP port for HTTP. Use <u>service</u> and <u>chkconfig</u> instead of <u>systemctl</u> to control services.<br />
<br />
<br />
<pre class="brush: bash ; title 'services and firewall for RHEL6'"># Open Firewall for HTTP
sudo lokkit --service=http
# start and enable thttpd
chkconfig thttpd on
service thttpd start
</pre>
</div>
<div>
<br /></div>
<div>
<h4>
RHEL6, OpenShift, Ruby/Rails versions and Software Collections</h4>
<br />
RHEL6 is .. special. OpenShift is written in Ruby 1.9.3 and Rails 3. These didn't exist or weren't stable when RHEL6 was created. Different Ruby versions don't play nicely on a single system (there have been at least 3 attempts I can find to get Ruby 1.8 and 1.9 to co-exists like Python 2 and 3. All have thrown up their hands in frustration). Given that, heroic measures were required to get them to run on RHEL6. Those heroic measures are called <i><a href="https://fedorahosted.org/SoftwareCollections/">Software Collections</a></i>. better known as <i>"SCL"</i>.</div>
<div>
<br /></div>
<div>
What SCL does is provide a means to repackage software and run it in a special environment that isolates it from the rest of the system. The SCL team has re-packaged over 500 packages to run in the ruby193 environment on RHEL6. These are all needed to run OpenShift on RHEL6. They're also needed to build OpenShift for RHEL6.</div>
<div>
<br /></div>
<div>
Fortunately, the SCL and OpenShift teams have kindly <a href="http://mirror.openshift.com/pub/openshift-origin/nightly/">provided a YUM repository for them</a>. When you add the OpenShift dependencies repository to your YUM repo configurations all of the build dependencies will resolve. They've also added a switch to <u>tito</u> so that it will run your builds inside the SCL environment.</div>
<div>
<pre class="brush: bash ; title 'YUM Repository file for OpenShift Dependencies on RHEL6' ; highlight: 1">/etc/yum.repos.d/openshift-dependencies.repo
[openshift-dependencies]
name=OpenShift Dependencies
baseurl=http://mirror.openshift.com/pub/openshift-origin/nightly/rhel-6/dependen
cies/$basearch/
enabled=1
gpgcheck=0
</pre>
</div>
<div>
<br />
There are other things in those dependencies repositories. There are a large number of update packages which OpenShift needs but which have not yet appeared upstream. On Fedora you'll still need the dependencies YUM repository to create the runtime hosts but you don't need them to build the OpenShift packages.<br />
<br /></div>
</div>
<h3>
<a href="http://www.blogger.com/blogger.g?blogID=5022186007695457923" id="ec2">Considerations for EC2 hosting</a></h3>
<div>
<br /></div>
<div>
If you're placing your build box in AWS EC2 there are a couple of additional things to consider:</div>
<div>
<br /></div>
<div>
<ol>
<li>EC2 security_policy must allow HTTP (port 80/TCP)<br />Your build instance must be created with a security policy which allows port 80/TCP for your install hosts.</li>
<li>Internal Hostname<br />EC2 hosts have an internal and external hostname. Both names are dynamic (unless you assign ElasticIP). If your install hosts are also on EC2 you can use the internal hostname and IP address for the security_policy.</li>
<li>External Hostname<br />If your install boxes are not hosted in EC2 then you must allow all hosts on port 80 TCP and note the EC2 public hostname so that the install hosts can access the build host web server.</li>
</ol>
<div>
<br /></div>
</div>
<h3>
OpenShift Source Code Repositories</h3>
<div>
<br /></div>
<div>
<ul>
<li>origin-server -<br />http://github.com/openshift/origin-server</li>
<li>rhc -<br />http://githhub.com/openshift/rhc</li>
<li>origin-dependencies (SRPM repository)<br />http://mirror.openshift.com/pub/openshift-origin/nightly/fedora-latest/dependencies/SRPMS/</li>
</ul>
<div>
<br /></div>
</div>
<h3>
OpenShift Dependencies RPM Repositories</h3>
<div>
<br /></div>
<div>
<ul>
<li>Fedora -<br />http://mirror.openshift.com/pub/openshift-origin/nightly/fedora-latest/dependencies/$basearch</li>
<li>RHEL6 -<br />http://mirror.openshift.com/pub/openshift-origin/nightly/rhel-6/dependencies/$basearch</li>
</ul>
<div>
<br /></div>
</div>
<h3>
References</h3>
<div>
<ul>
<li>git - http://git-scm.com/</li>
<li>github - https://github.com</li>
<li>thttpd - http://www.acme.com/software/thttpd/</li>
<li>firewalld - https://fedoraproject.org/wiki/FirewallD</li>
<li>tito - http://linux.die.net/man/8/tito</li>
<li>Software Collections (SCL) - https://fedorahosted.org/SoftwareCollections/</li>
</ul>
</div>
markllamahttp://www.blogger.com/profile/14193184544557876514noreply@blogger.com4tag:blogger.com,1999:blog-5022186007695457923.post-86727286930798374662013-10-11T12:30:00.002-07:002013-11-11T05:20:54.505-08:00Diversion: Kerberos (FreeIPA) in AWS EC2One of the things many people are asking for in OpenShift is alternate ways of authenticating SSH and git interactions with the applications gears. Since I'm doing my development work in EC2, I thought that was surely the right place to try it out. Well as usual, it didn't work out quite as simple as I'd planned.<br />
<br />
This post isn't about OpenShift directly. It addresses what I found when I tried to implement FreeIPA in EC2 so that I could develop code to allow Kerberos authentication in OpenShift.<br />
<br />
<h3>
Kerberos in way too few words</h3>
<div>
<br /></div>
<div>
<a href="https://en.wikipedia.org/wiki/Kerberos_(protocol)">Kerberos</a> is an authentication protocol and service defined originally at MIT as part of <a href="https://en.wikipedia.org/wiki/Project_Athena">Project Athena</a> (along with things like the <a href="https://en.wikipedia.org/wiki/X_Window_System">X Windows System</a> and Zephyr, a predecessor to modern IM services). It is meant to provide authenticated on unencrypted and even untrusted networks. Perfect right? Well Kerberos has some quirks.</div>
<div>
<br /></div>
<div>
First, different people can run their own Kerberos services. To avoid conflicts, each service is given an identifier string known as a <i>realm</i>. By convention the realm string is the same as the enterprise DNS domain name. That is, if a company has DNS domain <code>example.com</code> then the Kerberos realm would be <code>EXAMPLE.COM</code>. Unlike DNS domain names, Kerberos realms are case sensitive.</div>
<div>
<br /></div>
<div>
Each participating host must be registered with the Kerberos server and each user must be added to the user list on the server as well. Hosts and users (and any other manageable entity) is identified with a <i>principal</i>. This is basically a name which is unique for each resource, err user, umm host.... thing. The important thing is that the host is identified by a string which is derived from its hostname.</div>
<div>
<br /></div>
<div>
Now wars have been fought over whether a hostname should be the <a href="https://en.wikipedia.org/wiki/Fully_Qualified_Domain_Name">Fully Qualified Domain Name</a> (FQDN) or just the host portion. For Kerberos there is only one answer: <b>FQDN</b>.</div>
<div>
<br /></div>
<div>
The <i>host principal</i> for a given host is composed of the hostname, and the realm. When a client tries to log in, it needs to know the correct principal to request from the server. This is why the FQDN must be the hostname. When the user attempts to log in he must provide both his own principal and the host principal for the destination. The only way to know the destination's host principal is if it is related to the hostname <b>as viewed from the client host</b>.</div>
<div>
<br /></div>
<div>
This is where life gets interesting in AWS EC2.</div>
<div>
<br /></div>
<div>
You see, AWS uses <a href="https://tools.ietf.org/rfc/rfc1918.txt">RFC 1918</a> and something like <a href="https://tools.ietf.org/rfc/rfc1631.txt">Network Address Translation</a> to create a <a href="https://en.wikipedia.org/wiki/Private_network">private network</a> for the virtual machines which make up the EC2 service. AWS also uses an internal DNS service to identify each virtual machine. This means that from the view of a host inside the private network, the destination host has a different IP address <b>and a different hostname</b> than when viewed from outside the private network. The upshot is that, to use Kerberos with EC2 I need some way to make sure that the user can determine a valid host principal to request regardless of where the user is located.</div>
<div>
<br />
<h3>
A Word about IPA and AWS</h3>
</div>
<div>
<br /></div>
<div>
IPA (and FreeIPA) is not a single service. It's a collection of services configured so that they work in concert to provide secure user and host access over untrusted networks. Kerberos is only one of the services, though it is probably the core one. LDAP, NTP and DNS are all support services which make the operation of Kerberos work. IPA wraps these services in such a way so that mere mortals don't necessarily need to know how the bindings work merely to get the service running.</div>
<div>
<br /></div>
<div>
In this post I'm dealing almost entirely with the Kerberos service within IPA and I'll refer to that component by name. Where I mention FreeIPA it will be in reference to the specific tools that FreeIPA provides to set up and manage the conglomerate service.</div>
<div>
<br /></div>
<div>
AWS (Amazon Web Services) is also a suite of services. The core of that is the EC2 virtual host service. Again, all of the AWS services generally work together, but I'm only dealing with EC2 instances in this post so I'll refer to EC2 specifically unless I'm referring to the full suite.</div>
<div>
<br />
<b>UPDATE: 2013-11-07 - AWS TOS do not permit open DNS recursion.</b><br />
<b><br /></b>
One other thing to be aware of when running IPA in AWS. Amazon terms of service do not allow users to create open recursive DNS services within AWS on the grounds that they can be abused.<br />
<br />
When setting up your AWS security policies and the <u>named</u> service on your IPA hosts, be sure to disable recursion and/or limit access to appropriate IP ranges for your DNS clients or you'll get a polite nastygram from Amazon.</div>
<h3>
Kerberos, Linux and SSH</h3>
<div>
<br /></div>
<div>
I want to use Kerberos with SSH so that I can avoid using SSH authorized_keys when pushing git updates to my applications on OpenShift. (mostly ignoring EC2 for now). To do that I need several things set up:</div>
<div>
<br /></div>
<div>
<ul>
<li>A Kerberos (FreeIPA) server - IPA installed, configured</li>
<li>A set of users configured into the FreeIPA service</li>
<li>A target host (OpenShift node)</li>
</ul>
<div>
For SSH the most important things are that the Kerberos and LDAP configurations are set up properly. This includes configuring <i>sssd</i>, and the <code>/etc/nsswitch.conf</code> settings. Luckly the FreeIPA <code>ipa-client-install</code> script (with the right inputs) will do all of that for me. I think there are ways to get it to tell me precisely what changes it's making but I haven't learned how yet. I do know that I can find the results in the <code>/var/log/ipaclient-install.log</code>.</div>
</div>
<div>
<br /></div>
<div>
The other thing I need to do is to make sure that the SSH client and server both will at least try to use the GSSAPI protocol for managing the authentication process. On the server this means making sure that the GSSAPIAuthentication is enabled.</div>
<div>
<br /></div>
<div>
On the client side, I may need to specify that I want to use the <i>gssapi-with-mic</i> authentication method. I may also need to specify the host principal to use to access the destination (as distinct from the hostname from the client's vantage point). More on these later.<br />
<br /></div>
<h3>
EC2 , cloud-init and resisting dynamic naming</h3>
<div>
<br /></div>
<div>
The network interface numbering and naming in EC2 are dynamic by design, both on the internal and external interfaces. EC2 does offer "elastic IP" which is really "static IP" for an instance and since I own a DNS zone I can assign a name to the address. Unfortunately this only offers control of the external IP address assigned to an instance. I have to find ways to manage the internal naming myself.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="http://2.bp.blogspot.com/-kkeRCPHOETY/UlbKzx1O5cI/AAAAAAAAB8E/t3aaKzK8bVo/s1600/kerberos_nat.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="380" src="http://2.bp.blogspot.com/-kkeRCPHOETY/UlbKzx1O5cI/AAAAAAAAB8E/t3aaKzK8bVo/s640/kerberos_nat.png" width="640" /></a></div>
<br /></div>
<br />
When a host registers with a Kerberos service it generally uses its own hostname as the identifier for the host principal. If this is the same as the DNS name associated with one or more if its IP addresses, this is just by convention. That is, Kerberos doesn't maintain the mapping. So if a host changes its hostname but no changes are made to the Kerberos database, the host can no longer identify itself by it's principal. Also, if the name by which it is known from the outside changes (because the IP address and/or DNS name changed) then clients will no longer know what principal to use to request an access ticket.<br />
<div>
<br /></div>
<div>
There are two factors here: Making sure the host knows its own name, and making sure that users coming from remote hosts can determine the (a?) valid principal (based on the hostname) to request a ticket for.</div>
<div>
<br /></div>
<h3>
Maintaining Host Identity</h3>
<div>
<br /></div>
<div>
For Kerberos, the hostname is the anchor for a host principal. If the hostname changes on a registered host, it will no longer be able to properly communicate with the Kerberos server and clients. Luckly the Fedora and RHEL images in EC2 use <i>cloud-init</i> to initialize potentially dynamic information on startup.</div>
<div>
<br /></div>
<div>
<a href="https://launchpad.net/cloud-init">Cloud-init</a> is software which, when installed on a host, can take input from the cloud environment and customize the host to integrate it into the environment. It can do things like.. oh, say, set the IP address of network interfaces and hostnames, install SSH host keys, set device mount points and the like. It will also allow me to tell it <i>not</i> to update the hostname on each reboot.</div>
<div>
<br /></div>
<div>
The main configuration for cloud-init is <code>/etc/cloud/cloud.cfg</code>. I just need to add a line containing '<code>preserve_hostname: 1</code>' and set the hostname I want in <code>/etc/hostname</code>. From then on, restarts or reboots will keep the hostname I set. Given that value I have my anchor for registering the host with the kerberos server and maintaining the host/principal mapping.</div>
<div>
<br /></div>
<div>
The host now always knows its own name: part one solved.</div>
<div>
<br /></div>
<h3>
The view from Inside/Outside</h3>
<div>
<br /></div>
<div>
You do learn something every day. In talking with some of the FreeIPA developer folks I learned something I hadn't known about how the Kerberos protocol works. ;Here's the important bit.</div>
<div>
<br /></div>
<div>
When client wants to gain access to some resource, it sends a message to the kerberos server saying "I am this principal and I want access to that one over there, ok?" The Kerberos server sends back a signed/encrypted <i>ticket</i> with both names (principals) wrapped inside it. The client then sends the ticket in an authentication request to the destination host, who verifies "yep, that's me, and I can see that that's you, let me check are you allowed?" and if the answer is "yes" the client request is granted.</div>
<div>
<br /></div>
<div>
What this means is that the client must know the name (principal) of the destination resource <u>before</u> attempting to connect to the resource. It must know a name that both the kerberos server and the resource host itself will recognize. When everyone uses DNS FQDNs to identify hosts <u>and</u> they have the same view of DNS, this works nicely. Accessing private network resources from a public network creates some issues.</div>
<div>
<br /></div>
<div>
Most tools, SSH included, assume that they can compose a host principal from the hostname given by the user. So if a client was using realm <code>EXAMPLE.COM</code> and tried to reach a remote host with FQDN 'destination.example.com' the principal would be <code>host/destination.example.com@EXAMPLE.COM</code>. But since the EC2 hosts have (not one but two) random hostnames assigned when they boot, it's impossible to know from the hostname alone what the principal of the destination is.</div>
<div>
<br /></div>
<div>
If I happen to know the mapping (ie, what principal is associated with the destination host) then SSH allows me to specify that with <code>-oGSSAPIServerIdentity=<principal></code> on the CLI or in a Host entry in my <code>.ssh/config</code> file. From the illustration above, to properly authenticate with the Kerberos Host I could do this:</div>
<div>
<br /></div>
<div>
<pre class="brush: bash">ssh -oPreferredAuthentications=gssapi-with-mic -oGSSAPIServerIdentity=host1.example.com random2.external
</pre>
</div>
<div>
<br /></div>
<div>
(this also assumes that my local hostname and remote one are the same and that I've got a ticket-granting-ticket for the <code>EXAMPLE.COM</code> realm using <i>kinit</i>.)</div>
<div>
<br /></div>
<div>
What this says is to log into a host who's name (from this view) is <code>random2.internal</code>, and who's principal is <code>host1.example.com</code>. With that the local client can send a query to the Kerberos server and get the right ticket back to hand to the destination host. It can say "yep, that's me and yep you're you, and yep you're allowed"</div>
<div>
<br /></div>
<h3>
The Many Faces of Kerberos</h3>
<div>
<br /></div>
<div>
It's totally coincidence that Cerberos is the 3-headed dog that guards the landing in Hades on the river Styx and I'm going to add two "faces" to my kerberos clients. Totally.</div>
<div>
<br /></div>
<div>
I <u>think</u> that in the discussion above I've been careful to make it clear that a Kerberos principal is an <i>identifier</i>. That is, it is a handle which is used to refer to an object in the Kerberos database which corresponds to an object in reality. I have nick names. Hosts in Kerberos can have them too, and this is going to solve my identity problem with random dynamic names and IP addresses.</div>
<div>
<br /></div>
<div>
I've managed to give each host a fixed hostname, thanks to cloud-init. Once I know the dynamic names both public and internal I should be able to inform the Kerberos server of both of the aliases.</div>
<div>
<br /></div>
<div>
If this works, here's what will happen when I try to log in either from a host inside or outside the private network, my SSH client will form a principal from the (DNS) name I offer. My client will send that to the Kerberos server and request an access ticket to the remote host using the alias principal. <b>And the Kerberos server will know which host that means.</b> It will create an access ticket which will grant me access to the destination host, which will examine it and on finding everything in order, will allow my SSH connection.</div>
<div>
<br /></div>
<div>
It turns out that FreeIPA doesn't yet have a nice Web or CLI user interface to add principals to a registered host record, but the Kerberos database is stored in an LDAP server on the Kerberos master host. For now I (or a friend actually) can craft an LDAP query which will add the principals I need to the host record. This is assumed to be run on </div>
<div>
<br /></div>
<div>
<pre class="brush: plain; title='add-alias-principal.ldif'; highlight: 1">kerberos# ldapmodify -h localhost -x -D "cn=Directory Manager" -W @lt;@lt;EOF
dn: fqdn=host1.example.com,cn=computers,cn=accounts,dc=example,dc=com
changetype: modify
add: krbprincipalname
krbprincipalname: host/random2.external@EXAMPLE.COM
krbprincipalname: host/random2.internal@EXAMPLE.COM
EOF</pre>
</div>
<div>
<br />
The invocation above will request the password of the admin user for the FreeIPA LDAP service. I'm sure there's a way to do it with Kerberos/GSSAPI, but I haven't got it yet.<br />
<br />
What that change does is add two Kerberos principal names to the host entry for <code>host1.example.com</code>. The principal names match what an SSH client would construct using the DNS name (internal or external) to reach the target host. Now when the Kerberos server gets a ticket request from clients either inside or outside the private network, the principal in the ticket request will be associated with a known host.<br />
<br />
<h3>
The Devil's in the Dynamics</h3>
<div>
<br /></div>
<div>
This is all fine so long as <code>host1.example.com</code> doesn't reboot. When it does, AWS will assign it a new internal and external IP address and new DNS names. It would be really nice if the host, when it boots could inform the Kerberos service what its new internal and external principal names are.<br />
<br />
I don't currently know how to do this, but I suspect that I could add a module to cloud-init to do the job. The client is already configured to use the LDAP service on the Kerberos (FreeIPA) server. Once the server knows that all three principals refer to the same host life should be good.<br />
<br />
Now to learn some cloud-init finagling and enough Kerberos so that I can have the host update itself on reboot.</div>
<br />
<br />
<h3>
What does this mean for OpenShift?</h3>
<div>
<br /></div>
<div>
If you want to run an OpenShift service in AWS and you want to offer Kerberos authentication for SSH/git to the application gears, you'll have to do a little LDAP tweaking of the Kerberos principals associated with each host so that the Kerberos service will know which host you mean regardless of your view of the destination host.<br />
<br />
The first round of Kerberos integration code is going into OpenShift Origin as I write this (the pull requst is submitted and getting commentary). By the next release it should be possible to manage developer access to gears with Kerberos and FreeIPA. Additional use cases will be added over time.<br />
<br /></div>
<h3>
Summary</h3>
<div>
<br /></div>
<div>
<ul>
<li>Cloud services like AWS and corporate networks often rely on private network spaces and Network Address Translation to manage dynamic hosts.</li>
<li>Cloud Init usually updates the hostname on each boot but this can be suppressed.</li>
<li>For a client trying to reach a host for SSH this poses a problem because the view of the destination from the client differs based on where the client sits in relation to the network boundary.</li>
<li>Kerberos can assign multiple principals to a single host, which allows authentication to work.</li>
</ul>
</div>
</div>
<div>
<br />
<h2>
References</h2>
<ul>
<li><a href="http://www.freeipa.org/">FreeIPA</a> - A component based single-sign-on service </li>
<li><a href="http://web.mit.edu/kerberos/">Kerberos</a> - The authentication component of FreeIPA and MIT Project Athena</li>
<li><a href="https://en.wikipedia.org/wiki/Generic_Security_Services_Application_Program_Interface">GSSAPI</a> - A standardized generic authentication and access control protocol</li>
<li><a href="https://en.wikipedia.org/wiki/Project_Athena">Project Athena</a> - 1980s MIT/DEC/IBM project to design network services and protocols</li>
<li><a href="https://tools.ietf.org/rfc/rfc1918.txt">RFC 1918</a> - Private non-routable IP address space reservations</li>
<li><a href="https://en.wikipedia.org/wiki/Network_address_translation">Network Address Translation</a> - Private network boundary system</li>
<li><a href="https://aws.amazon.com/articles/1346">AWS Elastic IP</a> - AWS static IP addresses for dynamic hosts</li>
<li><a href="https://launchpad.net/cloud-init">Cloud Init</a> - A service for customizing host configuration on reboot</li>
</ul>
</div>
markllamahttp://www.blogger.com/profile/14193184544557876514noreply@blogger.com0tag:blogger.com,1999:blog-5022186007695457923.post-77301932538450247512013-09-27T11:30:00.003-07:002013-09-27T11:30:45.636-07:00Broker-Node interaction and visibility - Debugging "missing" cartridges on a node.In the previous post I set up the end-point messaging for OpenShift. <code>(Broker -> Messaging -> Node)</code>. I showed a simple use of the MCollective mco command and where the MCollective log files are. The last step was to send an echo message to the OpenShift agent on an OpenShift node and get the response back.<br />
<br />
Now I have my OpenShift broker and node set up (I think) but something's not right and I have to figure out what.<br />
<br />
<b>DISCLAIMER: </b>this post isn't a "how to" it's a mostly-stream-of-consciousness log of my attempt to answer a question and understand what's going on underneath. It's messy. It may cast light on some of the moving parts. It may also lead me to a confrontation with <a href="http://www.sacred-texts.com/neu/mphg/mphg.htm#Scene 24">The Old Man From Scene 24</a> and we all know <a href="http://www.sacred-texts.com/neu/mphg/mphg.htm#Scene 35">how that ends</a>. You have been warned.<br />
<br />
In the paragraphs below I include a number of CLI invocations and their responses. I include a prompt at the beginning of each one to indicate where (on which host) the CLI command is running.<br />
<br />
<ul>
<li>broker$ - the command is running on my OpenShift broker host</li>
<li>node$ - the command is running on my OpenShift node host</li>
<li>dev$ - the command is running on my laptop</li>
</ul>
<div>
<br /></div>
<div>
I've also got a copy of the <a href="https://github.com/openshift/origin-server">origin-server source code</a> checked out from the repository on Github.</div>
<br />
I've got my rhc client already configured for my test user (cleverly named 'testuser') and my broker (using the libra_server variable). See ~/.openshift/express.conf if needed.<br />
<h3>
What's going on here?</h3>
I started trying to access the Broker with the <code>rhc</code> CLI command to create a user, register a namespace and then create an application. I'd like to create a python app and I've installed the <code>openshift-origin-cartridge-python</code> package to provide that app framework. But when I try to create my app I'm told that Python is not available:<br />
<br />
<br />
<pre class="brush:bash ; title: 'attempt to create python application' ; highlight: 1">client$ rhc create-app testapp1 python
Short Name Full name
========== =========
There are no cartridges that match 'python'.
</pre>
<br />
So I figure I'll ask what cartridges ARE available:
<br />
<br />
<pre class="brush:bash ; title: 'list available cartridges' ; highlight: 1">client$ rhc cartridges
Note: Web cartridges can only be added to new applications.
</pre>
<br />
Now, when I look for cartridge packages on the node I get a different answer:
<br />
<br />
<pre class="brush:bash ; title: 'attempt to create python application' ; highlight: 1">node$ rpm -qa | grep cartridge
openshift-origin-cartridge-abstract-1.5.9-1.fc19.noarch
openshift-origin-cartridge-cron-1.15.2-1.git.0.aa68436.fc19.noarch
openshift-origin-cartridge-php-1.15.2-1.git.0.090a445.fc19.noarch
openshift-origin-cartridge-python-1.15.1-1.git.0.0eb3e95.fc19.noarch
</pre>
<br />
Somehow, when the broker is asking the node to list its cartridges, the node isn't answering correctly. Why?
<br />
<br />
I'm going to see if I can observe the broker making the query to list the nodes and then see if I can determine where the node is (or isn't) getting its answer.<br />
<br />
<h2>
Refresher: MCollective RPC and OpenShift</h2>
<div>
<br />
MCollective is really an RPC (Remote Procedure Call) mechanism. It defines the interface for a set of functions to be called on the remote machine. The client submits a function call which is sent to the server. The server executes the function on behalf of the client and then returns the result.</div>
<div>
<br /></div>
<div>
The OpenShift client adds one more level of indirection and I want to get that out of the way. I can look at the logs on the broker and node to see what activity was caused when the <code>rhc</code> command issued the cartridge list query.</div>
<div>
<br /></div>
<div>
The broker writes its logs into several files in <code>/var/log/openshift/broker</code>. You can see the REST queries arrive and resolve in the Rails log file <code>/var/log/openshift/broker/production.log</code>.</div>
<div>
<br /></div>
<div>
<pre class="brush:bash ; title: 'attempt to create python application' ; highlight: 1">broker$ sudo tail /var/log/openshift/broker/production.log
...
2013-09-26 17:54:06.445 [INFO ] Started GET "/broker/rest/api" for 127.0.0.1 at 2013-09-26 17:54:06 +0000 (pid:16730)
2013-09-26 17:54:06.447 [INFO ] Processing by ApiController#show as JSON (pid:16730)
2013-09-26 17:54:06.453 [INFO ] Completed 200 OK in 6ms (Views: 3.6ms) (pid:16730)
2013-09-26 17:54:06.469 [INFO ] Started GET "/broker/rest/api" for 127.0.0.1 at 2013-09-26 17:54:06 +0000 (pid:16730)
2013-09-26 17:54:06.470 [INFO ] Processing by ApiController#show as JSON (pid:16730)
2013-09-26 17:54:06.476 [INFO ] Completed 200 OK in 6ms (Views: 3.8ms) (pid:16730)
2013-09-26 17:54:06.504 [INFO ] Started GET "/broker/rest/cartridges" for 127.0.0.1 at 2013-09-26 17:54:06 +0000 (pid:16730)
2013-09-26 17:54:06.507 [INFO ] Processing by CartridgesController#index as JSON (pid:16730)
2013-09-26 17:54:06.509 [INFO ] Completed 200 OK in 1ms (Views: 0.4ms) (pid:16730)
</pre>
<br /></div>
<div>
From that I can see that my <code>rhc</code> calls are arriving and apparently the response is being returned OK.
</div>
<div>
<br />
The default settings for the MCollective client (on the OpenShift broker) don't go to a log file. I can check the OpenShift node though to see what's happened there and if it has received a query for the list of installed cartridges.
<br />
<br /></div>
<div>
<pre class="brush: bash ; title: 'mcollective action log'; highlight: 1">node$ sudo grep cartridge /var/log/mcollective.log | tail -3
I, [2013-09-26T17:10:23.696825 #9827] INFO -- : openshift.rb:1217:in `cartridge_repository_action' action: cartridge_repository_action, agent=openshift, data={:action=>"list", :process_results=>true}
I, [2013-09-26T17:29:10.768487 #9827] INFO -- : openshift.rb:1217:in `cartridge_repository_action' action: cartridge_repository_action, agent=openshift, data={:action=>"list", :process_results=>true}
I, [2013-09-26T17:29:24.957806 #9827] INFO -- : openshift.rb:1217:in `cartridge_repository_action' action: cartridge_repository_action, agent=openshift, data={:action=>"list", :process_results=>true}
</pre>
</div>
<div>
<br />
This too looks like the message has been received and processed properly and returned.
<br />
<br />
<h3>
Hand Crafting An mco Message</h3>
<br />
Here's where that MCollective RPC interface definition comes in. I can look at that to see how to generate the cartridge list query using mco so that I can observe both ends and track down what's happening.<br />
<br />
There are really two things to look for here:<br />
<br />
<ol>
<li>What message is sent (and how do I duplicate it)?</li>
<li>What action does the agent take when it receives the message?</li>
</ol>
<div>
<br />
For part one, MCollective defines the RPC interfaces in a file with a .ddl extension. Looking for one of those in the openshift-server Github repository finds me this: <a href="https://github.com/openshift/origin-server/blob/master/msg-common/agent/openshift.ddl">origin-server/msg-common/agent/openshift.ddl</a> </div>
<div>
<br /></div>
<div>
Of particular interest are <a href="https://github.com/openshift/origin-server/blob/master/msg-common/agent/openshift.ddl#L390-L397">lines 390-397</a>. These define the cartridge_repository action and the set of operations it can perform: install, list, erase<br />
<br /></div>
<div class="gistLoad" data-id="GistID" id="6718972">
<a href="https://gist.github.com/markllama/6718972">Loading gist</a> </div>
<div>
<br />
Taking that, I can craft an mco rpc message to duplicate what the broker is doing when it queries the nodes:<br />
<br />
<pre class="brush: bash ; title: 'mco query for cartridges on a node'; highlight: 1">mco rpc openshift cartridge_repository action=list
Discovering hosts using the mc method for 2 second(s) .... 1
* [ ==========================================================> ] 1 / 1
ec2-54-211-74-85.compute-1.amazonaws.com
output:
Finished processing 1 / 1 hosts in 32.15 ms
</pre>
<br />
Yep, it still says "none". When I go back and look at the logs, it shows the same query I was looking at, so I think I got that right.<br />
<br />
<h3>
But What Does It DO?</h3>
</div>
<div>
<br />
Now that I can send the query message, I need to find out what happens on the other end to generate the response. My search begins in the node messaging plugin for MCollective, particularly in the agent module code (in <a href="https://github.com/openshift/origin-server/blob/master/plugins/msg-node/mcollective/src/openshift.rb">plugins/msg-node/mcollective/src/openshift.rb</a>). This defines a function <a href="https://github.com/openshift/origin-server/blob/master/plugins/msg-node/mcollective/src/openshift.rb#L1216-L1244">cartridge_repository_action</a> which... doesn't actually do the work, but points me to the next piece of code which actually does implent the function.</div>
<div>
<br /></div>
<div>
It appears that the OpenShift node implements a class <code>::OpenShift::Runtime::CartridgeRepository</code> which is a factory for an object that actually produces the answer. A quick look in the source repository shows me the file that defines the <code>CartridgeRepository</code> class.</div>
<div>
<br /></div>
<div>
<pre class="brush: bash ; title: 'search for the definition of CartridgeRepository'; highlight: [2,6]">dev$ cd ~/origin-server
dev$ find . -name \*.rb | xargs grep 'class CartridgeRepository'
./node/test/functional/cartridge_repository_func_test.rb: class CartridgeRepositoryFunctionalTest < NodeTestCase
./node/test/functional/cartridge_repository_web_func_test.rb:class CartridgeRepositoryWebFunctionalTest < OpenShift::NodeTestCase
./node/test/unit/cartridge_repository_test.rb:class CartridgeRepositoryTest < OpenShift::NodeTestCase
./node/lib/openshift-origin-node/model/cartridge_repository.rb: class CartridgeRepository
</pre>
<br /></div>
<div>
<br />
So, on the node, when a query is received for the list of cartridges that is present, the MCollective agent for OpenShift creates one of the <code>CartridgeRepository</code> objects and then asks <i>it</i> for the list.<br />
<br />
A quick look at the <a href="https://github.com/openshift/origin-server/blob/master/node/lib/openshift-origin-node/model/cartridge_repository.rb#L86">cartridge_repository.rb</a> file on Github is enlightening. First, the file has 60 lines of excellent commentary before the code starts. Line 86 indicates that the <code>CartridgeRepository</code> object will look for cartridges in <code>/var/lib/openshift/.cartridge_repository</code> (while noting that this location should be configurable in the <code>/etc/openshift/node.conf</code> someday). And <a href="https://github.com/openshift/origin-server/blob/master/node/lib/openshift-origin-node/model/cartridge_repository.rb#L170-L188">lines 170-189</a> define the <i>install</i> method which seems to populate the cartridge_repository from some directory which is provided as an argument.
<br />
<br />
But when does <code>CartridgeRepository.install</code> get invoked? Well, since <code>CartridgeRespository</code> is a factory and a Singleton (which provides the <i>instance()</i> method for initialization) I can look for where it's instantiated:<br />
<br />
<pre class="brush: bash ; title: 'Instantiate CartridgeRepository'; highlight: [1,4]">dev$ find . -type f | xargs grep -l OpenShift::Runtime::CartridgeRepository.instance | grep -v /test/
./plugins/msg-node/mcollective/src/openshift.rb
./node-util/bin/oo-admin-cartridge
./node/lib/openshift-origin-node/model/upgrade.rb
</pre>
<br />
Note that I remove all of the files in the test directories using <code>grep -v /test/</code>. What remains are the working files which actually instantiate a CartridgeRepository object. If I also check for a call to the <i>install()</i> method, the list is reduced to one file:<br />
<br />
<pre class="brush: bash ; title: 'files which install plugin metadata' ; highlight 1">find . -type f | xargs grep OpenShift::Runtime::CartridgeRepository.instance | grep -v /test/ | grep install
./plugins/msg-node/mcollective/src/openshift.rb: ::OpenShift::Runtime::CartridgeRepository.instance.install(path)
</pre>
<br /></div>
So, it looks like the node messaging module is what populates the OpenShift cartridge repository. When I looked earlier though, it didn't seem to have done that. Messaging is running and I've installed cartridge RPMs and I can successfully query for (what turns out to be) an empty database of cartridge information.<br />
<br />
Finally! When I look at plugins/msg-node/mcollective/src/openshift.rb <a href="https://github.com/openshift/origin-server/blob/master/plugins/msg-node/mcollective/src/openshift.rb#L26-L45">lines 26-45</a> I find what I'm looking for. CartridgeRepository.install is called when the MCollective openshift agent is loaded. That is: when the MCollective service starts.<br />
<br />
It turns out that I'd started and began testing the MCollective service <u>before</u> installing any of the OpenShift cartridge packages. Restarting MCollective populates the .cartridge_repository directory and now my mco rpc queries indicate the cartridges I've installed.<br />
<br />
<h3>
Verifying the Change</h3>
<div>
So, I think, based on the code I've found, that when I restart the mcollective daemon on my OpenShift node, it will look in /usr/libexec/openshift/cartridges and it will use the contents to populate /var/lib/openshift/.cartridge_repository (not sure why that's hidden, but..).</div>
<div>
<br />
<pre class="brush: bash ; title: 'cartridges in node repository'; highlight: 1">node$ ls /var/lib/openshift/.cartridge_repository
redhat-cron redhat-php redhat-python
</pre>
<br /></div>
<div>
<b>DING!</b><br />
<b><br /></b>
Now when I query with mco, I should see those. And I do:<br />
<br />
<pre class="brush: bash ; title: 'cartridges in node repository'; highlight: 1">broker$ mco rpc openshift cartridge_repository action=list
Discovering hosts using the mc method for 2 second(s) .... 1
* [ ==========================================================> ] 1 / 1
ec2-54-211-74-85.compute-1.amazonaws.com
output: (redhat, php, 5.5, 0.0.5)
(redhat, python, 2.7, 0.0.5)
(redhat, python, 3.3, 0.0.5)
(redhat, cron, 1.4, 0.0.6)
Finished processing 1 / 1 hosts in 29.41 ms
</pre>
<br /></div>
I suspect that the OpenShift broker also caches these values, so I might have to restart the openshift-broker service on the broker host as well. Then I can use rhc in my development environment to see what cartridges I can use to create an application.<br />
<br />
<pre class="brush: bash ; title: 'as an app developer, list available cartridges'; highlight: 1">dev$ rhc cartridges
php-5.5 PHP 5.5 web
python-2.7 Python 2.7 web
python-3.3 Python 3.3 web
cron-1.4 Cron 1.4 addon
Note: Web cartridges can only be added to new applications.
</pre>
<br />
And when I try to create a new application:<br />
<br />
<pre class="brush: bash ; title: 'cartridges in node repository'; highlight: 1">$ rhc app-create testapp1 python-2.7
Application Options
-------------------
Namespace: testns1
Cartridges: python-2.7
Gear Size: default
Scaling: no
Creating application 'testapp1' ... done</pre>
<pre class="brush: bash ; title: 'cartridges in node repository'; highlight: 1">
</pre>
<pre class="brush: bash ; title: 'cartridges in node repository'; highlight: 1">
</pre>
<pre class="brush: bash ; title: 'cartridges in node repository'; highlight: 1">Waiting for your DNS name to be available ...
</pre>
<br />
Well, that's better than before!<br />
<h3>
What I learned:</h3>
</div>
<div>
<ul>
<li>rhc is configured using ~/.openshift/express.conf</li>
<li>look at the logs</li>
<ul>
<li>OpenShift broker: /var/log/openshift/broker/production.log</li>
<li>mcollective server: /var/log/mcollective.log</li>
</ul>
<li>the mcollective client mco can be used to simulate broker activity</li>
<ul>
<li>mco plugin doc - list all plugins available</li>
<li>mco plugin doc openshift - list OpenShift RPC actions and parameters</li>
<li>mco rpc openshift cartridge_repository action=list<br />query all nodes for their cartridge repository contents</li>
</ul>
<li>source code is useful</li>
<ul>
<li>OpenShift source repository: https://github.com/openshift/origin-server</li>
<li>judicious use of find and grep can narrow problem searches</li>
</ul>
<li>cartridge RPMs are installed in /usr/libexec/openshift/cartridges</li>
<li>cartridges are "installed" in /var/lib/openshift/.cartridge_repository</li>
<li>adding cartridges to a node requires a restart for the mcollective service</li>
</ul>
<div>
<br /></div>
</div>
<script src="https://raw.github.com/moski/gist-Blogger/master/public/gistLoader.js" type="text/javascript"></script>markllamahttp://www.blogger.com/profile/14193184544557876514noreply@blogger.com0tag:blogger.com,1999:blog-5022186007695457923.post-24764449632914428142013-09-23T08:58:00.001-07:002013-11-25T06:48:53.072-08:00OpenShift Support Services: Messaging Part 2 (MCollective)About a year ago I did a series of posts on verifying the plugin operations for OpenShift Origin support services. I showed how to check the datastore (mongodb) and DNS updates and how to set up an ActiveMQ message broker , but I when I got to actually sending and receiving messages I got stuck.<br />
<br />
The <a href="http://cloud-mechanic.blogspot.com/2012/11/openshift-back-end-services-data-store.html">Datastore</a> and <a href="http://cloud-mechanic.blogspot.com/2012/11/openshift-back-end-services-dns.html">DNS</a> services use a single point-to-point connection between the broker and the update server. The messaging services use an<a href="http://cloud-mechanic.blogspot.com/2012/11/openshift-back-end-services-messaging.html"> intermediate message broker </a>(ActiveMQ, not to be confused with the OpenShift broker). This means that I need to configure and check not just one points, but three:<br />
<br />
<ul>
<li>Mcollective client to (message) broker (on OpenShift broker)</li>
<li>Mcollective server to (message) broker (on OpenShift node)</li>
<li>End to End</li>
</ul>
<div>
<br /></div>
<div>
I'm using the ActiveMQ message broker to carry MCollective RPC messages. The message broker is interchangeable. MCollective can be carried over any one of several messaging protocols. I'm using the Stomp protocol for now, though MCollective is deprecating Stomp in favor of a native ActiveMQ (AMQP?) messaging protocol. <br />
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="http://3.bp.blogspot.com/-z9YrkLuXrFo/UkBfPxKTiXI/AAAAAAAAB6w/vJiubxaTZvY/s1600/openshift_messaging.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="OpenShift Messaging Components" border="0" height="186" src="http://3.bp.blogspot.com/-z9YrkLuXrFo/UkBfPxKTiXI/AAAAAAAAB6w/vJiubxaTZvY/s640/openshift_messaging.png" title="OpenShift Messaging Components" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">OpenShift Messaging Components</td></tr>
</tbody></table>
<br /></div>
<div>
<br /></div>
<div>
In a previous post I set up an <a href="http://cloud-mechanic.blogspot.com/2012/11/openshift-back-end-services-messaging.html">ActiveMQ message broker</a> to be used for communication between the OpenShift broker and nodes. In this one I'm going to connect the OpenShift components to the messaging service, verify both connections and then verify that I can send messages end-to-end.<br />
<br />
Hold on for the ride, it's a long one (even for me)</div>
<div>
<br />
<b>Mea Culpa</b>: I'm referring to what MCollective does as "messaging" but that's not strictly true. ActiveMQ, RabbitMQ, QPID are message broker services. MCollective uses those, but actually, MCollective is an RPC (Remote Procedure Call) system. Proper messaging is capable of much more than MCollective requires, but to avoid a lot of verbal knitting I'm being lazy and calling MCollective "messaging".<br />
<br />
<h2>
The Plan</h2>
<div>
Since this is a longer process than any of my previous posts, I'm going to give a little road-map up front so you know you're not getting lost on the way. Here are the landmarks between here and a working OpenShift messaging system:</div>
<div>
<ol>
<li>Ingredients: Gather configuration information for messaging setup.</li>
<li>Mcollective Client -<br />Establish communications between the Mcollective client and the ActiveMQ server<br />(OpenShift broker host to message broker host)</li>
<li>MCollective Server -<br />Establish communications beween the MCollective server and the ActiveMQ server<br />(OpenShift node host to message broker host)</li>
<li>MCollective End-To-End -<br />Verify MCollective communication from client to server</li>
<li>OpenShift Messaging and Agent -<br />Install OpenShift messaging interface definition and agent packages on both OpenShift broker and node</li>
</ol>
<h2>
Ingredients</h2>
</div>
<table border="2">
<tbody>
<tr><th>Variable</th><th>Value</th></tr>
<tr>
<td>ActiveMQ Server</td><td>msg1.infra.example.com</td>
</tr>
<tr>
<td colspan="2" style="text-align: center;">Message Bus</td>
</tr>
<tr><td>topic username</td><td>mcollective</td></tr>
<tr><td>topic password</td><td>marionette</td></tr>
<tr><td>admin password</td><td>msgadminsecret</td></tr>
<tr>
<td colspan="2" style="text-align: center;">Message End</td>
</tr>
<tr><td>password</td><td>mcsecret</td></tr>
</tbody></table>
</div>
<div>
<ul>
<li>A running ActiveMQ service</li>
<li>A host to be the MCollective client (and after that an OpenShift broker)</li>
<li>A host to run the MCollective service (and after that an OpenShfit node)</li>
</ul>
<div>
On the Mcollective client host, install these RPMs</div>
<div>
<ul>
<li>mcollective-client</li>
<li>rubygem-openshift-origin-msg-broker-mcollective</li>
</ul>
On the MCollective server (OpenShift node) host, install these RPMs</div>
<div>
<ul>
<li>mcollective</li>
<li>openshift-origin-msg-node-mcollective</li>
</ul>
<div>
<br /></div>
</div>
</div>
<h2>
Secrets and more Secrets</h2>
<div>
<br /></div>
<div>
As with all secure network services, messaging requires authentication. Messaging has a twist though. You need two sets of authentication information, because, underneath, you're actually using two services. When you send a message to an end-point, the end point has to be assured that you are someone who is allowed to send messages. Like with a letter, having some secret code or signature so that you can be sure the letter isn't forged.</div>
<div>
<br /></div>
<div>
Now imagine a special private mail system. Before the mail carrier will accept a letter, you have to give them the secret handshake so that they know you're allowed to send letters. On the delivery end, the mail carrier requires not just a signature but a password before handing over the letter.</div>
<div>
<br /></div>
<div>
That's how authentication works for messaging systems.</div>
<div>
<br /></div>
<div>
When I set up the ActiveMQ service I didn't create a separate user for writing to the queue (sending a letter) and for reading (receiving) but I probably should have. As it is, getting a message from the OpenShift broker to an OpenShift node through MCollective and ActiveMQ requires two passwords and one username.</div>
<div>
<br /></div>
<div>
<ul>
<li>mcollective endpoint secret</li>
<li>ActiveMQ username</li>
<li>ActiveMQ password</li>
</ul>
<div>
<br />
The ActiveMQ values will have to match those I set on the ActiveMQ message broker in the previous post. The MCollective end point secret is only placed in the MCollective configuration files. You'll see those soon.</div>
<div>
<br /></div>
<h2>
MCollective Client (OpenShift Broker)</h2>
</div>
<div>
<br /></div>
<div>
The OpenShift broker service sends messages to the OpenShift nodes. All of the messages (currently) originate at the broker. This means that the nodes need to have a process running which connects to the message broker and registers to receive MColletive messages. </div>
<div>
<br />
<h3>
Client configuration: client.cfg</h3>
<br /></div>
<div>
The MCollective client is (predictably) configured using the <code>/etc/mcollective/client.cfg</code> file. For the purpose of connecting to the message broker, only the connector plugin values are interesting, and for end-to-end communications I need the securityprovider plugin as well. The values related to logging are useful debugging too.<br />
<br />
<br />
<div>
<pre class="brush: bash; title: '/etc/mcollective/client.cfg' ; highlight: [10,14,16,17]"># Basic stuff
topicprefix = /topic/
main_collective = mcollective
collectives = mcollective
libdir = /usr/libexec/mcollective
loglevel = log # just for testing, normally 'info'
# Plugins
securityprovider = psk
plugin.psk = mcsecret
# Middleware
connector = stomp
plugin.stomp.host = msg1.infra.example.com
plugin.stomp.port = 61613
plugin.stomp.user = mcollective
plugin.stomp.user = marionette
</pre>
</div>
<div>
<br />
<b>NOTE:</b>if you're running on RHEL6 or CentOS 6 instead of Fedora you're going to be using the SCL version of Ruby and hence MCollective. The file is then at the SCL location:<br />
<br />
<code>/opt/rh/ruby193/root/etc/mcollective/client.cfg</code><br />
<br />
Now I can test connections to the ActiveMQ message broker, though without any servers connected, it won't be very exciting (I hope).<br />
<br />
<h3>
Testing client connections</h3>
</div>
<div>
<br /></div>
<div>
MCollective provides a command line tool for sending messages: mco . mco is capable of several other 'meta' operations as well. The one I'm interested in first is 'mco ping'. With mco ping I can verify the connection to the ActiveMQ service (via the Stomp protocol) .</div>
<div>
<br /></div>
<div>
The default configuration file is owned by root and is not readable by ordinary users. This is because it contains plain-text passwords (There are ways to avoid this, but that's for another time). This means I have to either run mco commands as root, or create a config file that is readable. I'm going to use sudo to run my commands as root.</div>
<div>
<br /></div>
<div>
The mco ping command connects to the messaging service and asks all available MCollective servers to respond. Since I haven't connected any yet, I won't get any answers, but I can at least see that I'm able to connect to the message broker, send queries. If all goes well I should get a nice message saying "no one answered".</div>
<div>
<br /></div>
<div>
<br />
<pre class="brush: bash ; title : 'mco ping - successful, no servers' ; highlight : 1">sudo mco ping
---- ping statistics ----
No responses received
</pre>
</div>
<div>
<br />
If that's what you got, feel free to skip down to the <a href="http://www.blogger.com/blogger.g?blogID=5022186007695457923#mcserver">MCollective Server</a> section.<br />
<br />
<h3>
Debugging client-side configuration errors</h3>
<br /></div>
<div>
There are a couple of obvious possible errors:<br />
<ol>
<li>Incorrect broker host</li>
<li>broker service not answering</li>
<li>Incorrect messaging username/password</li>
</ol>
</div>
<div>
The first two will appear the same to the MCollective client. Check the simple stuff first. If I'm sure that the host is correct then I'll have to diagnose the problem on the other (and write another blog post). Here's how that looks:<br />
<br />
<pre class="brush: bash ; title: 'Server Connection Failure' ; highlight: [1,5]">sudo mco ping
connect to localhost failed: Connection refused - connect(2) will retry(#0) in 5
connect to localhost failed: Connection refused - connect(2) will retry(#1) in 5
connect to localhost failed: Connection refused - connect(2) will retry(#2) in 5
^C
The ping application failed to run, use -v for full error details: Could not connect to Stomp Server:
</pre>
<br />
Note the message <u>Could not connect to the Stomp Server</u>.<br />
<br />
If you get this message, check these on the OpenShift broker host:<br />
<br />
<ol>
<li>The plugin.stomp.host value is correct</li>
<li>The plugin.stomp.port value is correct</li>
<li>The host value resolves to an IP address in DNS</li>
<li>The ActiveMQ host can be reached from the OpenShift Broker host (by ping or SSH)</li>
<li>You can connect to Stomp port on the ActiveMQ broker host<br />telnet msg1.example.com 61613 (yes, telnet is a useful tool) </li>
</ol>
<br />
If all of these are correct, then look on the ActiveMQ message broker host:<br />
<br />
<ol>
<li>The ActiveMQ service is running</li>
<li>The Stomp transport TCP ports match the plugin.stomp.port value</li>
<li>The host firewall is allowing inbound connections on the Stomp port</li>
</ol>
<br />
The third possibility indicates an information or configuration mismatch between the MCollective client configuration and the ActiveMQ server. That will look like this:<br />
<br />
<pre class="brush: bash ; title: 'Bad Authentication' ; highlight: 1">sudo mco ping
transmit to msg1.infra.example.com failed: Broken pipe
connection.receive returning EOF as nil - resetting connection.
connect to localhost failed: Broken pipe will retry(#0) in 5
The ping application failed to run, use -v for full error details: Stomp::Error::NoCurrentConnection
</pre>
<br />
You can get even more gory details by changing the client.cfg to set the log level to <i>debug</i> and send the log output to the console:<br />
<br />
<pre class="brush: bash ; title: 'enable debug output to the console in client.cfg'">...
loglevel = debug # instead of 'log' or 'info'
logger_type = console # instead of 'file', or 'syslog' or unset (no logging)
...
</pre>
<br />
I'll spare you what that looks like here.<br />
<br /></div>
<h2>
<a href="http://www.blogger.com/blogger.g?blogID=5022186007695457923" id="mcserver">
MCollective Server (OpenShift Node)</a></h2>
<div>
<br />
The mcollective server is a process that connects to a message broker, subscribes to (registers to receive messages from) one or more topics and then listens for incoming messages. When it accepts a message, the mcollective server passes it to a plugin module for execution and then returns any response. All OpenShift node hosts run an MCollective server which connects to one or more of the ActiveMQ message brokers.<br />
<br />
<h3>
Configure the MCollective service daemon: server.cfg </h3>
<br />
I bet you have already guessed that the MCollective server configuration file is /etc/mcollective/server.cfg</div>
<div>
<br />
<pre class="brush: bash ; title: '/etc/mcollective/server.cfg'; highlight: [6,7,8,12,13,21,23,24]"># Basic stuff
topicprefix = /topic/
main_collective = mcollective
collectives = mcollective
libdir = /usr/libexec/mcollective
logfile = /var/log/mcollective.log
loglevel = debug # just for setup, normally 'info'
daemonize = 1
classesfile = /var/lib/puppet/state/classes.txt
# Plugins
securityprovider = psk
plugin.psk = mcsecret
# Registration
registerinterval = 300
registration = Meta
# Middleware
connector = stomp
plugin.stomp.host = msg1.infra.example.com
plugin.stomp.port = 61613
plugin.stomp.user = mcollective
plugin.stomp.password = marionette
# NRPE
plugin.nrpe.conf_dir = /etc/nrpe.d
# Facts
factsource = yaml
plugin.yaml = /etc/mcollective/facts.yaml
</pre>
</div>
<div>
<br />
<b>NOTE:</b> again the mcollective config files will be in <code>/opt/rh/ruby193/root/etc/mcollective/</code> if you are running on RHEL or CentOS.<br />
<br />
The server configuration looks pretty similar to the client.cfg. The <i>securityprovider</i> plugin must have the same values, because that's how the server knows that it can accept a message from the clients. The plugin.stomp.* values are the same as well, allowing the MCollective server to connect to the ActiveMQ service on the message broker host. It's really a good idea for the <i>logfile</i> value to be set so that you can observe the incoming messages and their responses. The <i>loglevel</i> is set to <u>debug</u> to start so that I can see all the details of the connection process. Finally the <i>daemonize</i> value is set to 1 so that the mcollectived will run as a service.<br />
<br />
The mcollectived will complain if the YAML file does not exist or if the Meta registration plugin is not installed and selected. Comment those out for now. They're out of scope for this post.<br />
<br />
<h3>
Running the MCollective service</h3>
<div>
<br /></div>
When you're satisfied with the configuration, start the <i>mcollective</i> service and verify that it is running:<br />
<br />
<br />
<pre class="brush: bash ; title: 'start MCollective service' ; highlight: [1,3]">sudo service mcollective start
Redirecting to /bin/systemctl start mcollective.service
ps -ef | grep mcollective
root 13897 1 5 19:37 ? 00:00:00 /usr/bin/ruby-mri /usr/sbin/mcollectived --config=/etc/mcollective/server.cfg --pidfile=/var/run/mcollective.pid
</pre>
</div>
<div>
<br />
You should be able to confirm the connection to the ActiveMQ server in the log.<br />
<br />
<pre class="brush: bash ; title 'mcollective service startup log' ; highlight: [1,5]">sudo tail /var/log/mcollective.log
I, [2013-09-19T19:53:21.317197 #16544] INFO -- : mcollectived:31:in `<main>' The Marionette Collective 2.2.3 started logging at info level
I, [2013-09-19T19:53:21.349798 #16551] INFO -- : stomp.rb:124:in `initialize' MCollective 2.2.x will be the last to fully support the 'stomp' connector, please migrate to the 'activemq' or 'rabbitmq' connector
I, [2013-09-19T19:53:21.357215 #16551] INFO -- : stomp.rb:82:in `on_connecting' Connection attempt 0 to stomp://mcollective@msg1.infra.example.com:61613
I, [2013-09-19T19:53:21.418225 #16551] INFO -- : stomp.rb:87:in `on_connected' Conncted to stomp://mcollective@msg1.infra.example.com:61613
...
</main></pre>
<br />
If you see that, you can skip down again to the next section, <a href="http://www.blogger.com/blogger.g?blogID=5022186007695457923#endtoend">MCollective End-to-End</a>
<br />
<br />
<h3>
Debugging MCollective Server Connection Errors</h3>
<br />
Again the two most likely problems are that the host or the stomp plugin are mis-configured.<br />
<br />
<br />
<pre class="brush: bash ; title: 'mcollective server connection failure' ; hightlight=[2]">sudo tail /var/log/mcollective.log
I, [2013-09-19T20:05:50.943144 #18600] INFO -- : stomp.rb:82:in `on_connecting' Connection attempt 1 to stomp://mcollective@msg1.infra.example.com:61613
I, [2013-09-19T20:05:50.944172 #18600] INFO -- : stomp.rb:97:in `on_connectfail' Connection to stomp://mcollective@msg1.infra.example.com:61613 failed on attempt 1
I, [2013-09-19T20:05:51.264456 #18600] INFO -- : stomp.rb:82:in `on_connecting' Connection attempt 2 to stomp://mcollective@msg1.infra.example.com:61613
...
</pre>
<br />
If I see this, I need to check the same things I would have for the client connection. On the MCollective server host:<br />
<br />
<ul>
<li>plugin.stomp.host is correct</li>
<li>plugin.stomp.port matches Stomp transport TCP port on the ActiveMQ service</li>
<li>Hostname resolves to an IP address</li>
<li>ActiveMQ host can be reached from the MCollective client host (ping or SSH)</li>
</ul>
<div>
<br /></div>
<div>
On the ActiveMQ message broker:</div>
<div>
<br /></div>
<div>
<ul>
<li>ActiveMQ service is running</li>
<li>Any firewall rules allow inbound connections to the Stomp TCP port</li>
</ul>
<div>
The other likely error is username/password mismatch. If you see this in your mcollective logs, check the ActiveMQ user configuration and compare it to your mcollective server plugin.stomp.user and plugin.stomp.password values.</div>
</div>
<div>
<br />
<pre class="brush: bash ; title: 'Stomp Authentication Errors' ; highlight: 1">...
I, [2013-09-19T20:15:13.655366 #20240] INFO -- : stomp.rb:82:in `on_connecting'
Connection attempt 0 to stomp://mcollective@msg1.infra.example.com:61613
I, [2013-09-19T20:15:13.700844 #20240] INFO -- : stomp.rb:87:in `on_connected'
Conncted to stomp://mcollective@msg1.infra.example.com:61613
E, [2013-09-19T20:15:13.729497 #20240] ERROR -- : stomp.rb:102:in `on_miscerr' U
nexpected error on connection stomp://mcollective@msg1.infra.example.com:61613: es_trans: transmit
to msg1.infra.example.com failed: Broken pipe
...
</pre>
</div>
<div>
<br /></div>
</div>
<h2>
<a href="http://www.blogger.com/blogger.g?blogID=5022186007695457923" id="endtoend">MCollective End-to-End</a></h2>
<div>
Now that I have both the MCollective client and server configured to connect to the ActiveMQ message broker I can confirm the connection end to end. Remember that 'mco ping' command I used earlier? When there are connected servers, they should answer the ping request.<br />
<br />
<pre class="brush: bash ; title: 'successful mco ping' ; highlight: [1,2]"> sudo mco ping
node1.infra.example.com time=138.60 ms
---- ping statistics ----
1 replies max: 138.60 min: 138.60 avg: 138.60
</pre>
<br /></div>
<h2>
OpenShift Node 'plugin' agent</h2>
<div>
Now I'm sure that both MCollective and ActiveMQ are working end-to-end between the OpenShift broker and node. But there's no "OpenShift" in there yet. I'm going to add that now.<br />
<br />
There are three packages that specifically deal with MCollective and interaction with OpenShift:<br />
<br />
<ul>
<li>openshift-origin-msg-common.noarch (misnamed, specifically mcollective)</li>
<li>rubygem-openshift-origin-msg-broker-mcollective</li>
<li>openshift-origin-msg-node-mcollective.noarch</li>
</ul>
<div>
<br /></div>
<ul>
</ul>
</div>
<div>
The first package defines the messaging protocol for OpenShift. It includes interface specifications for all of the messages, their arguments and expected outputs. This is used on both the MCollective client and server side to produce and validate the OpenShift messages. The broker package defines the interface that the OpenShift broker (a Rails application) uses to generate messages to the nodes and process the returns. The node package defines how the node will respond when it receives each message.<br />
<br />
<div>
The OpenShift node also requires several plugins that, while not required for messaging per-se, will cause the OpenShift agent to fail if they are not present</div>
<ul>
<li>rubygem-openshift-origin-frontend-nodejs-websocket</li>
<li>rubygem-openshift-origin-frontend-apache-mod-rewrite</li>
<li>rubygem-openshift-origin-container-selinux</li>
</ul>
When these packages are installed on the OpenShift broker and node, mco will have a new set of messages available. MCollective calls added sets of messages... (OVERLOAD!) 'plugins'. So, to see the available message plugins, use mco plugin doc. To see the messages in the openshift plugin, use mco plugin doc openshift.<br />
<br />
<h4>
Mcollective client: mco</h4>
</div>
<div>
<br /></div>
<div>
I've used mco previously just to send a ping message from a client to the servers. This just collects a list of the MCollective servers listening. The mco command can also send complete messages to remote agents. Now I need to learn how to determine what agents and messages are available and how to send them a message. Specifically, the OpenShift agent has an echo message which simply returns a string which was sent in the message. Now that all of the required OpenShift messaging components are installed, I should be able to tickle the OpenShift agent on the node from the broker. This is what it looks like when it works properly:<br />
<br />
<pre class="brush: bash ; title: 'OpenShift echo message via mco'; highlight: 1">sudo mco rpc openshift echo msg=foo
Discovering hosts using the mc method for 2 second(s) .... 1
* [ ========================================================> ] 1 / 1
node1.infra.example.com
Message: foo
Time: nil
Finished processing 1 / 1 hosts in 25.49 ms
</pre>
<br />
As you might expect, this has more than its fair share of interesting failure modes. The most likely thing you'll see from the mco command is this:<br />
<br />
<pre class="brush: bash ; title: 'mco: no nodes' ; highlight: 1">sudo mco rpc openshift echo msg=foo
Discovering hosts using the mc method for 2 second(s) .... 0
No request sent, we did not discover any nodes.
</pre>
<br />
This isn't very informative, but it does at least indicate that the message was sent and nothing answered. Now I have to look at the MCollective server logs to see what happened. After setting the loglevel to 'debug' in <code>/etc/mcollective/server.cfg</code>, restarting the mcollective service and re-trying the mco rpc command, I can find this in the log file:<br />
<br />
<br />
<pre class="brush: bash ; title: 'mcollective.log - failed to load OpenShift agent' ; highlight: [1,4,5]">sudo grep openshift /var/log/mcollective.log
D, [2013-09-20T14:18:05.864489 #31618] DEBUG -- : agents.rb:104:in `block in findagentfile' Found openshift at /usr/libexec/mcollective/mcollective/agent/openshift.rb
D, [2013-09-20T14:18:05.864637 #31618] DEBUG -- : pluginmanager.rb:167:in `loadclass' Loading MCollective::Agent::Openshift from mcollective/agent/openshift.rb
E, [2013-09-20T14:18:06.360415 #31618] ERROR -- : pluginmanager.rb:171:in `rescue in loadclass' Failed to load MCollective::Agent::Openshift: error loading openshift-origin-container-selinux: cannot load such file -- openshift-origin-container-selinux
E, [2013-09-20T14:18:06.360633 #31618] ERROR -- : agents.rb:71:in `rescue in loadagent' Loading agent openshift failed: error loading openshift-origin-container-selinux: cannot load such file -- openshift-origin-container-selinux
D, [2013-09-20T14:18:13.741055 #31618] DEBUG -- : base.rb:120:in `block (2 levels) in validate_filter?' Failing based on agent openshift
D, [2013-09-20T14:18:13.741175 #31618] DEBUG -- : base.rb:120:in `block (2 levels) in validate_filter?' Failing based on agent openshift
</pre>
<br />
It turns out that the reason those three additional packages are requires is that they provide facters to MCollective. Facter is a tool which gathers a raft of information about a system and makes it quickly available to MCollective. The rubygem-openshift-origin-node package adds some facter code, but those facters will fail if the additional packages aren't present. If you do the "install everything" these resolve automatically, but if you install and test things piecemeal as I am they show up as missing requirements.<br />
<br />
After I add those packages I can send an echo message and get a successful reply. If you can discover the MCollective servers from the client with mco ping, but can't get a response to an mco rpc openshift echo message, then the most likely problem is that the OpenShift node packages are missing or misconfigured. Check the logs and address what you find.<br />
<br />
<h2>
Finally! (sort of)</h2>
<div>
At this point, I'm confident that the Stomp and MCollective services are working and that the OpenShift agent is installed on the node and will at least respond to the echo message. I was going to also include testing through the Rails console, but this has gone on long enough. That's next.</div>
<br /></div>
<h2>
References</h2>
<div>
<ul>
<li><a href="http://activemq.apache.org/">ActiveMQ</a> - Message Broker</li>
<li><a href="http://www.rabbitmq.com/">RabbitMQ</a> - Message Broker</li>
<li><a href="https://qpid.apache.org/">QPID</a> - Message Broker</li>
<li><a href="http://puppetlabs.com/mcollective">MCollective</a> - RPC</li>
<ul>
<li>MCollective <a href="http://docs.puppetlabs.com/mcollective/configure/client.html">Client Configuration</a></li>
<li>MCollective <a href="http://docs.puppetlabs.com/mcollective/configure/server.html">Server Configuration</a></li>
</ul>
<li><a href="http://stomp.github.io/">Stomp</a> - messaging protocol</li>
<li><a href="http://www.amqp.org/">AMQP</a> (Advanced Message Queue Protocol)</li>
<li><a href="http://activemq.apache.org/openwire.html">OpenWire</a> - messaging protocol</li>
</ul>
</div>
</div>
markllamahttp://www.blogger.com/profile/14193184544557876514noreply@blogger.com1tag:blogger.com,1999:blog-5022186007695457923.post-10301839027635527192013-07-25T12:07:00.001-07:002013-07-26T06:09:31.115-07:00Installing OpenShift using Puppet, Part 1: Divide and ConquerIt's been quite a while since I posted last. I got stuck on three things<br />
<br />
<ol>
<li>I didn't (don't?) know Puppet</li>
<li>The layers of service and configuration were (are?) muddy.</li>
<li>There are several competing significant installation use cases to be considered.</li>
</ol>
<div>
It would be very Agile to just leap in and start coding things until I got a set of boxes that worked. But it would also likely lead to something which was difficult to adapt to new uses because it didn't respect the working boundaries between different layers and compartments which make up the OpenShift service.</div>
<div>
<br /></div>
<div>
So I learned Puppet, and started coding some top down samples and some bottom up samples, while at the same time writing philosophical tracts trying to justify the direction(s) I was going.</div>
<div>
<br /></div>
<div>
I'm not nearly done (having thrown out several attempts and restarted each time) but I think I've reached a point where I can express clearly *how* I want to go about developing a CMS reference implementation for OpenShift installation and configuration.</div>
<div>
<br /></div>
<div>
OK, you're not going to get away without some philosophy. Rather a lot actually this time.</div>
<div>
<br /></div>
<h2>
Where do Configuration Management Services (CMS) fit...</h2>
<div>
Up until now I've concentrated on reaching a point where I can start installing OpenShift. And I'm finally there. No. Wait. I'm at the point where I can start installing the <i>parts that make up</i> OpenShift. After that I have to configure each of the parts to run in their own way and <i>then</i> I have to configure the settings that OpenShift cares about.</div>
<div>
<br /></div>
<div>
See what happened there? It's layers.</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="http://4.bp.blogspot.com/-zp7FAdCHu78/UfFgLIOLhtI/AAAAAAAAB3U/iVHExM87628/s1600/os_control_layers_white_bg.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="640" src="http://4.bp.blogspot.com/-zp7FAdCHu78/UfFgLIOLhtI/AAAAAAAAB3U/iVHExM87628/s640/os_control_layers_white_bg.png" width="548" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Host and Service Configuration Management Layers</td></tr>
</tbody></table>
<div>
<br /></div>
<div>
<br /></div>
<div>
See where the CMS fits in? Between the running OS and all those configured hosts/services. That's where I am now.</div>
<div>
<br /></div>
<div>
Look at the top layer. Those vertical slices are individual hosts or services that have to be created. Only the ones in the middle are OpenShift. The others are operations support (for a running service) or development and testing stuff which isn't really OpenShift but is needed to create OpenShift.</div>
<div>
<br /></div>
<h2>
... and what do they need to do.</h2>
<div>
<br /></div>
<div>
I need to show you another complicated looking picture:</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="http://4.bp.blogspot.com/-7IlHiBkTgWI/UfFrkPzGg_I/AAAAAAAAB3k/SMseqKsRebo/s1600/openshift_origin_puppet_modules.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="632" src="http://4.bp.blogspot.com/-7IlHiBkTgWI/UfFrkPzGg_I/AAAAAAAAB3k/SMseqKsRebo/s640/openshift_origin_puppet_modules.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Draft OpenShift CMS Module Layout</td></tr>
</tbody></table>
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div>
As you can see, I need to learn <u>Inkscape</u> more, because <u>Dia</u> graphics just don't look as cool.<br />
<br /></div>
<div>
I'm a fan of big complicated looking graphics to help describe big complicated concepts. This is a very rough incomplete draft of a module breakdown for installing OpenShift using a CM system (Puppet, by name, though this should be applicable to any modular CM system). The three columns in the diagram represent different class uses.</div>
<div>
<br /></div>
<div>
The first column contains classes that are just used to hold information that will be used to instantiate other classes on the target hosts. None of these classes will be instantiated directly on any host. The second column shows an OpenShift Broker and an OpenShift Node. Each includes a class which describes the function of that host within the OpenShift service. Each also includes any support services which run on the same host. The third column contains the definitions of the hosts which run support services. They include a module for the support service itself, and then one which applies the OpenShift customizations to the service.<br />
<br />
OpenShift uses plugin modules for several support services. In the diagram, the plugins for each support service are grouped together. Only one would be instantiated for a given OpenShift installation. Which one is selected as a parameter of the Master Configuration class <i>::openshift</i></div>
<div>
<br /></div>
<div>
There is one lonely class at the bottom of the middle column: <i>::openshift::host</i>. This is currently a catch-all class which provides a single point of control for configuring common host settings such as SSH firewall rules, the presence (or absence) of special YUM repositories and the like. It will be instantiated on every host which participates in the OpenShift service (for now) but can be customized using class parameters. This class could be broken up or other features added depending on how (in)coherent it becomes.<br />
<br />
<h2>
I showed you that diagram to show you this one.</h2>
<div>
<br /></div>
<div>
Now if you look back to the top diagram, in the top row there are a bunch of vertical items that are peers of a sort. Each blob represents a component service of OpenShift or a supporting service or task. In a fully distributed service configuration each one would represent an individual host.</div>
<div>
<br /></div>
<div>
Keep that in mind as you look at the middle and right side of the second diagram. Those (UML/Puppet) nodes there map to the blobs ad the top of the first diagram. They show the internal structure of those blobs when installing OpenShift and support components. Each one contains at least one module which <i>installs</i> a support service or component and which doesn't have the word <u>openshift</u> in it. Each one also contains (at least) one OpenShift customization class. This latter uses the information classes from the first column to customize the software on the node and integrate it with the OpenShift service.</div>
<div>
<br /></div>
<div>
This is the key point:</div>
<div>
<br /></div>
<div>
There are layers here too.</div>
<div>
<b><br /></b></div>
<div>
The configuration management tools should be designed so so that you can plug them together in a way that gets you the service you want to have, building up from the base to the completed service. But: <b>you should also be able to understand how the service is put together by looking at the configuration files.</b></div>
<div>
<b><br /></b></div>
<div>
By creating each (Puppet) node from the (Puppet) parts that define what a host does, you can see what the host does by looking at the Puppet node definition. Knowledge is maintained both ways.</div>
<div>
<br /></div>
<h2>
Outside-In Development</h2>
</div>
<div>
<br /></div>
<div>
Since I'm still learning specific CMS implementations (Puppet now, and Ansible soon) and trying to understand how best to express a configuration for OpenShift using these CMS, I'm working from the top alot. At the same time, I'm trying to actually implement (or steal implementations of) modules to do things like set up the YUM repositories and install the packages. I like this kind of Outside-In development model because (if I'm careful not to thrash too much) it helps me keep both perspectives in mind and hopefully meet in the middle.</div>
<div>
<br /></div>
<div>
In the next installment I'll try putting some meat on the bones of this skeleton: Actually creating the empty class definitions in their hierarchical structure and then creating a set of node definitions that import and use the classes to at least pretend to install an OpenShift service. Hopefully it won't take me another couple of months.</div>
<div>
<br /></div>
<h2>
References</h2>
<h3>
CMS Software</h3>
<div>
<ul>
<li><a href="http://www.puppetlabs.com/">Puppet</a></li>
<li><a href="http://forge.puppetlabs.com/">PuppetForge</a> - puppet modules</li>
<li><a href="https://github.com/ansible/ansible">Ansible</a></li>
</ul>
</div>
<div>
<h3>
Drawing Software</h3>
</div>
<div>
<ul>
<li><a href="http://inkscape.org/">Inkscape</a></li>
<li><a href="https://wiki.gnome.org/Dia">Dia</a></li>
</ul>
</div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
markllamahttp://www.blogger.com/profile/14193184544557876514noreply@blogger.com1tag:blogger.com,1999:blog-5022186007695457923.post-58752354564328018962013-06-05T13:30:00.000-07:002013-06-06T07:02:52.122-07:00OpenShift on AWS EC2, Part 5 - Preparing Configuration Management (with Puppet)I'm 5 posts into this and still haven't gotten to any OpenShift yet, except for doling out the instances and defining the ports and securitygroups for network communication. I did say "from the ground up" though, so if you've been here from the beginning, you knew what you were getting into.<br />
<br />
In this post I'm going to build and run the tasks needed to turn an EC2 base instance with almost nothing installed into a Puppet master, or a Puppet client. There are a number of little details that need managing to get puppet to communicate and to make it as easy as possible to manage updates.<br />
<br />
First a short recap for people just joining and so I can get my bearings.<br />
<br />
<h2>
Previously, our heros...</h2>
<div>
<br /></div>
<div>
<a href="http://cloud-mechanic.blogspot.com/2013/05/openshift-on-aws-ec2-part-1-from-wheels.html">In the first post</a> I introduced a set of tools I'd worked up for myself to help me understand and then automate the interactions with AWS.<br />
<br />
In <a href="http://cloud-mechanic.blogspot.com/2013/05/openshift-on-aws-ec2-part-2-being-seen.html">the second one</a> I registered a DNS domain and delegated it to the AWS Route53 DNS service.<br />
<br />
In <a href="http://cloud-mechanic.blogspot.com/2013/05/openshift-on-aws-ec2-part-3-getting-in.html">the third</a> I figured out what hosts (or classes of hosts) I'd need to run for an OpenShift service. Then I defined a set of network filter rules (using the AWS EC2 securitygroup feature) to make sure that my hosts and my customers could interact.<br />
<br />
Finally <a href="http://cloud-mechanic.blogspot.com/2013/05/openshift-on-aws-ec2-part-4-first.html">in the previous post</a> I selected an AMI to use as the base for my hosts, allocated a static IP address, added DNS A record, and started an instance for the puppet master and broker hosts. The remaining three (data1, message1, and node1) were left as an exercise for the reader.</div>
<div>
<br /></div>
<div>
So now I have five AWS EC2 instances running. I can reach them via SSH. The default account <i>ec2-user</i> has sudo ALL permissions. The instances are completely unconfigured.<br />
<br />
The next few sections are a bunch of exposition and theory. It explains some about what I'm doing and why, but doesn't contain a lot of <u>doing.</u><b> </b>Scan ahead if you get bored<a href="http://cloud-mechanic.blogspot.com/2013/06/openshift-on-aws-ec2-part-5-preparing.html#goodstuff"> to the real stuff closer to the bottom.</a><br />
<br />
<h2>
The End of EC2</h2>
</div>
<div>
<br /></div>
<div>
With the completion of the 4th post, we're done with EC2. All of the interactions from here on occur over SSH. The only remaining interactions with Amazon will be with Route53. The broker will be configured to update the app.example.org zone when applications are added or removed.</div>
<div>
<br /></div>
<div>
You could reach this point with any other host provisioning platform, AWS cloudformation, libvirt, virtualbox, Hyper-V, VMWare, or bare metal, it doesn't matter. Each of those will have its own provisioning details but if you can get to networked hosts with stable public domain names you can pick up here and go on, ignoring everything but the first post.</div>
<div>
<br /></div>
<div>
The first post is still needed for the process I'm defining because the origin-setup tools written with Thor aren't just used for EC2 manipulation. If that's all they were for I would have used one of the existing EC2 CLI packages.</div>
<div>
<br /></div>
<div>
<h2>
Configuration Management: An Operations Religion</h2>
</div>
<div>
<br />
I mean this with every coloring and shade of meaning it can have, complete with schisms and dogma and redemption and truth.</div>
<div>
<br /></div>
<div>
Some small shop system administrators think that configuration management isn't for them, it isn't needed. I differ with that opinion. Configuration management systems have two complementary goals. Only one of them is managing large numbers of systems. The important goal is managing even one <b>repeatably</b>. This is the Primary Dogma of System Administration. If you can't do it 1000 times, you can't do it at all.</div>
<div>
<br /></div>
<div>
The service I'm outlining only requires four hosts (the puppet master will be 5). I could do it on one. That's how most demos until now have done it. I could describe to you how to manually install and tweak each of the components in an OpenShift system, but its very unlikely that anyone would ever be able to reproduce what I described exactly. (I speak from direct experience here, following that kind of description in natural language is hard and writing it is harder) Using a CMS it is possible to expose what needs to be configured specially and what can be defaulted, and to allow (if its done well) for flexibility and customization.</div>
<div>
<br /></div>
<div>
The religion comes in when you try to decide <u>which one.</u></div>
<div>
<u><br /></u></div>
<div>
I'm going to go with sheep and expedience and choose Puppet. Other than that I'm not going to explain why.</div>
<div>
<br /></div>
<h2>
Brief Principals of Puppet</h2>
<div>
<br /></div>
<div>
Puppet is one of the currently popular configuration management systems. It is widely available and has a large knowledgeable user base. (that's why).</div>
<div>
<br /></div>
<h3>
The Master/Agent deployment</h3>
<div>
<br /></div>
<div>
The standard installation of puppet contains a <i>puppet master</i> and one or more puppet clients running the <i>puppet agent</i> service. The configuration information is stored on the puppet master host. The agent processes periodically poll the master for updates to their configuration. When an agent detects a change in the configuration spec the change is applied to the host.</div>
<div>
<br /></div>
<div>
The puppet master scheme has some known scaling issues, but for this scenario it will suit just fine. If the OpenShift service grows beyond what the master/agent model can handle, then there are other ways of managing and distributing the configuration, but they are beyond the scope of this demonstration.</div>
<div>
<br /></div>
<h3>
The Site Model Paradigm</h3>
<div>
<br /></div>
<div>
That's the only time you'll see me use that word. I promise.</div>
<div>
<br /></div>
<div>
The puppet configuration is really a description of every component, facet and variable you care about in the configuration of your hosts. It is a <i>model</i> in the sense that it represents the components and their relationships. The model can be compared to reality to find differences. Procedures can be defined to resolve the differences and bring the model and reality into agreement.</div>
<div>
<br /></div>
<div>
There are some things to be aware of. The model is, at any moment, static. It represents the current ideal configuration. The agents are responsible for polling for changes to the model and for generating the comparisons as well as applying any changes to the host. It is certain that when a change is made to the model, there will be a window of time when the site does not match. Usually it doesn't matter, but sometimes changes have to be coordinated. Later I may add MCollective to the configuration to address this. MCollective is Puppet's messaging/remote procedure call service and it allows for more timing control than the standard Puppet agent pull model.</div>
<div>
<br /></div>
<div>
Also, the model is only aware of what you tell it to be aware of. Anything that you don't specify is.... undetermined. Now specifying <b>everything</b> will bury you and your servers under the weight of just trying to stay in sync. It's important to determine what you really care about and what you don't. It's also important to look carefully at what you're leaving out to be sure that it's safe.</div>
<div>
<br /></div>
<div>
<h2>
Preparing the Puppet Master and Clients</h2>
</div>
<div>
<br /></div>
<div>
As usual, there's something you have to do before you can do the thing you really want to do. While puppet can manage pretty much anything about a system after it is set up, it can''t set it self up from nothing.</div>
<div>
<br /></div>
<div>
<ul>
<li>The puppet master must have a well known public hostname (DNS). Check.</li>
<li>Each participating client must have a well known public hostname (DNS): Check</li>
<li>The master and clients must know its own hostname (for id to the master) Err.</li>
<li>The master and clients must have time sync. Ummm</li>
<li>The master and clients must have the puppet (master/client) software installed. Not Check.</li>
<li>The master must have any additional required modules installed.</li>
<li>The master must have a private certificate authority (CA) so that it can sign client credentials. Not yet</li>
<li>The clients must generate and submit a client certificate for the master to sign. Nope.</li>
<li>The master must have a copy of the site configuration files to generate the configuration model. No.</li>
</ul>
</div>
<div>
<br />
The first four points are generic host setup, and the first two are complete. Installing the puppet software <u>should</u> be simple, but I may need to check and/or tweak the package repositories to get the version I want. The last four are pure puppet configuration and the last one is the goal line.<br />
<br />
<h3>
Hostname</h3>
<div>
<br /></div>
<div>
Puppet uses the <i>hostname</i> value set on each host to identfy the host. Each host should have its hostname set to the FQDN of the IP address on which it expects incoming connections.</div>
<div>
<br /></div>
<h3>
Time Sync on Virtual Machines</h3>
<div>
<br /></div>
Time sync needs a little space here. On an ordinary bare-metal host I'd say "install an ntpd on every host". NTP daemons are light weight and more reliable and stable than something like cron job to re-sync. Virtual machines are special though.<br />
<br />
On a properly configured virtual machine, the system time comes from the VM host. As the guest, you must assume that the host is doing the right thing. The guest VM has a simulated real-time clock (RTC) which is a pass-through either of the host clock or the underlying hardware RTC. In either case, the guest is not allowed to adjust the underlying clock.<br />
<br />
Typically a service like ntpd gets time information from outside and not only slews the system (OS) clock but it compares that to the RTC and tries to compensate for drift between the RTC and the "real" time. In the default case it will even adjust the RTC to keep it in line with the system clock and "real" time.<br />
<br />
As a guest, it's impolite to go around adjusting your host's clocks.<br />
<br />
So a virtual machine system like an IaaS is one of the few places I'd advise against installing a time server. If your VMs aren't in sync, call your provider and ask them why their hardware clocks are off. If they can't give you a good answer, find a new IaaS provider.<br />
<br />
<h3>
Time Zones and The Cloud</h3>
<div>
<br /></div>
I'm going to throw one more timey-wimey things in here. I set the system timezone on every server host to UTC. If I ever have to compare logs on servers from different regions of the world (this is the <u>cloud</u> remember?) I don't have to convert time zones. User accounts can always set their timezone to localtime using the TZ environment variable. The tasks offer an option so that you can override the timezone.<br />
<br />
<h3 id="goodstuff">
Host Preparation vs. Software Configuration</h3>
<div>
<br /></div>
<div>
It would be fairly easy to write a single task that completes all of the bullet points listed above, but something bothers me about that idea. The first 4 are generic host tasks. The last four are distinctly puppet configuration related. Installing the software packages sits on the edge of both. The system tasks are required on every host. Only the puppet master will get the puppet master service software and configuration. The puppet clients will get different software and a different configuration process.</div>
<div>
<br /></div>
<div>
I'm going to take advantage of the design of Thor to create three separate tasks to accomplish the job:</div>
<div>
<br /></div>
<div>
<ul>
<li><code>origin:prepare</code> - do the common hosty tasks</li>
<li><code>origin:puppetmaster</code> - prepare and then install and configure a master</li>
<li><code>origin:puppetclient</code> - prepare, and then install and register a client</li>
</ul>
<div>
<br />
So the <code>origin:prepare</code> task needs to set the hostname on the box to match the FQDN. I prefer also to enable the local firewall service and open a port for SSH to minimize the risk of unexpected exposure. This is also where I'd put a task to add a software repository for the puppet packages if needed.</div>
</div>
<div>
<br /></div>
Each of the <code>origin:puppetmaster</code> and <code>origin:puppetclient</code> tasks will invoke the <code>origin:prepare</code> task first.<br />
<br />
<h2>
File Revision Control</h2>
<div>
<br /></div>
<div>
Since Configuration Management is all about control and repeatability it also makes sense to place the configuration files themselves under revision control. For this example I'm going to place the site configuration in a Github repository. Changes can be made in a remote work space and pushed to the repository. ;Then they can be pulled down to the puppet master and the service notified to re-read the configurations. ;They can also be reverted as needed.</div>
<div>
<br /></div>
<div>
When the Puppet site configuration is created on the puppet master, it will be cloned from the git repo on github.</div>
<div>
<br /></div>
<h2>
Initialize the Puppet Master</h2>
<div>
The puppet master process runs as a service on the puppet server. It listens for polling queries from puppet agents on remote machines. The puppet master service must read the site configurations to build the models that will define each host. The puppet service runs as a non-root user and group, each named "puppet". The default location for puppet configuration files is in /etc/puppet. This area is only writable by the root user. Other service files reside in /var/lib/puppet. This area is writable by the puppet user and group. Further, SELinux limits access by the puppet user to files outside these spaces.</div>
<div>
<br /></div>
<div>
On RHEL6, the EC2 login user is still root. The user and group settings aren't really needed there, but they are still consistent.</div>
<div>
<br /></div>
<div>
The way I choose to manage this is:</div>
<div>
<ol>
<li>Add the ec2-user to the puppet group</li>
<li>Place the site configuration in <code>/var/lib/puppet/site</code></li>
<li>Update the puppet configuration file (<code>/etc/puppet/puppet.conf</code>) to reflect the change</li>
<li>Clone the configuration repo into the local configuration directory</li>
<li>Symlink the configuration repo root into the ec2-user home directory.</li>
</ol>
</div>
<div>
This way the ec2-user has permission and access to update the site configuration.</div>
<div>
<br /></div>
<div>
Puppet uses x509 server and client certificates. The puppet master needs a server certificate and needs to self-sign it before it can sign client certificates or accept connections from clients.</div>
<div>
<br /></div>
<div>
Once the server certificate is generated and signed, I also need to enable and start the puppet master service. Finally, I need to add a firewall rule allowing inbound connections on the puppet master port, 8140/TCP.</div>
<br />
So the process of initializing the puppet master is this:<br />
<br />
<ul>
<li>install the puppet master software</li>
<li>modify the puppet config file to reflect the new site configuration file location</li>
<li>install additional puppet modules</li>
<li>generate server certificate and sign it</li>
<li>add ec2-user to puppet group (or root user on RHEL6)</li>
<li>create site configuration directory and set owner, group, permissions</li>
<li>clone the git repository into the configuration directory</li>
<li>start and enable the puppet master service</li>
</ul>
<div>
<br /></div>
</div>
<div>
<h3>
Installing Packages</h3>
<div>
<br /></div>
<div>
Since I'm using Thor, the package installation process is a Thor task. Each sub-task will only run once within the invocation of its parent. The <code>origin:puppetmaster</code> task calls the <code>origin:prepare</code> task and provides a set of packages needed for a puppet master in addition to any installed as part of the standard preparation (firewall management and augeas). For the puppet master, these additional packages are the <i>puppet-master</i> and <i>git</i> packages. Dependencies are resolved by YUM.</div>
<br />
<h3>
Adding user to Puppet group</h3>
<div>
<br /></div>
<div>
The puppet service is controlled by the root user, but runs as a role user and group both called <i>puppet.</i> I would like the login user to be able to manage the puppet site configuration files, but not to log in either as the root or puppet user. I'll add the <i>ec2-user</i> user to the puppet group, and set the group write permissions so that this user can manage the site configuration.</div>
<div>
<br /></div>
<h3>
Creating the Site Configuration Space</h3>
<div>
<br /></div>
<div>
As noted above, the ec2-user account will be used to manage the puppet site configuration files. The files must be writable by the ec2-user (through the puppet group) but they must also be readable by the puppet user and service. In addition, since these are service configurations rather than (local) host configuration files, I'd prefer that they not reside in /etc.</div>
<div>
<br /></div>
<div>
SELinux policy restricts the location of files which the puppet service processes can read. One of those locations is in <code>/var/lib/puppet</code>. Rather than update the policy, it seems easier to place the site configuration data within <code>/var/lib/puppet.</code></div>
<div>
<br /></div>
<div>
I create a new directory <code>/var/lib/puppet/site</code> and set the owner, group and permissions so that the puppet user/group and read and write the files. I also set the permissions so that new files will inherit the group and group permissions. This way the ec2-user will have the needed access, and SELinux will not prevent the puppet master service from reading the files. In a later step I'll use git to clone the site configuration files into place.<br />
<br /></div>
<h3>
</h3>
<h3>
Install Service Configuration File (setting variables)</h3>
<br />
Moving the location of the site configuration files from the default (<code>/etc/puppet/manifests</code>) and adding a location for user defined modules requires updating the default configuration file. Currently I make three alterations to the default file:<br />
<br />
<br />
<ul>
<li>set the puppet master hostname as needed</li>
<li>set the location of the site configuration (manifests)</li>
<li>add a location to the modulepath</li>
</ul>
<div>
I use a template file, push a copy to the master and use <i>sed</i> to replace the values before copying the updated file into place.</div>
<br />
<h3>
</h3>
<h3>
Installing Standard Modules</h3>
<div>
<br /></div>
<div>
Puppet provides a set of standard modules for managing common aspects of clients. These are installed from a web site on PuppetLabs with the <code>puppet module install</code> command. these are installed before starting the master process.</div>
<br />
<h3>
Unpacking Site Configuration (From git)</h3>
<br />
I already have a task for cloning a git repository on a remote host. Unpack the site configurations into the directory prepared previously. The git repo must have two directories at the top: <i>manifests</i> and <i>modules.</i> These will contain the site configuration and any custom modules needed for OpenShift. These locations are configured into the puppet master configuration above.<br />
<h3>
</h3>
<br />
<h3>
Adding Firewall Rules</h3>
<br />
The puppet master service listens on port 8140/TCP. I need to add an allow rule so that inbound connections to the puppet master will succeed.<br />
<br />
Just to be safe I also add an explicit rule to allow SSH (22/TCP) before restarting the firewall service.<br />
<br />
These match the securitygroup rule definitions defined in the third post. Some people would question the need for running a host based firewall when EC2 provides network filtering I would refer anyone who asks that to read up on <a href="https://en.wikipedia.org/wiki/Defence_in_depth">Defense in Depth</a>.<br />
<br />
<h3>
Filtering the Puppet logs into a separate file</h3>
<div>
<br /></div>
<div>
It is much easier to observe the operation of the service if the logs are in a separate file. I add an entry to the <code>/etc/rsyslog.d/</code> directory and restart the rsyslog daemon to place puppet master logs in <code>/var/log/puppet-master.log</code></div>
<h3>
</h3>
<br />
<h3>
Enabling and Starting the Puppet Master Service</h3>
<div>
<br /></div>
<div>
Finally, when all of the puppet master host customization is complete, I can enable and start the puppet master service.</div>
<div>
<br /></div>
<h3>
What all that looks like</h3>
<div>
<br />
That's a whole long list and I created a whole set of Thor tasks to manage the steps. Then I created an uber-task to execute it all. It starts with the result of <code>origin:baseinstance</code> (run with the securitygroups default and puppetmaster). It results in a running puppet master waiting for clients to connect.</div>
<div>
<br />
<pre class="brush:bash ; title: 'initialize puppet master' ; highlight: 1">thor origin:puppetmaster puppet.infra.example.org --siterepo https://github.com/markllama/origin-puppet
origin:puppetmaster puppet.infra.example.org
task: remote:available puppet.infra.example.org
task: origin:prepare puppet.infra.example.org
task: remote:distribution puppet.infra.example.org
fedora 18
task: remote:arch puppet.infra.example.org
x86_64
task: remote:timezone puppet.infra.example.org UTC
task: remote:hostname puppet.infra.example.org
task: remote:yum:install puppet.infra.example.org puppet-server git system-config-firewall-base augeas
task: puppet:master:join_group puppet.infra.example.org
task: remote:git:clone puppet.infra.example.org https://github.com/markllama/origin-puppet
task: puppet:master:configure puppet.infra.example.org
task: puppet:master:enable_logging puppet.infra.example.org
task: puppet:module:install puppet.infra.example.org puppetlabs-ntp
task: remote:firewall:stop puppet.infra.example.org
task: remote:firewall:service puppet.infra.example.org ssh
task: remote:firewall:port puppet.infra.example.org 8140
task: remote:firewall:start puppet.infra.example.org
task: remote:service:start puppet.infra.example.org puppetmaster
task: remote:service:enable puppet.infra.example.org puppetmaster
</pre>
</div>
<br />
You can check that the puppet master has created and signed its own CA certificate by listing the puppet certificates like this:<br />
<br />
<pre class="brush:bash ; title: 'list all puppet certificates' ; highlight: 1">thor puppet:cert list puppet.infra.example.org --all
task puppet:cert:list puppet.infra.example.org
+ puppet.infra.example.org BD:27:A5:3B:AE:F5:1D:05:7E:8F:E7:E9:CA:BA:32:4B
</pre>
<br />
This indicates that there is now a single certifiicate associated with the puppet master. This certificate will be used to sign the client certificates as they are submitted.<br />
<br />
<h2>
Initializing a Puppet Client</h2>
</div>
<div>
<br />
The first part of creating a puppet client host is the same as for the master (almost). It involves installing some basic puppet packages (puppet, facter, augeas), setting the hostname and time zone and the rest of the hosty stuff. Then we get to the puppet client registration.<br />
<br />
The puppet agent runs on the controlled client hosts. It polls the puppet master periodically checking for updates to the configuration model for the host.</div>
<div>
<br /></div>
<div>
When the puppet agent starts the first time it generates an x509 client certificate and sends a signing request to the puppet master.</div>
<div>
<br /></div>
<div>
When the puppet master receives an unsigned certificate from an agent for the first time it places it in a list of certificates waiting to be signed. The user can then sign and accept each new client certificate and the initial identification process is complete. From then on the puppet agent polls using its client certificate for identification and the signature provides authentication.</div>
<div>
<br /></div>
<div>
The process then for installing and initializing the puppet client is this:</div>
<div>
<br /></div>
<div>
On the client:</div>
<div>
<ul>
<li>install the puppet agent package</li>
<li>configure the puppet master hostname into the configuration file</li>
<li>enable the puppet agent service</li>
<li>start the puppet agent service</li>
</ul>
<div>
Then on the puppet master:</div>
</div>
<div>
<ul>
<li>wait for the client unsigned certificate to arrive</li>
<li>sign the new client certificate</li>
</ul>
</div>
<div>
This is what it looks like for the broker host:<br />
<br />
<br />
<pre 1="" add="" agent="" broker="" class="brush:bash ; title: " highlight:="" host="" puppet="" the="" to="">thor origin:puppetclient broker.infra.example.org puppet.infra.example.org
origin:puppetclient broker.infra.example.org, puppet.infra.example.org
task: remote:available broker.infra.example.org
task: origin:prepare broker.infra.example.org
task: remote:distribution broker.infra.example.org
fedora 18
task: remote:arch broker.infra.example.org
x86_64
task: remote:timezone broker.infra.example.org UTC
task: remote:hostname broker.infra.example.org
task: remote:yum:install broker.infra.example.org puppet facter system-config-firewall-base augeas
task: puppet:agent:set_server broker.infra.example.org puppet.infra.example.org
task: puppet:agent:enable_logging broker.infra.example.org
task: remote:service:enable broker.infra.example.org puppet
task: remote:service:start broker.infra.example.org puppet
task: puppet:cert:sign puppet.infra.example.org broker.infra.example.org
</pre>
<br />
At this point the client can request its own configuration model and the master will confirm the identity of the client and return the requested information.<br />
<br />
<pre class="brush:bash ; title: 'list all certificates' ; highlight: 1">thor puppet:cert:list puppet.infra.example.org --all
task puppet:cert:list puppet.infra.example.org
+ broker.infra.example.org 09:97:22:B9:A9:16:AE:B1:32:93:EC:3A:6D:7A:CF:67
+ puppet.infra.example.org 70:B8:E0:C0:F8:5B:48:67:4E:92:91:D2:0D:E4:2B:F4
</pre>
<br />
Repeat the <code>origin:puppetclient</code> step for the data1, message1 and node1 instances you created last time. You did create them, right? Check the certs as each one registers.<br />
<br />
The next step is to actually build a model for the client to request by creating a site manifest and a set of node descriptions.<br />
<br />
That means: we finally get to do some OpenShift.</div>
<div>
<ul>
</ul>
</div>
<div>
</div>
markllamahttp://www.blogger.com/profile/14193184544557876514noreply@blogger.com4tag:blogger.com,1999:blog-5022186007695457923.post-82999244800289923342013-05-30T11:04:00.003-07:002013-05-30T12:55:28.050-07:00OpenShift on AWS EC2, Part 4 - The First MachineThere's enough infrastructure in place now that I should be able to create the first instance for my OpenShift service. I'm going to be managing the configuration with a Puppet master, so that will be the first instance I create.<br />
<br />
The puppet master must have a public name and a fixed IP address. I need to be able to reach it via SSH, and the puppet agents need to be able to find it by name (oversimplification, go with me on this).<br />
<br />
With Route53 and EC2 configured, I can request a static (elastic) IP and associate it with a hostname in my domain. I can also associate it with a new instance after the instance is launched. I can specify the network filtering rules so I can access the host over the network.<br />
<br />
I actually have a task that does all this in one go, but I'm going to walk through the steps once so it's not magic.<br />
<br />
<b>NOTE</b>: if you haven't pulled down the <a href="https://github.com/markllama/origin-setup">origin-setup</a> tools from github and you want to follow along, you should go back to <a href="http://cloud-mechanic.blogspot.com/2013/05/openshift-on-aws-ec2-part-1-from-wheels.html">the first post in this series</a> and do so.<br />
<br />
This is not the only way to accomplish the goals set here. You can use the AWS <a href="https://console.aws.amazon.com/">web console</a>, <a href="http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/Welcome.html">CloudFormation</a> or even tools like the <a href="https://github.com/mlinderm/vagrant-aws">control plugins for Vagrant</a>.<br />
<br />
<h2>
Instances and Images in EC2</h2>
<div>
<br /></div>
<div>
First, a little terminology. EC2 has a number of terms to disambiguate... things.</div>
<div>
<br /></div>
<div>
An <i><a href="http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ComponentsAMIs.html">image</a></i> is a static piece of storage which contains an OS. It is the "gold copy" that we used to make when we still cloned hard disks to copy systems. An image cannot run an OS. It's storage. An image does have some metadata though. It has an associated machine architecture. It has instructions for how it is to be mounted when it is used to create an instance.</div>
<div>
<br /></div>
<div>
(actually, this is a lie, an image <u>is</u> the metadata, the storage is really in a <i>snapshot</i> with a <i>volume</i> but that's too much and not really important right now.<i>)</i></div>
<div>
<br /></div>
<div>
An <a href="http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Instances.html">instance</a> is a runnable copy of an image. It has a copy of the disk, but it also has the ability to start and stop. It is assigned an IP address when it starts. A copy of your RSA security key is installed when it starts so that you can log in.</div>
<div>
<br /></div>
<div>
When you want a new machine, you create an instance from an image. You select an image which uses the architecture and contains the OS that you want. You give the instance a name, and comment, and its security groups. There are other things you can specify as well, but they don't come into play here.</div>
<div>
<br /></div>
<h2>
Finding the Right Image</h2>
<div>
<br /></div>
<div>
People like their three letter abbreviations. On the web interface you'll see the term "AMI", which, I think stands for "Amazon Machine Image". Otherwise known as "an image" in this context. While the image ID's all begin with <i>ami-</i> I'm going to continue to refer to them as "images".</div>
<div>
<br /></div>
<div>
For OpenShift I want to start with either a Fedora or a RHEL (or CentOS) image. I can't think of a reason anymore not to use a 64 bit OS and VM, so I'll specify that. You can easily find official RHEL images on the AWS web console or using the AWS Marketplace. You can find CentOS in the Marketplace. There are "official" Fedora images there too, though they're not publicized.</div>
<div>
<br /></div>
<div>
What I do is use the web interface to find a recommended image and then make a note of the owner ID of the image. From then on I can use the owner ID to find images using the CLI tools. It doesn't look like you can look up an owner's information from their owner ID.</div>
<div>
<br /></div>
<div>
New instances (running machines) are created from images. Conversely new images can be created from an instance. People can create and register and publish their images, so there can be lots of things that look like they're "official" which may have been altered. It takes a little sleuthing to find the images that come from the source you want. </div>
<div>
<br /></div>
<div>
Using the AWS console, I narrowed the Fedora x86_64 images down to this:</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="http://2.bp.blogspot.com/-UfqEp8DQAQo/UaZWUmvqmqI/AAAAAAAAB1M/bXJzf2x7-EU/s1600/ec2_image_search_fedora.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="152" src="http://2.bp.blogspot.com/-UfqEp8DQAQo/UaZWUmvqmqI/AAAAAAAAB1M/bXJzf2x7-EU/s640/ec2_image_search_fedora.png" width="640" /></a></div>
<div>
<br /></div>
<div>
I made a note of the owner ID, and the pattern for the names and I can search for them on the CLI like this:</div>
<div>
<br /></div>
<div>
<pre class="brush:bash ; title: 'list Fedora images' ; highlight : 1">thor ec2:image list --owner 125523088429 --name 'Fedora*' --arch x86_64
ami-2509664c 125523088429 x86_64 Fedora-x86_64-17-1-sda
ami-6f3b5006 125523088429 x86_64 Fedora-x86_64-19-Beta-20130523-sda
ami-b71078de 125523088429 x86_64 Fedora-x86_64-18-20130521-sda
</pre>
<br /></div>
Note that the <i>--name</i> search allows for globbing using the asterisk (*) character.<br />
<br />
<h2>
Launching the First Instance</h2>
<div>
<br /></div>
<div>
I think I have enough information now to fire up the first instance for my OpenShift service. The first one will be the puppet master, as that will control the configuration of the rest.</div>
<div>
<br /></div>
What I know:<br />
<br />
<ul>
<li>hostname - puppet.infra.example.com</li>
<li>base image - ami-b71078de</li>
<li>securitygroup(s) - default, allow SSH</li>
<li>SSH key pair name</li>
</ul>
<div>
<br /></div>
<div>
Later, I will also need a static (ElasticIP) address and a DNS A record. Both of those can be set after the instance is running.<br />
<br />
There is one last thing to decide. When you create an EC2 instance, you must specify the instance <i>type</i> which is a kind of sizing for the machine resources. AWS has a table of EC2 instance types that you can use to help you size your instances to your needs. Since I'm only building a demo, I'm going to use the <u>t1.micro</u> type. This has 7GB instance storage a single virtual core and enough memory for this purpose. The CPU usage is also free (Storage and unused elastic IPs still cost).<br />
<br />
<ul>
<li>size: t1.micro</li>
</ul>
<div>
<br />
So, here we go, with the CLI tools:</div>
<div>
<br /></div>
<pre class="brush:bash ; title : 'creating the first instance' ; highlight: 1">thor ec2:instance create --name puppet --type t1.micro --image ami-b71078de --key <mykeyname> --securitygroup default
task: ec2:instance:create --image ami-b71078de --name puppet
id = i-d8c912bb
</pre>
</div>
<div>
<br /></div>
<div>
That's actually pretty.... anti-climactic. I've got a convention that each task echos the required arguments back as it is invoked. That way as the tasks are composed into bigger tasks, you can see what's going on inside while it runs.<br />
<br />
All this one seemed to do was to return an image id. To see what's going on, I can request the status of the instance:<br />
<br />
<pre class="brush:bash ; title : 'instance status' ; highlight: 1">thor ec2:instance status --id i-d8c912bb
pending
</pre>
<br />
Since I'm impatient, I do that a few more times and after about 30 seconds it changes to this:
<br />
<br />
<pre class="brush:bash ; title : 'instance status: running' ; highlight: 1">thor ec2:instance status --id i-d8c912bb
running
</pre>
</div>
<div>
<br /></div>
<div>
I want to log in, but so far all I know is the instance ID. I can ask for the hostname.<br />
<br />
<pre class="brush:bash ; title : 'get instance external hostname'; highlight: 1">thor ec2:instance hostname --id i-d8c912bb
ec2-23-22-234-113.compute-1.amazonaws.com
</pre>
<br />
And with that I should be able to log in via SSH using my private key:
<br />
<br />
<pre class="brush:bash ; title : 'log in the first time'; highlight: 1">ssh -i ~/.ssh/<mykeyfile;>.pem ec2-user@ec2-23-22-234-113.compute-1.amazonaws.com
The authenticity of host 'ec2-23-22-234-113.compute-1.amazonaws.com (23.22.234.113)' can't be established.
RSA key fingerprint is 64:ec:6d:7d:af:ae:9a:70:78:0d:02:28:f1:c3:45:50.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ec2-23-22-234-113.compute-1.amazonaws.com,23.22.234.113' (RSA) to the list of known hosts.
</pre>
</div>
<br />
It looks like I generated and saved my key pair right, and specified it correctly when creating the instance.<br />
<br />
The Fedora instances don't use the <i>root</i> user as the primary remote login. Instead, there's an <i>ec2-user</i> account which has <u>ssh ALL:ALL</u> permissions. That is, the ec2-user account can use <i>sudo</i> without providing a password. This really just gives you a little separation and forces you to <b>think</b> before you take some action as root.<br />
<br />
<h2>
Getting a Static IP Address</h2>
<div>
<br />
Now I have a host running and I can get into it, but the hostname is some long abstract string in the EC2 <i>amazonaws.com</i> domain. I want MY name on it. I also want to be able to reboot the host and have it get the same IP address and name. Well, it's not quite that simple.</div>
<div>
<br /></div>
<div>
Amazon EC2 has a curious and wonderful feature. Each running instance actually has two IP addresses associated with it. One is the internal IP address (the one configured in eth0). But that's in an RFC 1918 private network space. You can't route it. You can't reach it. You could even have a duplicate inside your corporate or home network.</div>
<div>
<br /></div>
<div>
The second address is an external IP address and this is the one you can see and can route to. Amazon works some router table magic at the network border to establish the connection between the internal and external addresses. What this means is that EC2 can change your external IP address without doing a thing to the host behind it. This is where Elastic IP addresses come in.</div>
<div>
<br /></div>
<div>
As with all of these things, you can do it from the web interface, but since I'm trying to automate things, I've made a set of tasks to manipulate the elastic IPs. I'm lazy and there's no other kind of IP in EC2 that I can change, so the tasks are in the <i>ec2:ip</i> namespace.</div>
<div>
<br /></div>
<div>
Creating a new IP is pretty much what you'd expect. You're not allowed to specify anything about it so it's as simple as can be:</div>
<div>
<br />
<pre class="brush:bash ; title : 'reserve an elastic IP' ; highlight: 1">thor ec2:ip create
task: ec2:ip:create
184.72.228.220
</pre>
<br />
Once again, not very exciting. Since each IP must be unique, the address itself serves as an ID. An address isn't very useful until it's associated with a running instance. The <i>ipaddress</i> task can retrieve the IP address of an instance. It can also set the external IP address (the address must be an allocated Elastic IP)<br />
<br /></div>
<div>
<pre class="brush:bash ; title: 'assign the IP adddress to an instance' ; highlight: 1">thor ec2:instance ipaddress 184.72.228.220 --id i-d8c912bb
task: ec2:instance:ipaddress 184.72.228.220
</pre>
<br /></div>
<div>
You can get the status and more information about an instance. You can also request the status using the instance name rather than the ID. For objects which have an ID and a name, you can query using either one, but you must specify it with an argument. For objects like the IP address which do not have a name, the id is the first argument f any query.
<br />
<br />
<pre class="brush:bash ; title: 'getting instance status by name' ; highlight: 1">thor ec2:instance info --name puppet --verbose
EC2 Instance: i-d8c912bb (puppet)
DNS Name: ec2-184-72-228-220.compute-1.amazonaws.com
IP Address: 184.72.228.220
Status: running
Image: ami-b71078de
Platform:
Private IP: 10.212.234.234
Private Hostname: ip-10-212-234-234.ec2.internal
</pre>
<br />
<h2>
And now for something completely different: Route53 and DNS</h2>
</div>
<div>
I now have a a running host with the operating system and architecture I want. It has a fixed address. But it has a really funny domain name.</div>
<div>
<br /></div>
<div>
When I created my Route53 zones, I split them in two. <i>infra.example.org</i> will contain my service hosts. <i>app.example.com</i> will contain the application CNAME records. The broker will only have permission to change the application zone. It won't be able to damage the infrastructure either through a compromise or a bug.</div>
<div>
<br /></div>
<div>
I'm going to call the puppet master puppet.infra.example.org. It will have the IP address I was granted above.</div>
<div>
<br /></div>
<div>
All of the previous tasks were in the <code>ec2:</code> namespace. Route53 is actually a different service within AWS, so it gets its own namespace.</div>
<div>
<br /></div>
<div>
An IP address record has four components:</div>
<div>
<br /></div>
<div>
<ul>
<li>type</li>
<li>name</li>
<li>value</li>
<li>ttl (time to live, in seconds)</li>
</ul>
<div>
<br /></div>
<div>
All of the infrastructure records will be A (address) records. The TTL has a regular default and there's no reason generally to override it. The value of an A record is IP address.</div>
</div>
<div>
<br /></div>
<div>
The name in an A record is a Fully Qualified Domain Name (FQDN). It has both the domain suffix and and the hostname and any sub-domain parts. To save some trouble parsing, the <code>route53:record:create</code> task expects the zone first, and the host part next as a separate argument. The last two arguments are the type and value.</div>
<div>
<br /></div>
<div>
<pre class="brush:bash; title: 'create the puppet A record'; highlight: 1">thor route53:record create infra.example.org puppet a 184.72.228.220
task: route53:record:create infra.example.org puppet a 184.72.228.220
</pre>
<br /></div>
<div>
Also pretty anti-climactic. This time though there will be an external effect.<br />
<br />
First, I can list the contents of the <i>infra.example.org</i> zone from Route53. Then I can also query the A record from DNS, though this may take some time to be available.<br />
<br />
<pre class="brush:bash ; title: 'view an A record from Route53' ; highlight: 1">thor route53:record:get infra.example.org puppet A
task: route53:record:get infra.example.org puppet A
puppet.infra.example.org. A
184.72.228.220
</pre>
<br />
And the same when viewed with <code>host</code>:
<br />
<br />
<pre>host puppet.infra.example.org
puppet.infra.example.org has address 184.72.228.220
</pre>
<br />
The SOA records for AWS Route53 have a TTL of 900 seconds (15 minutes). When you add or remove a record from a zone, you also cause an update to the SOA record serial number. Between you and Amazon there are almost certainly one or more caching nameservers and they will only refresh their cache when the SOA TTL expires. So you could experience a delay of up to 15 minutes from the time that you create a new record in a zone and when it resolves. I'm hoping this doesn't hold true for individual records, because it's going to cause problems for OpenShift.<br />
<br />
You can check the TTL of the SOA record by requesting the record directly using dig:<br />
<br /></div>
<div>
<br />
<pre class="brush:bash ; title : 'view zone SOA (and TTL)' ; highlight: 1">dig infra.example.org soa
; <<>> DiG 9.9.2-rl.028.23-P2-RedHat-9.9.2-10.P2.fc18 <<>> infra.example.org soa
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 60006
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;infra.example.org. IN SOA
;; ANSWER SECTION:
infra.example.org. 900 IN SOA ns-1450.awsdns-53.org. awsdns-hostmaster.amazon.com. 1 7200 900 1209600 86400
;; Query time: 222 msec
;; SERVER: 172.30.42.65#53(172.30.42.65)
;; WHEN: Wed May 29 18:46:46 2013
;; MSG SIZE rcvd: 130
</pre>
</div>
<br />
The '900' on the first line of the answer section is the record TTL.<br />
<br />
<h2>
Wrapping it all up.</h2>
<div>
<br /></div>
<div>
The beauty of Thor is that you can take each of the tasks defined above and compose them into more complex tasks. You can invoke each task individually from the command line or you can invoke the composed task and observe the process.</div>
<div>
<br /></div>
<div>
Because this task uses several others from both EC2 and Route53, I put it under a different namespace. All of the specific composed tasks will go in the <code>origin:</code> namespace.</div>
<div>
<br /></div>
<div>
The composed task is called <code>origin:baseinstance.</code> At the top I know the fully qualified domain name of the host, the image and securitygroups that I want to use to create the instance. Since I already have the puppet master this one will be the broker.</div>
<div>
<br /></div>
<div>
<ul>
<li>hosthame: broker.infra.example.org</li>
<li>image: ami-b71078de</li>
<li>instance type: t1.micro</li>
<li>securitygroups: default, broker</li>
<li>key pair name: <mykeypair></li>
</ul>
</div>
<div>
<br />
<pre class="brush:bash ; title : 'create host, IP, DNS all in one go' ; highlight: 1">thor origin:baseinstance broker --hostname broker.infra.example.org --image ami-b71078de --type t1.micro --keypair <mykeypair> --securitygroup default broker
task: origin:baseinstance broker
task: ec2:ip:create
184.73.182.10
task: route53:zone:contains broker.infra.example.org
Z1PLM62Y00LCIN infra.example.org.
task: route53:record:create infra.example.org. broker A 184.73.182.10
- image id: ami-b71078de
task: ec2:instance:create ami-b71078de broker
id = i-19b1f576
task: remote:available ec2-54-226-116-229.compute-1.amazonaws.com
task: ec2:ip:associate 184.73.182.10 i-19b1f576
</pre>
<br /></div>
<div>
This process takes about two minutes. If you add <code>--verbose</code> you can see more of what is happening. There is a delay waiting for the A record creation to sync so that you don't accidentally create negative cache records which can slow propagation. Also you can see the <code>remote:available</code> task which polls a host for SSH login access. This allows time for the instance to be created, start running and reach multi-user network state.<br />
<br />
<pre class="brush:bash ; title : 'login with fully-qualified domain name' ; highlight: 1">ssh ec2-user@broker.infra.example.org
The authenticity of host 'broker.infra.example.org (184.73.182.10)' can't be established.
RSA key fingerprint is 8f:db:46:25:bf:19:2e:47:f5:f4:4a:23:a5:98:e3:5c.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'broker.infra.example.org,184.73.182.10' (RSA) to the list of known hosts.
Last login: Thu May 30 11:37:08 2013 from 66.187.233.206
</pre>
<br />
I will duplicate this process for the data and message servers, and for one node to begin.<br />
My tier of AWS only allows 5 Elastic IP addresses, so I'm at my limit. For a real production setup, only the broker, nodes and possibly the puppet master require fixed IP addresses and public DNS. The datastore and message servers could use dynamic addresses, but then they will require some tweaking on restart. I'm sure Amazon will give you more IP addresses for money, but I haven't looked into it.<br />
<br />
<h2>
Summary</h2>
</div>
<div>
<br /></div>
<div>
There's a lot packed into this post:</div>
<div>
<ul>
<li>Select an image to use as a base</li>
<li>Manage IP addresses</li>
<li>Bind IP addresses to running instances</li>
<li>Create a running instance.</li>
</ul>
<div>
All of this can be done with the AWS console. The ec2, route53 tasks just make it a little easier and the origin:baseinstance task wraps it all up so that creating new bare hosts is a single step.</div>
</div>
<div>
<br /></div>
<div>
In the next post I'll establish the puppet master service on the puppet server and install a puppet agent on each of the other infrastructure hosts. From then all of the service management will happen in puppet and we can let EC2 fade into the background.</div>
<div>
<br /></div>
<h2>
References</h2>
<div>
<ul>
<li><a href="http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/concepts.html">EC2 Documentation</a></li>
<ul>
<li><a href="http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AMIs.html">Image (AMI)</a></li>
<li><a href="http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Instances.html">Instance</a></li>
</ul>
<li><a href="http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/HowDoesRoute53Work.html">Route53 Documentation</a></li>
<ul>
<li><a href="http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/AboutHostedZones.html">Zone</a></li>
<li><a href="http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/AboutRRS.html">Record</a></li>
</ul>
<li><a href="http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AccessingInstances.html">Access via SSH</a></li>
</ul>
</div>
markllamahttp://www.blogger.com/profile/14193184544557876514noreply@blogger.com7tag:blogger.com,1999:blog-5022186007695457923.post-52373475706776499602013-05-28T09:21:00.001-07:002013-05-28T09:21:57.294-07:00OpenShift on AWS EC2, Part 3: Getting In and Out (securitygroups)In the previous two posts, I talked about tools to manage AWS EC2 with a CLI toolset, and preparing AWS Route53 so that the OpenShift broker will be able to publish new applications. There is one more facet of EC2 that needs to be addressed before trying to start the instances which will host the OpenShift service components.<br />
<br />
AWS EC2 provides (enforces?) network port filtering. The filter rule sets are called <i><a href="https://console.aws.amazon.com/ec2/home?region=us-east-1#s=SecurityGroups">securitygroups</a></i>. AWS also offers two forms of EC2, "classic", and "VPC" (virtual private cloud). Managing securitygroups for classic and VPC are a little different. I'm going to present securitygroups in EC2-Classic. If you're going to use EC2-VPC, you'll need to read the Amazon documentation and adapt your processes to the VPC behaviors. Also note that securitygroups have a scope. They can be applied only in the region in which they are defined.<br />
<br />
In EC2-Classic you must associate all of the securitygroups with a new instance when you launch it (create it from an image). You cannot change the set of securitygroups associated with an instances later. You can change the rulesets in the securitygroups and the new rules will be applied immediately to all of the members of the securitygroup.<br />
<br />
Amazon provides a default securitygroup which basically restricts all network traffic to the members (but not *between* members). To make OpenShift work we will need a set of security groups which allow communications between the OpenShift Broker and the back-end services, and between the broker and nodes (through some form of messaging). We will also need to allow external access to the OpenShift broker (for control) and to the nodes (for user access to the applications).<br />
<br />
The creation of the securitygroups probably does not need to be automated. The securitygroups will be created and the rulesets defined only once for a given OpenShift service. The web interface is probably appropriate for this.<br />
<br />
Since we'll be creating the instances with the CLI, it will be necessary to be able to list, examine to apply the securitygroups to new instances there as well.<br />
<br />
<b>NOTE: These are not the security settings you are looking for.</b><br />
<b><br /></b>
The securitygroups and rulesets shown here are designed to demonstrate the securitygroup features and the user interface used to manage them. They are not designed with an eye to the best possible function and security for your service. You must look at your service design and requirements to create the best group and rulesets for your service.<br />
<br />
Most people focus on the inbound (ingress) filtering rules. I'm going to go with that. I won't be defining any outbound (egress) rule sets.<br />
<br />
I expect to need a different group for each type of host:<br />
<br />
<ul>
<li>OpenShift broker</li>
<li>OpenShift node</li>
<li>datastore</li>
<li>message broker</li>
<li>puppetmaster</li>
</ul>
<div>
<br /></div>
<div>
In addition I'm going to manage the service hosts with Puppet using a puppetmaster host. Each of the service hosts will be a puppet client. I don't think the puppet agent needs any special rules so I only have one additional securitygroup.</div>
<div>
<br /></div>
<div>
If I also planned to use an external authentication service on the broker, I would need a securitygroup for that. I could also extend this set to include build and test servers for development of OpenShift itself.</div>
<div>
<br /></div>
<h2>
Defining Securitygroups</h2>
<div>
Each of the groups below has only a single rule. To be rigorous I could add the SSH (22/TCP) rule to the node securitygroup. It is actually required for the operation of the node, not just for administrative remote access.<br />
<br />
<table border="2">
<tbody>
<tr>
<th>securitygroup</th>
<th>service</th>
<th>port/proto</th>
<th>source</th>
<th>comments</th>
</tr>
<tr>
<td>default</td>
<td>SSH</td>
<td>22/TCP</td>
<td>OpenShift Ops</td>
<td>remote access and control</td>
</tr>
<tr>
<td>puppetmaster</td>
<td>puppetmaster</td>
<td>8140/TCP</td>
<td>all managed hosts</td>
<td>configuration management</td>
</tr>
<tr>
<td>datastore</td>
<td>mongodb</td>
<td>27017/TCP</td>
<td>OpenShift Broker Hosts</td>
<td>NoSQL DB</td>
</tr>
<tr>
<td>messagebroker</td>
<td>activemq/stomp</td>
<td>61613/TCP</td>
<td>OpenShift broke and node hosts</td>
<td>carries MCollective</td>
</tr>
<tr>
<td>broker</td>
<td>httpd (apache2)</td>
<td>80/TCP, 443/TCP</td>
<td>OpenShift Ops and Users (unrestricted)</td>
<td>Ruby on Rails and Passenger</td>
</tr>
<tr>
<td rowspan="3">node</td>
<td>httpd (apache2)</td>
<td>80/TCP, 443/TCP</td>
<td>OpenShift Application Users (unrestricted)</td>
<td>HTTP routing</td>
</tr>
<tr>
<td>Web Sockets</td>
<td>8000/TCP, 8443/TCP</td>
<td>OpenShift App user</td>
<td>web sockets</td>
</tr>
<tr>
<td>SSH</td>
<td>22/TCP</td>
<td>OpenShift App Users (unrestricted)</td>
<td>shell and app control</td>
</tr>
</tbody></table>
</div>
<br />
<br />
Populating each securitygroup is a two step process. First create the empty security group. Then add the rules to the group. At that point, the group is ready to be applied to new instances.<br />
<br />
<h2>
Creating a Securitygroup</h2>
<div>
Each security group starts with a name and an option description string. The restrictions on the names are different from EC2-Classic and EC2-VPC securitygroups. See the Amazon documentation for the differences. Simple upper/lower case strings with no white space are allowed in both. The descriptions are more freeform.</div>
<div>
<br /></div>
<div>
You can add new securitygroups on the <a href="https://console.aws.amazon.com/ec2/home?region=us-east-1#s=SecurityGroups">AWS EC2 console page</a>. Select the "Security Groups" tab on the left side and click "Create Security Group". Fill in the name and description fields, make sure that the VPC selector indicates "No VPC" and click "Yes, Create".</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="http://3.bp.blogspot.com/-sT_84XYaUCw/UaQAt5oPZ8I/AAAAAAAAB0A/uOvfbx8lzXQ/s1600/ec2_new_security_group.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="264" src="http://3.bp.blogspot.com/-sT_84XYaUCw/UaQAt5oPZ8I/AAAAAAAAB0A/uOvfbx8lzXQ/s640/ec2_new_security_group.png" width="640" /></a></div>
<div>
<br /></div>
<h2>
Adding Rulesets</h2>
<div>
Securitygroup rulesets are one of the more complex elements in EC2. When using the web interface, Amazon provides a set of pre-defined rules for things like HTTP and SSH and common database connections. You should use them when they're appropriate. The web interface also allows you to create custom rulesets. <br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://3.bp.blogspot.com/-iQ8GhSfz8ug/UaTCSo03wwI/AAAAAAAAB0Q/UsnaqvgzqzA/s1600/ec2_securitygroup_new_rule.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="275" src="http://3.bp.blogspot.com/-iQ8GhSfz8ug/UaTCSo03wwI/AAAAAAAAB0Q/UsnaqvgzqzA/s640/ec2_securitygroup_new_rule.png" width="640" /></a></div>
<br />
There are several things to note about this display. The default group has three mandatory rules (blue and white bars in the lower right). These allow all of the members of the group unrestricted access to each other.<br />
<br />
I'm adding the SSH rule which allows inbound port 22 connections. I'm leaving the source as the default 0.0.0.0/0. This is the IPv4 notation for "everything", so there will be no restrictions on the source of inbound SSH connections. If you want to restrict SSH access so that connections come only from your corporate network, you can set the exit address space for your company there.<br />
<br />
Since the members of the default group have unrestricted access to each other and since I'm going to apply the default group to all of my instances, it turns out that I only need special rules for access to hosts from the outside. I need to add the SSH rule above, and I need to allow web access to the broker and node hosts. I am going to create these as distinct groups because I can't change the assigned groups for an instance after it is launched. I'd like the ability to restrict access to the broker later.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://1.bp.blogspot.com/-4N5Sy-RrOlI/UaTVjN_PDVI/AAAAAAAAB0g/_ocfCXSm6U4/s1600/ec2_securitygroup_done.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="286" src="http://1.bp.blogspot.com/-4N5Sy-RrOlI/UaTVjN_PDVI/AAAAAAAAB0g/_ocfCXSm6U4/s640/ec2_securitygroup_done.png" width="640" /></a></div>
<br />
<br />
If I were to apply rigorous security to this setup, I would avoid using the default group. Instead I would create a distinct group for each service component. Then I would add rulesets which allow only the required communications. This would decrease the risk that a compromise of one host would grant access to the rest of the service hosts.<br />
<br />
Since it's a one-time task, I created both of my securitygroups and rulesets using the web interface. I have written Thor tasks to create and populate securitygroups:<br />
<br />
<pre class="brush: bash ; title: 'securitygroup tasks'; highlight: 1 "> thor help ec2:securitygroup
Tasks:
thor ec2:securitygroup:create NAME # create a new se...
thor ec2:securitygroup:delete # delete the secu...
thor ec2:securitygroup:help [TASK] # Describe availa...
thor ec2:securitygroup:info # retrieve and re...
thor ec2:securitygroup:list # list the availa...
thor ec2:securitygroup:rule:add PROTOCOL PORTS [SOURCES] # add a permissio...
thor ec2:securitygroup:rules # list the rules ...
Options:
[--verbose]
</pre>
<br />
The list of tasks is incomplete, as I have not needed to change or delete rulesets. If I find that I need those tasks, I'll add them.<br />
<br />
<h2>
Next Up</h2>
This is everything that must be done before beginning to create running instances for my OpenShift service. In the next post I'll select a base image to use for my host instances and begin creating running machines.<br />
<br /></div>
<h2>
References</h2>
<br />
<ul>
<li> <a href="http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-network-security.html">AWS Network Security (securitygroups)</a> http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-network-security.html</li>
</ul>
markllamahttp://www.blogger.com/profile/14193184544557876514noreply@blogger.com0tag:blogger.com,1999:blog-5022186007695457923.post-52113568045898913112013-05-26T18:44:00.000-07:002013-05-26T18:55:16.461-07:00OpenShift on AWS EC2, Part 2: Being Seen (DNS)OpenShift is, at least in part, a publication system. Developers create applications and OpenShift tells the world about them. This means that the very first thing you need to think about when you're considering creating an OpenShift service is "what do I call it?"<br />
<br />
I actually created two zones when setting up the DNS for OpenShift. The servers reside in one zone, and the user applications in another. The broker service will be making updates to the application zone. It doesn't seem like a good idea to have the server hostnames in the same zone where a bug or intrusion could alter or delete them. Something like this will do.<br />
<br />
<ul>
<li><code>infra.example.org</code> - contains the server hostnames</li>
<li><code>app.example.org</code> - contains the application records</li>
</ul>
<div>
<br /></div>
<h2>
Picking a Domain Name (and a Registrar)</h2>
<br />
In most cases your choice is going to be constrained by what domains you own or have access too. You may need (as I did) to purchase a domain from a domain registrar. Or you will have to have your corporate IT department delegate a domain for you (whether they run it or you do).<br />
<br />
When you register or delegate a domain your domain registrar will request a list of name servers which will be serving the content of your domain. Route53 won't tell you the nameservers until you tell them what domain they'll be serving for you. That means that creating a domain, if you don't have one, is a 3 step exchange:<br />
<br />
<ol>
<li>Request domain from a registrar</li>
<li>Tell Route53 to serve the domain for you</li>
<li>Tell your registrar which Route53 nameservers will be providing your domain</li>
</ol>
<div>
<br /></div>
<div>
These steps will happen so rarely that I haven't bothered to script them. I just use the web interface for each step.<br />
<br />
NOTE: there are technical differences between a <u>zone</u> and a <u>domain</u> but I'm going to treat them as synonyms for this process. When you're registering, it's called a domain. When you're going to change the contents it's called a zone.</div>
<div>
<br /></div>
<div>
Each registrar will have a different means for you to set your domain's nameserver records. You'll have to look them up yourself. If you're getting a domain delegated from your corporate IT department you'll have to give them the list of Route53 nameservers so that they can install the "<i>glue records"</i> into their service.<br />
<br />
So, pick your Registrar, search for an available domain, request, register, and pay. Then head over to the <a href="https://console.aws.amazon.com/route53/home">AWS Route53 console</a>.<br />
<br />
<h2>
Adding a zone to Route53</h2>
<div>
<br /></div>
<div>
On the web interface, click "Create Hosted Zone" in the top tool bar. You'll see this dialog on the right side.</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="http://3.bp.blogspot.com/-QQSw4iFd0tE/UaJRpn0qIEI/AAAAAAAABzk/iPpQbDjG2c4/s1600/route53_create_zone.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="268" src="http://3.bp.blogspot.com/-QQSw4iFd0tE/UaJRpn0qIEI/AAAAAAAABzk/iPpQbDjG2c4/s640/route53_create_zone.png" width="640" /></a></div>
<div>
<br /></div>
<div>
Fill in the values for your new domain and a comment, if you wish. Then click "Create Hosted Zone" at the bottom of the dialog and Route53 will create your zone and assign a set of nameservers.</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="http://1.bp.blogspot.com/-qRJyHvXJOEU/UaJbiVu6tmI/AAAAAAAABzw/Uzf_N74WOHw/s1600/route53_new_zone.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="262" src="http://1.bp.blogspot.com/-qRJyHvXJOEU/UaJbiVu6tmI/AAAAAAAABzw/Uzf_N74WOHw/s640/route53_new_zone.png" width="640" /></a></div>
<br />
Make a note of the "Delegation Set". This is the set of nameservers which you need to provide to your domain registrar. The registrar will provide some place to enter the nameserver list and then they will add the glue records to the top-level domain.<br />
<br />
Make a note as well of the "Hosted Zone ID". That's what you will use to select the zone to update when you send requests to AWS Route53.<br />
<br />
When the domain registrar completes adding the Route53 nameservers it's time to come back to the thor CLI tools installed in <a href="http://cloud-mechanic.blogspot.com/2013/05/openshift-on-aws-ec2-part-1-from-wheels.html">part one</a>.<br />
<br />
<h2>
Viewing the Route53 DNS information</h2>
<div>
<br /></div>
<div>
You certainly can view the DNS information on the Route53 console. If you've set up the AWS CLI tools indicated <a href="http://cloud-mechanic.blogspot.com/2013/05/openshift-on-aws-ec2-part-1-from-wheels.html">in the previous post</a> you can also view them on the CLI. <br />
<br />
<b>NOTE: </b>If you haven't followed the previous post, you should before you continue here.<br />
<br />
First list the zones you have registered.<br />
<br /></div>
<div>
<pre class="brush: bash; title: 'AWS zone information': highlight 1">thor route53:zone:list
task: route53:zone:list
id: <YOURZONEID> name: app.example.org. records: 3
</pre>
<br /></div>
<div>
Now you can list the records in the zone (indicating the zone by name)<br />
<br /></div>
<div>
<pre class="brush: bash; title: 'AWS zone initial records': highlight: 1">thor route53:record:list app.example.org
task: route53:record:list app.example.org
looking for zone id <YOURZONEID>
example.org. NS
ns-131.awsdns-16.com.
ns-860.awsdns-43.net.
ns-2023.awsdns-60.co.uk.
ns-1076.awsdns-06.org.
app.example.org. SOA
ns-131.awsdns-16.com. awsdns-hostmaster.amazon.com. 1 7200 900 1209600 86400
</pre>
</div>
<div>
<br /></div>
<h2>
Adding a DNS Record</h2>
<div>
<br /></div>
<div>
The real goal in all of this is that the OpenShift broker must be able to add and remove records for applications. OpenShift uses the <i>aws-sdk</i> rubygem. The Thor tasks also use that gem. You can call them from the command line or use them to compose more complex operations.</div>
<div>
<br /></div>
<div>
OpenShift currently uses CNAME records to publish applications rather than A records. This is largely to allow for rapid re-naming or re-numbering of nodes within AWS. The use of CNAME records (which are aliases to another FQDN or <i>fully qualified domain name</i> means that the node which hosts the applications can be renumbered without the need to update every DNS record for every application. If bulk updates of DNS are not expensive, I believe that OpenShift could use A records, though it could require significant recoding.</div>
<div>
<br /></div>
<div>
To verify your DNS domain has been properly configured, add a CNAME record. Thor provides a standard <i>help</i> option for each query.</div>
<div>
<br /></div>
<div>
<br />
<pre class="brush: bash ; title: 'help for route53:record:create'; highlight: 1">thor help route53:record:create
Usage:
thor route53:record:create ZONE NAME TYPE VALUE
Options:
[--ttl=N]
# Default: 300
[--verbose]
[--wait]
create a new resource record
</pre>
<br />
From this you can craft a command. This example includes the <i>--wait</i> and <i>--verbose</i> options so that you can observe the process. Without the <i>--wait</i> option, the task will complete and return, but there will be a propagation delay before the name will resolve. With the <i>--wait</i> option, the task polls the Route53 service until it reports that the DNS services have synched.<br />
<br /></div>
<div>
<pre class="brush: bash; title 'add a CNAME record'; highlight: 1">thor route53:record create app.example.org test1 CNAME test2.app.example.org --verbose --wait
task: route53:record:create app.example.org test1 CNAME test2.infra.example.org
update record = {:comment=>"add CNAME record test1.app.example.org", :changes=>[{:action=>"CREATE", :resource_record_set=>{:name=>"test1.app.example.org", :type=>"CNAME", :ttl=>300, :resource_records=>[{:value=>"test2.infra.example.org"}]}}]}
response = {:change_info=>{:id=>"/change/C2VQAFRSE6OXMY", :status=>"PENDING", :submitted_at=>2013-05-27 00:24:19 UTC, :comment=>"add CNAME record test1.app.example.org"}}
1) change id: /change/C2VQAFRSE6OXMY, status: UNKNOWN - sleeping 5
2) change id: /change/C2VQAFRSE6OXMY, status: PENDING - sleeping 5
3) change id: /change/C2VQAFRSE6OXMY, status: PENDING - sleeping 5
4) change id: /change/C2VQAFRSE6OXMY, status: PENDING - sleeping 5
5) change id: /change/C2VQAFRSE6OXMY, status: PENDING - sleeping 5
</pre>
</div>
<div>
<br /></div>
<div>
<br />
When this command completes the new record should resolve:<br />
<br />
<pre class="brush: bash ; title: 'check publication of new CNAME record'; highlight: 1">host -t cname test1.app.example.org
test1.app.example.org is an alias for test2.infra.example.org.
</pre>
Also, now if you list the zone records with <code>thor route53:record list app.example.org</code> you'll see the new CNAME record.
<br />
<br />
<h2>
Deleting a DNS record</h2>
</div>
<div>
<br /></div>
<div>
Deleting a DNS record is nearly identical to adding one. To insure that you are deleting the correct record the delete task requires the same complete inputs as the create task.</div>
<div>
<br /></div>
<div>
<pre class="brush: bash ; title 'delete a DNS CNAME record'; highlight: 1">thor route53:record delete app.example.org test1 CNAME test2.infra.example.org --verbose --wait
task: route53:record:delete app.example.org CNAME test1
update record = {:comment=>"delete CNAME record test1.app.example.org", :changes=>[{:action=>"DELETE", :resource_record_set=>{:name=>"test1.app.example.org", :type=>"CNAME", :ttl=>300, :resource_records=>[{:value=>"test2.infra.example.org"}]}}]}
response = {:change_info=>{:id=>"/change/C3ORAEV7FTLPBJ", :status=>"PENDING", :submitted_at=>2013-05-27 00:58:25 UTC, :comment=>"delete CNAME record test1.app.example.org"}}
1) change id: /change/C3ORAEV7FTLPBJ, status: UNKNOWN - sleeping 5
2) change id: /change/C3ORAEV7FTLPBJ, status: PENDING - sleeping 5
3) change id: /change/C3ORAEV7FTLPBJ, status: PENDING - sleeping 5
4) change id: /change/C3ORAEV7FTLPBJ, status: PENDING - sleeping 5
5) change id: /change/C3ORAEV7FTLPBJ, status: PENDING - sleeping 5
</pre>
</div>
<div>
<br />
Again, with the <i>--verbose</i> and <i>--wait</i> options, the task will not complete until the DNS change has propagated. When it completes, the name will no longer resolve.
</div>
<div>
<br />
<pre class="brush:bash ; title: 'check that CNAME record is deleted' ; highlight: 1">host -t cname test1.app.example.org
Host test1.app.example.org not found: 3(NXDOMAIN)
</pre>
</div>
<br />
<h2>
Summary</h2>
<div>
Now that we've registered a domain, and arranged to have it served by Route53, we can add and remove names. When we configure OpenShift, it will be able to publish new application records.</div>
<div>
<br /></div>
<h2>
Next Time</h2>
<div>
We still have to create hosts to run the OpenShift service. On AWS that means creating <i>instances</i> , virtual machines in Amazon's cloud. Amazon applies some fairly restrictive nework level packet filtering. They use a feature called a <i>securitygroup</i> to define the filtering rules. In the next post, I'll discuss how to create and manage new securitygroups, and what groups we'll need to allow OpenShift to operate.</div>
<div>
<br /></div>
<h2>
Resources</h2>
</div>
<div>
<ul>
<li>AWS Route53 DNS management console - <a href="https://console.aws.amazon.com/route53/home">https://console.aws.amazon.com/route53/home</a></li>
<li>ICANN list of Accredited Registrars - <a href="http://www.icann.org/registrar-reports/accredited-list.html">http://www.icann.org/registrar-reports/accredited-list.html</a></li>
<li>DNS Zones - <a href="https://en.wikipedia.org/wiki/DNS_zone">https://en.wikipedia.org/wiki/DNS_zone</a></li>
</ul>
<div>
<br /></div>
</div>
markllamahttp://www.blogger.com/profile/14193184544557876514noreply@blogger.com0tag:blogger.com,1999:blog-5022186007695457923.post-64528934129848134672013-05-23T14:10:00.001-07:002013-05-23T14:13:17.340-07:00OpenShift on AWS EC2, Part 1: From the wheels upSomeone asked me recently how to build an <a href="https://github.com/openshift/origin-server">OpenShift Origin</a> service on<a href="https://aws.amazon.com/"> Amazon Web Services</a> EC2. My first thought was "easy, we do this all the time". I started going through what exists for our own testing, development and deployment. It clearly works, it's clearly the place to start, right? Just fire up a few instances, tweak the existing puppet configs and <i>zoom!</i> right?<br />
<br />
Then I started trying to figure out how to describe it and adapt it to general use, and I found myself adding more and more caveats and limitations and internal assumptions. It's grown organically to do what is needed but what I have available isn't really designed for general use. Some of it I couldn't understand just from reading and observing (since I'm kind of a hands on break-it-to-understand-it kind of guy). Time to start taking it apart so I can put it back together. When I can do that and it starts up when I turn the key, then I can claim to understand it.<br />
<br />
So I decided to go back to the fundamentals not of OpenShift, but of AWS EC2 itself.<br />
<br />
<h2>
Defining a Goal: Machines Ready To Eat</h2>
<br />
An OpenShift service consists of a number of component services. Ideally each component would have multiple instances for availability and scaling, but that's not required for initial setup. Only the OpenShift broker, console and nodes need to be exposed to the users.<br />
<br />
The host configuration is complex enough that even for a small service it is best to use a Configuration Management System (CMS) to configure and manage the system, but the CMS can't start work until the hosts exist and have network communications. The CMS itself must be installed and configured. Once the hosts exist and are bound together then the CMS can do the rest of the work and a clean boundary of control and access is established. This will later allow the bottom layer (establishing hosts and installing/configuring the CMS) to be replaced without affecting the actual service installation above.<br />
<br />
So the goal here is: create and connect hosts with a CMS installed using EC2. That's the base on which the OpenShift service will be built. If you run each of the component services on its own host using external DNS and authentication services, OpenShift requires a minimum of four hosts:<br />
<br />
<ul>
<li>OpenShift Broker</li>
<li>Data Store (mongodb)</li>
<li>Message Broker (activemq)</li>
<li>OpenShift Node</li>
</ul>
<div>
<br /></div>
<div>
Each of these can (theoretically, at least) be duplicated to provide high availability, but for now I'll start there. The goal of this series of posts is to create the hosts on which these services will be installed. We won't come back to OpenShift itself until that's done.<br />
<br />
<h2>
AWS EC2: Getting the lay of the land</h2>
</div>
<div>
<br /></div>
<div>
If you're not familiar with AWS EC2, go check out https://aws.amazon.com . EC2 is the part of AWS which provides "virtual" hosts (for a fee, of course). There are free-to-try levels, but you are required to give a credit card to sign up and you're very likely to start incurring charges for storage even if you stick to the "free" tier. Read, be informed, decide for yourself.</div>
<div>
<br />
<h3>
AWS without the "W"</h3>
<br /></div>
<div>
AWS presents a modern single-page web interface for all interactions, but I'm interested in command line or scripted interaction. Amazon does provide a REST protocol and has implemented libraries for a wide number of scripting languages. I'm using the <i><a href="https://aws.amazon.com/sdkforruby/">rubygem-aws-sdk</a></i> library (which is, surprisingly enough, written in Ruby) because I also want to use another Ruby tool called <i><a href="https://github.com/wycats/thor/wiki">Thor</a></i>. </div>
<div>
<br />
<h3>
Tasks and the Command Line Interface</h3>
<br /></div>
<div>
Thor is a ruby library which helps create really nice command line "tasks". The beauty of Thor is that you can use it both to define individual tasks and to compose those tasks into more complex task sequences. This allows you to test each step as a distinct CLI operation and also to debug only the step that fails when one inevitably does.</div>
<div>
<br /></div>
<div>
I'm going to use Thor and the aws-sdk to create a CLI interface to the AWS low level operations, and then compose them to create higher level tasks which, in the end, will leave me with a set of hosts ready to receive an OpenShift service.</div>
<div>
<br /></div>
<div>
I'm not going to try to create a comprehensive CLI interface to AWS. I'm only going to create the steps that I need to get this job done. A number of the steps will encapsulate operations which may seem trivial, but this will allow for better consistency and visibility of the operations. A primary goal is to have as little magic as possible. At the same time, I want to avoid overwhelming the user (me) with unnecessary detail when things are working as planned.</div>
<div>
<br /></div>
<div>
I'm not going to make you sit through the entire development process (which isn't complete). Instead I mean to show the tools that I've developed and use them to cleanly define the base on which an OpenShift service would sit.<br />
<br />
<div>
<h2>
</h2>
<h2>
AWS Setup</h2>
<br />
To work with AWS, you must have an established account. To use the the REST API you need to have generated a set of access keys. To log into your EC2 instances you need to have generated a set of SSH key pairs and placed them so your SSH client can find them. (Usually in $HOME/.ssh) and configure your ssh client to use those keys when logging into EC2 instances (in $HOME/.ssh/config).<br />
<br />
<br />
<ul>
<li>AWS Access Keys</li>
<li>AWS SSH Key Pairs</li>
<li>SSH client configuration</li>
</ul>
<br />
<div>
You can learn about and generate both sets of keys on the AWS <a href="https://portal.aws.amazon.com/gp/aws/securityCredentials#access_credentials">Security Credentials</a> page<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://3.bp.blogspot.com/-YsOxVVCmAB4/UZ6DSYxZ1AI/AAAAAAAABzU/QwpIVh15GUk/s1600/aws_access_credentials.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="353" src="http://3.bp.blogspot.com/-YsOxVVCmAB4/UZ6DSYxZ1AI/AAAAAAAABzU/QwpIVh15GUk/s640/aws_access_credentials.png" width="640" /></a></div>
<br /></div>
<div>
<br /></div>
</div>
<br />
<h2>
Origin-Setup (really EC2 and SSH tools)</h2>
</div>
<div>
<br /></div>
<div>
The tool set is currently called <i>origin-setup</i> and it resides in a repository on Github. The name is a misnomer, there's not actually any OpenShift in most of it.</div>
<div>
<br /></div>
<div>
<ul>
<li>Github repo URL: <a href="https://github.com/markllama/origin-setup">https://github.com/markllama/origin-setup</a></li>
</ul>
<div>
<br /></div>
</div>
<div>
<h3>
Requirements</h3>
<br /></div>
<div>
The tasks are written in Ruby using the Thor library. They also require several other rubygems. All of them are available on Fedora 18 as RPMs.</div>
<div>
<br /></div>
<div>
<ul>
<li>ruby</li>
<li>rubygems</li>
<li>rubygem-thor</li>
<li>rubygem-aws-sdk</li>
<li>rubygem-parseconfig</li>
<li>rubygem-net-ssh</li>
<li>rubygem-net-scp</li>
</ul>
<div>
<br />
<h3>
Getting (and setting) the Bits</h3>
<br /></div>
<div>
Thor can be used to create stand-alone CLI commands, but I have not done that yet for these tasks. To use them you need to <code>cd</code> into the origin-setup directory and call thor directly. You will also need to set the RUBYLIB path to find a small helper library which manages the AWS authentication.<br />
<br />
<pre class="brush: bash">git clone https://github.com/markllama/origin-setup
cd origin-setup
export RUBYLIB=`pwd`/lib
thor list --all
</pre>
<br />
<h3>
AWS Again: configuring the toolset</h3>
<div>
<br />
The final step is to give the origin-setup toolset the information needed to communicate with the AWS REST interface.</div>
<div>
<br /></div>
<div>
<pre class="brush: bash; title: '${HOME}/.awscred'">AWSAccessKeyId=YOURKEYIDHERE
AWSSecretKey=YOURSECRETKEYHERE
AWSKeyPairName=YOURKEYPAIRNAMEHERE
RemoteUser=ec2-user
AWSEC2Type=t1.micro
</pre>
</div>
<br />
This file contains what is essentially the passwords to your AWS account. You should set the permissions on this file so that only you can read it and protect the contents as you would your credit card.<br />
<br />
The <i>RemoteUser</i> is the default user for SSH logins (F18+). For RHEL6 it would be <u>root.</u> The <i>AWSEC2Type</i> value defines the default instance "type" to be created when you create a new instance. The <i>t1.micro</i> instance type is small and it is in the free tier. You will need to choose a larger type for real use.<br />
<br />
<h3>
Turn the Key</h3>
<div>
You should be able to use the <i>thor</i> command to explore the list of available tasks. Thor allows the creation of <i>namespaces</i> to contain related tasks. Most of the important tasks to begin with are in the <i>ec2</i> namespace.<br />
<br />
You can see the available tasks with the <i>thor list</i> command:<br />
<br />
<pre class="brush: bash ; title: 'ec2 task namespace'; highlight: 1">thor list ec2 --all
ec2
---
thor ec2:image:create # Create a new imag...
thor ec2:image:delete # Delete an existin...
thor ec2:image:find TAGNAME # find the id of im...
thor ec2:image:info # retrieve informat...
thor ec2:image:list # list the availabl...
thor ec2:image:tag --tag=TAG # set or retrieve i...
thor ec2:instance:create --image=IMAGE --name=NAME # create a new EC2 ...
thor ec2:instance:delete # delete an EC2 ins...
thor ec2:instance:hostname # print the hostnam...
thor ec2:instance:info # get information a...
thor ec2:instance:ipaddress [IPADDR] # set or get the ex...
thor ec2:instance:list # list the set of r...
thor ec2:instance:private_hostname # print the interna...
thor ec2:instance:private_ipaddress # print the interna...
thor ec2:instance:rename --newname=NEWNAME # rename an EC2 ins...
thor ec2:instance:start # start an existing...
thor ec2:instance:status # get status of an ...
thor ec2:instance:stop # stop a running EC...
thor ec2:instance:tag --tag=TAG # set or retrieve i...
thor ec2:instance:wait # wait until an ins...
thor ec2:ip:associate IPADDR INSTANCE # associate and Ela...
thor ec2:ip:associate IPADDR INSTANCE # associate and Ela...
thor ec2:ip:create # create a new elas...
thor ec2:ip:delete IPADDR # delete an elastic IP
thor ec2:ip:list # list the defined ...
thor ec2:securitygroup:create NAME # create a new secu...
thor ec2:securitygroup:delete # delete the securi...
thor ec2:securitygroup:info # retrieve and repo...
thor ec2:securitygroup:list # list the availabl...
thor ec2:securitygroup:rule:add PROTOCOL PORTS [SOURCES] # add a permission ...
thor ec2:snapshot:delete SNAPSHOT # delete the snapshot
thor ec2:snapshot:list # list the availabl...
thor ec2:volume:delete VOLUME # delete the volume
thor ec2:volume:list # list the availabl...
</pre>
<br />
<br />
It's time to see if you can talk to EC2. This first query requests a list of images produced by the Fedora hosted team:<br />
<br />
<pre class="brush: bash ; title: 'test AWS connectivity: list Fedora images'; highlight: 1">thor ec2:image list --name \*Fedora\* --owner 125523088429
ami-2509664c Fedora-x86_64-17-1-sda
ami-4b0b6422 Fedora-i386-17-1-sda
ami-6f640c06 Fedora-i386-18-20130521-sda
ami-b71078de Fedora-x86_64-18-20130521-sda
ami-d13758b8 Fedora-18-ec2-20130105-x86_64-sda
ami-dd3758b4 Fedora-18-ec2-20130105-i386-sda
ami-ed375884 Fedora-17-ec2-20120515-i386-sda
ami-fd375894 Fedora-17-ec2-20120515-x86_64-sda
</pre>
<div>
<br />
If instead you get a really long messy ruby error, then check the permissions and contents of your <i>~/.awscred</i> file.<br />
<br />
It's probably a good idea, before experimenting too much here to go get familar with EC2 and Route53 using the web console a bit.<br />
<br />
Next post I'll establish the DNS zone in Route53 and show how to manage DNS records to prepare for my OpenShift service.<br />
<br /></div>
</div>
<h2>
References</h2>
</div>
</div>
<div>
<br /></div>
<div>
<ul>
<li><a href="https://console.aws.amazon.com/ec2/v2/home">AWS EC2 Console</a> - managing remote virtual machines</li>
<li><a href="https://console.aws.amazon.com/route53/home">AWS Route53 (DNS) Console</a> - managing DNS</li>
<li><a href="https://aws.amazon.com/sdkforruby/">rubygem-aws-sdk</a> - an implimentation of the AWS REST protocol in Ruby</li>
<li>SSH publickey - secure login without passwords</li>
<li><a href="https://github.com/wycats/thor/wiki">Thor</a> - A ruby gem to build command line interface "tasks"</li>
<li><a href="https://puppetlabs.com/puppet/puppet-open-source/">Puppet</a> - A popular Configuration Management System</li>
<li><a href="http://git-scm.com/">Git</a> - a popular Source Code Management system</li>
<li><a href="https://github.com/">Github</a> - a site for keeping Git repositories</li>
</ul>
</div>
<div>
<ul>
<li><a href="https://github.com/markllama/origin-setup">origin-setup</a> - a set of Thor tasks for managing AWS EC2 and Route53<br />With a goal of automating the creation of an OpenShift Origin service in EC2</li>
</ul>
</div>
markllamahttp://www.blogger.com/profile/14193184544557876514noreply@blogger.com6tag:blogger.com,1999:blog-5022186007695457923.post-90269949143521502142013-04-03T11:16:00.000-07:002013-04-03T13:45:21.537-07:00OpenShift Process Tools (for humans)<h2>
<span style="font-size: x-large;">Making Sausage: It ain't for the faint of stomach</span></h2>
Otto von Bismark is known (among other things) for his <a href="http://www.brainyquote.com/quotes/quotes/o/ottovonbis161318.html">observations on law and sausage</a>. The observation could apply to software as well, but there are lots of us with cast-iron stomachs needed to produce some really wonderful stuff. If you're one undaunted, read on.<br />
<br />
I give a lot of attention to the parts running under an OpenShift service. I want people to be able to run their own service, to understand, configure, tune and diagnose problems with it. But I <b>REALLY</b> want people to understand that, if there's something they want it to do, that it doesn't yet, they don't have to wait for Red Hat to do it for them.<br />
<br />
I have a bit of a different position from most people with respect to OpenShift development. It came about by serendipity, but it suits me well and the management seem happy with it for now. I'm not actually in the product development hierarchy. I work on things mostly from the outside. I work to experience installing and configuring OpenShift as someone from the community, and I comment and report using the same channels the community members have. (I talk about that some in a <a href="http://bitmason.blogspot.com/2013/03/podcast-working-with-openshift-with.html">blog interview</a> with <a href="http://bitmason.blogspot.com/">Gordon Haff</a>)<br />
<br />
Red Hat has a vested interest in the success of OpenShift, but it defines that interest in terms that go beyond market penetration and adoption rates. Most of the developers currently work at Red Hat, but they are going out of their way to allow people to see not just the inner workings of the code, but of the process, and to form a community of contributors who help steer and shape what OpenShift becomes.<br />
<br />
Building a community is itself a process and there are learning steps so it's not all there yet, but you're invited already.<br />
<br />
<h2>
<span style="font-size: x-large;">Where's the Beef?</span></h2>
<div>
<br /></div>
<div>
When I started at Red Hat and began coding, one of the first things I asked was "where do I put my code". Coming from a background at proprietary software companies I was asking "where's the internal code repository?". I got a quizzical look for a second and then a reply: "Umm. <a href="http://github.com/">Github</a>?" like I'd just asked "Where's the bathroom" while standing at the mirror washing my hands. (Some people use bitbucket or sourceforge or a number of others)</div>
<div>
<br /></div>
<div>
<a href="http://github.com/openshift/origin-server">OpenShift has been on Github</a> for quite a while. It was one of the first big moves to bring OpenShift to the community (the fact that it made life easier for our folks was a warmly anticipated beneficial side effect). The hottest newest stuff is out there. It's not always pretty. If you look at the wiki and blog posts (especially from me) you'll find a number that are out of date because they contained hacks around warts and things have been changed or fixed (it's a good idea to check on the #openshift-dev channel on <a href="http://freenode.net/">freenode</a> if you're ever wondering about something). It's embarrassing sometimes to find something's gone stale, but the alternative is waiting or hiding things, and we're not doing that. (There are people working on the Official Documentation, but that's for Official Releases. We're talking bleeding edge here)</div>
<div>
<br /></div>
<div>
All of the developers (and by this, I include <i>you) </i>work by forking the repository, making changes on a branch and then submitting a pull request to bring those changes back into the master repository. The pull requests get review and commentary and then get run through automated tests (discussed next), and when they're ready are merged.</div>
<div>
<br /></div>
<h2>
<span style="font-size: x-large;">Have it your way.</span></h2>
<div>
<br />
The newest move is the switch to using <a href="https://trello.com/">Trello</a> for planning. OpenShift has always done development using an <a href="https://en.wikipedia.org/wiki/Agile_software_development">Agile process</a> based on the <a href="http://www.scrumalliance.org/learn_about_scrum">Scrum framework</a>. The project <a href="https://trello.com/openshift">is now hosted on Trello</a> and you're invited to look and contribute.</div>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: left; margin-right: 1em; text-align: left;"><tbody>
<tr><td style="text-align: center;"><a href="http://1.bp.blogspot.com/-mfuZbSnXsM8/UVxtAbj9ZTI/AAAAAAAABsk/KHw4SdvtvXo/s1600/Trello_Broker_Board.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="371" src="http://1.bp.blogspot.com/-mfuZbSnXsM8/UVxtAbj9ZTI/AAAAAAAABsk/KHw4SdvtvXo/s640/Trello_Broker_Board.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The OpenShift Origin Broker Scrum Board</td></tr>
</tbody></table>
<div>
<br /></div>
<div>
<br /></div>
<div>
Trello is free in the same way that Github is. To contribute, get an account and join the OpenShift organization. It's a web based scrum task board system. New features are added to the board as "cards". The cards are used to define and track the tasks needed to complete the feature. They're moved along the board from inception to completion as the tasks are defined, filled in, assigned (or assumed) and completed. All of the planning and work happen in plain sight. </div>
<div>
<br /></div>
<div>
If you're new to Agile or Scrum you want to take some time to look up what they are and how they work. Scrum is known as a "framework" for a reason. It's a set of priorities and guidelines, not rules. Each team has it's own conventions and etiquette. You'll get the best response from people if you observe a bit and dip your toe in slowly.</div>
<div>
<br /></div>
<div>
Check out Trello's <a href="http://help.trello.com/">Help site</a> and <a href="https://trello.com/tour">take the tour</a> for some idea of what Trello itself is and does, and how it works. Take a look too at<a href="https://trello.com/c/Fw7i4PIX"> the first card</a> in the Broker board. It describes how the OpenShift team is expected to use Trello. Take a look at the <a href="https://trello.com/c/VQWeDg74">story template card</a>. It show the skeleton of the questions a card should ask (and answer).</div>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: left; margin-right: 1em; text-align: left;"><tbody>
<tr><td style="text-align: center;"><a href="http://1.bp.blogspot.com/-9NZJtjeFrhs/UVxtH-bW5gI/AAAAAAAABss/4NtNHhjQBmU/s1600/Trello_Card_View.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="372" src="http://1.bp.blogspot.com/-9NZJtjeFrhs/UVxtH-bW5gI/AAAAAAAABss/4NtNHhjQBmU/s640/Trello_Card_View.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">A card, still capturing requirements.</td></tr>
</tbody></table>
<div>
<br /></div>
<div>
<br /></div>
<div>
The Trello site is a place for <b>contributors</b>. It's not a forum or question and answer session. Those are better served on the OpenShift<a href="https://www.openshift.com/forums/openshift"> fora</a>, on IRC or on the <a href="https://lists.openshift.redhat.com/openshiftmm/listinfo">mailing lists</a>. Reasonable suggestions are encouraged. See <a href="https://trello.com/c/fSWvzkPM">this one</a> requesting feedback on an update to the PHP cartridge as an example.</div>
<div>
<br />
<h2>
<span style="font-size: x-large;">Welcome to the kitchen</span></h2>
</div>
<div>
<br /></div>
<div>
Well it's hot in here. The stainless steel doors with the windows in them so the servers don't crash into each other are flapping on their springs behind you. The knife rack and the prep counter are in front of you. You know your way around an industrial fridge? All of us bus dishes now and then. You're a bit overdressed, but get to work we've got some hungry folks out there waiting. Check that card over your head and start cooking.</div>
<div>
<br /></div>
<h2>
<span style="font-size: x-large;">Resources</span></h2>
<div>
<ul>
<li>Git - Software Revision Control System - <a href="http://git-scm.com/">http://git-scm.com/</a></li>
<li>Github - an online service for Git - <a href="https://github.com/">https://github.com</a></li>
<li>Agile Software Development - https://en.wikipedia.org/wiki/Agile_software_development</li>
<li>The Agile Manifesto - <a href="http://www.agilemanifesto.org/">http://www.agilemanifesto.org/</a></li>
<li><a href="https://en.wikipedia.org/wiki/Scrum_(development)">The Scrum Framework</a> - <a href="https://en.wikipedia.org/wiki/Scrum_(development)">https://en.wikipedia.org/wiki/Scrum_(development)</a></li>
<li>Trello - online Scrum board service - <a href="https://www.trello.com/">https://www.trello.com</a></li>
<li>OpenShift community fora - <a href="https://www.openshift.com/forums/openshift">https://www.openshift.com/forums/openshift</a></li>
<li>OpenShift developer mailing list - <a href="https://lists.openshift.redhat.com/openshiftmm/listinfo/dev">https://lists.openshift.redhat.com/openshiftmm/listinfo/dev</a></li>
</ul>
</div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
markllamahttp://www.blogger.com/profile/14193184544557876514noreply@blogger.com1tag:blogger.com,1999:blog-5022186007695457923.post-4287447977272146952013-03-22T13:31:00.000-07:002013-03-28T08:21:29.685-07:00Installing (but not configuring) the broker service by handI'm working through a totally(?) manual installation of the OpenShift Origin service on Fedora 18. The last post on this topic was about building the RPMs on your own Yum repository. This time I'm going to install the broker service and make a few tweaks that are still required.<br />
<br />
One seriously major thing to note is that <b>I don't recommend actually doing this.</b> I'm doing it to shed some light on some of the things still going on in the development process and to highlight the ways in which you can get some visibility into the installation and monitoring of the service.<br />
<br />
If you're interested in building and running your own development environment or service for real, I suggest starting by reading through Krishna Raman's article on <a href="http://www.krishnaraman.net/installing-openshift-origin-using-vagrant-and-puppet/" target="_blank">creating a development environment using Vagrant and Puppet</a> and the <a href="https://github.com/openshift/puppet-openshift_origin" target="_blank">puppet script sources</a> themselves to see what's involved. Finally there's a <a href="http://openshift.github.com/origin/file.install_origin_manually.html">comprehensive document</a> that describes the procedure with fewer warts.<br />
<br />
<br />
<h2>
<a href="http://www.blogger.com/blogger.g?blogID=5022186007695457923" id="ingredients">Ingredients</a></h2>
As usual, I start with a clean minimal install of Fedora 18. In addition this time I also have a yum repository filled with a bleeding-edge build from source <a href="http://cloud-mechanic.blogspot.com/2013/03/the-bleeding-edge-building-openshift.html" target="_blank">as I described previously</a>. Finally I have a prepared MongoDB server waiting for a connection.<br />
<br />
I'm replacing my real URLs and access information with dummies for demonstration purposes.<br />
<br />
<br />
<ul>
<li>Yum repo URL<br /><code>http://myrepo.example.com/origin-server</code></li>
</ul>
<ul>
<li>MONGO_HOST_PORT="mydbhost.example.com:27017"</li>
<li>MONGO_USER="openshift"</li>
<li>MONGO_PASSWORD="dontuseme"</li>
<li>MONGO_DB="openshift"</li>
</ul>
<div>
<br /></div>
<h2>
Preparation</h2>
<div>
Since I'm building my own packages from source and placing them in a Yum repository, I need to add that repo to the standard set. I'll add a new file to /etc/yum.repod.d referring to my yum server.</div>
<div>
<br /></div>
<div>
Even if you're building from your own sources, there are still some packages you need to get that aren't in either the stock Fedora repositories or in the OpenShift sources. These are generally packages with patches that are in the process of moving upstream or are in the acceptance process for Fedora. Right now a set is maintained by the OpenShift build engineers. I need to add the repo file for that too:</div>
<div>
<br /></div>
<div>
<pre class="brush: bash ; title: '/etc/yum.repos.d/origin-server.repo'">[origin-server]
name=OpenShift Origin Server
baseurl=http://myrepo.example.com/openshift-origin
enable=1
gpgcheck=0
</pre>
</div>
<div>
<pre class="brush: bash ; title: '/etc/yum.repos.d/origin-extras.repo'">[origin-extras]
name=Custom packages for OpenShift Origin Server
baseurl=https://mirror.openshift.com/pub/openshift-origin/fedora-18/x86_64/
enable=1
gpgcheck=0
</pre>
</div>
<div>
At this point you can install the <code>openshift-origin-broker</code> package.
</div>
<div>
<pre class="brush: bash ; title: 'install the broker package (and dependencies)'">yum install openshift-origin-broker
...
urw-fonts.noarch 0:2.4-14.fc18
v8.x86_64 1:3.13.7.5-1.fc18
xorg-x11-font-utils.x86_64 1:7.5-10.fc18
Complete!
</pre>
<br /></div>
<div>
<br /></div>
<div>
There are a set of Rubygems that are not yet packaged as RPMs. I need to install these as gems for now.</div>
<div>
<br /></div>
<div>
<pre class="brush: bash ; title: 'install non-RPM gems'; highlight: 1">gem install mongoid
Fetching: i18n-0.6.1.gem (100%)
Fetching: moped-1.4.4.gem (100%)
Fetching: origin-1.0.11.gem (100%)
Fetching: mongoid-3.1.2.gem (100%)
Successfully installed i18n-0.6.1
Successfully installed moped-1.4.4
Successfully installed origin-1.0.11
Successfully installed mongoid-3.1.2
3 gems installed
Installing ri documentation for moped-1.4.4...
Building YARD (yri) index for moped-1.4.4...
Installing ri documentation for origin-1.0.11...
Building YARD (yri) index for origin-1.0.11...
Installing ri documentation for mongoid-3.1.2...
Building YARD (yri) index for mongoid-3.1.2...
Installing RDoc documentation for moped-1.4.4...
Installing RDoc documentation for origin-1.0.11...
Installing RDoc documentation for mongoid-3.1.2...
</pre>
</div>
<div>
There are a number of gem version restrictions in the broker Gemfile which are not met by the current rubygem RPMs. I have to remove the version restrictions so that the broker application will use what <u>is </u>available. This risks breaking things due to interface changes, but will at least allow the broker application to start.</div>
<div>
<br /></div>
<div>
<pre class="brush: bash ; title: 'remove version restrictions for RPM gems'; highlight: 1">sed -i -f - <<EOF /var/www/openshift/broker/Gemfile
/parseconfig/s/,.*//
/minitest/s/,.*//
/rest-client/s/,.*//
/mocha/s/,.*//
/rake/s/,.*//
EOF
</pre>
</div>
<div>
<br />
<br />
For some reason, even with the <code>--without</code> clause for <code>:test</code> and <code>:development</code>, bundle still wants the <code>mocha</code> rubygem. This should not be required for production, but right now you need to install it so that the Rails application will start.<br />
<br />
<pre class="brush: bash ; title: 'install rubygem mocha and dependencies'; highlight: 1">
yum install rubygem-mocha
...
Installed:
rubygem-mocha.noarch 0:0.12.1-1.fc18
Dependency Installed:
rubygem-metaclass.noarch 0:0.0.1-6.fc18
</pre>
<br />
</div>
<h2>
Verifying The Dependencies</h2>
<div>
Now that all of the software dependencies have been installed (mostly by RPM requirements through Yum, and finally through gem requirements and some version tweaking of the Gemfile) I can check that all of them resolve when I start the application. Rails will call bundler when the application starts so I'll call it explicitly before hand. I'm only interested in the production environment, so I'll explicitly exclude development and test.</div>
<div>
<br /></div>
<div>
<pre class="brush: bash ; title: 'check gems with bundle' ; highlight: 1">cd /var/www/openshift/broker
bundle --local
Using rake (0.9.6)
Using bigdecimal (1.1.0)
....
Using systemu (2.5.2)
Using xml-simple (1.1.2)
Your bundle is complete! Use `bundle show [gemname]` to see where a bundled gem is installed.
</pre>
</div>
<div>
<br /></div>
<div>
If I try to start the rails console now, though, I'll be sad. It won't connect to the database.</div>
<h2>
Configure MongoDB access/authentication</h2>
<div>
The OpenShift broker is (right now) tightly coupled to MongoDB. Recently it switched to using the rubygem-mongoid ODM module (which is a definite plus if you have to work on the code).</div>
<div>
<br /></div>
<div>
The last thing I need to do before I can fire up the Rails console with the broker application is to set the database connectivity parameters. One side effect of using an ODM is that it establishes a connection to the database the moment the application starts.</div>
<div>
<br /></div>
<div>
<b>NOTE:</b> when this is done I will <u>not</u> have a complete working broker server. I still need to configure the other external services: auth, dns and messaging.</div>
<div>
<br /></div>
<div>
Set the values listed in the <a href="http://www.blogger.com/blogger.g?blogID=5022186007695457923#ingredients">Ingredients</a> into <code>/etc/openshift/broker.conf</code>.</div>
<div>
<br /></div>
<div>
<pre class="brush: bash ; title :'broker database configuration' ; highlight: 1">/etc/openshift/broker.conf
...
# Eg: MONGO_HOST_PORT="<host1:port1>,<host2:port2>..."
MONGO_HOST_PORT="mydbhost.example.com:27017"
MONGO_USER="openshift"
MONGO_PASSWORD="dontuseme"
MONGO_DB="openshift"
MONGO_SSL="false"
</pre>
</div>
<div>
...</div>
<div>
<br /></div>
<div>
Now I <i>can</i> try starting the rails console. It should connect to the mongodb and offer an irb prompt:</div>
<div>
<br /></div>
<div>
<br /></div>
<div>
To verify the database connectivity, take a look at <a href="http://cloud-mechanic.blogspot.com/2013/03/verifying-mongodb-datastore-with-rails.html" target="_blank">this recent blog post</a>.</div>
<div>
<br /></div>
<div>
Next up is configuring each plugin, one by one.<br />
<br />
<h2>
Gist Scripts</h2>
<div>
I'm trying something new. Rather than including code snippets inline, I'm going to post them as Github Gist entries.</div>
<div>
<br /></div>
<div>
<ul>
<li>Add Yum Repos - <a href="https://gist.github.com/markllama/5222366">oo-add-repo.sh</a></li>
<li>Fix broker requirements - <a href="https://gist.github.com/markllama/5222431">oo-broker-fix-requirements.sh</a></li>
</ul>
</div>
<h2>
References</h2>
</div>
<div>
<ul>
<li><a href="https://docs.fedoraproject.org/en-US/Fedora/16/html/System_Administrators_Guide/sec-Configuring_Yum_and_Yum_Repositories.html" target="_blank">yum repository configuration</a></li>
<li><a href="http://rubygems.org/" target="_blank">rubygems</a></li>
<li><a href="https://docs.fedoraproject.org/en-US/Fedora/16/html/System_Administrators_Guide/sec-Configuring_Yum_and_Yum_Repositories.html" target="_blank">gem</a></li>
<li><a href="http://gembundler.com/" target="_blank">bundler</a></li>
<li><a href="http://mongodb.org/" target="_blank">mongodb</a></li>
<li><a href="http://mongoid.org/" target="_blank">mongoid</a></li>
</ul>
</div>
markllamahttp://www.blogger.com/profile/14193184544557876514noreply@blogger.com2