Under The Hood of Cloud Computing: container

Showing posts with label container. Show all posts

Thursday, September 4, 2014

Kubernetes: Simple Containers and Services

From previous posts I now have a MongoDB image and another which runs a QPID AMQP broker. I intend for these to be used by the Pulp service components.

What I'm going to do this time is to create the subsidiary services that I'll need for the Pulp service within a Kubernetes cluster.

UPDATE 12/16/2014: recently the kubecfg command has been deprecated and replaced with kubectl. I've updated this post to reflect the CLI call and output from kubectl.

Pre-Launch

A Pulp service stores it's persistent data in the database. The service components, a Celery Beat server and a number of Celery workers, as well as one or more Apache web server daemons all communicate using the AMQP message broker. They store and retrieve data from the database.

In a traditional bare-bare metal or VM based installation all of these services would likely be run on the same host. If they are distributed, then the IP addresses and credentials of the support services would have to be configured into Pulp servers manually or using some form of configuration management. Using containers the components can be isolated but the task of tracking them and configuring the consumer processes remains.

Using just Docker, the first impulse of an implementer would be similar, to place all of the containers on the same host. This would simplify the management of the connectivity between the parts, but it also defeats some of the benefit of containerized applications: portability and non-locality. This isn't a failing of Docker. It is the result of conscious decisions to limit the scope of what Docker attempts to do, avoiding feature creep and bloat. And this is where a tool like Kubernetes comes in.

As mentioned elsewhere, Kubernetes is a service which is designed to bind together a cluster of container hosts, which can be regular hosts running the etcd and kubelet daemons or they can be specialized images like Atomic or CoreOS. They can be private or public services such as Google Cloud

For Pulp, I need to place a MongoDB and a QPID container within a Kubernetes cluster and create the infrastructure so that clients can find it and connect to it. For each of these I need to create a Kubernetes Service and a Pod (group of related containers).

Kicking the Tires

It's probably a good thing to explore a little bit before diving in so that I can see what to expect from Kubernetes in general. I also need to verify that I have a working environment before I start trying to bang on it.

Preparation

If you're following along, at this point I'm going to assume that you have access to a running Kubernetes cluster. I'm going to be using the Vagrant test cluster as defined in the github repository and described in the Vagrant version of the Getting Started Guides.

I'm also going to assume that you've built the kubernetes binaries. I'm using the shell wrappers in the cluster sub-directory, especially cluster/kubectl.sh. If you try that and you haven't built the binaries you'll get a message that looks like this:

cluster/kubectl.sh 
It looks as if you don't have a compiled kubectl binary.

If you are running from a clone of the git repo, please run
'./build/run.sh hack/build-cross.sh'. Note that this requires having
Docker installed.

If you are running from a binary release tarball, something is wrong. 
Look at http://kubernetes.io/ for information on how to contact the 
development team for help.

If you see that, do as it says. If that fails, you probably haven't installed the golang package.

For convenience I alias the kubectl.sh wrapper so that I don't need the full path.

alias kubectl=~/kubernetes/cluster/kubectl.sh

Like most CLI commands now if you invoke it with no arguments it prints usage.

kubectl --help 2>1 | more
Usage of kubectl:

Usage: 
  kubectl [flags]
  kubectl [command]

Available Commands: 
  version                                             Print version of client and server
  proxy                                               Run a proxy to the Kubernetes API server
  get [(-o|--output=)json|yaml|...] <resource> [<id>] Display one or many resources
  describe <resource> <id>                            Show details of a specific resource
  create -f filename                                  Create a resource by filename or stdin
  createall [-d directory] [-f filename]              Create all resources specified in a directory, filename or stdin
  update -f filename                                  Update a resource by filename or stdin
  delete ([-f filename] | (<resource> <id>))          Delete a resource by filename, stdin or resource and id

The full usage output can be found in the CLI documentation in the Kubernetes Github repository.

kubectl has one oddity that makes a lot of sense once you understand why it's there. The command is meant to produce output which is consumable by machines using UNIX pipes. The output is structured data formatted using JSON or YAML. To avoid strange errors in the parsers, the only output to STDOUT is structured data. This means that all of the human readable output goes to STDERR. This isn't just the error output though. This includes the help output. So if you want to run the help and usage output through a pager app like more(1) or less(1), you have to first redirect STDERR to STDOUT as I did above.

Exploring the CLI control objects

You can see in the REST API line the possible operations: get, list, create, delete, update . That line also shows the objects that the API can manage: minions, pods, replicationControllers, servers.

Minions

A minion is a host that can accept containers. It runs an etcd and a kubelet daemon in addition to the Docker daemon.For our purposes a minion is where containers can go.

I can list the minions in my cluster like this:

kubectl get minions
NAME                LABELS
10.245.2.4          <none>
10.245.2.2          <none>
10.245.2.3          <none>

The only valid operation on minions using the REST protocol are the list and get actions. The get response isn't very interesting.

Until I add some of the other objects this is the most interesting query. It indicates that there are three minions connected and ready to accept containers.

Pods

A pod is the Kubernetes object which describes a set of one or more containers to be run on the same minion. While the point of a cluster is to allow containers to run anywhere within the cluster, there are times when a set of containers must run together on the same host. Perhaps they share some external filesystem or some other resource. See the golang specification for the Pod struct.

kubectl get pods
NAME                IMAGE(S)            HOST                    LABELS              STATUS

See? not very interesting.

Replication Controllers

I'm going to defer talking about replication controllers in detail for now. It's enough to note their existence and purpose.

Replication controllers are the tool to create HA or load balancing systems. Using a replication controller you can tell Kubernetes to create multiple running containers for a given image. Kubernetes will ensure that if one container fails or stops that a new container will be spawned to replace it.

I can list the replication controllers in the same way as minions or pods, but there's nothing to see yet.

Services

I think the term service is an unfortunate but probably unavoidable terminology overload.

In Kubernetes, a service defines a TCP or UDP port reservation. It provides a way for applications running in containers to connect to each other without requiring that each one be configured with the end-point IP addresses. This both allows for abstracted configuration and for mobility and load balancing of the providing containers.

When I define a Kubernetes service, the service providers (the MongoDB and QPID containers) will be labeled to receive traffic and the service consumers (the Pulp components) will be given the access information in the environment so that they can reach the providers. More about that later.

I can list the services in the same way as I would minions or pods. And it turns out that creating a couple of Kubernetes services is the first step I need to take to prepare the Pulp support service containers.

Creating a Kubernetes Service Object

In a cloud cluster one of the most important considerations is being able to find things. The whole point of the cloud is to promote non-locality. I don't care where things are, but I still have to be able to find them somehow.

A Kubernetes Service object is a handle that allows my MongoDB and QPID clients find the servers without them having to know where they really are. It defines a port to listen on and a way for clients to indicate that they want to accept the traffic that comes in. Kubernetes arranges for the traffic to be forwarded to the servers.

Kubernetes both accepts and produces structured data formats for input and reporting. The two currently supported formats are JSON and YAML. The Service structure is relatively simple but it has elements which are shared by all of the top level data structures. Kubernetes doesn't yet have any tooling to make the creation of an object description easier than hand-crafting a snipped of JSON or YAML. Each of the structures is documented in the godoc for Kubernetes. For now that's all you get.

There are a couple of provided examples and these will have to do for now. The guestbook example demonstrates using ReplicationServers and master/slave implementation using Redis. The second shows how to perform a live update of the pods which make up an active service within a Kubernetes cluster. These are actually a bit more advanced than I'm ready for and don't give the detailed break-down of the moving parts that I mean to do.

This is a complete description of the service. Lines 5-8 define the actual content.

Line 2 indicates that this is a Service object.
Line 3 indicates the object schema version.
v1beta1 is current
(note: my use of the term 'schema' is a loose one)
Line 4 identifies the Service object.
This must be unique within the set of services
Line 5 is the TCP port number that will be listening
Line 6 is for testing. It tells the proxy on the minion with that IP to listen for inbound connections.
I'll also use the publicIPs value to expose the HTTP and HTTPS services for Pulp
Lines 7-9 set the Selector
The selector is used to associate this Service object with containers that will accept the inbound traffic.
This will match with one of the label items assigned to the containers.

When a new service is created Kubernetes establishes a listener on an available IP address (one of the minions addresses). While the service object exists any new containers will start with a new set of environment variables which provide access information. The value of the selector (converted to upper case) is used as the prefix for these environment variables so that containers can be designed to pick them up and use them for configuration.

For now I just need to establish the service so that when I create the DB and QPID containers they have something to be bound to.

The QPID service is identical to the MongoDB service, replacing the port (5672) and the selector (msg)

Querying a Service Object

I've just created a Service object. I wonder what Kubernetes thinks of it? I can list the services as seen above. I can also get the object information using kubectl.

kubectl get services db
NAME                LABELS              SELECTOR            IP                  PORT
db                                name=db             10.0.41.48          27017

That's nice. I know the important information now. But what does it look like really.

kubectl get --output=json services db
{
    "kind": "Service",
    "id": "db",
    "uid": "c040da3d-8536-11e4-a18b-0800279696e1",
    "creationTimestamp": "2014-12-16T15:18:12Z",
    "selfLink": "/api/v1beta1/services/db?namespace=default",
    "resourceVersion": 13,
    "apiVersion": "v1beta1",
    "namespace": "default",
    "port": 27017,
    "protocol": "TCP",
    "selector": {
        "name": "db"
    },
    "publicIPs": [
        "10.245.2.2"
    ],
    "containerPort": 0,
    "portalIP": "10.0.41.48"
}

Clearly Kubernetes has filled out some of the object fields. Note the --output=json flag for structured data.

I'll be using this method to query information about the other elements as I go along.

Describing a Container (Pod) in Kubernetes

We've seen how to run a container on a Docker host. With Kubernetes we have to create and submit a description of the container with all of the required variables defined.

Kubernetes has an additional abstraction called a pod. While Kubernetes is designed to allow the operator to ignore the location of containers within the cluster, there are times when a set of containers needs to be co-located on the same host. A pod is Kubernetes' way of grouping containers when needed. When starting a single container it will still be referred to as a member of a pod.

Here's the description of a pod containing the MongoDB service image I created earlier.

This is actually a set of nested structures, maps and arrays.

Lines 1-21 define a Pod.
Lines 2-4 are elements of an inline JSONBase structure
Lines 5-7 are a map (hash) of strings assigned to the Pod struct element named Labels.
Lines 8-20 define a PodState named DesiredState.
The only required element is the ContainerManifest, named Manifest in the PodState.
A Podstate has a required Version and ID, though it is not a subclass of JSONBase.
It also has a list of Containers and an optional list of Volumes
Lines 12-18 define the set of containers (only one in this case) that will reside in the pod.
A Container has a name and an image path (in this case to the previously defined mongodb image).
Lines 15-17 are a set of Port specifications.
These indicate that something inside the container will be listening on these ports.

You can see how learning the total schema means fishing through each of these structure definitions in the documentation. If you work at it you will get to know them. To be fair they are really meant to be generated and consumed by machines rather than humans. Kubernetes is still the business end of the service. Pretty dashboards will be provided later. The only visibility I really need is for development and diagnostics. There are gaps here too, but finding them is what experiments like this are about.

A note on Names and IDs

There are several places where there is a key named "name" or "id". I could give them all the same value, but I'm going to deliberately vary them so I can expose which ones are used for what purpose. Names can be arbitrary strings. I believe that IDs are restricted somewhat (no hyphens).

Creating the first Pod

Now I can get back to business.

Once I have the Pod definition expressed in JSON I can submit that to kubectl for processing.

kubectl create -f pods/mongodb.json 
pulpdb

TADA! I now have a MongoDB running in Kubernetes.

But how do I know?

Now that I actually have a pod, I should be able to query the Kubernetes service about it and get more than an empty answer.

kubectl get pods pulpdb
NAME                IMAGE(S)            HOST                    LABELS              STATUS
pulpdb              markllama/mongodb   10.245.2.3/10.245.2.3   name=db             Running

Familiar and Boring. But I can get more from kubectl by asking for the raw JSON return from the query.

{
    "kind": "Pod",
    "id": "pulpdb",
    "uid": "4bac8381-8537-11e4-a18b-0800279696e1",
    "creationTimestamp": "2014-12-16T15:22:06Z",
    "selfLink": "/api/v1beta1/pods/pulpdb?namespace=default",
    "resourceVersion": 22,
    "apiVersion": "v1beta1",
    "namespace": "default",
    "labels": {
        "name": "db"
    },
    "desiredState": {
        "manifest": {
            "version": "v1beta2",
            "id": "",
            "volumes": [
                {
                    "name": "devlog",
                    "source": {
                        "hostDir": {
                            "path": "/dev/log"
                        },
...
            "pulp-db": {
                "state": {
                    "running": {
                        "startedAt": "2014-12-16T15:27:04Z"
                    }
                },
                "restartCount": 0,
                "image": "markllama/mongodb",
                "containerID": "docker://8f21d45e49b18b37b98ea7556346095261699bc
3664b52813a533edccee55a63"
            }
        }
    }
}

It's really long. So I'm not going to include it inline. Instead I put it into a gist.

If you fish through it you'll find the same elements I used to create the pod, and lots, lots more. The structure now contains both a desiredState and a currentState sub-structure, with very different contents.

Now a lot of this is just noise to us, but lines 59-72 are of particular interest. These show the effects of the Service object that was created previously. These are the environment variables and network ports declared. These are the values that a client container will use to connect to this service container.

Testing the MongoDB service

If you've read my previous blog post on creating a MongoDB Docker image you'll be familiar with the process I used to verify the basic operation of the service.

In that case I was running the container using Docker on my laptop. I knew exactly where the container was running and I had direct access to the Docker CLI so that I could ask Docker about my new container.

I'd opened up the MongoDB port and told Docker to bind it to a random port on the host and I could connect directly to that port.

In a Kubernetes cluster there's no way to know a priori where the MongoDB container will end up. You have to ask Kubernetes where it is. Further you don't have direct access to the Docker CLI.

This is where that publicIPs key in the mongodb-service.json file comes in. I set the public IP value of the db service to an external IP address of one of the Kubernetes minions: 10.245.2.2. This causes the proxy on that minion to accept inbound connections and forward them to the db service pods where ever they are.

The minion host is accessible from my desktop so I can test the connectivity directly.

echo "show dbs" | mongo 10.245.2.2
MongoDB shell version: 2.4.6
connecting to: 10.245.2.4/test
local 0.03125GB
bye

And now for QPID?

As with the Service object, creating and testing the QPID container within Kubernetes requires the same process. Create a JSON file which describes the QPID service and another for the pod. Submit them and test as before.

Summary

Now I have two running network services inside the Kubernetes cluster. This consists of a Kubernetes Service object and a Kubernetes Pod which is running the image I'd created for each service application.

I can prove to myself that the application services are running and accessible, though for some of the detailed tests I have to go under the covers of Kuberntes still.

I have the information I need to craft images for the other Pulp services so that they can consume the database and messenger services.

Next Up

In the next post I mean to create the first Pulp service image, the Celery Beat server. There are elements that all of the remaining images will have in common, so I'm going to first build a base image and then apply the last layer to differentiate the beat server from the Pulp resource manager and the pulp workers.

References

Docker
https://docker.com/
Kubernetes
https://github.com/GoogleCloudPlatform/kubernetes/
Kubernetes Source Code Documentation
https://godoc.org/github.com/GoogleCloudPlatform/kubernetes
Pulp
http://www.pulpproject.org/
Celery
http://www.celeryproject.org/
JSON
http://json.org/
YAML
http://yaml.org/
Pretty Printing JSON with Python
http://stackoverflow.com/questions/352098/how-can-i-pretty-print-json

Saturday, August 30, 2014

Docker: A simple service container example with MongoDB

In my previous post I said I was going to build, over time a Pulp repository using a set of containerized service components and host it in a Kubernetes cluster.

A complete Pulp service

The Pulp service is composed of a number of sub-services:

A MongoDB database
A QPID AMQP message broker
A number of Celery processes

1 Celery Beat process
1 Pulp Resource Manager (Celery worker) process
>1 Pulp worker (Celery worker) process

>1 Apache HTTPD - serves mirrored content to clients
>1 Crane service - Docker plugin for Pulp

This diagram illustrates the components and connectivity of a Pulp service as it will be composed in Kubernetes using Docker containers.

Pulp Service Component Structure

The simplest images will be those for the QPID and MongoDB services. I'm going to show how to create the MongoDB image first.

There are several things I will not be addressing in this simple example:

HA and replication
In production the MongoDB would be replicated
In production the QPID AMQP service would have a mesh of brokers
Communications Security
In production the links between components to the MongoDB and the QPID message broker would be encrypted and authenticated.

Key management is actually a real problem with Docker at the moment and will require its own set of discussions.

A Docker container for MongoDB

This post essentially duplicates the instructions for creating a MongoDB image which are provided on the Docker documentation site. I'm going to walk through them here for several reasons. First is for completeness and for practice on the basics of creating a simple image. Second, the Docker example uses Ubuntu for the base image. I am going to use Fedora. In later posts I'm going to be doing some work with Yum repos and RPM installation. Finally I'm going to make some notes which are relevant to the suitability of a container for use in a Kubernetes cluster.

Work Environment

I'm working on Fedora 20 with the docker-io package installed and the docker service enabled and running. I've also added my username to the docker group in /etc/group so I don't need to use sudo to issue docker commands. If your work environment differs you'll probably have to adapt some.

Defining the Container: Dockerfile

New docker images are defined in a Dockerfile. Capitalization matters in the file name. The Dockerfile must reside in a directory of its own. Any auxiliary files that the Dockerfile may reference will reside in the same directory.

The syntax for a Dockerfile is documented on the Docker web site.

This is the Dockerfile for the MongDB image in Fedora 20:

That's really all it takes to define a new container image. The first two lines are the only ones that are mandatory for all Dockerfiles. The rest form the description of the new container.

Dockerfile: FROM

Line 1 indicates the base image to begin with. It refers to an existing image on the official public Docker registry. This image is offered and maintained by the Fedora team. I specify the Fedora 20 version. If I had left the version tag off, the Dockerfile would use the latest tagged image available.

Dockerfile: MAINTAINER

Line 2 gives contact information for the maintainer of the image definition.

Diversion:

Lines 4 and 5 are an unofficial comment. It's a fragment of JSON which contains some information about how the image is meant to be used.

Dockerfile: RUN

Line 7 is where the real fun begins. The RUN directive indicates that what follows is a command to be executed in the context of the base image. It will make changes or additions which will be captured and used to create a new layer. In fact, every directive from here on out creates a new layer. When the image is run, the layers are composed to form the final contents of the container before executing any commands within the container.

The shell command which is the value of the RUN directive must be treated by the shell as a single line. If the command is too long to fit in an 80 character line then shell escapes (\<cr>) and conjunctions (';' or '&&' or '||') are used to indicate line continuation just as if you were writing into a shell on the CLI.

This particular line installs the mongodb-server package and then cleans up the YUM cache. This last is required because any differences in the file tree from the begin state will be included in the next image layer. Cleaning up after YUM prevents including the cached RPMs and metadata from bloating the layer and the image.

Line 10 is another RUN statement. This one prepares the directory where the MongoDB storage will reside. Ordinarily this would be created on a host when the MongoDB package is installed with a little more during the startup process for the daemon. They're here explicitly because I'm going to punch a hole in the container so that I can mount the data storage area from host. The mount process can overwrite some of the directory settings. Setting them explicitly here ensures that the directory is present and the permissions are correct for mounting the external storage.

Dockefile: ADD

Line 14 adds a file to the container. In this case it's a slightly tweaked mongodb.conf file. It adds a couple of switches which the Ubuntu example from the Docker documentation applies using CLI arguments to the docker run invocation. The ADD directive takes the input file from the directory containing the Dockerfile and will overwrite the destination file inside the container.

Lines 16-22 don't add new content but rather describe the run-time environment for the contents of the container.

Dockerfile: VOLUME

Line 16 officially declares that the directory /var/lib/mongodb will be used as a mountpoint for external storage.

Dockerfile: EXPOSE

Line 18 declares that TCP port 21017 will be exposed. This will allow connections from outside the container to access the mongodb inside.

Dockerfile: USER

Line 20 declares that the first command executed will be run as the mongodb user.

Dockerfile: WORKDIR

Line 22 declares that the command will will run in /var/lib/mongodb, the home directory for the mongodb user.

Dockerfile: CMD

The last line of the Dockerfile traditionally describes the default command to be executed when the container starts.

Line 24 uses the CMD directive. The arguments are an array of strings which make up the program to be invoked by default on container start.

Building the Docker Image

With the Dockerfile and the mongodb.conf template in the image directory (in my case, the directory is images/mongodb) I'm ready to build the image. The transcript for the build process is pretty long. This one I include in its entirety so you can see all of the activity that results from the Dockerfile directives.

docker build -t markllama/mongodb images/mongodb
Sending build context to Docker daemon 4.096 kB
Sending build context to Docker daemon 
Step 0 : FROM fedora:20
Pulling repository fedora
88b42ffd1f7c: Download complete 
511136ea3c5a: Download complete 
c69cab00d6ef: Download complete 
 ---> 88b42ffd1f7c
Step 1 : MAINTAINER Mark Lamourine 
 ---> Running in 38db2e5fffbb
 ---> fc120ab67c77
Removing intermediate container 38db2e5fffbb
Step 2 : RUN  yum install -y mongodb-server &&      yum clean all
 ---> Running in 42e55f18d490
Resolving Dependencies
--> Running transaction check
---> Package mongodb-server.x86_64 0:2.4.6-1.fc20 will be installed
--> Processing Dependency: v8 for package: mongodb-server-2.4.6-1.fc20.x86_64
...
Installed:
  mongodb-server.x86_64 0:2.4.6-1.fc20                                          
...

Complete!
Cleaning repos: fedora updates
Cleaning up everything
 ---> 8924655bac6e
Removing intermediate container 42e55f18d490
Step 3 : RUN  mkdir -p /var/lib/mongodb &&      touch /var/lib/mongodb/.keep &&      chown -R mongodb:mongodb /var/lib/mongodb
 ---> Running in 88f5f059c3ff
 ---> f8e4eaed6105
Removing intermediate container 88f5f059c3ff
Step 4 : ADD mongodb.conf /etc/mongodb.conf
 ---> eb358bbbaf75
Removing intermediate container 090e1e36f7f6
Step 5 : VOLUME [ "/var/lib/mongodb" ]
 ---> Running in deb3367ff8cd
 ---> f91654280383
Removing intermediate container deb3367ff8cd
Step 6 : EXPOSE 27017
 ---> Running in 0c1d97e7aa12
 ---> 46157892e3fe
Removing intermediate container 0c1d97e7aa12
Step 7 : USER mongodb
 ---> Running in 70575d2a7504
 ---> 54dca617b94c
Removing intermediate container 70575d2a7504
Step 8 : WORKDIR /var/lib/mongodb
 ---> Running in 91759055c498
 ---> 0214a3fbcafc
Removing intermediate container 91759055c498
Step 9 : CMD [ "/usr/bin/mongod", "--quiet", "--config", "/etc/mongodb.conf", "run"]
 ---> Running in 6b48f1489a3e
 ---> 13d97f81beb4
Removing intermediate container 6b48f1489a3e
Successfully built 13d97f81beb4

You can see how each directive in the Dockerfile corresponds to a build step, and you can see the activity that each directive generates.

When docker processes a Dockerfile what it really does first is to put the base image in a container and run it but execute a command in that container based on the first Docker file directive. Each directive causes some change to the contents of the container.

A Docker container is actually composed of a set of file trees that are layered using a read-only union filesystem with a read/write layer on the top. Any changes go into the top layer. When you unmount the underying layers, what remains in the read/write layer are the changes caused by the first directive. When building a new image the changes for each directive are archived into a tarball and checksummed to produce the new layer and the layer's ID.

This process is repeated for each directive, accumulating new layers until all of the directives have been processed. The intermediate containers are deleted, the new layer files are saved and tagged. The end result is a new image (a set of new layers).

Running the Mongo Container

This simplest test is to for the new container is to try running it and observing what happens.

docker run --name mongodb1 --detach --publish-all  markllama/mongodb
a90b275d00d451fde4edd9bc99798a4487815e38c8efbe51bfde505c17d920ab

This invocation indicates that docker should run the image named markllama/mongodb. When it does, it should detach (run as a daemon) and make all of the network ports exposed by the container available to the host. (that's the --publish-all). It will name the newly created container mongodb1 so that you can distinguish it from other instances of the same image. It also allows you to refer to the container by name rather than needing the ID hash all the time. If you don't provide a name, docker will assign one from some randomly selected words.

The response is a hash which is the full ID of the new running container. Most times you'll be able to get away with a shorter version of the hash (as presented by docker ps. See below) or by the container name.

Examining the Running Container(s)

So the container is running. There's a MongoDB waiting for for a connection. Or is there? How can I tell and how can I figure out how to connect to it?

Docker offers a number of commands to view various aspects of the running containers.

Listing the Running Containers.

To list the running containers use docker ps.

docker ps
CONTAINER ID        IMAGE                      COMMAND                CREATED             STATUS              PORTS                      NAMES
a90b275d00d4        markllama/mongodb:latest   /usr/bin/mongod --qu   5 mins  ago         Up 5 min            0.0.0.0:49155->27017/tcp   mongodb1

This line will likely wrap unless you have a wide screen.

In this case there is only one running container. Each line is a summary report on a single container. The important elements for now are the name, id and the ports summary. This last tells me that I should be able to connect from the host to the container MongoDB using localhost:49155 which is forward to the container's exposed port 27017

What did it do on startup?

A running container has one special process which is sort of like the init process on a host. That's the process indicated by the CMD or ENTRYPOINT directive in the Dockerfile.

When the container starts, the STDOUT of the initial process is connected to the the docker service. I can retrieve the output by requesting the logs.

For Docker commands which apply to single containers the final argument is either the ID or name of a container. Since I named the mongodb container I can use the name to access it.

docker logs mongodb1
Thu Aug 28 20:38:08.496 [initandlisten] MongoDB starting : pid=1 port=27017 dbpath=/var/lib/mongodb 64-bit host=a90b275d00d4
Thu Aug 28 20:38:08.498 [initandlisten] db version v2.4.6
Thu Aug 28 20:38:08.498 [initandlisten] git version: nogitversion
Thu Aug 28 20:38:08.498 [initandlisten] build info: Linux buildvm-12.phx2.fedoraproject.org 3.10.9-200.fc19.x86_64 #1 SMP Wed Aug 21 19:27:58 UTC 2013 x86_64 BOOST_LIB_VERSION=1_54
Thu Aug 28 20:38:08.498 [initandlisten] allocator: tcmalloc
Thu Aug 28 20:38:08.498 [initandlisten] options: { command: [ "run" ], config: "/etc/mongodb.conf", dbpath: "/var/lib/mongodb", nohttpinterface: "true", noprealloc: "true", quiet: true, smallfiles: "true" }
Thu Aug 28 20:38:08.532 [initandlisten] journal dir=/var/lib/mongodb/journal
Thu Aug 28 20:38:08.532 [initandlisten] recover : no journal files present, no recovery needed
Thu Aug 28 20:38:10.325 [initandlisten] preallocateIsFaster=true 26.96
Thu Aug 28 20:38:12.149 [initandlisten] preallocateIsFaster=true 27.5
Thu Aug 28 20:38:14.977 [initandlisten] preallocateIsFaster=true 27.58
Thu Aug 28 20:38:14.977 [initandlisten] preallocateIsFaster check took 6.444 secs
Thu Aug 28 20:38:14.977 [initandlisten] preallocating a journal file /var/lib/mongodb/journal/prealloc.0
Thu Aug 28 20:38:16.165 [initandlisten] preallocating a journal file /var/lib/mongodb/journal/prealloc.1
Thu Aug 28 20:38:17.306 [initandlisten] preallocating a journal file /var/lib/mongodb/journal/prealloc.2
Thu Aug 28 20:38:18.603 [FileAllocator] allocating new datafile /var/lib/mongodb/local.ns, filling with zeroes...
Thu Aug 28 20:38:18.603 [FileAllocator] creating directory /var/lib/mongodb/_tmp
Thu Aug 28 20:38:18.629 [FileAllocator] done allocating datafile /var/lib/mongodb/local.ns, size: 16MB,  took 0.008 secs
Thu Aug 28 20:38:18.629 [FileAllocator] allocating new datafile /var/lib/mongodb/local.0, filling with zeroes...
Thu Aug 28 20:38:18.637 [FileAllocator] done allocating datafile /var/lib/mongodb/local.0, size: 16MB,  took 0.007 secs
Thu Aug 28 20:38:18.640 [initandlisten] waiting for connections on port 27017

This is just what I'd expect for a running mongod.

Just the Port Information please?

If I know the name of the container or its ID I can request the port information explicitly. This is useful when the output must be parsed, perhaps by a program that will create another container needing to connect to the database.

docker port mongodb1 27017
0.0.0.0:49155

But is it working?

Docker thinks there's something running. I have enough information now to try connecting to the database itself. From the host I can try connecting to the database itself.

The ports information indicates that the container port 27017 is forward to the host "all interfaces" port 49155. If the host firewall allows connections in on that port the database could be used (or attacked) from outside.

echo "show dbs" | mongo localhost:49155
MongoDB shell version: 2.4.6
connecting to: localhost:49155/test
local 0.03125GB
bye

What next?

At this point I have verified that I have a running MongoDB accessible from the host (or outside if I allow).

There's lots more that you can do and query about the containers using the docker CLI command, but there's no need to detail it all here. You can learn more from the Docker documentation web site

Before I start on the Pulp service proper I also need a QPID service container. This is very similar to the MongoDB container so I won't go into detail.

Since the point of the exercise is to run Pulp in Docker with Kubernetes, the next step will be to run the MongoDB and QPID containers using Kubernetes.