Thursday, July 25, 2013

Installing OpenShift using Puppet, Part 1: Divide and Conquer

It's been quite a while since I posted last.  I got stuck on three things

  1. I didn't (don't?) know Puppet
  2. The layers of service and configuration were (are?) muddy.
  3. There are several competing significant installation use cases to be considered.
It would be very Agile to just leap in and start coding things until I got a set of boxes that worked.  But it would also likely lead to something which was difficult to adapt to new uses because it didn't respect the working boundaries between different layers and compartments which make up the OpenShift service.

So I  learned Puppet, and started coding some top down samples and some bottom up samples, while at the same time writing philosophical tracts trying to justify the direction(s) I was going.

I'm not nearly done (having thrown out several attempts and restarted each time) but I think I've reached a point where I can express clearly *how* I want to go about developing a CMS reference implementation for OpenShift installation and configuration.

OK, you're not going to get away without some philosophy.  Rather a lot actually this time.

Where do Configuration Management Services (CMS) fit...

Up until now I've concentrated on reaching a point where I can start installing OpenShift.  And I'm finally there.  No.  Wait.  I'm at the point where I can start installing the parts that make up OpenShift.  After that I have to configure each of the parts to run in their own way and then I have to configure the settings that OpenShift cares about.

See what happened there? It's layers.

Host and Service Configuration Management Layers

See where the CMS fits in? Between the running OS and all those configured hosts/services.  That's where I am now.

Look at the top layer.  Those vertical slices are individual hosts or services that have to be created. Only the ones in the middle are OpenShift. The others are operations support (for a running service) or development and testing stuff which isn't really OpenShift but is needed to create OpenShift.

... and what do they need to do.

I need to show you another complicated looking picture:

Draft OpenShift CMS Module Layout

As you can see, I need to learn Inkscape more, because Dia graphics just don't look as cool.

I'm a fan of big complicated looking graphics to help describe big complicated concepts. This is a very rough incomplete draft of a module breakdown for installing OpenShift using a CM system (Puppet, by name, though this should be applicable to any modular CM system). The three columns in the diagram represent different class uses.

The first column contains classes that are just used to hold information that will be used to instantiate other classes on the target hosts.  None of these classes will be instantiated directly on any host.  The second column shows an OpenShift Broker and an OpenShift Node.  Each includes a class which describes the function of that host within the OpenShift service.  Each also includes any support services which run on the same host.  The third column contains the definitions of the hosts which run support services.  They include a module for the support service itself, and then one which applies the OpenShift customizations to the service.

OpenShift uses plugin modules for several support services.  In the diagram, the plugins for each support service are grouped together. Only one would be instantiated for a given OpenShift installation.  Which one is selected as a parameter of the Master Configuration class ::openshift

There is one lonely class at the bottom of the middle column:  ::openshift::host.  This is currently a catch-all class which provides a single point of control for configuring common host settings such as SSH firewall rules, the presence (or absence) of special YUM repositories and the like.  It will be instantiated on every host which participates in the OpenShift service (for now) but can be customized using class parameters. This class could be broken up or other features added depending on how (in)coherent it becomes.

I showed you that diagram to show you this one.

Now if you look back to the top diagram, in the top row there are a bunch of vertical items that are peers of a sort.  Each blob represents a component service of OpenShift or a supporting service or task.  In a fully distributed service configuration each one would represent an individual host.

Keep that in mind as you look at the middle and right side of the second diagram.  Those (UML/Puppet) nodes there map to the blobs ad the top of the first diagram.  They show the internal structure of those blobs when installing OpenShift and support components.  Each one contains at least one module which installs a support service or component and which doesn't have the word openshift in it.  Each one also contains (at least) one OpenShift customization class.  This latter uses the information classes from the first column to customize the software on the node and integrate it with the OpenShift service.

This is the key point:

There are layers here too.

The configuration management tools should be designed so so that you can plug them together in a way that gets you the service you want to have, building up from the base to the completed service.  But: you should also be able to understand how the service is put together by looking at the configuration files.

By creating each (Puppet) node from the (Puppet) parts that define what a host does, you can see what the host does by looking at the Puppet node definition.  Knowledge is maintained both ways.

Outside-In Development

Since I'm still learning specific CMS implementations (Puppet now, and Ansible soon) and trying to understand how best to express a configuration for OpenShift using these CMS, I'm working from the top alot. At the same time, I'm trying to actually implement (or steal implementations of) modules to do things like set up the YUM repositories and install the packages.   I like this kind of Outside-In development model because (if I'm careful not to thrash too much) it helps me keep both perspectives in mind and hopefully meet in the middle.

In the next installment I'll try putting some meat on the bones of this skeleton: Actually creating the empty class definitions in their hierarchical structure and then creating a set of node definitions that import and use the classes to at least pretend to install an OpenShift service.  Hopefully it won't take me another couple of months.


CMS Software

Drawing Software

1 comment:

  1. Hi Mark,

    Interesting read. I am currently looking into feasibility of using OpenShift (vs. a few others) for a large private PaaS installation and have also flagged the Puppet installation method as a serious area for development.

    I currently have working PoC's of OpenShift origin using 50% the puppet modules available at; and 50% manually tweaking/editing either the modules or configuration directly on the PoC itself (Not ideal, but we all start somewhere).

    This existing puppet module gives a good starting point to an 'all in one' host, running all components, and looks to have parameters to allow the various openshift components to be broken out into multiple distinct nodes, however in practice this does not seem to work at all. Leaving much of the configuration incorrect on a multi-VM installation.

    So, we're both in the situation of writing a set of classes for a modular approach for OpenShift.

    Some thoughts of my own;

    A. I think we should probably sync up on this. It's a small community and i'm relativley new to large puppet classes also, however I do think there are things we can take from the existing Puppet modules as I have been hacking around with them for a couple of weeks.

    B. Another angle to this is the current versioning of the openshift origin builds... The nightlies available in RPM form only stay on the mirror for a day or so, and you'll notice due to package/dependancy changes, the existing puppet manifests get tweaked fairly often to un-break bits that have changed in the latest nightlies.

    Which gets me thinking about how best to write an installer for what is, currently, something of a moving target.

    A colleague of mine put it rather succinctly like this;
    "The Redhad echosystem is
    RawHide -> Fedora -> RHEL / Centos

    In the openshift echosystem, Origin feels like rawhide and Enterprise feels like RHEL but there is nothing in between and no Centos equivalient"

    I think it may be worth having a secondary level of stability between the nightlies and the out of our control 'Openshift Enterprise' that would allow people like us to focus on the surrounding requirements (such as CMS) without having the floor move every day.

    If you want to discuss more you can PM/Tweet me at @mattdashj or mail details on