While the OpenShift Online service has been up for... sheesh almost 2 years now? (corrections welcome) the development activity has only accelerated over time. More than ever the admin tasked with implementing an On-Premise OpenShift Origin service is shooting at a moving target. There are released RPMs in the Fedora 18 distribution and updates, but even the updates aren't keeping pace with the source changes. (This is good, it gives *some* stability).
The experimenter will often find now that the tiny feature she needs is already in the source tree but hasn't yet made it to the released packages. They may even find the need (desire?) to make changes and contribute them back to the base. In both of those cases she will have to be prepared to create local builds of the OpenShift packages for development and testing.
There is a build toolset also on
github for the
origin-server package set. It's in a separate repository named
origin-dev-tools. This follows the model of the original internal build and test environment. It's an all-in-one wrap-it-to-go kind of toolset. But this is the
Under the Hood blog, so I'm going to crack the case open and see what's inside.
This post uses Fedora 18 but should be applicable to RHEL6+ as well.
Building a Build Site
If you're customizing the
OpenShift Origin software, whether because you want to work with committed but pre-release software or because you're making changes on your own, the best way to manage the software life cycle is to create a proper build server. To start with I'll describe how to create the build server and to make the RPM repo available to server boxes. There are tools to automate the build/test/publish process as well but I won't deal with them yet.
The goal of this post is to outline the requirements and process for creating your own build
Building on a Base
As usually I start on a minimal system. I add the software I need explicitly and let yum manage the dependencies. The build system will need a fixed IP address and a well known DNS name so that you can reach it later from your OpenShift Origin servers.
When configured for my work environment (including Kerberos 5, and LDAP authentication) I start with about 245 packages.
The Tool Box of Modern Software Development
There's a lot of stuff that goes into making software packages appear automatically. The development and build process today commonly includes remote repository updating, automated testing, software tagging on top of the compilers, interpreters and language libraries.
Note that the first set of tools listed below are
just those needed to manage the build and packaging process. Each package will also have additional build requirements, but those will be dealt with later.
Once, long ago building software meant having a compiler (which you built yourself from source code), and
tar for unpacking it and
make to automate the process. Today the same tasks apply but there's a lot more formalism to the process. Collaboration has required the creation of distributed software revision control tools. Software testing has become everyone's job. People have recognized that software is never finished, it evolves and grows over time. Users need to be able to know what they're running and where to get updates. The modern tool set reflects these needs.
While most of the time these tools will just work it's often important to know what tools are doing what jobs and how they interact. This is critical either when things don't go as planned, or when contributing new software packages to the set. First I'll take a look at which tools OpenShift uses and then demonstrate how to install them (which is actually pretty trivial)
Software Revision Control: Git and Github
A distributed project today requires some kind of remote software revision control system. This allows developers to work together without having to be in one place. The Revision Control System (RCS) manages changes and flags conflicts. It allows tagging of releases.
The OpenShift Origin project uses
git for revision control. It uses the
Github service to hold the master repository and development forks and branches. You can pull down a cloned copy of the source tree without having an account on Github. To manage your local changes and to contribute back you'll need an account of your own. There are a number of good books or sites on how to use git. See the Github site itself for
help learning how to create your own development fork and branches.
Task Automation: Ruby, Rubygems and Rake
To automate the unit testing OpenShift uses a rubygem called
rake after the original GNU
make. Rake implements dependencies and tasks in a way similar in behavior (but syntactically entirely different) from
make.
Rake is implemented as a
rubygem which is in turn a module packaging mechanism for Ruby code.
Unit Testing: Rspec 2
Many OpenShift components include unit tests written using the
RSpec framework. RSpec is another rubygem. It has components for writing special expectations, mocks and hooks for testing Rails applications.
rubygem-rspec-rails requires all of the other components, so we can install that and let
yum handle the dependencies.
Build, Packaging and Release: Tito and rpm-build and rubygem-bundler
All of the software in OpenShift must be packaged for delivery in RPM format. This is both a requirement for inclusion in Fedora and RHEL releases as well as good general practice (use the native software packaging format). A number of components are also packaged as Rubygems. This adds the requirement for the rubygem-bundler package for building but these are not the deliverable format.
OpenShift uses a tool called
tito to manage package builds and revision tags. Tito works with the standard RPM spec files and with
rpmbuild and
createrepo. When it runs successfully, tito not only builds the requested package, it increments the package version number and inserts it in a yum repository.
Documentation: rubygem-yard
The ruby community have created a set of tools which allow documentation to be automatically generated. The author of the code inserts specially formatted markup comments which the documentation generator uses to produce HTML or other documentation formats.
OpenShift is using the
yard documentation tool to markup and auto-generate documentation for the ruby packages. Yard is installed with the
rubygem-yard RPM
Publication: thttpd
Once the packages are built they're useless if your OpenShift Origin servers can't reach them. I typically use Apache2 for web service but these are static, so a light weight server like
lighttpd or
thttpd are in order. I'm going to use
thttpd because I can configure it to serve the default yum repo location with a single
sed command.
If you don't want to share the builds from the build server you can instead use a tool like
rsync to push them where you need them to be for publication.
Installing The Software
I can compose the list now:
- git - revision control
- rake
- ruby
- rubygems
- rubygems-devel
- rubygem-rake
- rubygem-bundler
- rubygem-yard
- rspec
- rubygem-rspec-core
- rubygem-rspec-mocks
- rubygem-rspec-expectations
- rubygem-rspec-rails
- tito
- rpm-build
- lighttpd
Note that RPM package dependencies make the actual install list fairly small if you pick carefully:
Now that I have my list, installing the toolset is easy enough:
yum install -y git rubygems-devel rubygem-rake rubygem-yard rubygem-rspec-rails tito rpm-build lighttpd
This will actually cause the installation of almost 100 more packages due to dependencies.
When this software is all installed on my build system, the next step is get myself a copy of the source code.
Getting the Source from Github
Git was created by Linus Torvalds himself to replace a proprietary software revision control system which had been used for years to manage the Linux kernel source tree. Since then a number of services have sprung up to offer a place for people to host their projects. OpenShift Origin is hosted on
Github.
You can get the git URL for the OpenShift Origin service software without an account, but if you want to make modifications or contributions you'll need to register and then create your own project fork. Github has some greate help and tutorials here:
https://help.github.com/
You will probably also want to look at the process for setting up SSH keys for Github so you don't have to type a password for every operation.
The OpenShift Origin server source code is here:
Cloning the Source Code Repository(s)
Once you've created your account and forked the origin-server project you should find a git@github.com: URL on your fork page. You can cut-and-paste that and use it to clone a local copy of your workspace. (In the example below, replace the URL with your own)
git clone git@github.com:/openshift/origin-server.git --tags
Now you've got everything that the build process needs, but not what the software you're building needs.
Task Automation
The current official process uses the origin-dev-tools and has a certain amount of overhead. It's made for rigourous exhaustive build/test/release cycles.
What we need here is much simpler and self-contained.
The exploration that follows is captured in a
Rakefile script I put on gist.github.com. When it's placed at the top of the origin-server source tree and set executable, it will execute the tasks described below.
NOTE: the oo-rake script is not part of the official origin-server sources. It will likely not be maintained and comes with no warranty. Use at your own risk.
cd origin-server
wget http://gist.github.com/markllama/5225912/raw/abcbeebed584bc1aae56b9091fa977e8636c316c/oo-rake
chmod a+x oo-rake
./oo-rake --tasks
rake all:builddep[answer] # install all build requirements
rake all:rpm[repodir,test,yum] # generate all RPMs and create yum repository
rake all:testrpm[repodir,yum] # generate all test RPMs and create yum rep...
rake all:yard[destdir] # generate comprehensive documentation
Package Build Requirements
Building most software requires more than just the build tools. Most software depends on other tools or libraries for its own build process. Because OpenShift is set up to build into packages and because the RPM mechanism has a feature to allow developers to call out the dependencies, we can find out what's needed and install it.
Packages and ".spec" files
Every component of OpenShift Origin must be packaged as an RPM. It's just the way things are. This gives us a hook to help identify each package and ultimately, to find the set of build prerequisites for each package.
The contents of each package must reside in a directory within the source code tree. Each package must have exactly one RPM .spec file. We can search the directory tree for these files and we'll know both the names of the packages and their locations within the source tree.
Assuming you've just cloned the origin-server repository into your current working directory you can find the list of packages with a little shell snippet like this:
find origin-server -name \*.spec
Build Requirements
Among other things, a package .spec file defines a set of packages that must be installed before the new package can be built. The required packages are specified with BuildRequires lines.
The yum-builddep program which is part of the yum-utils package will install the build requirements for a package:
yum-builddep <specfile> [<specfile>...]
This will install all of the build requirements for the listed packages.
The oo-rake script offers the all:builddeps target. Invoking this task will install all of the build requirements for the packages under the tree.
Building the Packages
The packages (and yum repository) are built by tito. Tito has to run in the root directory for each package (where the .spec file resides.) Since we already know how to find all the spec files we can find the directories which contain them fairly simply:
find origin-server -name \*.spec | xargs -i {} dirname {}
This will produce a list of directories which contain potential packages. We can just loop over that and call tito in each one to build the packages.
for PKGDIR in $(find origin-server -name \*.spec | xargs -i {} dirname {}) ; do
(cd $PKGDIR ; tito build --rpm)
done
This will change to each directory, build the RPM and place it in a
yum repository at
/tmp/tito
. You can change where the output goes either by adding
-o <directory> to the
tito command or by setting a variable named
PREBUILD_BASEDIR in the build user's
~/.titorc file.
The oo-rake provides a target: all:rpm which will build all of the packages in the tree below it. You can provide arguments to rake targets. The first argument to the all:rpm task is the destination for the packages.
Git Tags and Test RPMs
Tito depends on git and specifically on release tags. If you get any messages indicating that a tag is missing for a package, fetch the tags from your git repository as well
cd origin-server ; git fetch origin --tags
When you run the all:rpm target tito will build tagged release packages. That is, it will build from the last tagged commit. If you have checked in new versions of files, they will not be used.
To build packages from the head of the current branch, you want to build test packages.
The oo-rack script provides another target all:testrpm which will build test packages for the entire tree (and place them in the yum repository). Test packages get hashed names so that yum update will install the newer packages from the repository.
Publishing the Yum Repository
You don't have to publish the RPMs in the yum repository but you have to make the RPMs available somehow. I'm going to add a step here to make the yum repo available by HTTP using a lightweight http server,
thttpd.
By default thttpd serves the contents of /var/www/thttpd
. I want it to serve /tmp/tito
. A single line sed command makes the adjustment:
sed -i -e 's|^dir=.*$|dir=/tmp/tito|'
Fedora 18 comes with the firewall daemon limiting remote access. We have to open access to port 80 so that thttpd can answer queries.
firewall-cmd --zone public --add-service http
We just have to enable the thttpd and we'll be able to have servers pull from it.
systemctl enable thttpd
systemctl start thttpd
If you have a web server established you could instead use rsync or something like it to move the build results to the web server.
What this doesn't include?
This is just the barest minimum information to build OpenShift Origin RPMs on Fedora 18. There are a bunch of tasks that aren't handled:
- Triggering automatic re-build on developer commit
- Running unit tests
- Interpreting and handling build errors
- Handling new package build requirements
- Installing and configuring OpenShift servers
This should be enough though for someone who wants to extend or contribute to the OpenShift Origin project and needs to build their own packages.
References