Under The Hood of Cloud Computing: OpenShift Back End Services: Data Store (MongoDB)

In the previous post, I listed the set of back end services which underpin an OpenShift Origin service.
In this one I'm going to detail the installation of the first of these back end services and initialize it for OpenShift Origin.

Data Persistance

The OpenShift Origin service needs a backing data store to contain persistent data. It must keep track of the meta data for user's applications, the available nodes and their capabilities and the ssh public keys provided by the user to allow them git and ssh access to their apps.

Currently OpenShift Origin uses a MongoDB back end for data persistence. Before you can start building your OpenShift Origin service you need to have a running MongoDB that can be reached by the broker service on the Openshift Origin broker host(s).

There are lots of good resources about administration and use of MongoDB. If you're going to run an OpenShift Origin service yourself you should keep them handy. Check the References section at the end of this (and each) post.

In this post I'm going to walk through preparing the Data Storage host. Except for a couple of commands to create the user accounts and an empty database, there isn't anything really databasey about this procedure.

Information Gathering

From the table in the last post, we have the hostname and IP address of the broker host and the MongoDB server.

We're going to install and configure the MongoDB service to permit the OpenShift Origin broker service to access, read and write the OpenShift database.

The MongoDB service runs on port TCP 27017.

I'm going to create a root account in the MongoDB admin database and enable authentication. This will allow two layer access control to the data. I'll create the empty OpenShift Origin database so that it is ready for the broker to connect. I will also create a role user in the OpenShift database for the Openshift Origin broker to use. I'm going to create values for those. You should choose different passwords.

I will also need the IP address of the broker host so that I can use the iptables firewall to limit inbound connections to only that host.

So here's the information we have:

MongoDB Setup Information
Function	Hostname	IP Address
Broker Host	broker1.example.com	192.168.5.11
Data Storage Host	data1.example.com	192.168.5.8

MongoDB TCP Port	27017
OpenShift Database Name	openshift

Account	Username	Password
MongoDB Privileged User	root	dbadminsecret
Openshift Database User	openshift	dbsecret

Preparing the Base

In each of these setups I assume that the hosts have a base operating system installed and configured. I'm working with RHEL 6.3 and Fedora 17, but some of the packages are not yet publicly available for those. (November 15 2012) They will be available with the release of Fedora 18 at the end of the month and either from EPEL or from RHEL repositories.

I start with the Base package list, ntpd, policycoreutils-python and what ever packages are needed for central user control. These are Kerberos 5 and LDAP in my case (krb5-workstation, pam_krb5 and openldap-clients). For me this works out at between 300 and 425 packages. Right now I have to add a set of yum repositories for the pre-release builds but these should not be neccesary once all of the packages go to release for both RHEL and Fedora.

I also generally bring my base system up to date before beginning any other work.

yum -y update

The Process

There are several steps to preparing MongoDB for the OpenShift Origin service. I have to make sure it will start correctly, that it is secured and that it is initialized so that when the broker first connects the database is ready to accept updates. I also need to save the information that the broker will need to establish a connection as I'll need that to configure the broker plugin.

The steps look like this:

Install the MongoDB server software
Enable authentication
Add an administrator account
Add the empty OpensShift database
Add the OpenShift broker user account
Add firewall rules
Enable listening on external interfaces
Enable service restart on boot

Each of these steps is fairly small and the whole thing should be easily scriptable. It also should be fairly simple to script a set of checks to verify that the database service is properly configured and secured. These scripts can be used to check the underlying database service configuration in the event of a problem with the OpenShift service.

Note that I try to perform the steps in such a way that the service is never exposed to an external network until authentication is enable and configure and the firewalls have been established. This prevents cracking attempts during the (admittedly tiny) window when anonymous access would succeed.

Installing the Software

The first thing to do is to install the mongodb-server software. This will pull in several pre-requisites as well.

yum -y install mongodb-server

Enable Authentication

The MongoDB configuration file is /etc/mongodb.conf. The configuration is a simple space/line delimited key/value pair format. Authentication is off by default. The auth section looks like this:

...
# Turn on/off security.  Off is currently the default
#noauth = true
#auth = true
...

The first thing to do is make a copy of the original configuration file in case I mess up.
I need to uncomment the "auth = ..." line. You can do this with an editor, but I generally do this for simple changes with a line editor like sed(1). I'm also going to sneak in one other tuning parameter here that OpenShift wants but isn't related to security or service management: The smallfiles parameter needs to be added to the end of the file and set to true.

cp /etc/mongodb.conf /etc/mongodb.conf.orig
sed -i -e '/^#auth =/auth =/' /etc/mongodb.conf
echo "smallfiles = true" >> /etc/mongodb.conf

Throughout these pages you'll see patterns like that: "save a copy, make a change or two".

Add Accounts and Empty Database.

To add accounts and initialize the database it must be running. Once it is, I'll use the mongo(1) CLI client tool to make the changes I need.
The mongodb process needs a few seconds to start and begin listening, so there's a sleep after starting the daemon and before trying to access the database.
There are three steps here:

Create the administrator (root) account.
Create the OpenShift Origin server database
Create the OpenShift Origin broker role account.

MongoDB will actually create a database just by being told to use it even if it doesn't exist, so the steps are actually simpler. I'm also going to stop the database service once this is done so that I can open it (carefully) to external network access.

Note that the passwords and the OpenShift Origin database name come from the table of values a the beginning of this post. You need to change the password and you can select a different database name if you wish (you must keep track of it, you'll need it later).

# Start the mongod service (obviously)
service mongod start

# Wait for the service to be ready to listen
sleep 10

# Connect, create root account, authenticate, create database and role account.
mongo admin << EOMONGO
db.addUser('root', 'dbadminsecret');
db.auth('root', 'dbadminsecret');
use openshift;
db.addUser('openshift', 'dbsecret');
EOMONGO

# Stop the service again.
service mongod stop

In the past I would have also created a read-only user in the admin database. This would have allowed full db backups without read-write access. Now I should probably create a read-only user in the openshift database to back up just the one database.

Open a small hole in the firewall

By default the mongod process only listens on the localhost interface. Since it also starts with no authentication this is a good thing for security. Now I want to allow the service to listen for outside connections. However I only want to allow connections from the OpenShift Origin broker host. I'll use an iptables entry to restrict inbound connections and then tell the daemon to listen on the external interface.

If you wanted to allow unrestricted inbound connections, you could just use lokkit but we want to restrict access to a single or small set of inbound hosts so we need to configure the IP tables deliberately.

It would be handy to have the iptables(8) man page handy here for reference.

I want to allow inbound connections only from the broker host. I'll have to craft an iptables line which will do that. I have the IP address of the broker host and the TCP port to which mongod listens by default. I'll use those to craft the appropriate rule.

-A INPUT-s 192.168.5.11/32 -m state --state NEW -m tcp -p tcp --dport 27017 -j ACCEPT

This line means:

"Append an entry to the INPUT queue. Match NEW connections from 192.168.5.11. Accept TCP packets destined for port 27017"

I'd like to add this line to the end of the INPUT queue, but actually not the last line. The last line is the one that says "anything not matched by now, reject it". I want that to stay at the end. I want to insert my allow line just before that.

I could craft a clever sed or awk command to line edit the /etc/sysconfig/iptables file. iptables itself provides me with a better way. I can use the iptables control commands to determine which rule I want to insert at, add the line to the running tables and then have iptables dump the result to a file. I can save that as the new /etc/sysconfig/iptables file so that it will restore my new rule set at system startup.

I can use the iptables command first to list the existing ruleset. I can count the number of rules with wc -l and use expr to subtract one. That becomes the rule number for my insert command.

# Save a copy of the original
cp /etc/sysconfig/iptables /etc/sysconfig/iptables.orig

# Find the index of the next-to-last rule in the INPUT ruleset
INSERT_INDEX=$(expr $(iptables -S INPUT | wc -l) - 1)

# Add a new rule at N-1
iptables -I INPUT $INSERT_INDEX -s 192.168.5.11/32 -m state --state NEW -m tcp -p tcp --dport 27017 -j ACCEPT

# dump a copy of the current rules and save them for next reboot
iptables-save > /etc/sysconfig/iptables

Listen on external interfaces

Now that the authentication and firewall are in place, I can reconfigure mongod to listen on all interfaces. We're back to the easy stuff.

The bind_ip option in the /etc/mongodb.conf file specifies where the mongod will bind. By default this is 127.0.0.1. The IPv4 convention for "all addresses" is 0.0.0.0. I'll replace the value of bind_ip in /etc/mongodb.conf and I'll be ready to restart the service. This is a simple sed script again.

sed -i -e '/^bind_ip =/s/=.*/= 0.0.0.0/' /etc/mongodb.conf
service mongod start

Access and Security Verification

Once the service is running I want to make attempts to connect to it from at least three different sources.

From the datastore host (localhost) - allowed
From the datastore host (data1.example.com) - rejected
From the broker host - allowed
From an unauthorized host - rejected

I also want to connect both to the admin database and to the openshift on the two that should not be restricted by the firewall.

Note that, because the firewall rule did not include an allow clause for localhost as the source, you can only connect to the database using the localhost interface if you are on the datastore host.

The test commands below consist of an echo command which writes a string to standard output. The string is a single mongodb CLI command. The string is piped as input to the mongo CLI client. The argument to the mongo command is the host and database to be opened.

# admin, root user on localhost
echo 'db.auth("root", "dbadminsecret");' | mongo localhost/admin
MongoDB shell version: 2.2.1
connecting to: localhost/admin
1
bye

# openshift, openshift user on localhost
echo 'db.auth("openshift", "dbsecret");' | mongo localhost/openshift
MongoDB shell version: 2.2.1
connecting to: localhost/openshift
1
bye

# admin, root user on external interface
echo 'db.auth("root", "dbadminsecret");' | mongo data1.example.com/admin
MongoDB shell version: 2.2.1
connecting to: data1.example.com/admin
Fri Nov 16 21:28:45 Error: couldn't connect to server data1.example.com:27017 src/mongo/shell/mongo.js:93
exception: connect failed

# openshift, openshift user on external interface.
echo 'db.auth("openshift", "dbsecret");' | mongo data1.example.com/openshift
MongoDB shell version: 2.2.1
connecting to: data1.example.com/openshift
Fri Nov 16 21:28:45 Error: couldn't connect to server data1.example.com:27017 src/mongo/shell/mongo.js:93
exception: connect failed

I would try those two tests on the broker host and some external host to verify access. I should probably add a test to each with invalid user and with valid user but invalid password to be rigorous.

Don't Do This (Without Re-Compiling with SSL)

Having gone through this exercise to try to show how the MongoDB back end can be disentangled from the Openshift Origin service configuration, I have to say: Don't Do It.

The last step should have been to establish an encrypted pipe between the broker host and the datastore database. It turns out that MongoDB and most NoSQL databases don't think they should be doing encryption, so they don't. While they taught the benefits of distribution and sharding and other multi-host based behaviors, I really can't recommend allowing MongoDB to play in the street with the other kids. Both the authentication information and the data go in clear text.

In the MongoDB: The Definitive Guide from O'Reilly the authors recommend an SSH tunnel if you really need encryption point to point. I don't find this very satisfying.

The MongoDB documentation web site has a page on using MongoDB with SSL. It requires recompiling the package with a -ssl option. The documentation does not give a detailed set of instructions for re-compiling. You have to work that out for yourself.

If you install the resulting package you then have to mark your yum repository so that it doesn't update the mongodb RPM. Updates from the stock repositories will destroy your SSL configuration. When an update is indicated, you'll have to download the new source RPM, rebuild again and manually update the package.

For now, if you don't recompile with SSL I'd suggest that OpenShift servers be restricted to a single broker with a single unreplicated database on a single host.

The exercise remains a good one. You can follow the steps listed here using localhost for all of the datastore host addresses and you'll see the parts of the setup that are specific to the datastore as distinct from the OpenShift Origin service proper.

Next Up

Next up will be the Messaging service using ActiveMQ.

References

MongoDB Web Site: http://www.mongodb.org
MongoDB Documentation Site: http://docs.mongodb.org/manual
Using MongoDB with SSL: http://docs.mongodb.org/manual/administration/ssl/
MongoDB: The Definitive Guide, Chodorow and Dirolf, O'Reilly and Associates
iptables development site at netfilter.org

3 comments:

AnonymousDecember 26, 2012 at 11:05 AM
Note that unfortunately, OpenShift didn't quite make it into Fedora 18. http://fedoraproject.org/wiki/Features/OpenShift_Origin

"Don't do this" seems a bit harsh to me. I know ssh tunnels are one more point of failure, and I get that if someone has network access to listen in on the mongo connection, they can sniff everything. So, definitely consider network topology and security when installing a MongoDB replica set. But you have to provide access somehow. If you're worried about attack vectors, and you've locked down network access to the installed hosts (keeping the node hosts on a separate segment, using switched network, or other method of preventing sniffing of broker traffic), the broker itself seems like the main attack vector, and if someone were to crack that, they'd have access to everything else anyway.
markllamaDecember 26, 2012 at 3:56 PM

Yep. "Don't do this" is harsh. It's meant to be.

Of course, it's just my paranoid opinion.

For now, I wouldn't recommend placing he MongoDB anywhere but on the same host as the broker application. As you said, if the bad guy gets that they get everything anyway. Adding separate database hosts is just adding another attack vector.

I think the idea of someone listening on the network connection is entirely likely. The idea of trusting firewalls or router security when configuring network services makes no sense to me. Previous experience leads me to assume that every network is untrusted. I expect the network mangers to assume that every host is cracked when they make their plans.

Undoubtedly someone will accidentally connect these services over untrusted networks without encryption because the authors don't address it.

I assume, when I configure a service for security at any level, that I *must* have missed something. It's my opinion that any critical data path between hosts must be encrypted regardless of the environment *because I make mistakes*.

One of the promises of MongoDB is that the data can be distributed widely and that implies distribution over untrusted networks. People who's primary job is not security will undoubtedly neglect to consider it. It's perfectly understandable and it's to be expected.

That said, I'm sure there are reliable ways to encrypt the communication end to end that I'm not aware of. I'm still looking into how to do it. I expect that inserting an openssl wedge between the database and the external interface may do the trick.

Regardless of the flaws in integrated SSL encryption, I still think it's the best possible option currently and to ignore it on the grounds that it has flaws while promoting a distributed service is... a problem. I'd love the MongoDB authors to add it.

Under The Hood of Cloud Computing

Friday, November 16, 2012

OpenShift Back End Services: Data Store (MongoDB)