Sandpit in OpenShift v3

Introduction

I have been doing an OpenShift v3 PoC for a customer and one member of the customer staff came up with an interesting use case for OpenShift, which is not what customers usually use it for. The idea is to be able to quickly spawn a container to evaluate or validate a package or configuration and to get it destroyed afterwards. The customer well understood that when the container terminates all modifications done are lost and he was fine with that. All what he wanted is really to have a sandpit, similar to what can be done with this docker command:

# sudo docker run -i -t rhel7 /bin/bash

The command just starts a RHEL image with a batch process.

The rational behind having it done in OpenShift rather than doing it directly with docker is that on one side sudo rights are needed to run docker on a server, which will only get granted to a limited number of employees. And on the other side the team in charge or managing computers may not be willing to provide support for docker on staff computer. OpenShift provides fine granularity for managing user rights and allows the execution of privileged operations in a secured way.

The OpenShift way

A container can easily get started with a batch process in Openshift with a command similar to the following one.

# oc run –restart=Never –attach –stdin –tty –image rhel7 rhel7 /bin/sh

Where things get more complicated is when the user wants to install new packages. This is generally not possible for security reasons when the container gets started in OpenShift by a standard user. A way to work around that is to create a Dockerfile extending the base image with the required packages. The image build will be launched by the builder service account, which is able to run privileged containers for this purpose. Here is an example of such a Dockerfile.

FROM rhel7.2# RHEL 7.2 Image extension for my Sandpit
#MAINTAINER Your Name <yourname@yourcompany.com>RUN INSTALL_PKGS=”tar httpd” && \
yum install -y $INSTALL_PKGS && \
rpm -V $INSTALL_PKGS && \
yum clean all

CMD [“/bin/bash”]

Once the image has been built you can “oc run” it as shown earlier. That’s it!

Tips for automated performance tests

Introduction

Depending on the application field non functional requirements like response time, throughput or availability may be as important as functional ones. Not being able to fulfil them may make an application unfit for production.
Based on that it may be critical to keep an eye on response times or throughput during the development and maintenance life cycles. This is best addressed by automated tests, which can be orchestrated by a continuous integration tool like Jenkins.
In this blog post I don’t want to write about big design, complex infrastructure or comparison between tools. My aim is more humble. I just one to mention two pitfalls when you integrate your performance or load tests in your favourite CI tool and to give possible workarounds.

Be on time

You want to start your load tests when your application has fully started but not before. Sure, you can just let enough time for your application to start before you launch your tests but it may not be that much more complicate to watch for an event stating that your application has finished starting. Most of the applications would write something in the logs. On a Linux system like RHEL it is straightforward to watch the logs so that you may not need to do any change to your log monitoring system if you happen to use something like Logstash or Graylog. This short shell script does the job:

# start-monitor.sh
#!/bin/sh
FILE=${1:-server.log}
SEMAPHORE=${2:-JBAS015874}

if [ ! -f “$FILE” ]; then
echo “File could not be found!”
echo “Usage: start-monitor.sh FILE SEMAPHORE”
exit 1
fi

tail -f “$FILE” | while read LOGLINE
do
[[ “${LOGLINE}” == *${SEMAPHORE}* ]] && pkill -P $$ tail
done
echo “Server started, tests can begin!”

It is self explanatory the trick is to read from tail and to check whether the start completed event has been recorded. Note with Wildfly AS 7 / JBoss EAP 6 it is: JBAS015874.

Quick but not too quick

So we are now able to start our load tests when we are sure that the application has started. This is already a good point. But some time ago I was doing some performance testing by a customer and it reminds me that from a statistic point of view when you look at a decent size population a few elements may have a significant impact on the global outcome. By Telcos it is quite usual to monitor the 95 percentile for round trip time and other metrics. By doing so extreme values get filtered out. What does it have to do with our load tests? We are already taking care of the fact that the application is fully started. Fully started yes, but initialised?
An application server like JBoss EAP may load modules on demand depending on its configuration to reduce the starting time. In a same way threads or connection pools may only get fed when there is the need. Are you using a layer 2 or some other type of cache? Values need first to get populated into it before the database round trips can be avoided.
Coming back to my customer. He was simulating 60 parallel user sessions. 4 web services were repeatedly called.

Starting the tests right after sever start-up gave the following results:

  number of runs max time (ms) average time (ms)
Service 1 ~50 31000 3750
Service 2 ~4000 23000 560
Service 3 ~2000 22000 310
Service 4 ~1500 11000 260

A second time we started recording measures after we had performed a few calls on each service and the results looked quite different:

  number of runs max time (ms) average time (ms)
Service 1 ~50 1400 1150
Service 2 ~4000 2000 50
Service 3 ~2000 1600 50
Service 4 ~1500 850 45

I am not saying that edge conditions should not be considered but know what you are measuring otherwise you may get surprised… as I did.