Combining Jenkins’ Job DSL and Shared Libraries for Docker Images Pipelines

Posted on April 17, 2020 by vzurczak

This is another article about Docker pipelines with Jenkins.
I had already written about Jenkins libraries to standardize Jenkins pipelines in an organization. This can be seen as a complement in an OpenShift context.

OpenShift is a solution based on Kubernetes and that brings additional features such as a portal, a catalog of solutions (including Jenkins), many security features and new K8s resources. Among these new resources, there is the notion of Build Config. A build config object allows to associate a source configuration with a Jenkins project. So, basically, you write a YAML file, indicating as an example the location of a git repository, the branch you want to build, etc. You then pass it to the oc client (oc apply -f my.yaml) and it will create a Jenkins project. You will find more information about build configurations on the OpenShift project.

The problem I got with build configs was that the generated project in Jenkins is a simple one.
It does not support multiple branches. Generating a multi-branch pipeline would be much better, but it is not feasible. I was then advised to look at Jenkins’ Job DSL. It relies on a Jenkins plug-in that from a seed project can populate and configure Jenkins projects. Thanks to this plug-in, I quickly got a Jenkins configuration as code.

Using the Job DSL

The first thing to do is to create a seed project in Jenkins.
It is a free-style project with a build step that executes a Job DSL script. Everything is documented here.

The following script is stored in a git repository.
I copy it in my seed’s configuration and that’s it. Here is what it looks like…

def gitBaseUrl = "https://some.gitlab/path/to/a/my/gitlab/group"
def gitRepos = [
	["docker-project-name-1":"path-to-project-1"],
	["docker-project-name-2":"path-to-project-2"]
]

for (gitRepo in gitRepos) {
	for ( e in gitRepo ) {

		// Create Jenkins folders and reference shared libraries
		folder("${e.value}") {
			properties {
			folderLibraries {
				libraries {
					libraryConfiguration {
						name("my-library-for-docker")
						defaultVersion('master')
						implicit(false)
						allowVersionOverride(true)
						includeInChangesets(true)
						retriever {
							modernSCM {
							scm {
							git {
								remote('https://some.gitlab/path/to/a/jenkins/library')
								credentialsId('my-credentials-id')
							}}}
						}
					}
				}
			}}
		}

		// Create multi-branch pipeline projects
		multibranchPipelineJob("${e.value}/${e.key}") {
			branchSources {
				branchSource {
					source { git {
						id("${e.value}-${e.key}")
						remote("${gitBaseUrl}/${e.value}/${e.key}.git")
						credentialsId('middleware-jenkins-gittoken')
					}}
					
					strategy {
						defaultBranchPropertyStrategy {
							props {
								// Do not trigger build on branch scan
								noTriggerBranchProperty()
							}
						}
					}
				}
			}
			
			// Listen to changes in branches and new tags
			configure { node ->
				node / sources / data / 'jenkins.branch.BranchSource' / source / traits {
					'jenkins.plugins.git.traits.BranchDiscoveryTrait'()
					'jenkins.plugins.git.traits.TagDiscoveryTrait'()
				}
			}

			// Verify new branches and new tags everyday
			triggers {
				cron('@daily')
			}

			// What to do with old builds?
			orphanedItemStrategy {
				discardOldItems {
					numToKeep(10)
					daysToKeep(-1)
				}
			}
		}
	}
}

Here, I explicitly reference the projects I want to generate.
I reuse the git’s folder structure and project it in Jenkins. Shared libraries can be defined globally in Jenkins, but also on folders. The Job DSL does not allow to configure global stuff. But it allows to do it on folders. The multi-branch pipelines are quite easy to understand. We consider both branches and tags.

If you need to customize the script, you can find examples on Github and most of all in your Jenkins instance.

Once the script is set in your seed project, just build it in Jenkins and it will update your Jenkins projects. The build is idem-potent. You can run it as many times as you want. It will overwrite the current settings. So, if you need to update the DSL script, just do it and run a new build, everything will be updated. This is useful if you add new shared libraries. In the same way, the Jenkins plug-in tracks the projects it has created. So, if I remove a project from my list, with the default behavior, it will not delete the Jenkins related project but only disable it (it will be read-only).

I have investigated about directly referencing the script instead of copying it.
It seems you cannot automatically propagate changes from the sources, you have to validate the changes first. I guess this is a security feature. I have not searched a lot of about this, this is not a big matter for us, for the moment.

Normalized Pipeline for Docker Images

The job DSL defines a shared library at the folder level.
Here is a simple pipeline library (myDockerPipeline.groovy) for our Docker images.

1. Checkout the sources.
2. Verify some assertions on our Dockerfile.
3. Build the image.
4. Publish it in the right repository (not the same for branches and tags).
5. Perform an AQUA analysis of development images. We assume another pipeline handles images built from a tag (not shown here).

There is no test in this pipeline, although we could add some.

In Jenkins, it is achieved with…

// Shared library that defines the generic pipeline for Docker images.
def call( Map pipelineParams ) {

	// Basic properties
	def tokenString = pipelineParams.gitRepoPath.replace('/', '-') + "--" + pipelineParams.gitRepoName
	def imageName = pipelineParams.gitRepoName.replace('-dockerfile', '')
	def label = 'base'

	// Complex properties (configure build trigger through an URL and a token)
	properties([
		pipelineTriggers([
		[$class: 'GenericTrigger',
			genericVariables: [
				[key: 'ref', value: '$.ref'],
				[
					key: 'before',
					value: '$.before',
					expressionType: 'JSONPath',
					regexpFilter: '',
					defaultValue: ''
				]
			],
			genericRequestVariables: [
				[key: 'requestWithNumber', regexpFilter: '[^0-9]'],
				[key: 'requestWithString', regexpFilter: '']
			],
			genericHeaderVariables: [
				[key: 'headerWithNumber', regexpFilter: '[^0-9]'],
				[key: 'headerWithString', regexpFilter: '']
			],
			
			causeString: 'Triggered after a change on $ref',
			token: "${tokenString}",
			printContributedVariables: true,
			printPostContent: true,
			regexpFilterText: '$ref',
			regexpFilterExpression: 'refs/heads/' + BRANCH_NAME
		]
		])
	])

	podTemplate(label: label, cloud: 'openshift', containers: [
		containerTemplate(
			name: "jnlp",
			image: "my-jenkins-jnlp:v3.11",
			envVars: [
				envVar(key: 'ENV_DOCKER_HOST', value: 'remote-docker-engine'),
				envVar(key: 'ENV_LOCAL_IMG_NAME', value: 'my-team/' + imageName),
				envVar(key: 'ENV_DEV_IMG_NAME', value: 'my-team/dev/' + imageName),
				envVar(key: 'ENV_RELEASE_IMG_NAME', value: 'my-team/releases/' + imageName)
			]
		),
		containerTemplate(
			name: "aqua",
			image: "our-aqua-image:v3.11",
			command: 'cat', 
			ttyEnabled: true,
			envVars: [
				envVar(key: 'ENV_DEV_IMG_NAME', value: 'my-team/dev/' + imageName)
			]
		)
	],
	serviceAccount: "jenkins") {
		node(label) {
			container(name: 'jnlp') {

				// Checkout
				stage('Checkout') {
					checkout scm
				}
				
				// Lint
				stage('Linting') {
					// Do we have the right labels in the Dockerfile?
					verifyDockerfile()
				}

				// Build
				stage('Build') {
					sh 'docker -H "${ENV_DOCKER_HOST}" build -t "$ENV_LOCAL_IMG_NAME" .'
				}

				// Stages executed for a TAG
				if(env.TAG_NAME) {
					stage('Publish') {

						sh'''#!/bin/bash
						# Push to BUILD
						docker -H "${ENV_DOCKER_HOST}" tag \
							"$ENV_LOCAL_IMG_NAME" \
							"${ENV_RELEASE_IMG_NAME}":"${TAG_NAME}"

						docker -H "${ENV_DOCKER_HOST}" push \
							"${ENV_RELEASE_IMG_NAME}":"${TAG_NAME}"

						# Push to releases
						docker -H "${ENV_DOCKER_HOST}" tag \
							"$ENV_LOCAL_IMG_NAME" \
							"${ENV_RELEASE_IMG_NAME}":"${TAG_NAME}"

						docker -H "${ENV_DOCKER_HOST}" push \
							"${ENV_RELEASE_IMG_NAME}":"${TAG_NAME}"
						'''

					}
				}

				// Simple branch
				else if(env.BRANCH_NAME) {
					stage('Publish') {

						sh'''#!/bin/bash
						# Push to BUILD
						docker -H "${ENV_DOCKER_HOST}" tag \
							"$ENV_LOCAL_IMG_NAME" \
							"${ENV_DEV_IMG_NAME}":"${BRANCH_NAME}"

						docker -H "${ENV_DOCKER_HOST}" push \
							"${ENV_DEV_IMG_NAME}":"${BRANCH_NAME}"
						'''

					}
				}
			}

			// Here, we use the AQUA plug-in to scan images
			// (we reference a remote AQUA installation).
			container(name: 'aqua') {
				if(env.BRANCH_NAME) {
					stage("Hosted : Aqua CI/CD Scan Image") {
						ansiColor('css') {
							aqua customFlags: '',
							hideBase: false,
							hostedImage: '"${ENV_DEV_IMG_NAME}":"${BRANCH_NAME}"',
							localImage: '',
							locationType: 'hosted',
							notCompliesCmd: 'echo "The AQUA has failed."',
							onDisallowed: 'fail',
							showNegligible: true, 
							registry: 'our-registry-id',
							register: true
						}
					}
				}
			}
		}
	}
}

The main thing to notice is what is commented as the complex properties.
By default, our Jenkins installation has a global listener activated: https://our-jenkins-url/generic-webhook-trigger/invoke
So, anyone sending a HTTP notification to this address could trigger something. The question is how to use it to trigger a specific job, for a specific branch? Well, we use a simple token for that. Here, the token is based on the repository name and location. As an example, our git repo “some-path/some-project” will be associated with the following token: some-path–some-project

So, if someone notifies https://our-jenkins-url/generic-webhook-trigger/invoke?token=some-path–some-project, then the job’s configuration will catch it. The other properties allow to filter the right branch and only trigger the right Jenkins job.

Another element to notice is the custom verifyDockerfile library.
Here is its code (verifyDockerfile.groovy).

def call() {
	def dockerfileContent = readFile('Dockerfile')
	assert dockerfileContent.contains('LABEL maintainer=') : "No maintainer was found."
	assert dockerfileContent.contains('"common-mailbox@our-domain.com"') : "The maintainer must be common-mailbox@our-domain.com"
	// OK, the check is somehow basic
}

It allows to verify some parts of the Dockerfile.
Eventually, here is an example of our pipeline (Jenkinsfile).

@Library('my-library-for-docker') _

myDockerPipeline(
    gitRepoPath: 'repo-path',
    gitRepoName: 'repo-name'
)

This way, the content of our Jenkinsfile is minimalist.
We can update our library at any time without having to update the Jenkinsfile. No matter how many Docker images you maintain, you are sure all of them follow a same pipeline.

As a reminder, all the groovy libraries must be located under the vars directory in your project.

About the Source Branch Plug-ins

You must have noticed I referenced the projects by hand at the beginning of the seed’s script.
It is possible to avoid this and to use a plug-in to scan directly your sources. There are existing ones for Github, Bitbucket and GitLAB.

You have to define an organization folder and its properties.
Once the seed is built, it will scan the Git forge and create multi-branch pipeline projects (for the branches that have a Jenkinsfile). Here is a sample for GitLAB.

organizationFolder('GitLab Organization Folder') {
    displayName('GitLAB')

    // "Projects"
    organizations {
        gitLabSCMNavigator {
            projectOwner("my-gitlab-group")
            credentialsId("personal-token")
            serverName("my-gitlab-url")
            traits {
                subGroupProjectDiscoveryTrait() // discover projects inside subgroups
                gitLabBranchDiscovery {
                    strategyId(3) // discover all branches
                }
            }
        }
    }
	
    // "Project Recognizers"
    projectFactories {
        workflowMultiBranchProjectFactory {
            scriptPath 'Jenkinsfile'
        }
    }
	
    // "Orphaned Item Strategy"
    orphanedItemStrategy {
        discardOldItems {
            daysToKeep(10)
            numToKeep(5)
        }
    }
	
    // "Scan Organization Folder Triggers" 
    triggers {
        periodicFolderTrigger {
            interval('60')
        }
    }
}

As you can see, it is a little bit less verbose.
We have not chosen this approach though. Overall, the manual declaration is suitable for now. We also noticed some glitches with the GitLAB plug-in, mainly about character encoding and avatars. This is not a big issue by itself, fixes will come for that.

End-to-end tests for applications in Kubernetes

Posted on July 9, 2019April 17, 2020 by vzurczak

This article aims at introducing a small library to ease end-to-end testing of applications in Kubernetes environments.

An overview of what already existed

When typing “kubernetes e2e” or “kubernetes end to end” on Google, the first result I got was about testing a K8s cluster or component. It is what the project’s team is using to test the development of the Kubernetes project. This is not what I wanted. My goal was to test an application I packaged for K8s, not K8s itself.

Terratest is another solution I found. We have the same goal, but viewing this project made me realize I did not want a solution involving advanced programming. We have DevOps that can develop and maintain operative aspects. But they are not so many and most hardly know the Go language. All the team members learned kubectl and Helm commands easily. A scripting solution would be better. This would avoid the choice of a programming language (Go, Java…) and a thus a lot of arguing / reinventing.

Since we mostly had Helm packages, used by several internal projects, I tried to focus on Helm. I immediately found the solution used by the official Helm project. There are interesting parts, such as the linting and version checks. This can be convenient if you setup a internal Helm repository and that you want to make every chart use the same rules. There are also commands to check an installation. Anyway, this is tailored for a collection of Helm charts and still not adapted to what I wanted.

I then found the unit test plug-in for Helm. The principle is to create a YAML file that contains a set of tests. A Helm chart has to be deployed in the environment. The YAML files are passed to the chart that verifies them. This is an interesting solution, but it mostly tests the templating of your chart. Not the applicative behavior.

Testing an application in a Kubernetes environment means being able to deploy it, adapt the topology (scale replicas), verify everything works, execute scenarios and check assertions at various stages. The solution that fit the best this requirement was EUFT. This small project relies on BATS (Bash Automated Testing System), a script framework that allows to write and execute unit tests by using scripts. EUFT is in fact a solution to deploy a small K8s cluster and run BATS tests inside. Examples of tests are available in this repository. I also found out afterwards that Hashicorp was using the same technique for some of their Helm packages.

If I liked the principle of BATS, all the tests used by EUFT and Hashicorp are a little bit complex to maintain. Not everyone in our project is a script god. Besides, we do not want to deploy a K8s cluster in our tests: we want to use an existing one, with the same settings than our production one. This is important because of permissions and network policies. Running e2e tests in a ephemeral K8s installation is too limited. However, EUFT gave me a direction since I have not found anything else.

The DETIK library

I was not really inspired for a name…
DETIK stands for « DevOps End-to-End Testing In Kubernetes ». The idea is to write tests by using scripts, running them with BATS, and having a simple syntax, almost in natural language, to write assertions about resources in Kubernetes. With kubectl or Helm commands, a few knowledge in scripts (BASH, Ruby, Python, whatever…) and this library, anyone should be able to write applicative tests and be able to maintain them with very few efforts.

In addition to performing actions on the cluster, I also wanted to support the execution of scenarios. Scenarios can imply topology adaptations, but also user actions. BATS can integrate with many solutions, such as Selenium or Cypress for end-user scenarios, or Gatling for performance tests. With all these tools, it becomes possible to test an application from end-to-end in a K8s environment.

Example

The following example is taken from the Git repository.
It show the test of a Helm package. A part of the syntax comes from BATS.

#!/usr/bin/env bats

###########################################
# An example of tests for a Helm package
# that deploys Drupal and Varnish
# instances in a K8s cluster.
###########################################

load "/home/testing/lib/detik.bash"
DETIK_CLIENT_NAME="kubectl"
pck_version="1.0.1"

function setup() {
	cd $BATS_TEST_DIRNAME
}

function verify_helm() {
 	helm template ../drupal | kubectl apply --dry-run -f -
}


@test "verify the linting of the chart" {

	run helm lint ../drupal
	[ "$status" -eq 0 ]
}


@test "verify the deployment of the chart in dry-run mode" {

	run verify_helm
	[ "$status" -eq 0 ]	
}


@test "package the project" {

	run helm -d /tmp package ../drupal
	# Verifying the file was created is enough
	[ -f /tmp/drupal-${pck_version}.tgz ]
}


@test "verify a real deployment" {

	[ -f /tmp/drupal-${pck_version}.tgz ]

	run helm install --name my-test \
		--set varnish.ingressHost=varnish.test.local \
		--set db.ip=10.234.121.117 \
		--set db.port=44320 \
		--tiller-namespace my-test-namespace \
		/tmp/drupal-${pck_version}.tgz

	[ "$status" -eq 0 ]
	sleep 10

	# PODs
	run verify "there is 1 pod named 'my-test-drupal'"
	[ "$status" -eq 0 ]

	run verify "there is 1 pod named 'my-test-varnish'"
	[ "$status" -eq 0 ]

	# Postgres specifics
	run verify "there is 1 service named 'my-test-postgres'"
	[ "$status" -eq 0 ]

	run verify "there is 1 ep named 'my-test-postgres'"
	[ "$status" -eq 0 ]

	run verify \
		"'.subsets[*].ports[*].port' is '44320' " \
		"for endpoints named 'my-test-postgres'"
	[ "$status" -eq 0 ]

	run verify \
		"'.subsets[*].addresses[*].ip' is '10.234.121.117' " \
		"for endpoints named 'my-test-postgres'"
	[ "$status" -eq 0 ]

	# Services
	run verify "there is 1 service named 'my-test-drupal'"
	[ "$status" -eq 0 ]

	run verify "there is 1 service named 'my-test-varnish'"
	[ "$status" -eq 0 ]

	run verify "'port' is '80' for services named 'my-test-drupal'"
	[ "$status" -eq 0 ]

	run verify "'port' is '80' for services named 'my-test-varnish'"
	[ "$status" -eq 0 ]

	# Deployments
	run verify "there is 1 deployment named 'my-test-drupal'"
	[ "$status" -eq 0 ]

	run verify "there is 1 deployment named 'my-test-varnish'"
	[ "$status" -eq 0 ]

	# Ingress
	run verify "there is 1 ingress named 'my-test-varnish'"
	[ "$status" -eq 0 ]

	run verify \
		"'.spec.rules[*].host' is 'varnish.test.local' " \
		"for ingress named 'my-test-varnish'"
	[ "$status" -eq 0 ]

	run verify \
		"'.spec.rules[*].http.paths[*].backend.serviceName' " \
		"is 'my-test-varnish' for ingress named 'my-test-varnish'"
	[ "$status" -eq 0 ]

	# PODs should be started
	run try "at most 5 times every 30s " \
		"to get pods named 'my-test-drupal' " \
		"and verify that 'status' is 'running'"
	[ "$status" -eq 0 ]

	run try "at most 5 times every 30s " \
		"to get pods named 'my-test-varnish' " \
		"and verify that 'status' is 'running'"
	[ "$status" -eq 0 ]

	# Indicate to other tests that the deployment succeeded
	echo "started" > tests.status.tmp
}


@test "verify the deployed application" {

	if [[ ! -f tests.status.tmp ]]; then
		skip " The application was not correctly deployed... "
	fi

	rm -rf /tmp.drupal.html
	curl -sL http://varnish.test.local -o /tmp/drupal.html
	[ -f ${BATS_TMPDIR}/drupal.html ]

	grep -q "<title>Choose language | Drupal</title>" /tmp/drupal.html
	grep -q "Set up database" /tmp/drupal.html
	grep -q "Install site" /tmp/drupal.html
	grep -q "Save and continue" /tmp/drupal.html
}


@test "verify the undeployment" {

	run helm del --purge my-test --tiller-namespace my-test-namespace
	[ "$status" -eq 0 ]
	[ "$output" == "release \"my-test\" deleted" ]

	run verify "there is 0 service named 'my-test'"
	[ "$status" -eq 0 ]

	run verify "there is 0 deployment named 'my-test'"
	[ "$status" -eq 0 ]

	sleep 60
	run verify "there is 0 pod named 'my-test'"
	[ "$status" -eq 0 ]
}


@test "clean the test environment" {
	rm -rf tests.status.tmp
}

These unit tests include the linting of the chart, a dry-run deployment, but also a real deployment with a basic topology. After deploying it, we verify assertions on K8s resources. Once the application (a simple Drupal) is started, we get the content of the web site and make sure it contains some expected words and sentences. We could replace it by a Selenium scenario.

Executing the bats my-test-file.bats command would start the execution.
A successful run would show the following output:

bats my-test-file.bats
1..7

✓ 1 verify the linting of the chart
✓ 2 verify the deployment of the chart in dry-run mode
✓ 3 package the project
✓ 4 verify a real deployment
✓ 5 verify the deployed application
✓ 6 verify the undeployment
✓ 7 clean the test environment

The command "bats my-test-file.bats" exited with 0.

Errors appear like below.

...

✗ 1 verify the linting of the chart
    (in test file my-test(file.bats, line 14)
     `[ "$status" -eq 0 ]' failed

...

Library Principles

Assertions are used to generate kubectl queries.
The output is extracted and compared to the values given as parameters.

There are very few queries in fact.
However, they work with all the kinds of resources of Kubernetes. That includes native K8s objects (POD, services….) but also OpenShift elements (routes, templates…) or custom resources (e.g. the upcoming Helm v3 objects).

Queries can be run with kubectl or with oc (the OpenShift client).
You only have to specify the client name in the DETIK_CLIENT_NAME variable (and make sure the client is available in the path).

With this, you can verify pre and post-conditions when using a Kubernetes client, Helm or even operators.

Usage

The library is available as a single file.
It can be donwloaded from this Github repository. The syntax is documented in the readme of the project.

A Dockerfile is provided as a basis in the project.
It embeds a kubectl client, a Helm client, BATS and the DETIK library. Depending on your cluster configuration, you might want to add other items (e.g. to log into your cluster).

Continuous Integration

The project is documented and explains how to execute (and write) such tests on your own machine. But the real interest of such tests is to be run in the last parts of an automated pipeline.

Here is a simple Jenkinsfile (for a Jenkins pipeline).

def label = "${env.JOB_NAME}.${env.BUILD_NUMBER}".replace('-', '_').replace('/', '_')
podTemplate(label: label, containers: [
	containerTemplate(
			name: 'jnlp',
			image: 'jnlp-slave-alpine:3.27-1-alpine'), 
	containerTemplate(
			name: 'detik',
			image: 'detik:LATEST',
			ttyEnabled: true,
			alwaysPullImage: true, 
			envVars: [
				envVar(key: 'http_proxy', value: 'http://proxy.local:3128'),
				envVar(key: 'https_proxy', value: 'http://proxy.local:3128'),
				envVar(key: 'TILLER_NAMESPACE', value: 'my-test-namespace')
			])
]) {

	node(label) {
		container(name: 'jnlp') {
			stage('Checkout') {
				checkout scm
			}
		}

		container(name: 'ci-docker') {
			stage('Login') {
				withCredentials([usernamePassword(
						credentialsId: 'k8s-credentials',
						passwordVariable: 'K8S_PASSWORD',
						usernameVariable: 'K8S_USERNAME')]) {

					echo 'log into the cluster...'
					// TODO: it depends on your cluster configuration
				}
			}

			stage('Build and Test') {
				sh 'bats tests/main.bats'
			}
		}
	}
}

It can easily be adapted for Travis or GitLab CI.
You will find more examples on Github.

News from April 2020: the project has joined the BATS Core organization on Github.

Vincent Zurczak's Blog

My technical blog

Tag: Openshift