Containerizing a Clojure project
Our team had been using Dockerfiles for several years. We recently discovered that we could achieve the same packaging standards using only jib. Plus, there were some real advantages in doing so.
- Build images using just the clojure cli (no Dockerfile, or docker install required).
- Faster & reproducible builds
- Improved image pull times (first deployment)
- Track application dependencies by layer
- Index container workloads by package - useful for assessing new vulnerabilities
- Implement standard image LABEL metadata ( eg
org.opencontainer
annotations )
We've encapsulated this in a tool that you can use on deps.edn projects.
Start with deps.edn
A deps.edn file, on its own, provides much of the information needed to package a clojure application into a container image. Consider the deps.edn file below. This describes an application which will require a classpath containing 28 jar files (this is automatically computed for you by the tools.build library).
{:paths ["src"]
:deps {org.clojure/clojure {:mvn/version "1.10.3"}
compojure/compojure {:mvn/version "1.6.2"}
ring/ring-jetty-adapter {:mvn/version "1.9.4"}
ring/ring-json {:mvn/version "0.5.1"}}
:aliases
{:jetty
{:main-opts ["-m" "my.app.handler"]}}}
The entrypoint for this application, if not aot-compiled, will need to reference these jar files when specifying the -cp argument. Other details about the image entrypoint can be extracted from the :main-opts
value. In many deps.edn files, a :main-opts
value is already present in the alias used to run the app locally. An example entrypoint is shown below.
java -Dclojure.main.report=stderr -Dfile.encoding=UTF-8 \
-cp lib/jetty-io-9.4.42.v20210604.jar:lib/instaparse-1.4.8.jar:lib/jetty-util-9.4.42.v20210604.jar:lib/jackson-core-2.10.2.jar:lib/clout-2.2.1.jar:lib/ring-servlet-1.9.4.jar:lib/compojure-1.6.2.jar:lib/jackson-dataformat-cbor-2.10.2.jar:lib/jackson-dataformat-smile-2.10.2.jar:lib/ring-codec-1.1.3.jar:lib/tools.macro-0.1.5.jar:lib/clojure-1.10.3.jar:lib/jetty-http-9.4.42.v20210604.jar:lib/javax.servlet-api-3.1.0.jar:lib/ring-json-0.5.1.jar:lib/jetty-server-9.4.42.v20210604.jar:lib/crypto-equality-1.0.0.jar:lib/tigris-0.1.2.jar:lib/core.specs.alpha-0.2.56.jar:lib/medley-1.3.0.jar:lib/ring-core-1.9.4.jar:lib/crypto-random-1.2.1.jar:lib/commons-fileupload-1.4.jar:lib/ring-jetty-adapter-1.9.4.jar:lib/cheshire-5.10.0.jar:lib/commons-codec-1.15.jar:lib/spec.alpha-0.2.194.jar:lib/src:lib/commons-io-2.10.0.jar:src \
clojure.main \
-m my.app.handler
Given only the deps.edn file, we can use the jib tool to create a local docker image using just the clojure cli (a local docker install is not required).
# generate a test project using clj-new
# (un-comment the next line if you don't have it installed)
# clojure -Ttools install com.github.seancorfield/clj-new '{:git/tag "v1.2.381"}' :as clj-new
clojure -Tclj-new create :template '"http://github.com/slimslenderslacks/templates/clj-web-template@f0dd6683e8e775e1ae2c86b01bfe8794f210a7b0"' :name container/clojure-app
cd clojure-app
#install deps tool to your machine
clojure -Ttools install io.github.atomisthq/jibbit '{:git/tag "v0.1.12"}' :as jib
# build local image
clojure -Tjib build :aliases '[:jetty]'
This used a library called jib (downloadable from maven, which is why you only need the clojure
cli). If you run the example above, the output will, weirdly, just be an app.tar
file.
You probably already know that images are just archives. However, we don't often interact with them in their archive form. You can explore the contents of an image using tools you probably already have installed (eg tar -tf app.tar
for the archive just built). The archive contains json files, that are used by container runtimes to know how to "run" whatever has been packaged, plus a nested set of compressed archives (the “layers”). These layers form the read-only parts of the container’s union filesystem. Docker runtimes can cache them individually - this is one of the reaons it's useful to package application code and dependencies into different layers. The dependencies layer, which is usually much larger, tends to change more slowly.
The example above will produce a seven layer archive. Five from the “base” image (the OS and the JVM in the case), and two from the local deps.edn
project.
Application dependencies
The image layers above also represent different aspects of an application's overall software supply chain. When we run this clojure application, we are trusting that we have correctly packaged software coming from several distinct sources. This is also true for application maintenance, and security patching. We have to track updates, and security advisories, from several places.
Each layer has its own effective bill of materials. Rolled up, this determines the overall surface area for vulnerabilities, and risk assessment when we discover new exploits. Tracking your bill of materials layer by layer is also useful when managing security patches. For example, we all recently needed to query our production container workloads for image layers containing a vulnerable log4j-core package.
For container images, many dependencies are buffered by the base image. In the example above, the base image defaulted to gcr.io/distroless/java:latest
. It rolls up the JVM, and O/S dependencies into one named thing.
Base images and image registries
Details about base and target images can be committed to a jib.edn file in the root of the project.
{:base-image {:type :registry
:image-name "gcr.io/distroless/java:11@sha256:1d377403a44d32779be00fceec4803be0301c7f4a62b72d7307dc411860c24c3"}
:aliases [:jetty]
:target-image {:type :tar
:image-name "<docker-namespace>/<repository-name>"}}
In the :base-image
above, we reference a base image digest. We can also reference base images by tag; however, tags can be updated, and when they are, your application stack will suddenly change.
By including the digest, any updates to the stack will be backed by a commit to the application repository. Our tools can check for new versions of base images just like we check for new versions of library dependencies. And as long as we avoid horrible things like snapshot versions, then entire application stack is determined by what is checked in. That's a good starting point.
A target-image of type :tar
is not that useful. Change the :type
to either :docker
(to push the image to a local docker daemon) or to :registry
to push the image to a remote image registry.
{:base-image {:type :registry
:image-name "gcr.io/distroless/java:11@sha256:1d377403a44d32779be00fceec4803be0301c7f4a62b72d7307dc411860c24c3"}
:aliases [:jetty]
:target-image {:type :registry
:image-name "<docker-namespace>/<repository-name>"
:username "<docker-username>"
:password "<docker-password>"}}
When running clj -T:jib build
now, the resulting image will be pushed to dockerhub. See the project documentation for details on pushing to other registries (like gcr, or ecr), and for different ways of managing credentials. If the :type
is switched to :docker
, you can run the resulting container in a docker runtime using the docker cli.
docker run --rm -it -p 3000:3000 <docker-namespace>:<repository-name>
AOT compilation
If we want to aot compile clojure sources before packaging, the tools supports an :aot
option. Add :aot
and :main
to the jib.edn config file.
{:base-image {:type :registry
:image-name "gcr.io/distroless/java:11@sha256:1d377403a44d32779be00fceec4803be0301c7f4a62b72d7307dc411860c24c3"}
:aliases [:jetty]
:aot true
:main my.app.handler
:target-image {:type :docker
:image-name "<docker-namespace>/<repository-name>"}}
Obviously, this will have an impact on what gets packaged in the application layer, and on the application startup time. We also use a different entrypoint for aot-compiled sources.
Next Steps
There are plenty of possible optimizations. Let us know if you have any suggestions. Included in our current list of potential features are the following.
- improved auto-selection of non-root users. For some base images, we know which non-root users are available, and can make good default choices. For example, the default
User
for the gcr.io/distroless base image will be a non-root user (with uid65532
). Officialopenjdk
images come with the usernobody
. Unfortunately, it's possible to select base images where the imageUser
will still default to root. - running test
:aliases
to automatically generate, and test, apparmor profiles, and to track unexpected writes to the top container layer. - other ways to specify how resources should be copied into layers in the target image?
- what about whiteout files or layer squashing?