Application containers are one of those great technologies that comes along and reshapes an entire industry. Historically, these kinds of disruptions have been rare; to witness in real-time how a product like Docker can evolve from a seed of an idea to the must-have backbone of so much of today’s digital landscape is quite remarkable. My own career as a technologist has run parallel to the development and maturity of Docker and its greater container ecosystem. As containers and container platforms have evolved, the communities around them have grown, and container-based products have permeated our tech stacks. Yet, despite all of this, there’s still a bit of mystery around how containers actually work and the security implications they create for the applications that run inside them. This is the topic I want to try and tackle today.
Unpacking Docker Images
To start off, we need to understand what a Docker image is. Images, broadly speaking, are like container templates that can either be run on their own or built upon to create new images. On a more technical level, images are just .tar archives containing a filesystem. That’s it!
Once the images are downloaded from an image repository (like hub.docker.com), they are unpacked and stored in the host’s filesystem. We can actually inspect these images on the filesystem by navigating to the image path (
/var/lib/docker/overlay2 for hosts using the overlay2 storage driver). Each image layer is named with a sha256 hash; here’s the base layer for the official Alpine image:
[email protected]:/var/lib/docker/overlay2/56abedeb5085c1ad962f3dec89d1e9bc6b584ee06d9bed0897221417bb496c56/diff# ls -la total 72 drwxr-xr-x 18 root root 4096 Jan 24 16:41 . drwx------ 3 root root 4096 Jan 24 16:41 .. drwxr-xr-x 2 root root 4096 Dec 20 22:25 bin drwxr-xr-x 2 root root 4096 Dec 20 22:25 dev drwxr-xr-x 15 root root 4096 Dec 20 22:25 etc drwxr-xr-x 2 root root 4096 Dec 20 22:25 home drwxr-xr-x 5 root root 4096 Dec 20 22:25 lib drwxr-xr-x 5 root root 4096 Dec 20 22:25 media drwxr-xr-x 2 root root 4096 Dec 20 22:25 mnt dr-xr-xr-x 2 root root 4096 Dec 20 22:25 proc drwx------ 2 root root 4096 Dec 20 22:25 root drwxr-xr-x 2 root root 4096 Dec 20 22:25 run drwxr-xr-x 2 root root 4096 Dec 20 22:25 sbin drwxr-xr-x 2 root root 4096 Dec 20 22:25 srv drwxr-xr-x 2 root root 4096 Dec 20 22:25 sys drwxrwxrwt 2 root root 4096 Dec 20 22:25 tmp drwxr-xr-x 7 root root 4096 Dec 20 22:25 usr drwxr-xr-x 11 root root 4096 Dec 20 22:25 [email protected]:/var/lib/docker/overlay2/56abedeb5085c1ad962f3dec89d1e9bc6b584ee06d9bed0897221417bb496c56/diff#
As we can see, it really is just a directory that holds a filesystem. So how does a filesystem get turned into a running container?
See cgroup. See cgroup Run
Now, let’s talk about Linux cgroups. Cgroups, or “control groups”, are a function of the Linux kernel that allow for isolating groups of processes from the rest of a machine. Using cgroups, a process can have a “virtual” filesystem, resource limits, firewalled networking, and a host of other features. If this sounds suspiciously like a Docker container, you’d be right! Docker leverages cgroups to run its containers.
To run a container, the Docker daemon takes the following steps (roughly):
- Compiles a virtual filesystem from each image layer.
- Creates a new cgroup.
- Mounts the virtual filesystem to the cgroup.
- Sets the cgroup limits to those defined by the image metadata (stored in a local DB).
- Sets cgroup networking.
- Starts the process defined in the image’s Dockerfile
Congratulations! You now have a running Docker container. In fact, if you want to inspect the running container, you can find its process by using ps:
Exploring How To Build Secure Containers
Docker, when you dig into how it does what it does, isn’t doing anything entirely revolutionary by itself. It simply leverages already supported kernel primitives like cgroups and packages them into a product that is simpler and more streamlined to use. But because Docker doesn’t build much on top of the actual kernel mechanisms, the containers it handles and applications it hosts are only as secure as the kernel itself. This means that the Docker daemon, its images, and its containers have no real built-in security features of their own that would allow for embedded secrets to stay secret or keep unauthorized third parties from gaining access to their processes and filesystems. We can test this by exploring a few of the more popular methods of hiding secrets in a container image: injecting secrets at image build and putting secrets in environment variables at container run time.
Test #1 Injecting secrets into containers at build
Let’s create a new Docker container. It’s rather simple, actually. All we need is a Dockerfile (assuming you’ve already got a working Docker installation).
FROM alpine:latest ADD super.secret /super.secret CMD /bin/sh
We’ll create a file called super.secret to inject into the container:
$> echo "this is a secret" > super.secret # And now we can build the image. [email protected]:~/docker# docker build -t secret:test . Sending build context to Docker daemon 3.072kB Step 1/3 : FROM alpine:latest ---> 3f53bb00af94 Step 2/3 : ADD super.secret /super.secret ---> 61ea9104ee5d Step 3/3 : CMD /bin/sh ---> Running in 2a1b90e4b209 Removing intermediate container 2a1b90e4b209 ---> c3713649d32e Successfully built c3713649d32e Successfully tagged secret:test [email protected]:~/docker#
Great! We have an image that now contains our secret (password, certificate, really anything you want to keep secret). So what happens if we search the filesystem for that secret file?
[email protected]:~/docker# find / | grep super.secret /root/docker/super.secret /var/lib/docker/overlay2/11567f3bcc1b8e844e22ba37cfef2432ea319247e403707497022edebfd7a7ce/diff/super.secret [email protected]:~/docker#
That’s not good. Not only did we find the file, but we can actually read that file back, too.
[email protected]:~/docker# cat /var/lib/docker/overlay2/11567f3bcc1b8e844e22ba37cfef2432ea319247e403707497022edebfd7a7ce/diff/super.secret this is a secret [email protected]:~/docker#
This is because, as we explored earlier, Docker images are just filesystems that exist on the Docker host. Now, in this case, we obviously didn’t expose anything because a) the secret is
“this is a secret” and b) the image never left our local host. But, for the sake of argument, let’s say we pushed our image up to Docker Hub, or an internal enterprise repository like Artifactory. Now anyone with “pull” permissions to that image has access to our secret, meaning that secret isn’t really all that secret anymore.
So what about environment variables? Surely those are more secure?
Test #2 Injecting secrets in environment variables
Let’s go back and modify our Dockerfile to add an environment variable called
FROM alpine:latest ENV SECRET="" # blank secret that we'll over-write at run-time. CMD /bin/sh
And now we can build this new image.
[email protected]:~/docker# docker build -t secret:testenv . Sending build context to Docker daemon 3.072kB Step 1/3 : FROM alpine:latest ---> 3f53bb00af94 Step 2/3 : ENV SECRET="" ---> Running in a1872c8078c3 Removing intermediate container a1872c8078c3 ---> 178c24f06c44 Step 3/3 : CMD /bin/sh ---> Running in bccf052509b7 Removing intermediate container bccf052509b7 ---> cca1f3bd8248 Successfully built cca1f3bd8248 Successfully tagged secret:testenv [email protected]:~/docker#
So, now all we’ve done is shifted the vulnerability from the filesystem to the process environment. While this change eliminates secrets being disseminated via an image repository, this method still doesn’t prevent other containers or users from accessing the secrets in the running container. Let’s explore how.
First, we need to run our new container. Since it defaults to running /bin/sh and wants user input, we can overwrite this to make it run in the background with the sleep command.
[email protected]:~/docker# docker run -d -e SECRET="this is a secret" secret:testenv sleep 300
This will make the container run for five minutes before exiting.
Now, just like we did with Nginx, we can find the sleep process that’s running in the container with ps:
[email protected]:~/docker# clear [email protected]:~/docker# ps aux | grep sleep root 12184 0.0 0.0 1516 4 ? Ss 17:06 0:00 sleep 300 root 12342 0.0 0.0 12944 940 pts/1 S+ 17:06 0:00 grep --color=auto sleep [email protected]:~/docker#
Taking note of the PID (
12184), we can then inspect the process environment by navigating to /proc/9944 and looking at the environ file:
[email protected]:~/docker# cat /proc/12184/environ PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin HOSTNAME=6ddf6d9e588d SECRET=this is a secret HOME=/root [email protected]:~/docker#
While slightly more secure (and definitely more obfuscated) than the previous method, it’s still not as secure as we would like it to be.
Secure Containers Require Secure Applications
Images and containers should never be treated as inherently secure -- safe and acceptable use of containers in a secure environment comes from properly implementing security at every level of the application stack. It follows, then, that security should be implemented at the application level. To do this properly, we need to consider some common pitfalls.
Pitfall #1: Hard-coding secrets into applications
Especially in a project with a faster release cycle, the temptation to just put all your secrets into your source code can be great. Especially when you can just rotate secrets when a new release is deployed. However, this presents some problems. Chief among them is the fact that you have to put those secrets in the code.
Wherever that code goes, so do your secrets. Assuming that everyone uses some sort of version control system (VCS) like GitHub, Bitbucket, or GitLab, then your secrets will be in that VCS for the life of the project. Even worse, those secrets will continue to persist in the commit history of that project. Even if you rotate your secrets regularly, the commit history provides anyone willing to do the work with a pattern of how those secrets are generated, and once they understand the pattern they can try to predict what the next one will be.
Putting secrets into VCS also makes sure that their horizon (how far the secrets can travel from those who need to know them) is rather large, with it being possible for everyone in an organization to have read-only access to that organization’s VCS system.
Pitfall #2: Using environment variables for secrets
As we’ve already discussed earlier, you should never put secrets into environment variables. Enough said.
Pitfall #3: Not using unique secrets for each application and environment.
One of the problems with having multiple accounts, applications, APIs, networks, and other systems is keeping track of the secrets they ultimately require. Accounts have passwords, applications have encryption keys and certificates, APIs have API keys, networks have packet flags or MAC addresses, the list goes on -- a list that now needs to be kept track of.
There is a very real temptation to simply use the same secret for everything so you only have to remember or keep track of one thing instead of several. This is not just a problem in the consumer space where people keep using the same password over and over, sharing their one unique secret between their banking, shopping, tax returns, DMV portals, and so on. This is also a problem for developers and enterprise IT as well. Reusing secrets like certificates or API keys across applications creates a chain of vulnerability: if one application’s secrets are identified, but they’re shared by another application, then the second application is also vulnerable. The repeat use of secrets also increases their lifespan, and time is the enemy of secrecy. The longer a secret is used, the likelihood that secret will be compromised rises.
Pitfall #4: Storing secrets in insecure places
This ties in with pitfall #1 in that you should never store your secrets in places that are not secure themselves. This includes shared storage, unencrypted files, version control systems, e-mail, chat applications, development planning apps, text message, carrier pigeon, 12th century parchment, databases, and hand-written notebooks. Just to name a few. As we’ve seen recently, one common method of accidentally exposing data is through user error on Amazon’s S3 service, though at no fault on Amazon’s behalf. Using S3 as a secure repository is a bad idea, as companies like Booz-Allen Hamilton found out recently when they accidentally leaked geo-spacial intelligence imagery via a very poorly configured S3 bucket, as did Verizon and Accenture with their own mishaps. While pitfall #4 is arguably the least severe of the four, there are ways to secure secrets properly and in a way that works with applications of all types.
The solution here is to implement some sort of secrets management, either through a cloud provider (like AWS’s Secrets Manager and KMS), a custom application, or a third-party application like Hashicorp Vault, CyberArk, Salt, or similar. The end result should be a container infrastructure that has no concept of secrets, applications that dynamically acquire secrets on an as-needed basis, and secrets that are managed and rotated separately from the rest of the infrastructure.
It should be noted that applications that strictly adhere to the idea of the “12 Factor App” will not be secure if they use environment variables for all of their config. While the ideas proposed by the 12 Factor App are good starting points (Capital One’s own Jimmy Ray has an excellent write up on the 12 Factor App and microservices), developers should never adhere to a set of rules without critical evaluation of those rules -- this is one of those cases. Perhaps there should be a thirteenth factor added that talks about secrets management?
An Ecosphere of Trust
Real application security should be handled at the application level, not the infrastructure level. When operating an application in an environment where secrets are required, developers should make an effort to leverage secrets management systems to ensure that those secrets don’t proliferate beyond their intended horizon.
We as developers and engineers should endeavor to write applications that we would want to use ourselves. Our role requires that we exist in an ecosphere of trust as we coexist with other developers and their work. We write software that our peers will use, that will manage our peers’ data, and in some cases, have real power over people’s lives. Part of that ecosphere of trust is the understanding that the software we write will be secure and won’t unnecessarily expose data. If those applications are running in containers, we have a duty to make sure those applications are as secure as possible.
But here’s the good news: it’s not that hard! Unlike many things in the DevOps/SRE world, securing containers is pretty simple once you understand how they work. I hope this blog post helped demystify some of that for you.