Curiosity
It’s clear we’ve come to a point where operations teams and devops teams are using kubernetes (k8s for the rest of this blog) for more than developer-centric applications. It has also become more common for us to see customers and enthusiasts using containerised operations tools to service ops and admin tasks instead of heading to VM deployments. Along with this, we see the spawning of quick-to-live and quick-to-die containerised apps that execute their function quickly while at the whim of k8s orchestration. This isn't negative, but it may bring with it a risk to security for those that rely on containerised software that is encryption/decryption heavy.
Before we go any further, I've read about and understand enough about container security and encryption to be somewhat dangerous (in good humour of course). However, working through the needs of our customers, it’s obvious which things matter first. Encrypting and safe-guarding secrets is really big on their list, and I’m happy to say that Vault is doing its job well...(And here's my question) But how well can it work if we deploy it into a place where we’re not sure about entropy quality? How well does entropy-reliant Vault work, in k8s or non-k8s infrastructure?
Some Explanation
This can get dense, so I hope you’ve your learning and reading cap on. Let’s get some background out of the way before we get to the problem and solution.
Software, Hardware and Entropy
Some would argue that to be the most secure, solutions such as Hashicorp Vault are best deployed on hardware first, then VMs, then system containers (LXC/LXD) and then (k8s) containers, and there’s good reasoning behind this.
“But the software works fine on k8s! What are you talking about?”
In order for software to generate highly unique and large keys --which is essentially the basis of all encryption algorithms -- software relies on random number generation. Computers generally have a hard time creating truly random numbers as they have such strict programming otherwise. Creating chaos within a system programmed within strict rules is difficult, but it is how computers can provide software with adequately unique and sufficiently random strings of digits to create keys and/or certificates. Imagine if software was only able to create the same string of random digits every time you need a new encryption key. That wouldn’t be very secure nor useful.
Computer programmers solve this problem by introducing entropy. There’s plenty to read about entropy elsewhere online, but essentially we bring entropy to Linux systems through I/O devices such as network activity, keyboard/mouse input, etc to provide some real-life randomness. On laptops and desktops, there’s no shortage of that I/O to generate entropy. Linux systems also have a finite pool of entropy from which to create random strings of digits. To improve upon random number generation for servers (which have no keyboard, or mouse input) network I/O isn’t enough, and we rely on software such a ‘rngd’ (random number generator daemon) to help populate the entropy pool for a running server. It takes time to fill that pool. When it’s full of random I/O entropy, we can safely assume that all randomly generated keys or certs are indeed as random as they can be (given the constraints).
{% include image name="entropy-comic_cjppa9" position="center" alt="entropy-comic" %} Referencing 'goneintorapture.com' for this excellent comic.
Containers and Entropy
Now what do we know about k8s pods? They often have a short lifespan, being rescheduled and restarted frequently in a cluster of servers. This is a problem if it’s not addressed in how the pod is deployed. So if we run Vault in a pod that has been up for 8 seconds, it is safe to assume there’s a really tiny pool of entropy to use versus a system that has been on for hours, and has more network I/O thus more entropy, etc…you see where I’m going with this. The odds of generating a noticeable pattern from the random strings out of a nascent container is probably much higher than from a VM or physical system running rngd for additional entropy. It may not paint a positive light for any containerised encryption-reliant software.
Or at least that’s what I assumed at first...
See For Yourself
If you're on a server and you're curious about the quality of your entropy pool (which should instil confidence when generating random strings of digits for keys and certs), you can use the following to query it:
If you're extra concerned and want to boost the entropy on your system, use the following to install rngd and enable it now and after reboots.
In the case of containers, which live for sometimes seconds or minutes, asking such a young instance of software to generate QUALITY entropy is a tall order. Some methods to solve for this would be to configure your host k8s worker/app nodes to run rngd, which helps with the quality of entropy by helping fill the entropy pool.
Which Device?
When seeking a source of random characters, the advisement is to go with /dev/urandom . Alongside /dev/random , both sources aren't the same, but essentially, from a security perspective, it's preferable to use /dev/urandom for gathering character strings for Vault's uses.
In Action
It's difficult to simulate, but if you've a container available to shell into, take a long string (64 characters or something) from /dev/urandom and continue to do it several times, while also polling the available entropy. As you see entropy drop below 200, you know the quality and size of the pool is poor, thus debilitating any software that's supposed to receive truly random strings for key/cert generation.
One of the greatest tolls on this pool of entropy is openssl utility commands where generating new certs is involved. Go ahead and give it a try on a recently booted system, versus a recently up container. While both would appear to succeed for a short while, the one with less entropy will fail or complain or show patterns, which is much less secure than we like.
Tested and Observed.. all good!
Having seen something like Hashicorp's Vault run on hardware, VMs, and in kubernetes, I have compared available entropy values in all environments and have seen little variance on the quality of entropy that the container images or VMs are providing. This is all assuming the Vault servers are part of recommended and supported Vault architectures, as clustered sets. I have also tested this with single 'dev' instances of Vault with the same outcome. The same expected amount of entropy is available and I've not yet been able to force Vault to generate identical random strings or keys.
This is good for both Vault and for running it on Kubernetes, where I originally guessed we'd see a problem with entropy availability. I chock this up as learning and find it very interesting. The hilarious side-effect is seeing how so few people seem to care about entropy. I'll go back to my homelab and laugh it off. My concerns are far more "creative" than necessary, it seems.
TL;DR
Tested and observed that entropy values between hardware, VM and container clusters of Vault are consistent and reliable; I also have spent too much time thinking about this technological rabbit-hole.