People think container security is hard. But it’s not… if you think about it the right way. And that’s where people tend to go wrong, and that’s why they think it is hard.
So let’s follow a thought pattern…
First we need to consider what is a container and what distinguishes it from a virtual machine.
In general a container has the following properties:
- Shared kernel
- Segmented view of resources
- Separate process ID space
- Separate filesystem mount space
- Separate IPC memory segments
- Separate view of the network
- …
- Multi-platform
- Linux VServer (from 2001!)
- OpenVZ
- AIX WPARs
- Solaris
ContainersZones (from Solaris 10, 2005) - Linux containers (LXC, cgroups, etc)
- Docker
- Warden
- …
You can see that containers aren’t even new. I was using VServer on a RedHat 6 build (not RHEL; the old old RedHat!) in 2001 to create “bastion servers” on my firewall machine; one bastion had an apache instance, another handled incoming SMTP, another handled incoming ssh. (Yes, yes; overkill for a home network at the end of a 640/128K DSL connections!)
Solaris tended to treat their containers as lightweight VMs; you could install a whole copy of the OS or you could create ‘sparse zones’ which used the parent OS image as a read-only copy inside the zone. There was a persistent filesystem (typically from ZFS) associated with the zone.
There are plenty of ultra-cheap “virtual machine” providers that use OpenVZ and similar to give you a “server”. It’s not a VM, but it’s mostly good enough.
So what goes into a container?
Almost anything. As we saw from Solaris, a container could contain a whole OS install. Including an ssh daemon, an IP address dedicated to the container and so on (so now we have identity and access management issues; remote scanning issues; change management issues…)
On the other hand a container could contain the minimum necessary to run the code. My “web” bastion server from 2001 had httpd and the minimum necessary libraries and binaries to run it.
With a language such as “go” (which compiles to a static binary) a container could contain as little as a single binary plus a few pseudo devices to make it work. It could have an “internal to the machine” ethernet address with the parent OS port forwarding traffic on a single port.
Over on the docker blog, Mike Coleman argues that containers are not VMs. He’s right that they shouldn’t be. Unfortunately that’s how some people use them. In an earlier post I highlighted how some people may treat a container as VM and flagged that this is wrong.
If containers are so flexible, how do I secure them?
Let’s look at the worst case scenario. A container may have:
- A cut down (maybe!) version of the OS
- Glibc, bash, web server, support libraries, tools
- Maybe different versions than those running on the host!
- Filesystems
- Private to the container?
- Shared with the host?
- Access to the network
- Bridged to the main network?
- NAT?
- Processes that run
- Network listeners
- User logins
Despite Mike’s comments, this sounds like a whole operating system to me. The only difference is the shared kernel.
So, in the worst case, we need to control this the same as the OS. And deal with the unique challenges.
It’s easy to draw parallels. To start with:
Virtual Machine | Container | New threat | |
---|---|---|---|
"Parent Access" | Hypervisor attack; one VM may be able to use hypervisor bugs to gain access to another VM | Parent OS attack; process may be able to escape the container | More people may have access to the host OS than to the Hypervisor; more threat actors. Shared kernel is bigger than hypervisor; bigger attack surface Kernel may allow dynamically loaded modules; variable attack surface! |
"Shared resources" | Rowhammer Noisy neighbour Overallocation ... |
Mostly the same | Resource separation is now in the kernel; not as well segregated; not all resources may be segregated |
"Unique Code" | Each VM may run a different OS, different patch revisions, different software | Each container may run different software versions, different patch levels, different libraries | Scaling issue; we can run many more containers than VMs |
"Inventory management" | What machines are out there? What OS are they running? Are they patched? What services are they running?... | Mostly the same | We typically have strong controls as to who can create new VM's but we allow anyone to spin up a new container 'cos it's so quick and easy! |
"Rogue code" | Is the software secure? Acceptable license terms? Vulnerability scanned? | Mostly the same | Anyone can build a container; it may have different versions of core code (eg glibc) than the core OS. Developers may introduce buggy low-level routines without noticing |
I think it becomes clear that, in the worst case, there are no new problems. We see all these at the VM level.
So can we use the same solution?
Where are things different?
Unlike a VM, a container may have other characteristics:
- Very dynamic
- Harder to track, maintain inventory
- If you don’t know what is out there, how can you patch?
- How many containers may suffer shellshock, heartbleed or CVE-2015-7547 (glibc) issues because we don’t know where they run?
- Technology specific issues
- Docker daemon running as root?
- SELinux interaction
- Root inside a container may be able to escape (eg via /sys)
- Kernel bugs
- Ensure your parent OS is patched!
What is now highlighted is that the existing solutions may not scale properly. What works for a 1,000 (or even 10,000) VMs won’t work for 100,000 or millions of containers.
Can your solutions deal with containers spinning up, running for a few minutes, then shutting down again? Probably not.
The easy way.
The above is the hard way of doing it. You can manage your containers that way if you want. If you do then you’re just using the container as a lighter-weight VM. You’re likely to hit problems. If you want a VM then build VMs. Automate your VM build processes; don’t rely on docker repositories (I hope no enterprise is using external docker repos, anyway… that should go without saying).
Instead, if you want to make use of containers you must follow Mike’s advice and stop thinking about containers as light-weight VMs.
Now you can start to think of the flexibility containers can give you. You can scale bigger and better.
While container technology allows you to treat a container as VM, you should think of a container as
- Transient
- There is no persistent storage
- Short lived
- Anywhere from microseconds upwards
- Immutable
- You don’t change the running container; you build a new image and deploy that
- Untouchable
- You NEVER LOG IN!
- Automated
- If you can’t login, you better automate the builds
Now we can start to build a container security model. We have an application focus, rather than an OS focus. You don’t allow people to deploy containers; your “application delivery system” does it for them.
Now we’re on the path to elastic compute; if your app deployment is hands off then you can start to scale!
Minimizing attack surface
We can follow basic hygiene rules:
- Harden the kernel
- You don’t have to use the vendor provided one
- Ensure the OS is patched
- Integrate into your CI/CD process
- No manual startup of containers
- Only those that passed testing, vuln scanning, etc can be run
- No external code without approval
- Standard code hygiene (provenance, licensing, etc)
- Automate, automate, automate!
This gives big wins.
- How many licenses are you using? You know because your automation tools will tell you what is running
- A CVE has been found; you know what images are vulnerable and where they
are running
- Your image is more lightweight so may not be vulnerable in the first place!
- Patching doesn’t exist; you just redeploy
And another benefit… if your app can be delivered this way, you can also start to deploy VMs this way. A Linux OS without any form of login? Let’s see someone brute-force ssh bad passwords that way :-)
But remember
Container security is weaker than VM security; the kernel has a large surface area to be attacked.
VM security is weaker than physical machine security.
Network security is weaker than air-gap separation.
Air-gapped machines are weaker than a server covered in cement and dropped into the bottom of the ocean.
We’ve seen that using a container as a light-weight VM doesn’t make sense; you haven’t solved any security issues, you’ve just increased them!
Your security stance depends on your needs and risk evaluation. Pick the technology that is right for you and use it appropriately.
Summary
Google have been using containers for around a decade. Docker has made massive strides in bring containers down a easily automatable format.
But containers are not should not be VMs.
Google Functions? Amazon Lamba? They’re containers! A rules engine triggers on an event, starts up a container, runs your function, destroys the container. A container that spins up, runs, is destroyed in a fraction of a second. How cool is that?
A PaaS like Cloud Foundry; your app runs inside a container. The autoscaler will spin up new containers across the cloud. A VM dies, the PaaS will spin up a new copy. All you deal with is your app code; you don’t touch the OS.
Docker can build light weight containers that can then be thrown into Mesosphere; a Swarm of apps can be started up in containers across the cloud quicker than you can read this post.
And they all work by following the basic rules
- Transient
- Immutable
- Untouchable
- Shortlived
- Automated
I think this post is long enough; we still need to touch on lifecycle managment to complete the security model.
But you already have that… right?