One of the golden rules of IT security is that you need to maintain an accurate inventory of your assets. After all, if you don’t know what you have then how you can secure it? This may cover a list of physical devices (servers, routers, firewalls), virtual machines, software… An “asset” is an extremely flexible term and you need to look at it from various viewpoints to ensure you have good knowledge of your environment.
Traditionally many of the tools we use to track this stuff are based
around a relatively static concept of machine persistency; if it takes
months to get tin onto the datacenter floor and then it’ll stay there
for 3 or more years then it makes sense to track physical assets in
that way. Similarly if you build a VM and it stays around for a year
and runs a known application then we can use that VM as the basis for
an inventory record. We can create tools that use the OS as the core
record and scan the machine (e.g. ssh
in) and discover relationships
(child VMs; OS version; software running; communications between servers;
etc) and automatically build out the inventory. This is good standard
enterprise IT management.
The cloud, though, can change this. I can spin up a server in under a minute, run my application for 2 hours, then shut it down again. Tomorrow I can spin up another copy (with a new machine identity, new IP address, etc). The machine isn’t up long enough to be scanned, and even if we could then the records would be stale and useless.
So we need to rethink how we deal with inventory in a cloud world.
IaaS
The first thing people typically think of, when considering cloud, is the IaaS model. Spin up an AWS VPC, create networks, VMs, install your app. We’re in the cloud! And this is fine. You’re basically treating the cloud as an outsourced data center, with the assets being an extension of your core environment. In this scenario your existing tools and processes may well be sufficient.
However a primary feature of an IaaS is that it can be a lot more dynamic than that. There’s an API to let you build new assets and these are available almost immediately. I could run a script that builds a new VPC, creates a three tier network architecture, installs VMs, deploys my app. Within minutes I have a full stack.
This can lead to a shadow IT problem, so it becomes very important to control IaaS services. Central IT should provide the tools and the gateway services to the cloud and not allow direct administrative access.
This can also be used to solve the “need to scan” nature of existing systems; your tooling can tag the data by requiring this to be specified up front (“I’m building 3 VMs with RedHat 7 to run application XYZ and it needs to listen on port 443 and speak to database ABC”). You have the data needed for your inventory, up front!
The same can be true for other sorts of infrastructure (networking, WAFs, firewalls, storage). If you must permit this level of administrative access to the cloud provider then ensure there’s an audit trail of activity (e.g. AWS CloudTrail) that can be used to track the API calls and maintain the inventory information, accordingly.
However we’re not quite finished… a lot of inventories are point in time snapshots; “this is what we have now”. Unfortunately a data breach may not be detected for quite a while (I’ve see 200 days thrown around a lot). Since your cloud server may have been shut down, IP address re-used for another application, and a new instance created elsewhere, we need the ability to “time travel”; maintain an audit trail to allow us to recreate the state of the inventory at whatever time we care about.
And, bonus! This also helps speed up delivery of internal IaaS services.
PaaS
In some respects an off-premise PaaS provider is probably easier to manage than an IaaS; PaaS engagements are typically more “heavy weight” than just spinning up a few VMs, and the teams using them are more likely to follow corporate processes around vendor management (especially if the team wants a VPN into the PaaS environment), and this can help ensure the application inventory is correctly maintained. At this point we can’t create an “application to host” mapping, because there is no fixed host, but we can easily see how the PaaS provider becomes an equivalent parent entity.
Of course some teams may bypass the controls (another Shadow IT issue), especially where the PaaS is directly reachable and management via https. Some of these concerns can be addressed in a similar way to SaaS, which we’ll get to in a moment.
An on-premise PaaS can be a challenge for existing processes to handle. It’s tempting to treat this similar to an off-prem solution (“my application runs on the inhouse PaaS”), but we lose some of the ability to determine impact of a breach. Common post-breach investigation processes use the inventory to determine application scope impact, but with a PaaS almost any application in the environment could be co-located on the same physical machine at the time an application was broken into. Inventory and audit trails don’t help so much here because elastic applications, especially 12 factor applications, are highly dynamic inside a PaaS environment. Yes, this also impacts off-prem PaaS, but we’re accepting those are opaque boxes; with an on-prem PaaS we have the ability to take the lid off the box… and this could hamper us!
I’d love to hear from any DFIR experts on how they handle (or plan to handle) PaaS breaches; the best I’ve heard, so far, is “comprehensive audit logs of everything that happens”). Please, feed back welcome!
SaaS
Now we come to the scariest part of the cloud,
mostly because your organisation may be are using
SaaS services without your knowledge. Wolf Goerlich
claims
that your average organisation uses 928 cloud services. There’s also
a claim that CISOs reckon they only use 40 cloud apps. Now there may
a discrepancy between “cloud app” and “cloud service” accounting for
this massive gap but, even so, 928 is a lot, and most of these will
be SaaS offerings.
We’re not talking about your centrally managed service (e.g. the corporate Office365, or corporate Google Apps); those we have pretty good control over and can track their use and understand them as assets.
The problem is the unmanaged services. This is because many of these SaaS offerings are just ad-hoc web sites that people use without thinking. “I want to pretty format my perl code; ah, here’s a web site; let’s cut’n’paste into that form…”
Effectively this is almost all “shadow IT”, and one that’s hard to identify.
You might wonder why that is a problem… well, that web site now has a copy of your code. Do they keep it? What terms and conditions were on that site? Many of them may claim ownership of the data submitted; has your developer now given away that code, just for the benefit of having it nicely formatted?
Even if the T&Cs are good, has the site ever been breached? How can you be sure there isn’t a bad guy on that service stealing data?
Then there’s third party repository services; developers may pull down modules or libraries from docker, or nodejs, or CPAN or… These services may not be reliable (leftpad anyone?), or have rogue code added, and can lead to non-repeatable builds.
How can you manage this if you don’t even know what is in use?
The first step is to identify what your teams are doing; what 928 services are being accessed. Then identify which of these may be considered high risk (eg T&Cs that you don’t like; poor reputation) and then start blocking them at your border. You do force people to go via proxies, don’t you?
This starts to sound like a lot of work and is likely an area worth purchasing a service to handle; these can take your proxy logs and identify who is accessing what, whether it’s mostly downloads or if there are large uploads (which could identify data exfiltration), and provide reports on your risk posture. (And this can also help with the shadow IT PaaS issue, mentioned earlier).
Ultimately you may wish to create inhouse alternatives; if you find lots of people hitting code formatting sites then create your own! Create your own code repository service that has curated library/code modules so you’re not dependent on external services. Perhaps find a service provider that does what the adhoc service does and enter into a managed agreement.
Conclusion
There’s no one size fits all solution to managing a cloud inventory. For managed services this can be done via central IT tooling (IaaS) or identified service agreements (SaaS); we need to change our technology processes to handle the more dynamic nature of the environments, but we can deal with this in various ways (e.g. dedicate a VPC to an application? Then it doesn’t matter so much about VM spinup/spindown; all VMs inside that VPC are dedicated to the app)
The bigger gap is the unmanaged SaaS footprint. The first step is identification. The second is control. This can be made easier with strong egress controls on your environment, forcing traffic to go through choke and monitoring points. You want to do this for Data Loss Prevention, malware scanning, blocking phishing sites and so on; the same infrastructure can be used to detect and limit risking SaaS usage… and to keep an inventory of what SaaS services are in use!