Force 3 : Services want to be on
//
Central Idea: Take architectural steps to inherently reduce your attack surface - don’t just rely on, so called, attack surface management tools except for real time issue discovery to relentlessly counterbalance the inherent desire of software and services to be open.
Continuing our theme of exploring the 6 fundamental forces that shape information security risk we will now look at Force 3: Services want to be on. As we did in the last post we can move from treating the symptoms to getting to grips with the underlying force itself.
First, a reminder of how we state Force 3: Unless positively constrained, attack surfaces grow. Risk is proportional to attack surface. Unknown services are never checked. There is a Murphy’s Law corollary of this which could be stated as: services want to be on, unless you really want them to be on and then they often fail.
The genesis of this force is that new assets, services, software and components are constantly being introduced in planned and unplanned ways. There is often an economic incentive for manufacturers and software / service vendors to add new features and capabilities for actual or perceived competitive reasons, such as functions, manageability or efficiency of support. Many features are enabled by default to encourage their use to promote adoption and stickiness. This is the reality we live in and despite some vendor initiatives and regulatory efforts to drive necessary secure-by-default and secure-by-design approaches Force 3 continues to be a fact of life.
To deal with this, there is a useful collection of tools and approaches commonly referred to as attack surface management. Let’s explore whether these approaches really do address the root causes or just tackle the symptoms. I won’t cover the resiliency corollary in this post and will save the broader topic of reliability and resilience for another day.
Treating Symptoms - dealing with what the force does
Attack Surface Management. The main tactical approach to dealing with this force is to constantly discover what your attack surface looks like. So, attack surface management tooling aims to continuously find, categorize and classify what is discovered and compare that to what is expected. This is difficult at multiple levels because of the transience of some services (they’re not always on), accurately profiling what is discovered, and having sufficient visibility to know if the observable attack surface is the actual attack surface. We also have the meta-problem of truly understanding what is the expected surface that you’re seeking to use as the baseline.
Discover, Embrace or Kill. This is all about finding errant services and closing them down. This discovery regimen, followed by embrace (realize it is authorized) or kill (when it isn’t) can become one giant game of whack-a-mole. Many organizations then just accept this as an operational part of their security or vulnerability management operations.
Vulnerability Management. You could make a strong argument that the class of attack surface management tools are just one part of vulnerability management. Indeed, vulnerability management could hardly be deemed successful unless it was able to look across the entire attack surface. The major issue I see in organizations that have vulnerability management wholly encompass attack surface management is the inherent tendency to simply make sure the presented attack surface has no exhibited vulnerabilities - as opposed to treating the mere presence of a particular part of the surface as the vulnerability itself. For example, you discover an unexpected FTP server (remember those?) but vulnerability management tells you that it is not vulnerable (patched, strongly authenticated, well configured) and so it doesn’t get flagged. Except, who, why and how did it get there in the first place?
Asset Management. Enterprise asset / inventory management and the basic discipline of configuration management are necessary to control what is present in your environment. There are whole categories of solutions as part of IT and service management operations. These can be effective but do fail when a particular asset (say an operating system build) is upgraded and the vendor has slipped in a new service with an open (and possibly vulnerable) port that suddenly represents a significant new attack surface.
Baseline Configurations. For that last issue, and many other reasons of course, it is important to establish baseline configurations of software and other services rather than simply deploying what the vendor gives you. A big part of this security hardening includes removing or disabling unneeded or risky components. Doing this well can be tricky since vendors will often (inadvertently?) undo your disabled settings during updates. Fortunately, there is more help now than ever before with approaches like the Center for Internet Security’s benchmarks. Also, many, but not enough, vendors are taking a secure-by-default and secure-by-design approach more seriously.
Default Deny Networking. Defense in depth is needed (see also this post that explores this in more detail). In particular, network segmentation and good old fashioned firewalls to configure cross-domain connectivity using explicitly allow-listed flows for service and application DMZ (perimeter) configuration. This should be augmented to ensure all zones use a similar explicit allow-listing of ports so inadvertent opening up of interior services are not exposed beyond the organization boundary or specific segment. But what about zero-trust principles and "de-perimeterization" you ask? Well, zero trust is not all about eliminating zones and segments to assert control. Rather, it is about not assuming they can be a sole line of defense. To move to an environment of zero implicit trust you need to create explicit trust through a combination of controls ranging from continuous strong human and device authentication, end point integrity checking, end to end protocol security (authentication and authorization), context specific access controls which in turn are implemented in policy enforcement points including networking controls to funnel traffic to such points.
Treating Causes - dealing with the force itself
The Holistic Attack Surface. In many cases the tactics we’ve covered are good enough, especially if you can run an ever faster OODA loop over your game of whack-a-mole. But how to deal with the nature of the force itself? The first is to look at the problem more holistically - what actually is our entire attack surface? Looking at the diagram above, if we think about attack surface discovery as probing from source to destination, we can see four possibilities. Scanning outside-in is classic attack surface management. Inside-in is your regular internal asset / inventory approach. Now, I don’t doubt I’ve just annoyed a number of vendors who will object to the big “?” in the two remaining quadrants. Yes, I know there are tools and techniques that fill those quadrants to outside-out discover and profile services beyond your classic organization boundary. Yes, there are tools to inside-out profile shadow IT or unexpected exfiltration paths from your core IT. But, I’d argue these are not well defined services and are not integrated with the other approaches that cover the other quadrants. The whole point of this diagram is to show that there should be one approach to cover it all - to truly look at the world as attackers might see you. I would love to see attack surface management approaches merge with enterprise inventory and to not just focus on the core IT of an organization but to look at everything.
Next Generation Vulnerability Management. Much of the underlying force can be dealt with by applying a more expansive view of the goal of vulnerability management. Consider, four layers of vulnerability management that detects and resolves broader configuration and architectural problems as well as dealing with the more common patch / bug fix cycle.
Coverage completeness, criticality ranking and dependency mapping. Having a continuously defined, enumerated and verified inventory of all the objects in your domain (internal or external, understanding their relative criticality in the context of the organization's business processes as well as the dependencies between them. Identify dependency discrepancies e.g. something ranked as highly critical being intimately dependent on something ranked as not critical signals an error, or a need to understand why the dependency doesn’t propagate the criticality (which may in fact be a good design). You all know, this is hard to do, very hard.
Component flaw discovery and remediation. This is what most refer to as vulnerability management - the discovery (by various techniques) of flaws in software / other objects that can be exploited. These are remediated by fixes / patches, layered mitigation or compensating controls.
Configuration flaw discovery and remediation. A system that is free of component flaws (patched and up-to-date) can, of course, still be riddled with exploitable vulnerabilities due to its configuration. This could be by erroneous design or accident (drift from expected configuration). Hence, it is important to adhere to standards or baselines by continuous monitoring and / or continuous redeployment of assured / pristine builds and validating overall system-wide configuration.
Architectural goal enumeration and enforcement. Defining and enforcing design patterns across an environment such that individual flaws or issues from layers 1, 2 or 3 have less potential effect or overall ‘blast radius’. This could be as simple as separation of services across security zones, service isolation, data desensitization, tokenization, immutable infrastructure patterns, and a myriad of others. There are two overall approaches to this:
Constraints. Developing rules for what potentially toxic arrangements of components should never exist. Scanning for these is as much a job for continuous vulnerability scanning as making sure unit components are patched and configured correctly.
Obligations. Developing default architectural / design patterns for the deployment of common services and then monitoring for adherence to those as well as enforcing them as “policy as code” in various parts of the development and deployment lifecycle.
Continuous Discovery. Dealing with the force is about reducing the need to play the game of discovery whack-a-mole. However, discovery as a check and balance is still needed and this should be done well across multiple dimensions with the most important being:
Coverage completeness: not just scanning for network ranges and ports but looking at your other surfaces like the identity perimeter of access configuration or API verbs within specific services.
Frequency: parts of your attack surface may be transient and so having a continuously operational and high frequency discovery to spot even the most fleeting availability of services is important.
Suspicion: deeply analyze the mechanisms of categorization to watch for categorization errors - bringing in other sources like traffic profiling and logs to validate categorization.
Seek Adjacent Benefits. I’ve covered the importance of deliberately driving adjacent benefits to your security program here. This space is no different and a well executed set of attack surface management and root cause mitigation activities can have multiple benefits from reducing cost (spotting unused and expensive infrastructure, cloud or SaaS services) and identifying service conflicts impacting reliability. Also, having a more complete picture of your surface means you can more easily challenge the naive use of inaccurate external security ratings.
Software Defined Infrastructure (“Attack Surface Management-as-Code”). One of the better ways of dealing with the root cause of this force is to take advantage of the possibilities of software defined infrastructure. If you can declaratively specify secured and minimally exposed services, with complementary controls (controls-as-code) used to instantiate and constantly verify an actual implementation then you are less likely to be surprised. In fact, in the limit, it might be that your attack surface management is mostly analysis of your configuration specification as it is the real-time discovery of the observed effect of that configuration being deployed.
Architectural Secure by Default. It should not matter whether a specific service or feature is enabled or even vulnerable if you’ve made the correct architectural choices to shield production services or end-users from that even being reachable. One way to think about this is to explore what is the “potential energy” of your environment that if something were to change would unleash the “kinetic energy” of that attack surface being exposed. In other words, if your attack surface potential is one service or port configuration block away from being massively exposed to the kinetics of an attack then you’re probably missing some architectural defenses.
Extended Supply Chain. Your holistic attack surface is now made up of not just your own directly managed environment and known 3rd-party vendors but a potentially larger surface induced from your Nth-parties in that extended supply chain. Driving a secure-by-default approach across this is important, but first, having visibility is key.
Triangulation. Broaden the sources of discovery to enable checks-and-balances to reduce the risk of surprise. Such fractal self-checking (peer to peer surprise detection) where every control point is also a sensor that can look for anomalous devices, services or flows in its immediate surroundings is very useful. Performing triangulation (reconciliation) across many sources of inventory and other data, such as open source intelligence, brand monitoring, DNS records, through to inbound as well as outbound traffic flows can reveal new insights.
Path Modeling. Construct dependency maps and risk graphs from discovered assets and services, then reconcile against modeled inventories, policies and expected configuration. Understanding potential attack paths can not only aid in prioritizing resolution but can help improve the defense in depth of architectural controls. Remember, as John Lambert has said “Defenders think in lists. Attackers think in graphs. As long as this is true, attackers win.”
Policy and Compliance Surface. For many organizations you don’t just have a security attack surface you also have the potential for compliance issues. Your approach here can deal with that. If you’re restricted from running services or having data in specific countries then monitoring for some violation of that as a part of your discovered attack surface will be important. For example, your exposed IT services in Syria might be perfectly well configured and secured, and that would all be fine except you’re not supposed to have any services in Syria.
Hidden Surfaces. Some parts of your attack surface might be hidden or embedded inside other services. For example, how much of an attack surface does your ML-trained services or LLM services represent to careful prompt engineering or training data discovery risks?
Bottom line: attack surface management should be more about reducing your attack surface at source rather than a never-ending cycle of discovery, embrace or kill. But as you continue to do discovery make sure it is fast, holistic and relentlessly feeds back into your architectural secure-by-default work to truly get to the root cause of Force 3.