The Cyber Safety Review Board of the log4shell incident

A useful summary of the events surrounding CVE-2021-44228, but not too much more.

Jul 16, 2022

The newly created Cyber Safety Review Board (CSRB) has released their report on the log4shell (CVE-2021-44228) incident.

I previously expressed some skepticism regarding the (at the time) plan for yet another federal organization with responsibility for analyzing cybersecurity issues, arguing that the diffusion of responsibility would only muddy the waters further. Although I stand by that position, I think the report is a generally useful document for those not deeply steeped in cybersecurity matters. It explains some of the big problems in the space that set the conditions for the log4shell incident, which is helpful for framing and context. Its recommendations, however, are vague and typical of such august bodies. Furthermore, they (surprise!) advocate for more such blue-ribbon panels to further study the tough problems.

With that said, I’ll review some of the key pieces of information from the report, grouping it into thematic chunks. I won’t provide a timeline or summary of events (because the report does a great job with this); my goal is to summarize the key findings and identify areas of improvement in terms of the recommendations.

Every hacker in the world came out of the woodwork

I think the most revealing statistic from the report was that “[f]ive days following the [log4j] flaw’s disclosure, Cloudflare observed 400 exploitation attempts per second…totaling millions of scanning attempts to identify vulnerable systems” (p. 4). Although many of these were probably security researchers or automated tools just scanning to find instances of the vulnerability, there is no doubt that many of these were bona fide malicious exploitation attempts.

Although the board notes that the actual damage appears to have been less than feared, in a similar such event it is easy to see how there could be a catastrophic impact to global networks. Enterprises in general, but governments and insurers in particular should pay attention to the systemic risk of every cyber criminal in the world simultaneously exploiting such a serious issue.

Oddly, from my perspective, the board noted “no authoritative source exists to understand exploitation trends across geographies, industries, or ecosystems” (p. v). Although this is literally true, I believe one could use the Exploit Prediction Scoring System (EPSS) to provide such an overview. Although it doesn’t provide a register of exploitations (companies obviously want to keep this information confidential), it does offer a regularly updated statistic regarding the probability of exploitation of a given CVE. You could potentially work backwards from this using historical data (e.g. from Shodan) to try to get an idea of the scale of exploitation events. It would be a rough number, but it would give you an idea of the magnitude of the problem. This omission also shows that awareness of the EPSS appears to be limited throughout the industry, a problem that I am trying to help solve (as a recently added contributing member to the project).

There was a huge amount of noise

Something that clearly struck me as true is the fact that the incident created a massive amount of “noise” that was hard to cut through (p. 13). Most illuminating was the fact that one federal department dedicated 33,000 hours to respond to log4shell (p. 17). The fact that a single government agency lost more than a dozen man-years dealing with this vulnerability demonstrates how big of an issue the simple communication aspect of these situations can be.

Also of note were the various data sources (of varying reliability) that people needed to consume to stay up-to-date: Twitter (p. 5), CISA bulletins (p. 6), GitHub (p. 7), and Slack (p. 7), to name a few. Interestingly enough, the board found that “[d]ue to the urgency of the situation, CISA shared other unvalidated third-party resources and encouraged readers to independently verify the information” (p. 7). This represents a huge risk for “secondary” attacks, where hackers could seed malicious code into purportedly legitimate updates during a crisis situation. Due to the frequency of typosquatting attacks during normal times, it only seems reasonable that this tactic would be even more effective during a crisis situation. Image what would happen if a government agency accidentally endorsed a corrupt piece of software that was rapidly deployed throughout the global technology ecosystem?

Another important piece that caught my eye was the observation that “CVSS scores do not provide insight into the impact of the vulnerability in the context of an organization’s business assets, exposure, or operations, or the specific intent of attackers exploiting the vulnerability. As impacted organizations attempted to prioritize these risks and determine remediations, they were forced to sort through bulletins, advisories, and guidance from vendors, media, partners, and government sources.” (p.16)

Understanding the true risk of a vulnerability in your network is challenging to do under the best of circumstances. Stitching together various sources of information that all list the issue as being “10.0 out of 10.0” does not help the situation. I’ll repeat what I observed soon after the log4shell incident and implied before it: CVSS is dead and should be retired.

Stating the obvious

Although I understand that the writers produced it for a broad audience, some of the statements within the report suggest a degree of naivety on the part of the authors. I have said previously that some of the findings were no-brainers, e.g. “[o]rganizations should have a documented vulnerability response program.” Varun Badhwar made a fair point, though, that some of those reading the report wouldn’t necessarily view such recommendations in the same way. This is a reasonable position, but I suppose I expected more specificity (see the Deploy Securely store if you need templates for building such a procedure).

Most revealing, however, were the board’s comments about risk management. For example, the report notes that when the Apache Software Foundation “made upgrades for Log4j available, deploying them was itself a risk decision, forcing a tradeoff between possible operational disruption and timeliness, completeness, and compensating controls.” (p. iv). I made this exact point six months ago, but it appears that this is a novel observation to the board.

Furthermore, the report observes that “software developers, maintainers, vulnerability response teams, and the U.S. government commonly made risk trade-offs about software use and integration. For example, organizations made the decision to use Log4j, rather than develop a logging framework from scratch” (p. 10).

The report reads as if this is something of revelation. As any practitioner at the “sharp end” knows, however, these trade-offs happen every single day. Due to the double bind that individual contributors find themselves in, they often make the best call possible with the information available. It’s good to see formal acknowledgement that risk acceptance is a fact of life, but I believe the surprised tone shows there are still many in government - including those at the highest levels - who refuse to acknowledge this reality.

Refer a friend

Furthermore, the board notes that the “Log4j developers might not have introduced the vulnerability in 2013 if they had had access at the time to training in secure coding practices consistent with established secure development lifecycle tools and techniques” and that “the only way to reduce the likelihood of risk to the ecosystem caused by vulnerabilities in Log4j, and other widely used open source software, is to ensure that code is developed pursuant to industry-recognized secure coding practices and an attendant audit by security experts” (p. 12). These are painfully obvious problems (and observations) to anyone who has worked in the software industry.

Finally, there are a bunch of “feel-good” recommendations littered throughout the report, without any specific guidance as to how to implement them, such as:

“As vulnerability identification capabilities mature (see Recommendation 12 on SBOMs), organizations should be prepared to champion and adopt new technologies to enhance the speed of vulnerability mitigation.” (p. 21).

“CISA should expand its capability to develop, coordinate, and publish authoritative cyber risk information.” (p. 19).

Enterprises should “disrupt the traditional vulnerability discovery and software maintenance process, which currently relies on imperfect and incomplete scanning methods for vulnerable systems and software.” (p. 28)

The tough issues

For all of my critiques, the board does identify some of the thorny underlying problems that led to the log4shell crisis, as well as proposing some potential solutions. For example, they recommend that companies building “commercial software that includes open source libraries or dependencies should commit financial resources toward the open source projects that they deploy” (p. 24). I agree that private sector companies need to do more to secure the open source projects they rely on. There are obvious free-riding problems here, in that some organizations will not contribute but continue to use the projects. But frankly, I think that should be an accepted cost of doing business for those that do help out.

As I have posted previously, I am a big fan of paying open source maintainers to better secure their projects. To make this practice more widespread, I think it makes sense that organizations also contractually require their vendors the same. Many compliance frameworks require 4th-party security measures, and I think this recommendation is just a natural extension of the concept.

The board also described the state of the art of Software Bills of Materials (SBOMs), noting that no “representative groups for organizations currently using SBOMs in their environments…reported having leveraged them to identify vulnerable deployments of Log4j.” Furthermore, the board writes that closing “SBOM standardization gaps would support a faster software supply chain vulnerability response.” (p. 13). While I am an advocate of SBOMs in general, there has been a ton of noise on this topic, and much more work is needed.

Bizarrely, though, the report claims that “Executive Order (EO) 14028 Improving the Nation’s Cybersecurity provides a roadmap for the inclusion of SBOMs when providing software to the federal government” (p. 24). As I have said and will continue to say, the EO provides no such roadmap, and in fact, opened many questions which remain unanswered. As Tom Alrich and others have noted, the lack of clear guidance on the topic will slow implementation and actually contributes to the noise that made the log4shell incident so challenging to deal with.

International reactions

The log4shell incident was a global phenomenon, and it’s good to see the board took a look at how other countries responded. Of interest, the

“Israel National Cyber Directorate (INCD) leveraged internal and perimeter vulnerability scanners alongside a comprehensive configuration management database (CMDB) and internet service provider (ISP) data to filter, aggregate, correlate, and enrich data for providing information about the vulnerability…[and] noted that it closed the vulnerability relatively quickly because it aligned dedicated resources to its critical infrastructure organizations” (p. 8).

Although self-reported and potentially prone to exaggeration, it’s interesting to see that at least one nation (albeit a highly technologically advanced one) had a unified response to the crisis. Please let me know if there are any good resources to review about the INCD or the broader Israeli response.

Across the globe, in China, the picture is more sinister. Although an engineer at the firm Alibaba was the first to identify and responsibly disclose to Apache the log4shell vulnerability (p. iv), the Chinese government appears to have punished the company as a result. According to the board, “several Western and Chinese-language media sources reported that [the Ministry of Industry and Information Technology (MIIT)] suspended Alibaba from a cybersecurity threat information sharing platform partnership for failing to report the Log4j vulnerability to MIIT in a timely manner” (p.15). Based on the lack of follow-up to the board’s requests for information, it seems almost certain that the Chinese government would have preferred to have concealed the vulnerability and weaponized it for their own purposes.

While it would have been perhaps less widespread, a stealthy state-sponsored attack using log4shell could have been even more devastating than previous ones such as the one targeting the Office of Personnel Management (OPM) or the slew of federal agencies that used SolarWinds as a technology contractor.

More government panels

Having served in two out of three branches of it, I am deeply skeptical as to the ability of the federal government to improve people’s lives outside of very narrow circumstances. And I am on record as advocating for a reduction in the size of certain executive branch organizations. One of the biggest problems with expanding the federal government, in my view, is that it then becomes better able and incentivized to continue expanding. Such a vicious circle leads to waste and potential infringement on civil liberties.

It comes as no surprise to me, unfortunately, that the CSRB advocates for such an expansion of government programs and projects. These include exploring “the feasibility of establishing a Software Security Risk Assessment Center of Excellence (SSRACE).” (p. 27) and “recommend[ing] the U.S. government’s National Academy of Sciences Cyber Resilience Forum undertake a study of incentive structures to build secure software” (p. 28).

The issues, in my mind, are clear, and self-replicating blue ribbon panels are not the way to fix them. I would have preferred to have seen specific recommendations for legislative or executive actions, or a simple statement that certain problems are too difficult to solve at present.

One final recommendation I found interesting was that the “U.S. government should continue to invest in and engage with higher education institutions and training programs..establishing and incentivizing cybersecurity curricula and certification programs” (p. 24). Applying a “rising tide lifts all boats” approach makes sense in general to me, but understanding exactly how this would be implemented is important. I have written about how certain federal standards are actually harmful to some organizations’ cybersecurity posture, and would be concerned about such faulty logic entering the water supply.

Conclusion

All in all, I think the CSRB report is a useful historical document detailing the genesis and consequences of the log4shell incident. I would have loved to have seen more detailed and specific recommendations, as well as fewer suggestions for yet more investigative panels, but no one consulted me. My critiques about the report’s shortcomings may be sharp, but I feel like what is ostensibly the most powerful organization in the world should be held to an appropriately high standard. With that said, if you weren’t deeply involved in the log4shell saga when it was happening and are interested in diving more deeply into the topic, by all means give the report a read. Additionally, and as always, I am here to help anyone in the U.S. government if they are interested in what I have to say.