Welcome to the latest edition of Malware Monthly, where our teams of security researchers and developer advocates bring you the latest discoveries of malicious packages in software registries.
Thankfully the 2022 holiday season did not deliver the level of disruption seen in last year's Log4Shell zero-day vulnerability. But some developers and security professionals did receive an unwelcome gift from bad actors during this past Christmas. Users of PyTorch, a popular machine learning framework, dealt with a malicious dependency posing to be a legitimate library.
In December 2022, we found 422 malicious packages in the npm registry, mostly data exfiltration through typosquatting or dependency confusion attacks, including: ajax-cuuu, angular-nanoscroller, angular-stateful-fastclick, arcgis-charts-shared-utils, arcgis-components, aws-postgres-rotator, aws-rfdk-project, bluebird.node, chart.js-latest, datagrid-text, datagrid-web, jfrog-alfheim, phup.js, react-router-susanin, reactjs-slick, yandex-html5-video-player, yandex-tjson… and many more.
We also caught 58 malicious packages in PyPI, including the heavily obfuscated Discord token stealers proxier-api and nitro-api66. Keep in mind here we cover just a few of the hundreds of malicious packages our AI-enabled system flagged before they entered your build environment.
Some of the malicious packages detected by our AI targeted developers using Mac computers. As an example, in cobo-python-api, threat actors used dependency confusion to trick developers into downloading a tainted version of the crypto library Cobo Custody Restful.
They leveraged the fact that this package doesn't have an official distribution through the PyPI registry. By uploading a compromised version with the same name on PyPI, attackers expect that the package manager (pip) used by developers will prioritize the malicious version over the legit GitHub version.
And what are they hoping to achieve?
The technique is well-known and widely used: include the malicious code in the setup script setup.py so developers deploy the malware as soon as they run the command pip install.
If we look at setup.py, we find code that checks the operating system of the computer, and if it is MacOS, it decodes the concatenation of 4 hexadecimal strings that represent two commands:
that downloads the file slack-helper and makes it executable, and
that runs the binary slack-helper in the background and discards the output logs. At that point, our AI system had gathered more than enough information to consider this package suspicious. The binary has been flagged by four other vendors as malware.
Our system flagged and we helped take down six malicious packages attacking Python developers with a specific tactic — by combining the capabilities of a remote access trojan (RAT) and information stealers, these packages are strange mutations we hadn’t seen before in the PyPI registry.
With names such as easytimestamp, pyrologin, discorder, discord-dev, style.py, and pythonstyles, the malicious packages launch a PowerShell script that fetches a ZIP file and in a RAT fashion, installs the libraries pynput, pydirectinput, and pyscreenshot that allow the attacker to control the target’s mouse and keyboard, and take screenshots.
If you're not familiar with RAT, it is a type of malware that allows an attacker to gain remote access and control of an infected machine. It gives the attacker the ability to view, copy, and modify files, and monitor the victim's activities. Once a RAT is installed on a system (in this case with a simple pip install command) it establishes a connection with a command and control (C&C) server owned by the attacker from where it can control not only one but several infected machines.
Additionally, these malicious packages are also stealers, with the ability to extract sensitive information such as saved passwords, cryptocurrency wallet data, and cookies. They also seek to install cloudflared, a command-line tool for Cloudflare Tunnel, which would allow remote access to the infected machine via a Flask-based app.
Execution of shell commands, downloading and executing remote files, exfiltrating files and directories, and even running arbitrary Python code are some of the features of this novel attack that Phylum describes in more technical detail.
Users of a nightly build in the PyPI code repository ended their holiday season with an unexpected gift: a malicious dependency chain compromise. PyTorch, a machine learning framework, disclosed the compromise saying the attack exclusively affected users of the PyTorch-nightly build and not users of PyTorch stable packages.
Between December 25 and December 30, 2022, the nightly build downloaded a library called torchtriton — a dependency that executes a malicious binary when imported. PyTorch removed the counterfeit library as a dependency for nightly packages and replaced it with a dummy package registered on PyPI to prevent similar attacks.
Sonatype's Ilkka Turunen looked into the details of this incident as well as the mechanics of dependency confusion in supply chain attacks.
In the previous Malware Monthly post, we talked about a series of malicious packages that establish a backdoor connection between the attacker and the target using a technique called reverse shell. Last month our AI system caught additional packages such as aidoc-consul, aidoc.genmfa, and aidoc-e2e-utils that were leveraging this very same attack vector.
If we take a look at one of the packages, aidoc-consul, and decode the base64 string, we find the commands
The attacker here is attempting to create a TCP connection to the IP 3.221.152.203 on port 771 and run the shell command that is sent on that connection, as well as redirect the output of the command to /dev/null so it doesn't appear in the terminal. Another package, discord, used a similar tactic but attempted to download a malicious Python script to execute commands.
Following the technique we described in last month's edition, our automated system caught a series of packages on PyPI that concealed an import statement in their setup.py files by using spaces to create the illusion that there’s no malicious code in sight.
To refresh your memory, when you zoom out the code editor window you can reveal the hidden malicious code:
A technique that might trick the human eye, but not the advanced eye of our AI tool.
Packages such as pywz, https-rot, aio3, pxhttp, Pycolorio, instantcolor, AutoRequirements, nitro-checker, SeleniumWebdriver, and BetterColors were flagged as suspicious by our system and later confirmed as malicious by our security research team. These packages were all trying to do the same thing: download and execute malicious code from a remote URL.
If you were to ask us what's a common goal for most of the packages our AI/ML system catches almost in real-time, we'd say exfiltrating data to malicious servers. Information such as the uptime of the machine, OS release, OS version, system name, and host IP details is usually exfiltrated as part of proof-of-concepts (PoCs) or bug bounty programs. And our system flags hundreds of these packages.
Using a simple technique that in most cases doesn't cause much harm, these packages must be kept outside of your build environment anyway: some of the exfiltrated data can be sensitive or lead to more elaborate attacks.
The library hubplus, for example, is not only using dependency confusion to have priority over the official version uploaded to GitHub, but is also exfiltrating data. The malicious code in setup.py uses the "requests" library to send an HTTP GET request to the URL hxxps[:]//cdmfmb12vtc00005adp0g8mk8pryyyyyp[.]oast[.]fun in order to exfiltrate the hostname, current working directory, and username.
You don’t want your machine to be giving out information to someone else, and we flagged the packages typing-extnesions, ulrlib3, btoocore, paquete-malicioso1, edenred, edenred-payments, python-edenred-payments, skdh, nnabla-dataset-uploader, hayhanuman, thisismanan, and oscscreen that were trying to do just that.
Since 2019, we've discovered a total of 103,850 packages flagged as malicious, suspicious, or proof of concept. The ones mentioned above are just the tip of the iceberg.
Sonatype's system uses ML/AI techniques to recognize unusual attributes for newly published components in public repositories.
Data delivered via our tooling's near real-time detection capabilities helps prevent our customers from inadvertently consuming malicious components.
Users of Sonatype Repository Firewall can rest easy knowing that such malicious packages would automatically be blocked from reaching their development builds.