We have repeatedly come across cases involving open source registries like npm and PyPI being flooded with thousands of packages in a short span of time. Typically, such surges in publishing activity are related to malware, dependency confusion proof of concepts (PoCs), or just ...annoying SEO spam leveraging these registries.
It's not every day though that we see a virtually benign flood of packages that otherwise aren't conducting anything dangerous — well then, why the flood?
Data scientist Cody Nash, part of Sonatype's release integrity team that powers our automated malware detection systems, noticed a spike this month in newly published npm packages that all appeared to be related to a single user.
Through several npm accounts, a dev who goes by the name One Dionys published upwards of 13,995 packages on the platform.
Each of these packages depend on several other packages published by the same person. These packages also contain minimal code cloned from legitimate open source packages, making them, in a way, capable of delivering minimal functionality for specific tasks. But, their overall purpose remains unclear.
At first, we wondered if this is an 'everything'-style attack that entailed an npm package called 'everything' listing literally every single package on npmjs.com as its dependency. This made it virtually impossible for all other developers to delete their own packages. But, that wasn't it.
An absence of any outright malicious or suspicious code in these packages also made it difficult to understand what the dev's motives were.
Our sharp-eyed security researcher Daniel Aguirre took note of harmless 'tea.yaml' files included in each of these packages that make their purpose clear.
YAML (Yet Another Markup Language) is a human-readable data serialization language which is commonly used for defining configuration files and in applications where data is being stored or transmitted. As such, it is quite easy to miss simple, configuration text files like these when analyzing development components with plenty of code and executable files for suspicious signs.
The 'tea.yaml' included in Dionys' packages looks like this, with the hexadecimal strings resembling some sort of blockchain addresses or identifiers:
The link in the first line of the file leads to an FAQ with the answer: https://tea.xyz/what-is-this-file
These YAML files are associated with the decentralized 'tea' protocol designed to reward open source software developers with Tea "tokens" for depending on the popularity and usage of their packages.
According to its website, the tea protocol is built on Base, the layer-2 blockchain launched by the cryptocurrency exchange platform Coinbase. It touts itself as a way for OSS developers to "build reputation for securing the software supply chain."
tea, which says, it's "shaking up" the digital world by "addressing the long-standing issue of inadequate compensation" for OSS devs, recently launched a $250,000 grant to reward developers' hard work.
"Anyone can support any open-source project that’s integrated with the tea Protocol by staking TEA tokens to the software project. Staking tokens to a project enables you to earn rewards while also contributing to the security of the software supply chain."
We reached out to One Dionys to get their take on their hyperactive npm publishing activity.
The developer initially stated that their intention behind putting out these packages was to try "to create a package that might be useful for the open-source community. And I am struggling to popularize the package."
"I was wondering whether if I have a lot of dependencies, the package will rank number one in search engines or not," stated the developer, while adding that they have not had much luck.
"Then I found a useful package/project to copy and publish packages directly. I tried it, and it worked. After trying it for a few days, it turned out that the package I made did not match my expectations, which was not in the rankings or search engines. Finally, yesterday I stopped doing this."
They reassured that in no manner was their intention or the purpose of these packages to harm the open source ecosystem.
Sonatype understands that the developer was using a translator when responding to our emails and we were told that as a result some inconsistencies in communication may occur.
I then posed some questions about the tea protocol to the developer, and if they have been able to earn rewards in this manner, i.e. by attempting to artificially inflate their reputation on the tea platform. By their own admission, One Dionys did not, during this experiment.
The dev further told us:
"Regarding the tea protocol, this is just from my knowledge, you can research more on the tea website. As you can see on the website, tea is a platform to appreciate OSS developers. In the tea system, you can register a project that you have created for the open-source community and will get points if you maintain the project.
Because, sometimes OSS developers often create projects but after a few years they abandon the project, either because they get a job, are lazy to do maintenance, etc. And in the tea system, there will also be several tasks such as registering open-source projects, staking to our favorite projects such as dotenv, etc. When we complete the task we will get points, for what the points will be used for I still don't know for sure.
Other developers, yes you will find some tea.yaml files on their projects on github, because many OSS developers have registered their projects in the tea protocol."
This month, GitHub also took down several packages from another developer in the `@lbnqduy11805/` namespace. Rather than "malware," these packages also contained 'tea.yaml' files, exhibiting much the same pattern.
According to our Advanced Binary Fingerprinting (ABF) technology, there are upwards of 1,000 distinct packages published by this separate developer, some of which are listed below:
@lbnqduy11805/animated-doodle
@lbnqduy11805/bookish-sniffle
@lbnqduy11805/cautious-octo-rotary-phone
@lbnqduy11805/congenial-dollop
@lbnqduy11805/congenial-octo-guide
@lbnqduy11805/expert-waddle
@lbnqduy11805/friendly-doodle
@lbnqduy11805/ideal-octo-spork
@lbnqduy11805/legendary-octo-carnival
@lbnqduy11805/miniature-garbanzo
@lbnqduy11805/miniature-train
@lbnqduy11805/musical-doodle
@lbnqduy11805/potential-giggle
@lbnqduy11805/potential-octo-enigma
@lbnqduy11805/psychic-journey
@lbnqduy11805/psychic-waffle
@lbnqduy11805/redesigned-journey
@lbnqduy11805/refactored-eureka
@lbnqduy11805/refactored-octo-carnival
@lbnqduy11805/reimagined-happiness
@lbnqduy11805/shiny-rotary-phone
@lbnqduy11805/silver-parakeet
@lbnqduy11805/special-funicular
@lbnqduy11805/special-palm-tree
@lbnqduy11805/studious-memory
@lbnqduy11805/studious-octo-waddle
@lbnqduy11805/stunning-fishstick
@lbnqduy11805/stunning-octo-parakeet
@lbnqduy11805/sturdy-chainsaw
@lbnqduy11805/sturdy-waddle
@lbnqduy11805/super-duper-train
@lbnqduy11805/turbo-octo-memory
@lbnqduy11805/urban-octo-adventure
@lbnqduy11805/verbose-pancake
Sonatype security researcher Aguirre says, this "seems like something that is spreading," referring to the growing trend, akin to the dependency confusion saga.
He noticed some of these copycat projects, named in a predictable pattern, served no purpose other than boosting the publishing developer's tea reputation.
"I was able to see that the names of the copies created by that user use this pattern (project-copy-name)-(project-another-user)," says Aguirre.
"At first I thought that was something random, but checking on some projects listed on that pattern they were using the tea protocol too and were weird too since those projects were forks or new projects of nonsensical or empty things."
While none of these packages may be harmful, and there's a very low chance of anyone willfully downloading these, the packages could still be considered Potentially Unwanted Applications and Packages (PUAs/PUPs) and something you don't want lurking in your build.
Users of Sonatype Repository Firewall should therefore, in most cases, start noticing such packages getting blocked from entering their builds. We may discretionarily review and expand our blocklists periodically with other similar packages on a case-by-case basis as our investigation into this growing trend progresses.