As part of our software package supply chain security efforts, we continuously scan for malware in newly released PyPI and NPM packages.
In this post, we describe a particularly interesting cluster of malicious packages that we've identified.
In late 2022, we released GuardDog, a CLI-based tool that uses Semgrep and package metadata heuristics to identify malicious software packages based on common patterns.
A few months later, we started instrumenting GuardDog at scale to continuously scan the Python Package Index.
We've identified and manually triaged close to 1,500 malicious packages that we regularly publish as part of an open source dataset, which is one of the largest labeled datasets of malicious packages made publicly available.
Empty information: The package had an empty description, which is unusual for legitimate packages.
Single python file: The package consisted of a single Python file, which is also slightly suspicious.
Command overwrite: The package was overwriting the install command, triggering code that gets automatically executed when someone pip installs it.
Code execution: The package was executing OS commands.
Although each of these rules individually only gave us a clue as to whether the package was malicious, these four pieces of information put together gave us a strong sense that we were looking at a malicious package.
After diving deeper into the packages, we confirmed that they contained malicious code.
The initial package that prompted our analysis was published to PyPI on May 9, 2024 and was named reallydonothing.
As we'll see in the Detailed analysis section, this piece of malware targets specific systems and infects the victim's machine only if a specific, secret file is identified on the local file system.
These packages don't attempt to mimic or implement legitimate functionality.
The malicious code then searches for a secret file whose path, when hashed, matches a predetermined hardcoded value.
The packages we've identified and analyzed look for different file patterns, use different hardcoded salt and binary values, and drop binaries in different locations.
It's likely that these packages are part of a broader campaign targeting a specific set of machines, based on either a specific configuration or markers left from a previous infection.
The malicious packages we've analyzed in this post have been identified by GuardDog, an open source project that you can run on your own dependencies or arbitrary PyPI and NPM packages.
We'll make sure to update this post if we identify new malicious packages that exhibit a similar behavior.
May 23, 2024Added a reference to a newly-published malicious packaged, published on May 23th after the initial publication of this post.
This Cyber News was published on securitylabs.datadoghq.com. Publication date: Mon, 27 May 2024 12:43:10 +0000