Python package index found crammed with AWS keys and malware • The Register

The Python Package Index, or PyPI, continues to surprise, and not in a good way.

Ideally a source of Python libraries that developers can include in their projects to save time, PyPI was again discovered hosting packages containing Amazon Web Services (AWS) keys and data-stealing malware.

Malicious packages, unfortunately, are nothing new to PyPI or to packaging systems like npm, RubyGems, crates.io, and the like. Supply chain attacks – through compromising software libraries or a typo – have been a problem for years, although they’ve gotten more attention recently with incidents like settlement From Solarwinds.

Despite the increased vigilance, these incidents still occur at an alarming rate. Just before the new year, the maintainers of the machine learning framework PyTorch warned that PyTorch-nightly , if installed on Linux via pip , It included a hacked dependency Available through PyPI called torchtriton.

Less than a week later, security firm Phylum He said In December, it identified a remote access Trojan in a PyPI package called pyrologin. Another security company, ReversingLabs, discovered a malicious PyPI package that month: the malware Disguised as an SDK from the security company SentinelOne. And in November, dozens of newly published PyPI packages were found to be included W4SP malware.

PyPI had a mass culling of malware in March 2021 that resulted in it Removed 3653 malicious codes. But the weed is back, not to mention the security issues identified by automated analysis a few months later In about half PyPI Libraries.

Aside from hacked libraries and half-decent code, what did PyPI do for us? More recently, it offers keys that provide access to AWS computing resources and data used by Amazon, Intel, several US universities, the Australian government, US energy company Fusion Atomics, and Malaysia-based Top Glove, the world’s largest glove maker, from Among others.

The British find the keys again

UK-based software developer Tom Forbes on Friday posted a blog post Shows how it found 57 active AWS API access keys from the companies listed above.

Forbes built rust tool Automatically checks all new packages released on PyPI for listing AWS API keys. It works fine.

Forbes explains in his post that his scanner periodically runs using GitHub actions and looks for AWS keys in new versions of PyPI, HexPM, and RubyGems. If it finds anything, it generates Report with relevant details which adheres to the aws-credit-scanner repo.

“This report contains the keys that were found, as well as a public link to the keys and other metadata about the release,” Forbes said in its post. Because these keys are committed to the public GitHub repository, Github’s secret survey The service starts and AWS reports the key leak.”

As a result, AWS opens a support ticket to notify the offending developer and implements a quarantine policy to reduce the potential for key misuse.

The problem, of course, is that anyone less scrupulous could create a similar scan script for the purpose of exploitation and abuse. It would be amazing if it hadn’t already happened.

Forbes said in an email log AWS keys of this type can be misused.

“It depends on the exact permissions given to the key itself,” Forbes explained. The key you found was leaked by InfoSys [in November] It has “full administrative access” which means it can do anything, the other keys I found in PyPI were “root keys” which are also allowed to do anything. An attacker with these keys would have full access to the AWS account associated with it.”

He said other keys may be limited but still excessive permissions. For example, he said it is common for a key intended to provide access to a single AWS S3 storage bucket is erroneously provisioned to provide access to all S3 buckets associated with that account.

On the horns of a dilemma

Forbes pointed to GitHub’s automated key scanning, which also covers keys in npm packages, as an example of a useful defensive measure. But he said the company’s approach has limits.

“GitHub also cares a lot about supply chain security but they’ve dug a hole for themselves: the way they search for secrets involves a lot of collaboration with vendors who might reveal inside information about how keys are generated to GitHub,” he explained.

“This means that the regular expressions that GitHub uses to look up secrets cannot be published and are sensitive, which also means that third parties like PyPI are effectively unable to use this wonderful infrastructure without submitting every piece of code published to PyPI to GitHub.”

Forbes said that’s a shame because while PyPI can do more to enhance supply chain security, it’s hard to do a good job.

“GitHub has a whole team working on this while PyPI simply doesn’t have those kinds of resources,” he said. “I think there are improvements that need to be made in the Python ecosystem to help prevent keys (and tokens) from being accidentally compiled and propagated to PyPI, and that could be a more efficient use of resources.”

A spokesperson for the Python Foundation did not immediately respond to a request for comment.

“I think a fair amount of blame can be placed on developers, but that sort of thing may not be part of their core competency – security is hard to patch at the best of times,” Forbes said. “AWS has some blame here too: IAM is notoriously hard to debug and debug resulting in giving too broad permissions to keys.”

Forbes also suggested that companies think carefully about their security policies.

Policies might dictate that “nothing in S3 should be public”, and when something is required to be public, it may be easier to make the IAM credentials public rather than trying to work through the security policies and get an exception. This is something I’ve I’ve heard of this happening before.” ®

Leave a Comment