Dozens of malicious PyPI packages discovered targeting developers

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • packj

    Packj stops :zap: Solarwinds-, ESLint-, and PyTorch-like attacks by flagging malicious/vulnerable open-source dependencies ("weak links") in your software supply-chain

  • This is exactly what Packj [1] scans packages for (30+ such risky attributes). Many packages will use base64 for benign reasons, this is why no full-automated tool could be 100% accurate.

    Manual auditing is impractical, but Packj can quickly point out if a package accesses sensitive files (e.g., SSH keys), spawns shell, exfiltrates data, is abandoned, lacks 2FA, etc. Alerts could be commented out if don't apply.

    1. https://github.com/ossillate-inc/packj

    Disclaimer: I developed this.

  • cli

    Command line interface for the Phylum API (by phylum-dev)

  • This is one of the projects we're working on (and open sourcing)!

    Currently allows you to specify allowed resources during the package installation in a way very similar to what you've outlined [1].

    The sandbox itself lives here [2] and can be integrated into other projects.

    1. https://github.com/phylum-dev/cli/blob/main/extensions/npm/P...

    2. https://github.com/phylum-dev/birdcage

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • birdcage

    Cross-platform embeddable sandboxing

  • This is one of the projects we're working on (and open sourcing)!

    Currently allows you to specify allowed resources during the package installation in a way very similar to what you've outlined [1].

    The sandbox itself lives here [2] and can be integrated into other projects.

    1. https://github.com/phylum-dev/cli/blob/main/extensions/npm/P...

    2. https://github.com/phylum-dev/birdcage

  • W4SP-Stealer

    Discontinued w4sp Stealer official source code, one of the best python stealer on the web [GET https://api.github.com/repos/loTus04/W4SP-Stealer: 403 - Repository access blocked]

  • Yep. You can read the source code for it here: https://github.com/loTus04/W4SP-Stealer

  • warehouse

    The Python Package Index

  • We tried doing this on PyPI a couple of years ago, and it produced a large number of false positives (too many to manually review).

    You can see the rules we tried here[1].

    [1]: https://github.com/pypi/warehouse/blob/main/warehouse/malwar...

  • Contents

    Community documentation, code, links to third-party resources, ... See the issues and pull requests for pending content. Contributions are welcome !

  • crev

    Socially scalable Code REView and recommendation system that we desperately need. See http://github.com/crev-dev/cargo-crev for real implemenation.

  • I don't think it makes much sense to verify pypi authors. I mean you could verify corporations and universities and that would get you far, but most of the packages you use are maintained by random people who signed up with a random email address.

    I think it makes more sense to verify individual releases. There are tools in that space like crev [1], vouch [2], and cargo-vet [3] that facilitate this, allowing you to trust your colleagues or specific people rather than the package authors. This seems like a much more viable solution to scale trust.

    [1]: https://github.com/crev-dev/crev

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • vouch

    A multi-ecosystem package code review system. (by vouch-dev)

  • cargo-vet

    supply-chain security for Rust

  • secimport

    eBPF Python runtime sandbox with seccomp (Blocks RCE).

  • There is also this, although I haven't tested it yet. The approach is interesting though. https://github.com/avilum/secimport

  • autobox

    A set of tools and libraries for automatically generating and initiating sandboxes for Rust programs

  • Once I'm done with (2) though I think I'll tackle (3).

    `autobox` is fun but I think it may be impractical without more language level support and no matter what I'd end up having to implement it in the compiler at some point, which means it would be unusable without nightly or a fork.

    I'm going to try to wrap up an autobox POC that handles branching and loops, publish it, and see if someone who does more compilery things is willing to pick it up. As for (2) and (3) I believe I can build practical implementations for both.

    [0] https://github.com/insanitybit/autobox/

  • LavaMoat

    tools for sandboxing your dependency graph

  • You are basically talking about Lavamoat. It provides tooling and policies for SES, which aims to make it into standards.

    https://github.com/LavaMoat/LavaMoat

  • security-wg

    Node.js Ecosystem Security Working Group

  • Node.js is building something very similar: Permission Model https://github.com/nodejs/security-wg/issues/791

  • lunasec

    LunaSec - Dependency Security Scanner that automatically notifies you about vulnerabilities like Log4Shell or node-ipc in your Pull Requests and Builds. Protect yourself in 30 seconds with the LunaTrace GitHub App: https://github.com/marketplace/lunatrace-by-lunasec/

  • It is possible to set your registry in NPM via the "npmrc" file. That will let you hit the specified HTTP server whenever you run commands like "npm install".

    I know this is also possible for Python because we did it at Uber. I don't remember the specific details anymore though.

    In either case though, a lot of people have written proxies for this use case (I helped write one for NPM at Uber). Companies like Bytesafe and Artifactory also exist in this space.

    We're working on something similar that's on GitHub here: https://github.com/lunasec-io/lunasec

    Proxy support isn't built out yet but the data is all there already.

  • cosmopolitan

    build-once run-anywhere c library

  • Indeed! If your dependencies are able to be command line programs that are shell scripted together, then you can in fact have an access policy on a per-dependency basis, using my pledge.com tool. So shell scripters of the world rejoice.

    But it gets better. If you build python.com in the Cosmopolitan Libc repository:

        git clone https://github.com/jart/cosmopolitan

  • wasmer

    🚀 The leading Wasm Runtime supporting WASIX, WASI and Emscripten

  • That's the main reason we should start using WebAssembly for distributing and using packages.

    Shamless plug: Wasmer [1] and WAPM [2] could help a lot on this quest!

    [1]: https://wasmer.io/

    [2]: https://wapm.io/

  • wapm-cli

    Discontinued 📦 WebAssembly Package Manager (CLI)

  • That's the main reason we should start using WebAssembly for distributing and using packages.

    Shamless plug: Wasmer [1] and WAPM [2] could help a lot on this quest!

    [1]: https://wasmer.io/

    [2]: https://wapm.io/

  • Code-Server

    VS Code in the browser

  • Darn. Maybe the solution is to use vs-code client in the browser? Like vscode.dev or https://github.com/coder/code-server ? Though it limits what keyboard shortcuts and extensions are available, but at least it's in a secure sandbox on the client side.

  • conductor

    Discontinued Conductor is a microservices orchestration engine.

  • Yeah, that's quite interesting reading from them, some sort of specialized appliance really.

    I'm and the average Joe around me, totally far from Netflix's task of packing bytes from disk to network. Simple 2vCPU VPS serving 4GBit without being saturated on system resource level is quite often much more than enough. Extra note - it's not even using kTLS.

    Moreover, even for Netflix, noting they know FreeBSD in and out, do you think/have info on using FreeBSD as base OS beyond distribution level - running applications/services in particular?

    I've quickly checked on their repos like https://github.com/Netflix/conductor and it smells like they use containers/Docker, which doesn't work on FreeBSD => I'm in very much doubts it's OS of choice for them.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts