Dozens of malicious PyPI packages discovered targeting developers

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Build time-series-based applications quickly and at scale.
  • SonarLint - Clean code begins in your IDE with SonarLint
  • SaaSHub - Software Alternatives and Reviews
  • packj

    The vetting tool 🚀 behind our "dependency firewall" to block malicious/risky open-source packages in your software supply chain

    This is exactly what Packj [1] scans packages for (30+ such risky attributes). Many packages will use base64 for benign reasons, this is why no full-automated tool could be 100% accurate.

    Manual auditing is impractical, but Packj can quickly point out if a package accesses sensitive files (e.g., SSH keys), spawns shell, exfiltrates data, is abandoned, lacks 2FA, etc. Alerts could be commented out if don't apply.

    1. https://github.com/ossillate-inc/packj

    Disclaimer: I developed this.

  • cli

    Command line interface for the Phylum API (by phylum-dev)

    This is one of the projects we're working on (and open sourcing)!

    Currently allows you to specify allowed resources during the package installation in a way very similar to what you've outlined [1].

    The sandbox itself lives here [2] and can be integrated into other projects.

    1. https://github.com/phylum-dev/cli/blob/main/extensions/npm/P...

    2. https://github.com/phylum-dev/birdcage

  • InfluxDB

    Build time-series-based applications quickly and at scale.. InfluxDB is the Time Series Platform where developers build real-time applications for analytics, IoT and cloud-native services. Easy to start, it is available in the cloud or on-premises.

  • birdcage

    Cross-platform embeddable sandboxing

    This is one of the projects we're working on (and open sourcing)!

    Currently allows you to specify allowed resources during the package installation in a way very similar to what you've outlined [1].

    The sandbox itself lives here [2] and can be integrated into other projects.

    1. https://github.com/phylum-dev/cli/blob/main/extensions/npm/P...

    2. https://github.com/phylum-dev/birdcage

  • W4SP-Stealer

    w4sp Stealer official source code, one of the best python stealer on the web [GET https://api.github.com/repos/loTus04/W4SP-Stealer: 403 - Repository access blocked]

    Yep. You can read the source code for it here: https://github.com/loTus04/W4SP-Stealer

  • warehouse

    The Python Package Index

    We tried doing this on PyPI a couple of years ago, and it produced a large number of false positives (too many to manually review).

    You can see the rules we tried here[1].

    [1]: https://github.com/pypi/warehouse/blob/main/warehouse/malwar...

  • Contents

    Community documentation, code, links to third-party resources, ... See the issues and pull requests for pending content. Contributions are welcome !

  • crev

    Socially scalable Code REView and recommendation system that we desperately need. See http://github.com/crev-dev/cargo-crev for real implemenation.

    I don't think it makes much sense to verify pypi authors. I mean you could verify corporations and universities and that would get you far, but most of the packages you use are maintained by random people who signed up with a random email address.

    I think it makes more sense to verify individual releases. There are tools in that space like crev [1], vouch [2], and cargo-vet [3] that facilitate this, allowing you to trust your colleagues or specific people rather than the package authors. This seems like a much more viable solution to scale trust.

    [1]: https://github.com/crev-dev/crev

  • SonarLint

    Clean code begins in your IDE with SonarLint. Up your coding game and discover issues early. SonarLint is a free plugin that helps you find & fix bugs and security issues from the moment you start writing code. Install from your favorite IDE marketplace today.

  • vouch

    A multi-ecosystem package code review system. (by vouch-dev)

  • cargo-vet

    supply-chain security for Rust

  • secimport

    The only sandbox that manages privilliges per-module in your code using eBPF and DTrace.

    There is also this, although I haven't tested it yet. The approach is interesting though. https://github.com/avilum/secimport

  • autobox

    A set of tools and libraries for automatically generating and initiating sandboxes for Rust programs

    Once I'm done with (2) though I think I'll tackle (3).

    `autobox` is fun but I think it may be impractical without more language level support and no matter what I'd end up having to implement it in the compiler at some point, which means it would be unusable without nightly or a fork.

    I'm going to try to wrap up an autobox POC that handles branching and loops, publish it, and see if someone who does more compilery things is willing to pick it up. As for (2) and (3) I believe I can build practical implementations for both.

    [0] https://github.com/insanitybit/autobox/

  • LavaMoat

    tools for sandboxing your dependency graph

    You are basically talking about Lavamoat. It provides tooling and policies for SES, which aims to make it into standards.

    https://github.com/LavaMoat/LavaMoat

  • security-wg

    Node.js Ecosystem Security Working Group

    Node.js is building something very similar: Permission Model https://github.com/nodejs/security-wg/issues/791

  • lunasec

    LunaSec - Dependency Security Scanner that automatically notifies you about vulnerabilities like Log4Shell or node-ipc in your Pull Requests and Builds. Protect yourself in 30 seconds with the LunaTrace GitHub App: https://github.com/marketplace/lunatrace-by-lunasec/

    It is possible to set your registry in NPM via the "npmrc" file. That will let you hit the specified HTTP server whenever you run commands like "npm install".

    I know this is also possible for Python because we did it at Uber. I don't remember the specific details anymore though.

    In either case though, a lot of people have written proxies for this use case (I helped write one for NPM at Uber). Companies like Bytesafe and Artifactory also exist in this space.

    We're working on something similar that's on GitHub here: https://github.com/lunasec-io/lunasec

    Proxy support isn't built out yet but the data is all there already.

  • cosmopolitan

    build-once run-anywhere c library

    Indeed! If your dependencies are able to be command line programs that are shell scripted together, then you can in fact have an access policy on a per-dependency basis, using my pledge.com tool. So shell scripters of the world rejoice.

    But it gets better. If you build python.com in the Cosmopolitan Libc repository:

        git clone https://github.com/jart/cosmopolitan

  • wasmer

    🚀 The leading WebAssembly Runtime supporting WASI and Emscripten

    That's the main reason we should start using WebAssembly for distributing and using packages.

    Shamless plug: Wasmer [1] and WAPM [2] could help a lot on this quest!

    [1]: https://wasmer.io/

    [2]: https://wapm.io/

  • wapm-cli

    📦 WebAssembly Package Manager (CLI)

    That's the main reason we should start using WebAssembly for distributing and using packages.

    Shamless plug: Wasmer [1] and WAPM [2] could help a lot on this quest!

    [1]: https://wasmer.io/

    [2]: https://wapm.io/

  • Code-Server

    VS Code in the browser

    Darn. Maybe the solution is to use vs-code client in the browser? Like vscode.dev or https://github.com/coder/code-server ? Though it limits what keyboard shortcuts and extensions are available, but at least it's in a secure sandbox on the client side.

  • conductor

    Conductor is a microservices orchestration engine.

    Yeah, that's quite interesting reading from them, some sort of specialized appliance really.

    I'm and the average Joe around me, totally far from Netflix's task of packing bytes from disk to network. Simple 2vCPU VPS serving 4GBit without being saturated on system resource level is quite often much more than enough. Extra note - it's not even using kTLS.

    Moreover, even for Netflix, noting they know FreeBSD in and out, do you think/have info on using FreeBSD as base OS beyond distribution level - running applications/services in particular?

    I've quickly checked on their repos like https://github.com/Netflix/conductor and it smells like they use containers/Docker, which doesn't work on FreeBSD => I'm in very much doubts it's OS of choice for them.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts