What's up, Python? The GIL removed, a new compiler, optparse deprecated

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • jit-benchmarks

    Benchmark for interpreted languages implementations.

  • Here ya go. On these sometimes one is faster, sometimes the other. https://github.com/kostya/jit-benchmarks/blob/master/README....

  • docopt

    This project is no longer maintained. Please see https://github.com/jazzband/docopt-ng

  • If you aren't averse to using a third party package, on my personal projects I always found https://github.com/docopt/docopt to be nice.

    You can kill 2 birds with one stone by documenting your scripts while also providing the argument structure / parsing.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • CPython

    The Python programming language

  • > every process keeps its own copy of hundreds of gigabytes of stuff. May be okay, depending on how many processes you spawn

    That depends on how you're using multiprocessing. If you're using the "spawn" multiprocessing-start method (which was set to the default on MacOS a few years ago[1], unfortunately), then every process re-starts python from the beginning of your program and does indeed have its own copy of anything not explicitly shared.

    However, the "fork" and "forkserver" start methods make everything available in python before your multiprocessing.Pool/Process/concurrent.futures.ProcessPoolExecutor was created accessible for "free" (really: via fork(2)'s copy-on-write semantics) in the child processes without any added memory overhead. "fork" is the default startup mode on everything other than MacOS/Windows[2].

    I find that those differing defaults are responsible for a lot of FUD around memory management regarding multiprocessing (some of which can be found in these comments!); folks who are watching memory while using multiprocessing on MacOS or Windows observe massively different memory consumption behavior than folks on Linux/BSD (which includes folks validating in Docker on MacOS/Windows). There's an additional source of FUD among folks who used Python on MacOS before the default was changed from "fork" to "spawn" and who assume the prior behavior still exists when it does not.

    If you're on MacOS (not Windows) and wish to use the "fork" or "forkserver" behaviors of multiprocessing for memory sharing, do "export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES" in your shell before starting Python (modifying os.environ or calling os.setenv() in Python will not work), and then call "multiprocessing.set_start_method("fork", force=True)" in your entry point. Per the linked GitHub issue below, this can occasionally cause issues, but in my experience it does so rarely if ever.

    1. https://github.com/python/cpython/issues/77906

    2. https://docs.python.org/3/library/multiprocessing.html#conte...

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts