Julia is the better language for extending Python

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • Python-Complementary-Languages

    Just a small test to see which language is better for extending python when using lists of lists

  • Note that the OP is a Python user, not a Julia user. The Github profile is a bunch of Python packages and the Julia code wasn't even optimized (https://github.com/00sapo/cython_list_test/pull/5). If this test says anything, it at least would say that a inexperienced Python user could pick up Julia and do pretty well, even if the code they write isn't great.

    Even if it doesn't say that, bashing people who use Julia for a repository made by a Python user is a new level of HN trolling.

  • simplification

    Very fast Python line simplification using either the RDP or Visvalingam-Whyatt algorithm implemented in Rust

  • Rust doesn’t need to copy the data. It’s trivial to pass e.g. Numpy arrays to Rust as slices via Cython (let alone originating in Cython!), modify them, and return them, or use them as input for a new returned struct.

    https://github.com/urschrei/simplification

    https://github.com/urschrei/lonlat_bng

    https://github.com/urschrei/pypolyline

    Each of those repos has links to the corresponding Rust “shim” libraries that provide FFIs for dealing with the incoming data, constructing Rust data structures from it, and then transforming it back on the way out.

    As a more general comment, using a GC language as the FFI target from a GC language is begging for difficult-if-not-impossible-to-debug crashes down the line.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • lonlat_bng

    A multithreaded Rust library with FFI for converting WGS84 longitude and latitude coordinates into BNG (OSGB36) Eastings and Northings and vice versa (using OSTN15)

  • Rust doesn’t need to copy the data. It’s trivial to pass e.g. Numpy arrays to Rust as slices via Cython (let alone originating in Cython!), modify them, and return them, or use them as input for a new returned struct.

    https://github.com/urschrei/simplification

    https://github.com/urschrei/lonlat_bng

    https://github.com/urschrei/pypolyline

    Each of those repos has links to the corresponding Rust “shim” libraries that provide FFIs for dealing with the incoming data, constructing Rust data structures from it, and then transforming it back on the way out.

    As a more general comment, using a GC language as the FFI target from a GC language is begging for difficult-if-not-impossible-to-debug crashes down the line.

  • pypolyline

    Fast Google Polyline encoding and decoding using a Rust binary

  • Rust doesn’t need to copy the data. It’s trivial to pass e.g. Numpy arrays to Rust as slices via Cython (let alone originating in Cython!), modify them, and return them, or use them as input for a new returned struct.

    https://github.com/urschrei/simplification

    https://github.com/urschrei/lonlat_bng

    https://github.com/urschrei/pypolyline

    Each of those repos has links to the corresponding Rust “shim” libraries that provide FFIs for dealing with the incoming data, constructing Rust data structures from it, and then transforming it back on the way out.

    As a more general comment, using a GC language as the FFI target from a GC language is begging for difficult-if-not-impossible-to-debug crashes down the line.

  • shared_numpy

    A simple library for creating shared memory numpy arrays

  • There are also some libraries built on top of it that might be useful https://github.com/dillonalaird/shared_numpy

  • iminuit

    Jupyter-friendly Python interface for C++ MINUIT2

  • Have you tried numba+numpy? In my experience, it is much faster than Jax and can compile to cuda. It's not caveat free, but it also removes the hustle of labeling arrays as donated in Jax.

    You may find this interesting https://github.com/scikit-hep/iminuit/blob/develop/tutorial/...

  • rust-numpy

    PyO3-based Rust bindings of the NumPy C-API

  • Given that it's via pyO3, you could even pass the numpy arrays using https://github.com/PyO3/rust-numpy and get ndarrays at the other side.

    Same no copy, slightly more user friendly approach.

    Further criticism of the actual approach - even if we didn't do zero copy, there's no preallocation for the vector despite the size being known upfront, and nested vectors are very slow by default.

    So you could speed up the entire thing by passing it to ndarray, and then running a single call to sum over the 2D array you'd find at the other end. (https://docs.rs/ndarray/0.15.1/ndarray/struct.ArrayBase.html...)

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • py2many

    Transpiler of Python to many other languages

  • py2many doesn't care which language is faster as long as the source language is annotated python3.

    I don't know if this particular benchmark transpiles correctly or not, but it should be possible to achieve much speedup if you annotate your python code properly.

    https://github.com/adsharma/py2many

    Looking for help with open issues.

  • julia

    The Julia Programming Language

  • I find it virtually certain that it does derive from it. Considering that one of the authors of Julia wrote Julia's front-end in Lisp (https://github.com/JuliaLang/julia/blob/master/src/julia-par... and some other files in the same directory), it would have been astonishing for CLOS to not have major impact on the design. There's also some relevant statements in a paper on Julia's design (https://dl.acm.org/doi/10.1145/3276490) in the part on multiple dispatch in section 7. Related Work, where CLOS and its "algebraic cousin" Dylan are mentioned. I got the impression that Julia's object system is basically CLOS without quite a few of CLOS' complexities such as inheritance (which in CLOS necessitates some advanced extension facilities to cover some corner cases if method lookup doesn't do what you want it to do if you're attempting a highly complex application model). The nice effect of those feature removals was that in many cases monomorphization of call sites in emitted native code is possible, which is presumably the other reason for those feature removals: suddenly even primitive operations such as +, * etc. can be generics without incurring (most of the time) dispatch cost at runtime. That (primitive operations being generic functions) is not the case in CLOS, although that can also very well be attributed to backwards compatibility efforts in Common Lisp.

    Interestingly enough, in Julia's documentation, the section "Noteworthy Differences from other Languages" (https://docs.julialang.org/en/v1/manual/noteworthy-differenc...) compares Julia to several relevant languages, which are: Matlab, R, Python, C/C++, and...Common Lisp, of all things. I very strongly doubt that this is a coincidence.

  • cunumeric

    An Aspiring Drop-In Replacement for NumPy at Scale

  • Try dask

    Distribute your data and run everything as dask.delayed and then compute only at the end.

    Also check out legate.numpy from Nvidia which promises to be a drop in numpy replacement that will use all your CPU cores without any tweaks on your part.

    https://github.com/nv-legate/legate.numpy

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts