|13 days ago||8 days ago|
|Mozilla Public License 2.0||MIT License|
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Does Rust's performance advantage over python extend to numpy/pandas?
1 project | reddit.com/r/bioinformatics | 4 Jan 2022
It has some impressive benchmarks https://h2oai.github.io/db-benchmark/
Is Data Science 90% boring and 10% mega-interesting?
1 project | reddit.com/r/datascience | 28 Dec 2021
Programming languages for data roles Big Tech [OC]
1 project | reddit.com/r/learnmachinelearning | 22 Dec 2021
It’s not just speed but also memory usage and parallelization potential. Look how many times pandas is out of memory https://h2oai.github.io/db-benchmark/
The polars dataframe library now also exposes bindings to NodeJS
2 projects | reddit.com/r/node | 22 Dec 2021
Polars is a blazingly fast DataFrame library. Its written in Rust and until now exposed only bindings in Python. Its one of the best performing solutions in H2oAi's db-benchmark
Polars, lightning-fast DataFrame library
1 project | reddit.com/r/Python | 17 Dec 2021
Polars: Lightning-fast DataFrame library for Rust and Python
13 projects | news.ycombinator.com | 16 Dec 2021
Hmmm .. in the linked benchmarks , DataFrames.jl (Julia library) appears to be fairly competitive.13 projects | news.ycombinator.com | 16 Dec 2021
Rust and what it needs to gain space in computation-oriented applications
7 projects | reddit.com/r/rust | 24 Nov 2021
You should check out polars, datafusion, influxdb iox and databend, all written in native Rust and powered by the Apache Arrow format. Polars in particular is pretty dam fast and has bindings for Python.
Database-Like Ops Benchmark
1 project | news.ycombinator.com | 20 Nov 2021
A better dtypes for pandas dataframes pulled from Postgres
1 project | reddit.com/r/datascience | 14 Nov 2021
Here is a good comparison: https://h2oai.github.io/db-benchmark/
Which programming language or compiler is faster
7 projects | news.ycombinator.com | 18 Dec 2021
This is total misinformation, sorry. Julia may, depending on your setup, be slow to initially load, but the compiler is quite fast generally.
Also, there's a solution to precompile binaries with no JIT penalty...
Is rust good for mathematical computing?
4 projects | reddit.com/r/rust | 16 Nov 2021
*: This is possible in Julia using PackageCompiler.jl but it ships the entire runtime, so big binaries, and the process isn't too smooth yet. In theory, you're definitely capable of compiling into small binaries that don't embed the runtime or at least a big chunk of it, but nobody has worked on this enough yet
Compile for faster execution?
4 projects | reddit.com/r/Julia | 10 Oct 2021
Simply importing a library takes 30-40 seconds?
1 project | reddit.com/r/Julia | 31 Aug 2021
Why Julia's multiple dispatch is so greated explained with Pokemons
2 projects | news.ycombinator.com | 20 Jul 2021
Julia is fairly fast, since its type system _only_ does dynamic/runtime typing, the JIT is optimized towards that. You'll experience some minor startup lag, typically due to initial JIT'ing of any new used functions. However, this has largely be remedied with a compiler backend that completely precomputes this behavior. https://julialang.github.io/PackageCompiler.jl/dev/
[D] fast.ai's Jeremy Howard on Why Python is not the future of machine learning - Gradient Dissent Clip
1 project | reddit.com/r/MachineLearning | 15 Jun 2021
Python has one performance advantage over other VM/jit-based languages (including Java) which is its very short startup time, which is well suited for horizontal scaling, but otherwise your point is fair. In fact, Julia is better designed for AOT compilation, which means that eventually it will be much better-suited for production than pure dockerized python.
Why not Julia?
11 projects | reddit.com/r/Julia | 1 May 2021
Livebook: A collaborative and interactive code notebook for Elixir
6 projects | news.ycombinator.com | 18 Apr 2021
I'm considering Rust, Go, or Julia for my next language and I'd like to hear your thoughts on these
12 projects | reddit.com/r/rust | 16 Apr 2021
Package load times were cut by roughly a factor of two, in my experience, but that doesn't bring the initialization overhead down to the point where it's usable as a standalone microservice. Your best options at this point are https://github.com/dmolina/DaemonMode.jl, which keeps a Julia process running in the background using a client/server model, or https://github.com/JuliaLang/PackageCompiler.jl, which allows for ~zero-overhead package loading (at the cost of some up-front complexity).
Julia 1.6 Highlights
9 projects | news.ycombinator.com | 25 Mar 2021
We call these "system images" and you can generate them with [PackageCompiler](https://github.com/JuliaLang/PackageCompiler.jl). Unfortunately, it's still a little cumbersome to create them, but this is something that we're improving from release to release. One possible future is where an environment can be "baked", such that when you start Julia pointing to that environment (via `--project`) it loads all the packages more or less instantaneously.
The downside is that generating system images can be quite slow, so we're still working on ways to generate them incrementally. In any case, if you're inspired to work on this kind of stuff, it's definitely something the entire community is interested in!
What are some alternatives?
arrow-datafusion - Apache Arrow DataFusion and Ballista query engines
polars - Fast multi-threaded DataFrame library in Rust | Python | Node.js
julia - The Julia Programming Language
LuaJIT - Mirror of the LuaJIT git repository
DataFramesMeta.jl - Metaprogramming tools for DataFrames
sktime - A unified framework for machine learning with time series
Apache Arrow - Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
ModelingToolkit.jl - A modeling framework for automatically parallelized scientific machine learning (SciML) in Julia. A computer algebra system for integrated symbolics for physics-informed machine learning and automated transformations of differential equations
Genie.jl - The highly productive Julia web framework
csvs-to-sqlite - Convert CSV files into a SQLite database