|2 months ago||4 days ago|
|Mozilla Public License 2.0||MIT License|
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Rust and what it needs to gain space in computation-oriented applications
7 projects | reddit.com/r/rust | 24 Nov 2021
You should check out polars, datafusion, influxdb iox and databend, all written in native Rust and powered by the Apache Arrow format. Polars in particular is pretty dam fast and has bindings for Python.
Database-Like Ops Benchmark
1 project | news.ycombinator.com | 20 Nov 2021
A better dtypes for pandas dataframes pulled from Postgres
1 project | reddit.com/r/datascience | 14 Nov 2021
Here is a good comparison: https://h2oai.github.io/db-benchmark/
Introducing tidypolars - a Python data frame package with syntax familiar to R tidyverse users
4 projects | reddit.com/r/datascience | 10 Nov 2021
The biggest difference with this one is that it's built on top of the polars package, which is probably the fastest data frame manipulation library out there. All of the other dplyr-to-python packages are build on top of pandas (which is very slow in comparison).
Introducing tidypolars - a Python data frame package for R tidyverse users
9 projects | reddit.com/r/rstats | 10 Nov 2021
I think having a basic understanding of pandas, given how broadly it's used, is beneficial. That being said, polars seems to be matching or beating data.table in performance, so I think it'd be very worth it to take it up. Wes McKinney, creator of pandas, has been quite vocal about architecture flaws of pandas -- which is why he's been working on the Arrow project. polars is based on Arrow, so in principle it's kinda like pandas 2.0 (adopting the changes that Wes proposed).9 projects | reddit.com/r/rstats | 10 Nov 2021
tidypolars uses the polars package as a backend, which might be the fastest data frame manipulation library out there. (Faster even than R's data.table, which has been the king of speed for many years.)
Your perfect program/language for experience studies?
1 project | reddit.com/r/actuary | 4 Nov 2021
Julia has ExperienceStudies.jl to help with exposure calculations and MortalityTables.jl for mortality rate data. It also performs very well in data science benchmarks: https://h2oai.github.io/db-benchmark/
Comparing SQLite, DuckDB and Arrow
5 projects | news.ycombinator.com | 27 Oct 2021
this benchmark is more comprehensive for this type of analytical work:
1 project | reddit.com/r/datascience | 23 Oct 2021
Data too big to work with memory you can do in R too, using SparkR. I agree the documentation to something like PySpark is better though. For data within memory, data.table in R beats pandas. Loses to Polars (implemented in Rust that has bindings in Python) but that is not in use much as its new: https://github.com/h2oai/db-benchmark.
Turning database into a searchable dashboard?
3 projects | reddit.com/r/datascience | 21 Oct 2021
Julia 1.7 release
1 project | reddit.com/r/programming | 1 Dec 2021
That's not true anymore, while it was the case initially it's been advertised as a general language for a long time now. Take a look the "Julia in a Nutshell" of the website : https://julialang.org
Julia 1.7 has been released
15 projects | news.ycombinator.com | 30 Nov 2021
I'm extremely excited about this. But there still many problems, for example a ton of tests failing. For example: https://github.com/JuliaLang/julia/issues/43164.15 projects | news.ycombinator.com | 30 Nov 2021
Mutation is tricky, because basically the only way to do it sanely is to copy all data into the residual, but since most arrays aren't actually mutated, that's extremely wasteful. I've been hoping to address this by changing mutability in the language more generally (e.g. https://github.com/JuliaLang/julia/pull/42465) to make immutable arrays the default at which point there wouldn't be a penalty anymore. I've had request to do the mutable copying optionally, but it's a bit tricky, because it needs rule system integration and the rule system currently doesn't reason about mutation.
As for exceptions and recursion, shouldn't be a problem, just needs to be implemented.
Release v1.7.0 · JuliaLang/julia
2 projects | reddit.com/r/Julia | 30 Nov 2021
When it isn't crashing it flies and is wonderful to use, but right now https://github.com/JuliaLang/julia/issues/41440 prevents (my) using Apple Silicon for production runs. (i.e. library development works ok, especially if you can stand having to restart the REPL with some frequency, but long running / demanding simulations can't be trusted to finish without crashing).2 projects | reddit.com/r/Julia | 30 Nov 2021
julia/NEWS.md at v1.7.0 · JuliaLang/julia
1 project | reddit.com/r/contextfree | 30 Nov 2021
Does Julia support currying? (Too happy)
4 projects | reddit.com/r/Julia | 29 Nov 2021
There's a very long ongoing discussion about syntax for currying: https://github.com/JuliaLang/julia/pull/24990
PyTorch: Where we are headed and why it looks a lot like Julia (but not exactly)
It's acknowledged. The full redefinition of any function in any context was solved by RuntimeGeneratedFunctions and we use it extensively throughout SciML, ModelingToolkit, Pumas, etc. so it's fair to say struct redefinitions are the only thing left. That said, "struct redefinitions" are done all of the time in many contexts: if you know you may need to do this, you can just use a named tuple which acts just like an anonymous type and obeys dispatch on a named form that you can then hijack at runtime. With smart engineering then you can be redefining everything on the fly, and we do this through MTK quite regularly. That said, it would be nice to do the last feat of full redefinition of non-anonymous structs, and the latest proposal for how to do struct redefinition is https://github.com/JuliaLang/julia/issues/40399. So we both built and provided solutions for how to do it, showed how to do it in libraries, tutorials, videos, etc. How is that not acknowledging the point of view and doing something about it?
You might want to try and acknowledge the other point of view where I note that, hey, we did make this work but it was a smaller deal then we thought because we legally cannot employ it in many production contexts. We're still going to work out a few details for fun and ease of debugging, but given that we have extensively looked into and thought deeply about the whole issue, we have noticed that the last little bits are less useful than we had thought (at least in the contexts and applications I have been discussing, like clinical trial analysis). That doesn't mean we won't finish the last pieces, but given how much of it you can already do and how many teaching materials show how to do work around the issues in any real-world context, and how little of a real-world application the last few bits have, it shouldn't be surprising that the last pieces haven't been a huge priority. So instead of looking narrowly at one factor, I encourage you to take a more global view.
I would say the almost every version of Julia 1.x has better in terms of code startup.
as in 1.7 > 1.6 > 1.5 > 1.4 > 1.3 > etc...
it's especially goten way better since julia 1.5, so really mostly in the last few years.
In julia 1.8, what's interesting to me is that the julia runtime will be separated from the llvm codegen; https://github.com/JuliaLang/julia/pull/41936
the immediate effect is to allow small static binaries without a huge runtime (namely the LLVM ORC), but the side effect is probably that the interpreter will also get better in cases where you don't want JIT.
What are some alternatives?
NetworkX - Network Analysis in Python
Numba - NumPy aware dynamic Python compiler using LLVM
rust-numpy - PyO3-based Rust binding of NumPy C-API
Lua - Lua is a powerful, efficient, lightweight, embeddable scripting language. It supports procedural programming, object-oriented programming, functional programming, data-driven programming, and data description.
arrow-datafusion - Apache Arrow DataFusion and Ballista query engines
Dagger.jl - A framework for out-of-core and parallel execution
py2many - Python to CLike languages transpiler
duckdf - 🦆 SQL for R dataframes, with ducks
awesome-lisp-companies - Awesome Lisp Companies
femtolisp - a lightweight, robust, scheme-like lisp implementation
polars - Fast multi-threaded DataFrame library in Rust and Python
DFTK.jl - Density-functional toolkit