dplyr vs axon

dplyr

dplyr: A grammar of data manipulation (by tidyverse)

R data-manipulation Grammar

Source Code

dplyr.tidyverse.org

Suggest alternative

Edit details

axon

Nx-powered Neural Networks (by elixir-nx)

Artificial intelligence Nx Elixir Deep Learning neural-networks Optimizers

Source Code

Suggest alternative

Edit details

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

dplyr		axon
	Project
40	Mentions	15
4,645	Stars	1,439
0.6%	Growth	1.4%
7.4	Activity	7.8
12 days ago	Latest Commit	7 days ago
R	Language	Elixir
GNU General Public License v3.0 or later	License	Apache License 2.0

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

dplyr

Posts with mentions or reviews of dplyr. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-03-15.

Show HN: Open-source, browser-local data exploration using DuckDB-WASM and PRQL
11 projects | news.ycombinator.com | 15 Mar 2024

That's great feedback, thanks!
This tool definitely comes from a place of personal need - beyond just handling large files, I've also never really gelled well with the Excel/Google Sheet model of changing data in place as if you were editing text. I'm a Data Scientist and always preferred the chained data transforms you see in things like dplyr (https://dplyr.tidyverse.org/) or Polars (https://pola.rs/) and I feel this tool maps very closely to the chained model.
Also, thank you for the feature requests! Those would all be very useful - we'll put them on the roadmap.
PSA: You don't need fancy stuff to do good work.
10 projects | /r/datascience | 9 May 2023

Before diving into advanced machine learning algorithms or statistical models, we need to start with the basics: collecting and organizing data. Fortunately, both Python and R offer a wealth of libraries that make it easy to collect data from a variety of sources, including web scraping, APIs, and reading from files. Key libraries in Python include requests, BeautifulSoup, and pandas, while R has httr, rvest, and dplyr.
osdc-2023-assignment1
5 projects | dev.to | 9 Jan 2023
Modern Polars: an extensive side-by-side comparison of Polars and Pandas
5 projects | news.ycombinator.com | 7 Jan 2023

It really can't be said enough how pandas is a mess. It has way too much surface area and no common thread pulling it all together. This gets obvious when you work with better dataframe libs like dplyr [1] or DataFramesMeta [2]. I've worked on production systems with all of these libs, this is not gratuitous bashing.
[1] https://dplyr.tidyverse.org/
How do I find R code for R functions?
2 projects | /r/rprogramming | 21 Nov 2022

There are two ways you can generally see the source code for packages. The simplest is to look for the github repository for the package (assuming it exists). For dplyr, it's here. Easiest way to find these is to google search "r github" plus the name of the package. Usually it'll be one of the first results. The github repo would also usually be linked on the package's CRAN page. However, be aware that this may be a development version of the package and not the same version that is currently released on CRAN (e.g. dplyr on CRAN is version 1.0.10, but on github it is listed as version 1.0.99.9000, which will probably become version 1.1.0 when it is released onto CRAN).
People who live near other people vote for Democrats
4 projects | /r/dataisbeautiful | 9 Nov 2022

Tools used: various packages in R (tidycensus, dplyr, ggplot2, sf)
Used Cars Data Scraping - R & Github Actions & AWS
2 projects | dev.to | 11 Sep 2022

It came up with the idea of how to combine Data Engineering with Cloud and automation. I needed to find a data source as it would be an automated pipeline, so I needed a dynamic source. At the same time, I wanted to find a site where I thought retrieving data would not be a problem and do practice with both rvest and dplyr. After I had no problems with my experiments with Carvago, I added the necessary data cleaning steps. Another thing I aimed for in the project was to keep the data in different ways in different environments. While raw (daily CSV) and processed data were written to the Github repo, I wrote the processed data to PostgreSQL on AWS RDS. In addition, I sync the raw and processed data to S3 to be able to use it with Athena. However, I have separated some stages for GitHub Actions to be a good practice. For example, in the first stage, I added synchronization with AWS S3 as a separate action while scraping data, cleaning, and printing fundamental analysis to a simple log file. If there is no error after all this, I added a report with RMarkdown and the action that will be published on github.io. Thus, I created an end-to-end data pipeline where the data from the source is made to offer basic reporting with simple processing.
Quick candlestick summaries with Elixir's Explorer
8 projects | dev.to | 22 Aug 2022

The API is heavily influenced by Tidy Data and borrows much of its design from dplyr. The philosophy is heavily influenced by this passage from dplyr's documentation:
tidytable v0.8.1 is on CRAN - it also comes with a new logo! Need data.table speed with tidyverse syntax? Check out tidytable.
2 projects | /r/rstats | 22 Aug 2022

Also - I might have been the one that put in the request for .by in dplyr 😅
ibis-datasette: Query datasette servers without writing a line of SQL
2 projects | /r/Python | 18 Aug 2022

For my day job I work on ibis. ibis lets users write queries using a familiar dataframe-like API, and then execute those queries on a number of SQL (and non-SQL) backends. Think of it like dplyr but for Python.

axon

Posts with mentions or reviews of axon. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-04-14.

Data wrangling in Elixir with Explorer, the power of Rust, the elegance of R
7 projects | news.ycombinator.com | 14 Apr 2023

José from the Livebook team. I don't think I can make a pitch because I have limited Python/R experience to use as reference.
My suggestion is for you to give it a try for a day or two and see what you think. I am pretty sure you will find weak spots and I would be very happy to hear any feedback you may have. You can find my email on my GitHub profile (same username).
In general we have grown a lot since the Numerical Elixir effort started two years ago. Here are the main building blocks:
* Nx (https://github.com/elixir-nx/nx/tree/main/nx#readme): equivalent to Numpy, deeply inspired by JAX. Runs on both CPU and GPU via Google XLA (also used by JAX/Tensorflow) and supports tensor serving out of the box
* Axon (https://github.com/elixir-nx/axon): Nx-powered neural networks
* Bumblebee (https://github.com/elixir-nx/bumblebee): Equivalent to HuggingFace Transformers. We have implemented several models and that's what powers the Machine Learning integration in Livebook (see the announcement for more info: https://news.livebook.dev/announcing-bumblebee-gpt2-stable-d...)
* Explorer (https://github.com/elixir-nx/explorer): Series and DataFrames, as per this thread.
* Scholar (https://github.com/elixir-nx/scholar): Nx-based traditional Machine Learning. This one is the most recent effort of them all. We are treading the same path as scikit-learn but quite early on. However, because we are built on Nx, everything is derivable, GPU-ready, distributable, etc.
Regarding visualization, we have "smart cells" for VegaLite and MapLibre, similar to how we did "Data Transformations" in the video above. They help you get started with your visualizations and you can jump deep into the code if necessary.
I hope this helps!
Elixir and Rust is a good mix
10 projects | news.ycombinator.com | 13 Apr 2023

> I guess, why not use Rust entirely instead of as a FFI into Elixir or other backend language?
Because Rust brings none of the benefits of the BEAM ecosystem to the table.
I was an early Elixir adopter, not working currently as an Elixir developer, but I have deployed one of the largest Elixir applications for a private company in my country.
I know it has limits, but the language itself is only a small part of the whole.
Take ML, Jose Valim and Sean Moriarity have studied the problem, made a plan to tackle it and started solving it piece by piece [1] in a tightly integrated manner, it feels natural, as if Elixir always had those capabilities in a way that no other language does and to put the icing on the cake the community released Livebook [2] to interactively explore code and use the new tools in the simplest way possible, something that Python notebooks only dream of being capable of, after a decade of progress
That's not to say that Elixir is superior as a language, but that the ecosystem is flourishing and the community is able to extract the 100% of the benefits from the tools and create new marvellously crafted ones, that push the limits forward every time, in such a simple manner, that it looks like magic.
And going back to Rust, you can write Rust if you need speed or for whatever reason you feel it's the right tool for the job, it's totally integrated [3][4], again in a way that many other languages can only dream of, and it's in fact the reason I've learned Rust in the first place.
The opposite is not true, if you write Rust, you write Rust, and that's it. You can't take advantage of the many features the BEAM offers, OTP, hot code reloading, full inspection of running systems, distribution, scalability, fault tolerance, soft real time etc. etc. etc.
But of course if you don't see any advantage in them, it means you probably don't need them (one other option is that you still don't know you want them :] ). In that case Rust is as good as any other language, but for a backend, even though I gently despise it, Java (or Kotlin) might be a better option.
[1] https://github.com/elixir-nx/nx https://github.com/elixir-nx/axon
[2] https://livebook.dev/
[3] https://github.com/rusterlium/rustler
[4] https://dashbit.co/blog/rustler-precompiled
Bumblebee: GPT2, Stable Diffusion, and More in Elixir
5 projects | news.ycombinator.com | 8 Dec 2022

I've trained models using Jupyter and Livebook (though only smaller toy models [1]) so I can deposit my 2 cents here. Small disclaimer that I started with Jupyter, so in some sense my mental model was biased towards Jupyter.
I think the biggest difference that'll trip you up coming from Jupyter is that Livebook enforces linear execution. You can't arbitrarily run cells in any order like you can in Jupyter - if you change an earlier cell all the subsequent cells have to be run in order. The only deviation from this is branches which allow you to capture the state at a certain point and create a new flow from there on. There's a section in [1] that explains how branching works and how you can use it when training models.
The other difference is that if you do something that crashes in a cell, you'll lose the state of the entire branch and have to rerun from the beginning of the branch. Iirc if you stop a long running cell, that forces a rerun as well. That can also be painful when running training loops that run for a while, but there are some pretty neat workarounds you can do using Kino. Using those workarounds does break the reproducibility guarantees though.
Personally while building NN models I find that I prefer the Jupyter execution model because for NNs, rerunning cells can be really time-consuming. Being able to quickly change some variables and run a cell out of order helps while I'm exploring/experimenting.
Two things I love about Livebook though are 1) the file format makes version control super easy and 2) Kino allows for real interactivity in the notebook in a way that's much harder to do in Jupyter. So in Livebook you can easily create live updating charts, images etc that show training progress or have other kinds of interactivity.
If you're interested to see what my model training workflow looks like with Livebook (and I have no idea if it's the best workflow!), check out the examples below [1][2]. Overall I'd say it definitely works well, you just have to shift your mental model a bit if you're coming from Jupyter. If I were doing something where rerunning cells wasn't expensive I would probably prefer the Livebook model.
[1] https://github.com/elixir-nx/axon/blob/main/notebooks/genera...
ElixirConf 2022 - That's a wrap!
7 projects | dev.to | 12 Sep 2022

Machine learning is rapidly expanding within the Elixir ecosystem, with tools such as Nx, Axon, and Explorer being used both by individuals and companies such as Amplified, as mentioned above.
What's your opinion on Elixir?
3 projects | /r/rust | 20 May 2022

It's my professional daily driver since 2018 but I consider it an average-to-disappointing language and ecosystem on top of an incredible VM/runtime. For more specific thoughts, back in 2020 I've previously posted some critique here and very little of these concerns are improved in the interim. There is a vestigial ML story around libraries like Nx/Axon. LiveView is inadvisable in practice but is sort of the banner marketing device right now, which disappoints me.
Recognize Digits Using ML in Elixir
2 projects | /r/elixir | 11 May 2022

Yeah, as Mark said, I think the problem is related to this issue https://github.com/elixir-nx/axon/issues/244
Do Elixir's benefits still hold when interfacing with another language?
2 projects | /r/elixir | 2 May 2022
Show HN: Dataframes in Elixir Backed by Rust
6 projects | news.ycombinator.com | 4 Nov 2021

What are some alternatives?

When comparing dplyr and axon you can also consider the following projects:

worldfootballR - A wrapper for extracting world football (soccer) data from FBref, Transfermark, Understat and fotmob

nx - Multi-dimensional arrays (tensors) and numerical definitions for Elixir

Rustler - Safe Rust bridge for creating Erlang NIF functions

ggplot2 - An implementation of the Grammar of Graphics in R

explorer - Series (one-dimensional) and dataframes (two-dimensional) for fast and elegant data exploration in Elixir

livebook - Automate code & data workflows with interactive Elixir notebooks

rmarkdown - Dynamic Documents for R

Pandas - Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

polars - Dataframes powered by a multithreaded, vectorized query engine, written in Rust

regression-js - Curve Fitting in JavaScript.