dataiter VS explorer

Compare dataiter vs explorer and see what are their differences.

explorer

Series (one-dimensional) and dataframes (two-dimensional) for fast and elegant data exploration in Elixir (by elixir-explorer)
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
dataiter explorer
2 20
23 976
- 1.1%
7.8 9.4
22 days ago 7 days ago
Python Elixir
MIT License MIT License
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

dataiter

Posts with mentions or reviews of dataiter. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-05-01.
  • Modern Pandas (Part 2): Method Chaining
    5 projects | news.ycombinator.com | 1 May 2022
    Here's another alternative. I wrote Dataiter specifically as I too was frustrated with Pandas. In my experience if you design a new API from scratch (and don't try to reimplement the Pandas API as many projects have done!) and have some vision and consistent principles, it's well possible to get a good intuitive API as a result. Two relevant issues remain: You're limited by NumPy's datatypes and their problems, such as memory-hogging strings and a lack of a proper missing value (NA), and secondly, limited by the Python language, so compared to e.g. dplyr's non-standard evaluation, you'll need to use lambda functions, which are unfortunately clumsy and verbose.

    https://github.com/otsaloma/dataiter

    Here's a comparison of dplyr vs. Dataiter vs. Pandas, which should give quick overview of the similarieties and differences.

    https://dataiter.readthedocs.io/en/latest/_static/comparison...

  • Polars: Lightning-fast DataFrame library for Rust and Python
    13 projects | news.ycombinator.com | 16 Dec 2021
    Agreed, dplyr is great.

    I built my own data frame implementation on top of NumPy specifically trying to accomplish a better API, similar to dplyr. It's not exactly the same naming or operations, but should feel familiar and much simpler and consistent than Pandas. And no indexes or axes.

    Having done this, a couple notes on what will unavoidably differ in Python

    * It probably makes more sense in Python to use classes, so method chaining instead of function piping. I wish one could syntactically skip enclosing parantheses in Python though, method chains look a bit verbose.

    * Python doesn't have R's "non-standard evaluation", so you end up needing lambda functions for arguments in method chains and group-wise aggregation etc. I'd be interested if someone has a better solution.

    * NumPy (and Pandas) is still missing a proper missing value (NA). It's a big pain to try to work around that.

    https://github.com/otsaloma/dataiter

explorer

Posts with mentions or reviews of explorer. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-01-08.
  • Polars
    11 projects | news.ycombinator.com | 8 Jan 2024
    The Explorer library [0] in Elixir uses Polars underneath it.

    [0] https://github.com/elixir-explorer/explorer

  • Unpacking Elixir: Concurrency
    9 projects | news.ycombinator.com | 25 Aug 2023
  • Elixir Livebook is a secret weapon for documentation
    12 projects | news.ycombinator.com | 6 Aug 2023
    To ensure you do not miss this: LiveBook comes with a Vega Lite integration (https://livebook.dev/integrations -> https://livebook.dev/integrations/vega-lite/), which means you get access to a lot of visualisations out of the box, should you need that (https://vega.github.io/vega-lite/).

    In the same "standing on giant's shoulders" stance, you can use Explorer (see example LiveBook at https://github.com/elixir-explorer/explorer/blob/main/notebo...), which leverages Polars (https://www.pola.rs), a very fast DataFrame library and now a company (https://www.pola.rs/posts/company-announcement/) with 4M$ seed.

  • Does anyone else hate Pandas?
    2 projects | /r/dataengineering | 11 Jun 2023
    Already exists. Check out https://github.com/elixir-nx/explorer which provides a tidyverse-like API in Elixir using polars as the back end.
  • Data wrangling in Elixir with Explorer, the power of Rust, the elegance of R
    7 projects | news.ycombinator.com | 14 Apr 2023
    José from the Livebook team. I don't think I can make a pitch because I have limited Python/R experience to use as reference.

    My suggestion is for you to give it a try for a day or two and see what you think. I am pretty sure you will find weak spots and I would be very happy to hear any feedback you may have. You can find my email on my GitHub profile (same username).

    In general we have grown a lot since the Numerical Elixir effort started two years ago. Here are the main building blocks:

    * Nx (https://github.com/elixir-nx/nx/tree/main/nx#readme): equivalent to Numpy, deeply inspired by JAX. Runs on both CPU and GPU via Google XLA (also used by JAX/Tensorflow) and supports tensor serving out of the box

    * Axon (https://github.com/elixir-nx/axon): Nx-powered neural networks

    * Bumblebee (https://github.com/elixir-nx/bumblebee): Equivalent to HuggingFace Transformers. We have implemented several models and that's what powers the Machine Learning integration in Livebook (see the announcement for more info: https://news.livebook.dev/announcing-bumblebee-gpt2-stable-d...)

    * Explorer (https://github.com/elixir-nx/explorer): Series and DataFrames, as per this thread.

    * Scholar (https://github.com/elixir-nx/scholar): Nx-based traditional Machine Learning. This one is the most recent effort of them all. We are treading the same path as scikit-learn but quite early on. However, because we are built on Nx, everything is derivable, GPU-ready, distributable, etc.

    Regarding visualization, we have "smart cells" for VegaLite and MapLibre, similar to how we did "Data Transformations" in the video above. They help you get started with your visualizations and you can jump deep into the code if necessary.

    I hope this helps!

  • Would you still choose Elixir/Phoenix/LiveView if scaling and performance weren’t an issue to solve for?
    3 projects | /r/elixir | 7 Mar 2023
    There's a package in the Nx ecosystem called Explorer (https://github.com/elixir-nx/explorer). It uses bindings for the rust library, polars, which is much more betterer than Pandas.
  • Updated Erlport alternative ?
    3 projects | /r/elixir | 26 Oct 2022
    FWIW around April this year I started using erlport with python polars in a production ETL app because explorer didn't have the features I needed at the time.
  • ElixirConf 2022 - That's a wrap!
    7 projects | dev.to | 12 Sep 2022
    Machine learning is rapidly expanding within the Elixir ecosystem, with tools such as Nx, Axon, and Explorer being used both by individuals and companies such as Amplified, as mentioned above.
  • Dataframes but for Elixir
    1 project | news.ycombinator.com | 23 Aug 2022
  • Quick candlestick summaries with Elixir's Explorer
    8 projects | dev.to | 22 Aug 2022

What are some alternatives?

When comparing dataiter and explorer you can also consider the following projects:

dtplyr - Data table backend for dplyr

dplyr - dplyr: A grammar of data manipulation

dataframe-api - RFC document, tooling and other content related to the dataframe API standard

polars - Dataframes powered by a multithreaded, vectorized query engine, written in Rust

chain-ops-python - Simple chaining of operations (a.k.a. pipe operator) in python

axon - Nx-powered Neural Networks

data_algebra - Codd method-chained SQL generator and Pandas data processing in Python.

db-benchmark - reproducible benchmark of database-like ops

mito - The mitosheet package, trymito.io, and other public Mito code.

arrow2 - Transmute-free Rust library to work with the Arrow format

minimal-pandas-api-for-pola

wasmex - Execute WebAssembly from Elixir