Datamancer
polars
Datamancer | polars | |
---|---|---|
7 | 144 | |
124 | 26,378 | |
2.4% | 3.4% | |
8.7 | 10.0 | |
3 months ago | about 20 hours ago | |
Nim | Rust | |
MIT License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Datamancer
-
Anyone attempted to make Nim serve R's role? How is it currently?
I have been using Nim for all of my recent data munging and analysis. There's https://github.com/Vindaar/ggplotnim for plots (among others) and everything else has just been normal code. There's also https://github.com/SciNim/Datamancer if you need something more like tidyverse.
- Nim Version 1.6.6 Released
-
Is Nim right for me?
Check out Datamancer for your Pandas equivalent. If I recall correctly it does have the ability to read/write csv. If that doesn't suite you, there is a Python/Nim bridge called Nimpy. I do a lot of machine learning projects and have to use OpenCV and some other things from python because it doesn't exist yet. It's a pretty damn cool library.
-
daily report for Nim language
worked on the roadmap https://github.com/nim-lang/Nim/pull/19388 (enable -d:nimPreviewFloatRoundtrip and -d:nimPreviewDotLikeOps) and found that an important_packages (datamancer) failed. So I made a PR (https://github.com/SciNim/Datamancer/pull/23). It is not a bug of nimPreviewFloatRoundtrip(It seems like a precision problem to me) so alternatively datamancer can be disabled transiently.
-
Which dataframe library to use?
There seems to be two major ones for Nim, NimData and Datamancer. Which one is better?
- Polars: Lightning-fast DataFrame library for Rust and Python
polars
-
Why Python's Integer Division Floors (2010)
This is because 0.1 is in actuality the floating point value value 0.1000000000000000055511151231257827021181583404541015625, and thus 1 divided by it is ever so slightly smaller than 10. Nevertheless, fpround(1 / fpround(1 / 10)) = 10 exactly.
I found out about this recently because in Polars I defined a // b for floats to be (a / b).floor(), which does return 10 for this computation. Since Python's correctly-rounded division is rather expensive, I chose to stick to this (more context: https://github.com/pola-rs/polars/issues/14596#issuecomment-...).
-
Polars
https://github.com/pola-rs/polars/releases/tag/py-0.19.0
-
Stuff I Learned during Hanukkah of Data 2023
That turned out to be related to pola-rs/polars#11912, and this linked comment provided a deceptively simple solution - use PARSE_DECLTYPES when creating the connection:
- Polars 0.20 Released
- Segunda linguagem
- Polars: Dataframes powered by a multithreaded query engine, written in Rust
- Summing columns in remote Parquet files using DuckDB
- Polars 0.34 is released. (A query engine focussing on DataFrame front ends)
What are some alternatives?
nimpy - Nim - Python bridge
vaex - Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
dtplyr - Data table backend for dplyr
modin - Modin: Scale your Pandas workflows by changing a single line of code
nimskull - An in development statically typed systems programming language; with sustainability at its core. We, the community of users, maintain it.
datafusion - Apache DataFusion SQL Query Engine
Nim - Nim is a statically typed compiled systems programming language. It combines successful concepts from mature languages like Python, Ada and Modula. Its design focuses on efficiency, expressiveness, and elegance (in that order of priority).
DataFrames.jl - In-memory tabular data in Julia
ggplotnim - A port of ggplot2 for Nim
datatable - A Python package for manipulating 2-dimensional tabular data structures
NimData - DataFrame API written in Nim, enabling fast out-of-core data processing
Apache Arrow - Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing