polars

Open-source projects categorized as polars

Top 23 polar Open-Source Projects

  • polars

    Dataframes powered by a multithreaded, vectorized query engine, written in Rust

  • Project mention: Why Python's Integer Division Floors (2010) | news.ycombinator.com | 2024-02-28

    This is because 0.1 is in actuality the floating point value value 0.1000000000000000055511151231257827021181583404541015625, and thus 1 divided by it is ever so slightly smaller than 10. Nevertheless, fpround(1 / fpround(1 / 10)) = 10 exactly.

    I found out about this recently because in Polars I defined a // b for floats to be (a / b).floor(), which does return 10 for this computation. Since Python's correctly-rounded division is rather expensive, I chose to stick to this (more context: https://github.com/pola-rs/polars/issues/14596#issuecomment-...).

  • Mimesis

    Mimesis is a powerful Python library that empowers developers to generate massive amounts of synthetic data efficiently.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • ibis

    the portable Python dataframe library

  • Project mention: Show HN: Hashquery, a Python library for defining reusable analysis | news.ycombinator.com | 2024-04-23

    I really don't understand the appeal of dbt vs a proper programming language. The templating approach leads to massive spaghetti. I look forward to trying out something like Ibis [0]

    0: https://ibis-project.org/

  • DataFrame

    C++ DataFrame for statistical, Financial, and ML analysis -- in modern C++ using native types and contiguous memory storage

  • Project mention: New multithreaded version of C++ DataFrame was released | news.ycombinator.com | 2024-02-13
  • qsv

    CSVs sliced, diced & analyzed.

  • Project mention: Qsv: Efficient CSV CLI Toolkit | news.ycombinator.com | 2023-12-22

    Thanks for the detailed feedback @snidane!

    As maintainer of qsv, here's my reply:

    - Given qsv's rapid release cycle (173 releases over three years), the auto-update check is essential at the moment. Once we reach 1.0, I'll turn it off. For now, given your feedback, I've only made it check 10% of the time.

    - Pivot is in the backlog and I'll be sure to add unpivot when I implement it. (https://github.com/jqnatividad/qsv/issues/799)

    - I'll add a dedicated summing command with the group by (-by) and window by (-over) capability (https://github.com/jqnatividad/qsv/issues/1514). Do note that `stats` has basic sum as @ezequiel-garzon pointed out.

    - With the `enum` command, qsv can achieve what you proposed with `laminate`. E.g. qsv enum --new-column newcol --constant newconstant mydata.csv --output laminated-data.csv

    - With the cat rowskey command, qsv can already concatenate files with mismatched headers.

    - other file formats. qsv supports parquet, csv, tsv, excel, ods, datapackage, sqlite and more (see https://github.com/jqnatividad/qsv/tree/master#file-formats). Fixed-format though is not supported yet and quite interesting, and have added it to the backlog (https://github.com/jqnatividad/qsv/issues/1515)

    - as to "enable embedding outputs of commands", qsv is composable by design, so you can use standard stdin/stdout redirection/piping techniques to have it work with other CLI tools like jq, awk, etc.

    Finally, just released v0.120.0 that already incorporates the less aggressive self-update check. https://github.com/jqnatividad/qsv/releases/tag/0.120.0

  • functime

    Time-series machine learning at scale. Built with Polars for embarrassingly parallel feature extraction and forecasts on panel data.

  • Project mention: functime: NEW Data - star count:616.0 | /r/algoprojects | 2023-11-08
  • jupysql

    Better SQL in Jupyter. 📊

  • Project mention: Show HN: JupySQL – a SQL client for Jupyter (ipython-SQL successor) | news.ycombinator.com | 2023-12-06

    Hey, HN community!

    We're stoked to launch JupySQL today! JupySQL is an open-source library that brings a modern SQL experience to Jupyter. JupySQL is compatible with all major databases, such as Snowflake, Redshift, PostgreSQL, MySQL, MariaDB, DuckDB, SQL Server, Clickhouse, Trino, and more!

    To get started, check out our tutorial: https://jupysql.ploomber.io/en/latest/quick-start.html

    SQL is the defacto language for data analysis; however, analysis often requires a mix of SQL and Python. JupySQL bridges this gap, allowing users to execute SQL queries seamlessly in Jupyter and continue their analysis in Python. Add %%sql to the top of your cell and start writing SQL.

    Here are some of JupySQL's main features:

    - Syntax highlighting

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • awesome-polars

    A curated list of Polars talks, tools, examples & articles. Contributions welcome !

  • Project mention: 👉 New Awesome Polars release! What's new in the world of Polars in June 2023 ? Let's find out! 🚀 | /r/dataengineering | 2023-06-28
  • geopolars

    Geospatial extensions for Polars

  • datacompy

    Pandas and Spark DataFrame comparison for humans and more!

  • Project mention: How to Check 2 SQL Tables Are the Same | news.ycombinator.com | 2023-07-26
  • r-polars

    Bring polars to R

  • Project mention: Polars R Package | news.ycombinator.com | 2024-02-08
  • nodejs-polars

    nodejs front-end of polars

  • Project mention: Using Deno with Jupyter Notebook to build a data dashboard | dev.to | 2024-01-17

    Polars: A blazingly fast DataFrame library written in Rust for data manipulation and analysis

  • rust-data-analysis

    Rust for data analysis encyclopedia (WIP).

  • Project mention: Ask HN: Rust Viable for Data Analytics? | news.ycombinator.com | 2024-02-01

    Rust still has some key pieces missing, but looks promising, see: https://github.com/wiseaidev/rust-data-analysis

    F# has a very decent data community: https://datascienceinfsharp.com

    And obviously Julia is also something to consider.

  • rust-mlops-template

    A work in progress to build out solutions in Rust for MLOPs

  • Project mention: Is anyone doing Machine Learning in Rust? | /r/rust | 2023-05-11
  • pyo3-polars

    Pyo3 extensions for polars

  • Project mention: Introducing polars expression plugins for python | /r/Python | 2023-10-26

    See the full examples here: https://github.com/pola-rs/pyo3-polars/tree/main/example/derive_expression

  • polars-xdt

    Polars plugin offering eXtra stuff for DateTimes

  • Project mention: Business day arithmetic in Polars is...easy! Just use the `polars-business` plugin | /r/datascience | 2023-11-14
  • biobear

    Work with bioinformatic files using Arrow, Polars, and/or DuckDB

  • cuallee

    Possibly the fastest DataFrame-agnostic quality check library in town.

  • Project mention: Show HN: Snowflake Data Quality Checks in Python | news.ycombinator.com | 2024-02-11
  • s2protocol-rs

    Starcraft 2 Protocol Replay Reader

  • Project mention: New version of s2protocol-rs SC2Replay parsing crate | /r/starcraft2 | 2023-10-06
  • fastexcel

    A Python wrapper around calamine (by ToucanToco)

  • time-series-streaming-analytics-template

    Template to quickstart streaming analytics using Apache Kafka for ingestion, QuestDB for time-series storage and analytics, Grafana for near real-time dashboards, and Jupyter Notebook for data science

  • Project mention: Show HN: Open-source template for end-to-end streaming analytics | news.ycombinator.com | 2024-02-08
  • lightning-mlflow-hf

    Use QLoRA to tune LLM in PyTorch-Lightning w/ Huggingface + MLflow

  • Project mention: Show HN: LoRA Tune LLM in Lightning on GPU | news.ycombinator.com | 2023-11-12
  • dply-rs

    A dataframe manipulation tool for parquet, csv, and json data.

  • Project mention: A dplyr interpreter powered by Polars | /r/rust | 2023-05-10

    I have added documentation for all supported functions here.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

polars related posts

  • Polars R Package

    1 project | news.ycombinator.com | 8 Feb 2024
  • Using Deno with Jupyter Notebook to build a data dashboard

    5 projects | dev.to | 17 Jan 2024
  • 👉 New Awesome Polars release! What's new in the world of Polars in June 2023 ? Let's find out! 🚀

    1 project | /r/dataengineering | 28 Jun 2023
  • 👉 New Awesome Polars release! What's new in the world of Polars in June 2023 ? Let's find out! 🚀

    1 project | /r/u_damiendotta | 28 Jun 2023
  • 👉 New Awesome Polars release! What's new in the world of Polars in the last 3 weeks ? A polars-df gems to use Polars with Ruby. 🚀

    1 project | /r/ruby | 30 May 2023
  • 👉 New Awesome Polars release! What's new in the world of Polars in the last 3 weeks ? Let's find out! 🚀

    1 project | /r/Python | 30 May 2023
  • 👉 New Awesome Polars release (04-21-2023) ! 🚀 What's new in #Polars? Let's find out!

    1 project | /r/Python | 21 Apr 2023
  • A note from our sponsor - SaaSHub
    www.saashub.com | 9 May 2024
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source polar projects? This list will help you:

Project Stars
1 polars 26,378
2 Mimesis 4,309
3 ibis 4,241
4 DataFrame 2,280
5 qsv 2,234
6 functime 914
7 jupysql 610
8 awesome-polars 598
9 geopolars 487
10 datacompy 394
11 r-polars 390
12 nodejs-polars 313
13 rust-data-analysis 284
14 rust-mlops-template 273
15 pyo3-polars 198
16 polars-xdt 153
17 biobear 122
18 cuallee 107
19 s2protocol-rs 102
20 fastexcel 72
21 time-series-streaming-analytics-template 44
22 lightning-mlflow-hf 44
23 dply-rs 38

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com