jupyter
polars
jupyter | polars | |
---|---|---|
13 | 144 | |
14,735 | 26,378 | |
0.2% | 3.4% | |
7.2 | 10.0 | |
7 days ago | 3 days ago | |
Python | Rust | |
BSD 3-clause "New" or "Revised" License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
jupyter
-
Mastering Data Science: Top 10 GitHub Repos You Need to Know
6. Jupyter Jupyter is a collection of tools and applications designed for interactive computing and data visualization. At the heart of the Jupyter ecosystem is the Jupyter Notebook, an interactive web-based platform that allows you to create and share documents containing live code, equations, visualizations, and narrative text. Itβs an excellent tool for exploratory data analysis, model prototyping, and creating reproducible data science workflows.
-
You can run Rust code in a Jupyter notebook
How cool. This motivated a quick search - this could be fun:
How to write your own kernel
https://jupyter-client.readthedocs.io/en/stable/kernels.html
All the language kernels (a lot of abandoned ones - the mariaDB one ('binder') will take a while to load but SQL in Jupyter!)
https://github.com/jupyter/jupyter/wiki/Jupyter-kernels
- Resource for interesting data science project notebooks
-
Mathics: A free, open-source alternative to Mathematica
There are Jupyter kernels for Python, Mathics, Wolfram, R, Octave, Matlab, xeus-cling, allthekernels (the polyglot kernel). https://github.com/jupyter/jupyter/wiki/Jupyter-kernels
-
How does 3[a] gives the element at index 3 in an array?
Not only there is. But it is only a simple Google search away... But to make it simpler... There are 3 π https://github.com/jupyter/jupyter/wiki/Jupyter-kernels
-
How to use Jupyter notebooks in a conda environment?
As it seems, this is not quite straight forward and manyusers have similar troubles.
-
Semi-Weekly Discussion Thread - February 21, 2022
Community maintained kernels : https://github.com/jupyter/jupyter/wiki/Jupyter-kernels
- Node.js Notebooks
-
Python Tutorials using Jupyter Notebook
Derek Banas on YouTube is doing a "Python for Finance" course at ghe moment using Jupyter, and is making the files available. I believe he's done others too.Failing that, there's this Git repo: A gallery of interesting jupyter notebooks
- Github Discussion: What is your favorite Data Science Repo?
polars
-
Why Python's Integer Division Floors (2010)
This is because 0.1 is in actuality the floating point value value 0.1000000000000000055511151231257827021181583404541015625, and thus 1 divided by it is ever so slightly smaller than 10. Nevertheless, fpround(1 / fpround(1 / 10)) = 10 exactly.
I found out about this recently because in Polars I defined a // b for floats to be (a / b).floor(), which does return 10 for this computation. Since Python's correctly-rounded division is rather expensive, I chose to stick to this (more context: https://github.com/pola-rs/polars/issues/14596#issuecomment-...).
-
Polars
https://github.com/pola-rs/polars/releases/tag/py-0.19.0
-
Stuff I Learned during Hanukkah of Data 2023
That turned out to be related to pola-rs/polars#11912, and this linked comment provided a deceptively simple solution - use PARSE_DECLTYPES when creating the connection:
- Polars 0.20 Released
- Segunda linguagem
- Polars: Dataframes powered by a multithreaded query engine, written in Rust
- Summing columns in remote Parquet files using DuckDB
- Polars 0.34 is released. (A query engine focussing on DataFrame front ends)
What are some alternatives?
nteract - π The interactive computing suite for you! β¨
vaex - Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second π
cookiecutter-data-science - A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.
modin - Modin: Scale your Pandas workflows by changing a single line of code
pyodide - Pyodide is a Python distribution for the browser and Node.js based on WebAssembly
datafusion - Apache DataFusion SQL Query Engine
vscode-python - Python extension for Visual Studio Code
DataFrames.jl - In-memory tabular data in Julia
quokka - Repository for Quokka.js questions and issues
datatable - A Python package for manipulating 2-dimensional tabular data structures
Kedro - Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
Apache Arrow - Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing