Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work. Learn more →
Top 23 Python jupyter-notebook Projects
-
-
Project mention: FiftyOne Computer Vision Model Evaluation Tips and Tricks – Feb 03, 2023 | dev.to | 2023-02-03
Because the confusion matrix is implemented in plotly, it is interactive! To interact visually with your data via the confusion matrix, attach the plot to a session launched with the dataset:
-
Sonar
Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.
-
Project mention: pandas-profiling VS Rath - a user suggested alternative | libhunt.com/r/pandas-profiling | 2023-01-12
-
Project mention: (RANT) I think I'll die trying to setup and run Spark with Python in my local environment | reddit.com/r/dataengineering | 2023-01-28
I use an image that allows me to run spark on jupyter notebooks and use files from my local machine to test code. Here is a good one https://github.com/jupyter/docker-stacks/tree/main/pyspark-notebook
-
Project mention: MacBook: Jupyter Lab -- "ModuleNotFoundError: No module named 'pysqlite2'" | reddit.com/r/learnpython | 2022-09-18
Based on this conversation, this more or less means that your Python installation is kind of borked. By default it should use the built-in sqlite3, but apparently your installation is either ignoring it or it's missing for some reason.
-
Automatically convert ipynb files to py when saving them on JupyterLab
-
-
InfluxDB
Build time-series-based applications quickly and at scale.. InfluxDB is the Time Series Platform where developers build real-time applications for analytics, IoT and cloud-native services. Easy to start, it is available in the cloud or on-premises.
-
deeplake
Data Lake for Deep Learning. Build, manage, query, version, & visualize datasets. Stream data real-time to PyTorch/TensorFlow. https://activeloop.ai
Project mention: Launch HN: Activeloop (YC S18) – Data lake for deep learning | news.ycombinator.com | 2022-11-15Re: HF - we know them and admire their work (primarily, until very recently, focused on NLP, while we focus mostly on CV). As mentioned in the post, a large part of Deep Lake, including the Python-based dataloader and dataset format, is open source as well - https://github.com/activeloopai/deeplake.
Likewise, we curate a list of large open source datasets here -> https://datasets.activeloop.ai/docs/ml/, but our main thing isn't aggregating datasets (focus for HF datasets), but rather providing people with a way to manage their data efficiently. That being said, all of the 125+ public datasets we have are available in seconds with one line of code. :)
We haven't benchmarked against HF datasets in a while, but Deep Lake's dataloader is much, much faster in third-party benchmarks (see this https://arxiv.org/pdf/2209.13705 and here for an older version, that was much slower than what we have now, see this: https://pasteboard.co/la3DmCUR2iFb.png). HF under the hood uses Git-LFS (to the best of my knowledge) and is not opinionated on formats, so LAION just dumps Parquet files on their storage.
While your setup would work for a few TBs, scaling to PB would be tricky including maintaining your own infrastructure. And yep, as you said NAS/NFS would neither be able to handle the scale (especially writes with 1k workers). I am also slightly curious about your use of mmap files with image/video compressed data (as zero-copy won’t happen) unless you decompress inside the GPU ;), but would love to learn more from you! Re: pricing thanks for the feedback, storage is one component and customly priced for PB-scale workloads.
-
A nifty little alternative to voila, one might say.
-
pandas-ta
Technical Analysis Indicators - Pandas TA is an easy to use Python 3 Pandas Extension with 130+ Indicators
-
evidently
Evaluate and monitor ML models from validation to production. Join our Discord: https://discord.com/invite/xZjKRaNp8b
Project mention: evidently: Evaluate and monitor ML models from validation to production | reddit.com/r/coolgithubprojects | 2022-12-08 -
geemap
A Python package for interactive mapping with Google Earth Engine, ipyleaflet, and ipywidgets.
Project mention: I'm building an IDE and open source library to make it easier to work with geospatial data using Python | reddit.com/r/Python | 2022-11-04 -
deepchecks
Tests for Continuous Validation of ML Models & Data. Deepchecks is a Python package for comprehensively validating your machine learning models and data with minimal effort.
Project mention: [D] DL Practitioners, Do You Use Layer Visualization Tools s.a GradCam in Your Process? | reddit.com/r/MachineLearning | 2022-10-28 -
Project mention: Ask HN: Are there any good Diff tools for Jupyter Notebooks? | news.ycombinator.com | 2022-05-22
[5] ReviewNB for reviewing & diff'ing notebook PRs / Commits on GitHub
Disclaimer: While I’m the author of last two (GitPlus & ReviewNB), I’ve represented the overall landscape in an unbiased way. I've been working on this specific problem for 3+ years & regularly talk to teams who use GitHub with notebooks.
-
Project mention: How to convert big TIF image to smaller jpgs | reddit.com/r/computervision | 2023-01-12
i have the EXACT thing ! the libs github!
-
Project mention: Programmatically create presentation slides with data visualisation graphs in Python | reddit.com/r/datascience | 2022-12-12
If you would like to see slides during working on notebook, then you will need RISE extension. If you would like to update slides periodically, serve them on the cloud (with authentication) or add interactive widgets, then you can check Mercury framework.
-
Project mention: Powering Jupyter’s nbviewer.org: Yuvi Panda on the values, and value, of the open internet | dev.to | 2023-01-11
Yuvi: Just like the rest of the Jupyter ecosystem, nbviewer is an open-source package you can contribute to or even run your own internal version of! The public instance at nbviewer.org is generously hosted by the European cloud provider OVH and deployed on Kubernetes via this helm chart.
-
leafmap
A Python package for interactive mapping and geospatial analysis with minimal coding in a Jupyter environment
Project mention: Any folks doing data engineering in the agriculture or energy space? | reddit.com/r/dataengineering | 2022-11-03For USA & parts of world there is a the national map collection & API. Leafmap has a wrapper for it.
-
Project mention: [Discussion] Is there any general software for recreating a short sound? (audio "tracing") | reddit.com/r/synthrecipes | 2022-11-20
I do know of a few non-VST tools that are closer to this goal, but they're a pain to use. SMSTools and Loris attempt to resynthesize sounds by building consistent tracks for harmonics; but they both require coding knowledge to get any real use out of them. (Sounds like you're a programmer yourself so you probably have that part covered.)
-
nannyml
Detecting silent model failure. NannyML estimates performance for regression and classification models using tabular data. It alerts you when and why it changed. It is the only open-source library capable of fully capturing the impact of data drift on performance.
Project mention: [HIRING][Full Time, Part Time, Temporary, Internship, Freelance] Data Science Intern (Remote) | reddit.com/r/jobbit | 2022-05-20Description NannyML - creators of an Open Source Python library, are looking for multiple Data Science interns to help across research, prototyping, and product. Github: https://github.com/NannyML/nannyml About Us NannyML is an Open Source Python lib …
-
Project mention: Dynamically spin up VM (based on specific HTTPS request) and stop it once session is over? | reddit.com/r/devops | 2022-06-02
-
-
machine_learning_refined
Notes, examples, and Python demos for the 2nd edition of the textbook "Machine Learning Refined" (published by Cambridge University Press).
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python jupyter-notebook related posts
- euporie - Jupyter notebooks in the terminal
- (RANT) I think I'll die trying to setup and run Spark with Python in my local environment
- How to raise the quality of scientific Jupyter notebooks
- How we made Jupyter Docker Stacks multi-arch
- New library to develop streamlit apps in jupyter
- How to Do an EDA for Time-Series
- How to compare 2 datasets with pandas-profiling 🐼
-
A note from our sponsor - Sonar
www.sonarsource.com | 8 Feb 2023
Index
What are some of the best open-source jupyter-notebook projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | recommenders | 14,991 |
2 | plotly | 12,862 |
3 | ydata-profiling | 10,106 |
4 | docker-stacks | 7,092 |
5 | Jupyter Notebook (IPython) | 7,048 |
6 | jupytext | 5,792 |
7 | learn-python3 | 5,468 |
8 | deeplake | 5,197 |
9 | voila | 4,537 |
10 | pandas-ta | 3,283 |
11 | evidently | 3,121 |
12 | geemap | 2,520 |
13 | deepchecks | 2,362 |
14 | nbdime | 2,357 |
15 | sahi | 2,334 |
16 | mercury | 2,316 |
17 | nbviewer | 2,054 |
18 | leafmap | 1,474 |
19 | sms-tools | 1,470 |
20 | nannyml | 1,373 |
21 | zero-to-jupyterhub-k8s | 1,272 |
22 | livelossplot | 1,227 |
23 | machine_learning_refined | 1,221 |