Managing outdated pull requests is time-consuming. Mergify's Merge Queue automates your pull request management & merging. It's fully integrated to GitHub & coordinated with any CI. Start focusing on code. Try Mergify for free. Learn more →
Top 19 Python Datascience Projects
-
Two random examples I found from 30 seconds of googling: Here’s Netflix using it in their crisis management tool, and here’s Uber using it in their deep learning framework.
-
-
Sonar
Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.
-
Project mention: In Need of Guidance: Implementing MLOps in a Complex Organization as a Junior Data Engineer | /r/mlops | 2023-06-12
-
Mimesis
Mimesis is a powerful Python library that empowers developers to generate massive amounts of synthetic data efficiently.
Project mention: Mimesis allows you toeasily generate detailed dummy datasets. | /r/datascience | 2023-04-12Mimesis has well-structured and comprehensive documentation: https://mimesis.name
-
Project mention: What python library you are using for interactive visualisation?(other than plotly) | /r/datascience | 2023-06-01
https://panel.holoviz.org/ It's a web app framework for Python similar to what Dash does for plotly. It plays nicely with bokeh visuals and I think the front-end is built using bokeh css elements.
-
Project mention: Python: Uncovering the Overlooked Core Functionalities | news.ycombinator.com | 2023-07-24
If you actually think this code is better there's a real library that does this: https://github.com/EntilZha/PyFunctional.
-
Fast-F1
FastF1 is a python package for accessing and analyzing Formula 1 results, schedules, timing data and telemetry
F1 broadcasts their live timing via the SignalR protocol. The endpoint itself is unauthenticated. You can look at FastF1’s implementation of the SignalR client and the respective endpoints which it connects to within the code documentation here FastF1 SignalR client
-
InfluxDB
Collect and Analyze Billions of Data Points in Real Time. Manage all types of time series data in a single, purpose-built database. Run at any scale in any environment in the cloud, on-premises, or at the edge.
-
CleverCSV
CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.
There’s things like this, but I consider the existence of messy, non standard CSV files (backed by a decade of experience dealing with the problem) a strong reason to not use the format ever.
-
-
-
socios-brasil
Captura os dados de sócios das empresas brasileiras na Receita Federal e exporta para um formato legível por humanos
-
objectiv-analytics
Open-source product analytics infrastructure for data teams that want full control. Built for high quality data collection and ready to use for advanced analytics & ML.
-
-
-
Project mention: Simple, expressive pipeline syntax to transform and manipulate data with ease | news.ycombinator.com | 2023-01-24
-
I think this is perfectly natural - some ideas never manifest or are convincing enough to publish, or sometimes you write code and it turns out not to be used - why produce production ready code in this case. I have a python package that I slowly developed over 5 years, step by step [1]. Everytime I use it, I find many things that I could develop, some I do right then, others I leave for later. I also have a blog [2] - you can see three dates for each blog post:
- the time I first started working on it
- the first time I published it
- the last time it was updated
All of these dates are important. Think of doing things more like a process of chained events, not like a one-stop thing.
[1]: https://github.com/Sieboldianus/TagMaps
[2]: https://du.nkel.dev/
-
-
scrape-google-play-store-app
Single script to scrape Google Play Store App info without browser automation
GitHub Repository
-
Machine-Learning-Cyrillic-Classifier
This is a web app where you can draw a letter in the russian alphabet and the ML algorithm will predict the letter that you drew.
-
Mergify
Tired of breaking your main and manually rebasing outdated pull requests?. Managing outdated pull requests is time-consuming. Mergify's Merge Queue automates your pull request management & merging. It's fully integrated to GitHub & coordinated with any CI. Start focusing on code. Try Mergify for free.
Python Datascience related posts
- Python: Uncovering the Overlooked Core Functionalities
- Consume Live Timing/Telemetry From API During Race
- Does anyone know were I can find telemetry?
- Taipy – Robust Web Apps Programming Using Only Python
- Taipy: An Open-Source Web App Builder Made for Python
- Taipy: an Open-Source Python Library to create Web Apps with Python Only
- Taipy Studio: a VSCode Extension to create Graph-Based Data Pipelines
-
A note from our sponsor - Mergify
blog.mergify.com | 21 Sep 2023
Index
What are some of the best open-source Datascience projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | ludwig | 9,745 |
2 | modin | 8,967 |
3 | metaflow | 6,977 |
4 | Mimesis | 4,045 |
5 | panel | 3,120 |
6 | PyFunctional | 2,229 |
7 | Fast-F1 | 1,817 |
8 | CleverCSV | 1,119 |
9 | streamlit-geospatial | 699 |
10 | DGFraud | 591 |
11 | socios-brasil | 542 |
12 | objectiv-analytics | 461 |
13 | Mobile-Phone-Dataset-GSMArena | 52 |
14 | gretel-python-client | 38 |
15 | viper | 14 |
16 | TagMaps | 6 |
17 | linkedin-connections-analyzer | 5 |
18 | scrape-google-play-store-app | 2 |
19 | Machine-Learning-Cyrillic-Classifier | 1 |