-
production-tools
A bare-bones repository demonstrating how to set up tools for data science projects that will help you write higher quality code.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
PRAW
PRAW, an acronym for "Python Reddit API Wrapper", is a python package that allows for simple access to Reddit's API.
-
coveo-python-oss
This collection of general purpose python magic was too good to keep for ourselves!
-
ubelt
A Python utility library with a stdlib like feel and extra batteries. Paths, Progress, Dicts, Downloads, Caching, Hashing: ubelt makes it easy!
-
dispatch
All of the ad-hoc things you're doing to manage incidents today, done for you, and much more!
-
Kedro
Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
-
starlite
Discontinued Light, Flexible and Extensible ASGI API framework | Effortlessly Build Performant APIs [Moved to: https://github.com/litestar-org/litestar]
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Production Tools: It’s a python repo used for data science projects.
Requests: Simple HTTP library for Python.
I have a link of micro framework that I use at my work that I'd like to share. It's Flask, I think this is a nice example to follow 😀
Check Streamlit out. You can use it to provide a nice and simple web interface to your data analysis/ML related projects.
I also heard of Dash which serves the same purpose I guess, but I think it has more to offer.
Flask is battle tested, has 0 bugs (while for FastAPI there was a discussion about a potential memory leak ) and an incredible ecosystem with many complementary libraries (Flask-Admin, Flask-Migrate, Flask-SQLAlchemy, Flask-Security-Too....) that will speed up your development
trio. the best code, the best documentation, awesome community.
I refer back to the smartsheet-python-sdk from time to time. I like the way they use __get_attr__ to instantiate subclasses on the Smartsheet object.
The creation of the dropbox-sdk-python repo was almost certainly overseen by Guido van Rossum since he was working at Dropbox at the time. There is a note in the Smartsheet SDK repo that parts of it were developed by Dropbox as well.
the reddit API is actually really nice: https://github.com/praw-dev/praw
black is very well organized generic Python project example.
transformers is excellent ML project example.
For best practices you can take a look at https://github.com/coveo/stew (disclaimer: I'm the author). It's a tool that works with Poetry and offers some freebies around Continuous Integration. It also comes with a Github Action to make this a free meal.
You can take look at https://github.com/coveooss/coveo-python-oss for a monorepo that uses stew to test and ship several libraries to pypi.org.
I'm fairly happy with my ubelt library.
Someone else might have already mentioned it, but the source code for the Python standard library is done quite well (https://github.com/python/cpython/tree/main/Lib).
Python-powered shell https://xon.sh (open source) has good documentation, well structured modules and continuous delivery approach for releases by using unit testing. I'm wondering how small team cover all use cases during development. The approaches is not so super modern but as a whole it's good example of open source project.
Two random examples I found from 30 seconds of googling: Here’s Netflix using it in their crisis management tool, and here’s Uber using it in their deep learning framework.
Two random examples I found from 30 seconds of googling: Here’s Netflix using it in their crisis management tool, and here’s Uber using it in their deep learning framework.
You can also check out Kedro, it’s like the Flask for data science projects and helps apply clean code principles to data science code.
Well, I'm not objective but I'd say Starlite (https://github.com/starlite-api/starlite) is a might fine codebase. You can also learn a lot about tooling and typing going through it.
One other project that I enjoyed reading - to integrate with, is Picollo ORM (https://github.com/piccolo-orm/piccolo). The code is really readable.