Open-source projects categorized as Python

Top 23 Python Open-Source Projects

  • tensorflow

    An Open Source Machine Learning Framework for Everyone

    Latest mention: Intermittent RecvAsync is cancelled error - Keras Text classifier | reddit.com/r/tensorflow | 2021-01-26

    I have raised a bug with TF in github, all they could suggest was to try the nightly verison.I'm planning to move to pytorch completely, since fb and tesla use torch, I have a feeling that it would be more stable than tf.

  • system-design-primer

    Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.

    Latest mention: A Guide to UWaterloo CS/Software Engineering Co-ops | reddit.com/r/uwaterloo | 2021-01-13

    If the company you're interviewing for is known for asking systems design questions (e.g Splunk), consider reviewing those too. https://github.com/donnemartin/system-design-primer is a good resource.

  • TheAlgorithms

    All Algorithms implemented in Python

    Latest mention: Looking for Data Structure and Algorithm resources in Python. | reddit.com/r/computerscience | 2021-01-02

    If you in case find a psuedo code of things you want to learn, you can find how things are done in Python here This has all languages implementation.

  • awesome-python

    A curated list of awesome Python frameworks, libraries, software and resources

    Latest mention: Debugging | dev.to | 2021-01-18

    awesome-python: Debugging tools

  • thefuck

    Magnificent app which corrects your previous console command.

    Latest mention: Have you ever forgotten to sudo a command? | reddit.com/r/linuxadmin | 2021-01-26
  • Django

    The Web framework for perfectionists with deadlines.

    Latest mention: Importing CreateView | reddit.com/r/django | 2021-01-21

    You can see django.views.generic here https://github.com/django/django/tree/master/django/views/generic

  • Flask

    The Python micro framework for building web applications.

    Latest mention: Project Directory - Is there a reason not to use Blueprints? | reddit.com/r/flask | 2021-01-24

    You should take a look at Flaskr, it's the official tutorial of Flask, and it uses some blueprints.

  • Keras

    Deep Learning for humans

    Latest mention: Tensorflow .predict on Pandas rows | reddit.com/r/learnpython | 2020-12-22

    Then it's maybe a version bug problem, try to update to the latest tensorflow and keras version. It seems to appear in this issue and hasn't been resolved, switch to Pytorch maybe ?

  • httpie

    As easy as /aitch-tee-tee-pie/ 🥧 Modern, user-friendly command-line HTTP client for the API era. JSON support, colors, sessions, downloads, plugins & more. https://twitter.com/httpie

    Latest mention: What the hell happened to Postman? | reddit.com/r/webdev | 2020-12-27

    httpie looks great too!

  • Ansible

    Ansible is a radically simple IT automation platform that makes your applications and systems easier to deploy and maintain. Automate everything from code deployment to network configuration to cloud management, in a language that approaches plain English, using SSH, with no agents to install on remote systems. https://docs.ansible.com.

    Latest mention: removing ':' from a dictionary key | reddit.com/r/ansible | 2021-01-26

    Interestingly enough Ansible doesn't use json when parsing the output of lscpu: https://github.com/ansible/ansible/blob/13d08d232c8b692ec79dba81c9eb83887ee554a2/lib/ansible/module_utils/facts/virtual/linux.py#L266

  • Pytorch

    Tensors and Dynamic neural networks in Python with strong GPU acceleration

    Latest mention: [P] Implementation of RealFormer using pytorch | reddit.com/r/MachineLearning | 2020-12-27

    Tip: Use torch.bmm instead of torch.einsum. The former is considerably faster. Take a look at Pytorchs own MHA implementation to see how you have to do the reshaping for it.

  • requests

    A simple, yet elegant HTTP library.

    Latest mention: Clean Architecture in Flask | reddit.com/r/flask | 2021-01-26

    requests is apparently supposed to be pretty elegant | source

  • scikit-learn

    scikit-learn: machine learning in Python

    Latest mention: explanation of two simple code lines | reddit.com/r/learnpython | 2021-01-13

    The return values in make_forge are from this line: X, y = make_blobs(centers=2, random_state=4, n_samples=30). The make_blobs function is imported from sklearn.datasets, and there the return values are as such:

  • project-based-learning

    Curated list of project-based tutorials

    Latest mention: Exercises for a beginner? | reddit.com/r/learnpython | 2021-01-24

    Project based learning

  • Scrapy

    Scrapy, a fast high-level web crawling & scraping framework for Python.

    Latest mention: Does anyone know good resources to build a web crawler? | reddit.com/r/computerscience | 2021-01-26

    I'd strongly recommend Scrapy over Selenium if OP cares about performance. It uses built-in concurrency and async operations while Selenium basically runs entire Chrome instances in the background (or FF, IIRC).

  • Home Assistant

    :house_with_garden: Open source home automation that puts local control and privacy first

    Latest mention: Zigbee Network Is Down | reddit.com/r/homeassistant | 2021-01-27

    Here. https://github.com/home-assistant/core/issues/45144

  • Face Recognition

    The world's simplest facial recognition api for Python and the command line

    Latest mention: How to run PoseNet model and save data points for multiple images? | reddit.com/r/MLQuestions | 2021-01-15

    You feed the 17 landmark pairs as 34 inputs into a feed forward regression network (last time I did this I just tweaked something like this) with 34 pairs. However, it can help a lot to also add in the face detection to speed this up. So if you were to combine the landmarks from face detection as well like this face recognition and expand up to include the landmarks which helps I've found.

  • Apache Superset

    Apache Superset is a Data Visualization and Data Exploration Platform

    Latest mention: Ask HN: What low-code “dashboarding“ SaaS would you recommend in 2021? | news.ycombinator.com | 2020-12-29

    Check out Superset. https://github.com/apache/incubator-superset

    It’s modern, easy to extend. From the same author of apache airflow.

  • superset

    Apache Superset is a Data Visualization and Data Exploration Platform

    Latest mention: Apache Superset Is a Data Visualization and Data Exploration Platform | news.ycombinator.com | 2021-01-26
  • manim

    Animation engine for explanatory math videos

    Latest mention: Looking for a partner who can animate my scripts (Unpaid) | reddit.com/r/animation | 2021-01-12

    Regardless, since most people here won't work just because it might take off and would recommend you do to it yourself, here is a link to the software 3Blue1Brown uses to make his videos: https://github.com/3b1b/manim

  • Apache Spark

    A unified analytics engine for large-scale data processing

    Latest mention: Ballista: New approach for 2021 | reddit.com/r/rust | 2021-01-11

    Yes, I think to the extent that the open-source Spark has support for columnar data exchange. I think some/much of the work has been done in the last 2 years (see https://github.com/apache/spark/pull/24795/files), but I don't now to what extent one could completely build out the execution part in Spark 3.0 or 3.1.

  • Pandas

    Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

    Latest mention: "Python is so much better than Matlab or anything else" they said. A rant in G minor | reddit.com/r/programminghorror | 2021-01-26

    It works fine in a local machine (I didn't use the cmd since I like seeing variables and stuff, like in Spyder). No fuss about backslashes or xlrd or xlsx support. Tried the same line of code in colab and it doesn't work. For Spyder, it's still giving me the error that xlrd can't read xlsx files (although cmd doesn't, weird). Looking up this issue, xlrd was updated in December and xlsx is no longer supported, so I have to use openpyxl and specify in the call since read_excel doesn't automatically use openpyxl for xlsx files.

  • PythonDataScienceHandbook

    Python Data Science Handbook: full text in Jupyter Notebooks

    Latest mention: [OC] Intensity of the Interaction between Programming Languages in the top 100 GitHub projects | reddit.com/r/dataisbeautiful | 2020-12-28

    You are right, Jupyter is more of an environment, so not a language. On the other hand, GitHub added it and other frameworks in the "Language" section, so that is why it was picked up in the graphic. You can see an example here: https://github.com/jakevdp/PythonDataScienceHandbook

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2021-01-27.


What are some of the best open-source Python projects? This list will help you:

Project Stars
1 tensorflow 152,778
2 system-design-primer 119,353
3 TheAlgorithms 97,739
4 awesome-python 92,792
5 thefuck 58,660
6 Django 55,059
7 Flask 53,637
8 Keras 50,597
9 httpie 49,556
10 Ansible 46,663
11 Pytorch 45,712
12 requests 44,451
13 scikit-learn 43,902
14 project-based-learning 43,254
15 Scrapy 39,563
16 Home Assistant 39,452
17 Face Recognition 38,299
18 Apache Superset 33,607
19 superset 32,538
20 manim 29,926
21 Apache Spark 28,590
22 Pandas 28,207
23 PythonDataScienceHandbook 27,777