Python AI

Open-source Python projects categorized as AI | Edit details

Top 23 Python AI Projects

  • GitHub repo spaCy

    💫 Industrial-strength Natural Language Processing (NLP) in Python

    Project mention: Topic modelling with Gensim and SpaCy on startup news | dev.to | 2022-01-17

    SpaCy is one of the most popular NLP libraries, and is very fast and flexible.

  • GitHub repo pytorch-lightning

    The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.

    Project mention: [D] Are you using PyTorch or TensorFlow going into 2022? | reddit.com/r/MachineLearning | 2021-12-14

    Is the problem the sheer number of options, or the fact that they are all together in one place? Would it be better if they were organized into the different trainer entrypoints (fit, validate, ...)? If that is the case, there was an RFC proposing this which you might find interesting, feel free to drop by and comment on the issue: https://github.com/PyTorchLightning/pytorch-lightning/issues/10444

  • SonarQube

    Static code analysis for 29 languages.. Your projects are multi-language. So is SonarQube analysis. Find Bugs, Vulnerabilities, Security Hotspots, and Code Smells so you can release quality code every time. Get started analyzing your projects today for free.

  • GitHub repo MLflow

    Open source platform for the machine learning lifecycle

    Project mention: [D] Tips for ML workflow on raw data | reddit.com/r/MachineLearning | 2022-01-21
  • GitHub repo bert-as-service

    Mapping a variable-length sentence to a fixed-length vector using BERT model

  • GitHub repo dvc

    🦉Data Version Control | Git for Data & Models | ML Experiments Management

    Project mention: [D] Tips for ML workflow on raw data | reddit.com/r/MachineLearning | 2022-01-21

    Try to use a version controls tool for ML such as DVC

  • GitHub repo mycroft-core

    Mycroft Core, the Mycroft Artificial Intelligence platform.

    Project mention: Google Home Mini - what to do? | reddit.com/r/degoogle | 2022-01-21

    Fixed Link: https://mycroft.ai/

  • GitHub repo cookiecutter-data-science

    A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.

    Project mention: Personal Projects that are original | reddit.com/r/datascience | 2021-10-17

    Project don't need to be 100% original, do the project in a such a way that other people has not done yet. There are plenty of datasets and notebooks available on Kaggle. Those are just bunch of notebooks. Take the inspiration from the notebook and build the project in modular structure and organize your project in proper folders and modules. I am using this cookiecutter for building my portfolio projects. https://github.com/drivendata/cookiecutter-data-science

  • OPS

    OPS - Build and Run Open Source Unikernels. Quickly and easily build and deploy open source unikernels in tens of seconds. Deploy in any language to any cloud.

  • GitHub repo metaflow

    :rocket: Build and manage real-life data science projects with ease!

    Project mention: Best job scheduler in 2022? (Airflow / Dagster / Prefect / Luigi / other) | reddit.com/r/dataengineering | 2022-01-18

    Can I give a plug for Metaflow. It's particularly well suited to data science and ML workflows, with great tooling that's basically just annotations on python functions that gives you: - DAG orchestration - parallelism - cloud integration - data flow through DAGs — very very useful imo for data science teams trying to migrate their existing scripts to (and write new ones on) Metaflow

  • GitHub repo RobustVideoMatting

    Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

    Project mention: [D] AI Background Removal: a quick comparison between RVM & BGMv2 | reddit.com/r/MachineLearning | 2021-10-10

    GitHub

  • GitHub repo snorkel

    A system for quickly generating training data with weak supervision

    Project mention: [D] A hand-picked selection of the best Python ML Libraries of 2021 | reddit.com/r/MachineLearning | 2021-12-21
  • GitHub repo Activeloop Hub

    Dataset format for AI. Build, manage, & visualize datasets for deep learning. Stream data real-time to PyTorch/TensorFlow & version-control it. https://activeloop.ai (by activeloopai)

    Project mention: The hand-picked selection of the best Python libraries released in 2021 | reddit.com/r/Python | 2021-12-21

    Hub.

  • GitHub repo autoscraper

    A Smart, Automatic, Fast and Lightweight Web Scraper for Python

    Project mention: Turn Any Website Into An API with AutoScraper and FastAPI | dev.to | 2021-04-24

    In this article, we will learn how to create a simple e-commerce search API with multiple platform support: eBay and Amazon. AutoScraper and FastAPi provide the ability to create a powerful JSON API for the date. With Playwright's help, we'll extend our scraper and avoid blocking by using ScrapingAnt's web scraping API.

  • GitHub repo haystack

    :mag: Haystack is an open source NLP framework that leverages Transformer models. It enables developers to implement production-ready neural search, question answering, semantic document search and summarization for a wide range of applications.

    Project mention: Show HN: Hello – A conversational search engine powered by transformers | news.ycombinator.com | 2022-01-14
  • GitHub repo frigate

    NVR with realtime local object detection for IP cameras

    Project mention: Can I use ESP32-Cam to post MQTT with Tensorflow Lite Person Detection? | reddit.com/r/esp32 | 2022-01-20

    You probably don't need to run detection on every frame. Frigate for example only typically analyzes at 5 FPS, which is plenty. Having an detector that so quickly flips between on and off isn't super useful either

  • GitHub repo BentoML

    Model Serving Made Easy

    Project mention: How to Build a Machine Learning Demo in 2022 | dev.to | 2022-01-16

    Using a general-purpose framework such as FastAPI involves writing a lot of boilerplate code just to get your API endpoint up and running. If deploying a model for a demo is the only thing you are interested in and you do not mind losing some flexibility, you might want to use a specialized serving framework instead. One example is BentoML, which will allow you to get an optimized serving endpoint for your model up and running much faster and with less overhead than a generic web framework. Framework-specific serving solutions such as Tensorflow Serving and TorchServe typically offer optimized performance but can only be used to serve models trained using Tensorflow or PyTorch, respectively.

  • GitHub repo polyaxon

    Machine Learning Management & Orchestration Platform (Monorepo for Polyaxon's MLOps Tools)

    Project mention: [D] Productionalizing machine learning pipelines for small teams | reddit.com/r/MachineLearning | 2021-08-08

    For running experiments, http://polyaxon.com/ is a really good free open-source package that has lots of nice integrations so you can quickly run experiments in k8s but it might be overkill in some cases.

  • GitHub repo clearml

    ClearML - Auto-Magical CI/CD to streamline your ML workflow. Experiment Manager, MLOps and Data-Management

    Project mention: [D] Drop your best open source Deep learning related Project | reddit.com/r/MachineLearning | 2021-12-30

    Hi there. ClearML is our open-source solution which is part of the PyTorch ecosystem. We would really appreciate it if you read our README and starred us if you like what you see!

  • GitHub repo Automagica

    AI-powered Smart Robotic Process Automation 🤖

    Project mention: Automagica VS ClointFusion - a user suggested alternative | libhunt.com/r/automagica | 2021-09-24
  • GitHub repo thinc

    🔮 A refreshing functional take on deep learning, compatible with your favorite libraries

    Project mention: good examples of functional-like python code that one can study? | reddit.com/r/functionalprogramming | 2021-06-29

    thinc - defining neural nets in functional way jax, a new deep learning framework puts emphasis on functions rather than tensors, I've tested it for a couple of applications and it's really cool, you can write stuff like you'd write math expressions in papers using numpy. That speeds up development significantly, and makes code much more readable

  • GitHub repo sunfish

    Sunfish: a Python Chess Engine in 111 lines of code

    Project mention: The Kilobyte's Gambit: Can you beat 1024 bytes of JavaScript [at chess]? | news.ycombinator.com | 2021-03-07

    Incomprehensible scheiße code. I looked around and I like this one because it has "meta-level" definition of movements and liitle bit of strategy. You could implement context-free chess games with varying rules for us congenitally lazy and dull-witted. https://github.com/thomasahle/sunfish/blob/master/sunfish.py

  • GitHub repo eiten

    Statistical and Algorithmic Investing Strategies for Everyone

    Project mention: Has anyone worked on problems involving "portfolio optimization theory"? | reddit.com/r/datascience | 2021-07-29
  • GitHub repo pytorch-forecasting

    Time series forecasting with PyTorch

    Project mention: When to go for an 'easy' time-series model vs. using a complex deep learning model (when having experience with the latter) | reddit.com/r/datascience | 2021-11-29

    I'm a data trainee at this organisation. I wrote my master thesis about using an event clustering mechanism to enrich an existing dataset to improve short-term demand predictions, using Pytorch Forecasting using the temporal fusion transformer component, and LightGBM (and compare the models with and w/o the event feature, so 4 runs in total).

  • GitHub repo AIF360

    A comprehensive set of fairness metrics for datasets and machine learning models, explanations for these metrics, and algorithms to mitigate bias in datasets and models.

    Project mention: Building a Responsible AI Solution - Principles into Practice | dev.to | 2022-01-10

    Besides the existing monitoring solution mentioned in the section above, we were also took inspiration from continuous integration and continuous delivery (CI/CD) testing tools like Jenkins and Circle CI, on the engineering front, and existing fairness libraries like Microsoft's Fairlearn and IMB's Fairness 360, on the machine learning side of things.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2022-01-21.

Python AI related posts

Index

What are some of the best open-source AI projects in Python? This list will help you:

Project Stars
1 spaCy 22,176
2 pytorch-lightning 17,009
3 MLflow 11,064
4 bert-as-service 9,823
5 dvc 9,134
6 mycroft-core 5,565
7 cookiecutter-data-science 5,334
8 metaflow 5,178
9 RobustVideoMatting 5,113
10 snorkel 4,996
11 Activeloop Hub 4,200
12 autoscraper 4,184
13 haystack 3,750
14 frigate 3,750
15 BentoML 3,132
16 polyaxon 2,980
17 clearml 2,932
18 Automagica 2,617
19 thinc 2,437
20 sunfish 2,269
21 eiten 2,195
22 pytorch-forecasting 1,665
23 AIF360 1,612
Find remote jobs at our new job board 99remotejobs.com. There are 29 new remote jobs listed recently.
Are you hiring? Post a new remote job listing for free.
Less time debugging, more time building
Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.
scoutapm.com