[D] Looking for open source projects to contribute

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • Milvus

    A cloud-native vector database, storage for next generation AI applications

  • I am a part of the vector database project Milvus, we welcome open-source contributors to work on Golang (the distributed database) and C++ (ANN algorithm). https://github.com/milvus-io/milvus

  • bootcamp

    Dealing with all unstructured data, such as reverse image search, audio search, molecular search, video analysis, question and answer systems, NLP, etc. (by milvus-io)

  • For more beginner tasks associated with the Milvus vector database, you can contribute to the Bootcamp project( https://github.com/milvus-io/bootcamp), where we build a lot of data-driven solutions using ML and Milvus vector database, including reverse image search, recommender systems, etc.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • Gorgonia

    Gorgonia is a library that helps facilitate machine learning in Go.

  • If you know Go, Gorgonia is a pure Go framework for doing deep learning and various other autograd related things. I'd see it as a bastard baby of PyTorch and TensorFlow. We're always looking for new contributors.

  • nn

    ๐Ÿง‘โ€๐Ÿซ 60 Implementations/tutorials of deep learning papers with side-by-side notes ๐Ÿ“; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), ๐ŸŽฎ reinforcement learning (ppo, dqn), capsnet, distillation, ... ๐Ÿง 

  • Flux.jl

    Relax! Flux is the ML library that doesn't make you tensor

  • Hey! I highly suggest checking out: https://fluxml.ai ! There are so many impactful opportunities to contribute. Please ping me if you have any questions.

  • poutyne

    A simplified framework and utilities for PyTorch

  • Hi, I'm the author of Poutyne, a library that aims to simplify the use of PyTorch while keeping all its flexibility. Always looking for contributions. If you look in the issue on the Github repo, you'll few suggestions but I'm always looking for other ideas to improve the library.

  • docarray

    Represent, send, store and search multimodal data

  • hi if you speak Python, checkout https://github.com/jina-ai/docarray itโ€™s a very new project and very easy to contribute

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • habitat-sim

    A flexible, high-performance 3D simulator for Embodied AI research.

  • There are plenty of them out there. I spend a lot of time contributing to open source projects like Habitat-Sim https://github.com/facebookresearch/habitat-sim and Habitat-Lab https://github.com/facebookresearch/habitat-lab which have a ton of open issues and code maintaince stuff that we would welcome contributions of.

  • habitat-lab

    A modular high-level library to train embodied AI agents across a variety of tasks and environments.

  • There are plenty of them out there. I spend a lot of time contributing to open source projects like Habitat-Sim https://github.com/facebookresearch/habitat-sim and Habitat-Lab https://github.com/facebookresearch/habitat-lab which have a ton of open issues and code maintaince stuff that we would welcome contributions of.

  • vosk-api

    Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

  • Vosk speech recognition toolkit needs help as well. Check our github https://github.com/alphacep/vosk-api. We have a lot of ML tasks and simple programming tasks too

  • kaggle-environments

  • imodels

    Interpretable ML package ๐Ÿ” for concise, transparent, and accurate predictive modeling (sklearn-compatible).

  • Our package imodels is expanding our sklearn-compatible set of interpretable models and always looking for new contributors!

  • transformers

    ๐Ÿค— Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

  • HuggingFace's libraries are open source and everyone can contribute with features (and sorting issues). In particular, in the transformers library (https://github.com/huggingface/transformers), new architectures are welcome

  • dataqa

    Discontinued Labelling platform for text using weak supervision.

  • Hey, I am the creator and (only contributor today) of open-source https://github.com/dataqa/dataqa, a Python library to explore and annotate documents. It uses weak supervision, is based on spacy, and has a lot of opportunities to add more deep learning and ML functionality. I can guide you through it :-). This would be a great opportunity to be first and lead contributor of an open-source library (outside the creator).

  • general

  • I created a dataset of github projects.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts