Top 23 Machine Learning Open-Source Projects

tensorflow

221 182,456 10.0 C++

An Open Source Machine Learning Framework for Everyone

Project mention: TensorFlow-metal on Apple Mac is junk for training | news.ycombinator.com | 2024-01-16

transformers

175 125,021 10.0 Python

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Project mention: Maxtext: A simple, performant and scalable Jax LLM | news.ycombinator.com | 2024-04-23

Is t5x an encoder/decoder architecture?
Some more general options.
The Flax ecosystem
https://github.com/google/flax?tab=readme-ov-file
or dm-haiku
https://github.com/google-deepmind/dm-haiku
were some of the best developed communities in the Jax AI field
Perhaps the “trax” repo? https://github.com/google/trax
Some HF examples https://github.com/huggingface/transformers/tree/main/exampl...
Sadly it seems much of the work is proprietary these days, but one example could be Grok-1, if you customize the details. https://github.com/xai-org/grok-1/blob/main/run.py

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Pytorch

336 77,783 10.0 Python

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Project mention: My Favorite DevTools to Build AI/ML Applications! | dev.to | 2024-04-23

TensorFlow, developed by Google, and PyTorch, developed by Facebook, are two of the most popular frameworks for building and training complex machine learning models. TensorFlow is known for its flexibility and robust scalability, making it suitable for both research prototypes and production deployments. PyTorch is praised for its ease of use, simplicity, and dynamic computational graph that allows for more intuitive coding of complex AI models. Both frameworks support a wide range of AI models, from simple linear regression to complex deep neural networks.

Netdata

118 68,153 10.0 C

The open-source observability platform everyone needs

Project mention: A list of SaaS, PaaS and IaaS offerings that have free tiers of interest to devops and infradev | dev.to | 2024-02-05

netdata.cloud — Netdata is an open-source tool to collect real-time metrics. It's a growing product and can also be found on GitHub!

ML-For-Beginners

28 66,908 7.6 HTML

12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all

Project mention: Good coding groups for black women? | news.ycombinator.com | 2024-01-13

- https://github.com/microsoft/ML-For-Beginners
Also check out this list Pitt puts out every year:

cs-video-courses

58 64,788 7.3

List of Computer Science courses with video lectures.

Project mention: Need advice | /r/PAK | 2023-07-12

course Computer science is very wast field the fundamental remains same, learn basic fundamentals, data structures, concepts of object oriented programming.

Keras

77 60,937 9.9 Python

Deep Learning for humans

Project mention: My Favorite DevTools to Build AI/ML Applications! | dev.to | 2024-04-23

As a beginner, I was looking for something simple and flexible for developing deep learning models and that is when I found Keras. Many AI/ML professionals appreciate Keras for its simplicity and efficiency in prototyping and developing deep learning models, making it a preferred choice, especially for beginners and for projects requiring rapid development.

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
scikit-learn

81 58,046 9.9 Python

scikit-learn: machine learning in Python

Project mention: AutoCodeRover resolves 22% of real-world GitHub in SWE-bench lite | news.ycombinator.com | 2024-04-09

Thank you for your interest. There are some interesting examples in the SWE-bench-lite benchmark which are resolved by AutoCodeRover:
- From sympy: https://github.com/sympy/sympy/issues/13643. AutoCodeRover's patch for it: https://github.com/nus-apr/auto-code-rover/blob/main/results...
- Another one from scikit-learn: https://github.com/scikit-learn/scikit-learn/issues/13070. AutoCodeRover's patch (https://github.com/nus-apr/auto-code-rover/blob/main/results...) modified a few lines below (compared to the developer patch) and wrote a different comment.
There are more examples in the results directory (https://github.com/nus-apr/auto-code-rover/tree/main/results).

tesseract-ocr

120 58,022 8.9 C++

Tesseract Open Source OCR Engine (main repository)

Project mention: one of the Codia AI Design technologies: OCR Technology | dev.to | 2024-02-14

You will also need to install the Tesseract OCR engine, which can be downloaded and installed from the following link: https://github.com/tesseract-ocr/tesseract

awesome-scalability

6 53,036 6.3

The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
Face Recognition

34 51,755 0.0 Python

The world's simplest facial recognition api for Python and the command line

Project mention: Security Image Recognition | /r/computervision | 2023-12-10

Camera connected to a PI? Something like this could run locally: https://github.com/ageitgey/face_recognition

faceswap

10 49,178 8.0 Python

Deepfakes Software For All

Project mention: faceswap VS facefusion - a user suggested alternative | libhunt.com/r/faceswap | 2024-01-30

nn

26 48,004 7.7 Jupyter Notebook

🧑‍🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
yolov5

129 46,921 8.8 Python

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

Project mention: จำแนกสายพันธ์ุหมากับแมวง่ายๆด้วยYoLoV5 | dev.to | 2024-04-15

Ref https://www.youtube.com/watch?v=0GwnxFNfZhM https://github.com/ultralytics/yolov5 https://dev.to/gfstealer666/kaaraich-yolo-alkrithuemainkaartrwcchcchabwatthu-object-detection-3lef https://www.kaggle.com/datasets/devdgohil/the-oxfordiiit-pet-dataset/data

julia

350 44,510 10.0 Julia

The Julia Programming Language

Project mention: Top Paying Programming Technologies 2024 | dev.to | 2024-03-06

34. Julia - $74,963

100-Days-Of-ML-Code

3 43,302 0.0

100 Days of ML Coding

Project mention: Top 10 GitHub Repositories Every Developer Should Bookmark in 2024 | dev.to | 2024-02-07

2) 100 Days of ML Code: Embark on a 100-day journey into the fascinating world of machine learning with this structured curriculum. Packed with bite-sized coding challenges and real-world projects, this repository will transform you from a coding novice to a confident ML enthusiast. (https://github.com/Avik-Jain/100-Days-Of-ML-Code)

TensorFlow-Examples

2 43,200 0.0 Jupyter Notebook

TensorFlow Tutorial and Examples for Beginners (support TF v1 & v2)
Open-Assistant

329 36,622 9.1 Python

OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

Project mention: Best open source AI chatbot alternative? | /r/opensource | 2023-12-08

For open assistant, the code: https://github.com/LAION-AI/Open-Assistant/tree/main/inference

Made-With-ML

51 35,656 6.8 Jupyter Notebook

Learn how to design, develop, deploy and iterate on production-grade ML applications.

Project mention: [D] How do you keep up to date on Machine Learning? | /r/learnmachinelearning | 2023-08-13

Made With ML

Airflow

169 34,485 10.0 Python

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

Project mention: Building in Public: Leveraging Tublian's AI Copilot for My Open Source Contributions | dev.to | 2024-02-12

Contributing to Apache Airflow's open-source project immersed me in collaborative coding. Experienced maintainers rigorously reviewed my contributions, providing constructive feedback. This ongoing dialogue refined the codebase and honed my understanding of best practices.

gym

96 33,873 0.0 Python

A toolkit for developing and comparing reinforcement learning algorithms.

Project mention: OpenAI Acquires Global Illumination | news.ycombinator.com | 2023-08-16

A co-founder announced they disbanded their robots team a couple years ago: https://venturebeat.com/business/openai-disbands-its-robotic...
That was the same time they depreciated OpenAI Gym: https://github.com/openai/gym

Caffe

6 33,859 0.0 C++

Caffe: a fast open framework for deep learning.

Project mention: List of AI-Models | /r/GPT_do_dah | 2023-05-16

Click to Learn more...

Tesseract.js

32 33,498 8.2 JavaScript

Pure Javascript OCR for more than 100 Languages 📖🎉🖥

Project mention: I am out of the loop. Is Next.js "the future" and something I should consider adding to my knowledge pool? | /r/webdev | 2023-07-05

What do you have against tesseract.js?

SaaSHub

www.saashub.com sponsored

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Machine Learning related posts

Building an Email Assistant Application with Burr
6 projects | dev.to | 26 Apr 2024
Voxel51 Is Hiring AI Researchers and Scientists — What the New Open Science Positions Mean
1 project | dev.to | 26 Apr 2024
Observations on MLOps–A Fragmented Mosaic of Mismatched Expectations
1 project | dev.to | 26 Apr 2024
Show HN: I made a ROS package for realtime semantic segmentation
1 project | news.ycombinator.com | 26 Apr 2024
The Nimble File Format by Meta
2 projects | news.ycombinator.com | 25 Apr 2024
Wouldn't it be cool to have a Supabase for SQLite?
3 projects | news.ycombinator.com | 25 Apr 2024
How to Estimate Depth from a Single Image
8 projects | dev.to | 25 Apr 2024
A note from our sponsor - SaaSHub
www.saashub.com | 26 Apr 2024

SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source Machine Learning projects? This list will help you:

	Project	Stars
1	tensorflow	182,456
2	transformers	125,021
3	Pytorch	77,783
4	Netdata	68,153
5	ML-For-Beginners	66,908
6	cs-video-courses	64,788
7	Keras	60,937
8	scikit-learn	58,046
9	tesseract-ocr	58,022
10	awesome-scalability	53,036
11	Face Recognition	51,755
12	faceswap	49,178
13	nn	48,004
14	yolov5	46,921
15	julia	44,510
16	100-Days-Of-ML-Code	43,302
17	TensorFlow-Examples	43,200
18	Open-Assistant	36,622
19	Made-With-ML	35,656
20	Airflow	34,485
21	gym	33,873
22	Caffe	33,859
23	Tesseract.js	33,498