ML

Top 23 ML Open-Source Projects

  • tensorflow

    An Open Source Machine Learning Framework for Everyone

  • Project mention: Side Quest Devblog #1: These Fakes are getting Deep | dev.to | 2024-04-29

    # L2-normalize the encoding tensors image_encoding = tf.math.l2_normalize(image_encoding, axis=1) audio_encoding = tf.math.l2_normalize(audio_encoding, axis=1) # Find euclidean distance between image_encoding and audio_encoding # Essentially trying to detect if the face is saying the audio # Will return nan without the 1e-12 offset due to https://github.com/tensorflow/tensorflow/issues/12071 d = tf.norm((image_encoding - audio_encoding) + 1e-12, ord='euclidean', axis=1, keepdims=True) discriminator = keras.Model(inputs=[image_input, audio_input], outputs=[d], name="discriminator")

  • ML-For-Beginners

    12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all

  • Project mention: Good coding groups for black women? | news.ycombinator.com | 2024-01-13

    - https://github.com/microsoft/ML-For-Beginners

    Also check out this list Pitt puts out every year:

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • yolov5

    YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

  • Project mention: จำแนกสายพันธ์ุหมากับแมวง่ายๆด้วยYoLoV5 | dev.to | 2024-04-15

    Ref https://www.youtube.com/watch?v=0GwnxFNfZhM https://github.com/ultralytics/yolov5 https://dev.to/gfstealer666/kaaraich-yolo-alkrithuemainkaartrwcchcchabwatthu-object-detection-3lef https://www.kaggle.com/datasets/devdgohil/the-oxfordiiit-pet-dataset/data

  • netron

    Visualizer for neural network, deep learning and machine learning models

  • Project mention: Your 14-Day Free Trial Ain't Gonna Cut It | news.ycombinator.com | 2024-05-06

    They're data-dependence graphs for a neural-network scheduling problem. Like this but way bigger to start with and then lowered to more detailed representations several times: https://netron.app/?url=https://github.com/onnx/models/raw/m... My home-grown layout engine can handle the 12k nodes for llama2 in its highest-level form in 20s or so, but its not the most featureful, and they only get bigger from there. So I always have an eye out for potential tools.

  • handson-ml

    ⛔️ DEPRECATED – See https://github.com/ageron/handson-ml3 instead.

  • MindsDB

    The platform for customizing AI from enterprise data

  • Project mention: What’s the Difference Between Fine-tuning, Retraining, and RAG? | dev.to | 2024-04-08

    Check us out on GitHub.

  • MLflow

    Open source platform for the machine learning lifecycle

  • Project mention: Observations on MLOps–A Fragmented Mosaic of Mismatched Expectations | dev.to | 2024-04-26

    How can this be? The current state of practice in AI/ML work requires adaptivity, which is uncommon in classical computational fields. There are myriad tools that capture the work across the many instances of the AI/ML lifecycle. The idea that any one tool could sufficiently capture the dynamic work is unrealistic. Take, for example, an experiment tracking tool like W&B or MLFlow; some form of experiment tracking is necessary in typical model training lifecycles. Such a tool requires some notion of a dataset. However, a tool focusing on experiment tracking is orthogonal to the needs of analyzing model performance at the data sample level, which is critical to understanding the failure modes of models. The way one does this depends on the type of data and the AI/ML task at hand. In other words, MLOps is inherently an intricate mosaic, as the capabilities and best practices of AI/ML work evolve.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • StableLM

    StableLM: Stability AI Language Models

  • Project mention: The Era of 1-bit LLMs: ternary parameters for cost-effective computing | news.ycombinator.com | 2024-02-28

    https://github.com/Stability-AI/StableLM?tab=readme-ov-file#...

  • best-of-ml-python

    🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.

  • kubeflow

    Machine Learning Toolkit for Kubernetes

  • awesome-mlops

    A curated list of references for MLOps

  • ludwig

    Low-code framework for building custom LLMs, neural networks, and other AI models

  • Project mention: Show HN: Toolkit for LLM Fine-Tuning, Ablating and Testing | news.ycombinator.com | 2024-04-07

    This is a great project, little bit similar to https://github.com/ludwig-ai/ludwig, but it includes testing capabilities and ablation.

    questions regarding the LLM testing aspect: How extensive is the test coverage for LLM use cases, and what is the current state of this project area? Do you offer any guarantees, or is it considered an open-ended problem?

    Would love to see more progress toward this area!

  • dopamine

    Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.

  • ML.NET

    ML.NET is an open source and cross-platform machine learning framework for .NET.

  • pycaret

    An open-source, low-code machine learning library in Python

  • MNN

    MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba

  • Project mention: [D][R] Deploying deep models on memory constrained devices | /r/MachineLearning | 2023-10-03

    However, I am looking on this subject through the problem of training/finetuning deep models on the edge devices, being increasingly available thing to do. Looking at tflite, alibaba's MNN, mit-han-lab's tinyengine etc..

  • deeplake

    Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai

  • Project mention: FLaNK AI Weekly 25 March 2025 | dev.to | 2024-03-25
  • metaflow

    :rocket: Build and manage real-life ML, AI, and data science projects with ease!

  • Project mention: FLaNK Stack 05 Feb 2024 | dev.to | 2024-02-05
  • unstructured

    Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

  • Project mention: LlamaCloud and LlamaParse | news.ycombinator.com | 2024-02-20

    Be careful with unstructured:

    https://github.com/Unstructured-IO/unstructured/blob/d11c70c...

    from: https://github.com/open-webui/open-webui/issues/687

  • CoreML-Models

    Largest list of models for Core ML (for iOS 11+)

  • serving

    A flexible, high-performance serving system for machine learning models

  • Project mention: Llama.cpp: Full CUDA GPU Acceleration | news.ycombinator.com | 2023-06-12

    Yet another TEDIOUS BATTLE: Python vs. C++/C stack.

    This project gained popularity due to the HIGH DEMAND for running large models with 1B+ parameters, like `llama`. Python dominates the interface and training ecosystem, but prior to llama.cpp, non-ML professionals showed little interest in a fast C++ interface library. While existing solutions like tensorflow-serving [1] in C++ were sufficiently fast with GPU support, llama.cpp took the initiative to optimize for CPU and trim unnecessary code, essentially code-golfing and sacrificing some algorithm correctness for improved performance, which isn't favored by "ML research".

    NOTE: In my opinion, a true pioneer was DarkNet, which implemented the YOLO model series and significantly outperformed others [2]. Same trick basically like llama.cpp

    [1] https://github.com/tensorflow/serving

  • llm

    An ecosystem of Rust libraries for working with large language models

  • Project mention: Open-sourcing a simple automation/agent workflow builder | /r/ChatGPTPro | 2023-10-07

    We're open-sourcing a project that lets you build simple automations/agent workflows that use LLMs for different tasks. Kinda like Zapier or IFTTT but focused on using natural language to accomplish your tasks.It's super early but we'd love to start getting feedback to steer it in the right direction. It currently supports OpenAI and local models through llm.

  • oneflow

    OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

ML related posts

  • Show HN: LLM-powered NPCs running on your hardware

    4 projects | news.ycombinator.com | 30 Apr 2024
  • Observations on MLOps–A Fragmented Mosaic of Mismatched Expectations

    1 project | dev.to | 26 Apr 2024
  • Machine Learning with PHP

    3 projects | dev.to | 22 Apr 2024
  • Show HN: Open-source Google Docs for audio transcriptions (Whisper)

    2 projects | news.ycombinator.com | 17 Apr 2024
  • What’s the Difference Between Fine-tuning, Retraining, and RAG?

    1 project | dev.to | 8 Apr 2024
  • W3C discussions of impact of ML models on the web

    2 projects | news.ycombinator.com | 4 Apr 2024
  • Why do tree-based models still outperform deep learning on tabular data? (2022)

    3 projects | news.ycombinator.com | 5 Mar 2024
  • A note from our sponsor - SaaSHub
    www.saashub.com | 10 May 2024
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source ML projects? This list will help you:

Project Stars
1 tensorflow 182,693
2 ML-For-Beginners 67,111
3 yolov5 47,202
4 netron 26,174
5 handson-ml 25,094
6 MindsDB 21,354
7 MLflow 17,335
8 StableLM 15,851
9 best-of-ml-python 15,633
10 kubeflow 13,700
11 awesome-mlops 11,769
12 ludwig 10,845
13 dopamine 10,378
14 ML.NET 8,855
15 pycaret 8,450
16 MNN 8,325
17 deeplake 7,729
18 metaflow 7,630
19 unstructured 6,515
20 CoreML-Models 6,241
21 serving 6,085
22 llm 5,931
23 oneflow 5,731

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com