Machine Learning

Top 23 Machine Learning Open-Source Projects

  • tensorflow

    An Open Source Machine Learning Framework for Everyone

    Project mention: 🔥🚀 Top 10 Open-Source Must-Have Tools for Crafting Your Own Chatbot 🤖💬 | dev.to | 2023-11-06

    To get up to speed with TensorFlow, check their quickstart Support TensorFlow on GitHub ⭐

  • transformers

    🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

    Project mention: Fine-Tuned Llama2 Inserting Unnecessary Delimiters | /r/LocalLLaMA | 2023-11-04

    While its tough to say something specifc since we dont know how exactly you trained it or the prompt format of your training input or how you are performing inference, one thing I found when I faced similar types of issues is that the model does not know when to stop. Some of it is because the fast llama tokenizer does not add the token when encoding your inputs. So you can either add that token explicitly in your input text for each sample or use the slow llama tokenizer. Check llama_recipes github repo for the exact issue https://github.com/huggingface/transformers/issues/22794. The other most probable thing you might want to check is if the model.generate output contains the exact input tokens too. That is the expected behavior of some models (like llama2 or mpt) for example when you use vanilla transformers for inference.

  • InfluxDB

    Collect and Analyze Billions of Data Points in Real Time. Manage all types of time series data in a single, purpose-built database. Run at any scale in any environment in the cloud, on-premises, or at the edge.

  • Pytorch

    Tensors and Dynamic neural networks in Python with strong GPU acceleration

    Project mention: Diving into the Deep: My Inaugural PyTorch Contribution Adventure! | dev.to | 2023-11-24
  • cs-video-courses

    List of Computer Science courses with video lectures.

    Project mention: Need advice | /r/PAK | 2023-07-12

    course Computer science is very wast field the fundamental remains same, learn basic fundamentals, data structures, concepts of object oriented programming.

  • ML-For-Beginners

    12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all

    Project mention: FLaNK Stack Weekly for 20 Nov 2023 | dev.to | 2023-11-20
  • Keras

    Deep Learning for humans

    Project mention: Keras 3.0 | news.ycombinator.com | 2023-11-28

    All breaking changes are listed here: https://github.com/keras-team/keras/issues/18467

    You can use this migration guide to identify and fix each of these issues (and further, making your code run on JAX or PyTorch): https://keras.io/guides/migrating_to_keras_3/

  • scikit-learn

    scikit-learn: machine learning in Python

    Project mention: Contraction Clustering (RASTER): A fast clustering algorithm | news.ycombinator.com | 2023-11-27
  • Onboard AI

    Learn any GitHub repo in 59 seconds. Onboard AI learns any GitHub repo in minutes and lets you chat with it to locate functionality, understand different parts, and generate new code. Use it for free at www.getonboard.dev.

  • tesseract-ocr

    Tesseract Open Source OCR Engine (main repository)

    Project mention: Marker: Convert PDF to Markdown quickly with high accuracy | news.ycombinator.com | 2023-11-30

    Last update was pretty recent, and the git mentions tesseract 5 as a dep. so it's likely moved on a bit from when you last tried it:

    https://github.com/tesseract-ocr/tesseract/releases

    I suppose it depends on your use-case. For personal tasks like this it should be more than sufficient, and won't need user details/cc or whatever to use it.

  • Face Recognition

    The world's simplest facial recognition api for Python and the command line

    Project mention: GitHub - ageitgey/face_recognition: The world's simplest facial recognition api for Python and the command line | /r/Python | 2023-11-05
  • awesome-scalability

    The Patterns of Scalable, Reliable, and Performant Large-Scale Systems

    Project mention: Ask HN: What are some of the best blog posts by software engineers? | news.ycombinator.com | 2023-04-10
  • faceswap

    Deepfakes Software For All

    Project mention: A beginner guide into deepfakes | dev.to | 2023-06-01

    Head over to deepfakes/faceswap and install all the stuff that it asks you to do and then open the terminal with in faceswap env from anaconda.

  • julia

    The Julia Programming Language

    Project mention: Rust std:fs slower than Python | news.ycombinator.com | 2023-11-29

    https://github.com/JuliaLang/julia/issues/51086#issuecomment...

    So while this "fixes" the issue, it'll introduce a confusing time delay between you freeing the memory and you observing that in `htop`.

    But according to https://jemalloc.net/jemalloc.3.html you can set `opt.muzzy_decay_ms = 0` to remove the delay.

    Still, the musl author has some reservations against making `jemalloc` the default:

    https://www.openwall.com/lists/musl/2018/04/23/2

    > It's got serious bloat problems, problems with undermining ASLR, and is optimized pretty much only for being as fast as possible without caring how much memory you use.

    With the above-mentioned tunables, this should be mitigated to some extent, but the general "theme" (focusing on e.g. performance vs memory usage) will likely still mean "it's a tradeoff" or "it's no tradeoff, but only if you set tunables to what you need".

  • yolov5

    YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

    Project mention: How would i go about having YOLO v5 return me a list from left to right of all detected objects in an image? | /r/computervision | 2023-11-13
  • TensorFlow-Examples

    TensorFlow Tutorial and Examples for Beginners (support TF v1 & v2)

  • 100-Days-Of-ML-Code

    100 Days of ML Coding

  • nn

    🧑‍🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠

    Project mention: Can't remember name of website that has explanations side-by-side with code | /r/learnmachinelearning | 2023-03-28

    Hey are you talking about https://nn.labml.ai/ ?

  • Made-With-ML

    Learn how to design, develop, deploy and iterate on production-grade ML applications.

    Project mention: [D] How do you keep up to date on Machine Learning? | /r/learnmachinelearning | 2023-08-13

    Made With ML

  • Caffe

    Caffe: a fast open framework for deep learning.

    Project mention: List of AI-Models | /r/GPT_do_dah | 2023-05-16

    Click to Learn more...

  • gym

    A toolkit for developing and comparing reinforcement learning algorithms.

    Project mention: OpenAI Acquires Global Illumination | news.ycombinator.com | 2023-08-16

    A co-founder announced they disbanded their robots team a couple years ago: https://venturebeat.com/business/openai-disbands-its-robotic...

    That was the same time they depreciated OpenAI Gym: https://github.com/openai/gym

  • Tesseract.js

    Pure Javascript OCR for more than 100 Languages 📖🎉🖥

    Project mention: I am out of the loop. Is Next.js "the future" and something I should consider adding to my knowledge pool? | /r/webdev | 2023-07-05

    What do you have against tesseract.js?

  • google-research

    Google Research

    Project mention: Translate to and from 400+ languages locally with MADLAD-400 | /r/LocalLLaMA | 2023-11-10

    Google released T5X checkpoints for MADLAD-400 a couple of months ago, but nobody could figure out how to run them. Turns out the vocabulary was wrong, but they uploaded the correct one last week.

  • PhotoPrism

    AI-Powered Photos App for the Decentralized Web 🌈💎✨

    Project mention: New Release 231128-f48ff16ef ⚙️🌈 | /r/photoprism | 2023-11-30
  • DeepSpeed

    DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

    Project mention: DeepSpeed-FastGen: High-Throughput for LLMs via MII and DeepSpeed-Inference | news.ycombinator.com | 2023-11-04
  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2023-11-30.

Machine Learning related posts

Index

What are some of the best open-source Machine Learning projects? This list will help you:

Project Stars
1 tensorflow 179,107
2 transformers 116,187
3 Pytorch 72,946
4 cs-video-courses 61,879
5 ML-For-Beginners 60,836
6 Keras 59,873
7 scikit-learn 56,529
8 tesseract-ocr 54,891
9 Face Recognition 50,327
10 awesome-scalability 49,415
11 faceswap 47,683
12 julia 43,572
13 yolov5 43,570
14 TensorFlow-Examples 42,993
15 100-Days-Of-ML-Code 42,236
16 nn 39,186
17 Made-With-ML 34,592
18 Caffe 33,667
19 gym 33,161
20 Tesseract.js 32,129
21 google-research 31,504
22 PhotoPrism 30,109
23 DeepSpeed 29,742
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com