SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Machine Learning Open-Source Projects
-
Project mention: 🔥🚀 Top 10 Open-Source Must-Have Tools for Crafting Your Own Chatbot 🤖💬 | dev.to | 2023-11-06
To get up to speed with TensorFlow, check their quickstart Support TensorFlow on GitHub ⭐
-
While its tough to say something specifc since we dont know how exactly you trained it or the prompt format of your training input or how you are performing inference, one thing I found when I faced similar types of issues is that the model does not know when to stop. Some of it is because the fast llama tokenizer does not add the token when encoding your inputs. So you can either add that token explicitly in your input text for each sample or use the slow llama tokenizer. Check llama_recipes github repo for the exact issue https://github.com/huggingface/transformers/issues/22794. The other most probable thing you might want to check is if the model.generate output contains the exact input tokens too. That is the expected behavior of some models (like llama2 or mpt) for example when you use vanilla transformers for inference.
-
InfluxDB
Collect and Analyze Billions of Data Points in Real Time. Manage all types of time series data in a single, purpose-built database. Run at any scale in any environment in the cloud, on-premises, or at the edge.
-
Project mention: Diving into the Deep: My Inaugural PyTorch Contribution Adventure! | dev.to | 2023-11-24
-
course Computer science is very wast field the fundamental remains same, learn basic fundamentals, data structures, concepts of object oriented programming.
-
-
All breaking changes are listed here: https://github.com/keras-team/keras/issues/18467
You can use this migration guide to identify and fix each of these issues (and further, making your code run on JAX or PyTorch): https://keras.io/guides/migrating_to_keras_3/
-
Project mention: Contraction Clustering (RASTER): A fast clustering algorithm | news.ycombinator.com | 2023-11-27
-
Onboard AI
Learn any GitHub repo in 59 seconds. Onboard AI learns any GitHub repo in minutes and lets you chat with it to locate functionality, understand different parts, and generate new code. Use it for free at www.getonboard.dev.
-
Project mention: Marker: Convert PDF to Markdown quickly with high accuracy | news.ycombinator.com | 2023-11-30
Last update was pretty recent, and the git mentions tesseract 5 as a dep. so it's likely moved on a bit from when you last tried it:
https://github.com/tesseract-ocr/tesseract/releases
I suppose it depends on your use-case. For personal tasks like this it should be more than sufficient, and won't need user details/cc or whatever to use it.
-
Project mention: GitHub - ageitgey/face_recognition: The world's simplest facial recognition api for Python and the command line | /r/Python | 2023-11-05
-
Project mention: Ask HN: What are some of the best blog posts by software engineers? | news.ycombinator.com | 2023-04-10
-
Head over to deepfakes/faceswap and install all the stuff that it asks you to do and then open the terminal with in faceswap env from anaconda.
-
https://github.com/JuliaLang/julia/issues/51086#issuecomment...
So while this "fixes" the issue, it'll introduce a confusing time delay between you freeing the memory and you observing that in `htop`.
But according to https://jemalloc.net/jemalloc.3.html you can set `opt.muzzy_decay_ms = 0` to remove the delay.
Still, the musl author has some reservations against making `jemalloc` the default:
https://www.openwall.com/lists/musl/2018/04/23/2
> It's got serious bloat problems, problems with undermining ASLR, and is optimized pretty much only for being as fast as possible without caring how much memory you use.
With the above-mentioned tunables, this should be mitigated to some extent, but the general "theme" (focusing on e.g. performance vs memory usage) will likely still mean "it's a tradeoff" or "it's no tradeoff, but only if you set tunables to what you need".
-
Project mention: How would i go about having YOLO v5 return me a list from left to right of all detected objects in an image? | /r/computervision | 2023-11-13
-
-
-
nn
🧑🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
Project mention: Can't remember name of website that has explanations side-by-side with code | /r/learnmachinelearning | 2023-03-28Hey are you talking about https://nn.labml.ai/ ?
-
Project mention: [D] How do you keep up to date on Machine Learning? | /r/learnmachinelearning | 2023-08-13
Made With ML
-
Click to Learn more...
-
A co-founder announced they disbanded their robots team a couple years ago: https://venturebeat.com/business/openai-disbands-its-robotic...
That was the same time they depreciated OpenAI Gym: https://github.com/openai/gym
-
Project mention: I am out of the loop. Is Next.js "the future" and something I should consider adding to my knowledge pool? | /r/webdev | 2023-07-05
What do you have against tesseract.js?
-
Project mention: Translate to and from 400+ languages locally with MADLAD-400 | /r/LocalLLaMA | 2023-11-10
Google released T5X checkpoints for MADLAD-400 a couple of months ago, but nobody could figure out how to run them. Turns out the vocabulary was wrong, but they uploaded the correct one last week.
-
-
DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Project mention: DeepSpeed-FastGen: High-Throughput for LLMs via MII and DeepSpeed-Inference | news.ycombinator.com | 2023-11-04 -
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Machine Learning related posts
- Large Language Model Course
- Is anyone using self hosted LLM day to day and training it like a new employee
- We tried injecting hallucinogenics into vision models
- Q-Transformer
- Show HN: Taipy – Turns Data and AI algorithms into full web applications
- Ask HN: Which are some of the best open source courses for ML and Deep Learning?
- fast.ai Book in Rust - Chapter 2 - Part 1
-
A note from our sponsor - #<SponsorshipServiceOld:0x00007f0f9ba122a8>
www.saashub.com | 1 Dec 2023
Index
What are some of the best open-source Machine Learning projects? This list will help you:
Project | Stars | |
---|---|---|
1 | tensorflow | 179,107 |
2 | transformers | 116,187 |
3 | Pytorch | 72,946 |
4 | cs-video-courses | 61,879 |
5 | ML-For-Beginners | 60,836 |
6 | Keras | 59,873 |
7 | scikit-learn | 56,529 |
8 | tesseract-ocr | 54,891 |
9 | Face Recognition | 50,327 |
10 | awesome-scalability | 49,415 |
11 | faceswap | 47,683 |
12 | julia | 43,572 |
13 | yolov5 | 43,570 |
14 | TensorFlow-Examples | 42,993 |
15 | 100-Days-Of-ML-Code | 42,236 |
16 | nn | 39,186 |
17 | Made-With-ML | 34,592 |
18 | Caffe | 33,667 |
19 | gym | 33,161 |
20 | Tesseract.js | 32,129 |
21 | google-research | 31,504 |
22 | PhotoPrism | 30,109 |
23 | DeepSpeed | 29,742 |