Top 23 AI Open-Source Projects

AutoGPT

180 161,096 10.0 JavaScript

AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.

Project mention: Accessible AI for Everyone | news.ycombinator.com | 2024-01-08

stable-diffusion-webui

2,808 129,299 9.9 Python

Stable Diffusion web UI

Project mention: Show HN: I made an app to use local AI as daily driver | news.ycombinator.com | 2024-02-27

* LLaVA model: I'll add more documentation. You are right Llava could not generate images. For image generation I don't have immediate plans, but checkout these projects for local image generation.
- https://diffusionbee.com/
- https://github.com/comfyanonymous/ComfyUI
- https://github.com/AUTOMATIC1111/stable-diffusion-webui

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
ChatGPT

50 46,892 6.4 Rust

🔮 ChatGPT Desktop Application (Mac, Windows and Linux)

Project mention: What AI assistants are already bundled for Linux? | news.ycombinator.com | 2024-03-01

> I wouldn't hold my breath waiting for a native Linux AI-assisted assistant.
On Mac when I press Command + Space, it brings up Spotlight search
That can't easily be added to be the equivalent of some kind of LLM prompt on GNOME/KDE/XFCE?
I don't quite know what you'd ask it/do with it that would be of much value? Seems like a quicker way/a wrapper around either asking an LLM questions via CLI or basically Electron wrapping HTML (like this https://github.com/lencx/ChatGPT)?

generative-ai-for-beginners

8 42,394 9.8 Jupyter Notebook

18 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/

Project mention: Build a serverless ChatGPT with RAG using LangChain.js | dev.to | 2024-04-10

Generative AI For Beginners: a collection of resources to learn about Generative AI, including tutorials, code samples, and more.

ColossalAI

42 37,836 9.7 Python

Making large AI models cheaper, faster and more accessible

Project mention: FLaNK AI-April 22, 2024 | dev.to | 2024-04-22

Kong

18 37,482 9.9 Lua

🦍 The Cloud-Native API Gateway and AI Gateway.

Project mention: Kong 3.6 with LLM Support | news.ycombinator.com | 2024-02-15

Open-Assistant

329 36,622 9.1 Python

OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

Project mention: Best open source AI chatbot alternative? | /r/opensource | 2023-12-08

For open assistant, the code: https://github.com/LAION-AI/Open-Assistant/tree/main/inference

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
MockingBird

9 33,796 5.8 Python

🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
gold-miner

1 33,382 7.4

🥇掘金翻译计划，可能是世界最大最好的英译中技术社区，最懂读者和译者的翻译平台：
google-research

98 32,804 9.6 Jupyter Notebook

Google Research

Project mention: Show HN: Next-token prediction in JavaScript – build fast LLMs from scratch | news.ycombinator.com | 2024-04-10

People on here will be happy to say that I do a similar thing, however my sequence length is dynamic because I also use a 2nd data structure - I'll use pretentious academic speak: I use a simple bigram LM (2-gram) for single next-word likeliness and separately a trie that models all words and phrases (so, n-gram). Not sure how many total nodes because sentence lengths vary in training data, but there are about 200,000 entry points (keys) so probably about 2-10 million total nodes in the default setup.
"Constructing 7-gram LM": They likely started with bigrams (what I use) which only tells you the next word based on 1 word given, and thought to increase accuracy by modeling out more words in a sequence, and eventually let the user (developer) pass in any amount they want to model (https://github.com/google-research/google-research/blob/5c87...). I thought of this too at first, but I actually got more accuracy (and speed) out of just keeping them as bigrams and making a totally separate structure that models out an n-gram of all phrases (e.g. could be a 24-token long sequence or 100+ tokens etc. I model it all) and if that phrase is found, then I just get the bigram assumption of the last token of the phrase. This works better when the training data is more diverse (for a very generic model), but theirs would probably outperform mine on accuracy when the training data has a lot of nearly identical sentences that only change wildly toward the end - I don't find this pattern in typical data though, maybe for certain coding and other tasks there are those patterns though. But because it's not dynamic and they make you provide that number, even a low number (any phrase longer than 2 words) - theirs will always have to do more lookup work than with simple bigrams and they're also limited by that fixed number as far as accuracy. I wonder how scalable that is - if I need to train on occasional ~100-word long sentences but also (and mostly) just ~3-word long sentences, I guess I set this to 100 and have a mostly "undefined" trie.
I also thought of the name "LMJS", theirs is "jslm" :) but I went with simply "next-token-prediction" because that's what it ultimately does as a library. I don't know what theirs is really designed for other than proving a concept. Most of their code files are actually comments and hypothetical scenarios.
I recently added a browser example showing simple autocomplete using my library: https://github.com/bennyschmidt/next-token-prediction/tree/m... (video)
And next I'm implementing 8-dimensional embeddings that are converted to normalized vectors between 0-1 to see if doing math on them does anything useful beyond similarity, right now they look like this:
  [nextFrequency, prevalence, specificity, length, firstLetter, lastLetter, firstVowel, lastVowel]

PhotoPrism

510 32,590 9.9 Go

AI-Powered Photos App for the Decentralized Web 🌈💎✨

Project mention: Show HN: Memories, FOSS Google Photos alternative built for high performance | news.ycombinator.com | 2024-03-21

I have been using https://www.photoprism.app for a couple of years, and it works better than expected, with the latest updates it's actually quite fast and the face tagging works reasonably well.

AI-For-Beginners

8 31,046 6.7 Jupyter Notebook

12 Weeks, 24 Lessons, AI for All!

Project mention: FREE AI Course By Microsoft: ZERO to HERO! 🔥 | dev.to | 2024-03-18

🔗 https://github.com/microsoft/AI-For-Beginners 🔗 https://microsoft.github.io/AI-For-Beginners/

spaCy

106 28,704 9.2 Python

💫 Industrial-strength Natural Language Processing (NLP) in Python

Project mention: Step by step guide to create customized chatbot by using spaCy (Python NLP library) | dev.to | 2024-03-23

Hi Community, In this article, I will demonstrate below steps to create your own chatbot by using spaCy (spaCy is an open-source software library for advanced natural language processing, written in the programming languages Python and Cython):

Lobe Chat

6 28,579 9.9 TypeScript

LobeChat is a open-source, extensible (Function Calling), high-performance chatbot framework.It supports one-click free deployment of your private ChatGPT/LLM web application.

Project mention: The AI Revolution Is Crushing Thousands of Languages | news.ycombinator.com | 2024-04-25

Get your OpenAI API key and then use it on one of the hundreds of open source frontends available, such as: https://github.com/lobehub/lobe-chat

AI-Expert-Roadmap

30 28,388 0.0 JavaScript

Roadmap to becoming an Artificial Intelligence Expert in 2022

Project mention: Best AI ML DL DS Roadmap | /r/deeplearning | 2023-12-07

**[I.am.ai AI Expert Roadmap](https://i.am.ai/roadmap)**: This roadmap focuses more on AI and includes various aspects of machine learning and deep learning. It's suitable for those who want to delve deeper into AI, particularly in cutting-edge research and applications.

pytorch-lightning

8 26,883 9.9 Python

Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.

Project mention: Lightning AI Studios – A persistent GPU cloud environment | news.ycombinator.com | 2023-12-14

upscayl

125 26,216 9.6 TypeScript

🆙 Upscayl - Free and Open Source AI Image Upscaler for Linux, MacOS and Windows built with Linux-First philosophy.

Project mention: Why Does Windows Use Backslash as Path Separator? | news.ycombinator.com | 2024-04-24

Windows has caused us a lot of issues with Upscayl (https://upscayl.org).
I personally do not use Windows but most of our errors are reported by Windows users where sometimes path parsing is a problem or the drivers mess up vulkan configuration.

netron

32 26,040 9.9 JavaScript

Visualizer for neural network, deep learning and machine learning models

Project mention: Visualizer for neural network, deep learning and machine learning models | news.ycombinator.com | 2023-12-26

dify

11 23,073 9.9 TypeScript

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.

Project mention: Dify, a visual workflow to build/test LLM applications | news.ycombinator.com | 2024-04-22

> https://github.com/langgenius/dify/blob/main/LICENSE
everyone is apparently a license pioneer

MindsDB

78 21,223 10.0 Python

The platform for customizing AI from enterprise data

Project mention: What’s the Difference Between Fine-tuning, Retraining, and RAG? | dev.to | 2024-04-08

Check us out on GitHub.

learnopencv

6 20,363 8.6 Jupyter Notebook

Learn OpenCV : C++ and Python Examples

Project mention: YOLO-NAS Pose | /r/pytorch | 2023-11-16

Deci's YOLO-NAS Pose: Redefining Pose Estimation! Elevating healthcare, sports, tech, and robotics with precision and speed. Github link and blog link down below! Repo: https://github.com/spmallick/learnopencv/tree/master/YOLO-NAS-Pose

LocalAI

82 19,593 9.9 C++

:robot: The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. It allows to generate Text, Audio, Video, Images. Also with voice cloning capabilities.

Project mention: Drop-In Replacement for ChatGPT API | news.ycombinator.com | 2024-01-24

chatbox

21 18,459 8.1 TypeScript

Chatbox is a desktop client for ChatGPT, Claude and other LLMs, available on Windows, Mac, Linux

Project mention: Chatbox (latest versions) is not open source; AskHN: anything similar? | news.ycombinator.com | 2024-02-08

SaaSHub

www.saashub.com sponsored

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

AI related posts

OpenAI vs Gemini : Function Calling & Autonomous Agent
1 project | dev.to | 26 Apr 2024
PawanOsman/ChatGPT: Access GPT-3.5.turbo for free via an API
1 project | news.ycombinator.com | 26 Apr 2024
Observations on MLOps–A Fragmented Mosaic of Mismatched Expectations
1 project | dev.to | 26 Apr 2024
Llama 3 with Function Calling and Code Interpreter
3 projects | dev.to | 25 Apr 2024
What is a Plugin Ecosystem and Why Does It Matter?
2 projects | dev.to | 25 Apr 2024
Show HN: Langtrace – OpenTelemetry-Based LLM App Observability
2 projects | news.ycombinator.com | 25 Apr 2024
Why Does Windows Use Backslash as Path Separator?
4 projects | news.ycombinator.com | 24 Apr 2024
A note from our sponsor - InfluxDB
www.influxdata.com | 26 Apr 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source AI projects? This list will help you:

	Project	Stars
1	AutoGPT	161,096
2	stable-diffusion-webui	129,299
3	ChatGPT	46,892
4	generative-ai-for-beginners	42,394
5	ColossalAI	37,836
6	Kong	37,482
7	Open-Assistant	36,622
8	MockingBird	33,796
9	gold-miner	33,382
10	google-research	32,804
11	PhotoPrism	32,590
12	AI-For-Beginners	31,046
13	spaCy	28,704
14	Lobe Chat	28,579
15	AI-Expert-Roadmap	28,388
16	pytorch-lightning	26,883
17	upscayl	26,216
18	netron	26,040
19	dify	23,073
20	MindsDB	21,223
21	learnopencv	20,363
22	LocalAI	19,593
23	chatbox	18,459