Manage all types of time series data in a single, purpose-built database. Run at any scale in any environment in the cloud, on-premises, or at the edge. Learn more →
Top 23 Python AI Projects
-
Edit: linking to one issue in github where it goes over installing for various Linux flavors. Here. You may need to add an EXPORT to the start script.
-
Project mention: ColossalChat: An Open-Source Solution for Cloning ChatGPT with a RLHF Pipeline | news.ycombinator.com | 2023-04-04
> open-source a complete RLHF pipeline ... based on the LLaMA pre-trained model
I've gotten to where when I see "open source AI" I now know it's "well, except for $some_other_dependencies"
Anyway: https://scribe.rip/@yangyou_berkeley/colossalchat-an-open-so... and https://github.com/hpcaitech/ColossalAI#readme (Apache 2) can save you some medium.com heartache at least
-
InfluxDB
Collect and Analyze Billions of Data Points in Real Time. Manage all types of time series data in a single, purpose-built database. Run at any scale in any environment in the cloud, on-premises, or at the edge.
-
MockingBird
🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
-
Project mention: A beginner’s guide to sentiment analysis using OceanBase and spaCy | dev.to | 2023-10-25
In this article, I'm going to walk through a sentiment analysis project from start to finish, using open-source Amazon product reviews. However, using the same approach, you can easily implement mass sentiment analysis on your own products. We'll explore an approach to sentiment analysis with one of the most popular Python NLP packages: spaCy.
-
Project mention: Best practice for saving logits/activation values of model in PyTorch Lightning | /r/deeplearning | 2023-07-19
I've been wondering on what is the recommended method of saving logits/activations using PyTorch Lightning. I've looked at Callbacks, Loggers and ModelHooks but none of the use-cases seem to be for this kind of activity (even if I were to create my own custom variants of each utility). The ModelCheckpoint Callback in its utility makes me feel like custom Callbacks would be the way to go but I'm not quite sure. This closed GitHub issue does address my issue to some extent.
-
Project mention: cascade alternatives - clearml and MLflow | libhunt.com/r/Oxid15/cascade | 2023-11-01
-
SuperAGI
<⚡️> SuperAGI - A dev-first open source autonomous AI agent framework. Enabling developers to build, manage & run useful autonomous agents quickly and reliably.
-
Onboard AI
Learn any GitHub repo in 59 seconds. Onboard AI learns any GitHub repo in minutes and lets you chat with it to locate functionality, understand different parts, and generate new code. Use it for free at www.getonboard.dev.
-
-
Frigate https://frigate.video/ and ZoneMinder https://zoneminder.com/ come to mind. Blue Iris https://blueirissoftware.com/ is not open source but is what I prefer to use for my PoE systems ($80/yr)
-
haystack
:mag: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
-
-
-
-
cookiecutter-data-science
A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.
I opened an Anaconda cmd window and ran `cookiecutter https://github.com/drivendata/cookiecutter-data-science ` . I answered all prompted questions. After searching for a while I found where the project folder was created. However, how do I get this on GitHub? The only thing I can figure out is to create a brand new repo on GitHub with the exact same name, open it in GitHub desktop, click "show in explorer", and then drag and drop all files from the Cookiecutter folder into the GitHub Desktop folder. However to me this does not sound like the intended way to create a new project and put it on GitHub.
-
deeplake
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
Project mention: [P] I built a Chatbot to talk with any Github Repo. 🪄 | /r/MachineLearning | 2023-04-29This repository contains two Python scripts that demonstrate how to create a chatbot using Streamlit, OpenAI GPT-3.5-turbo, and Activeloop's Deep Lake. The chatbot searches a dataset stored in Deep Lake to find relevant information and generates responses based on the user's input.
-
Project mention: In Need of Guidance: Implementing MLOps in a Complex Organization as a Junior Data Engineer | /r/mlops | 2023-06-12
-
Project mention: Stability AI releases its latest image-generating model, Stable Diffusion XL 1.0 | news.ycombinator.com | 2023-07-26
-
promptflow
Build high-quality LLM apps - from prototyping, testing to production deployment and monitoring.
-
Project mention: Ask HN: Is there any open source/open hardware Echo Dot alike? | news.ycombinator.com | 2023-08-11
-
-
there are a few tools you can use as "cheat mode" shortcuts to give you a leg up as you're getting started. here's one: https://github.com/bentoml/BentoML
-
Project mention: What are the best tools for web scraping and analysis of natural language to populate a dataset? | /r/datasets | 2023-04-12
See if something like autoscraper or mlscraper suits your needs.
-
Project mention: Show HN: Run LLM-generated code in sandboxed envs | news.ycombinator.com | 2023-09-27
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python AI related posts
- 80% faster, 50% less memory, 0% accuracy loss Llama finetuning
- Experimenting with LLM-Based Chunk Enhancement for Better RAG Results
- 80% faster, 50% less memory, 0% loss of accuracy Llama finetuning
- Show HN: Unsloth – finetune Llama 2x faster 50% less memory
- Unable to re add my server to HAOS integration
- SDXL Turbo: A Real-Time Text-to-Image Generation Model
- Tanuki: Alignment-as-Code for LLM Applications
-
A note from our sponsor - InfluxDB
www.influxdata.com | 1 Dec 2023
Index
What are some of the best open-source AI projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | stable-diffusion-webui | 110,759 |
2 | ColossalAI | 35,411 |
3 | MockingBird | 32,052 |
4 | spaCy | 27,636 |
5 | pytorch-lightning | 25,275 |
6 | MLflow | 15,889 |
7 | SuperAGI | 13,103 |
8 | dvc | 12,515 |
9 | frigate | 11,948 |
10 | haystack | 11,746 |
11 | facefusion | 8,930 |
12 | RobustVideoMatting | 7,770 |
13 | dream-textures | 7,236 |
14 | cookiecutter-data-science | 7,205 |
15 | deeplake | 7,171 |
16 | metaflow | 7,168 |
17 | fast-stable-diffusion | 6,892 |
18 | promptflow | 6,611 |
19 | mycroft-core | 6,378 |
20 | embedchain | 5,977 |
21 | BentoML | 5,937 |
22 | autoscraper | 5,658 |
23 | E2B | 5,603 |