Top 23 Python generative-ai Projects

LLaMA-Factory

2 20,248 9.9 Python

Unify Efficient Fine-Tuning of 100+ LLMs

Project mention: Show HN: GPU Prices on eBay | news.ycombinator.com | 2024-02-23

Depends what model you want to train, and how well you want your computer to keep working while you're doing it.
If you're interested in large language models there's a table of vram requirements for fine-tuning at [1] which says you could do the most basic type of fine-tuning on a 7B parameter model with 8GB VRAM.
You'll find that training takes quite a long time, and as a lot of the GPU power is going on training, your computer's responsiveness will suffer - even basic things like scrolling in your web browser or changing tabs uses the GPU, after all.
Spend a bit more and you'll probably have a better time.
[1] https://github.com/hiyouga/LLaMA-Factory?tab=readme-ov-file#...

jina

126 20,041 9.1 Python

☁️ Build multimodal AI applications with cloud-native stack

Project mention: Jina.ai: Self-host Multimodal models | news.ycombinator.com | 2024-01-26

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
haystack

55 13,633 9.9 Python

:mag: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.

Project mention: Haystack DB – 10x faster than FAISS with binary embeddings by default | news.ycombinator.com | 2024-04-28

I was confused for a bit but there is no relation to https://haystack.deepset.ai/

NeMo

29 10,084 9.8 Python

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Project mention: [P] Making a TTS voice, HK-47 from Kotor using Tortoise (Ideally WaveRNN) | /r/MachineLearning | 2023-07-06

I don't test WaveRNN but from the ones that I know the best that is open source is FastPitch. And it's easy to use, here is the tutorial for voice cloning.

BentoML

16 6,537 9.8 Python

The most flexible way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Inference Graph/Pipelines, Compound AI systems, Multi-Modal, RAG as a Service, and more!

Project mention: Who's hiring developer advocates? (December 2023) | dev.to | 2023-12-04

Link to GitHub -->

krita-ai-diffusion

11 4,586 9.8 Python

Streamlined interface for generating images with AI in Krita. Inpaint and outpaint with optional text prompt, no tweaking required.

Project mention: A quick Krita/ComfyUI LCM live painting tip | /r/StableDiffusion | 2023-12-08

I have been playing a lot with Krita's SD plugin https://github.com/Acly/krita-ai-diffusion - that uses ComfyUI as it's API source.

TaskingAI

1 4,421 9.4 Python

The open source platform for AI-native application development.

Project mention: TaskingAI: AI-native app development platform | news.ycombinator.com | 2024-01-30

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
h2o-llmstudio

13 3,583 9.3 Python

H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://h2oai.github.io/h2o-llmstudio/

Project mention: Paid dev gig: develop a basic LLM PEFT finetuning utility | /r/LocalLLaMA | 2023-06-02

llmware

9 3,127 9.8 Python

Providing enterprise-grade LLM-based development framework, tools, and fine-tuned models.

Project mention: More Agents Is All You Need: LLMs performance scales with the number of agents | news.ycombinator.com | 2024-04-06

I couldn't agree more. You should check out LLMWare's SLIM agents (https://github.com/llmware-ai/llmware/tree/main/examples/SLI...). It's focusing on pretty much exactly this and chaining multiple local LLMs together.
A really good topic that ties in with this is the need for deterministic sampling (I may have the terminology a bit incorrect) depending on what the model is indended for. The LLMWare team did a good 2 part video on this here as well (https://www.youtube.com/watch?v=7oMTGhSKuNY)
I think dedicated miniture LLMs are the way forward.
Disclaimer - Not affiliated with them in any way, just think it's a really cool project.

jupyter-ai

9 2,857 9.2 Python

A generative AI extension for JupyterLab

Project mention: 🪄 Put magic in your Notebook w/ Jupyter-AI | dev.to | 2024-02-14

This notebook is dedicated to a (not so) short jupyterlab/jupyter-ai unboxing so anyone can enjoy this kind of magic (and much much more):

xTuring

31 2,523 8.4 Python

Build, customize and control you own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-source LLMs. Join our discord community: https://discord.gg/TgHXuSJEk6

Project mention: I'm developing an open-source AI tool called xTuring, enabling anyone to construct a Language Model with just 5 lines of code. I'd love to hear your thoughts! | /r/machinelearningnews | 2023-09-07

Explore the project on GitHub here.

YiVal

2 2,429 9.6 Python

Your Automatic Prompt Engineering Assistant for GenAI Applications

Project mention: YiVal——Unlocking Your Data's Power to Create Customized GenAI Apps | /r/u_YiVal | 2023-11-16

- 🤖Github:https://github.com/YiVal/YiVal/pull/189

dbrx

4 2,397 5.9 Python

Code examples and resources for DBRX, a large language model developed by Databricks

Project mention: Hello OLMo: A Open LLM | news.ycombinator.com | 2024-04-08

One thing I wanted to add and call attention to is the importance of licensing in open models. This is often overlooked when we blindly accept the vague branding of models as “open”, but I am noticing that many open weight models are actually using encumbered proprietary licenses rather than standard open source licenses that are OSI approved (https://opensource.org/licenses). As an example, Databricks’s DBRX model has a proprietary license that forces adherence to their highly restrictive Acceptable Use Policy by referencing a live website hosting their AUP (https://github.com/databricks/dbrx/blob/main/LICENSE), which means as they change their AUP, you may be further restricted in the future. Meta’s Llama is similar (https://github.com/meta-llama/llama/blob/main/LICENSE ). I’m not sure who can depend on these models given this flaw.

SDV

59 2,141 9.4 Python

Synthetic data generation for tabular data

Project mention: Synthetic data generation for tabular data | news.ycombinator.com | 2024-02-27

Can someone help me understand the licensing of this?
https://github.com/sdv-dev/SDV/blob/main/LICENSE
It was MIT licensed up until 2022 where it was changed to what it is now, where they say that it will become MIT again 4 years after release... but is that from when the license was changed or the first release of the software in GitHub?

coffee

4 1,341 8.8 Python

Build and iterate on your UI 10x faster with AI - right from your own IDE ☕️

Project mention: AI Grant Traction in OSS Startups | dev.to | 2024-02-01

Coframe

PyRIT

2 1,263 9.1 Python

The Python Risk Identification Tool for generative AI (PyRIT) is an open access automation framework to empower security professionals and machine learning engineers to proactively find risks in their generative AI systems. (by Azure)

Project mention: FLaNK 04 March 2024 | dev.to | 2024-03-04

openllmetry

2 1,271 9.8 Python

Open-source observability for your LLM application, based on OpenTelemetry

Project mention: Pydantic Logfire | news.ycombinator.com | 2024-04-30

I’m also aware of other OSS initiatives doing similar initiatives, so I wouldn’t say no one has ever done what your doing.
[1] https://github.com/traceloop/openllmetry

LLMStack

20 1,112 9.9 Python

No-code platform to build LLM Agents, workflows and applications with your data

Project mention: Vanna.ai: Chat with your SQL database | news.ycombinator.com | 2024-01-14

We have recently added support to query data from SingleStore to our agent framework, LLMStack (https://github.com/trypromptly/LLMStack). Out of the box performance performance when prompting with just the table schemas is pretty good with GPT-4.
The more domain specific knowledge needed for queries, the harder it has gotten in general. We've had good success `teaching` the model different concepts in relation to the dataset and giving it example questions and queries greatly improved performance.

cognita

4 887 7.9 Python

RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry

Project mention: FLaNK AI Weekly for 29 April 2024 | dev.to | 2024-04-29

canopy

14 873 9.8 Python

Retrieval Augmented Generation (RAG) framework and context engine powered by Pinecone

Project mention: FLaNK AI Weekly for 29 April 2024 | dev.to | 2024-04-29

aiconfig

29 840 9.7 Python

AIConfig is a config-based framework to build generative AI applications.

Project mention: VS Code: Prompt Editor for LLMs (GPT4, Llama, Mistral, etc.) | news.ycombinator.com | 2024-03-08

doesn't collect prompts and there's a way to disable telemetry as well - https://github.com/lastmile-ai/aiconfig/blob/8a5a59d47cef474...

quix-streams

25 570 9.0 Python

A Python library for building containerized ML and Generative AI applications with Apache Kafka.

Project mention: Show HN: Streaming DataFrames–a Pandas-like syntax for real-time data | news.ycombinator.com | 2024-04-23

Copulas

1 505 8.1 Python

A library to model multivariate data using copulas.
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python generative-ai related posts

Show HN: Cognita – open-source RAG framework for modular applications

3 projects | news.ycombinator.com | 27 Apr 2024
Gemini API 102: Next steps beyond "Hello World!"

5 projects | dev.to | 24 Apr 2024
Show HN: Streaming DataFrames–a Pandas-like syntax for real-time data

1 project | news.ycombinator.com | 23 Apr 2024
Hello OLMo: A Open LLM

3 projects | news.ycombinator.com | 8 Apr 2024
Are you looking for free open source alternative to Midjourney and Bing images?

1 project | news.ycombinator.com | 25 Mar 2024
100% free Midjourney alternative. Plant trees as you generated realistic images

1 project | news.ycombinator.com | 20 Mar 2024
Are you looking for a green yet free Chatgpt4 Alternative?

1 project | news.ycombinator.com | 18 Mar 2024
A note from our sponsor - SaaSHub
www.saashub.com | 1 May 2024

SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source generative-ai projects in Python? This list will help you:

	Project	Stars
1	LLaMA-Factory	20,248
2	jina	20,041
3	haystack	13,633
4	NeMo	10,084
5	BentoML	6,537
6	krita-ai-diffusion	4,586
7	TaskingAI	4,421
8	h2o-llmstudio	3,583
9	llmware	3,127
10	jupyter-ai	2,857
11	xTuring	2,523
12	YiVal	2,429
13	dbrx	2,397
14	SDV	2,141
15	coffee	1,341
16	PyRIT	1,263
17	openllmetry	1,271
18	LLMStack	1,112
19	cognita	887
20	canopy	873
21	aiconfig	840
22	quix-streams	570
23	Copulas	505

Python generative-ai

Top 23 Python generative-ai Projects

Python generative-ai related posts

Show HN: Cognita – open-source RAG framework for modular applications

Gemini API 102: Next steps beyond "Hello World!"

Show HN: Streaming DataFrames–a Pandas-like syntax for real-time data

Hello OLMo: A Open LLM

Are you looking for free open source alternative to Midjourney and Bing images?

100% free Midjourney alternative. Plant trees as you generated realistic images

Are you looking for a green yet free Chatgpt4 Alternative?

Index