Python large-language-models

Open-source Python projects categorized as large-language-models

Top 23 Python large-language-model Projects

large-language-models
  • gpt_academic

    为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, moss等。

  • CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
  • LLaMA-Factory

    Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

    Project mention: ORPO, DPO, and PPO: Optimizing Models for Human Preferences | dev.to | 2024-11-08

    Implementation: ORPO has been integrated into popular fine-tuning libraries like TRL, Axolotl, and LLaMA-Factory.

  • Chinese-LLaMA-Alpaca

    中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

  • haystack

    AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.

    Project mention: AI Engineer's Tool Review: Haystack | dev.to | 2024-12-03

    Are you curious about the NLP/GenAI/RAG framework for developers? Check out my opinionated developer review of Haystack, which emerges as a robust NLP/RAG framework that excels in search and retrieval applications: Read the review.

  • ChatGLM2-6B

    ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型

  • Qwen

    The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

    Project mention: Deploying AI Projects Through a Jenkins Pipeline | dev.to | 2024-11-20

    After logging in to Jozu Hub, you can grab any of the available ModelKits from their package registry. Start by unpacking the Qwen model from Jozu Hub.

  • MOSS

    An open-source tool-augmented conversational language model from Fudan University

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • ml-engineering

    Machine Learning Engineering Open Book

    Project mention: Accelerators | news.ycombinator.com | 2024-02-22
  • LLMSurvey

    The official GitHub page for the survey paper "A Survey of Large Language Models".

    Project mention: Ask HN: Textbook Regarding LLMs | news.ycombinator.com | 2024-03-23

    Here’s another one - it’s older but has some interesting charts and graphs.

    https://arxiv.org/abs/2303.18223

  • LightRAG

    "LightRAG: Simple and Fast Retrieval-Augmented Generation"

    Project mention: LightRAG: Simple and Fast Retrieval-Augmented Generation | news.ycombinator.com | 2024-12-02
  • txtai

    💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

    Project mention: Postgres for Everything (E/Postgres) | news.ycombinator.com | 2024-12-06

    I fully agree. Postgres has solved many of the problems that many are re-solving with GenAI related databases.

    With txtai (https://github.com/neuml/txtai), I've went all in with Postgres + pgvector. Projects can start small with a SQLite backend then switch the persistence to Postgres. With this, you get all the years of battle-tested production experience from Postgres built-in for free.

  • petals

    🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading

    Project mention: Serving AI from the Basement – 192GB of VRAM Setup | news.ycombinator.com | 2024-09-08
  • optimate

    A collection of libraries to optimise AI model performances

  • deeplake

    Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai

    Project mention: Creation of the ApostropheCMS Documentation Chatbot | dev.to | 2024-08-29

    Finally, we stored these vectors in our chosen database: the activeloop DeepLake database. This database is open source, something near and dear to our own open-source hearts. We will cover some additional details in a further section, but it is specifically designed to handle vector data and perform efficient similarity searches, which is crucial for quick and accurate retrieval during the RAG process.

  • PentestGPT

    A GPT-empowered penetration testing tool

  • camel

    🐫 CAMEL: Finding the Scaling Law of Agents. The first and the best multi-agent framework. https://www.camel-ai.org (by camel-ai)

  • Baichuan-7B

    A large-scale 7B pretraining language model developed by BaiChuan-Inc.

  • openchat

    OpenChat: Advancing Open-source Language Models with Imperfect Data (by imoneoi)

  • Qwen-VL

    The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

  • awesome-pretrained-chinese-nlp-models

    Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合

  • tree-of-thought-llm

    [NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large Language Models

    Project mention: Swarm, a new agent framework by OpenAI | news.ycombinator.com | 2024-10-11
  • marqo

    Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai

    Project mention: Pinecone integrates AI inferencing with vector database | news.ycombinator.com | 2024-12-04
  • AutoGPTQ

    An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python large-language-models discussion

Log in or Post with

Python large-language-models related posts

  • Trillium TPU Is GA

    2 projects | news.ycombinator.com | 11 Dec 2024
  • LightRAG: Simple and Fast Retrieval-Augmented Generation

    1 project | news.ycombinator.com | 2 Dec 2024
  • textGrad: Automatic “Differentiation” via Text

    1 project | dev.to | 15 Nov 2024
  • Manage Permissions in a Langflow Chain for LLM Queries using Permit.io

    2 projects | dev.to | 14 Nov 2024
  • Hertz-dev, the first open-source base model for conversational audio

    7 projects | news.ycombinator.com | 3 Nov 2024
  • Are there any debugging tools similar to Jupyter available now?

    1 project | news.ycombinator.com | 3 Nov 2024
  • Pax: A Jax-based machine learning framework for training large scale models

    1 project | news.ycombinator.com | 13 Oct 2024
  • A note from our sponsor - SaaSHub
    www.saashub.com | 12 Dec 2024
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source large-language-model projects in Python? This list will help you:

Project Stars
1 gpt_academic 66,185
2 LLaMA-Factory 35,732
3 Chinese-LLaMA-Alpaca 18,498
4 haystack 18,001
5 ChatGLM2-6B 15,733
6 Qwen 14,678
7 MOSS 11,980
8 ml-engineering 11,934
9 LLMSurvey 10,586
10 LightRAG 11,127
11 txtai 9,605
12 petals 9,281
13 optimate 8,377
14 deeplake 8,220
15 PentestGPT 7,347
16 camel 5,792
17 Baichuan-7B 5,675
18 openchat 5,269
19 Qwen-VL 5,170
20 awesome-pretrained-chinese-nlp-models 4,949
21 tree-of-thought-llm 4,889
22 marqo 4,672
23 AutoGPTQ 4,549

Sponsored
CodeRabbit: AI Code Reviews for Developers
Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
coderabbit.ai

Did you konow that Python is
the 2nd most popular programming language
based on number of metions?