large-language-models

Open-source projects categorized as large-language-models

Top 23 large-language-model Open-Source Projects

large-language-models
  1. langflow

    Langflow is a powerful tool for building and deploying AI-powered agents and workflows.

    Project mention: CVE-2026-33017: How I Found an Unauthenticated RCE in Langflow by Reading the Code They Already Fixed | dev.to | 2026-03-19

    I reported this through Langflow's GitHub Security Advisory on February 25, 2026. The initial response took about two weeks and a couple of follow-up pings from my end. Once the team engaged, things moved quickly. They merged a fix in PR #12160, and the advisory was published on March 16, 2026.

  2. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  3. LLMs-from-scratch

    Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

    Project mention: DeepSeek Sparse Attention | news.ycombinator.com | 2026-05-24
  4. llm-course

    Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

    Project mention: 10 Best AI Engineering GitHub Repos to Build Real Systems | dev.to | 2026-01-07

    A hands-on, end-to-end course on building, evaluating, and deploying LLM applications. Ideal when you want a clear path from spark of an idea to deployment. Link: https://github.com/mlabonne/llm-course

  5. LlamaFactory

    Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

    Project mention: Llama-Factory: Unified, Efficient Fine-Tuning for 100 Open LLMs | news.ycombinator.com | 2025-09-18
  6. gpt_academic

    为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, moss等。

  7. Flowise

    Build AI Agents, Visually

    Project mention: I Tested Flowise, Dify, and n8n Across 30+ Client Deployments. Here Is My Verdict. | dev.to | 2026-04-07

    Citation Capsule: n8n's GitHub community reached 182,000+ stars across a 7-year development history, with 70+ AI-specific nodes added in 2024 to 2025. Source: n8n GitHub. Dify crossed 106,000 stars on GitHub with an Apache 2.0 license. Source: Dify GitHub. Flowise reached 51,000+ stars with MIT license. Source: Flowise GitHub. Dify's minimum recommended RAM is 4 GB versus Flowise's 1 GB and n8n's 300 MB. Source: Dify Docs.

  8. Ray

    Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

    Project mention: GSoC 2026 Predictions: 30 NEW AI/ML/Security Organizations You Should Start Contributing to NOW! | dev.to | 2026-02-06

    Main: https://github.com/ray-project/ray ⭐ 34k+

  9. system_prompts_leaks

    Extracted system prompts from Anthropic - Claude Code, Claude Design, Opus 4.8, Sonnet 4.6. OpenAI - ChatGPT 5.5 Thinking, GPT 5.5 Instant, Codex, Google - Gemini - 3.5 Flash, 3.1 Pro, Antigravity, xAI - Grok, Cursor, Copilot, VS Code, Perplexity, and more. Updated regularly.

    Project mention: System_prompts_leaks: Anthropic/Claude-Opus-4.6.md | news.ycombinator.com | 2026-04-06
  10. langextract

    A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.

    Project mention: All Data and AI Weekly #203: 18-Aug-2025 | dev.to | 2025-08-18

    langextract: A tool for extracting language information. View on GitHub

  11. LightRAG

    [EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"

    Project mention: Show HN: Query years of Ask HN and Show HN discussions as local knowledge graph | news.ycombinator.com | 2026-05-10

    I built lightrag-snkv, Basically it uses lightRAG https://github.com/HKUDS/LightRAG ,this requires various storage databases like key value store, graph database, vector database, I built single embedded file based database which covers all these requirements: https://github.com/hash-anu/snkv.

  12. langfuse

    🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23

    Project mention: Three Budget-Guardrail Failure Modes That Matter More Than Model Quality (May 2026) | dev.to | 2026-05-19

    Source: https://github.com/langfuse/langfuse/issues/12614 (open, updated 2026-05-14)

  13. storm

    An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.

  14. awesome-generative-ai-guide

    A one stop repository for generative AI research updates, interview resources, notebooks and much more!

    Project mention: 10 Best AI Engineering GitHub Repos to Build Real Systems | dev.to | 2026-01-07

    A one-stop repo for GenAI research updates, notebooks, interview prep, and more. Great for staying current while practicing with solid reference materials you can trust. Link: https://github.com/aishwaryanr/awesome-generative-ai-guide

  15. Hands-On-Large-Language-Models

    Official code repo for the O'Reilly Book - "Hands-On Large Language Models"

    Project mention: 10 Best AI Engineering GitHub Repos to Build Real Systems | dev.to | 2026-01-07

    It’s the full code from the book, with notebooks covering LLM basics, training, and fine-tuning. If you like a guided, notebook-first path from foundations to customization, this feels like a friendly trail map. Link: https://github.com/HandsOnLLM/Hands-On-Large-Language-Models

  16. agentscope

    Build and run agents you can see, understand and trust.

    Project mention: All Data and AI Weekly #207: 15 Sept 2025 | dev.to | 2025-09-15

    GitHub Link: https://github.com/agentscope-ai/agentscope Summary: Agentscope is an agent-oriented programming library that makes it easier to build LLM applications. It's designed to be "developer-centric" with features like asynchronous execution, parallel tool calls, and real-time steering. It offers a transparent approach where prompt engineering and API invocation are fully visible and controllable. Why it's important: Agentscope, along with its related libraries like agentscope-runtime and agentscope-studio, provides a comprehensive toolkit for not only developing but also deploying and visualizing agent-based applications.

  17. haystack

    Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and agent workflows with explicit control over retrieval, routing, memory, and generation. Built for scalable agents, RAG, multimodal applications, semantic search, and conversational systems.

    Project mention: Show HN: Haystack – Review pull requests like you wrote them yourself | news.ycombinator.com | 2025-09-11

    I immediately thought this was an update by Deepset and their Haystack framework. https://haystack.deepset.ai/

    Just FYI.

  18. Qwen

    The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

  19. FinGPT

    FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.

  20. Chinese-LLaMA-Alpaca

    中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

  21. ml-engineering

    Machine Learning Engineering Open Book

    Project mention: Real-time Nvidia GPU dashboard | news.ycombinator.com | 2025-10-06

    For kernel-level performance tuning you can use the occupancy calculator as pointed out by jplusqualt or you can profile your kernel with Nsight compute which will give you a ton of info.

    But for model-wide performance, you basically have to come up with your own calculation to estimate the FLOPs required by your model and based on that figure out how well your model is maxing out the GPU capabilities (MFU/HFU).

    Here is a more in-depth example on how you might do this: https://github.com/stas00/ml-engineering/tree/master/trainin...

  22. Awesome-Multimodal-Large-Language-Models

    :sparkles::sparkles:Latest Advances on Multimodal Large Language Models

  23. camel

    🐫 CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org

    Project mention: Eigent: An open source Claude cowork alternative | news.ycombinator.com | 2026-01-14

    You can have a try; almost all sota models are supported all powered thanks to https://github.com/camel-ai/camel

  24. generative-ai

    Sample code and notebooks for Generative AI on Google Cloud, with Gemini Enterprise Agent Platform

    Project mention: Gemini Embedding 2: Our first natively multimodal embedding model | dev.to | 2026-03-10

    Learn how to use the model in our interactive Gemini API and Vertex AI Colab notebooks. You can also use it through LangChain, LlamaIndex, Haystack, Weaviate, QDrant, ChromaDB, and Vector Search.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

large-language-models discussion

Log in or Post with

large-language-models related posts

  • Gemma 4 12B: A unified, encoder-free multimodal model

    3 projects | news.ycombinator.com | 3 Jun 2026
  • How to track LLM costs per customer in production

    4 projects | dev.to | 2 Jun 2026
  • Train LLMs from Scratch, Hermes Agent WebUI, & Efficient OlmoEarth v1.1 for Local AI

    2 projects | dev.to | 31 May 2026
  • I scanned Langfuse. It observes its own LLM calls through its own platform.

    1 project | dev.to | 28 May 2026
  • DeepSeek Sparse Attention

    1 project | news.ycombinator.com | 24 May 2026
  • Per-user cost attribution for your AI APP

    1 project | dev.to | 21 May 2026
  • Orthrus-Qwen3: up to 7.8×tokens/forward on Qwen3, identical output distribution

    2 projects | news.ycombinator.com | 15 May 2026
  • A note from our sponsor - SaaSHub
    www.saashub.com | 7 Jun 2026
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source large-language-model projects? This list will help you:

# Project Stars
1 langflow 149,263
2 LLMs-from-scratch 96,593
3 llm-course 79,907
4 LlamaFactory 71,870
5 gpt_academic 70,836
6 Flowise 53,317
7 Ray 42,791
8 system_prompts_leaks 41,269
9 langextract 36,808
10 LightRAG 36,193
11 langfuse 28,520
12 storm 28,323
13 awesome-generative-ai-guide 26,976
14 Hands-On-Large-Language-Models 26,813
15 agentscope 26,238
16 haystack 25,466
17 Qwen 21,244
18 FinGPT 20,392
19 Chinese-LLaMA-Alpaca 18,949
20 ml-engineering 18,044
21 Awesome-Multimodal-Large-Language-Models 17,850
22 camel 17,122
23 generative-ai 16,986

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com

Did you know that Python is
the 1st most popular programming language
based on number of references?