contextgem
LLMStack
contextgem | LLMStack | |
---|---|---|
3 | 23 | |
1,248 | 1,983 | |
17.8% | 1.8% | |
8.9 | 9.9 | |
8 days ago | 7 months ago | |
Python | Python | |
Apache License 2.0 | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
contextgem
-
Transform DOCX into LLM-ready data
As part of work on my open-source project ContextGem, I've built a native, zero-dependency DOCX converter that transforms Word documents into LLM-ready data.
This custom-built converter directly processes Word XML, provides comprehensive content extraction + covers what other open-source tools often miss or lack support for:
- Rich paragraph and sentence metadata for enhanced context
- Misaligned tables
- Comments, footnotes, and textboxes
- Embedded images
The converted document can then be easily used in ContextGem's LLM extraction workflows.
Perfect for developers building contract intelligence applications where precision matters. The converter preserves document structure and relationships, empowering LLMs to better understand and analyze document content.
Try it / share with your dev team today and see the difference in your document processing pipeline!
GitHub: https://github.com/shcherbak-ai/contextgem
All DocxConverter features: https://contextgem.dev/converters/docx.html
-
I Built an Open-Source Framework to Make LLM Data Extraction Dead Simple
After getting tired of writing endless boilerplate to extract structured data from documents with LLMs, I built ContextGem - a free, open-source framework that makes this radically easier.
- ContextGem: Easier and faster way to build LLM extraction workflows
LLMStack
-
Show HN: Langrocks – tools like computer access, browser etc., for LLM agents
We built tools like web browser, code interpreter etc., needed for LLM agents as part of our LLMStack project. We have now moved them into a single collection as langrocks. We're using this in LLMStack and thought others might find it useful. https://github.com/trypromptly/LLMStack/blob/main/llmstack/p... shows how we use langrocks with Anthropic's Claude with computer use to automate web browsing.
The coolest part is watching an LLM actually use a computer - you get a unique URL to view the virtual display, so you can see exactly what it's doing with tools like computer access and web browser. We've used this to automate complex workflows where the LLM needs to research across multiple sites, interact with web apps, or perform system operations.
-
Llama 3.1 Official Launch
You can already run these models locally with Ollama (ollama run llama3.1:latest) along with at places like huggingface, groq etc.
If you want a playground to test this model locally or want to quickly build some applications with it, you can try LLMStack (https://github.com/trypromptly/LLMStack). I wrote last week about how to configure and use Ollama with LLMStack at https://docs.trypromptly.com/guides/using-llama3-with-ollama.
Disclaimer: I'm the maintainer of LLMStack
-
txtai: Open-source vector search and RAG for minimalists
You can actually do this with LLMStack (https://github.com/trypromptly/LLMStack) quite easily in a no-code way. Put together a guide to use LLMStack with Ollama last week - https://docs.trypromptly.com/guides/using-llama3-with-ollama for using local models. It lets you load all your files as a datasource and then build a RAG app over it.
For now it still uses openai for embeddings generation by default and we are updating that in the next couple of releases to be able to use a local model for embedding generation before writing to a vector db.
Disclosure: I'm the maintainer of LLMStack project
-
Vanna.ai: Chat with your SQL database
We have recently added support to query data from SingleStore to our agent framework, LLMStack (https://github.com/trypromptly/LLMStack). Out of the box performance performance when prompting with just the table schemas is pretty good with GPT-4.
The more domain specific knowledge needed for queries, the harder it has gotten in general. We've had good success `teaching` the model different concepts in relation to the dataset and giving it example questions and queries greatly improved performance.
-
FFmpeg Lands CLI Multi-Threading as Its "Most Complex Refactoring" in Decades
This will hopefully improve the startup times for FFmpeg when streaming from virtual display buffers. We use FFmpeg in LLMStack (low-code framework to build and run LLM agents) to stream browser video. We use playwright to automate browser interactions and provide that as tool to the LLM. When this tool is invoked, we stream the video of these browser interactions with FFmpeg by streaming the virtual display buffer the browser is using.
There is a noticeable delay booting up this pipeline for each tool invoke right now. We are working on putting in some optimizations but improvements in FFmpeg will definitely help. https://github.com/trypromptly/LLMStack is the project repo for the curious.
-
Show HN: IncarnaMind-Chat with your multiple docs using LLMs
We built https://github.com/trypromptly/LLMStack to serve exactly this persona. A low-code platform to quickly build RAG pipelines and other LLM applications.
-
A Comprehensive Guide for Building Rag-Based LLM Applications
Kudos to the team for a very detailed notebook going into things like pipeline evaluation wrt performance and costs etc. Even if we ignore the framework specific bits, it is a great guide to follow when building RAG systems in production.
We have been building RAG systems in production for a few months and have been tinkering with different strategies to get the most performance out of these pipelines. As others have pointed out, vector database may not be the right strategy for every problem. Similarly there are things like lost in the middle problems (https://arxiv.org/abs/2307.03172) that one may have to deal with. We put together our learnings building and optimizing these pipelines in a post at https://llmstack.ai/blog/retrieval-augmented-generation.
https://github.com/trypromptly/LLMStack is a low-code platform we open-sourced recently that ships these RAG pipelines out of the box with some app templates if anyone wants to try them out.
-
Building a Blog in Django
Django has been my go to framework for any new web project I start for more than a decade. Its batteries-included approach meant that one could go pretty far with just Django alone. Included admin interface and the views/templating setup was what first drew me to the project.
Django project itself has kept pace with recent developments in web development. I still remember migrations being an external project, getting merged in and the transition that followed. Ecosystem is pretty powerful too with projects like drf, channels, social-auth etc., covering most things we need to run in production.
https://github.com/trypromptly/LLMStack is a recent project I built entirely with Django. It uses django channels for websockets, drf for API and reactjs for the frontend.
-
Show HN: Rivet – open-source AI Agent dev env with real-world applications
We recently opensourced a similar platform for building workflows by chaining LLMs visually along with LocalAI support.
Check it out at https://github.com/trypromptly/LLMStack. Like you said, it was fairly easy to integrate LocalAI and is a great project.
-
Show HN: Retool AI
Would you mind expanding why it was tough to get started with Retool?
We are building https://github.com/trypromptly/LLMStack, a low-code platform to build LLM apps with a goal of making it easy for non-tech people to leverage LLMs in their workflows. Would love to learn about your experience with retool and incorporate some of that feedback into LLMStack.
What are some alternatives?
sdk - Lightfeed SDK to search and filter web data
anything-llm - The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, MCP compatibility, and more.
validex - Simplifies the retrieval, extraction, and training of structured data from various unstructured sources.
langflow - Langflow is a powerful tool for building and deploying AI-powered agents and workflows.
dn-institute - Distributed Networks Institute
sample-app-aoai-chatGPT - Sample code for a simple web chat experience through Azure OpenAI, including Azure OpenAI On Your Data.