dify
llama2.c
dify | llama2.c | |
---|---|---|
12 | 13 | |
25,645 | 15,942 | |
29.1% | - | |
9.9 | 9.2 | |
3 days ago | 5 days ago | |
TypeScript | C | |
GNU General Public License v3.0 or later | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
dify
- FLaNK AI Weekly for 29 April 2024
-
Dify, a visual workflow to build/test LLM applications
> https://github.com/langgenius/dify/blob/main/LICENSE
everyone is apparently a license pioneer
- Dify, an end-to-end, visualized workflow to build/test LLM applications
-
GreptimeAI + Xinference - Efficient Deployment and Monitoring of Your LLM Applications
Xorbits Inference (Xinference) is an open-source platform to streamline the operation and integration of a wide array of AI models. With Xinference, you’re empowered to run inference using any open-source LLMs, embedding models, and multimodal models either in the cloud or on your own premises, and create robust AI-driven applications. It provides a RESTful API compatible with OpenAI API, Python SDK, CLI, and WebUI. Furthermore, it integrates third-party developer tools like LangChain, LlamaIndex, and Dify, facilitating model integration and development.
-
Which LLM framework(s) do you use in production and why?
If you are looking to develop QnA or chat based apps then check out https://dify.ai. Do a quick check and see if it fit your requirements. You can integrate it with your app using the apis it provides
-
New Discoveries in No-Code AI App Building with ChatGPT
As an AI newbie, I used to find coding apps from scratch an absolute nightmare! The learning curve was steep as a ski slope, debugging took endless hours, and developing even a simple AI app nearly drove me insane! But since discovering Dify, it has totally revolutionized my life by enabling app development without any coding skills!
- FLaNK Stack Weekly for 14 Aug 2023
- Interesting LLMOps Tools Dify.ai
- Dify.ai – Simply create and operate AI-native apps based on GPT-4
- langgenius/dify: One API for plugins and datasets, one interface for prompt engineering and visual operation, all for creating powerful AI applications.
llama2.c
-
Stuff we figured out about AI in 2023
FOr inference, less than 1KLOC of pure, dependency-free C is enough (if you include the tokenizer and command line parsing)[1]. This was a non-obvious fact for me, in principle, you could run a modern LLM 20 years ago with just 1000 lines of code, assuming you're fine with things potentially taking days to run of course.
Training wouldn't be that much harder, Micrograd[2] is 200LOC of pure Python, 1000 lines would probably be enough for training an (extremely slow) LLM. By "extremely slow", I mean that a training run that normally takes hours could probably take dozens of years, but the results would, in principle, be the same.
If you were writing in C instead of Python and used something like Llama CPP's optimization tricks, you could probably get somewhat acceptable training performance in 2 or 3 KLOC. You'd still be off by one or two orders of magnitude when compared to a GPU cluster, but a lot better than naive, loopy Python.
[1] https://github.com/karpathy/llama2.c
[2] https://github.com/karpathy/micrograd
-
Minimal neural network implementation
A bit off topic but ML-guru Mr Karpathy has implemented a state-of-art Llama2 model in a plain C with no dependencies on 3rd party/freeware libraries. See repo.
-
WebLLM: Llama2 in the Browser
Related. I built karpathy’s llama2.c (https://github.com/karpathy/llama2.c) without modifications to WASM and run it in the browser. It was a fun exercise to directly compare native vs. Web perf. Getting 80% of native performance on my M1 Macbook Air and haven’t spent anytime optimizing the WASM side.
Demo: https://diegomarcos.com/llama2.c-web/
Code:
-
Lfortran: Modern interactive LLVM-based Fortran compiler
Would be cool for there to be a `llama2.f`, similar to https://github.com/karpathy/llama2.c, to demo it's capabilities
-
Llama2.c L2E LLM – Multi OS Binary and Unikernel Release
This is a fork of https://github.com/karpathy/llama2.c
karpathy's llama2.c is like llama.cpp but it is written in C and the python training code is available in that same repo. llama2.c's goal is to be a elegant single file C implementation of the inference and an elegant python implementation for training.
His goal is for people to understand how llama 2 and LLM's work, so he keeps it simple and sweet. As the project progresses, so will features and performance improvements added.
Currently it can infer baby (small) Story models trained by Karpathy at a fast pace. It can also infer Meta LLAMA 2 7b models, but at a very slow rate such as 1 token per second.
So currently this can be used for learning or as a tech preview.
Our friendly fork tries to make it portable, performant and more usable (bells and whistles) over time. Since we mirror upstream closely, the inference capabilities of our fork is similar but slightly faster if compiled with acceleration. What we try to do different is that we try to make this bootable (not there yet) and portable. Right now you can get binary portablity - use the same run.com on any x86_64 machine running on any OS, it will work (possible due to cosmopolitan toolchain). The other part that works is unikernels - boot this as unikernel in VM's (possible due unikraft unikernel & toolchain).
See our fork currently as a release early and release often toy tech demo. We plan to build it out into a useful product.
- FLaNK Stack Weekly for 14 Aug 2023
-
Adding LLaMa2.c support for Web with GGML.JS
In my latest release of ggml.js, I've added support for Karapathy's llama2.c model.
-
Beginner's Guide to Llama Models
I really enjoyed Anrej Kaparthy's llama2.c project (https://github.com/karpathy/llama2.c), which runs through creating and running a miniature Llama2 architecture model from scratch.
-
How to scale LLMs better with an alternative to transformers
- https://github.com/karpathy/llama2.c
I think there may be some applications in this limited space that are worth looking into. You won’t replicate GPT-anything but it may be possible to solve some nice problems very much more efficiently that one would expect at first.
-
A simple guide to fine-tuning Llama 2
It does now: https://github.com/karpathy/llama2.c#metas-llama-2-models
What are some alternatives?
langchain-llm-katas - This is a an open-source project designed to help you improve your skills with AI engineering using LLMs and the langchain library
llama2.c - Llama 2 Everywhere (L2E)
litellm - Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)
fastGPT - Fast GPT-2 inference written in Fortran
chainlit - Build Conversational AI in minutes ⚡️
CML_AMP_Churn_Prediction_mlflow - Build an scikit-learn model to predict churn using customer telco data.
duet-gpt - A conversational semi-autonomous developer assistant. AI pair programming without the copypasta.
feldera - Feldera Continuous Analytics Platform
IncognitoPilot - An AI code interpreter for sensitive data, powered by GPT-4 or Code Llama / Llama 2.
awesome-data-temporality - A curated list to help you manage temporal data across many modalities 🚀.
jdbc-connector-for-apache-kafka - Aiven's JDBC Sink and Source Connectors for Apache Kafka®
api-for-open-llm - Openai style api for open large language models, using LLMs just as chatgpt! Support for LLaMA, LLaMA-2, BLOOM, Falcon, Baichuan, Qwen, Xverse, SqlCoder, CodeLLaMA, ChatGLM, ChatGLM2, ChatGLM3 etc. 开源大模型的统一后端接口