DeepSeek-Coder
llama3
DeepSeek-Coder | llama3 | |
---|---|---|
8 | 21 | |
5,728 | 22,014 | |
7.2% | 20.5% | |
8.5 | 9.0 | |
24 days ago | 7 days ago | |
Python | Python | |
MIT License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
DeepSeek-Coder
-
Meta Llama 3
deepseek-coder-instruct 6.7B still looks like is better than llama 3 8B on HumanEval [0], and deepseek-coder-instruct 33B still within reach to run on 32 GB Macbook M2 Max - Lamma 3 70B on the other hand will be hard to run locally unless you really have 128GB ram or more. But we will see in the following days how it performs in real life.
[0] https://github.com/deepseek-ai/deepseek-coder?tab=readme-ov-...
-
Mistral Remove "Committing to open models" from their website
Deepseek (https://github.com/deepseek-ai/DeepSeek-Coder?tab=readme-ov-...) code is MIT and the model license is available too.
- FLaNK Stack 05 Feb 2024
-
Stable Code 3B: Coding on the Edge
https://github.com/deepseek-ai/deepseek-coder
33B Instruct doesn’t beat 6.7B Instruct by much but maybe those % improvements mean more for your usage.
I run 6.7B since I have 16GB RAM.
-
What the heck is so great about this model?
Deepseek Coder: https://github.com/deepseek-ai/DeepSeek-Coder (Best open source coding model right now)
- Deepseek Coder instruct – 6.7B model beats gpt3.5-turbo in coding
- FLaNK Stack Weekly for 13 November 2023
- DeepSeek-Coder: Has anyone tried this one?
llama3
-
How Meta trains large language models at scale
and deceptive if not inaccurate. Meta's Model Cards specifically call out that they were trained on publicly available datasets and NOT any Meta user data.
For example: https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md
- Reproduce GPT-2 (124M) in llm.c in 90 minutes for $20
-
Hugging Face is sharing $10M worth of compute to help beat the big AI companies
I was curious so I tried to answer this question
---
Training Llama 3 models emitted 2290 tons CO2e (https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md), and took 7.7 million GPU hours. Those GPU hours are for H100s, which consume 700W. So the conversion is approximately 2290 / (7.7e6 * 3600 * 700 / 1e9) ~= 0.12 tons CO2e per GPU-gigajoule.
A100s (what Huggingface offers) consume 400W (https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Cent...) and cost $2.21/hour (in e.g. CoreWeave https://www.coreweave.com/gpu-cloud-pricing). So $10 million in H100s buys you ($10e6 / $2.21/h * 3600s/h) * 400W ~= 6515 Gigajoules in GPU-hours.
So Huggingface's offering will emit ~781 tons CO2e. Less if they've inflated the value of the compute they provide, which they have an incentive to do, but let's round to 800 tons.
---
According to https://www.carbonindependent.org/22.html, one Boeing-737-400 flying for 926km emits (3.61 tons fuel/flight * 3.15(g CO2e / g fuel)) = 11.37 tons CO2e .
So $10million in compute is like ~72 Boeing-737-400 international flights.
-
International Scientific Report on the Safety of Advanced AI [pdf]
> It takes years to become competent at the math needed for AI
(Assuming that "AI" refers to large language models)
The best open source LLM fits in less than 300 lines of code and consists mostly of matrix multiplications. https://github.com/meta-llama/llama3/blob/main/llama/model.p...
Anyone with a basic grasp of linear algebra can probably learn to understand it in a week.
-
Llama3.np: pure NumPy implementation of Llama3
From the readme [0]:
> All models support sequence length up to 8192 tokens, but we pre-allocate the cache according to max_seq_len and max_batch_size values. So set those according to your hardware.
[0] https://github.com/meta-llama/llama3/tree/14aab0428d3ec3a959...
-
Hindi-Language AI Chatbot for Enterprises Using Qdrant, MLFlow, and LangChain
Now, let's start building the next part of the chatbot. In this part, we will be using the LLM from Ollama and integrating it with the chatbot. More particularly, we will be using the Llama-3 model. Llama-3 is Meta's latest and most advanced open-source large language model (LLM). It is the successor to the previous Llama 2 model and represents a significant improvement in performance across a variety of benchmarks and tasks. Llama 3 comes in two main versions - an 8 billion parameter model and a 70 billion parameter model. Llama 3 supports longer context lengths of up to 8,000 tokens.
- FLaNK AI-April 22, 2024
- Meta Llama 3 GitHub
- Mark Zuckerberg himself appears in the list of direct contributors to Llama 3
- Mark Zuckerberg: Llama 3, $10B Models, Caesar Augustus, Bioweapons [video]
What are some alternatives?
draw-a-ui - Draw a mockup and generate html for it
promptfoo - Test your prompts. Evaluate and compare LLM outputs, catch regressions, and improve prompt quality. [Moved to: https://github.com/promptfoo/promptfoo]
FT-Merge-Quantize-Infer-CML
llm - Access large language models from the command-line
cucim - cuCIM - RAPIDS GPU-accelerated image processing library
text-generation-inference - Large Language Model Text Generation Inference
wubloader
llama - Inference code for Llama models
linen.dev - Lightweight Google-searchable Slack alternative for Communities
incubator-xtable - Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.
clipea - 📎🟢 Like Clippy but for the CLI. A blazing fast AI helper for your command line
FLiPStackWeekly - FLaNK AI Weekly covering Apache NiFi, Apache Flink, Apache Kafka, Apache Spark, Apache Iceberg, Apache Ozone, Apache Pulsar, and more...