DeepSeek-Coder vs llama3

DeepSeek-Coder

DeepSeek Coder: Let the Code Write Itself (by deepseek-ai)

Suggest topics

Source Code

coder.deepseek.com

Suggest alternative

Edit details

llama3

The official Meta Llama 3 GitHub site (by meta-llama)

Suggest topics

Source Code

Suggest alternative

Edit details

Scout Monitoring - Free Django app performance insights with Scout Monitoring

Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

www.scoutapm.com

featured

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

DeepSeek-Coder		llama3
	Project
8	Mentions	21
5,728	Stars	22,014
7.2%	Growth	20.5%
8.5	Activity	9.0
24 days ago	Latest Commit	7 days ago
Python	Language	Python
MIT License	License	GNU General Public License v3.0 or later

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

DeepSeek-Coder

Posts with mentions or reviews of DeepSeek-Coder. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-18.

Meta Llama 3
10 projects | news.ycombinator.com | 18 Apr 2024

deepseek-coder-instruct 6.7B still looks like is better than llama 3 8B on HumanEval [0], and deepseek-coder-instruct 33B still within reach to run on 32 GB Macbook M2 Max - Lamma 3 70B on the other hand will be hard to run locally unless you really have 128GB ram or more. But we will see in the following days how it performs in real life.
[0] https://github.com/deepseek-ai/deepseek-coder?tab=readme-ov-...
Mistral Remove "Committing to open models" from their website
1 project | news.ycombinator.com | 26 Feb 2024

Deepseek (https://github.com/deepseek-ai/DeepSeek-Coder?tab=readme-ov-...) code is MIT and the model license is available too.
FLaNK Stack 05 Feb 2024
49 projects | dev.to | 5 Feb 2024
Stable Code 3B: Coding on the Edge
7 projects | news.ycombinator.com | 16 Jan 2024

https://github.com/deepseek-ai/deepseek-coder
33B Instruct doesn’t beat 6.7B Instruct by much but maybe those % improvements mean more for your usage.
I run 6.7B since I have 16GB RAM.
What the heck is so great about this model?
3 projects | /r/SillyTavernAI | 7 Dec 2023

Deepseek Coder: https://github.com/deepseek-ai/DeepSeek-Coder (Best open source coding model right now)
Deepseek Coder instruct – 6.7B model beats gpt3.5-turbo in coding
1 project | news.ycombinator.com | 1 Dec 2023
FLaNK Stack Weekly for 13 November 2023
30 projects | dev.to | 13 Nov 2023
DeepSeek-Coder: Has anyone tried this one?
1 project | news.ycombinator.com | 10 Nov 2023

llama3

Posts with mentions or reviews of llama3. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-06-12.

How Meta trains large language models at scale
3 projects | news.ycombinator.com | 12 Jun 2024

and deceptive if not inaccurate. Meta's Model Cards specifically call out that they were trained on publicly available datasets and NOT any Meta user data.
For example: https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md
Reproduce GPT-2 (124M) in llm.c in 90 minutes for $20
6 projects | news.ycombinator.com | 28 May 2024
Hugging Face is sharing $10M worth of compute to help beat the big AI companies
1 project | news.ycombinator.com | 22 May 2024

I was curious so I tried to answer this question
---
Training Llama 3 models emitted 2290 tons CO2e (https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md), and took 7.7 million GPU hours. Those GPU hours are for H100s, which consume 700W. So the conversion is approximately 2290 / (7.7e6 * 3600 * 700 / 1e9) ~= 0.12 tons CO2e per GPU-gigajoule.
A100s (what Huggingface offers) consume 400W (https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Cent...) and cost $2.21/hour (in e.g. CoreWeave https://www.coreweave.com/gpu-cloud-pricing). So $10 million in H100s buys you ($10e6 / $2.21/h * 3600s/h) * 400W ~= 6515 Gigajoules in GPU-hours.
So Huggingface's offering will emit ~781 tons CO2e. Less if they've inflated the value of the compute they provide, which they have an incentive to do, but let's round to 800 tons.
---
According to https://www.carbonindependent.org/22.html, one Boeing-737-400 flying for 926km emits (3.61 tons fuel/flight * 3.15(g CO2e / g fuel)) = 11.37 tons CO2e .
So $10million in compute is like ~72 Boeing-737-400 international flights.
International Scientific Report on the Safety of Advanced AI [pdf]
1 project | news.ycombinator.com | 18 May 2024

> It takes years to become competent at the math needed for AI
(Assuming that "AI" refers to large language models)
The best open source LLM fits in less than 300 lines of code and consists mostly of matrix multiplications. https://github.com/meta-llama/llama3/blob/main/llama/model.p...
Anyone with a basic grasp of linear algebra can probably learn to understand it in a week.
Llama3.np: pure NumPy implementation of Llama3
10 projects | news.ycombinator.com | 16 May 2024

From the readme [0]:
> All models support sequence length up to 8192 tokens, but we pre-allocate the cache according to max_seq_len and max_batch_size values. So set those according to your hardware.
[0] https://github.com/meta-llama/llama3/tree/14aab0428d3ec3a959...
Hindi-Language AI Chatbot for Enterprises Using Qdrant, MLFlow, and LangChain
5 projects | dev.to | 2 May 2024

Now, let's start building the next part of the chatbot. In this part, we will be using the LLM from Ollama and integrating it with the chatbot. More particularly, we will be using the Llama-3 model. Llama-3 is Meta's latest and most advanced open-source large language model (LLM). It is the successor to the previous Llama 2 model and represents a significant improvement in performance across a variety of benchmarks and tasks. Llama 3 comes in two main versions - an 8 billion parameter model and a 70 billion parameter model. Llama 3 supports longer context lengths of up to 8,000 tokens.
FLaNK AI-April 22, 2024
28 projects | dev.to | 22 Apr 2024
Meta Llama 3 GitHub
1 project | news.ycombinator.com | 22 Apr 2024
Mark Zuckerberg himself appears in the list of direct contributors to Llama 3
2 projects | news.ycombinator.com | 20 Apr 2024
Mark Zuckerberg: Llama 3, $10B Models, Caesar Augustus, Bioweapons [video]
3 projects | news.ycombinator.com | 18 Apr 2024

What are some alternatives?

When comparing DeepSeek-Coder and llama3 you can also consider the following projects:

draw-a-ui - Draw a mockup and generate html for it

promptfoo - Test your prompts. Evaluate and compare LLM outputs, catch regressions, and improve prompt quality. [Moved to: https://github.com/promptfoo/promptfoo]

FT-Merge-Quantize-Infer-CML

llm - Access large language models from the command-line

cucim - cuCIM - RAPIDS GPU-accelerated image processing library

text-generation-inference - Large Language Model Text Generation Inference

wubloader

llama - Inference code for Llama models

linen.dev - Lightweight Google-searchable Slack alternative for Communities

incubator-xtable - Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.

clipea - 📎🟢 Like Clippy but for the CLI. A blazing fast AI helper for your command line

FLiPStackWeekly - FLaNK AI Weekly covering Apache NiFi, Apache Flink, Apache Kafka, Apache Spark, Apache Iceberg, Apache Ozone, Apache Pulsar, and more...

DeepSeek-Coder vs draw-a-ui llama3 vs promptfoo DeepSeek-Coder vs FT-Merge-Quantize-Infer-CML llama3 vs llm DeepSeek-Coder vs cucim llama3 vs text-generation-inference DeepSeek-Coder vs wubloader llama3 vs llama DeepSeek-Coder vs linen.dev llama3 vs incubator-xtable DeepSeek-Coder vs clipea llama3 vs FLiPStackWeekly

Scout Monitoring - Free Django app performance insights with Scout Monitoring

Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

www.scoutapm.com

featured

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

Compare DeepSeek-Coder vs llama3 and see what are their differences.

DeepSeek-Coder

llama3

DeepSeek-Coder

llama3

What are some alternatives?