Paris-Based Startup and OpenAI Competitor Mistral AI Valued at $2B

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

transformers

175 125,021 10.0 Python

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

If you want to tinker with the architecture Hugging Face has a FOSS implementation in transformers: https://github.com/huggingface/transformers/blob/main/src/tr...
If you want to reproduce the training pipeline, you couldn't do that even if you wanted to because you don't have access to thousands of A100s.

ez-openai

2 16 6.9 Python

Ez API, ez life.

This is just tangential, but I wouldn't call their APIs "nice", I'd be far less charitable. I spent a few hours (because that's how long it took to figure out the API, due to almost zero documentation) and wrote a nicer Python layer:
https://github.com/skorokithakis/ez-openai/
With all that money, I would have thought they'd be able to design more user-friendly APIs. Maybe they could even ask an LLM for help.

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
MixtralKit

4 750 8.1 Python

A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI

> Mistral's latest just released model is well below GPT-3 out of the box
The early information I see implies it is above. Mind you, that is mostly because GPT-3 was comparatively low: for instance its 5-shot MMLU score was 43.9%, while Llama2 70B 5-shot was 68.9%[0]. Early benchmarks[1] give Mixtral scores above Llama2 70B on MMLU (and other benchmarks), thus transitively, it seems likely to be above GPT-3.
Of course, GPT-3.5 has a 5-shot score of 70, and it is unclear yet whether Mixtral is above or below, and clearly it is below GPT-4’s 86.5. The dust needs to settle, and the official inference code needs to be released, before there is certainty on its exact strength.
[0]: https://paperswithcode.com/sota/multi-task-language-understa...
[1]: https://github.com/open-compass/MixtralKit#comparison-with-o...

ollama

192 58,943 9.9 Go

Get up and running with Llama 3, Mistral, Gemma, and other large language models.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project