Paris-Based Startup and OpenAI Competitor Mistral AI Valued at $2B

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • transformers

    🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

  • If you want to tinker with the architecture Hugging Face has a FOSS implementation in transformers: https://github.com/huggingface/transformers/blob/main/src/tr...

    If you want to reproduce the training pipeline, you couldn't do that even if you wanted to because you don't have access to thousands of A100s.

  • ez-openai

    Ez API, ez life.

  • This is just tangential, but I wouldn't call their APIs "nice", I'd be far less charitable. I spent a few hours (because that's how long it took to figure out the API, due to almost zero documentation) and wrote a nicer Python layer:

    https://github.com/skorokithakis/ez-openai/

    With all that money, I would have thought they'd be able to design more user-friendly APIs. Maybe they could even ask an LLM for help.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • MixtralKit

    A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI

  • > Mistral's latest just released model is well below GPT-3 out of the box

    The early information I see implies it is above. Mind you, that is mostly because GPT-3 was comparatively low: for instance its 5-shot MMLU score was 43.9%, while Llama2 70B 5-shot was 68.9%[0]. Early benchmarks[1] give Mixtral scores above Llama2 70B on MMLU (and other benchmarks), thus transitively, it seems likely to be above GPT-3.

    Of course, GPT-3.5 has a 5-shot score of 70, and it is unclear yet whether Mixtral is above or below, and clearly it is below GPT-4’s 86.5. The dust needs to settle, and the official inference code needs to be released, before there is certainty on its exact strength.

    [0]: https://paperswithcode.com/sota/multi-task-language-understa...

    [1]: https://github.com/open-compass/MixtralKit#comparison-with-o...

  • ollama

    Get up and running with Llama 3, Mistral, Gemma, and other large language models.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts