Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
I haven't used the llama2 models much in quite awhile, because they just aren't very good compared to other options at this point. Mistral and Mixtral seem to have very little trouble responding in JSON when I ask for it. However, with LLMs that you run yourself, you can also enforce a grammar for the response if you want to, guaranteeing that it will respond with valid JSON and no extraneous text. Something potentially helpful here: https://github.com/ggerganov/llama.cpp/discussions/2494
If anyone wants to finetune their own Mistral 7b model 2.2x faster and use 62% less memory - give our open source package Unsloth a try! https://github.com/unslothai/unsloth a try! :)
Related posts
- AMD ROCm Software Blogs
- Show HN: We got fine-tuning Mistral-7B to not suck
- Has anyone tried out the ASPEN-Framework for LoRA Fine-Tuning yet and can share their experience?
- Show HN: 80% faster, 50% less memory, 0% loss of accuracy Llama finetuning
- Show HN: Unsloth – finetune Llama 2x faster 50% less memory