80% faster, 50% less memory, 0% loss of accuracy Llama finetuning

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • unsloth

    Finetune Llama 3, Mistral & Gemma LLMs 2-5x faster with 80% less memory

  • M1 support is probably coming in the future if there is enough support - could you make an Issue at https://github.com/unslothai/unsloth - that would be much appreciated!

    I think there are a few Redditors from /r/localllama who also requested this, but for now first priority is getting Mistral support!!

  • hyperlearn

    2-2000x faster ML algos, 50% less memory usage, works on all hardware - new and old.

  • Good point - the main issue is we encountered this exact issue with our old package Hyperlearn (https://github.com/danielhanchen/hyperlearn).

    I OSSed all the code to the community - I'm actually an extremely open person and I love contributing to the OSS community.

    The issue was the package got gobbled up by other startups and big tech companies with no credit - I didn't want any cash from it, but it stung and hurt really bad hearing other startups and companies claim it was them who made it faster, whilst it was actually my work. It hurt really bad - as an OSS person, I don't want money, but just some recognition for the work.

    I also used to accept and help everyone with their writing their startup's software, but I never got paid or even any thanks - sadly I didn't expect the world to be such a hostile place.

    So after a sad awakening, I decided with my brother instead of OSSing everything, we would first OSS something which is still very good - 5X faster training is already very reasonable.

    I'm all open to other suggestions on how we should approach this though! There are no evil intentions - in fact I insisted we OSS EVERYTHING even the 30x faster algos, but after a level headed discussion with my brother - we still have to pay life expenses no?

    If you have other ways we can go about this - I'm all ears!! We're literally making stuff up as we go along!

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • segment-anything-fast

    A batched offline inference oriented version of segment-anything

  • How does this compare to PyTorch labs optimizations for Sam and llama2 ?

    https://github.com/pytorch-labs/segment-anything-fast

    https://github.com/pytorch-labs/gpt-fast

  • gpt-fast

    Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

  • How does this compare to PyTorch labs optimizations for Sam and llama2 ?

    https://github.com/pytorch-labs/segment-anything-fast

    https://github.com/pytorch-labs/gpt-fast

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts