SaaSHub helps you find the best software and product alternatives Learn more →
Llama-2-Onnx Alternatives
Similar projects and alternatives to Llama-2-Onnx
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
FLiPStackWeekly
FLaNK AI Weekly covering Apache NiFi, Apache Flink, Apache Kafka, Apache Spark, Apache Iceberg, Apache Ozone, Apache Pulsar, and more...
-
chatgpt-retrieval-plugin
The ChatGPT Retrieval Plugin lets you easily find personal or work documents by asking questions in natural language.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
towhee
Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
-
dify
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
-
symmetric-ds
SymmetricDS is database replication and file synchronization software that is platform independent, web enabled, and database agnostic. It is designed to make bi-directional data replication fast, easy, and resilient. It scales to a large number of nodes and works in near real-time across WAN and LAN networks.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Llama-2-Onnx reviews and mentions
-
Show HN: Fine-tune your own Llama 2 to replace GPT-3.5/4
System: Here's some docs, answer concisely in a sentence.
YMMV on cost still, depends on cloud vendor, and my intuition & viewpoint agrees with yours, GPT-3.5 is priced low enough that there isn't a case where it makes sense to use another model.
It strikes me now that _very_ likely and not just our intuition: OpenAI's $/GPU hour is likely <= any other vendor's.
The next big step will come from formalizing the stuff rolling around the local LLM community, for months now it's either been one-off $X.c stunts that run on desktop, and the vast majority of the _actual_ usage and progress is coming from porn-y stuff, like all nascent tech.
Microsoft has LLaMa-2 ONNX available on GitHub[1]. There's budding but very small projects in different languages to wrap ONNX. Once there's a genuine cross-platform[2] ONNX wrapper that makes running LLaMa-2 easy, there will be a step change. It'll be "free"[3] to run your fine-tuned model that does as well as GPT-4 .
It's not clear to me exactly when this will occur. It's "difficult" now, but only because the _actual usage_ in the local LLM community doesn't have a reason to invest in ONNX, and it's extremely intimidating to figure out how exactly to get LLaMa-2 running in ONNX. Microsoft kinda threw it up on GitHub and moved on, the sample code even still needs a PyTorch model. I see at least one very small company on HuggingFace that _may_ have figured out full ONNX.
[1] https://github.com/microsoft/Llama-2-Onnx
- FLaNK Stack Weekly for 14 Aug 2023
- Llama 2 on ONNX runs locally
-
A note from our sponsor - SaaSHub
www.saashub.com | 8 May 2024
Stats
microsoft/Llama-2-Onnx is an open source project licensed under GNU General Public License v3.0 or later which is an OSI approved license.
The primary programming language of Llama-2-Onnx is Python.
Sponsored