ctransformers
TornadoVM
ctransformers | TornadoVM | |
---|---|---|
4 | 22 | |
1,718 | 1,123 | |
- | 2.8% | |
8.6 | 9.9 | |
4 months ago | 4 days ago | |
C | Java | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
ctransformers
-
Refact LLM: New 1.6B code model reaches 32% HumanEval and is SOTA for the size
Does ctransformer (https://github.com/marella/ctransformers#supported-models) support running refact?
I see that model type "gpt_refact" in https://huggingface.co/smallcloudai/Refact-1_6B-fim/blob/mai...
-
How do I utilize these quantized models being uploaded?
You can also use ctransformers with the ggml models if you want to use python rather than c++.
-
Langchain and self hosted LLaMA hosted API
For ggml https://github.com/marella/ctransformers/ and https://github.com/abetlen/llama-cpp-python has a decent server. https://github.com/go-skynet/LocalAI is very active too.
- Also reconnecting with Scala. Interested in LLMs
TornadoVM
-
Intel Gaudi 3 AI Accelerator
You don't need to use C++ to interface with CUDA or even write it.
A while ago NVIDIA and the GraalVM team demoed grCUDA which makes it easy to share memory with CUDA kernels and invoke them from any managed language that runs on GraalVM (which includes JIT compiled Python). Because it's integrated with the compiler the invocation overhead is low:
https://developer.nvidia.com/blog/grcuda-a-polyglot-language...
And TornadoVM lets you write kernels in JVM langs that are compiled through to CUDA:
https://www.tornadovm.org
There are similar technologies for other languages/runtimes too. So I don't think that will cause NVIDIA to lose ground.
- Java VectorAPI compatiblity with TornadoVM GPU programming framework
- Java GPU pre/post processing with ONNX RT and TornadoVM
- FLaNK Stack 05 Feb 2024
- FLaNK 25 December 2023
- GPU Acceleration for Python, JavaScript, Ruby from Java with Truffle
- TornadoVM v1.0 Released
- TornadoVM 1.0
-
From CPU to GPU and FPGAs: Supercharging Java Applications with TornadoVM [video]
Presented by Juan Fumero, PhD & Research Fellow (The University of Manchester, UK) during the JVM Language Summit 2023 (Santa Clara CA).
More information on TornadoVM can be found at https://www.tornadovm.org/
Tags: #Java #JVMLS #GPU #FPGA #OpenJDK #GraalVM #AI
What are some alternatives?
llama-cpp-python - Python bindings for llama.cpp
Aparapi - The New Official Aparapi: a framework for executing native Java and Scala code on the GPU.
LangChain_PDFChat_Oobabooga - oobaboga -text-generation-webui implementation of wafflecomposite - langchain-ask-pdf-local
openapi4j - OpenAPI 3 parser, JSON schema and request validator.
text-generation-inference - Large Language Model Text Generation Inference
GraalVMREPL - REPL (read–eval–print loop) shell built on top of JavaFX and GraalVM stack, incorporating GraalJS, GraalPython, TruffleRuby and FastR
artificial-nose - Instructions, source code, and misc. resources needed for building a Tiny ML-powered artificial nose.
kattlo-cli - Kattlo CLI Project
kendryte-standalone-sdk - Standalone SDK for kendryte K210
junodb - JunoDB is PayPal's home-grown secure, consistent and highly available key-value store providing low, single digit millisecond, latency at any scale.
text-generation-webui - A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
jr - JR: streaming quality random data from the command line