twinny
pinferencia
twinny | pinferencia | |
---|---|---|
7 | 21 | |
1,750 | 558 | |
- | 0.4% | |
9.9 | 0.0 | |
5 days ago | about 1 year ago | |
TypeScript | Python | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
twinny
- Twinny: Locally hosted (or API hosted) AI code completion for Visual Studio Code
-
The lifecycle of a code AI completion
For those who might not be aware of this, there is also an open source project on GitHub called "Twinny" which is an offline Visual Studio Code plugin equivalent to Copilot: https://github.com/rjmacarthy/twinny
It can be used with a number of local model services. Currently for my setup on a NVIDIA 4090, I'm running both the base and instruct model for deepseek-coder 6.7b using 5_K_M Quantization GGUF files (for performance) through llama.cpp "server" where the base model is for completions and the instruct model for chat interactions.
llama.cpp: https://github.com/ggerganov/llama.cpp/
deepseek-coder 6.7b base GGUF files: https://huggingface.co/TheBloke/deepseek-coder-6.7B-base-GGU...
deepseek-coder 6.7b instruct GGUF files: https://huggingface.co/TheBloke/deepseek-coder-6.7B-instruct...
- Private Ollama GitHub Copilot Alternative with FIM and Chat
- Ollama AI code completion plugin for VSCode, 100% free and 100% private
- A new locally hosted AI code completion API and vscode extension. Like Copilot but totally free and best of all private.
- Continue with LocalAI: An alternative to GitHub's Copilot that runs locally
-
Locally hosted code completion API and vscode extension. 100% free and 100% private.
https://github.com/rjmacarthy/twinny - vscode extension https://github.com/rjmacarthy/twinny-api - python inference api
pinferencia
- Show HN: Pinferencia, Deploy Your AI Models with Pretty UI and REST API
-
Stop Writing Flask to Serve/Deploy Your Model: Pinferencia is Here
Go visit: Pinferencia (underneathall.app) for detailed examples.
- Looking for a reference design pattern for an image to image microservice
-
Google T5 Translation as a Service with Just 7 lines of Codes
**Pinferencia** makes it super easy to serve any model with just three extra lines.
-
Pre-trained Model with Fine Tuning/Transfer Learning or Design and Train from Scratch?
Hi, recently I'm writing some tutorials involving HuggingFace's models for my project Pinferencia.
-
[D] Pre-trained Model with Fine Tuning/Transfer Learning or Design and Train from Scratch?
Hi, I'm the creator of Pinferencia, recently I'm writer some tutorial involving HuggingFace's models.
-
GPT2 — Text Generation Transformer: How to Use & How to Serve
If you haven't heard of Pinferencia go to its github page or its homepage to check it out, it's an amazing library help you deploy your model with ease.
-
My first Udemy course on ML Ops deployment!
Please allow me to recommend another simple but serious deployment tools which is also compatible with triton, torchserve, kubeflow, tf serving: Pinferencia
-
Easiest Way to Deploy HuggingFace Transformers
Never heard of Pinferencia? It’s not late. Go to its GitHub to take a look. Don’t forget to give it a star if you like it.
-
what is the easiest way to deploy a nlp model?
Check this out https://github.com/underneathall/pinferencia
What are some alternatives?
code-llama-for-vscode - Use Code Llama with Visual Studio Code and the Continue extension. A local LLM alternative to GitHub Copilot.
server - The Triton Inference Server provides an optimized cloud and edge inferencing solution.
twinny-api - Locally hosted AI code completion server. Like Github Copilot but 100% free and 100% private.
budgetml - Deploy a ML inference service on a budget in less than 10 lines of code.
koboldcpp - A simple one-file way to run various GGML and GGUF models with KoboldAI's UI
deepsparse - Sparsity-aware deep learning inference runtime for CPUs
ollama - Get up and running with Llama 3, Mistral, Gemma, and other large language models.
llmware - Providing enterprise-grade LLM-based development framework, tools, and fine-tuned models.
aichat - All-in-one AI-Powered CLI Chat & Copilot that integrates 10+ AI platforms, including OpenAI, Azure-OpenAI, Gemini, VertexAI, Claude, Mistral, Cohere, Ollama, Ernie, Qianwen...
polyaxon - MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle
serving - A flexible, high-performance serving system for machine learning models
dslinter - `dslinter` is a pylint plugin for linting data science and machine learning code. We plan to support the following Python libraries: TensorFlow, PyTorch, Scikit-Learn, Pandas and NumPy.