llm-awq
amazon-bedrock-with-builder-and-command-patterns
llm-awq | amazon-bedrock-with-builder-and-command-patterns | |
---|---|---|
7 | 16 | |
1,902 | 11 | |
10.9% | - | |
8.0 | 6.1 | |
8 days ago | about 2 months ago | |
Python | Java | |
MIT License | MIT No Attribution |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
llm-awq
-
TinyChat: Large Language Model on the Edge
TinyChat is an efficient, lightweight, Python-native serving framework for 4-bit LLMs by AWQ. It delivers 2.3x generation speed up on RTX4090.
Code: https://github.com/mit-han-lab/llm-awq/tree/main/tinychat
- FLaNK Stack Weekly 23 Oct 2023
-
New base model InternLM 7B weights released, with 8k context window.
I am having trouble finding any 8bit GPTQ models at all, there don't seem to be any on HF it's almost all 4bit with the odd 3bit of the big ones. Suspect I will have to make my own for eval purposes but it's lower priority on my list then finding a 4bit that's GPU friendly but doesn't have such a performance penalty... Looking at AWQ they have 3 and 4bit versions.
-
Llama33B vs Falcon40B vs MPT30B
Using the currently popular gptq the 3bit quantization hurts performance much more than 4bit, but there's also awq (https://github.com/mit-han-lab/llm-awq) and squishllm (https://github.com/SqueezeAILab/SqueezeLLM) which are able to manage 3bit without as much performance drop - I hope to see them used more commonly.
- New hardware-friendly quantization method
-
Activation-Aware Weight Quantization for LLM Compression Outperforms GPTQ
Better quantization would have a direct and meaningful impact for everyone running local LLMs. The technique has already been applied to both Vicuna and the multimodal LLaMA variant LLaVA.
https://github.com/mit-han-lab/llm-awq
-
New quantization method AWQ outperforms GPTQ in 4-bit and 3-bit with 1.45x speedup and works with multimodal LLMs
GitHub: https://github.com/mit-han-lab/llm-awq
amazon-bedrock-with-builder-and-command-patterns
-
How to Build Your Own ChatGPT Clone Using React & AWS Bedrock
The second service is what’s going to make our application come alive and give it the AI functionality we need and that service is AWS Bedrock which is their new generative AI service launched in 2023. AWS Bedrock offers multiple models that you can choose from depending on the task you’d like to carry out but for us, we’re going to be making use of Meta’s Llama V2 model, more specifically meta.llama2-70b-chat-v1.
-
Breaking News: AWS Bedrock Lands in Sydney
Amazon Bedrock
-
Implementing semantic image search with Amazon Titan and Supabase Vector
Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon. Each model is accessible through a common API which implements a broad set of features to help build generative AI applications with security, privacy, and responsible AI in mind.
- Amazon Bedrock
- Build an AI image catalogue! - Claude 3 Haiku
-
Comprehending JIRA Tickets with Amazon Bedrock
For those keeping track, Amazon Bedrock became generally available in September of 2023. My team had access to a preview, so when the AWS Comprehend entity analysis did not lend itself well to my use case; and I didn't feel like training a model, I started to get familiar with Bedrock. The following post is a follow-on to the Community article above and fleshes out a few details that will help those newer to Amazon Bedrock navigate the product.
-
Build a React genAI APP with Amazon Bedrock & AWS SDK
In this blog you will learn how to use Amazon Cognito credentials and IAM Roles to invoke Amazon Bedrock API in a react-based application with JavaScript and the CloudScape design system. You will deploy all the resources and host the app using AWS Amplify.
- Valida automáticamente tus respuestas de AWS Bedrock LLM
-
AWS Bedrock Claude 2.1 - Return only JSON
Working with the AWS Bedrock API is an exhilarating experience! I came across an interesting business case where I needed to develop an AI MVP. The MVP generates JSON data based on a prompt and utilizes the anthropic.claude-v2:1 model in AWS Bedrock.
- Ask HN: Best Alternatives to OpenAI ChatGPT?
What are some alternatives?
SqueezeLLM - [ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
stable-audio-tools - Generative models for conditional audio generation
GPTQ-for-LLaMa - 4 bits quantization of LLaMA using GPTQ
pejorative-compounds - Analysing patterns in English noun-noun pejorative compounds on Reddit
Voyager - An Open-Ended Embodied Agent with Large Language Models
langchain4j-examples
lang2sql - A tutorial for setting an SQL code generator with the OpenAI API
CML_AMP_AI_Text_Summarization_with_Amazon_Bedrock - CML_AMP_AI_Text_Summarization_with_Amazon_Bedrock
spreadsheet - Spreadsheet Builder
kafka-streams-dashboards - showcases Grafana dashboards for Kafka Stream applications leveraging client JMX metrics.
BedrockConnect - Join any Minecraft Bedrock Edition server IP on Xbox One, Nintendo Switch, and PS4/PS5