Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 23 huggingface-transformer Open-Source Projects
-
machine-learning-articles
🧠💬 Articles I wrote about machine learning, archived from MachineCurve.com.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
uform
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
-
detoxify
Trained models & code to predict toxic comments on all 3 Jigsaw Toxic Comment Challenges. Built using ⚡ Pytorch Lightning and 🤗 Transformers. For access to our API, please email us at [email protected].
-
Multimodal-Toolkit
Multimodal model for text and tabular data with HuggingFace transformers as building block for text data
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
finetune-gpt2xl
Guide: Finetune GPT2-XL (1.5 Billion Parameters) and finetune GPT-NEO (2.7 B) on a single GPU with Huggingface Transformers using DeepSpeed
-
local-llm-function-calling
A tool for generating function arguments and choosing what function to call with local LLMs
-
quickai
QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.
-
monitors4codegen
Code and Data artifact for NeurIPS 2023 paper - "Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context". `multispy` is a lsp client library in Python intended to be used to build applications around language servers.
-
aws-lambda-docker-serverless-inference
Serve scikit-learn, XGBoost, TensorFlow, and PyTorch models with AWS Lambda container images support.
-
discus
A data-centric AI package for ML/AI. Get the best high-quality data for the best results. Discord: https://discord.gg/t6ADqBKrdZ
-
Extracting-Training-Data-from-Large-Langauge-Models
A re-implementation of the "Extracting Training Data from Large Language Models" paper by Carlini et al., 2020
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Try this: https://github.com/refuel-ai/autolabel
Then the main challenge just becomes prompt design, which can sometimes be nebulous for NLP annotation.
There were some developments using LLMs in the timeseries domain which caught my attention.
I toyed with the Chronos forecasting toolkit [1], and the results were predictably off by wild margins [2]
What really caught my eye though was the "feel" of the predicted timeseries -- this is the first time I've seen synthetic timeseries that look like the real thing. Stock charts have a certain quality to them, once you've been looking at them long enough, you can tell more often than not whether some unlabeled data is a stock price timeseries or not. It seems the chronos LLM was able to pick up on that "nature" of the price movement, and replicate it in its forecasts. Impressive!
1: https://github.com/amazon-science/chronos-forecasting
2: https://imgur.com/a/hTRQ38d
Project mention: CatLIP: Clip Vision Accuracy with 2.7x Faster Pre-Training on Web-Scale Data | news.ycombinator.com | 2024-04-25question: any good on-device size image embedding models?
tried https://github.com/unum-cloud/uform which i do like, especially they also support languages other than English. Any recommendations on other alternatives?
Project mention: Run and create custom ChatGPT-like bots with OpenChat | news.ycombinator.com | 2023-06-07- https://github.com/r2d4/rellm
Project mention: [Machine Learning] [P] Implémentation de la génération de texte à partir de mots clés Python Module | /r/enfrancais | 2023-05-11
Project mention: Super JSON Mode: Up to 20x Faster JSON Generation from LLMs | news.ycombinator.com | 2024-02-06
Project mention: Tell HN: OpenAI still has a moat, it's called function calling and its API | news.ycombinator.com | 2024-02-21hello? https://github.com/rizerphe/local-llm-function-calling
Project mention: Debugging Python Code in Amazon SageMaker Locally Using Visual Studio Code and PyCharm: A Step-by-Step Guide | dev.to | 2023-11-15git clone https://github.com/aws-samples/amazon-sagemaker-local-mode/ cd amazon-sagemaker-local-mode/general_pipeline_local_debug python3 -m venv .venv source .venv/bin/activate pip install jupyter jupyter lab
I've been playing around with DALL-E 3 a lot recently. One of the things they do is to expand a user's prompt in order to add a lot of detail. Via their API, you can see the expanded prompt (whereas, you can't through the ChatGPT interface).
They obviously have the power of their LLM behind them and can generate some really interesting prompts. There is an open source implementation that the creator of Fooocus made which attempts to expand on prompts using some commonly used keywords[1] with some sort of basic context.
e.g., I typed in "Brisket on a table" and got: "Brisket on a table, product photography, Michelin star, award winning photo, 8k, trending, HD. High quality image, highly detailed, stunning lighting, flawless render, masterpiece, still from the movie directed by Denis Villeneuve with art direction"
You get a much better image with that prompt vs just the basic: "Brisket on a table"
[1] https://huggingface.co/spaces/daspartho/prompt-extend
Project mention: Show HN: Multilspy – A library to easily use language servers to analyze code | news.ycombinator.com | 2023-11-28
Project mention: LLM-Client – Python library for seamless integration with LLMs | news.ycombinator.com | 2023-07-27
Project mention: an open source package helping developers generate data for LLMs | /r/mlops | 2023-08-02
huggingface-transformers related posts
-
CatLIP: Clip Vision Accuracy with 2.7x Faster Pre-Training on Web-Scale Data
-
Multimodal Embeddings for JavaScript, Swift, and Python
-
Show HN: UForm v2 Featuring Multimodal Matryoshka, Multimodal DPO, and ONNX
-
UForm v1: Multimodal Chat in 1.5B Parameters
-
NLP Research in the Era of LLMs
-
ArtBot for Stable Diffusion
-
Show HN: I scraped 25M Shopify products to build a search engine
-
A note from our sponsor - InfluxDB
www.influxdata.com | 6 May 2024
Index
What are some of the best open-source huggingface-transformer projects? This list will help you:
Project | Stars | |
---|---|---|
1 | machine-learning-articles | 3,108 |
2 | autolabel | 1,788 |
3 | chronos-forecasting | 1,680 |
4 | uform | 885 |
5 | detoxify | 839 |
6 | Multimodal-Toolkit | 555 |
7 | transformers-bloom-inference | 548 |
8 | rellm | 488 |
9 | keytotext | 436 |
10 | finetune-gpt2xl | 421 |
11 | super-json-mode | 337 |
12 | local-llm-function-calling | 265 |
13 | amazon-sagemaker-local-mode | 229 |
14 | prompt-extend | 174 |
15 | quickai | 162 |
16 | monitors4codegen | 108 |
17 | aws-lambda-docker-serverless-inference | 92 |
18 | Romanian-Transformers | 83 |
19 | llm-client-sdk | 70 |
20 | discus | 62 |
21 | tasknet | 33 |
22 | predict-subreddit | 31 |
23 | Extracting-Training-Data-from-Large-Langauge-Models | 26 |
Sponsored