Our great sponsors
-
whisper-websocket-server
Self hosted high quality voice recognition for de-googled Android using whisper. Like Siri or OK Google.
-
rust-bert
Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...)
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
I suspect something similar is possible with ChatGPT. Using the GPT-neo-125m model I've been able to get some really convincing (if lackluster) answers on 4 core ARM hardware and less than 2gb of memory. With enough sampling, you can get legible paragraph-length responses out in less than 10 seconds; that's pretty good for an offline program in my book.
I'm using rust-bert to serve it over a Discord bot, similar to one of their examples[0]. It's running on Oracle VCPUs right now, but with dedi hardware and ML acceleration I can imagine the field moving really quickly.
[0] https://github.com/guillaume-be/rust-bert/blob/master/exampl...