ChatGPT and Whisper APIs

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • whisper.cpp

    Port of OpenAI's Whisper model in C/C++

  • I recently tried a number of options for streaming STT. Because my use case was very sensitive to latency, I ultimately went with https://deepgram.com/ - but https://github.com/ggerganov/whisper.cpp provided a great stepping stone while prototyping a streaming use case locally on a laptop.

  • ruby-openai

    OpenAI API + Ruby! 🤖❤️ Now with Assistants, Threads, Messages, Runs and Text to Speech 🍾

  • Added to the Ruby library here if any Rubyists interested! https://github.com/alexrudall/ruby-openai

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • emacs-chatgpt-jarvis

    press F12 to record, use whisper to transcribe and chatgpt to answer

  • wow just in time, i just made https://github.com/jackdoe/emacs-chatgpt-jarvis which is chatgpt+whisper but using local whisper and chatgpt-wrapper which is a bit clunky

    since i integrated chatgpt with my emacs i use it at least 20-30 times a day

    i wonder if they will charge me per token if i am paying the monthly fee

  • pydub

    Manipulate audio with a simple and easy high level interface

  • I doubt it will matter if you're breaking up mid sentence if you pass in the previous as a prompt and split words. This is how Whisper does it internally.

    It's not absolutely perfect, but splitting on the word boundary is one line of code with the same package in their docs: https://github.com/jiaaro/pydub/blob/master/API.markdown#sil...

    25MB is also a lot. That's 30 minutes to an hour on MP3 at reasonable compression. A 2 hour movie would have three splits.

  • whisperX

    WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

  • ControlNet

    Let us control diffusion models!

  • Stable diffusion + ControlNet is fire! Nothing compares to it. ControlNet allows you to have tight control over the output. https://github.com/lllyasviel/ControlNet

  • Open-Assistant

    OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

  • Yeah, might be worried about open, crowd sourced approaches like Open Assistant (https://open-assistant.io/).

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • grace

    LLM-based chatbot capable of interfacing with external systems for knowledge retrieval and command execution (by artmatsak)

  • `text-davinci-003` seems to offer more flexibility: You have complete freedom in providing interaction examples in the prompt (as in https://github.com/artmatsak/grace/blob/master/grace_chatbot...) and are not limited to the predefined three chat roles. But the requests being 10x cheaper means that we'll have to find ways around those limitations :)

  • openai-python

    The official Python library for the OpenAI API

  • Judging by this[0] the new structured format is immune to "injections":

    [0] https://github.com/openai/openai-python/blob/main/chatml.md

  • openai-cookbook

    Examples and guides for using the OpenAI API

  • Probably the embeddings API. This guide is what helped me understand the concept https://github.com/openai/openai-cookbook/blob/main/examples...

    tl;dr is that you can pre-process each chunk of your database and use embeddings to quickly look up which chunk is most similar to the user's query, and then prepend that chunk to the user's query before giving it to GPT, so that GPT has the relevant context to give an answer.

  • open-ai

    OpenAI PHP SDK : Most downloaded, forked, contributed, huge community supported, and used PHP (Laravel , Symfony, Yii, Cake PHP or any PHP framework) SDK for OpenAI GPT-3 and DALL-E. It also supports chatGPT-like streaming. (ChatGPT AI is supported)

  • Php library

    https://github.com/orhanerday/open-ai#chat-as-known-as-chatg...

  • chatgpt-api

    Node.js client for the official ChatGPT API. 🔥

  • I just pushed an update to the `chatgpt` NPM package with support for the official ChatGPT API: https://github.com/transitive-bullshit/chatgpt-api

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts