openai-python
whisper
openai-python | whisper | |
---|---|---|
71 | 372 | |
27,279 | 84,697 | |
1.7% | 2.7% | |
9.7 | 7.4 | |
4 days ago | 16 days ago | |
Python | Python | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
openai-python
- Structured Output with LangChain and Llamafile
-
🚀 Building an Azure OpenAI Chatbot: Challenges, Solutions & Why JavaScript Beats Python for the Web
Check the official migration guide for updates.
-
XAI Has Acquired X
Okay, I know Tesla's extremely high P/E ratio is because it's worth is not just tied to cars, and so xAI priced at $20B more than Anthropic does not necessarily mean xAI's AI products are that much better than Anthropic's (e.g. presumably xAI's worth is tied to synergies with Tesla FSD, Optimus, and maybe even Neurolink)...but what products does xAI actually offer, other than Grok being an add-on for premium X subscriptions?
Not only does the Grok API not have access to Grok 3, which was released more than a month ago, it doesn't even have it's own SDK? [0]
> Some of Grok users might have migrated from other LLM providers. xAI API is designed to be compatible with both OpenAI and Anthropic SDKs, except certain capabilities not offered by respective SDK. If you can use either SDKs, we recommend using OpenAI SDK for better stability.
(every code example has a call for `from openai import OpenAI`)
How would using Grok be viable for any enterprise? And if Grok's API is designed to be drop-in replacement for OpenAI's, how are they not able to just use Grok to whip up their own SDK variant based on OpenAI's open-sourced SDK [1] and API spec?
[0] https://docs.x.ai/docs/guides/migration
[1] https://github.com/openai/openai-python
-
New Tools for Building Agents
If you want to get an idea for the changes, here's a giant commit where they updated ALL of the Python library examples in one go from the old chat completions to the new resources APIs: https://github.com/openai/openai-python/commit/2954945ecc185...
-
Build your next AI Tech Startup with DeepSeek
The API itself is pretty straightforward. You can use it with the OpenAI package on NPM or PIP, or make an HTTP Request. Note for this demo I will be using NodeJS. I will be working in an empty folder with an index.js file, and a package.json file.
-
Introduction to Using Generative AI Models: Create Your Own Chatbot!
To interact with the OpenAI API, you will install the openai package:
-
Exploring Job Market for Software Engineers
Python was chosen for its versatile libraries, particularly linkedin_jobs_scraper and openai. These packages streamlined the scraping and processing of job data.
- OpenAI adds new o1 models
-
LLM Fine-Tuning: Domain Embeddings with GPT-3
The essential library for this project is OpenAI, supported by two helper libraries. Install them with the poetry dependency manager a shown:
- The Stainless SDK Generator
whisper
-
Show HN: TokenDagger – A tokenizer 2-4x faster than OpenAI's Tiktoken
> has a great package ecosystem
So great there are 8 of them. 800% better than all the rest!
> If you think Python is a bad language for AI integrations, try writing one in a compiled language.
I'll take this challenge, all day, every day, so long as I and the hypothetical 'move fast and break things' have equal "must run in prod" and "must be understandable by some other human" qualifiers
What type is `array`? Don't worry your pretty head about it, feed it whatever type you want and let Sentry's TypeError sort it out <https://github.com/openai/whisper/blob/v20250625/whisper/aud...> Oh, sorry, and you wanted to know what `pad_or_trim` returns? Well that's just, like, your opinion man
-
DeepSpeech Is Discontinued
It seems that the team that used to work on DeepSpeech then worked on coqui-ai STT https://github.com/coqui-ai/STT and now recommends using OpenAI Whisper (https://github.com/openai/whisper)
-
OpenAI Charges by the Minute, So Make the Minutes Shorter
It's a very simple change in a vanilla python implementation. The encoder is a set of attention blocks, and the length of the attention can be changed without changing the calculation at all.
Here(https://github.com/openai/whisper/blob/main/whisper/model.py...) is the relevant code in the whisper repo. You'd just need to change the for loop to an enumerate and subsample the context along its length at the point you want. I believe it would be:
for i, block in enumerate(self.blocks):
-
Show HN: WhisperBuddy, Privacy-first AI-transcription app built after my layoff
I was laid off recently and, instead of looking for another job right away, I decided to build something I always wanted: a transcription tool that respects user privacy.
Existing transcription tools often require internet connectivity, send your private audio to cloud servers, or lock you into monthly subscriptions. I wanted something different—so I built WhisperBuddy, a privacy-first AI transcription app that runs entirely on your machine.
WhisperBuddy uses OpenAI’s Whisper model (see: https://github.com/openai/whisper). While the model may not always deliver the absolute best accuracy, it provides solid performance and, most importantly, 100% privacy—no audio ever leaves your device.
I optimized the app for local performance and usability, and I’m eager to hear feedback from the community.
Thanks for checking it out!
-
Auto-Generating Clips for Social Media from Live Streams with the Strands Agents SDK
To accomplish this task, I decided to try out the new Strands Agents SDK. It's a fairly new framework for building agents that has a simple way to define tools that the agent can use to assist in responding to prompts. For this solution, we'll need FFMPEG and Whisper installed on the machine where the agent runs. I'll be working locally, but this could easily be converted to a server-based solution using FastAPI or another web framework and deployed to the cloud in a Docker/Podman container.
-
From Voice to Text: Exploring Speech-to-Text Tools and APIs for Developers
Link: Whisper GitHub
-
15 AI tools that almost replace a full dev team but please don’t fire us yet
Whisper: OpenAI’s speech-to-text.
-
The ultimate open source stack for building AI agents
Start by hooking up speech-to-text (STT) using something like OpenAI’s Whisper if you’re going open source, or Deepgram if you want a super-accurate plug-and-play API.
-
How to create video transcription with ffmpeg and whisper
# Install Homebrew if you don't have it /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" # Install ffmpeg brew install ffmpeg # Install Python (if needed) brew install python # Install Whisper pip3 install --upgrade pip pip3 install git+https://github.com/openai/whisper.git
-
Real-time in-browser speech recognition with Nuxt and Transformers.js
This is a barebones demo showing Transformers.js working with Whisper in the browser for realtime audio to text transcription.
It uses [whisper-base](https://huggingface.co/onnx-community/whisper-base), a 290mb model, and is able to transcribe audio in all of the languages listed [here](https://github.com/openai/whisper/blob/248b6cb124225dd263bb9...).
What are some alternatives?
Awesome-LLMOps - An awesome & curated list of best LLMOps tools for developers
vosk-api - Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
maelstrom - A workbench for writing toy implementations of distributed systems.
whisperX - WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
openai-node - Official JavaScript / TypeScript library for the OpenAI API
silero-vad - Silero VAD: pre-trained enterprise-grade Voice Activity Detector