descript-audio-codec VS opus

Compare descript-audio-codec vs opus and see what are their differences.

descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio. (by descriptinc)

opus

Modern audio compression for the internet. (by xiph)
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
descript-audio-codec opus
2 26
917 2,129
5.9% 2.5%
4.5 9.6
about 2 months ago 10 days ago
Python C
MIT License GNU General Public License v3.0 or later
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

descript-audio-codec

Posts with mentions or reviews of descript-audio-codec. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-08.
  • Show HN: Sonauto – a more controllable AI music creator
    1 project | news.ycombinator.com | 10 Apr 2024
    Hey HN,

    My cofounder (four months ago, classmate) and I trained an AI music generation model and after a month of testing we're launching 1.0 today. Ours is interesting because it's a latent diffusion model instead of a language model, which makes it more controllable: https://sonauto.ai/

    Others do music generation by training a Vector Quantized Variational Autoencoder like Descript Audio Codec (https://github.com/descriptinc/descript-audio-codec) to turn music into tokens, then training an LLM on those tokens. Instead, we ripped the tokenization part off and replaced it with a normal variational autoencoder bottleneck (along with some other important changes to enable insane compression ratios). This gave us a nice, normally distributed latent space on which to train a diffusion transformer (like Sora). Our diffusion model is also particularly interesting because it is the first audio diffusion model to generate coherent lyrics!

    We like diffusion models for music generation because they have some interesting properties that make controlling them easier (so you can make your own music instead of just taking what the machine gives you). For example, we have a rhythm control mode where you can upload your own percussion line or set a BPM. Very soon you'll also be able to generate proper variations of an uploaded or previously generated song (e.g., you could even sing into Voice Memos for a minute and upload that!). @Musicians of HN, try uploading your songs and using Rhythm Control/let us know what you think! Our goal is to enable more of you, not replace you.

    For example, we turned this drum line (https://sonauto.ai/songs/uoTKycBghUBv7wA2YfNz) into this full song (https://sonauto.ai/songs/KSK7WM1PJuz1euhq6lS7 skip to 1:05 if inpatient) or this other song I like better (https://sonauto.ai/songs/qkn3KYv0ICT9kjWTmins we accidentally compressed it with AAC instead of Opus which hurt quality, though)

    We also like diffusion models because while they're expensive to train, they're cheap to serve. We built our own efficient inference infrastructure instead of using those expensive inference as a service startups that are all the rage. That's why we're making generations on our site FREE and UNLIMITED for as long as possible.

    We'd love to answer your questions. Let us know what you think of our first model! https://sonauto.ai/

  • TSAC: Low Bitrate Audio Compression
    4 projects | news.ycombinator.com | 8 Apr 2024
    Another useful model to compare to would be DAC https://github.com/descriptinc/descript-audio-codec

    This is the codec that TSAC extended, so it could be a nice comparison to see. I'd also echo Vocos (from sibling comment), it operates on the same Encodec tokens but generally has better reconstruction quality.

opus

Posts with mentions or reviews of opus. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-08.
  • TSAC: Low Bitrate Audio Compression
    4 projects | news.ycombinator.com | 8 Apr 2024
    Opus doesn't support 44.1 kHz because compatibility and effort/benefit ratio:

    https://github.com/xiph/opus/issues/43

    The browser audio limitation is presumably a workaround to some bug or performance limitation that was relevant at some point in history (the site was created in 2014).

  • Permutation Iteration and Random Access
    4 projects | news.ycombinator.com | 23 Aug 2023
    There is a pattern here (that also goes with the author's prior article on inverting gauss' sum formula): Generally if if you can make a formula that counts the combination of things you can convert that into a code to encode and decode those combinations into indexes.

    So for example the opus audio codec needs to encode/decode vectors of dimension n whos absolute values sum to k. https://github.com/xiph/opus/blob/master/celt/cwrs.c#L74

    Or this rolling cuckoo filter that optimally encode/decode four sorted numbers in a range 0..2N with the constraint that the they span a range of N. https://github.com/sipa/bitcoin/blob/202006_cuckoo_filter/sr...

    If you're lucky there will be closed form expressions for the encoding and decoding equations. (There for both of the above, at least for some parameters, but in both those examples the implementations use small tables because for the ranges involved the tables end up being faster than sqrts).

  • A CPU in Sunvox
    1 project | news.ycombinator.com | 18 Aug 2023
    Too bad 10Hz is a too slow to generate audio-rate bitops music.

    (e.g. https://github.com/xiph/opus/blob/master/tests/test_opus_enc... )

  • L’avenir de la loi Hadopi suspendu à une décision de la justice européenne
    1 project | /r/france | 16 Apr 2023
  • Global Underground Disk Images
    1 project | /r/House | 22 Mar 2023
    Could anyone help me get a disk image files for older Global Underground CDs? I encoded my old CDs into subpar mp3 files, and I'd now like to have high-quality Opus encodings and experiment across various bitrates.
  • Which is better Opus or AC3?
    1 project | /r/AskReddit | 18 Mar 2023
    Presumably, OP is referring to the Opus audio codec versus Dolby's AC3 codec.
  • HD: Opus?
    1 project | /r/pixelbuds | 8 Mar 2023
    Indeed. https://opus-codec.org/
  • Multiple tags with the same name in metadata
    2 projects | /r/ffmpeg | 8 Jan 2023
    If there are multiple tags with the same name, Ffmpeg will only use the last tag. If you really need to have multiple tags with the same name in your OPUS files, use opusenc instead (https://opus-codec.org/). Beware that some playback software does not display multiple artists gracefully.
  • I built a Zoom clone 100% IN RUST
    12 projects | /r/rust | 24 Oct 2022
    AFAIK ogg isn't really suitable for low latency audio streaming. Consider the Opus codec instead.
  • ffmpeg libopus producing larger file size for the same bitrate as compared to vorbis
    1 project | /r/opus | 12 Oct 2022
    I have asked on GitHub also https://github.com/xiph/opus/issues/263 in anyone wants to respond there.

What are some alternatives?

When comparing descript-audio-codec and opus you can also consider the following projects:

libvorbis - Haskell binding for libvorbis, for decoding Ogg Vorbis audio files

go-m3u8 - Parse and generate m3u8 playlists for Apple HTTP Live Streaming (HLS) in Golang (ported from gem https://github.com/sethdeckard/m3u8)

argos-translate - Open-source offline translation library written in Python

vorbis - Reference implementation of the Ogg Vorbis audio format.

vgmstream - vgmstream - A library for playback of various streamed audio formats used in video games.

libopenaptx - Open Source implementation of Audio Processing Technology codec (aptX)

vgmstream - vgmstream - A library for playback of various streamed audio formats used in video games. [Moved to: https://github.com/vgmstream/vgmstream]

HanBaoBao - Mandarin Chinese text segmentation and mobile dictionary Android app (中文分词)

audiopus_sys - Rust FFI-binding of Opus.

AppleNeuralHash2ONNX - Convert Apple NeuralHash model for CSAM Detection to ONNX.

homebrew-macos-cross-toolchains - macOS cross compiler toolchains

concentus.oggfile - Implementing support for reading/writing .opus audio files using Concentus