Native Matrix VoIP with Element Call

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • matrix-spec-proposals

    Proposals for changes to the matrix specification

  • Yes but the current UX for that is terrible. With the proposed MSC linked in the blog post, Matrix will additionally gain Teamspeak/Discord like voice channels [1].

    With this we could finally have a proper Teamspeak/Mumble bridge for voice that gets properly represented on the Matrix side which is amazing :D

    Also kinda funny that Teamspeak only recently started using Matrix for their global chat feature [2].

    Maybe now after the gitter acquisition, Element should consider acquiring a certain voice-focused company ;)

    **

    [1] https://github.com/matrix-org/matrix-spec-proposals/blob/mat...

    [2] https://community.teamspeak.com/t/teamspeak-5-beta-bug-repor...

  • matrix-js-sdk

    Matrix Client-Server SDK for JavaScript

  • Currently we don't force TURN, so in practice this means that voice packets go direct between the clients if possible, and so the IP addresses of the clients are necessarily exposed to each other.

    However, this is utterly trivial to fix: matrix-js-sdk already exposes https://github.com/matrix-org/matrix-js-sdk/blob/96ba061732b... and we simply haven't exposed it as a setting in Element Call yet. I've filed a bug for it at https://github.com/vector-im/element-call/issues/251 - thanks for bringing it up!

    In terms of moderation: this is no different to moderation in Matrix as a whole, where we're already busy working on shared greylists (MSC2313 and friends) - https://matrix.org/blog/2020/10/19/combating-abuse-in-matrix... has more details at the end.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • matrix-doc

    Discontinued Proposals for changes to the matrix specification [Moved to: https://github.com/matrix-org/matrix-spec-proposals]

  • From my perspective, the really exciting thing about this that it works equally well in mobile web browsers as well as desktop web - clicking on a link on Mobile Safari should Do The Right Thing without having to install anything.

    Moreover, because it's built on Matrix, MSC3401 (https://github.com/matrix-org/matrix-doc/blob/matthew/group-...) means that we'll finally have decentralised cascading video/voice conferences once the SFU (selective forwarding unit) component is added into the mix. So, for instance, users on the same homeserver will get their video feeds relayed locally with minimal latency... and then users on another remote homeserver will also get mixed locally with minimal latency, trunking the two together. If the link dies or one homeserver dies, the conference will keep going - i.e. precisely the same semantics as normal Matrix.

  • seshat

    A Matrix message database/indexer

  • The competitive gap with Discord in terms of media quality is probably something like:

    * Need a low-latency SFU. This should be very doable; not only are there a lot of good FOSS SFUs to build on top of these days, the history of the Matrix team is actually that we built VoIP stacks fulltime before we shifted focus to Matrix, and we've built MCUs and media servers of all flavours in the past. MSC3401 should also give us a competitive edge given latency will be automagically minimised by using the physically closest decentralised SFU, and letting anyone bring their SFU to the party.

    * Needs a SFU with good rate control (and/or FEC). This is probably the single most important thing to get right in terms of quality. Signal wrote up a good overview of why: https://signal.org/blog/how-to-build-encrypted-group-calls/

    * Excellent noise cancellation (and background noise elimination, microphone scratch noise elimination etc). Ideally you need something like https://krisp.ai/ or https://workspaceupdates.googleblog.com/2021/06/background-n... in the mix - but doing this in an E2EE-friendly and privacy preserving manner is Hard. However, just like we solved E2EE full text search by doing it clientside and making the indexes gossipable between your clients (https://github.com/matrix-org/seshat), we'll have a go at doing something similar for this problem too.

    * Excellent automatic gain control. The importance of normalising/compressing everyone's audio so they're equivalent loudness is really important.

    We're also in the process of adding in spatial audio (unsure if Discord has that) which should help a tonne with distinguishing the different audio feeds.

    We can probably also be more bullish about supporting new audio codecs like Lyra.

  • element-call

    Group calls powered by Matrix

  • Currently we don't force TURN, so in practice this means that voice packets go direct between the clients if possible, and so the IP addresses of the clients are necessarily exposed to each other.

    However, this is utterly trivial to fix: matrix-js-sdk already exposes https://github.com/matrix-org/matrix-js-sdk/blob/96ba061732b... and we simply haven't exposed it as a setting in Element Call yet. I've filed a bug for it at https://github.com/vector-im/element-call/issues/251 - thanks for bringing it up!

    In terms of moderation: this is no different to moderation in Matrix as a whole, where we're already busy working on shared greylists (MSC2313 and friends) - https://matrix.org/blog/2020/10/19/combating-abuse-in-matrix... has more details at the end.

  • rnnoise-wasm

    rnnoise noise suppression library as a WASM module

  • FedCM

    A privacy preserving identity exchange Web API

  • Has support for IndieAuth and/or FedCM (when stable enough) been considered?

    https://aaronparecki.com/2018/07/07/7/oauth-for-the-open-web

    https://github.com/fedidcg/FedCM

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • https://gitlab.com/ptman/matrix-login this could ofc be modified to support oidc

  • nnnoiseless

    Recurrent neural network for audio noise reduction

  • 1. So the SFUs we're currently looking at are yours, ion-sfu (and/or galene) and mediasoup. Honestly we haven't finished looking at how they compare for rate control, but the Pion team seems very interested in ensuring they have good rate control.

    2. From context I think you're talking about noise cancellation here? I assumed that some of the more exotic ML-based ones ran serverside, which obviously is incompatible with E2EE. It sounds like there are a bunch of options for running WASM-based intelligent noise cancellation clientside though, especially with MediaStreamTrackProcessor and friends. rnnoiseless as a pure Rust->WASM port of rnnoise looks fun, for instance: https://github.com/jneem/nnnoiseless

    3. True, although given Google are highly motivated to make AEC work properly in WebRTC, I guess I'm hoping that they'll continue improving it, much as they have been. I certainly never want to have to write or integrate one ever again :D

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts