How to train large deep learning models as a startup

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • determined

    Determined is an open-source machine learning platform that simplifies distributed training, hyperparameter tuning, experiment tracking, and resource management. Works with PyTorch and TensorFlow.

  • Check out Determined https://github.com/determined-ai/determined to help manage this kind of work at scale: Determined leverages Horovod under the hood, automatically manages cloud resources and can get you up on spot instances, T4's, etc. and will work on your local cluster as well. Gives you additional features like experiment management, scheduling, profiling, model registry, advanced hyperparameter tuning, etc.

    Full disclosure: I'm a founder of the project.

  • snowboy

    Future versions with model training module will be maintained through a forked version here: https://github.com/seasalt-ai/snowboy

  • Great question. This is technically referred to as "Wake Word Detection". You run a really small model locally that is just processing 500ms (for example) of audio at a time through a light weight CNN or RNN. The idea here is that it's just binary classification (vs actual speech recognition).

    There are some open source libraries that make this relatively easy:

    - https://github.com/Kitt-AI/snowboy (looks to be shutdown now)

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • pocketsphinx

    A small speech recognizer

  • - https://github.com/cmusphinx/pocketsphinx

    This avoids having to stream audio 24x7 to a cloud model which would be super expensive. This being said, I'm pretty sure what the Alexa does, for example, is send any positive wake word to a cloud model (that is bigger and more accurate) to verify the prediction of the local wake word detection model AFAIK.

  • Spoken-Keyword-Spotting

    In this repository, we explore using a hybrid system consisting of a Convolutional Neural Network and a Support Vector Machine for Keyword Spotting task.

  • The search term you're looking for is "Keyword Spotting" - and that's what's implemented locally for ~embedded devices that sit and wait for something relevant to come along so that they know when to start sending data up to the mothership (or even turn on additional higher-power cores locally).

    Here's an example repo that might be interesting (from initial impressions, though there are many more out there) : https://github.com/vineeths96/Spoken-Keyword-Spotting

  • xla

    Enabling PyTorch on XLA Devices (e.g. Google TPU)

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts