Ask HN: What ML platform are you using?

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • Genann

    simple neural network library in ANSI C

  • > I am very much a beginner in the space of machine learning

    While the (precious and useful) advice around seem to cover mostly the bigger infrastructures, please note that

    you can effectively do an important slice of machine learning work (study, personal research) with just a battery-efficiency-level CPU (not GPU), in the order of minutes, on a battery. That comes before going to "Big Data".

    And there are lightweight tools: I am current enamoured with Genann («minimal, well-tested open-source library implementing feedfordward artificial neural networks (ANN) in C»), a single C file of 400 lines compiling to a 40kb object, yet well sufficient to solve a number of the problems you may meet.

    https://codeplea.com/genann // https://github.com/codeplea/genann

    After all, is it a good idea to have tools that automate process optimization while you are learning the deal? Only partially. You should build - in general and even metaphorically - the legitimacy of your Python ops on a good C ground.

    And: note that you can also build ANNs in R (and other math or stats environments). If needed or comfortable...

    Also note - reminder - that the MIT lessons of Prof. Patrick Winston for the Artificial Intelligence course (classical AI with a few lessons on ANNs) are freely available. That covers the grounds relative to climb into the newer techniques.

  • colab-vscode

    ✨ 1-Click Free GPU on VS Code with Google Colab

  • I created something that lets you get free GPU on VS Code with Google Colab with just 1-click. Have a look at https://github.com/DerekChia/colab-vscode

    This is my default go-to as a poor man ML setup, with environment and dependencies set up automatically via bash script on start up.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • hal9ai

    Discontinued Hal9 — Data apps powered by code and LLMs [Moved to: https://github.com/hal9ai/hal9]

  • If you want to build a web application on top of your ML project, give https://hal9.com a shot. We designed Hal9 with ease of use for deployment and maximum compatibility with web technologies that enable you to build ML apps with React, Vue, etc. We launched a couple months ago but could use some early feedback and users. Thank you!

  • orchest

    Build data pipelines, the easy way 🛠️

  • In case you want to start creating batch jobs too I’d recommend checking out Orchest (www.orchest.io). It has a generous free tier and supports GPU instances. The platform itself is self-hostable too and open source (https://github.com/orchest/orchest).

    The main advantages are its interactive pipeline editor, support for Jupyter notebooks in the pipeline/DAG context, and a simple way to specify environment dependencies. It also supports auto start-and stopping of instances so you only pay for the compute necessary to run your data pipelines.

    Disclosure, I’m one of the creators.

  • ploomber

    The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️

  • One of my deal breakers when choosing tooling is how easy is to move from a local environment to a distributed environment. Ideally, you want to start locally and move to a distributed env if you need to. So choose one tool that allows you to get started quickly and move from there.

    As an example: one of the reasons why I don't use Kubeflow is because it requires having a Kubernetes cluster up and running, which is an overkill in many cases.

    Check out the project I'm working on: https://github.com/ploomber/ploomber

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Decent low code options for orchestration and building data flows?

    1 project | /r/dataengineering | 23 Dec 2022
  • Build ML workflows with Jupyter notebooks

    1 project | /r/programming | 23 Dec 2022
  • Building container images in Kubernetes, how would you approach it?

    2 projects | /r/kubernetes | 6 Dec 2022
  • Ideas for infrastructure and tooling to use for frequent model retraining?

    1 project | /r/mlops | 9 Sep 2022
  • Looking for a mentor in MLOps. I am a lead developer.

    1 project | /r/mlops | 25 Aug 2022