Ask HN: What ML platform are you using?

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

Genann

7 1,901 0.0 C

simple neural network library in ANSI C

> I am very much a beginner in the space of machine learning
While the (precious and useful) advice around seem to cover mostly the bigger infrastructures, please note that
you can effectively do an important slice of machine learning work (study, personal research) with just a battery-efficiency-level CPU (not GPU), in the order of minutes, on a battery. That comes before going to "Big Data".
And there are lightweight tools: I am current enamoured with Genann («minimal, well-tested open-source library implementing feedfordward artificial neural networks (ANN) in C»), a single C file of 400 lines compiling to a 40kb object, yet well sufficient to solve a number of the problems you may meet.
https://codeplea.com/genann // https://github.com/codeplea/genann
After all, is it a good idea to have tools that automate process optimization while you are learning the deal? Only partially. You should build - in general and even metaphorically - the legitimacy of your Python ops on a good C ground.
And: note that you can also build ANNs in R (and other math or stats environments). If needed or comfortable...
Also note - reminder - that the MIT lessons of Prof. Patrick Winston for the Artificial Intelligence course (classical AI with a few lessons on ANNs) are freely available. That covers the grounds relative to climb into the newer techniques.

colab-vscode

4 77 3.5 Jupyter Notebook

✨ 1-Click Free GPU on VS Code with Google Colab

I created something that lets you get free GPU on VS Code with Google Colab with just 1-click. Have a look at https://github.com/DerekChia/colab-vscode
This is my default go-to as a poor man ML setup, with environment and dependencies set up automatically via bash script on start up.

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
hal9ai

22 122 -22.7 TypeScript

Discontinued Hal9 — Data apps powered by code and LLMs [Moved to: https://github.com/hal9ai/hal9]

If you want to build a web application on top of your ML project, give https://hal9.com a shot. We designed Hal9 with ease of use for deployment and maximum compatibility with web technologies that enable you to build ML apps with React, Vue, etc. We launched a couple months ago but could use some early feedback and users. Thank you!

orchest

44 4,020 4.5 TypeScript

Build data pipelines, the easy way 🛠️

In case you want to start creating batch jobs too I’d recommend checking out Orchest (www.orchest.io). It has a generous free tier and supports GPU instances. The platform itself is self-hostable too and open source (https://github.com/orchest/orchest).
The main advantages are its interactive pipeline editor, support for Jupyter notebooks in the pipeline/DAG context, and a simple way to specify environment dependencies. It also supports auto start-and stopping of instances so you only pay for the compute necessary to run your data pipelines.
Disclosure, I’m one of the creators.

ploomber

121 3,369 7.8 Python

The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️

One of my deal breakers when choosing tooling is how easy is to move from a local environment to a distributed environment. Ideally, you want to start locally and move to a distributed env if you need to. So choose one tool that allows you to get started quickly and move from there.
As an example: one of the reasons why I don't use Kubeflow is because it requires having a Kubernetes cluster up and running, which is an overkill in many cases.
Check out the project I'm working on: https://github.com/ploomber/ploomber

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Decent low code options for orchestration and building data flows?
1 project | /r/dataengineering | 23 Dec 2022
Build ML workflows with Jupyter notebooks
1 project | /r/programming | 23 Dec 2022
Building container images in Kubernetes, how would you approach it?
2 projects | /r/kubernetes | 6 Dec 2022
Ideas for infrastructure and tooling to use for frequent model retraining?
1 project | /r/mlops | 9 Sep 2022
Looking for a mentor in MLOps. I am a lead developer.
1 project | /r/mlops | 25 Aug 2022

Ask HN: What ML platform are you using?

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Data Science Machine Learning Pipelines VSCode Jupyter
Post date: 13 Mar 2022

Genann

colab-vscode

WorkOS

hal9ai

orchest

ploomber

InfluxDB

Related posts

Ask HN: What ML platform are you using?

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Data Science Machine Learning Pipelines VSCode Jupyter Post date: 13 Mar 2022

Genann

colab-vscode

WorkOS

hal9ai

orchest

ploomber

InfluxDB

Related posts

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Data Science Machine Learning Pipelines VSCode Jupyter
Post date: 13 Mar 2022