* training: history, comparisons, parameters, hyperparameter tuning with Optuna, Hyperopt or custom optimizer (https://github.com/valohai/optimo); additionally visualizations about training progress and hardware resource monitoring
Build data pipelines, the easy way 🛠️
Others have mentioned some cool projects in this space, but you mentioned self hosted specifically so I’ll share what we’re working on since it might match what you’re looking for.
As a new project we are still figuring out some of major topics you described.
In short, we built a data science pipeline tool that should fit well with existing workflows in machine learning and data science. We chose to embrace and integrate open source projects to create a simple and seamless experience with best in breed solutions for various tasks.
We are particularly happy with our deep integration of JupyterLab building on the Jupyter Enterpise Gateway project from IBM (Codait) for connecting kernels directly to your pipelines. For scheduling we build on top of Celery combined with containerization primitives. For stable and well defined dependency management we built a small environment abstraction on top of Docker. It works really well in our experience!
Feel free to check out the project on https://github.com/orchest/orchest
Self hosting should be as easy as running about two lines of code.
Less time debugging, more time building. Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.
:rocket: Build and manage real-life data science projects with ease!
has anyone done a comparison of ML pipelines from a devops centric perspective ?
For example, Metaflow doesnt support kubernetes today - https://github.com/Netflix/metaflow/issues/16
so ultimately the scale up story in most of these management tools is iffy.
I previously asked about kubeflow here - https://news.ycombinator.com/item?id=24808090 . Seems people think its pretty "horrendous". It seems most of these tools assume a very specialised devops team who will work around the ml tool...rather than the ml tool making this easy.
AWS Summit 2022 Australia and New Zealand - Day 2, AI/ML Edition
1 project | dev.to | 20 May 2022
Simplest way to run large batch jobs in the cloud?
1 project | reddit.com/r/dataengineering | 19 Feb 2022
2 projects | news.ycombinator.com | 23 Jan 2022
Help on understanding mlops tools.
1 project | reddit.com/r/mlops | 6 Oct 2021
A few reasons why internal product management is awesome and not a downgrade
1 project | reddit.com/r/ProductManagement | 23 Aug 2021