Our great sponsors
-
We were in almost exactly the same situation. Workstations at the office. GCP account. We started pushing for more remote work about two years ago, and we built our internal machine learning platform around our workflow solving one bit after the other, precisely because we didn't want to be tied to one library/framework, and without having to pollute our notebooks.
Here's what we're doing:
- No-setup, fresh, notebook environments with the most popular libraries pre-installed: this saves a lot of time, avoids having environments that break, etc. We start afresh with large Docker images, and then people can install libraries.
- Real-time collaboration on notebooks: so we can troubleshoot and pair-program on the same notebook at the same time, we can see cursors, selections, changes, etc. We also use that for our weekly calls where we have the agenda, code snippets, etc. all in the same place. We can talk through agenda items, edit in real-time, add snippets of code, and brainstorm.
- Multiple versions of your notebooks: there's a regression in JupyterLab that we fixed. We're working to make it available in upstream JupyterLab[0].
- Long-running notebook scheduling with output that survives closed tabs and network disruptions: we select the Docker image and the output file. This solves the common problem of launching a long-running notebook, and having to keep the browser tab open and pray the network connection doesn't break. Some people circumvent that porblem by scheduling the notebook and saving the artifacts right from the notebook, but we wanted to be able to have the output streamed whether the tab was closed or not. Bonus: we can watch the same notebook running on multiple laptops.
- Automatic experiment tracking: automatically detects your models, parameters, and metrics and saves them without you remembering to do so or polluting your notebook with tracking code.
- Easily deploy your model and get a "REST endpoint" so data scientists don't tap on anyone's shoulder to deploy their model, and developers don't need to worry about ML dependencies to use the models, and be dragged into the "ML realm". The models also have a page where you can invoke the model by entering JSON and get predictions, or uploading a CSV file and get predictions.
- Build Docker images for your model and push it to a registry to user it wherever you want: currently we push to DockerHub and GitLab
- Monitor your models' performance on a live dashboard: requests, latency, performance, etc.
- Publish notebooks as AppBooks: automatically parametrize a notebook to enable clients to interact with it without exporting PDFs or having to build an application or mutate the notebook. This is very useful when you want to expose some parameters that are very domain specific to a domain expert. For example, some features in a nuclear engineering problem are very domain specific, and a nuclear engineer can bring in a lot of value tweaking the parameters themselves. It also spares our people from spinning up yet another VM on GCP, loading the model, creating a Flask application, setting authentication, etc.
Much more on our roadmap. We're only focusing on actual problems we have faced serving our clients, and problems we are facing now.
The notebook servers are run on an arbitrary cluster that happens to be ours for now, but we're pretty much approaching it with the following mindset: we are a company that uses this platform, and the platform just happens to be ours.
-
SurveyJS
Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.