hopsworks
serverless-ml-course
Our great sponsors
hopsworks | serverless-ml-course | |
---|---|---|
4 | 5 | |
1,074 | 483 | |
1.4% | 17.8% | |
9.2 | 2.8 | |
6 days ago | about 2 months ago | |
Java | Jupyter Notebook | |
GNU Affero General Public License v3.0 | Creative Commons Zero v1.0 Universal |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
hopsworks
- Hopworks: MLOps platform with Python-centric Feature Store
- Show HN: Feature Store and Model Registry; Hopsworks 3.0
-
[D] Your 🫵 Preferred Feature Stores?
Anyways -> https://github.com/logicalclocks/hopsworks
-
Reflections on the Lack of Adoption of Domain Specific Languages [pdf]
We built the first open-source feature store for ML, https://github.com/logicalclocks/hopsworks , when every existing proprietary feature store (Uber Michelangelo and Bighead at AirBnb) were shouting about how their DSL for feature engineering was the future.
Fast-forward 2 years and it is clear that Data Scientists want to work with Python, not with a DSL. We based our Feature Store on a Dataframe API for Python/PySpark. The DSL can never evolve at the same rate as libraries in a general-purpose programming language. So, your DSL is great for show-casing a Feature Store, but when you need to compute embeddings or train a GAN or done any type of feature engineering that is not a simple time-window aggregation, you pull out Python (or Scala/Java). I am old enough to have seen many DSLs in different domains (GUIs, aspect-oriented programming, feature engineering) have their day in the sun only to be replaced by general-purpose programming languages due to their unmatched utility.
serverless-ml-course
-
Serverless Video Transcription inspired by Cyberpunk 2077
https://github.com/featurestoreorg/serverless-ml-course
Some of the students have built similar systems, for example chaining Whisper and ChatGPT or translation or sentiment analysis of transcribed text, such as here (transcribe Swedish and tell me the sentiment of the text):
-
Best options for hosting datasets? (Currently using GitHub)
On hopsworks.ai you get 10GB of free storage in their serverless feature store. You can find out more about it in this serverless-ml course: https://github.com/featurestoreorg/serverless-ml-course
- Don't build ML infra, build ML services in mins w Serverless ML – a course
-
GitHub Action Transcription
There is a severless machine learning course that includes GH actions to implement serverless feature pipelines and serverless batch inference pipelines.
https://github.com/featurestoreorg/serverless-ml-course
Disclaimer: I am involved in it.
-
A free online course for serverless ML.
Github: https://github.com/featurestoreorg/serverless-ml-course
What are some alternatives?
feathr - Feathr – A scalable, unified data and AI engineering platform for enterprise
Real-Time-Voice-Cloning - Clone a voice in 5 seconds to generate arbitrary speech in real-time
featureform - The Virtual Feature Store. Turn your existing data infrastructure into a feature store.
action-transcription-demo - A tool for creating a repository of transcribed videos
textX - Domain-Specific Languages and parsers in Python made easy http://textx.github.io/textX/
action-transcription - A tool for creating a repository of transcribed videos
feast - Feature Store for Machine Learning
llm-course - Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
OpenMLDB - OpenMLDB is an open-source machine learning database that provides a feature platform computing consistent features for training and inference.
cbp-translate
iwlearn - "Production First" Machine Learning Framework
excalidraw - Virtual whiteboard for sketching hand-drawn like diagrams