hopsworks
bytehub
Our great sponsors
hopsworks | bytehub | |
---|---|---|
4 | 3 | |
1,074 | 57 | |
1.4% | - | |
9.2 | 0.0 | |
6 days ago | almost 3 years ago | |
Java | Python | |
GNU Affero General Public License v3.0 | GNU General Public License v3.0 only |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
hopsworks
- Hopworks: MLOps platform with Python-centric Feature Store
- Show HN: Feature Store and Model Registry; Hopsworks 3.0
-
[D] Your 🫵 Preferred Feature Stores?
Anyways -> https://github.com/logicalclocks/hopsworks
-
Reflections on the Lack of Adoption of Domain Specific Languages [pdf]
We built the first open-source feature store for ML, https://github.com/logicalclocks/hopsworks , when every existing proprietary feature store (Uber Michelangelo and Bighead at AirBnb) were shouting about how their DSL for feature engineering was the future.
Fast-forward 2 years and it is clear that Data Scientists want to work with Python, not with a DSL. We based our Feature Store on a Dataframe API for Python/PySpark. The DSL can never evolve at the same rate as libraries in a general-purpose programming language. So, your DSL is great for show-casing a Feature Store, but when you need to compute embeddings or train a GAN or done any type of feature engineering that is not a simple time-window aggregation, you pull out Python (or Scala/Java). I am old enough to have seen many DSLs in different domains (GUIs, aspect-oriented programming, feature engineering) have their day in the sun only to be replaced by general-purpose programming languages due to their unmatched utility.
bytehub
- [D] Your 🫵 Preferred Feature Stores?
-
ByteHub: simple timeseries data preparation in Python
Hi everyone! We’ve been building a Python-based feature-store called ByteHub. The aim is to make time series data easy to store, access, and transform when building machine-learning models. It’s available as an open-source library or as a low-cost cloud-hosted service.
- Show HN: Easy-to-use feature store for ML
What are some alternatives?
feathr - Feathr – A scalable, unified data and AI engineering platform for enterprise
featureform - The Virtual Feature Store. Turn your existing data infrastructure into a feature store.
fugue - A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.
textX - Domain-Specific Languages and parsers in Python made easy http://textx.github.io/textX/
covalent - Pythonic tool for orchestrating machine-learning/high performance/quantum-computing workflows in heterogeneous compute environments.
feast - Feature Store for Machine Learning
OpenMLDB - OpenMLDB is an open-source machine learning database that provides a feature platform computing consistent features for training and inference.
Hyperactive - An optimization and data collection toolbox for convenient and fast prototyping of computationally expensive models.
iwlearn - "Production First" Machine Learning Framework
prosto - Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby