SynapseML
flink-cdc
SynapseML | flink-cdc | |
---|---|---|
18 | 4 | |
4,980 | 5,291 | |
0.4% | 2.5% | |
9.0 | 9.6 | |
7 days ago | 4 days ago | |
Scala | Java | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
SynapseML
- FLaNK Stack Weekly for 12 September 2023
-
Microsoft announces new tool for applying ChatGPT and GPT-4 at massive scales
Release Notes: https://github.com/microsoft/SynapseML/releases/tag/v0.11.0
-
Data science in Scala
b) There are libraries around e.g. Microsoft SynapseML, LinkedIn Photon ML
- [N] Microsoft Announces New Integrations with OpenAI and MLFlow
- [N] Microsoft Releases new Integrations with OpenAI and MLflow as part of SynapseML
-
[P] Microsoft releases SynapseML v0.9.5 with support for speech synthesis, anomaly detection, and geospatial analytics on large-scale data
Link to Release Notes: https://github.com/microsoft/SynapseML/releases/tag/v0.9.5
- Microsoft releases SynapseML v0.9.5 for distributed geospatial analytics, speech synthesis, and anomaly detection in PySpark.
- [P] SynapseML v0.9.5 announces support for geospatial analytics, speech synthesis, and anomaly detection on large-scale datasets
- Microsoft releases SynapseML v0.9.5 with support for speech synthesis, anomaly detection, and geospatial analytics on Apache Spark
flink-cdc
- FLaNK Stack Weekly 12 February 2024
- FLaNK Stack Weekly for 12 September 2023
-
Flink CDC / alternatives
Info from: https://github.com/ververica/flink-cdc-connectors
- Flink CDC Connectors
What are some alternatives?
mmlspark - Simple and Distributed Machine Learning [Moved to: https://github.com/microsoft/SynapseML]
debezium - Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.
isolation-forest - A Spark/Scala implementation of the isolation forest unsupervised outlier detection algorithm.
open-interpreter - A natural language interface for computers
Tensorflow_scala - TensorFlow API for the Scala Programming Language
flink-faker - A data generator source connector for Flink SQL based on data-faker.
deequ - Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
FLaNK-Halifax - Community over Code, Apache NiFi, Apache Kafka, Apache Flink, Python, GTFS, Transit, Open Source, Open Data
Breeze - Breeze is a numerical processing library for Scala.
hrequests - 🚀 Web scraping for humans
azure-kusto-spark - Apache Spark Connector for Azure Kusto
MmFLaNK - Mm FLaNK Stack (MXNet, MiNiFi, Flink, NiFi, Kafka, Kudu) for AI-IoT