spark-clickhouse-connector
Spark ClickHouse Connector build on DataSourceV2 API (by ClickHouse)
mmlspark
Simple and Distributed Machine Learning [Moved to: https://github.com/microsoft/SynapseML] (by Azure)
spark-clickhouse-connector | mmlspark | |
---|---|---|
1 | 2 | |
193 | 2,489 | |
1.0% | - | |
7.2 | 9.3 | |
5 days ago | about 3 years ago | |
Scala | Scala | |
Apache License 2.0 | MIT License |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
spark-clickhouse-connector
Posts with mentions or reviews of spark-clickhouse-connector.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2023-01-30.
-
SQL should be your default choice for data engineering pipelines
Agree with the OP that SQL will almost assuredly still be in use for 20+ years in the future, given the simplicity and flexibility of the declarative language, standardization, and as applicable to today as it was then to our big data problems.
Any discussion of SQL at scale must include ClickHouse [https://clickhouse.com/docs/en/install#self-managed-install], given it's broad open-source use, integrations available for Spark with JDBC [https://github.com/ClickHouse/clickhouse-jdbc/] or the open-source Spark-ClickHouse Connector [https://github.com/housepower/spark-clickhouse-connector], and capability to scale SQL as a network service.
Disclosure: I work for ClickHouse
mmlspark
Posts with mentions or reviews of mmlspark.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2022-03-08.