handy_sql_queries
By gregw2hn
spark-extension
A library that provides useful extensions to Apache Spark and PySpark. (by G-Research)
handy_sql_queries | spark-extension | |
---|---|---|
2 | 1 | |
1 | 173 | |
- | 5.2% | |
4.5 | 8.3 | |
10 months ago | 6 days ago | |
Scala | ||
The Unlicense | Apache License 2.0 |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
handy_sql_queries
Posts with mentions or reviews of handy_sql_queries.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2023-07-26.
-
Data diffs: Algorithms for explaining what changed in a dataset (2022)
If you are looking for an easy way to compare two tables in SQL, whether every single row and every single column are the same, you can use the following technique:
https://github.com/gregw2hn/handy_sql_queries/blob/main/sql_...
-
How to Check 2 SQL Tables Are the Same
This is part of why I don't use MINUS for table value comparisons... All you need is just GROUP BY/UNION ALL/HAVING, using the following technique:
https://github.com/gregw2hn/handy_sql_queries/blob/main/sql_...
spark-extension
Posts with mentions or reviews of spark-extension.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2023-07-26.
-
Data diffs: Algorithms for explaining what changed in a dataset (2022)
We're doing a env migration and I've been using spark diff extension for reconcile data, it's amazing, we've discover bugs in the data logic so quickly,
here is the extension if anyone is interested https://github.com/G-Research/spark-extension/blob/master/DI...
What are some alternatives?
When comparing handy_sql_queries and spark-extension you can also consider the following projects:
deep-diff2 - Deep diff Clojure data structures and pretty print the result
recidiffist - Diffs for structured data
pyspark-starter - Starter pyspark code with a working combination of all versions
datacompy - Pandas and Spark DataFrame comparison for humans and more!
macrobase-diff - Minimal implementation of Macrobase Diff
dbt-audit-helper - Useful macros when performing data audits
ExplainDaV
diffable-sql
Azure-Databricks-NYC-Taxi-Workshop - An Azure Databricks workshop leveraging the New York Taxi and Limousine Commission Trip Records dataset
handy_sql_queries vs deep-diff2
spark-extension vs deep-diff2
handy_sql_queries vs recidiffist
spark-extension vs pyspark-starter
handy_sql_queries vs datacompy
spark-extension vs recidiffist
handy_sql_queries vs macrobase-diff
spark-extension vs macrobase-diff
handy_sql_queries vs dbt-audit-helper
spark-extension vs ExplainDaV
handy_sql_queries vs diffable-sql
spark-extension vs Azure-Databricks-NYC-Taxi-Workshop