compressio
datacompy
compressio | datacompy | |
---|---|---|
4 | 4 | |
28 | 386 | |
- | 8.8% | |
0.0 | 7.5 | |
over 1 year ago | 2 days ago | |
Python | Python | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
compressio
-
Visions – User defined data type systems
Visions is a python library for working with user defined data type systems. Out of the box, it provides type inference and automated data cleaning of sequence data with backend specific implementations for pandas, spark, python, and numpy. We often use it as a first pass cleaning step when working with tabular data and to simplify the backend logic of both pandas-profiling and our tabular data compression library compressio.
- Show HN: Visions – User defined data type systems
datacompy
- How to Check 2 SQL Tables Are the Same
-
Comparing 2 CSV files
datacompy is a package to compare 2 pandas dataframes
- Performing Data Tests on External Data/Complex Data Quality Checks
-
Best Practice When Comparing Data Across Two SQL Servers in Python
https://github.com/capitalone/datacompy will allow you to compare two tables/dataframes against one another, and see detailed results on the difference.
What are some alternatives?
ydata-profiling - 1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
koalas - Koalas: pandas API on Apache Spark
Dask - Parallel computing with task scheduling
data-science-ipython-notebooks - Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
visions - Type System for Data Analysis in Python
data-diff - Compare tables within or across databases
popmon - Monitor the stability of a Pandas or Spark dataframe ⚙︎
dbt-audit-helper - Useful macros when performing data audits
visualiza - A general-purpose dynamic data visualizer.
diffable-sql
merkle-tree-solidity - JS - Solidity sha3 merkle tree bridge. Generate proofs in JS; verify in Solidity.