pointblank
pandera
Our great sponsors
pointblank | pandera | |
---|---|---|
3 | 7 | |
826 | 2,994 | |
2.5% | 4.8% | |
9.4 | 8.9 | |
about 1 month ago | 6 days ago | |
R | Python | |
GNU General Public License v3.0 or later | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
pointblank
-
R: Introduction to Data Science
(1) You might want to check out https://github.com/t-kalinowski/Rapp by my colleague Tomasz
(2) I think part of that is in scope for strict (https://github.com/hadley/strict). You might also be well served by adopting some more data validation tooling, e.g. pointblank (https://rstudio.github.io/pointblank/).
- Custom Formatting in pointblank
- Pointblank: R package for data validation
pandera
-
Unit testing functions that input/output dataframes?
I use Pandera, so I just need to define the expected input/output schemas (i.e. column names, types, and constraints on them), and Pandera automatically generates fake data for the unit tests, and validates the result: https://github.com/unionai-oss/pandera
-
Great Expectations is annoyingly cumbersome
Please DM me! Or we can discuss in this issue which I just created: https://github.com/unionai-oss/pandera/issues/1042
-
Data validation for dashboards
In my opinion for simple data validation tasks the best solution is always Pandera.
-
Show HN: Pandera 0.8.0 – validate pandas, dask, modin, and koalas dataframes
* adds support for mypy static type-linting if you need that extra type safety
Repo: https://github.com/pandera-dev/pandera
-
Pandera 0.8.0: Schema Validation for Pandas, Dask, Modin, and Koalas DataFrames. Oh, and also out-of-the-box Pydantic and Mypy support :)
Repo: https://github.com/pandera-dev/pandera
-
How heavily do you use Great Expectations?
pandera
What are some alternatives?
allure-environment-writer - Java library which allows to write environment.xml file into allure-results directory.
soda-sql - Data profiling, testing, and monitoring for SQL accessible data.
allure-docker-service - This docker container allows you to see up to date reports simply mounting your "allure-results" directory in the container (for a Single Project) or your "projects" directory (for Multiple Projects). Every time appears new results (generated for your tests), Allure Docker Service will detect those changes and it will generate a new report automatically (optional: send results / generate report through API), what you will see refreshing your browser.
Schematics - Python Data Structures for Humans™.
soda-core - :zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
jsonschema - An implementation of the JSON Schema specification for Python
piperider - Code review for data in dbt
swifter - A package which efficiently applies any function to a pandas dataframe or series in the fastest available manner
DataProfiler - What's in your data? Extract schema, statistics and entities from datasets
dbt-expectations - Port(ish) of Great Expectations to dbt test macros
sweetviz - Visualize and compare datasets, target values and associations, with one line of code.
riptable - 64bit multithreaded python data analytics tools for numpy arrays and datasets