isp-data-pollution vs amazon-s3-find-and-forget

isp-data-pollution

ISP Data Pollution to Protect Private Browsing History with Obfuscation (by essandess)

Amazon S3 Find and Forget is a solution to handle data erasure requests from data lakes stored on Amazon S3, for example, pursuant to the European General Data Protection Regulation (GDPR) (by awslabs)

data-lake amazon-s3 S3 Gdpr AWS Parquet data-erasure right-to-be-forgotten Ccpa Big Data Privacy Data

Source Code

Suggest alternative

Edit details

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

isp-data-pollution		amazon-s3-find-and-forget
	Project
2	Mentions	3
566	Stars	232
-	Growth	2.6%
0.0	Activity	7.2
about 1 year ago	Latest Commit	8 days ago
Python	Language	Python
MIT License	License	Apache License 2.0

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

isp-data-pollution

Posts with mentions or reviews of isp-data-pollution. We have used some of these posts to build our list of alternatives and similar projects.

A browser extension that sends periodical random search requests, from a large distributed list, to confuse the data broker industry.
1 project | /r/CrazyIdeas | 12 Apr 2022

I’ve seen multiple implementations of this idea. Here’s the first that came up in a quick search: https://github.com/essandess/isp-data-pollution/
YSK: “Data Pollution” is a technique for polluting your search history with random searches to keep ISPs, big data companies, and governments from gathering meaningful data about you
1 project | /r/YouShouldKnow | 31 May 2021

Here’s one such project tackling this: https://github.com/essandess/isp-data-pollution/

amazon-s3-find-and-forget

Posts with mentions or reviews of amazon-s3-find-and-forget. We have used some of these posts to build our list of alternatives and similar projects.

Deleting particular data from S3 External Tables
1 project | /r/dataengineering | 31 Oct 2022

Take a look at this: https://github.com/awslabs/amazon-s3-find-and-forget We use it for GDPR compliance; it will open a file, delete a row and pack it back. It will modify the file so watch out if you are using Glue job bookmarks. Because you are using external tables, the manifest file will also have to be updated with a proper lenght for the new, updated file. If you have hundreds of tables and thousands of files, and you need to do this on a regular basis this would be the scalable solution, but if you have few files honestly I would do it manually
Update S3 Files
1 project | /r/aws | 27 Jan 2022

Have a look at S3 Find and Forget
How to handle GDPR requests for data stored in S3 ?
1 project | /r/dataengineering | 22 Nov 2021

S3 Find and Forget is probably worth looking into, even if just to get ideas on how to implement a similar solution for yourself

What are some alternatives?

When comparing isp-data-pollution and amazon-s3-find-and-forget you can also consider the following projects:

Social-Amnesia - Forget the past. Social Amnesia makes sure your social media accounts only show your posts from recent history, not from "that phase" 5 years ago.

DataEngineeringProject - Example end to end data engineering project.

gretel-python-client - The Gretel Python Client allows you to interact with the Gretel REST API.

awesome-aws - A curated list of awesome Amazon Web Services (AWS) libraries, open source repos, guides, blogs, and other resources. Featuring the Fiery Meter of AWSome.

tracardi - TRACARDI is a new HOME for your customer data. TRACARDI is an Composable API-first solution for any company that need unexpensive CDP to intergrate with.

data-toolset - Upgrade from avro-tools and parquet-tools jars to a more user-friendly Python package.

tracardi - TRACARDI is a new HOME for your customer data. TRACARDI is an API-first solution, low-code / no-code platform aimed at any e-commerce business that wants to start using user data for marketing purposes.

s3-credentials - A tool for creating credentials for accessing S3 buckets

ReTube - ReImagine Tubing

Differential-Privacy-Guide - Differential Privacy Guide

isp-data-pollution vs Social-Amnesia amazon-s3-find-and-forget vs DataEngineeringProject isp-data-pollution vs gretel-python-client amazon-s3-find-and-forget vs awesome-aws isp-data-pollution vs tracardi amazon-s3-find-and-forget vs data-toolset isp-data-pollution vs tracardi amazon-s3-find-and-forget vs s3-credentials isp-data-pollution vs ReTube isp-data-pollution vs Differential-Privacy-Guide

Compare isp-data-pollution vs amazon-s3-find-and-forget and see what are their differences.

isp-data-pollution

amazon-s3-find-and-forget

isp-data-pollution

amazon-s3-find-and-forget

What are some alternatives?