Our great sponsors
-
amazon-s3-find-and-forget
Amazon S3 Find and Forget is a solution to handle data erasure requests from data lakes stored on Amazon S3, for example, pursuant to the European General Data Protection Regulation (GDPR)
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
Take a look at this: https://github.com/awslabs/amazon-s3-find-and-forget We use it for GDPR compliance; it will open a file, delete a row and pack it back. It will modify the file so watch out if you are using Glue job bookmarks. Because you are using external tables, the manifest file will also have to be updated with a proper lenght for the new, updated file. If you have hundreds of tables and thousands of files, and you need to do this on a regular basis this would be the scalable solution, but if you have few files honestly I would do it manually