SaaSHub helps you find the best software and product alternatives Learn more →
Top 3 Python Pii Projects
-
presidio
Context aware, pluggable and customizable data protection and de-identification SDK for text and images
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
metacrafter
Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully customizable and flexible rules
Perhaps de-identification before training could be helpful here.
Microsoft does seem active in this, e.g. https://microsoft.github.io/presidio/
Project mention: LongRoPE: Extending LLM Context Window Beyond 2M Tokens | news.ycombinator.com | 2024-02-22It's been possible to skip tokenization for a long time, my team and I did it here - https://github.com/capitalone/DataProfiler
For what it's worth, we actually were working with LSTMs with nearly a billion params back in 2016-2017 area. Transformers made it far more effective to train and execute, but ultimately LSTMs are able to achieve similar results, though slow & require more training data.
Project mention: Metacrafter – semantic data types detection Python lib | news.ycombinator.com | 2024-03-13
Python Pii related posts
- You can't build a moat with AI
- Presidio – Data Protection and De-Identification SDK
- LongRoPE: Extending LLM Context Window Beyond 2M Tokens
- Data Profiler – What's in your data?
- Data Profiler 0.9.0 -- offering a massive improvement to memory usage during profiling of large datasets
- Data Profiler 0.9.0 -- offering a massive improvement to memory usage during profiling of large datasets
- Data Profiler 0.9.0 -- offering a massive improvement to memory usage during profiling of large datasets
-
A note from our sponsor - SaaSHub
www.saashub.com | 26 Apr 2024
Index
What are some of the best open-source Pii projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | presidio | 3,077 |
2 | DataProfiler | 1,362 |
3 | metacrafter | 38 |
Sponsored