Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 7 Pii Open-Source Projects
-
presidio
Context aware, pluggable and customizable data protection and de-identification SDK for text and images
Project mention: Presidio – Data Protection and De-Identification SDK | news.ycombinator.com | 2024-03-04 -
Project mention: LongRoPE: Extending LLM Context Window Beyond 2M Tokens | news.ycombinator.com | 2024-02-22
It's been possible to skip tokenization for a long time, my team and I did it here - https://github.com/capitalone/DataProfiler
For what it's worth, we actually were working with LSTMs with nearly a billion params back in 2016-2017 area. Transformers made it far more effective to train and execute, but ultimately LSTMs are able to achieve similar results, though slow & require more training data.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
-
-
metacrafter
Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully customizable and flexible rules
Project mention: Metacrafter – semantic data types detection Python lib | news.ycombinator.com | 2024-03-13 -
streamer-mode-for-firefox
Hides personal information from pages, similar to Discord's Streamer mode.
-
custom-log-marshaler
Attempt to R.I.P PII or unnecessary info in logs and reduce log ingestion costs in the process.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
Pii related posts
- Presidio – Data Protection and De-Identification SDK
- LongRoPE: Extending LLM Context Window Beyond 2M Tokens
- Data Profiler – What's in your data?
- Data Profiler 0.9.0 -- offering a massive improvement to memory usage during profiling of large datasets
- Data Profiler 0.9.0 -- offering a massive improvement to memory usage during profiling of large datasets
- Data Profiler 0.9.0 -- offering a massive improvement to memory usage during profiling of large datasets
- Data Profiler 0.9.0 -- offering a massive improvement to memory usage during profiling of large datasets
-
A note from our sponsor - InfluxDB
www.influxdata.com | 28 Mar 2024
Index
What are some of the best open-source Pii projects? This list will help you:
Project | Stars | |
---|---|---|
1 | presidio | 2,974 |
2 | DataProfiler | 1,349 |
3 | Databunker | 1,182 |
4 | slog-formatter | 79 |
5 | metacrafter | 37 |
6 | streamer-mode-for-firefox | 28 |
7 | custom-log-marshaler | 12 |