-
big-list-of-naughty-strings
The Big List of Naughty Strings is a list of strings which have a high probability of causing issues when used as user-input data.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
This is usually the point to link to the Big List of Naughty Strings: https://github.com/minimaxir/big-list-of-naughty-strings
If your system can handle these it can probably handle most global text.
Ugh Unicode has been the bane of my existence trying to write a text format spec. I started by trying to forbid certain characters to keep files editable and avoid Unicode rendering exploits (like hiding text, or making structured text behave differently than it looks), but in the end it became so much like herding cats that I had to just settle on https://github.com/kstenerud/concise-encoding/blob/master/ct...
Basically allow everything except some separators, most control chars, and some lookalike characters (which have to be updated as more characters are added to Unicode). It's not as clean as I'd like, but it's at least manageable this way.