-
dangerzone
Take potentially dangerous PDFs, office documents, or images and convert them to safe PDFs
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
start here: https://github.com/freedomofpress/dangerzone
i've never used it, but i've been meaning to check it out. at least it should give you a jumping off point for further investigation.
if that is insufficient, use proofpoint.
for archives that are tickling bugs, you have to use a similar technique. it's not enough to analyze them and send them on as-is. you have to unpack in a sandbox (which will be detectable, no 2 ways about it, but the question is will anyone expend enough effort to detect -- no, not for your use case, seeing as how you're asking the question at all), process with dangerzone or dangerzone-like tool, then re-archive it and let the user see only that new archive.
Related posts
-
The Impact of API Response Time on Performance: What You Need to Know
-
Ask HN: Running LLMs Locally
-
GPUsGoBurr: Get up to 2x higher performance by Tuning LLM Inference Deployment
-
Show HN: Tarsier – vision for text-only LLM web agents that beats GPT-4o
-
PaliGemma: Open-Source Multimodal Model by Google