
-
It won't help with the more egregious scrapers, but this list is handy for the ones that respect robots.txt:
https://github.com/ai-robots-txt/ai.robots.txt
-
Nutrient
Nutrient – The #1 PDF SDK Library, trusted by 10K+ developers. Other PDF SDKs promise a lot - then break. Laggy scrolling, poor mobile UX, tons of bugs, and lack of support cost you endless frustrations. Nutrient’s SDK handles billion-page workloads - so you don’t have to debug PDFs. Used by ~1 billion end users in more than 150 different countries.
-
I deployed a small dockerized app on GCP a couple months ago and these bots ended up costing me a ton of money for the stupidest reason: https://github.com/streamlit/streamlit/issues/9673
The majority of requests were from OpenAI…
-
Nice idea!
Btw, such reverse slow-loris “attack” is called a tarpit. SSH tarpit example: https://github.com/skeeto/endlessh
Related posts
-
[Python Tips] Streamlit: A Rapid Prototyping Tool for Python
-
How to Scrape and Analyse Data for Free using AI: From Collection to Insight
-
Building a Voice Transcription and Translation App with OpenAI Whisper and Streamlit
-
Build an AI-Powered Anomaly Detection Application for E-Commerce Analytics
-
Create Your Own AI RAG Chatbot: A Python Guide with LangChain