apache-ultimate-bad-bot-blocker
infinite-fake-website
apache-ultimate-bad-bot-blocker | infinite-fake-website | |
---|---|---|
3 | 3 | |
747 | 7 | |
- | - | |
9.5 | 4.0 | |
6 days ago | 11 months ago | |
C | PHP | |
GNU General Public License v3.0 or later | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
apache-ultimate-bad-bot-blocker
-
Ahrefs Saved US$400M in 3 Years by Not Going to the Cloud
I found Ahrefs a bit too aggressive. Thankfully they seem to honor Robots.txt so it's not much of a problem, but across the wide range of sites I'm responsible for, the Ahrefs bots really stood out as being a problem. Out of curiosity, I just double checked and the only hits from them in the logs are "GET /robots.txt" which is what I would expect.
I recently found a regularly update place for bot stuff:
https://github.com/mitchellkrogza/apache-ultimate-bad-bot-bl...
I don't think I'd call Ahrefs a "Bad Bot" but they are on that list. I'd call any bot ignoring robots.txt bad though.
- My VM gets thousands of hits from vulnerability crawlers (bots) each day. Is that normal?
-
Added a Google Recaptcha to the Checkout page due to bot-fraud attempts, now it hinders legitimate customers
Most malicious bots are well known and blocking them is often simple. Use the popular Bad Bot Blocker software https://github.com/mitchellkrogza/apache-ultimate-bad-bot-blocker
infinite-fake-website
-
Bootstrapping Ahrefs: Reaching $10M in Revenue in 5 Years
I snickered at the emphasis on quality. Ahrefs web scraper reacted very poorly to randomly generated "content": https://github.com/bediger4000/infinite-fake-website
SEO is best avoided. It might not be illegal, but neither is farting in church. They'll both never be popular.
-
Ahrefs Saved US$400M in 3 Years by Not Going to the Cloud
I kind of wish they hadn't. Ahrefs claims to be in the "SEO" business, which as far as I'm concerned, has helped ruin web search, and indeed, the culture of the web in general.
Also, Ahrefs bot doesn't handle some things very well. I made an "infinite web site" in PHP a while back, and used Apache mod_rewrite to send every Ahrefs request to the infinite web site PHP program: https://github.com/bediger4000/infinite-fake-website
Ahrefs bot really freaked out, unlike some professional bots like Google's, and even Yandex' bot.
-
Ahrefs hacked Medium to get 200K views for a low search-volume keyword
What is Ahrefs' business? Sure "SEO", but what does that mean, exactly?
They seem to have a poorly coded web crawler. I wrote an "infinite web site" PHP program that Ahrefs reacted to very poorly: https://github.com/bediger4000/infinite-fake-website
Since SEO is a major contributor to lousy search results, I'm now just trying to ban them all via robots.txt, but it doesn't seem mean enough.
What are some alternatives?
nginx-ultimate-bad-bot-blocker - Nginx Block Bad Bots, Spam Referrer Blocker, Vulnerability Scanners, User-Agents, Malware, Adware, Ransomware, Malicious Sites, with anti-DDOS, Wordpress Theme Detector Blocking and Fail2Ban Jail for Repeat Offenders
faker - Faker is a pure Elixir library for generating fake data.
htpw - htpw is a project to increase the security of your WordPress!
apache-ultimate-bad-bot-bl
vue-content-placeholders - Composable components for rendering fake (progressive) content like facebook in vue
muenchhausen - Produce authentic fake data