Ahrefs Saved US$400M in 3 Years by Not Going to the Cloud

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • apache-ultimate-bad-bot-blocker

    Apache Block Bad Bots, (Referer) Spam Referrer Blocker, Vulnerability Scanners, Malware, Adware, Ransomware, Malicious Sites, Wordpress Theme Detectors and Fail2Ban Jail for Repeat Offenders

  • I found Ahrefs a bit too aggressive. Thankfully they seem to honor Robots.txt so it's not much of a problem, but across the wide range of sites I'm responsible for, the Ahrefs bots really stood out as being a problem. Out of curiosity, I just double checked and the only hits from them in the logs are "GET /robots.txt" which is what I would expect.

    I recently found a regularly update place for bot stuff:

    https://github.com/mitchellkrogza/apache-ultimate-bad-bot-bl...

    I don't think I'd call Ahrefs a "Bad Bot" but they are on that list. I'd call any bot ignoring robots.txt bad though.

  • I found Ahrefs a bit too aggressive. Thankfully they seem to honor Robots.txt so it's not much of a problem, but across the wide range of sites I'm responsible for, the Ahrefs bots really stood out as being a problem. Out of curiosity, I just double checked and the only hits from them in the logs are "GET /robots.txt" which is what I would expect.

    I recently found a regularly update place for bot stuff:

    https://github.com/mitchellkrogza/apache-ultimate-bad-bot-bl...

    I don't think I'd call Ahrefs a "Bad Bot" but they are on that list. I'd call any bot ignoring robots.txt bad though.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • infinite-fake-website

    Lure SEO spammers down a hole with no bottom

  • I kind of wish they hadn't. Ahrefs claims to be in the "SEO" business, which as far as I'm concerned, has helped ruin web search, and indeed, the culture of the web in general.

    Also, Ahrefs bot doesn't handle some things very well. I made an "infinite web site" in PHP a while back, and used Apache mod_rewrite to send every Ahrefs request to the infinite web site PHP program: https://github.com/bediger4000/infinite-fake-website

    Ahrefs bot really freaked out, unlike some professional bots like Google's, and even Yandex' bot.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • My VM gets thousands of hits from vulnerability crawlers (bots) each day. Is that normal?

    1 project | /r/aws | 20 Jan 2023
  • Bootstrapping Ahrefs: Reaching $10M in Revenue in 5 Years

    1 project | news.ycombinator.com | 24 Aug 2023
  • Anybody using Crowdsec?

    1 project | /r/unRAID | 4 Mar 2023
  • Has anyone tried this on Blackboard online exams??

    1 project | /r/Professors | 21 Feb 2023
  • ModSecurity VS openappsec - a user suggested alternative

    2 projects | 11 Nov 2022