Ask HN: Let's build an HN uBlacklist to improve our Google search results?

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • code-search-blocklist

    A list of domains hosting scrapped code snippets and polluting search results to block.

  • I just made a PR of entries from this blacklist: https://github.com/jhchabran/code-search-blacklist, but it's based on my fork of jhchabran which includes some fixes. If jhchabran merges those changes, then it can be based on upstream again.

    I also added pintrest.com based on yesterdays discussion (and the SEO rankings data), but I can't find those source links anymore... I will keep looking.

  • ublock-origin-shitty-copies-filter

    Filter for DuckDuckGo and Google to remove those spam-websites that just blatantly copy and paste content from well known websites.

  • This was discussed around a month ago, leading me to this post:

    https://news.ycombinator.com/item?id=29546433#29549855

    and the consequent uBlock Origin list that is what I'm using as the so far better solution for this problem:

    https://github.com/stroobants-dev/ublock-origin-shitty-copie...

    but it will need curation and updates over time, which I'm not sure the author is willing or has the time to do.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • This was discussed around a month ago, leading me to this post:

    https://news.ycombinator.com/item?id=29546433#29549855

    and the consequent uBlock Origin list that is what I'm using as the so far better solution for this problem:

    https://github.com/stroobants-dev/ublock-origin-shitty-copie...

    but it will need curation and updates over time, which I'm not sure the author is willing or has the time to do.

  • ublacklist

    Blocks specific sites from appearing in Google search results

  • For the unaware, uBlacklist [0] is a browser extension that lets you blacklist sites from the google search results page. It lets you blacklist sites right from the results page, by regex, or by linking lists hosted somewhere.

    The low quality of results has been a problem from a while now and has become worse lately thanks to all those StackOverflow and Github clones. So I was wondering if we could come together and contribute to a single blacklist hosted somewhere and then import it into each of our browsers. Who knows? We might end up improving the quality of the results we all get.

    Lists to get rid of the StackOverflow and Github clones already exist. [1]

    I would love to contribute to a project like this, but won't be able to be a maintainer due to time constraints. Would greatly appreciate it if someone could host this. A simple txt file on github would do.

    What do you say, HN?

    [0]: https://github.com/iorate/ublacklist

  • awesome-ublacklist

    Awesome list of uBlacklist subscriptions to block search results from google, bing, duckduckgo.

  • [1]: https://github.com/rjaus/awesome-ublacklist

  • Yacy

    Distributed Peer-to-Peer Web Search Engine and Intranet Search Appliance

  • This looks like a big, time-consuming project that would rely on a private Google API that can change any time. I think it's not worth to invest your effort into that. I wish more people would help to improve FLOSS, peer-to-peer search engine YaCy insted, https://yacy.net.

  • hn-search-blacklist

    Discontinued A list of SEO-spam sites curated by HN to be blacklisted from Google Search and other search engines. To be used with the uBlacklist extension.

  • I ended up creating a repo with blacklist.txt myself and will add to it for my own usage. I don't see anyone else who'd maintain this. Feel free to use it / contribute to it.

    https://github.com/sanketpatrikar/hn-search-blacklist

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • config-files

    My collection of .dotfiles, settings and snippets.

  • uBlock Origin supports blocking search results, so I don't require an additional browser extension. I maintain a blocklist for myself, targetting Google and DuckDuckGo [1]. Feel free to contribute more websites or use this list as a template for your own repository.

    [1] https://github.com/darekkay/config-files/blob/master/adblock...

  • mwmbl

    An open source, non-profit search engine implemented in python

  • > Im afraid this is potentially dangerously political.

    It only took 7 days for that other search engine project on HN last week (Mwmbl) to add hard-coded weights for certain news websites, so it does show how careful you have to be with this stuff.

    https://github.com/mwmbl/mwmbl/blob/a41088ca9ad7fdcac952a3be...

  • duckduckgo-locales

    Translation files for <a href="https://duckduckgo.com"> </a>

  • This is a great example of why "Google sucks!!11" is mainly FUD. Let's say you're looking for the SO link, which is #2 for Google. Let's compare:

    Google ("code that protects users from accidentally invoking the script when they didn't intend to")

    Link: https://www.google.com/search?q=%22code+that+protects+users+...

    SO - #2

    Bing ("code that protects users from accidentally invoking the script when they didn't intend to")

    Link: https://www.bing.com/search?q=%22code+that+protects+users+fr...

    SO - #2

    Brave Search

    Link: https://search.brave.com/search?q=%22code+that+protects+user...

    SO - Not on page

    You.com

    Link: https://you.com/search?q=%22code%20that%20protects%20users%2...

    SO - Doesn't load

    DuckDuckGo:

    Link: https://duckduckgo.com/?q=%22code+that+protects+users+from+a...

    SO - #2 (seems to depend on refresh)

    Basically they're all the same.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts