ClearURLs – automatically remove tracking elements from URLs

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • SurveyJS - Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • ClearURLs-Addon

    ClearURLs is an add-on based on the new WebExtensions technology and will automatically remove tracking elements from URLs to help protect your privacy.

  • >I don't see why ClearURLs couldn't add this functionality.

    I think the problem is that, for security reasons, ClearURLs can't change URLs arbitratily. It can only remove parts of it, so the actual URL would have to be a parameter. See [1] for a comment by the extension's author on this issue.

    [1] https://github.com/ClearURLs/Addon/issues/102#issuecomment-8...

  • Rules

    Rules database of the ClearURLs WebExtension. (by ClearURLs)

  • Yep, that's kind of the main problem :) Hence the need for some manual curation. (e.g. ClearURLs seems to do it here https://github.com/ClearURLs/Rules/blob/master/data.json) For 80% of sites just throwing away the query parameters work, for the rest sadly it's necessary to do more sophisticated normalizing.

    I'm also thinking that it might be possible by some simple machine learning, by looking at the corpus of existing URLs. E.g. if a human looks at a corpus of different URLs they would more or less guess what is useful, and what's tracking garbage, so perhaps it's possible to automate it with a high accuracy?

    Then, I also feel if it's paired with some UI to allow the user to 'fix' the algorithm for entity extraction (e.g. by pointing at the 'relevant' parts of the URL), it would already be good enough for the user -- they would fix the sites that are worst offenders for them. Then these fixes could be optionally contributed back and merged to the upstream 'rules database'.

  • SurveyJS

    Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.

    SurveyJS logo
  • uBlock-issues

    This is the community-maintained issue tracker for uBlock Origin

  • uBlock

    uBlock Origin - An efficient blocker for Chromium and Firefox. Fast and lean.

  • AdguardBrowserExtension

    AdGuard browser extension

  • promnesia

    Another piece of your extended mind

  • I've had an idea for a project (which I call 'cannon')[0] which would 'canonicalize' URLs and extract semantic information from it, ideally just by looking at the URL, without doing any extra requests. For example, a tweet URL usually encodes the tweet author and tweet ID; and by extracting such entities one could determine 'relations' between URLs. I'm using a simple prototype in Promnesia [1], a browser extension aiming to make the web browsing history more useful and aid knowledge management.

    This effort is really ought to be shared, it's potentially a lot of manual work, and could benefit many projects. ClearURLs seems like one of the most promising existing projects doing similar stuff; have been meaning to approach the devs, feels like it's something we could cooperate on. Although ClearURL has a somewhat narrower scope, but still I feel like there is a potential to share.

    [0] https://beepb00p.xyz/exobrain/projects/cannon.html

    [1] https://github.com/karlicoss/promnesia#readme

  • archiveis

    A simple Python wrapper for the archive.is capturing service

  • Another way to browse one-off sites you visit is to through a mirror like https://archive.is/ (I almost always do this exclusively use a mirror when viewing content aggregators like Medium, Substack, Buzzfeed; annoying News websites that download a gazillion files; and file-hosting websites like imgur).

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • user.js

    Firefox privacy, security and anti-tracking: a comprehensive user.js template for configuration and hardening

  • Those addons are very basic, just what I'd have done in 2010 --- before Snowden!

    Since you have Firefox, you could sync with the Arkenfox (previously GHacks) user.js [1], which seems to go much farther (and still not breaking much)! At least the settings privacy.resistFingerprinting and privacy.firstparty.isolate looked indispensable as soon as I learned what they do.

    And without FPI (first party isolation), not getting LocalCDN [2] (Decentraleyes successor) seems like a gross oversight. They have a great discussion on add-ons at the wiki [3].

    [1] https://github.com/arkenfox/user.js

    [2] https://addons.mozilla.org/en-US/firefox/addon/localcdn-fork...

    [3] https://github.com/arkenfox/user.js/wiki/4.1-Extensions

  • FilterLists

    :shield: The independent, comprehensive directory of filter and host lists for advertisements, trackers, malware, and annoyances.

  • Lobsters

    Computing-focused community centered around link aggregation and discussion

  • Thanks for sharing this. We filter some of these from submissions to Lobsters (https://github.com/lobsters/lobsters/blob/f25fc62d7603c1bf70...) and I'd be glad to expand it.

    In your second list, are those the names of query params? I'm puzzled by the inclusion of @ in many of them, maybe you're saying that '_encoding' is a tracking param on any amazon domain, 'sk' is a tracking param on bing.com? What does the $ in the first entry indicate?

  • Universal-Bypass

    Discontinued Don't waste your time with compliance. Universal Bypass circumvents annoying link shorteners.

  • ClearUrls

  • AFAIK it is not possible to get access to only the URL. I think the "tabs" permission triggers the "Access your data for all websites" warning. The "tabs" permission is required to get a tabs URL and change it.

    Discussion about ClearURLs permissions: https://gitlab.com/KevinRoebert/ClearUrls/-/issues/159

    Extension permissions: https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/Web...

  • Neat-URL

    Neat URL cleans URLs, removing parameters such as Google Analytics' utm parameters.

  • Looks like the syntax used by NeatURL, see https://github.com/Smile4ever/Neat-URL#parameter-rules

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts