dark-knowledge

😈📚 A curated library of research papers and presentations for counter-detection and web privacy enthusiasts. (by prescience-data)

Dark-knowledge Alternatives

Similar projects and alternatives to dark-knowledge

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better dark-knowledge alternative or higher similarity.

dark-knowledge reviews and mentions

Posts with mentions or reviews of dark-knowledge. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-04-28.
  • Share some articles you've saved
    2 projects | /r/privsec_dev | 28 Apr 2023
    "A curated library of research papers and presentations for counter-detection and web privacy enthusiasts": https://github.com/prescience-data/dark-knowledge
  • dark-knowledge: 😈📚 A curated library of research papers and presentations for counter-detection and web privacy enthusiasts - can be applied by this community to fingerprint various threat actors
    1 project | /r/blueteamsec | 11 Apr 2023
  • Can browser addons leak user data?
    1 project | /r/privacy | 1 Apr 2023
    What you said is contrary to the stance of Tor browser devs and not backed up by actual research. Here you can find a good collection of research papers: https://github.com/prescience-data/dark-knowledge
  • Choose your browser carefully| by Unix Sheikh
    1 project | /r/degoogle | 18 Jan 2023
    Browsers are complicated and you won't find a single analysis covering all aspects. For a security analysis Madaidan's blog is a good starting point. For a privacy analysis you need to learn the common tracking methods and which solutions or mitigations are available in which browser (and if they are properly implemented). You could start by learning about the different forms of tracking through state (cookies, cache, storage, ...), which is still one of the most used tracking methods. Of course you also need to check the easy things like telemetry. Then there is fingerprinting which is a huge topic on its own. This is where it's even more fun. You need to start reading research papers, not just one but many and you need to check mitigations used in browsers and their statistical implications (data about this is unfortunately very rare).
  • Will switching to linux make it easier to fingerprint my device?
    1 project | /r/privacy | 26 Dec 2022
    It's difficult to distinguish without having the knowledge yourself. Maybe Reddit is just not the right place, because most experts don't have the time to argue with non-experts on social media. You can look what experts write in research papers, or in the bug trackers of browsers with fingerprinting mitigations like the Tor browser.
  • Ask HN: Have you ever used anti detect browsers for web scraping?
    2 projects | news.ycombinator.com | 18 Nov 2022
    Plus curated list of research papers here if you want to go deep on the subject matter: https://github.com/prescience-data/dark-knowledge
  • is there a need for addtl FF extensions ?
    1 project | /r/privacy | 15 Jun 2022
    They become a significant part of your browser fingerprint. If you have a lot extensions installed, this alone could make you uniquely trackable. (See research papers here: https://github.com/prescience-data/dark-knowledge and https://fingerprintjs.com/blog/ad-blocker-fingerprinting/ )
  • VPN with best adblock?
    2 projects | /r/PrivacyGuides | 4 Jun 2022
    https://github.com/prescience-data/dark-knowledge (search for "extension")
  • Browser Fingerprint
    1 project | /r/privacy | 4 Mar 2022
    A dedicated list of research papers on browser fingerprinting: https://github.com/prescience-data/dark-knowledge
  • Avoiding Bot Detection
    2 projects | /r/webscraping | 31 Jan 2022
    "I'm a noob and using python with selenium to do some basic scraping on StockX" and scraping protected website like stockx with perimeterx is not possible. It's all about reverse engineering, browser introspection, fingerprint (from hardware to software canvas), then you still need tons of ips to rotate and cooldown, finally protection evolve with time and you have to redo most of the things to pass again. A company like Scrapfly exists because it's more expensive to do and maintain such solution internally, look at their public repositories on GitHub low level stuff, network spoofing stacks, packet manipulation, custom angle libs. It takes a long time to learn vs something like `asp=true` from their docs https://scrapfly.io/docs/scrape-api/anti-scraping-protection If you have time and are more interested in this side, you could start to read https://github.com/prescience-data/dark-knowledge and look at https://github.com/berstend/puppeteer-extra/tree/master/packages/puppeteer-extra-plugin-stealth project to see how it works. Do not attempt stealth project helping you to bypass at scale, it's public, anti bot companies are aware and spot it easily - most of the time they don't block directly and use bad fp generated to recognize bots and map proxies ips to collect it and deducted the subnet or residential > My main question is, would it be better to try and make my script act "more human" It's a legend that anti bot use or detect "human" behavior, this signal is not very important, you can randomly move the mouse or things, like is fine, having 0 input events, is suspect but not that much in fact - tactile systems do not trigger any events until you touch so it can't be a strong signal due to false-positive - and doing "behavioral detection" is a big lie in the industry, you can experiment by doing dumb things, it's still passing and at scale ... and when they say "machine learning" it's just basic stats like a throttle do but based on browser fingerprints rather than IP. If you hit some path, like login, registration and payment - they can use some very heavy system with GPU canvas and stuff like but not used for scraping yet > are other methods like switching drivers and using proxies the way to go? Using proxies yes, but with wrong fingerprints (chrome headless, a browser running on server hardware, browser in docker and so on) In fact, there is no magic, mixing driver change nothing, they still manipulate a spotted browser - some are just more flexible than other to spoof correctly some part - like js worker interception to inject scripts and hook correctly but that's all.
  • A note from our sponsor - WorkOS
    workos.com | 19 Apr 2024
    The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more →

Stats

Basic dark-knowledge repo stats
14
507
4.6
2 months ago
Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com