JavaScript Crawler

Open-source JavaScript projects categorized as Crawler

Top 17 JavaScript Crawler Projects

  • node-crawler

    Web Crawler/Spider for NodeJS + server-side jQuery ;-)

  • SurveyJS

    Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.

    SurveyJS logo
  • browser-fingerprinting

    Analysis of Bot Protection systems with available countermeasures 🚿. How to defeat anti-bot system 👻 and get around browser fingerprinting scripts 🕵️‍♂️ when scraping the web?

    Project mention: A site that tracks the price of a Big Mac in every US McDonald's | news.ycombinator.com | 2024-01-13

    Yes, there is a lot written about it. Here is one link I have saved:

    https://github.com/niespodd/browser-fingerprinting

  • work_crawler

    Download comics novels 小说漫画下载工具 小説漫画のダウンローダ 小說漫畫下載:腾讯漫画 大角虫漫画 有妖气 咪咕 SF漫画 哦漫画 看漫画 漫画柜 汗汗酷漫 動漫伊甸園 快看漫画 微博动漫 733动漫网 大古漫画网 漫画DB 無限動漫 動漫狂 卡推漫画 动漫之家 动漫屋 古风漫画网 36漫画网 亲亲漫画网 乙女漫画 webtoons 咚漫 ニコニコ静画 ComicWalker ヤングエースUP モアイ pixivコミック サイコミ;アルファポリス カクヨム ハーメルン 小説家になろう 起点中文网 八一中文网 顶点小说 落霞小说网 努努书坊 笔趣阁→epub.

  • google-play-scraper

    Node.js scraper to get data from Google Play

  • article-extractor

    To extract main article from given URL with Node.js

    Project mention: ScrapeGraphAI: Web scraping using LLM and direct graph logic | news.ycombinator.com | 2024-05-07

    Agreed!

    Apify's Website Content Crawler[0] does a decent job of this for most websites in my experience. It allows you to "extract" content via different built-in methods (e.g. Extractus [1]).

    We currently use this at Magic Loops[2] and it works _most_ of the time.

    The long-tail is difficult though, and it's not uncommon for users to back out to raw HTML, and then have our tool write some custom logic to parse the content they want from the scraped results (fun fact: before GPT-4 Turbo, the HTML page was often too large for the context window... and sometimes it still is!).

    Would love a dedicated tool for this. I know the folks at Reworkd[3] are working on something similar, but not sure how much is public yet.

    [0] https://apify.com/apify/website-content-crawler

    [1] https://github.com/extractus/article-extractor

    [2] https://magicloops.dev/

    [3] https://reworkd.ai/

  • fakebrowser

    🤖 Fake fingerprints to bypass anti-bot systems. Simulate mouse and keyboard operations to make behavior like a real person.

  • sitemap-generator

    Easily create XML sitemaps for your website.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • JSSoup

    JavaScript + BeautifulSoup = JSSoup

  • th-music-video-generator

    Touhou Project random music video generator/player, crawling image and video from websites to generate MV.

  • undetectable-crawler

    A Node.js script powered by Puppeteer for undetectable web scraping

    Project mention: Show HN: A Node.js script powered by Puppeteer for undetectable web scraping | news.ycombinator.com | 2024-01-17
  • images-downloader

    A Node.js module for downloading a single image or multiple images to disk from a given Url

  • selector-finder

    Find a CSS selector on a public site

    Project mention: SelectorHound: The tool for Sniffing out CSS Selectors | dev.to | 2024-02-29

    You can view the package on NPM and you can look at the code on Github

  • airbnb-scraper

    Apify public actor for scraping Airbnb homes.

  • CodexDrake

    An open source, privacy-first, self-hosting capable and blazing fast search engine written in JavaScript. Browse anonymously and safely without the need to pay third-party APIs. 👀

  • Netflix-Hotkeys

    A Chrome extension to enhance your Netflix binging experience!

  • tumblweed

    A simple cross-platform Tumblr blog downloader

  • finance-news-crawler

    Finance News Crawler uses News API to fetch some latest articles and generates a sentiment report with the OpenAI API or VADER

    Project mention: Cool ChatGPT Finance Sentiment Analysis | /r/ChatGPT | 2023-07-02
  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

JavaScript Crawler discussion

Log in or Post with

JavaScript Crawler related posts

  • Cool ChatGPT Finance Sentiment Analysis

    1 project | /r/ChatGPT | 2 Jul 2023
  • GitHub - simwai/finance-news-crawler: Finance News Crawler uses News API to fetch some latest articles and generates a sentiment report with the OpenAI API or VADER

    1 project | /r/algotrading | 2 Jul 2023
  • Would it be worth publishing my Chrome Extension even if I anticipate that it will have very few users?

    1 project | /r/developersIndia | 16 Jun 2023
  • Who here is developing extensions?

    2 projects | /r/webdev | 11 Jun 2023
  • Netflix Hotkeys: A Chrome Extension to enhance your Netflix Experience

    1 project | /r/javascript | 11 Jun 2023
  • Netflix Hotkeys: A Chrome Extension to enhance your Netflix Experience

    1 project | /r/coolgithubprojects | 10 Jun 2023
  • FAQs on my side project

    2 projects | /r/SideProject | 24 Oct 2022
  • A note from our sponsor - SaaSHub
    www.saashub.com | 24 Jun 2024
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source Crawler projects in JavaScript? This list will help you:

Project Stars
1 node-crawler 6,648
2 browser-fingerprinting 3,938
3 work_crawler 2,944
4 google-play-scraper 2,264
5 article-extractor 1,445
6 fakebrowser 1,050
7 sitemap-generator 396
8 JSSoup 366
9 th-music-video-generator 266
10 undetectable-crawler 23
11 images-downloader 19
12 selector-finder 20
13 airbnb-scraper 11
14 CodexDrake 9
15 Netflix-Hotkeys 9
16 tumblweed 6
17 finance-news-crawler 5

Sponsored
Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App
With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
surveyjs.io

Did you konow that JavaScript is
the 3rd most popular programming language
based on number of metions?