JavaScript Scraper

Open-source JavaScript projects categorized as Scraper

Top 23 JavaScript Scraper Projects

  • node-ytdl-core

    YouTube video downloader in javascript.

  • Project mention: Simple Youtube Downloader in under 50 Javascript lines | dev.to | 2023-11-04

    ytdl-core for streaming from Youtube

  • SurveyJS

    Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.

    SurveyJS logo
  • scrape-it

    🔮 A Node.js scraper for humans.

  • browser-fingerprinting

    Analysis of Bot Protection systems with available countermeasures 🚿. How to defeat anti-bot system 👻 and get around browser fingerprinting scripts 🕵️‍♂️ when scraping the web?

  • Project mention: A site that tracks the price of a Big Mac in every US McDonald's | news.ycombinator.com | 2024-01-13

    Yes, there is a lot written about it. Here is one link I have saved:

    https://github.com/niespodd/browser-fingerprinting

  • freeDictionaryAPI

    There was no free Dictionary API on the web when I wanted one for my friend, so I created one.

  • google-play-scraper

    Node.js scraper to get data from Google Play

  • node-website-scraper

    Download website to local directory (including all css, images, js, etc.)

  • article-extractor

    To extract main article from given URL with Node.js

  • Project mention: ScrapeGraphAI: Web scraping using LLM and direct graph logic | news.ycombinator.com | 2024-05-07

    Agreed!

    Apify's Website Content Crawler[0] does a decent job of this for most websites in my experience. It allows you to "extract" content via different built-in methods (e.g. Extractus [1]).

    We currently use this at Magic Loops[2] and it works _most_ of the time.

    The long-tail is difficult though, and it's not uncommon for users to back out to raw HTML, and then have our tool write some custom logic to parse the content they want from the scraped results (fun fact: before GPT-4 Turbo, the HTML page was often too large for the context window... and sometimes it still is!).

    Would love a dedicated tool for this. I know the folks at Reworkd[3] are working on something similar, but not sure how much is public yet.

    [0] https://apify.com/apify/website-content-crawler

    [1] https://github.com/extractus/article-extractor

    [2] https://magicloops.dev/

    [3] https://reworkd.ai/

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • website-scraper-puppeteer

    Plugin for website-scraper which returns html for dynamic websites using puppeteer

  • fredy

    :heart: Fredy - [F]ind [R]eal [E]states [D]amn Eas[y] - Fredy will constantly search for new listings on sites like Immoscout or Immowelt and send new results to you, so that you can focus on more important things in life ;)

  • obsidian-scrapers

    Get information from link for Obsidian

  • Project mention: Help regarding workflow | /r/ObsidianMD | 2023-12-07

    Templater scraper scripts

  • chorus

    Clone Hero-friendly Organized Repository of User-provided Songs (by Paturages)

  • amazon_scraper

    Amazon products scraper with using of rotating proxies and headless Chrome from ScrapingAnt

  • instagram-without-api-node

    A simple Node.js code to get unlimited instagram public pictures by every user without api, without credentials.

  • easy-reddit-downloader

    Simple headless Reddit post downloader

  • itchio-godot-scraper

    A scraper for Godot games hosted on https://itch.io.

  • trawler

    scraper for facebook, gab, google and tiktok (by niczem)

  • vlrgg-api

    An unofficial REST API for vlr.gg

  • XboxStoreAPI

    An API to retrieve a list of names & prices for Xbox games on sale.

  • html_tag_annotator

    A Machine Learning tool to create the training dataset very quickly & easily by using a smart chrome extension

  • nba-topshop-scraper

    Node script that will use Selenium to scrape card information from NBA Topshot including card names, rarity, and lowest cost at the moment. Data is scraped once per day.

  • awscraper

    A humble idea to fetch the cloud resources to make them inventoriable as possible.

  • reddit-in-valve-games

    Desktop app to automate getting text-only posts from reddit & binding them to keyboard keys in a Valve game to share them in the chat. Made with ElectronJS & Ruby.

  • tumblweed

    A simple cross-platform Tumblr blog downloader

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

JavaScript Scraper discussion

Log in or Post with

JavaScript Scraper related posts

  • Plug-in for formatting saved websites directly into Obsidian

    1 project | /r/ObsidianMD | 10 Nov 2023
  • Simple Youtube Downloader in under 50 Javascript lines

    1 project | dev.to | 4 Nov 2023
  • Nextjs ytdl-core youtube downloader

    1 project | /r/nextjs | 25 Jun 2023
  • Built a website to help you find... pocket knives!

    1 project | /r/reactjs | 5 Jun 2023
  • Feedback on new game

    1 project | /r/wordgames | 23 May 2023
  • How can I get all the words in this dictionary api?

    1 project | /r/AskProgramming | 12 Jan 2023
  • How can I get all the words in this dictionary api?

    1 project | /r/learnprogramming | 12 Jan 2023
  • A note from our sponsor - SaaSHub
    www.saashub.com | 16 Jun 2024
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source Scraper projects in JavaScript? This list will help you:

Project Stars
1 node-ytdl-core 4,368
2 scrape-it 3,990
3 browser-fingerprinting 3,938
4 freeDictionaryAPI 2,436
5 google-play-scraper 2,250
6 node-website-scraper 1,520
7 article-extractor 1,434
8 website-scraper-puppeteer 308
9 fredy 207
10 obsidian-scrapers 129
11 chorus 114
12 amazon_scraper 77
13 instagram-without-api-node 64
14 easy-reddit-downloader 58
15 itchio-godot-scraper 28
16 trawler 22
17 vlrgg-api 20
18 XboxStoreAPI 13
19 html_tag_annotator 12
20 nba-topshop-scraper 11
21 awscraper 8
22 reddit-in-valve-games 7
23 tumblweed 6

Sponsored
Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App
With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
surveyjs.io