JavaScript Scraper

Open-source JavaScript projects categorized as Scraper

Top 23 JavaScript Scraper Projects

  • node-ytdl-core

    YouTube video downloader in javascript.

  • Project mention: Simple Youtube Downloader in under 50 Javascript lines | dev.to | 2023-11-04

    ytdl-core for streaming from Youtube

  • scrape-it

    🔮 A Node.js scraper for humans.

  • Project mention: Built a website to help you find... pocket knives! | /r/reactjs | 2023-06-05

    https://github.com/IonicaBizau/scrape-it for simple code to target dom elements with classes/ids etc...

  • SurveyJS

    Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.

    SurveyJS logo
  • browser-fingerprinting

    Analysis of Bot Protection systems with available countermeasures 🚿. How to defeat anti-bot system 👻 and get around browser fingerprinting scripts 🕵️‍♂️ when scraping the web?

  • Project mention: A site that tracks the price of a Big Mac in every US McDonald's | news.ycombinator.com | 2024-01-13

    Yes, there is a lot written about it. Here is one link I have saved:

    https://github.com/niespodd/browser-fingerprinting

  • freeDictionaryAPI

    There was no free Dictionary API on the web when I wanted one for my friend, so I created one.

  • Project mention: Feedback on new game | /r/wordgames | 2023-05-23

    I see that the request to dictionaryapi.dev comes directly from the client. Looking at the github for that project, I see this:

  • google-play-scraper

    Node.js scraper to get data from Google Play

  • node-website-scraper

    Download website to local directory (including all css, images, js, etc.)

  • article-extractor

    To extract main article from given URL with Node.js

  • Project mention: How do Instapaper and Pocket apps extract the content of the articles? | /r/opensource | 2023-12-04

    Edit: I found this library in NodeJs useful for article extraction. Anyone looking for something like you can take a look. https://github.com/extractus/article-extractor

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • gogoanime-api

    Anime Streaming, Discovery API made with Cheerio and Express. Uses data from Gogoanime

  • Project mention: I created an anime website . | /r/developersIndia | 2023-06-29
  • website-scraper-puppeteer

    Plugin for website-scraper which returns html for dynamic websites using puppeteer

  • fredy

    :heart: Fredy - [F]ind [R]eal [E]states [D]amn Eas[y] - Fredy will constantly search for new listings on sites like Immoscout or Immowelt and send new results to you, so that you can focus on more important things in life ;)

  • obsidian-scrapers

    Get information from link for Obsidian

  • Project mention: Help regarding workflow | /r/ObsidianMD | 2023-12-07

    Templater scraper scripts

  • chorus

    Clone Hero-friendly Organized Repository of User-provided Songs (by Paturages)

  • amazon_scraper

    Amazon products scraper with using of rotating proxies and headless Chrome from ScrapingAnt

  • instagram-without-api-node

    A simple Node.js code to get unlimited instagram public pictures by every user without api, without credentials.

  • easy-reddit-downloader

    Simple headless Reddit post downloader

  • Project mention: Archiving Reddit without using an app that relies on their API | /r/DataHoarder | 2023-06-01

    Seeing that the move to make the Reddit API cost cash, I'm afraid that something like this won't function when I'm done seeing what I want to archive from this place. So, I'm asking to see what options from Website Mirroring tools I got, and how to set them up to fetch media files as well, other than just the website's layout and such.

  • itchio-godot-scraper

    A scraper for Godot games hosted on https://itch.io.

  • trawler

    scraper for facebook, gab, google and tiktok (by niczem)

  • vlrgg-api

    An unofficial REST API for vlr.gg

  • XboxStoreAPI

    An API to retrieve a list of names & prices for Xbox games on sale.

  • html_tag_annotator

    A Machine Learning tool to create the training dataset very quickly & easily by using a smart chrome extension

  • nba-topshop-scraper

    Node script that will use Selenium to scrape card information from NBA Topshot including card names, rarity, and lowest cost at the moment. Data is scraped once per day.

  • awscraper

    A humble idea to fetch the cloud resources to make them inventoriable as possible.

  • reddit-in-valve-games

    Desktop app to automate getting text-only posts from reddit & binding them to keyboard keys in a Valve game to share them in the chat. Made with ElectronJS & Ruby.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

JavaScript Scraper related posts

Index

What are some of the best open-source Scraper projects in JavaScript? This list will help you:

Project Stars
1 node-ytdl-core 4,280
2 scrape-it 3,978
3 browser-fingerprinting 3,830
4 freeDictionaryAPI 2,335
5 google-play-scraper 2,218
6 node-website-scraper 1,504
7 article-extractor 1,375
8 gogoanime-api 670
9 website-scraper-puppeteer 294
10 fredy 202
11 obsidian-scrapers 115
12 chorus 115
13 amazon_scraper 75
14 instagram-without-api-node 63
15 easy-reddit-downloader 52
16 itchio-godot-scraper 27
17 trawler 22
18 vlrgg-api 18
19 XboxStoreAPI 13
20 html_tag_annotator 12
21 nba-topshop-scraper 11
22 awscraper 8
23 reddit-in-valve-games 7

Sponsored
Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com