TypeScript Scraper

Open-source TypeScript projects categorized as Scraper | Edit details

Top 5 TypeScript Scraper Projects

  • GitHub repo cheerio

    Fast, flexible, and lean implementation of core jQuery designed specifically for the server.

    Project mention: Automating email verification for online accounts using JavaScript | dev.to | 2021-11-29

    cheerio to parse the HTML structure of an incoming email and filter out the link we need to click to verify our email address by an attribute (in this case the text Content of the link but it could be any html attribute)

  • GitHub repo mwoffliner

    Scrape any online Mediawiki motorised wiki (like Wikipedia) to your local filesystem

    Project mention: Creating ZIM files for Kiwix by myself? | reddit.com/r/DataHoarder | 2021-10-28

    r/kiwix would be the place to ask, but at the end of the day it all comes down to heading out to openzim.org (or the corresponding github repo) and figuring it out. You can either grab zimit and run it locally, or access all the libraries that will help you build your own scraper (Nautilus will assemble documents and videos into a single file library, MWoffliner will do for wikis, youtube will do YouTube, etc.).

  • Scout APM

    Scout APM: A developer's best friend. Try free for 14-days. Scout APM uses tracing logic that ties bottlenecks to source code so you know the exact line of code causing performance issues and can get back to building a great product faster.

  • GitHub repo scraper

    Open source nodejs web scraper. It scrapes, stores and exports data. Use it from your own javascript/typescript code, via command line or docker container. Supports multiple storage options: SQLite, MySQL, PostgreSQL. Supports multiple browser or dom-like clients: Puppeteer, Playwright, Cheerio, JSdom.

    Project mention: A simple solution to rotate proxies or how to spin up your own rotation proxy server with Puppeteer and only a few lines of JS code | reddit.com/r/webscraping | 2021-03-05

    I'm currently implementing concurrency conditions at project/proxy/domain/session level in https://github.com/get-set-fetch/scraper . On each level you can define the maximum number of requests and the delay between two consecutive requests.

  • GitHub repo extension

    web scraping extension (by get-set-fetch)

    Project mention: Scraping Sub-page on a password protected site | reddit.com/r/webscraping | 2021-02-03

    Once such open source extension is https://github.com/get-set-fetch/extension . Happy scraping :)

  • GitHub repo scraper

    Declarative web scraper in JavaScript primarily designed to extract linguistics data (by sergeyt)

    Project mention: LinguaBook - React app for learning Basic English | dev.to | 2021-01-24

    Under the hood it is heavily using my Just-in-Time scraper (my another open source project). This scrapper parses HTML pages from multiple sources in the browser and the parsed results are displayed on the page.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2021-11-29.

TypeScript Scraper related posts


What are some of the best open-source Scraper projects in TypeScript? This list will help you:

Project Stars
1 cheerio 24,534
2 mwoffliner 132
3 scraper 33
4 extension 18
5 scraper 2
Find remote jobs at our new job board 99remotejobs.com. There are 35 new remote jobs listed recently.
Are you hiring? Post a new remote job listing for free.
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives