TypeScript web-scraping

Open-source TypeScript projects categorized as web-scraping

Top 5 TypeScript web-scraping Projects

  • crawlee

    Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

  • Project mention: How to scrape Amazon products | dev.to | 2024-04-01

    In this guide, we'll be extracting information from Amazon product pages using the power of TypeScript in combination with the Cheerio and Crawlee libraries. We'll explore how to retrieve and extract detailed product data such as titles, prices, image URLs, and more from Amazon's vast marketplace. We'll also discuss handling potential blocking issues that may arise during the scraping process.

  • ayakashi

    :zap: Ayakashi.io - The next generation web scraping framework

  • SurveyJS

    Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.

    SurveyJS logo
  • LeMondeRssReader

    :newspaper: Read RSS feed from LeMonde.fr and display news inside the App

  • scrapyteer

    Web crawling & scraping framework for Node.js on top of headless Chrome browser

  • Project mention: Low-code Node.js web scraping tool | /r/webscraping | 2023-07-07

    Hi guys, I've created an open-source low-code Node.js web scraping tool on top of the Puppeteer - https://github.com/miroshnikov/scrapyteer. It offers a small set of functions that are combined in pipelines to define a crawling workflow and a shape of output data. Maybe somebody will find it useful.

  • monster-hunter-now-events

    A tool that auto-generates calendar events for Monster Hunter Now by scraping web news articles, processing them with AI, and creating a convenient calendar subscription.

  • Project mention: I created a calendar feed for in game events | /r/MHNowGame | 2023-11-02
  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

TypeScript web-scraping related posts

Index

What are some of the best open-source web-scraping projects in TypeScript? This list will help you:

Project Stars
1 crawlee 12,129
2 ayakashi 197
3 LeMondeRssReader 25
4 scrapyteer 16
5 monster-hunter-now-events 7

Sponsored
The modern identity platform for B2B SaaS
The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
workos.com