Top 23 TypeScript Scraper Projects

firecrawl

1 14 52,769 9.9 TypeScript

The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data 🔥

Project mention: Why we started sampleapp.ai | dev.to | 2025-06-23

Just a few days ago, Eric - CEO of Firecrawl - announced that they were closing down their previous startup, Mendable in this article and Hassan was promoted to the Director of Developer Relations in this post, both of whom post sample applications they build on a daily basis. These recent posts are testament to the prolific impact of sample applications on the adoption of Firecrawl and Together.ai.
SurveyJS

surveyjs.io featured

JavaScript Form Builder with No-Code UI & Built-In JSON Schema Editor. Add the SurveyJS white-label form builder to your JavaScript app (React/Angular/Vue3). Build dynamic JSON forms without coding. Fully customizable, works with any backend, perfect for data-heavy apps. Learn more.
cheerio

2 58 29,714 9.7 TypeScript

The fast, flexible, and elegant library for parsing and manipulating HTML and XML.

Project mention: JavaScript package manager - How to fix Cannot find module 'cheerio' error with Enzyme in Yarn 1 projects | dev.to | 2025-06-11

Cheerio 1.0.0 is incompatible with enzyme 3.11.0. #3987
crawlee

3 47 19,240 9.7 TypeScript

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

Project mention: Scraperr – A Self Hosted Webscraper | news.ycombinator.com | 2025-05-11

If you're a fan of Playwright check out Crawlee [0]. I've used it for a few small projects and it's been faster for me to get what I've needed done.
[0] https://crawlee.dev/
maxun

4 5 13,540 10.0 TypeScript

Easiest no code web data extraction platform. Instantly turn any website into API or spreadsheet.

Project mention: 👽 Extract Thousands of Rows of Data Without Writing Code (Open Source) | dev.to | 2025-07-17

Explore the project on GitHub: https://github.com/getmaxun/maxun
llm-scraper

5 5 5,974 7.5 TypeScript

Turn any webpage into structured data using LLMs

Project mention: Scraperr – A Self Hosted Webscraper | news.ycombinator.com | 2025-05-11

llm-scraper [1] does a decent job but it's still a bit fragile. The biggest problem I have is all the React CSS-in-JS libraries that use hashes in their class names, which the LLM isn't smart enough to ignore.
[1] https://github.com/mishushakov/llm-scraper
DevDocs

6 2 1,868 8.5 TypeScript

Completely free, private, UI based Tech Documentation MCP server. Designed for coders and software developers in mind. Easily integrate into Cursor, Windsurf, Cline, Roo Code, Claude Desktop App (by cyberagiinc)

Project mention: Show HN: We made an MCP Server so that Cursor can build anything from API Docs | news.ycombinator.com | 2025-03-24

Looks cool, the only one similar I've seen so far that is similar is: https://github.com/cyberagiinc/DevDocs
But every-time I've tried to run DevDocs, I've had issues running it. Either the scraper or the MCP server fails to run.
api.consumet.org

7 0 1,470 5.2 TypeScript

A Modern Search Engine API for Anime, Movies/TVShows, Books, Light Novels, Manga, etc.
Civic Auth

www.civic.com featured

Web2 & Web3 login in a simple SDK. Drop Civic Auth into your app with native TS/JS support. Email login, SSO options, embedded wallets, and full session management. Minimal config. Deploy in under 5 minutes.
epublifier

8 6 798 7.0 TypeScript

Converts some webnovels to epub format

Project mention: 聊聊开源 - FAV0周刊#019 | dev.to | 2024-10-27

将网站转化为Epub
linvo-scraper

9 6 622 3.0 TypeScript

Linkedin Automation Bot with every possible scraping! Valid for 2022 used by Linvo.io
HLTV

10 1 453 8.5 TypeScript

The unofficial HLTV Node.js API
mwoffliner

11 8 390 9.5 TypeScript

MediaWiki scraper: all your wiki articles in one highly compressed ZIM file

Project mention: Internet in a Box | news.ycombinator.com | 2025-04-27

Volunteer for Kiwix here (https://kiwix.org), we do a lot of offline Wikipedia stuff. I've personally worked on MWOffliner (https://github.com/openzim/mwoffliner) which scrapes MediaWikis, primarily Wikipedia.
We have apps for basically every platform. Our PWA even supports IE 11!
You can use the WP1 tool which I'm the primary maintainer of (https://wp1.openzim.org/#/selections/user) to create "selections" which let you have your own custom version of Wikipedia, using categories that you define, WikiProjects, or even custom SPARQL queries.
mkfd

12 2 186 8.7 TypeScript

RSS feed builder created with Bun🥖 and Hono🔥- builds from webpages, email folders, and REST API calls.

Project mention: Mkfd – RSS feed builder API created with Bun and Hono | news.ycombinator.com | 2024-11-17
scraper

13 12 114 0.0 TypeScript

Nodejs web scraper. Contains a command line, docker container, terraform module and ansible roles for distributed cloud scraping. Supported databases: SQLite, MySQL, PostgreSQL. Supported headless clients: Puppeteer, Playwright, Cheerio, JSdom. (by get-set-fetch)
extension

14 1 84 0.0 TypeScript

web scraping extension (by get-set-fetch)
vercel-metafy

15 1 52 2.3 TypeScript

Easily scrape metadata from websites as a service using Vercel.
freenom-auto-renew-domains

16 1 50 0.0 TypeScript

A scraper built with puppeteer that auto renew free domains on Freenom and send discord message using bot
webscraper-bot

17 2 29 4.7 TypeScript

Web scraping Discord bot that notifies if new item appears
botasaurus-starter

18 3 29 5.0 TypeScript

🚀 OFFICIAL STARTER TEMPLATE FOR BOTASAURUS SCRAPING FRAMEWORK 🤖
Philia

19 1 24 6.2 TypeScript

An easy to use imageboard scraper.
passport-appointment-bot

20 1 24 4.7 TypeScript

An automated bot designed to seamlessly book appointments for the renewal or creation of Swedish passports or national ID cards.
scrapyteer

21 1 19 4.0 TypeScript

Web crawling & scraping framework for Node.js on top of headless Chrome browser
forward-proxy-manager

22 1 13 2.8 TypeScript

Request distributor for web scraping
wallace-apple-dictionary

23 2 11 5.3 TypeScript

:book: macOS Dictionary for the readers of "Infinite Jest"
Sevalla

sevalla.com featured

Deploy and host your apps and databases, now with $50 credit! Sevalla is the PaaS you have been looking for! Advanced deployment pipelines, usage-based pricing, preview apps, templates, human support by developers, and much more!

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

TypeScript Scraper discussion

TypeScript Scraper related posts

👽 Extract Thousands of Rows of Data Without Writing Code (Open Source)

1 project | dev.to | 17 Jul 2025
Why we started sampleapp.ai

1 project | dev.to | 23 Jun 2025
Scraperr – A Self Hosted Webscraper

6 projects | news.ycombinator.com | 11 May 2025
Show HN: Get structured website data with just a prompt

1 project | news.ycombinator.com | 20 Jan 2025
Show HN: Llms.txt Generator – Turn websites into a text file to feed to any LLM

2 projects | news.ycombinator.com | 21 Nov 2024
Maxun: Open-Source No-Code Web Data Extraction Platform

1 project | news.ycombinator.com | 8 Nov 2024
Maxun: Open-Source No-Code Web Data Extraction Platform

1 project | news.ycombinator.com | 3 Nov 2024
A note from our sponsor - SurveyJS
surveyjs.io | 31 Aug 2025

Add the SurveyJS white-label form builder to your JavaScript app (React/Angular/Vue3). Build dynamic JSON forms without coding. Fully customizable, works with any backend, perfect for data-heavy apps. Learn more. Learn more →

Index

What are some of the best open-source Scraper projects in TypeScript? This list will help you:

#	Project	Stars
1	firecrawl	52,769
2	cheerio	29,714
3	crawlee	19,240
4	maxun	13,540
5	llm-scraper	5,974
6	DevDocs	1,868
7	api.consumet.org	1,470
8	epublifier	798
9	linvo-scraper	622
10	HLTV	453
11	mwoffliner	390
12	mkfd	186
13	scraper	114
14	extension	84
15	vercel-metafy	52
16	freenom-auto-renew-domains	50
17	webscraper-bot	29
18	botasaurus-starter	29
19	Philia	24
20	passport-appointment-bot	24
21	scrapyteer	19
22	forward-proxy-manager	13
23	wallace-apple-dictionary	11