TypeScript Scraper

Open-source TypeScript projects categorized as Scraper

Top 23 TypeScript Scraper Projects

  1. firecrawl

    🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.

    Project mention: Show HN: Get structured website data with just a prompt | news.ycombinator.com | 2025-01-20

    - Also, most of our work including /extract is open-source. Check it out here at https://github.com/mendableai/firecrawl

    That's all for now! Let us know any feedback on /extract.

  2. CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
  3. cheerio

    The fast, flexible, and elegant library for parsing and manipulating HTML and XML.

    Project mention: A JavaScript scraper for the Wikipedia Academy Award List. | dev.to | 2025-01-23

    Scraping the Academy Award winners listed on Wikipedia with cheerio and saving them to a CSV file.

  4. crawlee

    Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

    Project mention: Inside implementing SuperScraper with Crawlee. | dev.to | 2025-03-05

    View on GitHub

  5. maxun

    Open-source no-code web data extraction platform. Turn websites to APIs & spreadsheets with no-code robots in minutes.

    Project mention: Maxun: Open-Source No-Code Web Data Extraction Platform | news.ycombinator.com | 2024-11-08
  6. llm-scraper

    Turn any webpage into structured data using LLMs

    Project mention: llm-scraper VS parsera - a user suggested alternative | libhunt.com/r/llm-scraper | 2024-10-16
  7. api.consumet.org

    A Modern Search Engine API for Anime, Movies/TVShows, Books, Light Novels, Manga, etc.

  8. epublifier

    Converts some webnovels to epub format

    Project mention: 聊聊开源 - FAV0周刊#019 | dev.to | 2024-10-27

    将网站转化为Epub

  9. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  10. linvo-scraper

    Linkedin Automation Bot with every possible scraping! Valid for 2022 used by Linvo.io

  11. HLTV

    The unofficial HLTV Node.js API

  12. DevDocs

    Completely free, private, UI based Tech Documentation MCP server. Designed for coders and software developers in mind. Easily integrate into Cursor, Windsurf, Cline, Roo Code, Claude Desktop App (by cyberagiinc)

    Project mention: Show HN: We made an MCP Server so that Cursor can build anything from API Docs | news.ycombinator.com | 2025-03-24

    Looks cool, the only one similar I've seen so far that is similar is: https://github.com/cyberagiinc/DevDocs

    But every-time I've tried to run DevDocs, I've had issues running it. Either the scraper or the MCP server fails to run.

  13. mwoffliner

    Mediawiki scraper: all your wiki articles in one highly compressed ZIM file

  14. scraper

    Nodejs web scraper. Contains a command line, docker container, terraform module and ansible roles for distributed cloud scraping. Supported databases: SQLite, MySQL, PostgreSQL. Supported headless clients: Puppeteer, Playwright, Cheerio, JSdom. (by get-set-fetch)

  15. extension

    web scraping extension (by get-set-fetch)

  16. freenom-auto-renew-domains

    A scraper built with puppeteer that auto renew free domains on Freenom and send discord message using bot

  17. vercel-metafy

    Easily scrape metadata from websites as a service using Vercel.

  18. mkfd

    RSS feed builder created with Bun🥖 and Hono🔥- builds from webpages and/or REST API calls

    Project mention: Mkfd – RSS feed builder API created with Bun and Hono | news.ycombinator.com | 2024-11-17
  19. webscraper-bot

    Web scraping Discord bot that notifies if new item appears

    Project mention: I built my first SaaS - NotiFast | dev.to | 2024-06-22

    This is the second version of this bot. The first approach was webscraper-bot, which I built because I needed to be notified about new rental apartments quickly (more about that in this post). Some people started discovering the bot, and after a few months, I had around 100 users, but there was one big problem. Over 90% of the users didn't manage to create a single scraping bot because it required a query selector to be inserted. So how I interpreted it was:

  20. Philia

    An easy to use imageboard scraper.

  21. passport-appointment-bot

    An automated bot designed to seamlessly book appointments for the renewal or creation of Swedish passports or national ID cards.

  22. botasaurus-starter

    🚀 OFFICIAL STARTER TEMPLATE FOR BOTASAURUS SCRAPING FRAMEWORK 🤖

  23. scrapyteer

    Web crawling & scraping framework for Node.js on top of headless Chrome browser

  24. forward-proxy-manager

    Request distributor for web scraping

  25. wallace-apple-dictionary

    :book: macOS Dictionary for the readers of "Infinite Jest"

  26. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

TypeScript Scraper discussion

Log in or Post with

TypeScript Scraper related posts

  • Show HN: Get structured website data with just a prompt

    1 project | news.ycombinator.com | 20 Jan 2025
  • Show HN: Llms.txt Generator – Turn websites into a text file to feed to any LLM

    2 projects | news.ycombinator.com | 21 Nov 2024
  • Maxun: Open-Source No-Code Web Data Extraction Platform

    1 project | news.ycombinator.com | 8 Nov 2024
  • Maxun: Open-Source No-Code Web Data Extraction Platform

    1 project | news.ycombinator.com | 3 Nov 2024
  • Maxun: Open Source No-Code Web Data Extraction Platform⚡️

    1 project | dev.to | 30 Oct 2024
  • Firecrawl: Turn entire websites into LLM-ready Markdown or structured data

    1 project | news.ycombinator.com | 6 Oct 2024
  • Overcoming Common Web Scraping Challenges with Firecrawl, an open-source AI tool

    1 project | dev.to | 27 Sep 2024
  • A note from our sponsor - CodeRabbit
    coderabbit.ai | 25 Mar 2025
    Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR. Learn more →

Index

What are some of the best open-source Scraper projects in TypeScript? This list will help you:

# Project Stars
1 firecrawl 31,797
2 cheerio 29,221
3 crawlee 17,200
4 maxun 9,604
5 llm-scraper 4,635
6 api.consumet.org 1,330
7 epublifier 774
8 linvo-scraper 612
9 HLTV 419
10 DevDocs 350
11 mwoffliner 326
12 scraper 110
13 extension 81
14 freenom-auto-renew-domains 51
15 vercel-metafy 49
16 mkfd 63
17 webscraper-bot 29
18 Philia 25
19 passport-appointment-bot 24
20 botasaurus-starter 23
21 scrapyteer 19
22 forward-proxy-manager 12
23 wallace-apple-dictionary 11

Sponsored
CodeRabbit: AI Code Reviews for Developers
Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
coderabbit.ai

Did you know that TypeScript is
the 1st most popular programming language
based on number of references?