Crawlee Alternatives

Similar projects and alternatives to crawlee

TypeScript

1,303 97,944 9.9 TypeScript crawlee VS TypeScript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
steam-for-linux

463 4,108 2.0 crawlee VS steam-for-linux

Issue tracking for the Steam for Linux beta client
SurveyJS

surveyjs.io sponsored

Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
axios

438 103,985 8.4 JavaScript crawlee VS axios

Promise based HTTP client for the browser and node.js
Playwright

379 61,568 9.9 TypeScript crawlee VS Playwright

Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.
puppeteer

359 86,773 9.9 TypeScript crawlee VS puppeteer

Node.js API for Chrome
jq

306 25,063 0.0 C crawlee VS jq

Discontinued Command-line JSON processor [Moved to: https://github.com/jqlang/jq] (by stedolan)
Scrapy

180 50,824 9.6 Python crawlee VS Scrapy

Scrapy, a fast high-level web crawling & scraping framework for Python.
InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Cypress

174 46,129 9.8 JavaScript crawlee VS Cypress

Fast, easy and reliable testing for anything that runs in a browser.
Selenium WebDriver

63 29,245 9.9 Java crawlee VS Selenium WebDriver

A browser automation framework and ecosystem.
SheetJS js-xlsx

61 34,479 2.4 JavaScript crawlee VS SheetJS js-xlsx

📗 SheetJS Spreadsheet Data Toolkit -- New home https://git.sheetjs.com/SheetJS/sheetjs
jsdom

55 19,954 7.8 JavaScript crawlee VS jsdom

A JavaScript implementation of various web standards, for use with Node.js
pup

52 7,998 0.0 HTML crawlee VS pup

Parsing HTML at the command line
cheerio

50 27,749 9.7 TypeScript crawlee VS cheerio

The fast, flexible, and elegant library for parsing and manipulating HTML and XML.
undetected-chromedriver

40 8,066 7.1 Python crawlee VS undetected-chromedriver

Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
colly

39 22,165 6.0 Go crawlee VS colly

Elegant Scraper and Crawler Framework for Golang
oclif

34 8,799 9.4 TypeScript crawlee VS oclif

CLI for generating, building, and releasing oclif CLIs. Built by Salesforce.
chrome-aws-lambda

12 3,136 0.0 TypeScript crawlee VS chrome-aws-lambda

Chromium Binary for AWS Lambda and Google Cloud Functions
estela

10 153 8.1 Python crawlee VS estela

estela, an elastic web scraping cluster 🕸
NectarJS

6 3,540 0.0 C++ crawlee VS NectarJS

🔱 Javascript's God Mode. No VM. No Bytecode. No GC. Just native binaries.
awesome-puppeteer

1 2,318 3.0 crawlee VS awesome-puppeteer

A curated list of awesome puppeteer resources.
WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better crawlee alternative or higher similarity.

Suggest an alternative to crawlee

crawlee reviews and mentions

Posts with mentions or reviews of crawlee. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-01.

How to scrape Amazon products
4 projects | dev.to | 1 Apr 2024

In this guide, we'll be extracting information from Amazon product pages using the power of TypeScript in combination with the Cheerio and Crawlee libraries. We'll explore how to retrieve and extract detailed product data such as titles, prices, image URLs, and more from Amazon's vast marketplace. We'll also discuss handling potential blocking issues that may arise during the scraping process.
Automating Data Collection with Apify: From Script to Deployment
4 projects | dev.to | 17 Mar 2024

Previously, the Apify SDK offered a blend of crawling functionalities and Actor building features. However, a recent update separated these functionalities into two distinct libraries: Crawlee and Apify SDK v3. Crawlee now houses the web scraping and crawling tools, while Apify SDK v3 focuses solely on features specific to building Actors for the Apify platform. This distinction allows for a clear separation of concerns and enhances the development experience for various use cases.
Launching Crawlee Blog: Your Node.js resource hub for web scraping and automation.
1 project | dev.to | 26 Feb 2024

v3.1 added an error tracker for analyzing and summarizing failed requests.
Anything like scrapy in other languages?
1 project | /r/webscraping | 10 Dec 2023

Closest I found was https://crawlee.dev/ for Javascript/Typescript although still seems not on the level of scrapy. I didn't try it.
What is Playwright?
5 projects | dev.to | 11 Oct 2023

Also, you can go even further and develop your own web scraper with Crawlee, a Node.js library that helps you pass those challenges automatically using Puppeteer or Playwright. Crawlee helps you build reliable scrapers fast. Quickly scrape data, store it, and avoid getting blocked with headless browsers, smart proxy rotation, and auto-generated human-like headers and fingerprints.
Best web scraping framework to learn
1 project | /r/webscraping | 12 Jul 2023

https://crawlee.dev/ its very good, you can easily run your spiders in cloud with apify, and nodejs/puppeteer has many advantages than python/selenium
Deep diving into Apify world
1 project | /r/thewebscrapingclub | 2 Apr 2023

Apify is a platform for web scraping that helps the developer starting from the coding, having developed its open-source NodeJs library for web scraping called Crawlee. Then on their platform, you can run and monitor the scrapers and also finally sell your scrapers in their store.
Build and run your Python web scrapers in the cloud with Apify SDK for Python
2 projects | /r/webscraping | 14 Mar 2023

You can use our open source tools (not only this one, but also Crawlee for example) to build your scrapers and run them on your computer, and then if you need to run them in the cloud, you can upload them to the Apify platform and run them there. Our free tier is good enough for smaller web scraping and automation projects, and if you need more compute resources or proxies, you can go for one of our paid tiers.
How to scrape the web with Puppeteer in 2023
5 projects | dev.to | 7 Mar 2023

Comfortable scraping and crawling with Puppeteer is better done together with another library. This library is called Crawlee, and it's also free and open-source, just like Puppeteer. Crawlee wraps Puppeteer and grants access to all of Puppeteer's functionality, but also provides useful crawling and scraping tools like error handling, queue management, storages, proxies or fingerprints out of the box.
What's the most advanced, best maintained, most fully featured web scraper for node.js
2 projects | /r/node | 11 Feb 2023
A note from our sponsor - InfluxDB
www.influxdata.com | 24 Apr 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Stats

Basic crawlee repo stats

Mentions

Stars

12,044

Activity

9.8

Last Commit

7 days ago

apify/crawlee is an open source project licensed under Apache License 2.0 which is an OSI approved license.

The primary programming language of crawlee is TypeScript.

Popular Comparisons