Add the SurveyJS white-label form builder to your JavaScript app (React/Angular/Vue3). Build complex JSON forms without coding. Fully customizable, works with any backend, perfect for data-heavy apps. Learn more. Learn more →
Crawlee Alternatives
Similar projects and alternatives to crawlee
-
Playwright
Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.
-
SurveyJS
JavaScript Form Builder with No-Code UI & Built-In JSON Schema Editor. Add the SurveyJS white-label form builder to your JavaScript app (React/Angular/Vue3). Build complex JSON forms without coding. Fully customizable, works with any backend, perfect for data-heavy apps. Learn more.
-
-
-
-
-
-
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
SheetJS js-xlsx
📗 SheetJS Spreadsheet Data Toolkit -- New home https://git.sheetjs.com/SheetJS/sheetjs
-
-
-
undetected-chromedriver
Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
-
-
-
-
crawlee-python
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
-
-
-
pwa-asset-generator
Automates PWA asset generation and image declaration. Automatically generates icon and splash screen images, favicons and mstile images. Updates manifest.json and index.html files with the generated images according to Web App Manifest specs and Apple Human Interface guidelines.
-
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
crawlee discussion
crawlee reviews and mentions
-
Scraperr – A Self Hosted Webscraper
If you're a fan of Playwright check out Crawlee [0]. I've used it for a few small projects and it's been faster for me to get what I've needed done.
[0] https://crawlee.dev/
-
How to scrape TikTok using Python
uvx crawlee['cli'] create tiktok-crawlee --crawler-type playwright --http-client httpx --package-manager uv --apify --start-url 'https://crawlee.dev'
-
Inside implementing SuperScraper with Crawlee.
View on GitHub
-
12 tips on how to think like a web scraping expert
:::tip If you like the blog so far, please consider giving Crawlee a star on GitHub, it helps us to reach and help more developers. :::
-
11 best open-source web crawlers and scrapers in 2024
Check out Crawlee
- Web scraping with GPT-4o: powerful but expensive
-
Current problems and mistakes of web scraping in Python and tricks to solve them!
Developed by Apify, it is a Python adaptation of their famous JS framework crawlee, first released on Jul 9, 2019.
-
Show HN: Crawlee for Python – a web scraping and browser automation library
The main advantage (for now) is that the library has a single interface for both HTTP and headless browsers, and bundled auto scaling. You can write your crawlers using the same base abstraction, and the framework takes care of this heavy lifting. Developers of scrapers shouldn't need to reinvent the wheel, and just focus on building the "business" logic of their scrapers. Having said that, if you wrote your own crawling library, the motivation to use Crawlee might be lower, and that's fair enough.
Please note that this is the first release, and we'll keep adding many more features as we go, including anti-blocking, adaptive crawling, etc. To see where this might go, check https://github.com/apify/crawlee
-
Announcing Crawlee Python: Now you can use Python to build reliable web crawlers
import asyncio from crawlee.playwright_crawler import PlaywrightCrawler, PlaywrightCrawlingContext async def main() -> None: # Create a crawler instance crawler = PlaywrightCrawler( # headless=False, # browser_type='firefox', ) @crawler.router.default_handler async def request_handler(context: PlaywrightCrawlingContext) -> None: data = { "request_url": context.request.url, "page_url": context.page.url, "page_title": await context.page.title(), "page_content": (await context.page.content())[:10000], } await context.push_data(data) await crawler.run(["https://crawlee.dev"]) if __name__ == "__main__": asyncio.run(main())
-
How Crawlee uses tiered proxies to avoid getting blocked
If you like reading this blog, we would be really happy if you gave Crawlee a star on GitHub!
-
A note from our sponsor - SurveyJS
surveyjs.io | 19 May 2025
Stats
apify/crawlee is an open source project licensed under Apache License 2.0 which is an OSI approved license.
The primary programming language of crawlee is TypeScript.