parsel vs got-scraping

parsel

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors (by scrapy)

Source Code

Suggest alternative

Edit details

got-scraping

HTTP client made for scraping based on got. (by apify)

Suggest topics

Source Code

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

parsel		got-scraping
	Project
5	Mentions	3
1,080	Stars	400
1.5%	Growth	4.3%
6.5	Activity	6.3
13 days ago	Latest Commit	about 1 month ago
Python	Language	TypeScript
BSD 3-clause "New" or "Revised" License	License	-

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

parsel

Posts with mentions or reviews of parsel. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-10-07.

What web scraping tools do ya'll use?
1 project | /r/algotrading | 13 Feb 2023

An alternative for beautifulsoup is https://github.com/scrapy/parsel also from the scrapy team.
13 ways to scrape any public data from any website
6 projects | dev.to | 7 Oct 2022

variable.css(".X5PpBb::text").get() # returns a text value variable.css(".gs_a").xpath("normalize-space()").get() # https://github.com/scrapy/parsel/issues/192#issuecomment-1042301716 variable.css(".gSGphe img::attr(srcset)").get() # returns a attribute value variable.css(".I9Jtec::text").getall() # returns a list of strings values variable.xpath('th/text()').get() # returns text value using xpath
Web Scraping With Python (An Ultimate Guide)
3 projects | /r/Python | 15 Sep 2022

Something I don't see discussed when this topic is brought up is that Scrapy's HTML parsing library, parsel, can be installed separately from scrapy itself. You can use it in place of beautifulsoup and, imo, it's much easier to use.
Looking for a nicer html parser to use with python other than BeautifulSoup4
1 project | /r/learnpython | 28 Feb 2022
How to Crawl the Web with Scrapy
7 projects | news.ycombinator.com | 13 Sep 2021

got-scraping

Posts with mentions or reviews of got-scraping. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-09-13.

How do I scrape external web pages and then insert them as records into KB table?
1 project | /r/servicenow | 1 Dec 2022

You could do the scraping yourself by hosting your own ServiceNow MID Server, making a bespoke scraping script on top of an existing library (example: got-scraping), then calling the scraper script via IntegrationHub & a Script Step)
How to Crawl the Web with Scrapy
7 projects | news.ycombinator.com | 13 Sep 2021

While I agree that Scrapy is a great tool for beginner tutorials and easy entry into scraping, it's becoming difficult to use it in real world scenarios because almost all the large players now employ some anti-bot or anti-scraping protection.
A great example above all is Cloudflare. You simply can't convince Cloudflare you're a human with Scrapy alone. Scrapy has only experimental support of HTTP2 and does not support proxies over HTTP2 (https://github.com/scrapy/scrapy/issues/5213). Yet, all browsers use HTTP2 now, which means all normal users use HTTP2... You get the point.
What we use now is Got Scraping (https://github.com/apify/got-scraping). It's a special purpose extension of Got (HTTP client with 18 mil weekly downloads) that masks its HTTP communication as if it was coming from a real browser. Of course, this will not get you as far as Puppeteer or Playwright (headless browsers), but it improved our scraping tremendously. If you need a full crawling library, see the Apify SDK (https://sdk.apify.com) which uses Got Scraping under the hood.
Show HN: Web scraping focused HTTP client for Node.js
2 projects | news.ycombinator.com | 6 Aug 2021

What are some alternatives?

When comparing parsel and got-scraping you can also consider the following projects:

parsel-cli - cli for evaluating css and xpath selectors

google-search-results-php - Google Search Results PHP API via Serp Api

soupsieve - A modern CSS selector implementation for BeautifulSoup

header-generator - NodeJs package for generating browser-like headers.

insomnia - The open-source, cross-platform API client for GraphQL, REST, WebSockets, SSE and gRPC. With Cloud, Local and Git storage.

Scrapy - Scrapy, a fast high-level web crawling & scraping framework for Python.

CSS-Minifier - This CSS Minifier tries to reduce the length of code by renaming class names and id names.

colly - Elegant Scraper and Crawler Framework for Golang

author-tools - Author Tools

puppeteer - Node.js API for Chrome

FnF-Spritesheet-and-XML-Maker - A Friday Night Funkin' mod making helper tool that allows you to generate XML files and spritesheets from induvidual pngs

rtila-releases

parsel vs parsel-cli got-scraping vs google-search-results-php parsel vs soupsieve got-scraping vs header-generator parsel vs insomnia got-scraping vs Scrapy parsel vs CSS-Minifier got-scraping vs colly parsel vs author-tools got-scraping vs puppeteer parsel vs FnF-Spritesheet-and-XML-Maker got-scraping vs rtila-releases

Compare parsel vs got-scraping and see what are their differences.

parsel

got-scraping

parsel

got-scraping

What are some alternatives?