js2xml
Convert Javascript code to an XML document (by scrapinghub)
parsel
Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors (by scrapy)
js2xml | parsel | |
---|---|---|
1 | 5 | |
183 | 1,080 | |
0.0% | 1.5% | |
3.2 | 6.5 | |
about 2 years ago | 12 days ago | |
Python | Python | |
MIT License | BSD 3-clause "New" or "Revised" License |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
js2xml
Posts with mentions or reviews of js2xml.
We have used some of these posts to build our list of alternatives
and similar projects.
-
How do you parse a javascript script with python?
I've used js2xml with great success in the past. You can feed the output to HTML parser (like parsel or beautifulsoup) and reliably extract js object values.
parsel
Posts with mentions or reviews of parsel.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2022-10-07.
-
What web scraping tools do ya'll use?
An alternative for beautifulsoup is https://github.com/scrapy/parsel also from the scrapy team.
-
13 ways to scrape any public data from any website
variable.css(".X5PpBb::text").get() # returns a text value variable.css(".gs_a").xpath("normalize-space()").get() # https://github.com/scrapy/parsel/issues/192#issuecomment-1042301716 variable.css(".gSGphe img::attr(srcset)").get() # returns a attribute value variable.css(".I9Jtec::text").getall() # returns a list of strings values variable.xpath('th/text()').get() # returns text value using xpath
-
Web Scraping With Python (An Ultimate Guide)
Something I don't see discussed when this topic is brought up is that Scrapy's HTML parsing library, parsel, can be installed separately from scrapy itself. You can use it in place of beautifulsoup and, imo, it's much easier to use.
- Looking for a nicer html parser to use with python other than BeautifulSoup4
- How to Crawl the Web with Scrapy
What are some alternatives?
When comparing js2xml and parsel you can also consider the following projects:
Django_blog - A blog application made with Django and bootstrap
parsel-cli - cli for evaluating css and xpath selectors