Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 23 Xpath Open-Source Projects
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
HtmlAgilityPack
Html Agility Pack (HAP) is a free and open-source HTML parser written in C# to read/write DOM and supports plain XPATH or XSLT. It is a .NET code library that allows you to parse "out of the web" HTML files.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
xidel
Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.
-
camaro
camaro is an utility to transform XML to JSON, using Node.js binding to native XML parser pugixml, one of the fastest XML parser around.
-
dude
dude uncomplicated data extraction: A simple framework for writing web scrapers using Python decorators
-
ftr-site-config
Site-specific article extraction rules to aid content extractors, feed readers, and 'read later' applications.
-
Meeseeks
An Elixir library for parsing and extracting data from HTML and XML with CSS or XPath selectors.
-
nokolexbor
High-performance HTML5 parser for Ruby based on Lexbor, with support for both CSS selectors and XPath.
-
internettools
XPath/XQuery 3.1 interpreter for Pascal with compatibility modes for XPath 2.0/XQuery 1.0/3.0, custom and JSONiq extensions, pattern matching, XML/HTML/JSON parsers and classes for HTTP/S requests
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
et déjà essayé: pugixml
Project mention: Script invoking an Online Port Scan of your external IP, to test your firewall and port forwarder. | /r/PowerShell | 2023-07-06Pretty Straighforward. It uses an online port scanner , in this case https://www.speedguide.net/portscan.php parses the replies using HtmlAgilityPack .
Some time ago, I started a project called "xq", which is a command-line XML and HTML beautifier and content extractor written in Go. Using this project as an example, I want to show what I did to make it a little bit more discoverable and usable by other people.
You could try Xidel[1]. It supports JSON, XML and HTML using XPath/XQuery 3.1
It has some extensions to the standard that are pretty nice (JSONiq, CSS selectors, html “template” matching), but you can limit it to just standard XPath/XQuery if you like.
I recommend getting the nightly v .99 build if you give it a try, the stable .98 version is pretty old and I’ve had no issues with .99
back in the day where every OTA (online travel agent) and airlines use XML for their API, we had to integrate them in an API gateway where to unify their API schema and workflow.
we wrote a small package[1] (using pugixml) to transform XML to JSON using a custom Xpath template syntax. Make our job much easier.
Project mention: Webscraping beginner here ready to start leveling up to intermediate. Looking for some good webscraping repositories (e.g any of your GitHub repos/projects) that I can use as learning tools, and general recommendations for what to do next | /r/webscraping | 2023-05-08Please check https://github.com/roniemartinez/dude
As far as full-text caching... maybe a self-hosted instance or paid version of the FiveFilters Full-Text RSS service would work. You can integrate that into whatever aggregator you want.
Project mention: Ruby 3.3's YJIT: Faster While Using Less Memory | news.ycombinator.com | 2023-12-18Yes, we ended up replacing Nokogiri by Nokolexbor, our own port of lexbor parser with like almost full compatibility with Nokogiri APIs while being around 5x faster: https://github.com/serpapi/nokolexbor
Not XPath, but for folks interested in querying (rather than walking) syntax trees for arbitrary nodes, this is also a cool feature of tree-sitter[1]. It uses a scheme-like syntax, and it’s impressively efficient.
And in terms of XPath, for folks using a JS stack, fontoxpath[2] supports a DOM facade adapter interface which allows for querying any arbitrary tree-like structure, so it could certainly handle the same use case.
1: https://tree-sitter.github.io/tree-sitter/using-parsers#patt...
Xpath related posts
- Xsel: A XPath 1.0 Go library/CLI that can query XML, HTML, and JSON documents
- Using XPath in 2023
- can someone suggest a good rss reader for android please?
- Script invoking an Online Port Scan of your external IP, to test your firewall and port forwarder.
- Script to test the state of certain ports on your firewall from the outside
- Fast XML to JSON using xpath templates in node
- Copy Pasting Email Content Issue
-
A note from our sponsor - InfluxDB
www.influxdata.com | 19 Apr 2024
Index
What are some of the best open-source Xpath projects? This list will help you:
Project | Stars | |
---|---|---|
1 | jsoup | 10,606 |
2 | PugiXML | 3,802 |
3 | Ono | 2,599 |
4 | HtmlAgilityPack | 2,545 |
5 | DiDOM | 2,173 |
6 | parsel | 1,074 |
7 | Fuzi | 1,057 |
8 | xq | 745 |
9 | htmlquery | 696 |
10 | xidel | 650 |
11 | xpath | 649 |
12 | camaro | 547 |
13 | dude | 412 |
14 | eXist | 408 |
15 | xmlquery | 402 |
16 | sweet_xml | 353 |
17 | ftr-site-config | 348 |
18 | Meeseeks | 308 |
19 | jsonquery | 240 |
20 | nokolexbor | 153 |
21 | fs2-data | 140 |
22 | fontoxpath | 125 |
23 | internettools | 115 |