jsoup

jsoup: the Java HTML parser, built for HTML editing, cleaning, scraping, and XSS safety. (by jhy)

Jsoup Alternatives

Similar projects and alternatives to jsoup

  1. rust

    2,958 jsoup VS rust

    Empowering everyone to build reliable and efficient software.

  2. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  3. tachiyomi

    583 jsoup VS tachiyomi

    Discontinued Free and open source manga reader for Android.

  4. LibreTranslate

    Free and Open Source Machine Translation API. Self-hosted, offline capable and easy to setup.

  5. gron

    73 jsoup VS gron

    Make JSON greppable!

  6. murex

    68 jsoup VS murex

    A smarter shell and scripting environment with advanced features designed for usability, safety and productivity (eg smarter DevOps tooling)

  7. Guava

    65 jsoup VS Guava

    Google core libraries for Java

  8. pup

    52 jsoup VS pup

    Parsing HTML at the command line

  9. Disruptor

    36 jsoup VS Disruptor

    High Performance Inter-Thread Messaging Library

  10. website

    32 jsoup VS website

    Javalin website source code (by javalin)

  11. libsodium

    A modern, portable, easy to use crypto library.

  12. xidel

    22 jsoup VS xidel

    Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.

  13. Apache PDFBox

    Mirror of Apache PDFBox

  14. JsonPath

    12 jsoup VS JsonPath

    Java JsonPath implementation

  15. AssertJ

    15 jsoup VS AssertJ

    Fluent testing assertions for Java and the JVM

  16. Apache Nutch

    Apache Nutch is an extensible and scalable web crawler

  17. playwright-java

    Java version of the Playwright testing and automation library

  18. Crawler4j

    3 jsoup VS Crawler4j

    Open Source Web Crawler for Java

  19. lol-html

    8 jsoup VS lol-html

    Low output latency streaming HTML parser/rewriter with CSS selector-based API

  20. microhttp

    Fast, scalable, self-contained, single-threaded Java web server

  21. Sparkler

    0 jsoup VS Sparkler

    Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better jsoup alternative or higher similarity.

jsoup discussion

Log in or Post with

jsoup reviews and mentions

Posts with mentions or reviews of jsoup. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2026-05-27.
  • Best Java Web Scraping Libraries
    4 projects | dev.to | 27 May 2026
    jsoup is usually the best first option for static HTML extraction. Its official website describes it as a Java HTML parser for real-world HTML and XML, supporting URL fetching, parsing, DOM traversal, CSS selectors, and XPath selectors jsoup official documentation.
  • Jsoup - Java HTML 파서 및 웹 스크래핑 라이브러리
    1 project | dev.to | 30 Dec 2025
    Document doc = Jsoup.connect("http://jsoup.org").get(); Element link = doc.select("a").first(); String relHref = link.attr("href"); // "/" String absHref = link.attr("abs:href"); // "http://jsoup.org/" // 또는 headline.absUrl("href")
  • Jsoup으로 HTML 파싱하기 - 웹 스크래핑 가이드
    1 project | dev.to | 30 Dec 2025
  • Reverse-Engineering Cookies with Ktor and Ksoup
    4 projects | dev.to | 6 Apr 2025
    On top of this, because it scrapes HTML content, I also needed to use the Jsoup and Ksoup libraries to parse the HTML content and extract the necessary values. Because it's built on top of Ktor, it's available on multiple plaforms, meaning I needed to find different libraries to parse the HTML content on different platforms, and use different Ktor engines. For example, on Android, it uses the OkHttp engine, while on iOS, it uses the Darwin engine, and on the JVM, it uses the Java engine introduced in JDK11. You can look at the full build.gradle.kts file to see the dependencies and how they are set up.
  • Embed a Full HTML Document Inline Using Shadow DOM
    2 projects | dev.to | 18 Sep 2024
    We use ColdFusion/CFML to generate valid HTML documents for PDF generation using jsoup & WKHTMLTOPDF. If the generated HTML content is simply outputted onto an existing webpage, the webpage becomes invalid (due to double DOCTYPE "inception") and the website's global CSS styles will polluting the preview.
  • FLaNK Stack Weekly for 20 June 2023
    34 projects | dev.to | 20 Jun 2023
  • Russia news visualisation on steroids
    2 projects | /r/datascience | 23 May 2023
    2e. The HTML parsing library is in app-kt. It's called JSoup https://jsoup.org/
  • Looking for direction, guidance on in-home call button.
    1 project | /r/AndroidStudio | 26 Feb 2023
    For parsing the webpage in Java or Kotlin you can use Jsoup
  • Web Scraping Google With Java
    1 project | dev.to | 29 Jan 2023
    Jsoup — It is a Java library that can be used for both extracting and parsing HTML.
  • How I archived 100 million PDF documents... (Part 1)
    6 projects | dev.to | 11 Jan 2023
    Finally, at this point, I was able to go through a bunch of webpages (parsing them in the process with JSoup), grab all the links that contained pdf files based on the file extension then download them. Unsurprisingly, most of the pages (~60-80%) ended up being unavailable (404 Not Found and friends). After a quick cup of coffee, I got the 10.000 documents on my hard drive. This is when I realized that I have one more problem to solve.
  • A note from our sponsor - SaaSHub
    www.saashub.com | 12 Jun 2026
    SaaSHub helps you find the best software and product alternatives Learn more →

Stats

Basic jsoup repo stats
32
11,365
9.1
8 days ago

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com

Did you know that Java is
the 10th most popular programming language
based on number of references?