Python HTML Manipulation

Open-source Python projects categorized as HTML Manipulation | Edit details

Top 10 Python HTML Manipulation Projects

  • GitHub repo xmltodict

    Python module that makes working with XML feel like you are working with JSON

    Project mention: Dict or List to store table like data | | 2021-11-26
  • GitHub repo bleach

    Bleach is an allowed-list-based HTML sanitizing library that escapes or strips markup and attributes

    Project mention: mutation XSS via allowed math or svg; p or br; and style, title, noscript, script, textarea, noframes, iframe, | | 2021-02-05
  • SonarQube

    Static code analysis for 29 languages.. Your projects are multi-language. So is SonarQube analysis. Find Bugs, Vulnerabilities, Security Hotspots, and Code Smells so you can release quality code every time. Get started analyzing your projects today for free.

  • GitHub repo pyquery

    A jquery-like library for python

  • GitHub repo lxml

    The lxml XML toolkit for Python

    Project mention: How do i go about building a vidoe conferencing app? | | 2021-08-20

    Generally, I'm already using Python to glue together things like OpenCV or libxml, which do the heavy-lifting, and taking advantage of how things like Qt's QImage release Python's Global Interpreter Lock, allowing me to load and process images on a background thread, so the Python code itself is usually already I/O-bound, but yes. If the Python code would become a bottleneck, it helps with that too.

  • GitHub repo xhtml2pdf

    A library for converting HTML into PDFs using ReportLab

  • GitHub repo html5lib

    Standards-compliant library for parsing and serializing HTML documents and fragments in Python

    Project mention: Pydantic Factories | | 2021-11-25

    Neither did html5lib.

  • GitHub repo gazpacho

    🥫 The simple, fast, and modern web scraping library

    Project mention: Ask HN: What are some tools / libraries you built yourself? | | 2021-05-16

    I've been working on gazpacho [1] for last two years.

    It's a general purpose web scraping library for Python that replaces BeautifulSoup + requests for most projects.

    Just surpassed ~2K downloads every week!


  • OPS

    OPS - Build and Run Open Source Unikernels. Quickly and easily build and deploy open source unikernels in tens of seconds. Deploy in any language to any cloud.

  • GitHub repo untangle

    Converts XML to Python objects

  • GitHub repo MarkupSafe

    Safely add untrusted strings to HTML/XML markup.

  • GitHub repo xmldataset

    xmldataset: xml parsing made easy 🗃️

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2021-11-26.

Python HTML Manipulation related posts


What are some of the best open-source HTML Manipulation projects in Python? This list will help you:

Project Stars
1 xmltodict 4,650
2 bleach 2,276
3 pyquery 2,069
4 lxml 2,009
5 xhtml2pdf 1,852
6 html5lib 943
7 gazpacho 614
8 untangle 536
9 MarkupSafe 443
10 xmldataset 72
Find remote jobs at our new job board There are 28 new remote jobs listed recently.
Are you hiring? Post a new remote job listing for free.
Less time debugging, more time building
Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.