HTML Manipulation

Open-source projects categorized as HTML Manipulation

Top 13 HTML Manipulation Open-Source Projects

  • xmltodict

    Python module that makes working with XML feel like you are working with JSON

  • bleach

    Bleach is an allowed-list-based HTML sanitizing library that escapes or strips markup and attributes

  • Project mention: What's your favorite alternative to bleach for sanitizing HTML? | /r/django | 2023-06-06

    I noticed via the changelog for Django 4.2.2 that bleach is deprecated (Django removed mention of it from their docs).

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • lxml

    The lxml XML toolkit for Python

  • pyquery

    A jquery-like library for python

  • xhtml2pdf

    A library for converting HTML into PDFs using ReportLab

  • react-zoom-pan-pinch

    🖼 React library to support easy zoom, pan, pinch on various html dom elements like <img> and <div>

  • html5lib

    Standards-compliant library for parsing and serializing HTML documents and fragments in Python

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • selectolax

    Python binding to Modest and Lexbor engines (fast HTML5 parser with CSS selectors).

  • Project mention: GitHub – GSA/code-gov: An informative repo for all Code.gov repos | news.ycombinator.com | 2023-09-09

    https://github.com/rushter/selectolax#simple-benchmark )

    (Apache Nutch is a Java-based web crawler which supports e.g. CommonCrawl (which backs various foundational LLMs)) https://en.wikipedia.org/wiki/Apache_Nutch#Search_engines_bu... . But extruct extracts more types of metadata and data than Nutch AFAIU: https://github.com/scrapinghub/extruct )

    datasette-graphql adds a GraphQL HTTP API to a SQLite database:

  • gazpacho

    🥫 The simple, fast, and modern web scraping library

  • untangle

    Converts XML to Python objects

  • MarkupSafe

    Safely add untrusted strings to HTML/XML markup.

  • xmldataset

    xmldataset: xml parsing made easy 🗃️

  • cssutils

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

HTML Manipulation related posts

Index

What are some of the best open-source HTML Manipulation projects? This list will help you:

Project Stars
1 xmltodict 5,380
2 bleach 2,615
3 lxml 2,571
4 pyquery 2,271
5 xhtml2pdf 2,177
6 react-zoom-pan-pinch 1,307
7 html5lib 1,095
8 selectolax 967
9 gazpacho 730
10 untangle 607
11 MarkupSafe 598
12 xmldataset 77
13 cssutils -

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com