Python Specific Formats Processing

Open-source Python projects categorized as Specific Formats Processing

Top 23 Python Specific Formats Processing Projects

  • WeasyPrint

    The awesome document factory

    Project mention: The Gemini protocol seen by this HTTP client person (curl dev) | /r/programming | 2023-05-30

    Well yes, but you can implement HTML+CSS. WeasyPrint did from scratch, and independent implementations of HTML+CSS are considerably more numerous than HTML+CSS+JS.

  • PyPDF2

    A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files

    Project mention: Yara scanning PDF files | /r/computerforensics | 2023-06-01
  • Sonar

    Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.

  • csvkit

    A suite of utilities for converting to and working with CSV, the king of tabular file formats.

    Project mention: I wrote this iCalendar (.ics) command-line utility to turn common calendar exports into more broadly compatible CSV files. | /r/commandline | 2023-03-24

    CSV utilities (still haven't pick a favorite one...):

  • PDFMiner

    Python PDF Parser (Not actively maintained). Check out pdfminer.six.

    Project mention: Creating a python class for organizing courses I took in my education | /r/learnpython | 2022-10-15

    Technically this information is on my transcript, so I will be trying to use pdfminer to extract that data if there is a way to use a class you recommend when using that code

  • tablib

    Python Module for Tabular Datasets in XLS, CSV, JSON, YAML, &c.

  • python-docx

    Create and modify Word documents with Python

    Project mention: See unknow person with a problem in Stackoverflow: writes a library for her | /r/ProgrammerHumor | 2023-03-13
  • XlsxWriter

    A Python module for creating Excel XLSX files.

    Project mention: Streamlining Data Export to Excel: A comprehensive guide to using Python, Nodejs, PHP. | | 2023-01-11


  • InfluxDB

    Access the most powerful time series database as a service. Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression.

  • Python-Markdown

    A Python implementation of John Gruber’s Markdown with Extension support.

    Project mention: Introducing AutoPyTabs: Automatically generate code examples for different Python versions in MkDocs or Sphinx based documentations | /r/Python | 2023-04-30

    AutoPyTabs allows you to write code examples in your documentation targeting a single version of Python and then generates examples targeting higher Python versions on the fly, presenting them in tabs, using popular tabs extensions. This all comes packaged as a markdown extension, MkDocs plugin and a Sphinx, so it can easily be integrated with your documentation workflow.

  • borb

    borb is a library for reading, creating and manipulating PDF files in python.

    Project mention: Caffè Italia * 30/04/23 | /r/italy | 2023-04-30
  • xlwings

    xlwings is a Python library that makes it easy to call Python from Excel and vice versa. It works with Excel on Windows and macOS as well as with Google Sheets and Excel on the web.

    Project mention: Running python from excel | /r/learnpython | 2023-01-25


  • markdown2

    markdown2: A fast and complete implementation of Markdown in Python

    Project mention: Help converting markdown to HTML for CS50 Web Wiki Pset | /r/cs50 | 2023-02-05

    I remember I had some problems with converting as well but as per there is a quick usage section, and this example worked for me. Please try it like this

  • unoconv

    Universal Office Converter - Convert between any document format supported by LibreOffice/OpenOffice.

  • Mistune

    A fast yet powerful Python Markdown parser with renderers and plugins.

  • pdftabextract

    A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.

  • Camelot

    A Python library to extract tabular data from PDFs

    Project mention: Camelot: DeprecationError: PdfFileReader is deprecated and was removed in PyPDF2 3.0.0. Use PdfReader instead. | /r/learnpython | 2022-12-29

    here is the corresponding bug report in git:

  • python-pptx

    Create Open XML PowerPoint documents in Python

  • docxtpl

    Use a docx as a jinja2 template

  • pyexcel

    Single API for reading, manipulating and writing data in csv, ods, xls, xlsx and xlsm files

  • pymorphy2

    Morphological analyzer / inflection engine for Russian and Ukrainian languages.

    Project mention: Determine russian sentence parts. | /r/russian | 2023-05-11
  • mistletoe

    A fast, extensible and spec-compliant Markdown parser in pure Python.

    Project mention: python development on logseq md files | /r/logseq | 2023-01-20
  • unp

    Unpacks things.

    Project mention: A lifehack for your shell (link in comments) | /r/emacs | 2022-07-17

    I use for years

  • vcspull

    :arrows_counterclockwise: Synchronize projects via yaml/json manifest. Built using `libvcs`.

  • Marmir

    Python powered spreadsheets (by brianray)

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2023-06-01.

Python Specific Formats Processing related posts


What are some of the best open-source Specific Formats Processing projects in Python? This list will help you:

Project Stars
1 WeasyPrint 5,847
2 PyPDF2 5,752
3 csvkit 5,471
4 PDFMiner 5,021
5 tablib 4,251
6 python-docx 3,646
7 XlsxWriter 3,262
8 Python-Markdown 3,250
9 borb 3,032
10 xlwings 2,617
11 markdown2 2,477
12 unoconv 2,419
13 Mistune 2,267
14 pdftabextract 2,064
15 Camelot 1,985
16 python-pptx 1,735
17 docxtpl 1,579
18 pyexcel 1,125
19 pymorphy2 1,060
20 mistletoe 616
21 unp 406
22 vcspull 202
23 Marmir 170
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives