Python Specific Formats Processing

Open-source Python projects categorized as Specific Formats Processing

Top 23 Python Specific Formats Processing Projects

  • pdfminer

    Python PDF Parser (Not actively maintained). Check out pdfminer.six.

  • csvkit

    A suite of utilities for converting to and working with CSV, the king of tabular file formats.

  • WeasyPrint

    The awesome document factory

  • tablib

    Python Module for Tabular Datasets in XLS, CSV, JSON, YAML, &c.

  • PyPDF2

    A utility to read and write PDFs with Python

  • python-docx

    Create and modify Word documents with Python

  • XlsxWriter

    A Python module for creating Excel XLSX files.

  • markdown

    A Python implementation of John Gruber’s Markdown with Extension support.

  • python-markdown2

    markdown2: A fast and complete implementation of Markdown in Python

  • unoconv

    Universal Office Converter - Convert between any document format supported by LibreOffice/OpenOffice.

  • pdftabextract

    A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.

  • mistune

    A fast yet powerful Python Markdown parser with renderers and plugins.

  • xlwings

    xlwings is a BSD-licensed Python library that makes it easy to call Python from Excel and vice versa. It works with Microsoft Excel on Windows and macOS. Sign up for the newsletter or follow us on twitter via

    Latest mention: Pandas, xlwings and timestamps | | 2020-12-22

    No problem, my first guess would be some weird conversion issue or how the dates are stored and then converted, this issue says that you need to have a datetime object for proper conversion, maybe that's the problem?

  • python-pptx

    Create Open XML PowerPoint documents in Python

  • camelot

    A Python library to extract tabular data from PDFs

  • python-docx-template

    Use a docx as a jinja2 template

  • pyexcel

    Single API for reading, manipulating and writing data in csv, ods, xls, xlsx and xlsm files

  • pymorphy2

    Morphological analyzer / inflection engine for Russian and Ukrainian languages.

  • unp

    Unpacks things.

  • mistletoe

    A fast, extensible and spec-compliant Markdown parser in pure Python.

  • vcspull

    :arrows_counterclockwise: synchronize projects via yaml/json manifest. built on libvcs

  • mm

    Python powered spreadsheets

  • libvcs

    ⚙️ vcs abstraction layer

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).


What are some of the best open-source Specific Formats Processing projects in Python? This list will help you:

Project Stars
1 pdfminer 4,472
2 csvkit 4,447
3 WeasyPrint 4,048
4 tablib 3,834
5 PyPDF2 3,448
6 python-docx 2,496
7 XlsxWriter 2,436
8 markdown 2,386
9 python-markdown2 2,129
10 unoconv 2,037
11 pdftabextract 1,886
12 mistune 1,883
13 xlwings 1,881
14 python-pptx 1,207
15 camelot 960
16 python-docx-template 906
17 pyexcel 882
18 pymorphy2 871
19 unp 376
20 mistletoe 356
21 vcspull 188
22 mm 162
23 libvcs 40