Top 23 Python Specific Formats Processing Projects
-
pdfminer
Python PDF Parser (Not actively maintained). Check out pdfminer.six.
-
csvkit
A suite of utilities for converting to and working with CSV, the king of tabular file formats.
-
WeasyPrint
The awesome document factory
-
tablib
Python Module for Tabular Datasets in XLS, CSV, JSON, YAML, &c.
-
PyPDF2
A utility to read and write PDFs with Python
-
python-docx
Create and modify Word documents with Python
-
XlsxWriter
A Python module for creating Excel XLSX files.
-
markdown
A Python implementation of John Gruber’s Markdown with Extension support.
-
python-markdown2
markdown2: A fast and complete implementation of Markdown in Python
-
unoconv
Universal Office Converter - Convert between any document format supported by LibreOffice/OpenOffice.
-
pdftabextract
A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
-
mistune
A fast yet powerful Python Markdown parser with renderers and plugins.
-
xlwings
xlwings is a BSD-licensed Python library that makes it easy to call Python from Excel and vice versa. It works with Microsoft Excel on Windows and macOS. Sign up for the newsletter or follow us on twitter via
No problem, my first guess would be some weird conversion issue or how the dates are stored and then converted, this issue says that you need to have a datetime object for proper conversion, maybe that's the problem?
-
python-pptx
Create Open XML PowerPoint documents in Python
-
camelot
A Python library to extract tabular data from PDFs
-
python-docx-template
Use a docx as a jinja2 template
-
pyexcel
Single API for reading, manipulating and writing data in csv, ods, xls, xlsx and xlsm files
-
pymorphy2
Morphological analyzer / inflection engine for Russian and Ukrainian languages.
-
unp
Unpacks things.
-
mistletoe
A fast, extensible and spec-compliant Markdown parser in pure Python.
-
vcspull
:arrows_counterclockwise: synchronize projects via yaml/json manifest. built on libvcs
-
mm
Python powered spreadsheets
-
libvcs
⚙️ vcs abstraction layer
Index
What are some of the best open-source Specific Formats Processing projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | pdfminer | 4,472 |
2 | csvkit | 4,447 |
3 | WeasyPrint | 4,048 |
4 | tablib | 3,834 |
5 | PyPDF2 | 3,448 |
6 | python-docx | 2,496 |
7 | XlsxWriter | 2,436 |
8 | markdown | 2,386 |
9 | python-markdown2 | 2,129 |
10 | unoconv | 2,037 |
11 | pdftabextract | 1,886 |
12 | mistune | 1,883 |
13 | xlwings | 1,881 |
14 | python-pptx | 1,207 |
15 | camelot | 960 |
16 | python-docx-template | 906 |
17 | pyexcel | 882 |
18 | pymorphy2 | 871 |
19 | unp | 376 |
20 | mistletoe | 356 |
21 | vcspull | 188 |
22 | mm | 162 |
23 | libvcs | 40 |