Python Specific Formats Processing

Open-source Python projects categorized as Specific Formats Processing

Top 23 Python Specific Formats Processing Projects

Specific Formats Processing
  • PyPDF2

    A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files

  • Scout Monitoring

    Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

    Scout Monitoring logo
  • WeasyPrint

    The awesome document factory

  • Project mention: CSS Written in Pure Go | news.ycombinator.com | 2024-06-01

    Also see a full web rendering engine (modern HTML+CSS the whole layout engine) made in pure Python, that can export to PDFs: https://github.com/Kozea/WeasyPrint

  • csvkit

    A suite of utilities for converting to and working with CSV, the king of tabular file formats.

  • tablib

    Python Module for Tabular Datasets in XLS, CSV, JSON, YAML, &c.

  • PyMuPDF

    PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

  • Project mention: FLaNK Stack for 04 December 2023 | dev.to | 2023-12-04
  • python-docx

    Create and modify Word documents with Python

  • Project mention: What Would Go in Your Dream Documentation Solution? | /r/technicalwriting | 2023-12-09

    So, what I'd like to do is write a documentation package in Python to recreate what I've lost. I plan to build upon the fantastic python-docx and docxtpl packages, and I'll probably rely on pandas from much of the tabular stuff. Here are the features I intend to include:

  • Python-Markdown

    A Python implementation of John Gruber’s Markdown with Extension support.

  • Project mention: Show HN: Python) Markdown Exec, execute code blocks and render their output | news.ycombinator.com | 2024-06-15

    Hey everyone, here's an extension I made for Python-Markdown (https://github.com/Python-Markdown/markdown). It builds on top of PyMDown Extensions' SuperFences (https://facelessuser.github.io/pymdown-extensions/extensions...), and allows Markdown writers to execute their Markdown code blocks to render the execution output in place of / in addition to the code blocks.

    Languages supported:

    - python/pycon

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • XlsxWriter

    A Python module for creating Excel XLSX files.

  • borb

    borb is a library for reading, creating and manipulating PDF files in python.

  • xlwings

    xlwings is a Python library that makes it easy to call Python from Excel and vice versa. It works with Excel on Windows and macOS as well as with Google Sheets and Excel on the web.

  • Project mention: Microsoft to Deprecate VBScript | news.ycombinator.com | 2024-05-26

    AFAIK, Python in Excel is fine, but only if you don't use Microsoft's: https://www.xlwings.org/

  • Camelot

    A Python library to extract tabular data from PDFs

  • Project mention: Show HN: How do you OCR on a Mac using the CLI or just Python for free | news.ycombinator.com | 2024-01-02

    I had good repeated success extracting tables from PDFs using Camelot (Python, https://github.com/camelot-dev/camelot)

  • markdown2

    markdown2: A fast and complete implementation of Markdown in Python

  • unoconv

    Universal Office Converter - Convert between any document format supported by LibreOffice/OpenOffice.

  • Mistune

    A fast yet powerful Python Markdown parser with renderers and plugins.

  • python-pptx

    Create Open XML PowerPoint documents in Python

  • pdftabextract

    A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.

  • docxtpl

    Use a docx as a jinja2 template

  • pyexcel

    Single API for reading, manipulating and writing data in csv, ods, xls, xlsx and xlsm files

  • Project mention: Advice on ETL and Data Sharing work process | /r/ETL | 2023-11-07

    You could try and write some simple python using the pyexcel and pandas libraries. I created a tool as a consultant with these packages that parsed spreadsheets with data from factories from all around the world. They did not lock down the Excel files used to submit data and it made it so much harder. If you go this route, I would recommend starting by putting your data into a SQLite database. Once you have your data in a database, you unlock the power of SQL for pulling reports. Also, you can port the data into a proper database if you ever need to. ChatGPT can probably get you a good chunk of the way there.

  • pymorphy2

    Morphological analyzer / inflection engine for Russian and Ukrainian languages.

  • mistletoe

    A fast, extensible and spec-compliant Markdown parser in pure Python.

  • unp

    Unpacks things.

  • vcspull

    🔄 Synchronize projects via yaml/json manifest. Built using `libvcs`.

  • Marmir

    Python powered spreadsheets (by brianray)

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Specific Formats Processing discussion

Log in or Post with

Python Specific Formats Processing related posts

Index

What are some of the best open-source Specific Formats Processing projects in Python? This list will help you:

Project Stars
1 PyPDF2 7,661
2 WeasyPrint 6,797
3 csvkit 5,875
4 tablib 4,542
5 PyMuPDF 4,401
6 python-docx 4,316
7 Python-Markdown 3,642
8 XlsxWriter 3,526
9 borb 3,331
10 xlwings 2,874
11 Camelot 2,746
12 markdown2 2,600
13 unoconv 2,514
14 Mistune 2,489
15 python-pptx 2,235
16 pdftabextract 2,152
17 docxtpl 1,908
18 pyexcel 1,180
19 pymorphy2 1,103
20 mistletoe 779
21 unp 416
22 vcspull 203
23 Marmir 172

Sponsored
Free Django app performance insights with Scout Monitoring
Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
www.scoutapm.com