Specific Formats Processing

Open-source projects categorized as Specific Formats Processing | Edit details
Related topics: #Office #Python #PDF #Markdown #YAML

Top 23 Specific Formats Processing Open-Source Projects

  • csvkit

    A suite of utilities for converting to and working with CSV, the king of tabular file formats.

    Project mention: Do you know of any website that has a bunch of files for CSV parser edge case testing? | reddit.com/r/vba | 2021-06-23
  • WeasyPrint

    The awesome document factory

    Project mention: QuestPDF 2021.10 - a new version of the open-source, MIT-licensed, C# library for generating PDF documents with fluent API, now with extended text capabilities. Please help me make it popular :) | reddit.com/r/csharp | 2021-10-06

    I’d recommend Weasyprint (.net core wrapper) instead of wkhtmltopdf. It supports CSS Paged Media which is pretty much required for everything but the simplest of HTML2PDF conversions.

  • SonarLint

    Deliver Cleaner and Safer Code - Right in Your IDE of Choice!. SonarLint is a free and open source IDE extension that identifies and catches bugs and vulnerabilities as you code, directly in the IDE. Install from your favorite IDE marketplace today.

  • PDFMiner

    Python PDF Parser (Not actively maintained). Check out pdfminer.six.

    Project mention: Desktop File Search, How does it work! python or C# library libraries? | reddit.com/r/learnpython | 2021-09-12
  • PyPDF2

    A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files

    Project mention: How do I re-arrange a pdf/docx? | reddit.com/r/learnpython | 2022-04-26
  • tablib

    Python Module for Tabular Datasets in XLS, CSV, JSON, YAML, &c.

    Project mention: Is this possible with Python? | reddit.com/r/learnpython | 2021-12-28

    other than Pandas, you can also use tablib. I personally find tablib to be slightly easier but it doesn't have as many features. But for what you need, tablib might be best

  • python-docx

    Create and modify Word documents with Python

  • XlsxWriter

    A Python module for creating Excel XLSX files.

    Project mention: How much time to expect for a Python programmer to learn basic Excel? | reddit.com/r/learnmachinelearning | 2022-03-02

    I've been in this situation a few times. I would just open the excel sheet in python using xlsxwriter: https://xlsxwriter.readthedocs.io/

  • Scout APM

    Less time debugging, more time building. Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.

  • Python-Markdown

    A Python implementation of John Gruber’s Markdown with Extension support.

    Project mention: Is it a good practice to use /admin to create manage the blog in production? | reddit.com/r/django | 2022-05-19

    Interesting, I also use markdown, but hadn't heard of Django-Markdownx before your today. What I do is create two fields: body_md and body_html, and on save use Python Markdown to turn my markdown in html.

  • borb

    borb is a library for reading, creating and manipulating PDF files in python.

    Project mention: fpdf2.5.2 : SVG support and comparison with borb | dev.to | 2022-04-24

    I will also perform a quick comparison with the borb library.

  • markdown2

    markdown2: A fast and complete implementation of Markdown in Python

    Project mention: Why I built another static site generator: A love story | dev.to | 2022-03-19

    First, I used django-microframework as inspiration for a simple app.py that could be used instead of the potentially overwhelming files Django normally uses for a site. Then, I added in automatic reading of .env files to override Django settings that shouldn't be committed. I used markdown2 to automatically render markdown files into HTML. And built a way to load data from JSON into templates to be used as variables (since a database is not available when generating a static site).

  • xlwings

    xlwings is a Python library that makes it easy to call Python from Excel and vice versa. It works with Excel on Windows and macOS as well as with Google Sheets and Excel on the web.

    Project mention: Error in excel running python code | reddit.com/r/learnpython | 2022-05-13
  • unoconv

    Universal Office Converter - Convert between any document format supported by LibreOffice/OpenOffice.

    Project mention: Converting xls to xlsx | reddit.com/r/golang | 2021-12-15
  • Mistune

    A fast yet powerful Python Markdown parser with renderers and plugins.

  • pdftabextract

    A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.

  • Camelot

    A Python library to extract tabular data from PDFs

    Project mention: Camelot VS ExtractTable-py - a user suggested alternative | libhunt.com/r/camelot | 2022-02-02
  • python-pptx

    Create Open XML PowerPoint documents in Python

    Project mention: [HELP ]Does anyone know how to add audio to Powerpoint slides when using Python-PPTX? Help | reddit.com/r/learnpython | 2021-06-20

    Thank you. pls refer to https://github.com/scanny/python-pptx/issues/427

  • docxtpl

    Use a docx as a jinja2 template

    Project mention: Já sou formado e tenho mestrado em áreas não relacionadas com TI, sou servidor público com um salário bem alto mas me apaixonei por programação e quero mudar de área. Faço ADS ou vou direto produzir coisas pro meu portfolio? | reddit.com/r/brdev | 2022-02-07
  • pyexcel

    Single API for reading, manipulating and writing data in csv, ods, xls, xlsx and xlsm files

  • pymorphy2

    Morphological analyzer / inflection engine for Russian and Ukrainian languages.

  • mistletoe

    A fast, extensible and spec-compliant Markdown parser in pure Python.

  • unp

    Unpacks things.

    Project mention: How to change fonts in terminal? | reddit.com/r/linux4noobs | 2022-04-01

    If the download is an archive, you can use the specific tool to extract the contents (e. g. tar), or convenience tools like atool or unp.

  • vcspull

    :arrows_counterclockwise: Synchronize projects via yaml/json manifest. Built using `libvcs`.

  • Marmir

    Python powered spreadsheets (by brianray)

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2022-05-19.

Specific Formats Processing related posts


What are some of the best open-source Specific Formats Processing projects? This list will help you:

Project Stars
1 csvkit 4,973
2 WeasyPrint 4,945
3 PDFMiner 4,802
4 PyPDF2 4,305
5 tablib 4,114
6 python-docx 3,110
7 XlsxWriter 2,902
8 Python-Markdown 2,868
9 borb 2,627
10 markdown2 2,310
11 xlwings 2,304
12 unoconv 2,276
13 Mistune 2,063
14 pdftabextract 1,985
15 Camelot 1,544
16 python-pptx 1,519
17 docxtpl 1,257
18 pyexcel 1,018
19 pymorphy2 974
20 mistletoe 465
21 unp 400
22 vcspull 193
23 Marmir 167
Find remote jobs at our new job board 99remotejobs.com. There are 8 new remote jobs listed recently.
Are you hiring? Post a new remote job listing for free.
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives