Top 23 Specific Formats Processing Open-Source Projects
-
csvkit
A suite of utilities for converting to and working with CSV, the king of tabular file formats.
Project mention: Do you know of any website that has a bunch of files for CSV parser edge case testing? | reddit.com/r/vba | 2021-06-23 -
Project mention: QuestPDF 2021.10 - a new version of the open-source, MIT-licensed, C# library for generating PDF documents with fluent API, now with extended text capabilities. Please help me make it popular :) | reddit.com/r/csharp | 2021-10-06
I’d recommend Weasyprint (.net core wrapper) instead of wkhtmltopdf. It supports CSS Paged Media which is pretty much required for everything but the simplest of HTML2PDF conversions.
-
SonarLint
Deliver Cleaner and Safer Code - Right in Your IDE of Choice!. SonarLint is a free and open source IDE extension that identifies and catches bugs and vulnerabilities as you code, directly in the IDE. Install from your favorite IDE marketplace today.
-
Project mention: Desktop File Search, How does it work! python or C# library libraries? | reddit.com/r/learnpython | 2021-09-12
-
PyPDF2
A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
-
other than Pandas, you can also use tablib. I personally find tablib to be slightly easier but it doesn't have as many features. But for what you need, tablib might be best
-
-
Project mention: How much time to expect for a Python programmer to learn basic Excel? | reddit.com/r/learnmachinelearning | 2022-03-02
I've been in this situation a few times. I would just open the excel sheet in python using xlsxwriter: https://xlsxwriter.readthedocs.io/
-
Scout APM
Less time debugging, more time building. Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.
-
Project mention: Is it a good practice to use /admin to create manage the blog in production? | reddit.com/r/django | 2022-05-19
Interesting, I also use markdown, but hadn't heard of Django-Markdownx before your today. What I do is create two fields: body_md and body_html, and on save use Python Markdown to turn my markdown in html.
-
I will also perform a quick comparison with the borb library.
-
First, I used django-microframework as inspiration for a simple app.py that could be used instead of the potentially overwhelming files Django normally uses for a site. Then, I added in automatic reading of .env files to override Django settings that shouldn't be committed. I used markdown2 to automatically render markdown files into HTML. And built a way to load data from JSON into templates to be used as variables (since a database is not available when generating a static site).
-
xlwings
xlwings is a Python library that makes it easy to call Python from Excel and vice versa. It works with Excel on Windows and macOS as well as with Google Sheets and Excel on the web.
-
unoconv
Universal Office Converter - Convert between any document format supported by LibreOffice/OpenOffice.
-
-
pdftabextract
A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
-
Project mention: Camelot VS ExtractTable-py - a user suggested alternative | libhunt.com/r/camelot | 2022-02-02
-
Project mention: [HELP ]Does anyone know how to add audio to Powerpoint slides when using Python-PPTX? Help | reddit.com/r/learnpython | 2021-06-20
Thank you. pls refer to https://github.com/scanny/python-pptx/issues/427
-
Project mention: Já sou formado e tenho mestrado em áreas não relacionadas com TI, sou servidor público com um salário bem alto mas me apaixonei por programação e quero mudar de área. Faço ADS ou vou direto produzir coisas pro meu portfolio? | reddit.com/r/brdev | 2022-02-07
-
-
-
-
If the download is an archive, you can use the specific tool to extract the contents (e. g. tar), or convenience tools like atool or unp.
-
vcspull
:arrows_counterclockwise: Synchronize projects via yaml/json manifest. Built using `libvcs`.
-
Specific Formats Processing related posts
- Is it a good practice to use /admin to create manage the blog in production?
- Error in excel running python code
- Spell checking Markdown documents using a Github action
- How do I re-arrange a pdf/docx?
- fpdf2.5.2 : SVG support and comparison with borb
- This Week in Python
- borb, the open-source pure Python PDF engine
Index
What are some of the best open-source Specific Formats Processing projects? This list will help you:
Project | Stars | |
---|---|---|
1 | csvkit | 4,973 |
2 | WeasyPrint | 4,945 |
3 | PDFMiner | 4,802 |
4 | PyPDF2 | 4,305 |
5 | tablib | 4,114 |
6 | python-docx | 3,110 |
7 | XlsxWriter | 2,902 |
8 | Python-Markdown | 2,868 |
9 | borb | 2,627 |
10 | markdown2 | 2,310 |
11 | xlwings | 2,304 |
12 | unoconv | 2,276 |
13 | Mistune | 2,063 |
14 | pdftabextract | 1,985 |
15 | Camelot | 1,544 |
16 | python-pptx | 1,519 |
17 | docxtpl | 1,257 |
18 | pyexcel | 1,018 |
19 | pymorphy2 | 974 |
20 | mistletoe | 465 |
21 | unp | 400 |
22 | vcspull | 193 |
23 | Marmir | 167 |
Are you hiring? Post a new remote job listing for free.