PDFtoTXT
pdf2docx
PDFtoTXT | pdf2docx | |
---|---|---|
1 | 6 | |
6 | 2,169 | |
- | 4.5% | |
0.0 | 7.9 | |
over 1 year ago | 29 days ago | |
Python | Python | |
MIT License | GNU Affero General Public License v3.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
PDFtoTXT
pdf2docx
-
Tensorflow PDF Extraction
Try pdf2docx. Here is the source: https://github.com/dothinking/pdf2docx.
-
Show HN: Doc Converter – Convert PDF docs to Word documents on your computer
Does it include its source/dependency licensing post extraction? Some of these dependencies are under GPL/AGPL https://github.com/dothinking/pdf2docx/blob/master/LICENSE
-
What should exist, but doesn’t?
If you'd rather convert a PDF to .docx so you can easily edit it, there's a free Python tool that works, but it has no GUI: https://github.com/dothinking/pdf2docx
-
How to deploy containerized Python and Django application on Heroku
pdf2docx: This module helps to convert from pdf to docx
-
Help with pictures in python-docx
I found this post on github https://github.com/dothinking/pdf2docx/issues/54#issuecomment-715925252
What are some alternatives?
deepl-srt - Use Selenium to automate translation of a Chinese srt to English on Deepl website
django-convert-doc-to-pdf
J.A.R.V.I.S - Personal Assistant built using python libraries. It does almost anything which includes sending emails, Optical Text Recognition, Dynamic News Reporting at any time with API integration, Todo list generator, Opens any website with just a voice command, Plays Music, Wikipedia searching, Dictionary with Intelligent Sensing i.e. auto spell checking, Weather Reporting i.e. temp, wind speed, humidity, YouTube searching, Google Map searching, Youtube Downloading, etc.
borb - borb is a library for reading, creating and manipulating PDF files in python.
Screen-Translate - A Screen Translator/OCR Translator made by using Python and Tesseract, the user interface are made using Tkinter. All code written in python.
pdfsam - PDFsam, a desktop application to split, merge, mix, rotate PDF files and extract pages
tesseract-ocr - Tesseract Open Source OCR Engine (main repository)
pdf2docxConverter-PayalSasmal - This project is for converting pdf to docx and vise versa
Django - The Web framework for perfectionists with deadlines.
textshot - Python tool for grabbing text via screenshot
gunicorn - gunicorn 'Green Unicorn' is a WSGI HTTP Server for UNIX, fast clients and sleepy applications.