pdf2docx
PDFtoTXT
pdf2docx | PDFtoTXT | |
---|---|---|
6 | 1 | |
2,169 | 6 | |
4.5% | - | |
7.9 | 0.0 | |
26 days ago | over 1 year ago | |
Python | Python | |
GNU Affero General Public License v3.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
pdf2docx
-
Tensorflow PDF Extraction
Try pdf2docx. Here is the source: https://github.com/dothinking/pdf2docx.
-
Show HN: Doc Converter – Convert PDF docs to Word documents on your computer
Does it include its source/dependency licensing post extraction? Some of these dependencies are under GPL/AGPL https://github.com/dothinking/pdf2docx/blob/master/LICENSE
-
What should exist, but doesn’t?
If you'd rather convert a PDF to .docx so you can easily edit it, there's a free Python tool that works, but it has no GUI: https://github.com/dothinking/pdf2docx
-
How to deploy containerized Python and Django application on Heroku
pdf2docx: This module helps to convert from pdf to docx
-
Help with pictures in python-docx
I found this post on github https://github.com/dothinking/pdf2docx/issues/54#issuecomment-715925252
PDFtoTXT
What are some alternatives?
django-convert-doc-to-pdf
deepl-srt - Use Selenium to automate translation of a Chinese srt to English on Deepl website
borb - borb is a library for reading, creating and manipulating PDF files in python.
J.A.R.V.I.S - Personal Assistant built using python libraries. It does almost anything which includes sending emails, Optical Text Recognition, Dynamic News Reporting at any time with API integration, Todo list generator, Opens any website with just a voice command, Plays Music, Wikipedia searching, Dictionary with Intelligent Sensing i.e. auto spell checking, Weather Reporting i.e. temp, wind speed, humidity, YouTube searching, Google Map searching, Youtube Downloading, etc.
pdfsam - PDFsam, a desktop application to split, merge, mix, rotate PDF files and extract pages
Screen-Translate - A Screen Translator/OCR Translator made by using Python and Tesseract, the user interface are made using Tkinter. All code written in python.
pdf2docxConverter-PayalSasmal - This project is for converting pdf to docx and vise versa
tesseract-ocr - Tesseract Open Source OCR Engine (main repository)
Django - The Web framework for perfectionists with deadlines.
gunicorn - gunicorn 'Green Unicorn' is a WSGI HTTP Server for UNIX, fast clients and sleepy applications.
textshot - Python tool for grabbing text via screenshot