PDFMiner
Kaitai Struct
PDFMiner | Kaitai Struct | |
---|---|---|
6 | 44 | |
5,179 | 3,839 | |
- | 0.9% | |
0.0 | 7.5 | |
over 1 year ago | 17 days ago | |
Python | Shell | |
MIT License | GPL-3.0-or-later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
PDFMiner
-
Creating a python class for organizing courses I took in my education
Technically this information is on my transcript, so I will be trying to use pdfminer to extract that data if there is a way to use a class you recommend when using that code https://github.com/pdfminer/pdfminer.six
- Dúvida séria sobre Metadados!
- Desktop File Search, How does it work! python or C# library libraries?
-
Add Texts to existing PDF using Python
PDFMiner - for getting the fields of every possible PDF : only needed for weird cases where standard PDF reader methods do not work
-
I'm having trouble with the PDFminer library in Python. Whenever I try to call a certain function it says that something is missing in the library itself
The Github page says it's been superseded by pdfminer.six. Perhaps try that instead.
- Extract specific data from multiple PDF files
Kaitai Struct
- Reverse-engineering an encrypted IoT protocol
-
Parsing an Undocumented File Format
- ImHex [2], which has a pattern language [3] which allows parsing, and it seems more powerful than what Kaitai offers. I stumbled upon some limitations with it but it was still useful.
[1]: https://kaitai.io/
- Kaitai Struct – a declarative language used to describe binary data structures
-
HTTPie Desktop: cross-platform API testing client for humans
Beautiful. Didn't know something like this exists. Reminds me of Katai[0]
[0]. https://kaitai.io/
-
Hacking the LG Monitor's EDID
An EDID override like this would be helpful for macOS as well, where the monitors swapping around after standby is a real annoyance [0] [1]
EDID rewrites are 99% of the time blocked by the monitor firmware: https://notes.alinpanaitiu.com/Decoding-monitor-EDID-on-macO...
By the way, one helpful tool that helped me navigate the EDID dump was Kaitai Struct [2]. It shows a side by side view with the hex view and the EDID structure, and it highlights the hex values in real time as you navigate the structure. Unfortunately [3] it doesn't support the extension blocks that the author needs.
[0] https://notes.alinpanaitiu.com/Weird-monitor-bugs
[1] https://forums.macrumors.com/threads/external-displays-swapp...
[2] https://kaitai.io/
[3] https://github.com/kaitai-io/edid.ksy
- Kaitai Struct: new way to develop parsers for binary structures
-
Fq: Jq for Binary Formats
Kaitai Struct might be a good choice for that: https://kaitai.io/
-
Ingesting, parsing and making sense of device log data
For binary log format, there's the excellent Kaitai Struct frameworks, that make it very easy to generate parsers from a declarative schema
-
What is this tool? More info in comments
kaitai
-
Visual Programming with Elixir: Learning to Write Binary Parsers (2019)
https://kaitai.io/
Worth a look if you are writing binary parsers.
What are some alternatives?
PyPDF2 - A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
Protobuf - Protocol Buffers - Google's data interchange format
pdfplumber - Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
csvkit - A suite of utilities for converting to and working with CSV, the king of tabular file formats.
pdfminer.six - Community maintained fork of pdfminer - we fathom PDF
Camelot - A Python library to extract tabular data from PDFs
tablib - Python Module for Tabular Datasets in XLS, CSV, JSON, YAML, &c.
WeasyPrint - The awesome document factory
PyYAML
ReportLab
rizin - UNIX-like reverse engineering framework and command-line toolset.