pdfalyzer
peepdf
Our great sponsors
pdfalyzer | peepdf | |
---|---|---|
8 | 5 | |
211 | 1,195 | |
- | - | |
6.9 | 0.0 | |
7 days ago | about 2 years ago | |
Python | Python | |
GNU General Public License v3.0 only | GNU General Public License v3.0 only |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
pdfalyzer
-
The Pdfalyzer is a tool for visualizing the inner tree structure of a PDF in large and colorful diagrams as well as scanning its internals for suspicious content
The Pdfalyzer is a command line tool (paralyze) as well as a library for working with, visualizing, and scanning the contents of a PDF. Motivation for the project was personal: I got hacked by a PDF that turned out to be hiding its maleficent instructions inside the font binary where it was missed by modern malware scanners (twitter thread) (more details)
-
The Yaralyzer is a new tool for visualizing / force decoding YARA and regular expression matches in binary and text
A few weeks ago I made a post here about a PDF that evaded all malware detection and caused a security breach, almost certainly through PDF instructions hidden inside of an Adobe Type1 Font binary stream embedded within a PDF. At the time I posted a link to a tool I wrote called The Pdfalyzer that diagrams a PDF's internal and scans for various suspect content.
- Any useful cybersecurity software under $5k?
-
Novel PDF malware: injecting JavaScript into the encrypted section of Adobe Type 1 font binaries is not detectable by malware scanners and doesn't interfere with decryption/decompilation of the font (along with a new tool for malicious PDF analysis)
FWIW someone on twitter was talking to me about how he couldn't get t1disasm to work on his m1 Mac - just wanted to throw out there I worked through the issues with compiling the tool from source and there's a script that should work to build them on m1 Macs in the pdfalyzer repo
I dramatically scaled up the binary data scouring and visualization in the pdfalyzer... can rip through every backtick/frontslash/single or double quoted/etc etc set of bytes in the binaries and try a bunch of aggressive approaches to force decode them.
Just posted some new screenshots of various less garbled looking attempts to guess an encoding for some of the stuff in the JS regions of the font binaries (pdfalyzer code is also updated)
peepdf
-
The Pdfalyzer is a tool for visualizing the inner tree structure of a PDF in large and colorful diagrams as well as scanning its internals for suspicious content
This tool was built to fill a gap in the PDF assessment landscape. Didier Stevens's pdfid.py and pdf-parser.py are still the best game in town when it comes to PDF analysis tools but they lack in the visualization department and also don't give you much to work with as far as giving you a data model you can write your own code around. Peepdf seemed promising but turned out to be in a buggy, out of date, and more or less unfixable state. And neither of them offered much in the way of tooling for embedded binary analysis. Thus I felt the world might be slightly improved if I strung together a couple of more stable/well known/actively maintained open source projects (AnyTree, PyPDF2, and Rich) into this tool.
-
Pictures of the NOOK and Jacks email to Forrest June 5,2020!
If the images are originals and were objects added to the PDF, they can be extracted with specialized tools like peepdf or PDFStreamDumper. You could just try a right click, save image, and see if that works. Is the PDF available for download somewhere?
-
PDF Forensics
Ok so I found a tool called "peepdf" https://github.com/jesparza/peepdf which did what I was looking for! Thank you all for the suggestions.
What are some alternatives?
pdfstreamdumper - research tool for the analysis of malicious pdf documents. make sure to run the installer first to get all of the 3rd party dlls installed correctly.
PyPDF2 - A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
pypdfium2 - Python bindings to PDFium
Malware-IOCs
DidierStevensSuite - Please no pull requests for this repository. Thanks!
anytree - Python tree data library
rich - Rich is a Python library for rich text and beautiful formatting in the terminal.
yaralyzer - Visually inspect and force decode YARA and regex matches found in both binary and text data. With Colors.
CyberPipe - An easy to use PowerShell script to collect memory and disk forensics for DFIR investigations.
SysmonForLinux