DeepMalwareDetector
Unredactor
DeepMalwareDetector | Unredactor | |
---|---|---|
1 | 1 | |
65 | 0 | |
- | - | |
0.0 | 10.0 | |
about 1 year ago | over 2 years ago | |
Python | Python | |
- | GNU General Public License v3.0 only |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
DeepMalwareDetector
-
Looking for insight on labelling portable executable (PE) malware files using a VirusTotal API response report.
What brought me to this research was these studies [1] [2], which demonstrates how image-based malware classification can be done using a CNN (convolutional neural network). Since I had a bit of a background with malware, and I recently completed a CNN model, I figured I would try to do something similar. It was only after investigating different materials I hit a bit of a roadblock. I found this one dataset, malimg [3], which is made up of PE files that have been converted into images already. I didn't want to just use the images, I wanted to demonstrate how to get them, only the method used to classify them turned out to be a bit out of my depth, kind of like this whole project, it's discussed in Section 4.2 of this paper [4] . There's also this set [5], which contains the pixel content for each file record. And as for the static disassembly you mention, I think you are right, the training data might not exist. During my investigation the best I could find was this study [6].
Unredactor
-
Redacted and Sanitized
Interestingly, some years back (perhaps 12-15 years?) someone developed a program that would examine the font a physically redacted document was written in, and the spacing to try to unredact it, with some relatively decent success as only a set combination of words/letters etc. could fill a specific redacted portion. Of course the larger the redacted block, the harder it becomes. It was interesting none the less, not sure what happened to it though. This: https://github.com/gt0410/Unredactor is similar, but not what I was thinking of, and this: https://hackaday.com/2008/08/01/exposing-poorly-redacted-pdfs/ may also prove interesting for you.
What are some alternatives?
avclass - AVClass malware labeling tool
awesome-gradient-boosting-papers - A curated list of gradient boosting research papers with implementations.
deepNOID - deepNOID, the binary music genre classifier which determines if what you're listening to really is NOIDED
wordview - A Python package for Exploratory Data Analysis (EDA) for text-based data.
tinysleepnet - TinySleepNet: An Efficient Deep Learning Model for Sleep Stage Scoring based on Raw Single-Channel EEG by Akara Supratak and Yike Guo from The Faculty of ICT, Mahidol University and Imperial College London respectively
torchextractor - Feature extraction made simple with torchextractor
easyesn - Python library for Reservoir Computing using Echo State Networks
obsei - Obsei is a low code AI powered automation tool. It can be used in various business flows like social listening, AI based alerting, brand image analysis, comparative study and more .
strelka - Real-time, container-based file scanning at enterprise scale
mljar-supervised - Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation
MDML - Malware Detection using Machine Learning (MDML)
orange - 🍊 :bar_chart: :bulb: Orange: Interactive data analysis