We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CR

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Cryptography and Security

Title: HAPSSA: Holistic Approach to PDF Malware Detection Using Signal and Statistical Analysis

Abstract: Malicious PDF documents present a serious threat to various security organizations that require modern threat intelligence platforms to effectively analyze and characterize the identity and behavior of PDF malware. State-of-the-art approaches use machine learning (ML) to learn features that characterize PDF malware. However, ML models are often susceptible to evasion attacks, in which an adversary obfuscates the malware code to avoid being detected by an Antivirus. In this paper, we derive a simple yet effective holistic approach to PDF malware detection that leverages signal and statistical analysis of malware binaries. This includes combining orthogonal feature space models from various static and dynamic malware detection methods to enable generalized robustness when faced with code obfuscations. Using a dataset of nearly 30,000 PDF files containing both malware and benign samples, we show that our holistic approach maintains a high detection rate (99.92%) of PDF malware and even detects new malicious files created by simple methods that remove the obfuscation conducted by malware authors to hide their malware, which are undetected by most antiviruses.
Comments: Submitted version - MILCOM 2021 IEEE Military Communications Conference
Subjects: Cryptography and Security (cs.CR); Machine Learning (cs.LG); Signal Processing (eess.SP)
Cite as: arXiv:2111.04703 [cs.CR]
  (or arXiv:2111.04703v1 [cs.CR] for this version)

Submission history

From: Tajuddin Manhar Mohammed [view email]
[v1] Mon, 8 Nov 2021 18:32:47 GMT (1263kb,D)

Link back to: arXiv, form interface, contact.