Living off the Analyst: Harvesting Features from Yara Rules for Malware Detection
Abstract
A strategy used by malicious actors is to "live off the land," where benign systems and tools already available on a victim's systems are used and repurposed for the malicious actor's intent. In this work, we ask if there is a way for anti-virus developers to similarly re-purpose existing work to improve their malware detection capability. We show that this is plausible via YARA rules, which use human-written signatures to detect specific malware families, functionalities, or other markers of interest. By extracting sub-signatures from publicly available YARA rules, we assembled a set of features that can more effectively discriminate malicious samples from benign ones. Our experiments demonstrate that these features add value beyond traditional features on the EMBER 2018 dataset. Manual analysis of the added sub-signatures shows a power-law behavior in a combination of features that are specific and unique, as well as features that occur often. A prior expectation may be that the features would be limited in being overly specific to unique malware families. This behavior is observed, and is apparently useful in practice. In addition, we also find sub-signatures that are dual-purpose (e.g., detecting virtual machine environments) or broadly generic (e.g., DLL imports).
- Publication:
-
arXiv e-prints
- Pub Date:
- November 2024
- arXiv:
- arXiv:2411.18516
- Bibcode:
- 2024arXiv241118516G
- Keywords:
-
- Computer Science - Cryptography and Security;
- Computer Science - Machine Learning
- E-Print:
- To appear in BigData'24 CyberHunt 2024