GAEDM: Genetic Algorithm-Enhanced Static Analysis for Detection of API Hashing Obfuscation in Malware

Abstract

Malware authors increasingly exploit API Hashing to create ”invisible” system calls, replacing explicit function names with dynamically computed hashes that evade detection systems. This sophisticated obfuscation technique poses three critical challenges: accurately identifying hash functions within obfuscated code, linking computed hashes to their corresponding API calls, and detecting the growing diversity of hash algorithm variants. Existing rule-based approaches fail against these adaptive threats and cannot identify modern hash variants. We propose GAEDM, a novel framework that combines deep learning with program analysis to address these challenges. Our key innovation integrates static taint analysis with a genetic algorithm-enhanced assembly language model that generates diverse training variants, enabling robust detection of previously unseen obfuscation patterns. Experimental evaluation demonstrates that GAEDM achieves 91.9% MRR and 94.6% Recall@k in hash function identification, representing improvements of 18.4% and 8.2% respectively over state-of-the-art methods. GAEDM detects sophisticated obfuscation patterns that completely evade existing approaches, enabling security analysts to uncover previously undetectable threats and significantly advancing malware defense capabilities.

Abstract

Related papers