USENIX Security '21 - Reducing Test Cases with Attention Mechanism of Neural Networks
Xing Zhang, Jiongyi Chen, Chao Feng, Ruilin Li, Yunfei Su, Bin Zhang, Jing Lei, and Chaojing Tang, National University of Defense Technology
As fuzzing techniques become more effective at triggering program crashes, how to triage crashes with less human efforts has become increasingly imperative. To this aim, test case reduction which reduces a crashing input to its minimal form plays an important role, especially when analyzing programs with random, complex, or large inputs. However, existing solutions rely on random algorithms or pre-defined rules, which are inaccurate and error-prone in many cases because of the implementation variance in program internals.
In this paper, we present SCREAM, a new approach that leverages neural networks to reduce test cases. In particular, by feeding the network with a program's crashing inputs and non-crashing inputs, the network learns to approximate the computation from the program entry point to the crash point and implicitly denotes the input bytes that are significant to the crash. With the invisibility of the trained network's parameters, we leverage the attention mechanism to explain the network, namely extracting the significance of each input byte to the crash. At the end, the significant input bytes are re-assembled as the failure-inducing input.
The cost of our approach is to design a proper dataset augmentation algorithm and a suitable network structure. To this end, we develop a unique dataset augmentation technique that can generate adequate and highly-differentiable samples and expand the search space of crashing input. Highlights of our research also include a novel network structure that can capture dependence of input blocks in long sequences.
We evaluated SCREAM on 41 representative programs. The results show that SCREAM outperforms state-of-the-art solutions regarding accuracy and efficiency. Such improvement is made possible by the network's capability to summarize the significance of input bytes from multiple rounds of mutation, which tolerates perturbation occurred in random reduction of single crashing input.
View the full USENIX Security '21 Program at https://www.usenix.org/conference/usenixsecurity21/technical-sessions