ELISE: A Storage Efficient Logging System Powered by Redundancy Reduction and Representation Learning
Hailun Ding, Shenao Yan, Juan Zhai, and Shiqing Ma, Rutgers University
Log is a key enabler of many security applications including but not limited to security auditing and forensic analysis. Due to the rapid growth of modern computing infrastructure size, software systems are generating more and more logs every day. Moreover, the duration of recent cyber attacks like Advanced Persistent Threats (APTs) is becoming longer, and their targets consist of many connected organizations instead of a single one. This requires the analysis on logs from different sources and long time periods. Storing such large sized log files is becoming more important and also challenging than ever. Existing logging systems are either inefficient (i.e., high storage overhead) or designed for limited security applications (i.e., no support for general security analysis). In this paper, we propose ELISE, a storage efficient logging system built on top of a novel lossless data compression technique, which naturally supports all types of security analysis. It features lossless log compression using a novel log file preprocessing and Deep Neural Network (DNN) based method to learn optimal character encoding. On average, ELISE can achieve 3 and 2 times better compression results compared with existing state-of-the-art methods Gzip and DeepZip, respectively, showing a promising future research direction.
View the full USENIX Security '21 Program at https://www.usenix.org/conference/usenixsecurity21/technical-sessions