Compiler Provenance Recovery for Multi-CPU Architectures Using a Centrifuge Mechanism
Abstract
Bit-stream recognition (BSR) has many applications, such as forensic investigations, detection of copyright infringement, and malware analysis. We propose the first BSR that takes a bare input bit-stream and outputs a class label without any preprocessing. To achieve our goal, we propose a centrifuge mechanism, where the upstream layers (sub-net) capture global features and tell the downstream layers (main-net) to switch the focus, even if a part of the input bit-stream has the same value. We applied the centrifuge mechanism to compiler provenance recovery, a type of BSR, and achieved excellent classification. Additionally, downstream transfer learning (DTL), one of the learning methods we propose for the centrifuge mechanism, pre-trains the main-net using the sub-net's ground truth instead of the sub-net's output. We found that sub-predictions made by DTL tend to be highly accurate when the sub-label classification contributes to the essence of the main prediction.
- Publication:
-
IEEE Access
- Pub Date:
- 2024
- DOI:
- 10.1109/ACCESS.2024.3371499
- arXiv:
- arXiv:2211.13110
- Bibcode:
- 2024IEEEA..1234477O
- Keywords:
-
- Computer Science - Machine Learning;
- Computer Science - Cryptography and Security
- E-Print:
- 8 pages, 4 figures, 5 tables