Deep Semi-Supervised Anomaly Detection for Finding Fraud in the Futures Market
Abstract
Modern financial electronic exchanges are an exciting and fast-paced marketplace where billions of dollars change hands every day. They are also rife with manipulation and fraud. Detecting such activity is a major undertaking, which has historically been a job reserved exclusively for humans. Recently, more research and resources have been focused on automating these processes via machine learning and artificial intelligence. Fraud detection is overwhelmingly associated with the greater field of anomaly detection, which is usually performed via unsupervised learning techniques because of the lack of labeled data needed for supervised learning. However, a small quantity of labeled data does often exist. This research article aims to evaluate the efficacy of a deep semi-supervised anomaly detection technique, called Deep SAD, for detecting fraud in high-frequency financial data. We use exclusive proprietary limit order book data from the TMX exchange in Montréal, with a small set of true labeled instances of fraud, to evaluate Deep SAD against its unsupervised predecessor. We show that incorporating a small amount of labeled data into an unsupervised anomaly detection framework can greatly improve its accuracy.
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2023
- DOI:
- 10.48550/arXiv.2309.00088
- arXiv:
- arXiv:2309.00088
- Bibcode:
- 2023arXiv230900088D
- Keywords:
-
- Computer Science - Machine Learning;
- Quantitative Finance - Risk Management;
- 91G99
- E-Print:
- 8 pages, 3 figures