OCHADAI at SemEval-2022 Task 2: Adversarial Training for Multilingual Idiomaticity Detection
Abstract
We propose a multilingual adversarial training model for determining whether a sentence contains an idiomatic expression. Given that a key challenge with this task is the limited size of annotated data, our model relies on pre-trained contextual representations from different multi-lingual state-of-the-art transformer-based language models (i.e., multilingual BERT and XLM-RoBERTa), and on adversarial training, a training method for further enhancing model generalization and robustness. Without relying on any human-crafted features, knowledge bases, or additional datasets other than the target datasets, our model achieved competitive results and ranked 6th place in SubTask A (zero-shot) setting and 15th place in SubTask A (one-shot) setting.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2022
- DOI:
- 10.48550/arXiv.2206.03025
- arXiv:
- arXiv:2206.03025
- Bibcode:
- 2022arXiv220603025K
- Keywords:
-
- Computer Science - Computation and Language
- E-Print:
- arXiv admin note: substantial text overlap with arXiv:2105.05535