Comparative Analysis of Efficient Adapter-Based Fine-Tuning of State-of-the-Art Transformer Models

Comparative Analysis of Efficient Adapter-Based Fine-Tuning of State-of-the-Art Transformer Models

In this work, we investigate the efficacy of various adapter architectures on supervised binary classification tasks from the SuperGLUE benchmark as well as a supervised multi-class news category classification task from Kaggle. Specifically, we compare classification performance and time complexity of three transformer models, namely DistilBERT, ELECTRA, and BART, using conventional fine-tuning as well as nine state-of-the-art (SoTA) adapter architectures. Our analysis reveals performance differences across adapter architectures, highlighting their ability to achieve comparable or better performance relative to fine-tuning at a fraction of the training time. Similar results are observed on the new classification task, further supporting our findings and demonstrating adapters as efficient and flexible alternatives to fine-tuning. This study provides valuable insights and guidelines for selecting and implementing adapters in diverse natural language processing (NLP) applications.

Publication:

arXiv e-prints

Pub Date:

January 2025

arXiv:

arXiv:2501.08271

Bibcode:

2025arXiv250108271M

Keywords:

Computer Science - Computation and Language;
Computer Science - Artificial Intelligence

ADS

Comparative Analysis of Efficient Adapter-Based Fine-Tuning of State-of-the-Art Transformer Models

Abstract