Crosslingual Retrieval Augmented In-context Learning for Bangla
Abstract
The promise of Large Language Models (LLMs) in Natural Language Processing has often been overshadowed by their limited performance in low-resource languages such as Bangla. To address this, our paper presents a pioneering approach that utilizes cross-lingual retrieval augmented in-context learning. By strategically sourcing semantically similar prompts from high-resource language, we enable multilingual pretrained language models (MPLMs), especially the generative model BLOOMZ, to successfully boost performance on Bangla tasks. Our extensive evaluation highlights that the cross-lingual retrieval augmented prompts bring steady improvements to MPLMs over the zero-shot performance.
- Publication:
-
arXiv e-prints
- Pub Date:
- November 2023
- DOI:
- 10.48550/arXiv.2311.00587
- arXiv:
- arXiv:2311.00587
- Bibcode:
- 2023arXiv231100587L
- Keywords:
-
- Computer Science - Computation and Language
- E-Print:
- In The 1st Bangla Language Processing (BLP) Workshop, held in conjunction with The Conference on Empirical Methods in Natural Language Processing (EMNLP), December 2023