AceGPT, Localizing Large Language Models in Arabic

doi:10.48550/arXiv.2309.12053

AceGPT, Localizing Large Language Models in Arabic

This paper is devoted to the development of a localized Large Language Model (LLM) specifically for Arabic, a language imbued with unique cultural characteristics inadequately addressed by current mainstream models. Significant concerns emerge when addressing cultural sensitivity and local values. To address this, the paper proposes a comprehensive solution that includes further pre-training with Arabic texts, Supervised Fine-Tuning (SFT) utilizing native Arabic instructions, and GPT-4 responses in Arabic, alongside Reinforcement Learning with AI Feedback (RLAIF) employing a reward model attuned to local culture and values. The goal is to cultivate culturally cognizant and value-aligned Arabic LLMs capable of accommodating the diverse, application-specific needs of Arabic-speaking communities. Comprehensive evaluations reveal that the resulting model, dubbed `AceGPT', sets the state-of-the-art standard for open Arabic LLMs across various benchmarks. Codes, data, and models are in https://github.com/FreedomIntelligence/AceGPT.

Publication:

arXiv e-prints

Pub Date:

September 2023

DOI:

10.48550/arXiv.2309.12053

arXiv:

arXiv:2309.12053

Bibcode:

2023arXiv230912053H

Keywords:

Computer Science - Computation and Language

E-Print:

Accepted to NAACL main conference. https://github.com/FreedomIntelligence/AceGPT

NASA/ADS

AceGPT, Localizing Large Language Models in Arabic

Abstract