Machine Learning Model Attribution Challenge

doi:10.48550/arXiv.2302.06716

Machine Learning Model Attribution Challenge

We present the findings of the Machine Learning Model Attribution Challenge. Fine-tuned machine learning models may derive from other trained models without obvious attribution characteristics. In this challenge, participants identify the publicly-available base models that underlie a set of anonymous, fine-tuned large language models (LLMs) using only textual output of the models. Contestants aim to correctly attribute the most fine-tuned models, with ties broken in the favor of contestants whose solutions use fewer calls to the fine-tuned models' API. The most successful approaches were manual, as participants observed similarities between model outputs and developed attribution heuristics based on public documentation of the base models, though several teams also submitted automated, statistical solutions.

Publication:

arXiv e-prints

Pub Date:

February 2023

DOI:

10.48550/arXiv.2302.06716

arXiv:

arXiv:2302.06716

Bibcode:

2023arXiv230206716M

Keywords:

Computer Science - Machine Learning;
Computer Science - Computation and Language;
Computer Science - Cryptography and Security

NASA/ADS

Machine Learning Model Attribution Challenge

Abstract