Assessing LLMs Suitability for Knowledge Graph Completion

doi:10.48550/arXiv.2405.17249

Assessing LLMs Suitability for Knowledge Graph Completion

Recent work has shown the capability of Large Language Models (LLMs) to solve tasks related to Knowledge Graphs, such as Knowledge Graph Completion, even in Zero- or Few-Shot paradigms. However, they are known to hallucinate answers, or output results in a non-deterministic manner, thus leading to wrongly reasoned responses, even if they satisfy the user's demands. To highlight opportunities and challenges in knowledge graphs-related tasks, we experiment with three distinguished LLMs, namely Mixtral-8x7b-Instruct-v0.1, GPT-3.5-Turbo-0125 and GPT-4o, on Knowledge Graph Completion for static knowledge graphs, using prompts constructed following the TELeR taxonomy, in Zero- and One-Shot contexts, on a Task-Oriented Dialogue system use case. When evaluated using both strict and flexible metrics measurement manners, our results show that LLMs could be fit for such a task if prompts encapsulate sufficient information and relevant examples.

Publication:

arXiv e-prints

Pub Date:

May 2024

DOI:

10.48550/arXiv.2405.17249

arXiv:

arXiv:2405.17249

Bibcode:

2024arXiv240517249I

Keywords:

Computer Science - Computation and Language;
Computer Science - Artificial Intelligence

E-Print:

Accepted at 18th International Conference on Neural-Symbolic Learning and Reasoning, NESY 2024. Evaluating Mixtral-8x7b-Instruct-v0.1, GPT-3.5-Turbo-0125 and GPT-4o for Knowledge Graph Completion task with prompts formatted according to the TELeR taxonomy

NASA/ADS

Assessing LLMs Suitability for Knowledge Graph Completion

Abstract