EVALUATION OF THE EFFECTIVENESS OF LARGE LANGUAGE MODELS IN THE TASK OF KNOWLEDGE GRAPH COMPLETION: METHODS AND PERSPECTIVES
Keywords:
large language models (LLM), knowledge graphs (KG), knowledge graph completion (KGC), in-context learning (ICL), chain-of-thought (COT), Zero-Shot and One-Shot learning, automated knowledge processingAbstract
The automation of knowledge graph (KG) completion is a key task in artificial intelligence, with applications in dialogue systems, search engines, and analytical platforms. This study evaluates the effectiveness of large language models (LLMs) in knowledge graph completion (KGC) by analyzing three models: GPT-4o, GPT-3.5-Turbo-0125, and Mixtral-8x7b-Instruct-v0.1.The research assesses model performance in Zero-Shot and One-Shot scenarios and examines the impact of different prompting strategies, including in-context learning (ICL) and chain-of- thought (COT) reasoning. Two specialized datasets are utilized, containing both explicit and implicit entity relationships, enabling the assessment of models’ reasoning capabilities. The analysis employs a strict evaluation paradigm, requiring an exact match between predicted and reference triples, as well as a flexible paradigm, allowing for partial matches and post-processing adjustments. The findings demonstrate that LLMs can be effective in KGC tasks, yet their performance heavily depends on prompt quality, the presence of examples, and output formatting precision.Detailed prompts without examples do not consistently improve results, while Zero-Shot prompting proves less effective than One-Shot approaches. GPT models exhibit greater alignment with given instructions, whereas Mixtral-8x7b tends to include additional explanatory text, making its integration into structured KG systems more challenging. Despite advancements, LLMs still face limitations in output formatting, recognition of implicit relationships, and dependency on prompt formulation. Future research should focus on prompt optimization, refining learning methodologies, and integrating LLMs into more complex KG systems, enhancing the accuracy and efficiency of automated knowledge completion.
References
Hogan A., et al. Knowledge Graphs. Synthesis Lectures on Data. Semantics, and Knowledge, Morgan & Claypool Publishers. 2021.
Fill H., Fettke P., Keopke J. Conceptual modeling and large language models: Impressions from first experiments with ChatGPT. Enterpise Modelling & Information Systems Architectures. International Journal of Conceptual Modelling. 2023. № 18(3). pp. 1-15.
Кундос М.Г., Соловей Л.Я. та ін. Ефективність і багатопотоковість паралельних обчислень у системному програмуванні. Таврійський науковий вісник. 2024. № 5. C. 60-64.
Chen H., Liu X., Yin D., Tang J. A survey on dialogue systems: recent advances and new frontiers. ACM SIGKDD Explorations News. 2017. № 19(2). pp. 25-35.
Iga V., Silaghi G. Leveraging BERT for natural language understanding of domain-specific knowledge. 25th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, 2023, Nancy, France. pp. 210-215.
Pan S., Luo L., Wang Y., Chen C., Wang J., Wu X.: Unifying Large Language Models and knowledge graphs: A roadmap. CoRR abs/2306.08302. 2023. https://doi. org/10.48550/ARXIV.2306.08302
Zhang J., Chen B., Zhang L., Ke X., Ding H. Neural, symbolic and neuralsymbolic reasoning on knowledge graphs. AI Open 2. 2021. pp. 14-35.
Zhu Y., et al. LLMs for knowledge graph construction and reasoning: recent capabilities and future opportunities. CoRR abs/2305.13168. 2023. https://doi. org/10.48550/ARXIV.2305.13168
Santu S., Feng D. TELLeR: A general taxonomy of LLM prompts for benchmarking complex tasks. Findings of the ACL: EMNLP 2023, Singapore. 2023. pp. 14197-14203.
Jiang A.Q., et al. Mixtral of Experts. CoRR abs/2401.04088. 2024. https://doi. org/10.48550/ARXIV.2401.04088
Ji S., Pan S., Cambria E., Marttinen P., Yu P.S. A survey on knowledge graphs: representation, acquisition, and applications. IEEE Trans. Neural Networks Learn. Syst. 2022. № 33(2). pp. 494-514.
Wei X., et al. ChatIE: zero-shot information extraction via chatting with Chat- GPT. CoRR abs/2302.10205. 2023. https://doi.org/10.48550/arxiv.2302.10205
Khorashadizadeh H., Mihindukulasooriya N., Tiwari S., Groppe J. Exploring in-context learning capabilities of foundation models for generating knowledge graphs from text. CEURWorkshop Proceedings. 2023. № 3447, pp. 132-153.
Zhao W., et al. A survey of large language models. CoRR abs/2303.18223. 2023. https://doi.org/10.48550/ARXIV.2303.18223