EVALUATING OPEN-SOURCE IMAGE CAPTIONING MODELS WITH MULTIPLE METRICS ON THE IAPR TC-12 DATASET

Authors

  • K.H. NIKOGHOSYAN National Polytechnic University of Armenia Author
  • T.B. KHACHATRYAN National Polytechnic University of Armenia Author
  • E.A. HARUTYUNYAN National Polytechnic University of Armenia Author
  • D.M. GALSTYAN National Polytechnic University of Armenia Author

Keywords:

Artificial Intelligence, image captioning, natural language processing, evaluation metrics, computer vision

Abstract

In recent years, the development of image captioning AI models has been a focal point in the fields of computer vision and natural language processing (NLP). The paper presents a thorough comparative analysis of several state-of-the-art image captioning AI models, employing a diverse array of evaluation metrics, including CIDEr-D, BLEU-4, METEOR, ROUGE-L, SPICE, and Wu-Palmer similarity. The study is centered on the evaluation of image captioning models using the IAPR TC-12 dataset, a well-established benchmark for assessing visual content understanding. By leveraging multiple evaluation metrics, it was possible to gain a multifaceted understanding of the models' performance, encompassing both syntactic and semantic dimensions of generated captions. Comparative analysis highlights that different metrics capture distinct facets of image captioning quality with each shedding light on specific aspects of model performance.

In summary, this paper offers a valuable resource for researchers in the fields of computer vision and natural language processing. This comprehensive assessment of image captioning models using multiple evaluation metrics and the IAPR TC-12 dataset provides a deeper understanding of the current capabilities and limitations of AI-driven approaches fo­r generating descriptive image captions. This analysis paves the way for future advancements in this rapidly evolving domain.

Downloads

Published

25.02.2026

Issue

Section

Articles

Similar Articles

1-10 of 40

You may also start an advanced similarity search for this article.