MangaUB

MangaUB is a benchmark for evaluating the manga comprehension abilities of Large Multimodal Models (Vision-Language Models). Based on Manga109 and its derivative datasets, it includes tasks of various difficulty levels and content related to manga understanding. MangaUB is designed to assess not only the recognition and understanding of content within a single panel, but also the comprehension of information conveyed across multiple panels, enabling a fine-grained analysis of the various capabilities required for manga understanding.

Contents:

Tasks:

Links: