Constructing Multilingual Visual-Text Datasets Revealing Visual Multilingual Ability of Vision Language ModelsPublished in arXiv, 2024Share on Twitter Facebook LinkedIn Previous Next