TY - GEN
T1 - How to read paintings
T2 - 15th European Conference on Computer Vision, ECCV 2018
AU - Garcia, Noa
AU - Vogiatzis, George
PY - 2019/1/29
Y1 - 2019/1/29
N2 - Automatic art analysis has been mostly focused on classifying artworks into different artistic styles. However, understanding an artistic representation involves more complex processes, such as identifying the elements in the scene or recognizing author influences. We present SemArt, a multi-modal dataset for semantic art understanding. SemArt is a collection of fine-art painting images in which each image is associated to a number of attributes and a textual artistic comment, such as those that appear in art catalogues or museum collections. To evaluate semantic art understanding, we envisage the Text2Art challenge, a multi-modal retrieval task where relevant paintings are retrieved according to an artistic text, and vice versa. We also propose several models for encoding visual and textual artistic representations into a common semantic space. Our best approach is able to find the correct image within the top 10 ranked images in the 45.5% of the test samples. Moreover, our models show remarkable levels of art understanding when compared against human evaluation.
AB - Automatic art analysis has been mostly focused on classifying artworks into different artistic styles. However, understanding an artistic representation involves more complex processes, such as identifying the elements in the scene or recognizing author influences. We present SemArt, a multi-modal dataset for semantic art understanding. SemArt is a collection of fine-art painting images in which each image is associated to a number of attributes and a textual artistic comment, such as those that appear in art catalogues or museum collections. To evaluate semantic art understanding, we envisage the Text2Art challenge, a multi-modal retrieval task where relevant paintings are retrieved according to an artistic text, and vice versa. We also propose several models for encoding visual and textual artistic representations into a common semantic space. Our best approach is able to find the correct image within the top 10 ranked images in the 45.5% of the test samples. Moreover, our models show remarkable levels of art understanding when compared against human evaluation.
KW - Art analysis
KW - Image-text retrieval
KW - Multi-modal retrieval
KW - Semantic art understanding
UR - http://www.scopus.com/inward/record.url?scp=85061817245&partnerID=8YFLogxK
UR - https://link.springer.com/chapter/10.1007%2F978-3-030-11012-3_52
U2 - 10.1007/978-3-030-11012-3_52
DO - 10.1007/978-3-030-11012-3_52
M3 - Conference publication
AN - SCOPUS:85061817245
SN - 9783030110116
VL - 11130
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 676
EP - 691
BT - Computer Vision – ECCV 2018 Workshops, Proceedings
A2 - Roth, Stefan
A2 - Leal-Taixé, Laura
PB - Springer
Y2 - 8 September 2018 through 14 September 2018
ER -