How to read paintings

Semantic art understanding with multi-modal retrieval

Noa Garcia, George Vogiatzis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Automatic art analysis has been mostly focused on classifying artworks into different artistic styles. However, understanding an artistic representation involves more complex processes, such as identifying the elements in the scene or recognizing author influences. We present SemArt, a multi-modal dataset for semantic art understanding. SemArt is a collection of fine-art painting images in which each image is associated to a number of attributes and a textual artistic comment, such as those that appear in art catalogues or museum collections. To evaluate semantic art understanding, we envisage the Text2Art challenge, a multi-modal retrieval task where relevant paintings are retrieved according to an artistic text, and vice versa. We also propose several models for encoding visual and textual artistic representations into a common semantic space. Our best approach is able to find the correct image within the top 10 ranked images in the 45.5% of the test samples. Moreover, our models show remarkable levels of art understanding when compared against human evaluation.

Original languageEnglish
Title of host publicationComputer Vision – ECCV 2018 Workshops, Proceedings
EditorsStefan Roth, Laura Leal-Taixé
PublisherSpringer
Pages676-691
Number of pages16
Volume11130
ISBN (Electronic)978-3-030-11012-3
ISBN (Print)9783030110116
DOIs
Publication statusPublished - 29 Jan 2019
Event15th European Conference on Computer Vision, ECCV 2018 - Munich, Germany
Duration: 8 Sep 201814 Sep 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11130 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference15th European Conference on Computer Vision, ECCV 2018
CountryGermany
CityMunich
Period8/09/1814/09/18

Fingerprint

Painting
Retrieval
Semantics
Museums
Image Inpainting
Art
Encoding
Attribute
Evaluate
Evaluation
Model

Keywords

  • Art analysis
  • Image-text retrieval
  • Multi-modal retrieval
  • Semantic art understanding

Cite this

Garcia, N., & Vogiatzis, G. (2019). How to read paintings: Semantic art understanding with multi-modal retrieval. In S. Roth, & L. Leal-Taixé (Eds.), Computer Vision – ECCV 2018 Workshops, Proceedings (Vol. 11130, pp. 676-691). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11130 LNCS). Springer. https://doi.org/10.1007/978-3-030-11012-3_52
Garcia, Noa ; Vogiatzis, George. / How to read paintings : Semantic art understanding with multi-modal retrieval. Computer Vision – ECCV 2018 Workshops, Proceedings. editor / Stefan Roth ; Laura Leal-Taixé. Vol. 11130 Springer, 2019. pp. 676-691 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{cfc9c29bd0bb41b1a65d3fc25cf60b4f,
title = "How to read paintings: Semantic art understanding with multi-modal retrieval",
abstract = "Automatic art analysis has been mostly focused on classifying artworks into different artistic styles. However, understanding an artistic representation involves more complex processes, such as identifying the elements in the scene or recognizing author influences. We present SemArt, a multi-modal dataset for semantic art understanding. SemArt is a collection of fine-art painting images in which each image is associated to a number of attributes and a textual artistic comment, such as those that appear in art catalogues or museum collections. To evaluate semantic art understanding, we envisage the Text2Art challenge, a multi-modal retrieval task where relevant paintings are retrieved according to an artistic text, and vice versa. We also propose several models for encoding visual and textual artistic representations into a common semantic space. Our best approach is able to find the correct image within the top 10 ranked images in the 45.5{\%} of the test samples. Moreover, our models show remarkable levels of art understanding when compared against human evaluation.",
keywords = "Art analysis, Image-text retrieval, Multi-modal retrieval, Semantic art understanding",
author = "Noa Garcia and George Vogiatzis",
year = "2019",
month = "1",
day = "29",
doi = "10.1007/978-3-030-11012-3_52",
language = "English",
isbn = "9783030110116",
volume = "11130",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer",
pages = "676--691",
editor = "Stefan Roth and Laura Leal-Taix{\'e}",
booktitle = "Computer Vision – ECCV 2018 Workshops, Proceedings",
address = "Germany",

}

Garcia, N & Vogiatzis, G 2019, How to read paintings: Semantic art understanding with multi-modal retrieval. in S Roth & L Leal-Taixé (eds), Computer Vision – ECCV 2018 Workshops, Proceedings. vol. 11130, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11130 LNCS, Springer, pp. 676-691, 15th European Conference on Computer Vision, ECCV 2018, Munich, Germany, 8/09/18. https://doi.org/10.1007/978-3-030-11012-3_52

How to read paintings : Semantic art understanding with multi-modal retrieval. / Garcia, Noa; Vogiatzis, George.

Computer Vision – ECCV 2018 Workshops, Proceedings. ed. / Stefan Roth; Laura Leal-Taixé. Vol. 11130 Springer, 2019. p. 676-691 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11130 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - How to read paintings

T2 - Semantic art understanding with multi-modal retrieval

AU - Garcia, Noa

AU - Vogiatzis, George

PY - 2019/1/29

Y1 - 2019/1/29

N2 - Automatic art analysis has been mostly focused on classifying artworks into different artistic styles. However, understanding an artistic representation involves more complex processes, such as identifying the elements in the scene or recognizing author influences. We present SemArt, a multi-modal dataset for semantic art understanding. SemArt is a collection of fine-art painting images in which each image is associated to a number of attributes and a textual artistic comment, such as those that appear in art catalogues or museum collections. To evaluate semantic art understanding, we envisage the Text2Art challenge, a multi-modal retrieval task where relevant paintings are retrieved according to an artistic text, and vice versa. We also propose several models for encoding visual and textual artistic representations into a common semantic space. Our best approach is able to find the correct image within the top 10 ranked images in the 45.5% of the test samples. Moreover, our models show remarkable levels of art understanding when compared against human evaluation.

AB - Automatic art analysis has been mostly focused on classifying artworks into different artistic styles. However, understanding an artistic representation involves more complex processes, such as identifying the elements in the scene or recognizing author influences. We present SemArt, a multi-modal dataset for semantic art understanding. SemArt is a collection of fine-art painting images in which each image is associated to a number of attributes and a textual artistic comment, such as those that appear in art catalogues or museum collections. To evaluate semantic art understanding, we envisage the Text2Art challenge, a multi-modal retrieval task where relevant paintings are retrieved according to an artistic text, and vice versa. We also propose several models for encoding visual and textual artistic representations into a common semantic space. Our best approach is able to find the correct image within the top 10 ranked images in the 45.5% of the test samples. Moreover, our models show remarkable levels of art understanding when compared against human evaluation.

KW - Art analysis

KW - Image-text retrieval

KW - Multi-modal retrieval

KW - Semantic art understanding

UR - http://www.scopus.com/inward/record.url?scp=85061817245&partnerID=8YFLogxK

UR - https://link.springer.com/chapter/10.1007%2F978-3-030-11012-3_52

U2 - 10.1007/978-3-030-11012-3_52

DO - 10.1007/978-3-030-11012-3_52

M3 - Conference contribution

SN - 9783030110116

VL - 11130

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 676

EP - 691

BT - Computer Vision – ECCV 2018 Workshops, Proceedings

A2 - Roth, Stefan

A2 - Leal-Taixé, Laura

PB - Springer

ER -

Garcia N, Vogiatzis G. How to read paintings: Semantic art understanding with multi-modal retrieval. In Roth S, Leal-Taixé L, editors, Computer Vision – ECCV 2018 Workshops, Proceedings. Vol. 11130. Springer. 2019. p. 676-691. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-030-11012-3_52