Bilbo-Val: Automatic Identification of Bibliographical Zone in Papers

Amal Htait; Sébastien Fournier; Patrice Bellot

Bilbo-Val: Automatic Identification of Bibliographical Zone in Papers

Amal Htait, Sébastien Fournier, Patrice Bellot

Research output: Unpublished contribution to conference › Unpublished Conference Paper › peer-review

Abstract

In this paper, we present the automatic annotation of bibliographical references' zone in papers and articles of XML/TEI format. Our work is applied through two phases: first, we use machine learning technology to classify bibliographical and non-bibliographical paragraphs in papers, by means of a model that was initially created to differentiate between the footnotes containing or not containing bibliographical references. The previous description is one of BILBO's features, which is an open source software for automatic annotation of bibliographic reference. Also, we suggest some methods to minimize the margin of error. Second, we propose an algorithm to find the largest list of bibliographical references in the article. The improvement applied on our model results an increase in the model's efficiency with an Accuracy equal to 85.89. And by testing our work, we are able to achieve 72.23% as an average for the percentage of success in detecting bibliographical references' zone.

Original language	English
Number of pages	5
Publication status	Published - May 2016

Access to Document

Htailetal_2016__VoR
licensed on a Creative Commons Attribution 4.0 International License.
Final published version, 735 KBLicence: CC BY 4.0

https://hal.archives-ouvertes.fr/hal-01771689/documentLicence: CC BY 4.0

Cite this

@conference{58718a0a7d98427b8ad829685a9cb730,

title = "Bilbo-Val: Automatic Identification of Bibliographical Zone in Papers",

abstract = "In this paper, we present the automatic annotation of bibliographical references' zone in papers and articles of XML/TEI format. Our work is applied through two phases: first, we use machine learning technology to classify bibliographical and non-bibliographical paragraphs in papers, by means of a model that was initially created to differentiate between the footnotes containing or not containing bibliographical references. The previous description is one of BILBO's features, which is an open source software for automatic annotation of bibliographic reference. Also, we suggest some methods to minimize the margin of error. Second, we propose an algorithm to find the largest list of bibliographical references in the article. The improvement applied on our model results an increase in the model's efficiency with an Accuracy equal to 85.89. And by testing our work, we are able to achieve 72.23% as an average for the percentage of success in detecting bibliographical references' zone.",

author = "Amal Htait and S{\'e}bastien Fournier and Patrice Bellot",

year = "2016",

month = may,

language = "English",

}

TY - CONF

T1 - Bilbo-Val: Automatic Identification of Bibliographical Zone in Papers

AU - Htait, Amal

AU - Fournier, Sébastien

AU - Bellot, Patrice

PY - 2016/5

Y1 - 2016/5

N2 - In this paper, we present the automatic annotation of bibliographical references' zone in papers and articles of XML/TEI format. Our work is applied through two phases: first, we use machine learning technology to classify bibliographical and non-bibliographical paragraphs in papers, by means of a model that was initially created to differentiate between the footnotes containing or not containing bibliographical references. The previous description is one of BILBO's features, which is an open source software for automatic annotation of bibliographic reference. Also, we suggest some methods to minimize the margin of error. Second, we propose an algorithm to find the largest list of bibliographical references in the article. The improvement applied on our model results an increase in the model's efficiency with an Accuracy equal to 85.89. And by testing our work, we are able to achieve 72.23% as an average for the percentage of success in detecting bibliographical references' zone.

AB - In this paper, we present the automatic annotation of bibliographical references' zone in papers and articles of XML/TEI format. Our work is applied through two phases: first, we use machine learning technology to classify bibliographical and non-bibliographical paragraphs in papers, by means of a model that was initially created to differentiate between the footnotes containing or not containing bibliographical references. The previous description is one of BILBO's features, which is an open source software for automatic annotation of bibliographic reference. Also, we suggest some methods to minimize the margin of error. Second, we propose an algorithm to find the largest list of bibliographical references in the article. The improvement applied on our model results an increase in the model's efficiency with an Accuracy equal to 85.89. And by testing our work, we are able to achieve 72.23% as an average for the percentage of success in detecting bibliographical references' zone.

UR - https://hal.archives-ouvertes.fr/hal-01771689

UR - https://aclanthology.org/L16-1576/

M3 - Unpublished Conference Paper

ER -

Bilbo-Val: Automatic Identification of Bibliographical Zone in Papers

Abstract

Access to Document

Other files and links

Fingerprint

Cite this