In the context of forensic casework, are there meaningful metrics of the degree of calibration?

Geoffrey Stewart Morrison

doi:10.1016/j.fsisyn.2021.100157

In the context of forensic casework, are there meaningful metrics of the degree of calibration?

Research output: Contribution to journal › Article › peer-review

Abstract

Forensic-evaluation systems should output likelihood-ratio values that are well calibrated. If they do not, their output will be misleading. Unless a forensic-evaluation system is intrinsically well-calibrated, it should be calibrated using a parsimonious parametric model that is trained using calibration data. The system should then be tested using validation data. Metrics of degree of calibration that are based on the pool-adjacent-violators (PAV) algorithm recalibrate the likelihood-ratio values calculated from the validation data. The PAV algorithm overfits on the validation data because it is both trained and tested on the validation data, and because it is a non-parametric model with weak constraints. For already-calibrated systems, PAV-based ostensive metrics of degree of calibration do not actually measure degree of calibration; they measure sampling variability between the calibration data and the validation data, and overfitting on the validation data. Monte Carlo simulations are used to demonstrate that this is the case. We therefore argue that, in the context of casework, PAV-based metrics are not meaningful metrics of degree of calibration; however, we also argue that, in the context of casework, a metric of degree of calibration is not required.

Original language	English
Article number	100157
Journal	Forensic Science International: Synergy
Volume	3
DOIs	https://doi.org/10.1016/j.fsisyn.2021.100157
Publication status	Published - 12 Jun 2021

Bibliographical note

© 2021 The Author. This is an open access article under the CC BY-NC-ND license

Funding: This research was supported by Research England's Expanding Excellence in England Fund as part of funding for the Aston Institute for Forensic Linguistics 2019–2022.

Keywords

Calibration
Forensic inference and statistics
Likelihood ratio
Metric

Access to Document

10.1016/j.fsisyn.2021.100157Licence: CC BY-NC-ND 3.0

In the context of forensic casework, are there meaningful metrics of the degree of calibration
© 2021 The Author. This is an open access article under the CC BY-NC-ND license
Final published version, 9.04 MBLicence: CC BY-NC-ND 3.0

Cite this

@article{13254996891c46ad8eff47dea7fa6dc2,

title = "In the context of forensic casework, are there meaningful metrics of the degree of calibration?",

abstract = "Forensic-evaluation systems should output likelihood-ratio values that are well calibrated. If they do not, their output will be misleading. Unless a forensic-evaluation system is intrinsically well-calibrated, it should be calibrated using a parsimonious parametric model that is trained using calibration data. The system should then be tested using validation data. Metrics of degree of calibration that are based on the pool-adjacent-violators (PAV) algorithm recalibrate the likelihood-ratio values calculated from the validation data. The PAV algorithm overfits on the validation data because it is both trained and tested on the validation data, and because it is a non-parametric model with weak constraints. For already-calibrated systems, PAV-based ostensive metrics of degree of calibration do not actually measure degree of calibration; they measure sampling variability between the calibration data and the validation data, and overfitting on the validation data. Monte Carlo simulations are used to demonstrate that this is the case. We therefore argue that, in the context of casework, PAV-based metrics are not meaningful metrics of degree of calibration; however, we also argue that, in the context of casework, a metric of degree of calibration is not required.",

keywords = "Calibration, Forensic inference and statistics, Likelihood ratio, Metric",

author = "Morrison, {Geoffrey Stewart}",

note = "{\textcopyright} 2021 The Author. This is an open access article under the CC BY-NC-ND license Funding: This research was supported by Research England's Expanding Excellence in England Fund as part of funding for the Aston Institute for Forensic Linguistics 2019–2022.",

year = "2021",

month = jun,

day = "12",

doi = "10.1016/j.fsisyn.2021.100157",