The impact in forensic voice comparison of lack of calibration and of mismatched conditions between the known-speaker recording and the relevant-population sample recordings

Geoffrey Stewart Morrison

doi:10.1016/j.forsciint.2017.12.024

The impact in forensic voice comparison of lack of calibration and of mismatched conditions between the known-speaker recording and the relevant-population sample recordings

Research output: Contribution to journal › Article › peer-review

Abstract

In a 2017 New South Wales case, a forensic practitioner conducted a forensic voice comparison using a Gaussian mixture model – universal background model (GMM-UBM). The practitioner did not report the results of empirical tests of the performance of this system under conditions reflecting those of the case under investigation. The practitioner trained the model for the numerator of the likelihood ratio using the known-speaker recording, but trained the model for the denominator of the likelihood ratio (the UBM) using high-quality audio recordings, not recordings which reflected the conditions of the known-speaker recording. There was therefore a difference in the mismatch between the numerator model and the questioned-speaker recording versus the mismatch between the denominator model and the questioned-speaker recording. In addition, the practitioner did not calibrate the output of the system. The present paper empirically tests the performance of a replication of the practitioner’s system. It also tests a system in which the UBM was trained on known-speaker-condition data and which was empirically calibrated. The performance of the former system was very poor, and the performance of the latter was substantially better.

Original language	English
Pages (from-to)	e1-e7
Journal	Forensic Science International
Volume	283
Early online date	19 Dec 2017
DOIs	https://doi.org/10.1016/j.forsciint.2017.12.024
Publication status	Published - 1 Feb 2018

Bibliographical note

Keywords

Forensic voice comparison
Automatic speaker recognition
GMM-UBM
ikelihood ratio
Validation
Calibration
Admissibilit

Access to Document

10.1016/j.forsciint.2017.12.024

The impact in forensic voice comparison of lack of
© 2017, Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International.
Accepted author manuscript, 339 KBLicence: CC BY-NC-ND 3.0

Cite this

@article{8972eff73e224e39b3facc47b8bb325a,

title = "The impact in forensic voice comparison of lack of calibration and of mismatched conditions between the known-speaker recording and the relevant-population sample recordings",

abstract = "In a 2017 New South Wales case, a forensic practitioner conducted a forensic voice comparison using a Gaussian mixture model – universal background model (GMM-UBM). The practitioner did not report the results of empirical tests of the performance of this system under conditions reflecting those of the case under investigation. The practitioner trained the model for the numerator of the likelihood ratio using the known-speaker recording, but trained the model for the denominator of the likelihood ratio (the UBM) using high-quality audio recordings, not recordings which reflected the conditions of the known-speaker recording. There was therefore a difference in the mismatch between the numerator model and the questioned-speaker recording versus the mismatch between the denominator model and the questioned-speaker recording. In addition, the practitioner did not calibrate the output of the system. The present paper empirically tests the performance of a replication of the practitioner{\textquoteright}s system. It also tests a system in which the UBM was trained on known-speaker-condition data and which was empirically calibrated. The performance of the former system was very poor, and the performance of the latter was substantially better.",

keywords = "Forensic voice comparison, Automatic speaker recognition, GMM-UBM, ikelihood ratio, Validation, Calibration, Admissibilit",

author = "Morrison, {Geoffrey Stewart}",

note = "{\textcopyright} 2017, Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International.",

year = "2018",

month = feb,

day = "1",

doi = "10.1016/j.forsciint.2017.12.024",

language = "English",

volume = "283",

pages = "e1--e7",

journal = "Forensic Science International",

issn = "0379-0738",

publisher = "Elsevier",

}

TY - JOUR

T1 - The impact in forensic voice comparison of lack of calibration and of mismatched conditions between the known-speaker recording and the relevant-population sample recordings

AU - Morrison, Geoffrey Stewart

PY - 2018/2/1

Y1 - 2018/2/1

N2 - In a 2017 New South Wales case, a forensic practitioner conducted a forensic voice comparison using a Gaussian mixture model – universal background model (GMM-UBM). The practitioner did not report the results of empirical tests of the performance of this system under conditions reflecting those of the case under investigation. The practitioner trained the model for the numerator of the likelihood ratio using the known-speaker recording, but trained the model for the denominator of the likelihood ratio (the UBM) using high-quality audio recordings, not recordings which reflected the conditions of the known-speaker recording. There was therefore a difference in the mismatch between the numerator model and the questioned-speaker recording versus the mismatch between the denominator model and the questioned-speaker recording. In addition, the practitioner did not calibrate the output of the system. The present paper empirically tests the performance of a replication of the practitioner’s system. It also tests a system in which the UBM was trained on known-speaker-condition data and which was empirically calibrated. The performance of the former system was very poor, and the performance of the latter was substantially better.

AB - In a 2017 New South Wales case, a forensic practitioner conducted a forensic voice comparison using a Gaussian mixture model – universal background model (GMM-UBM). The practitioner did not report the results of empirical tests of the performance of this system under conditions reflecting those of the case under investigation. The practitioner trained the model for the numerator of the likelihood ratio using the known-speaker recording, but trained the model for the denominator of the likelihood ratio (the UBM) using high-quality audio recordings, not recordings which reflected the conditions of the known-speaker recording. There was therefore a difference in the mismatch between the numerator model and the questioned-speaker recording versus the mismatch between the denominator model and the questioned-speaker recording. In addition, the practitioner did not calibrate the output of the system. The present paper empirically tests the performance of a replication of the practitioner’s system. It also tests a system in which the UBM was trained on known-speaker-condition data and which was empirically calibrated. The performance of the former system was very poor, and the performance of the latter was substantially better.

KW - Forensic voice comparison

KW - Automatic speaker recognition

KW - GMM-UBM

KW - ikelihood ratio

KW - Validation

KW - Calibration

KW - Admissibilit

UR - https://www.scopus.com/record/display.uri?eid=2-s2.0-85039739231&origin=SingleRecordEmailAlert&dgcid=raven_sc_affil_en_us_email&txGid=0be8cbe21d6a44c270221676b8092e13

UR - http://linkinghub.elsevier.com/retrieve/pii/S0379073817305406

U2 - 10.1016/j.forsciint.2017.12.024

DO - 10.1016/j.forsciint.2017.12.024

M3 - Article

SN - 0379-0738

VL - 283

SP - e1-e7

JO - Forensic Science International

JF - Forensic Science International

ER -

The impact in forensic voice comparison of lack of calibration and of mismatched conditions between the known-speaker recording and the relevant-population sample recordings

Abstract

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Cite this