In a 2017 New South Wales case, a forensic practitioner conducted a forensic voice comparison using a Gaussian mixture model – universal background model (GMM-UBM). The practitioner did not report the results of empirical tests of the performance of this system under conditions reflecting those of the case under investigation. The practitioner trained the model for the numerator of the likelihood ratio using the known-speaker recording, but trained the model for the denominator of the likelihood ratio (the UBM) using high-quality audio recordings, not recordings which reflected the conditions of the known-speaker recording. There was therefore a difference in the mismatch between the numerator model and the questioned-speaker recording versus the mismatch between the denominator model and the questioned-speaker recording. In addition, the practitioner did not calibrate the output of the system. The present paper empirically tests the performance of a replication of the practitioner’s system. It also tests a system in which the UBM was trained on known-speaker-condition data and which was empirically calibrated. The performance of the former system was very poor, and the performance of the latter was substantially better.
Bibliographical note© 2017, Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International.
- Forensic voice comparison
- Automatic speaker recognition
- ikelihood ratio