Phoneme aware speech recognition through evolutionary optimisation

Jordan J. Bird; Elizabeth Wanner; Anikó Ekárt; Diego R. Faria

doi:10.1145/3319619.3321951

Phoneme aware speech recognition through evolutionary optimisation

Jordan J. Bird, Elizabeth Wanner, Anikó Ekárt, Diego R. Faria

Research output: Chapter in Book/Published conference output › Conference publication

Abstract

Phoneme awareness provides the path to high resolution speech recognition to overcome the difficulties of classical word recognition. Here we present the results of a preliminary study on Artificial Neural Network (ANN) and Hidden Markov Model (HMM) methods of classification for Human Speech Recognition through Diphthong Vowel sounds in the English Phonetic Alphabet, with a specific focus on evolutionary optimisation of bio-inspired classification methods. A set of audio clips are recorded by subjects from the United Kingdom and Mexico. For each recording, the data were pre-processed, using Mel-Frequency Cepstral Coefficients (MFCC) at a sliding window of 200ms per data object, as well as a further MFCC timeseries format for forecast-based models, to produce the dataset. We found that an evolutionary optimised deep neural network achieves 90.77% phoneme classification accuracy as opposed to the best HMM of 150 hidden units achieving 86.23% accuracy. Many of the evolutionary solutions take substantially longer to train than the HMM, however one solution scoring 87.5% (+1.27%) requires fewer resources than the HMM.

Original language	English
Title of host publication	GECCO 2019 Companion - Proceedings of the 2019 Genetic and Evolutionary Computation Conference Companion
Publisher	ACM
Pages	362-363
Number of pages	2
ISBN (Electronic)	9781450367486
DOIs	https://doi.org/10.1145/3319619.3321951
Publication status	Published - 13 Jul 2019
Event	2019 Genetic and Evolutionary Computation Conference, GECCO 2019 - Prague, Czech Republic Duration: 13 Jul 2019 → 17 Jul 2019

Publication series

Name	GECCO 2019 Companion - Proceedings of the 2019 Genetic and Evolutionary Computation Conference Companion

Conference

Conference	2019 Genetic and Evolutionary Computation Conference, GECCO 2019
Country/Territory	Czech Republic
City	Prague
Period	13/07/19 → 17/07/19

Keywords

Artificial Neural Networks
Computational Linguistics
Evolutionary Optimisation
Phoneme Awareness
Speech Recognition

Access to Document

10.1145/3319619.3321951

Cite this

Bird, J. J., Wanner, E., Ekárt, A., & Faria, D. R. (2019). Phoneme aware speech recognition through evolutionary optimisation. In GECCO 2019 Companion - Proceedings of the 2019 Genetic and Evolutionary Computation Conference Companion (pp. 362-363). (GECCO 2019 Companion - Proceedings of the 2019 Genetic and Evolutionary Computation Conference Companion). ACM. https://doi.org/10.1145/3319619.3321951

@inproceedings{208c09b5adc84a35a40a88e99a5db8f6,

title = "Phoneme aware speech recognition through evolutionary optimisation",

abstract = "Phoneme awareness provides the path to high resolution speech recognition to overcome the difficulties of classical word recognition. Here we present the results of a preliminary study on Artificial Neural Network (ANN) and Hidden Markov Model (HMM) methods of classification for Human Speech Recognition through Diphthong Vowel sounds in the English Phonetic Alphabet, with a specific focus on evolutionary optimisation of bio-inspired classification methods. A set of audio clips are recorded by subjects from the United Kingdom and Mexico. For each recording, the data were pre-processed, using Mel-Frequency Cepstral Coefficients (MFCC) at a sliding window of 200ms per data object, as well as a further MFCC timeseries format for forecast-based models, to produce the dataset. We found that an evolutionary optimised deep neural network achieves 90.77% phoneme classification accuracy as opposed to the best HMM of 150 hidden units achieving 86.23% accuracy. Many of the evolutionary solutions take substantially longer to train than the HMM, however one solution scoring 87.5% (+1.27%) requires fewer resources than the HMM.",

keywords = "Artificial Neural Networks, Computational Linguistics, Evolutionary Optimisation, Phoneme Awareness, Speech Recognition",

author = "Bird, {Jordan J.} and Elizabeth Wanner and Anik{\'o} Ek{\'a}rt and Faria, {Diego R.}",

year = "2019",

month = jul,

day = "13",

doi = "10.1145/3319619.3321951",

language = "English",

series = "GECCO 2019 Companion - Proceedings of the 2019 Genetic and Evolutionary Computation Conference Companion",

publisher = "ACM",

pages = "362--363",

booktitle = "GECCO 2019 Companion - Proceedings of the 2019 Genetic and Evolutionary Computation Conference Companion",

address = "United States",

note = "2019 Genetic and Evolutionary Computation Conference, GECCO 2019 ; Conference date: 13-07-2019 Through 17-07-2019",

}

Bird, JJ, Wanner, E , Ekárt, A & Faria, DR 2019, Phoneme aware speech recognition through evolutionary optimisation. in GECCO 2019 Companion - Proceedings of the 2019 Genetic and Evolutionary Computation Conference Companion. GECCO 2019 Companion - Proceedings of the 2019 Genetic and Evolutionary Computation Conference Companion, ACM, pp. 362-363, 2019 Genetic and Evolutionary Computation Conference, GECCO 2019, Prague, Czech Republic, 13/07/19. https://doi.org/10.1145/3319619.3321951

Phoneme aware speech recognition through evolutionary optimisation. / Bird, Jordan J.; Wanner, Elizabeth ; Ekárt, Anikó et al.
GECCO 2019 Companion - Proceedings of the 2019 Genetic and Evolutionary Computation Conference Companion. ACM, 2019. p. 362-363 (GECCO 2019 Companion - Proceedings of the 2019 Genetic and Evolutionary Computation Conference Companion).

Research output: Chapter in Book/Published conference output › Conference publication

TY - GEN

T1 - Phoneme aware speech recognition through evolutionary optimisation

AU - Bird, Jordan J.

AU - Wanner, Elizabeth

AU - Ekárt, Anikó

AU - Faria, Diego R.

PY - 2019/7/13

Y1 - 2019/7/13

N2 - Phoneme awareness provides the path to high resolution speech recognition to overcome the difficulties of classical word recognition. Here we present the results of a preliminary study on Artificial Neural Network (ANN) and Hidden Markov Model (HMM) methods of classification for Human Speech Recognition through Diphthong Vowel sounds in the English Phonetic Alphabet, with a specific focus on evolutionary optimisation of bio-inspired classification methods. A set of audio clips are recorded by subjects from the United Kingdom and Mexico. For each recording, the data were pre-processed, using Mel-Frequency Cepstral Coefficients (MFCC) at a sliding window of 200ms per data object, as well as a further MFCC timeseries format for forecast-based models, to produce the dataset. We found that an evolutionary optimised deep neural network achieves 90.77% phoneme classification accuracy as opposed to the best HMM of 150 hidden units achieving 86.23% accuracy. Many of the evolutionary solutions take substantially longer to train than the HMM, however one solution scoring 87.5% (+1.27%) requires fewer resources than the HMM.

AB - Phoneme awareness provides the path to high resolution speech recognition to overcome the difficulties of classical word recognition. Here we present the results of a preliminary study on Artificial Neural Network (ANN) and Hidden Markov Model (HMM) methods of classification for Human Speech Recognition through Diphthong Vowel sounds in the English Phonetic Alphabet, with a specific focus on evolutionary optimisation of bio-inspired classification methods. A set of audio clips are recorded by subjects from the United Kingdom and Mexico. For each recording, the data were pre-processed, using Mel-Frequency Cepstral Coefficients (MFCC) at a sliding window of 200ms per data object, as well as a further MFCC timeseries format for forecast-based models, to produce the dataset. We found that an evolutionary optimised deep neural network achieves 90.77% phoneme classification accuracy as opposed to the best HMM of 150 hidden units achieving 86.23% accuracy. Many of the evolutionary solutions take substantially longer to train than the HMM, however one solution scoring 87.5% (+1.27%) requires fewer resources than the HMM.

KW - Artificial Neural Networks

KW - Computational Linguistics

KW - Evolutionary Optimisation

KW - Phoneme Awareness

KW - Speech Recognition

UR - http://www.scopus.com/inward/record.url?scp=85069182849&partnerID=8YFLogxK

UR - https://dl.acm.org/citation.cfm?doid=3319619.3321951

U2 - 10.1145/3319619.3321951

DO - 10.1145/3319619.3321951

M3 - Conference publication

T3 - GECCO 2019 Companion - Proceedings of the 2019 Genetic and Evolutionary Computation Conference Companion

SP - 362

EP - 363

BT - GECCO 2019 Companion - Proceedings of the 2019 Genetic and Evolutionary Computation Conference Companion

PB - ACM

T2 - 2019 Genetic and Evolutionary Computation Conference, GECCO 2019

Y2 - 13 July 2019 through 17 July 2019

ER -

Bird JJ, Wanner E , Ekárt A, Faria DR. Phoneme aware speech recognition through evolutionary optimisation. In GECCO 2019 Companion - Proceedings of the 2019 Genetic and Evolutionary Computation Conference Companion. ACM. 2019. p. 362-363. (GECCO 2019 Companion - Proceedings of the 2019 Genetic and Evolutionary Computation Conference Companion). doi: 10.1145/3319619.3321951

Phoneme aware speech recognition through evolutionary optimisation

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this