In this study, the results of acoustic modeling used in a large vocabulary continuous speech recognition system are presented. The acoustic models have been developed with the use of a phonetically controlled large corpus of contemporary spoken Polish. Evaluation experiments showed that relatively good speech recognition results may be obtained with adequate training material, taking into account: (a) the presence of lexical stress; (b) speech styles (a variety of segmental and prosodic structures, various degrees of spontaneity of speech (spontaneous vs. read speech), pronunciation variants and dialects); (c) the influence of the sound level and background noises. The present large vocabulary continuous speech recognition evaluation results were obtained with Sclite assessment software. Moreover, the article delivers information about the speech corpus structure and contents and also a brief outline of the design and architecture of the automatic speech recognition system.
An early diagnosis of the congenital disorder of hearing creates new challenges for a multidisciplinary team: paedoaudiologists, ear nose throat specialists, and speech therapists. The cross modality matching method is based on the objective and subjective techniques in the evaluation of hearing thresholds in children. The electrical response audiometry provides information about the response of the brainstem to acoustic stimulation; the behavioural audiometry gives information about the perception and central associative processes in the auditory pathway. The paediatric fitting procedure relies on solid foundations of behavioural measurement to ensure the validity of hearing aid and cochlear implant fitting. This study assessed perception of phonemes in children with the cochlear implants and possibilities of applying acoustic solutions to the audiologic evaluation. The authors have also examined the possibilities of applying digital audio processing algorithm in clinical practice. Self-developed computer controlled diagnostic stations were used and tested. Speech perception was assessed on the basis of Erber's categories. Detection, discrimination and identification tests of 5 Ling phoneme were used. The sample comprised 23 implanted children, aged 3-6 years, who received a cochlear implant when they were 18 to 30 months old. The detection thresholds, discrimination and identification scores were assessed. Tests based on phonemes aa, uu, ii, ss, and sh (5 sounds of Ling) were used. The results indicated significant correlations between pure tone audiometry results and the thresholds of phoneme detection [dB SPL]. The identification score in this group was 95-100%.
In this article the authors investigated and presented statistical models of acoustic phenomena observed within realizations of phonemes and the correlations of the acoustic properties with functional features, such as accents and sentence boundaries. The authors used two databases: the first one contained separately produced sentences and the second one - phrases extracted from larger, continuous stretches of natural speech. The authors also statistically analyzed the selected features of Polish phonemes' realizations (the duration, energy and power of the phonemes, the fundamental frequency of voiced phones) in order to detect their relations with the phone location in a sentence. Additionally, the authors built the probabilistic models and suggested the evaluation methods to assess quantitatively the phenomena known from phonetic literature. Finally, the authors have identified the pre-boundary lengthening of the phones and a decrease of energy and pitch as the markers of sentence endings. In the place of accented syllables, we have observed a significant increase of total energy and power, accompanied by a local increase of F0. Finally, we have indicated possible application of the results for speech technology.
JavaScript is turned off in your web browser. Turn it on to take full advantage of this site, then refresh the page.