Tehničko veleučilište u Zagrebu · Zagreb

System for automatic assignment of lexical stress in Croatian

izvorni znanstveni rad

izvorni znanstveni rad

System for automatic assignment of lexical stress in Croatian

Vrsta prilog u časopisu
Tip izvorni znanstveni rad
Godina 2022
Časopis Electronics (Basel)
Volumen 11
Svesčić 22
Stranice 3687, 14
DOI 10.3390/electronics11223687
EISSN 2079-9292
Status objavljeno

Sažetak

It is very popular today to integrate voice interfaces into IoT devices. The pronunciation and proper prosody of speech play a major role in the intelligibility and naturalness of synthesized voices. Each language has its own prosodic characteristics. In this paper, we present the results of a study aimed at testing the applicability of methods for modelling and predicting the prosodic features of the Croatian language. The extent to which their performance can be improved by incorporating linguistic features and linguistic peculiarities specific to the Croatian language was investigated. In the model learning process, tree classification was used to predict the lexical stress position and the type of stress in a word, and a lexicon of 1, 011, 785 word forms was used as the model learning set. Separate models were created for predicting the position and type of lexical stress. The results improved significantly after the rules for atonic words (clitics) were applied. A hybrid approach combining a rule‐based approach and a modelling approach was also proposed. The final accuracy of assigning lexical stress using the hybrid approach was 95.3%.

Ključne riječi

voice interface ; automatic lexical stress assignment ; lexical stress detection ; lexical stress in Croatian ; text‐to‐speech synthesis