Lavrynenko O. Methods of increasing the efficiency of semantic coding of speech signals

Українська версія

Thesis for the degree of Candidate of Sciences (CSc)

State registration number

0421U103322

Thesis Registration Form

0421U103322.pdf

Applicant for

Lavrynenko Oleksandr Yu.

Specialization

05.12.02 - Телекомунікаційні системи та мережі

Date of defense

27-08-2021

Specialized Academic Board

Д 26.062.19

National Aviation University

Essay

The thesis is devoted to the solution of the actual scientific and practical problem in telecommunication systems, namely increasing the bandwidth of the semantic speech data transmission channel due to their efficient coding, that is the question of increasing the efficiency of semantic coding is formulated, namely – at what minimum speed it is possible to encode semantic features of speech signals with the set probability of their error-free recognition? It is on this question will be answered in this research, which is an urgent scientific and technical task given the growing trend of remote human interaction and robotic technology through speech, where the accurateness of this type of system directly depends on the effectiveness of semantic coding of speech signals. In the thesis the well-known method of increasing the efficiency of semantic coding of speech signals based on mel-frequency cepstral coefficients is investigated, which consists in finding the average values of the coefficients of the discrete cosine transformation of the prologarithmic energy of the spectrum of the discrete Fourier transform treated by a triangular filter in the mel-scale. The problem is that the presented method of semantic coding of speech signals based on mel-frequency cepstral coefficients does not meet the condition of adaptability, therefore the main scientific hypothesis of the study was formulated, which is that to increase the efficiency of semantic coding of speech signals is possible through the use of adaptive empirical wavelet transform followed by the use of Hilbert spectral analysis. Coding efficiency means a decrease in the rate of information transmission with a given probability of error-free recognition of semantic features of speech signals, which will significantly reduce the required passband, thereby increasing the bandwidth of the communication channel. In the process of proving the formulated scientific hypothesis of the study, the following results were obtained: 1) the first time the method of semantic coding of speech signals based on empirical wavelet transform is developed, which differs from existing methods by constructing a sets of adaptive bandpass wavelet-filters Meyer followed by the use of Hilbert spectral analysis for finding instantaneous amplitudes and frequencies of the functions of internal empirical modes, which will determine the semantic features of speech signals and increase the efficiency of their coding; 2) the first time it is proposed to use the method of adaptive empirical wavelet transform in problems of multiscale analysis and semantic coding of speech signals, which will increase the efficiency of spectral analysis due to the decomposition of high-frequency speech oscillations into its low-frequency components, namely internal empirical modes; 3) received further development the method of semantic coding of speech signals based on mel-frequency cepstral coefficients, but using the basic principles of adaptive spectral analysis with the application empirical wavelet transform, which increases the efficiency of this method. Conducted experimental research in the software environment MATLAB R2020b showed, that the developed method of semantic coding of speech signals based on empirical wavelet transform allows you to reduce the encoding speed from 320 to 192 bit/s and the required passband from 40 to 24 Hz with a probability of error-free recognition of about 0.96 (96%) and a signal-to-noise ratio of 48 dB, according to which its efficiency increases 1.6 times in contrast to the existing method. The results obtained in the thesis can be used to build systems for remote interaction of people and robotic equipment using speech technologies, such as speech recognition and synthesis, voice control of technical objects, low-speed encoding of speech information, voice translation from foreign languages, etc.

Thesis supervisor

Konahovych Georgiy

Official opponents

Klymash Mykhailo Mykolayovych
Saiko Volodymyr Hryhorovych

Files

autoreferat-Автореферат_Лавриненко (2).pdf

Дисертація_Лавриненко (1).pdf

Similar theses

0524U000090

Mykola Nesterenko

Methodology for management throughput and quality of service real-time traffic in heterogeneous electronic communication networks with bypass paths

0524U000084

Nameer Hashim Qasim Qasim

Methodology for ensuring the quality of IoT service in the 5G network

0523U100269

Volodymyr Lysechko

Methods and models for improving the noise immunity of wireless intelligent telecommunication systems based on complex signal-code constructions

0423U100140

Liudmyla P. Klobukova

Method of quasi-orthogonal frequency division of channels in cognitive radio networks

0423U100089

Yakymchuk Nataliia Mykolaivna

Methods of combating congestion of telecommunication networks of new generations by forming flows of heterogeneous network traffic