Oliinyk Y. Text stream analysis information technology

Українська версія

Thesis for the degree of Candidate of Sciences (CSc)

State registration number

0420U102411

Applicant for

Specialization

  • 05.13.06 - Інформаційні технології

11-12-2020

Specialized Academic Board

Д 67.052.01

Kherson National Technical University

Essay

The research is devoted to developing of text stream analysis information technology. Object of research is processes of intelligent analysis of text stream data. The purpose of the thesis is to increase the quality of analysis and performance of Ukrainian-language text data streams due development of new methods and text stream analysis information technologies. Research methods based on principles of computer linguistics, probability theory, mathematical statistics and data mining methods. The reliability and validity of the obtained results based on correct using of mathematical apparatus and confirmed by results of computational experiments. Modern approaches and methods analysis are performed, text streams analysis characteristics are detected. A text data stream model based on a sliding window was formalized, which allowed to expand the possibilities of using text mining methods and machine learning methods for text data streams. For the first time, anomalies detecting method in text data streams was proposed. Developed method based on an Isolation Forest algorithm, which in difference to the existing ones, supports text data stream model based on a sliding window, preprocessing and summarization phase of text data. Text stream sentiment analyses method based on combination of Gradient Boosting, Rule Based algorithms, and using text stream model based on a sliding window has been improved. For the first time, a text stream analysis information technology has been proposed, which supports the anomalies detection method and text stream sentiment analyses method. The information technology as opposed to existing ones supports Ukrainian-language texts data mining processing; developed text stream model based on a sliding window, high-performance computing and online visualization. Scope is processes in educational and scientific institutions, Ukrainian organizations and enterprises.

Files

Similar theses