Savastianov V. Supporting foresight using textual analytics for semistructural data.

Українська версія

Thesis for the degree of Candidate of Sciences (CSc)

State registration number

0421U102952

Applicant for

Specialization

  • 01.05.04 - Системний аналіз і теорія оптимальних рішень

14-05-2021

Specialized Academic Board

Д 26.002.03

Educational and Scientific Complex "Institute for Applied System Analysis" of National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute"

Essay

The paper proposes to review the process of foresight with the presence of semistructured data as a whole, gradually reducing uncertainty, moving from the start of the study to the desired future. To implement the proposed concept, a systematic approach to the support of the foresight process based on textual analytics, which is the most modern and most powerful tool for the analysis of semistructured data written in natural language. The system approach consists of four stages which are continuously repeated throughout the life cycle of foresight, and its results are reused in all other foresight sessions. In the first stage, the subject area is studied, the features to the desired future are analysed, the models, methods and their metadata are determined. The conceptual model of support of the foresight process is determined. An idea of the process of foresight and the horizon of foresight is formed. Factors of growth and reduction of uncertainty on the way to the forecast horizon are determined. An information model of the foresight process is introduced - the representation of subject areas using the set-theoretic concept of general systems theory. Restrictions on information model connections are introduced, options for presenting knowledge in the form of a hierarchical classifier or ontology are considered, and advantages and disadvantages are outlined. The concept of the existence of knowledge in time is considered. Integrated time-dependent awareness indicators have been introduced to measure changes in the knowledge base over time and / or depending on the amount of new knowledge. New knowledge is registered as classified metadata according to developed classifiers. Awareness indicators are constantly calculated and analyzed during the foresight process. At the second stage of the system approach the model and approach of extraction of knowledge from texts in natural language is introduced and applied. The work modifies the general model of extracting facts from texts in natural language to meet the requirements of extracting metadata information model of foresight, introducing universal lexical templates-restrictions to compile more powerful rules for extracting metadata. At the third stage of the system approach the information model of support of the foresight process is introduced, classes of input data are defined. At the fourth stage of the system approach, the semistructured data processing modules are adapted and scaled as a part of the foresight process support system. A number of cases show the application of a systematic approach to support the foresight process with the presence of semistructured data using textual analytics. The developed system approach is applied throughout the life cycle of the foresight session. Artifacts created at the end of the support process (classifiers, lexical restrictions, rules, knowledge) can be used in subsequent and new foresight sessions. Introduced system approach reduces the resources to provide data in the internal subprocesses of the system and improves the quality of processes, including: speeds up the processing of input data about foresight process, provides analysts and experts with tools for rapid analysis of input data, information on the progress in the form of awareness indicators, provides reuse of acquired knowledge and artifacts at the output of models, algorithms and approaches in subsequent foresight sessions. Number of practical cases confirmed the effectiveness, efficiency, scale of the proposed concept, saving the integrity of the foresight process, during the involvement of the proposed system approach. Keywords: systems analysis, foresight methodology, text analytics, natural language processing, data mining, foresight process support, sentiment analysis, foresight awareness indicators, information model, conceptual model, model of knowledge extraction from texts in natural language, classifiers, synthesis of classification rules, foresight process metadata.

Files

Similar theses