Shubkina O. Semantic annotation of text documents' methods and models using artificial neural networks

Українська версія

Thesis for the degree of Candidate of Sciences (CSc)

State registration number

0412U000073

Applicant for

Specialization

  • 05.13.23 - Системи та засоби штучного інтелекту

21-12-2011

Specialized Academic Board

Д 64.052.01

Kharkiv National University Of Radio Electronics

Essay

Object of research is the process of extracting knowledge from text information in intelligent document processing system. Subjects of the investigation are methods and models of the semantic annotation of text documents using artificial neural networks. In particular, author applies methods of the artificial neural networks and computational intelligence, Semantic Web technology and principles of natural language information processing. Theoretical and practical results of the work are solution of relevant theoretical and practical scientific problems concerning efficiency improvement of semantic annotation generation for text documents in sequential regime of data processing. The novelty of results is consists of following items: 1) for the first time, hierarchical radial-basis function neural network with a multilayered architecture, which uses the same type of each node in the radial basic function neural network, thus reducing the number of attributes that to the input of each layer with a limited training set to generate semantic annotation of text documents is developed; 2) for the first time, probabilistic neural networks, a special form, namely, modified and competition, are developed as a hybrid of the standard probabilistic and generalized regression neural networks, as well as self-organizing Kohonen maps. It can determine the probability of belonging for the input text object to each of the possible classes of the domain ontology, handle text documents in sequential mode, as they become available, and provide easy of implementation and speed of information processing; 3) for the first time, author proposed the probabilistic semantic annotation model based on input into the RDF- model the probabilistic component that allows to create text documents with metadata based on the probability of belonging text object to concept ontology and provides a measure of the relationship of text data to the current ontology; 4) gain following development binary semantic annotation model, which differs from the other using information from the outputs of the ANN, represented in binary form, and allows adding semantic annotations generated text documents, under the conditions limited sample. Adoption level is determined by realizations in "SMIT Company Ltd." and Artificial Intelligence Department, Kharkov National University of Radio Electronics. Area of application covers organizations which provide development of artificial systems for text data processing. Also, results can be applied in education, in the branch of intelligence data processing.

Files

Similar theses