Petrenko M. Methods of integration of heterogeneous bibliographic data

Українська версія

Thesis for the degree of Candidate of Sciences (CSc)

State registration number

0421U104038

Applicant for

Specialization

  • 01.05.03 - Математичне та програмне забезпечення обчислювальних машин і систем

21-12-2021

Specialized Academic Board

К 26.139.03

Higher Education Institution "Open International University of Human Development" Ukraine "

Essay

The relevance of the study is due to the high practical value of developing methods of integration and purification of heterogeneous bibliographic data for the digital transformation of the library industry of Ukraine, and therefore better meet the information needs of citizens. The urgency is confirmed by the state program and strategies of Ukraine. Peculiarities of the work - research of the problems revealed in the real data of public libraries of Kyiv, differences in the data obtained from different sources, and methods of overcoming them. The influence of the lack of means to avoid visually invisible errors on search availability and integration of bibliographic data is analyzed. The peculiarities and regularities of the use of bibliographic data in the developed methodology are revealed and described. New unique methods have been developed: comparison of the values of the "Pages" field (resistant to different approaches to determining the number of pages); comparison of values of the field "Publishing house" (resistant to the use of different types of abbreviations); ISBN data anomaly detection method; method of minimizing the influence of data anomalies of the ISBN field on the data integration process. Describes a 5-level software package that was developed during the study and automates all stages of working with disparate bibliographic data from obtaining it through the World Wide Web, cleaning, integration experiments and automatic creation of information resources for readers.

Files

Similar theses