The modern development of medicine requires the implementation of artificial intelligence (AI) systems. Despite the success of deep learning methods, classical statistical approaches have reached their effectiveness limits due to a shortage of labelled data, the phenomenon of "domain shift," and the semantic gap between pixels and clinical concepts. Ignoring medical knowledge leads to the creation of vulnerable, poorly interpretable models prone to topological and logical errors. An urgent task is to improve the accuracy and validity of AI by transitioning to hybrid architectures that integrate expert knowledge at the stages of segmentation, classification, and natural language processing.
The object of the study is the processes of knowledge integration and data processing in artificial intelligence systems of medical diagnostic complexes.
The subject of the study is the methods and tools for integrating expert knowledge into deep learning models for the tasks of segmenting, classifying medical images, and analysing textual clinical data.
The aim of the study is to increase the accuracy and clinical validity of the decision-making process in medical diagnostic complexes through the creation of methods and tools for integrating expert knowledge into artificial intelligence models.
The scientific novelty of the study is as follows:
1) An improved method of adaptive knowledge distillation from teacher models to a student model is proposed, which differs from analogues by using a dynamic ensemble of teacher models with a specialised model for adversarial domain adaptation and a selective filtering mechanism. This allows for accumulating experience from different clinical domains and transferring it to a compact student model, thereby increasing the accuracy of the decision-making process under varying input data.
2) An improved method for establishing semantic relationships in medical texts has been developed, which combines the integration of ontological knowledge and explicit coding of sentiment and negation information, increasing the accuracy of clinical record interpretation and ensuring the logical consistency of conclusions.
3) A new method for segmenting cardiac magnetic resonance images is developed, based on the synergistic combination of an expert-guided attention mechanism for focusing on complex areas and a specialised loss function with topological constraints based on signed distance, which allows for explicitly encoding the nesting and contiguity of anatomical structures, ensuring improved accuracy in defining organ boundaries and eliminating topological artefacts.
4) A new method for identifying heart pathologies from magnetic resonance imaging using a knowledge-oriented graph convolutional network is developed, which implements the paradigm of relational reasoning on graphs, where nodes combine hybrid visual and morphological features. The adjacency matrix is formed as a superposition of spatial relationships and clinical correlations from medical guidelines, enabling increased diagnostic classification accuracy and ensuring the interpretability of decisions.
The "IDK Medical AI" software complex was created. The distillation method increased the AUC-ROC on the target domain to 81.45% on 500 samples (+8.8%). The NLP method achieved an accuracy of 81.14%. The segmentation method reduced the myocardium HD95 error to 6.5 mm and increased the Dice coefficient to 95.5%. The GCN-based classification method provided an accuracy of 94.0% (+9.0%).
The results were implemented in the educational process at Lviv University of Trade and Economics, the clinical practice at Khmelnytskyi Infectious Diseases Hospital, the production process at "KC NEURON" LLC, and the implementation of Khmelnytskyi National University's state budget research project (No. 0124U004665).
The introduction substantiates the relevance. Chapter 1 analyses AI technologies in radiology and natural language processing. Chapter 2 presents methods for knowledge distillation and the analysis of medical texts. Chapter 3 is devoted to segmentation and classification methods. Chapter 4 describes the architecture of the "IDK Medical AI" software and the experimental results. The conclusions summarise the results.
The dissertation comprises 182 pages (main text: 130 pages), 54 figures, 27 tables, and 142 sources. 12 works were published (1 Scopus, 3 professional "B" category, 6 abstracts, 1 monograph chapter, 1 copyright).