Tytarenko A. Deep reinforcement learning for caregiving robotics

Українська версія

Thesis for the degree of Doctor of Philosophy (PhD)

State registration number

0825U001678

Applicant for

Specialization

  • 124 - Системний аналіз

Specialized Academic Board

PhD 8714

National Technical University of Ukraine «Igor Sikorsky Kyiv Polytechnic Institute»

Essay

The dissertation is dedicated to the development and study of control algorithms based on deep reinforcement learning and imitation learning for automated care tasks. The underlying research problem is highly relevant given global demographic changes, characterized by an aging population and a shortage of human resources to ensure comprehensive patient care. The aim of the research is to create algorithms capable of controlling robotic systems for care tasks while ensuring high adaptability, safety, and efficiency in unpredictable conditions. The thesis consists of seven chapters, which detail both the theoretical foundations of the methods and the practical aspects of their implementation for robotic care systems. In the first chapter, general issues of robotic care are considered in the context of global demographic changes and modern challenges. Particular attention is given to the needs of Ukraine, where the war has significantly increased the number of people requiring long-term care and rehabilitation. Key technical and social barriers to implementing automated systems, such as high costs, technical complexity, and the need to ensure safety during physical interaction with patients, are identified. The second chapter focuses on the development of neural network policies that ensure robustness and stability of robotic systems. New methods based on diffusion policies and reinforcement learning algorithms are proposed to reduce the risks of errors in robot behavior. Significant attention is paid to developing approaches for optimizing objective functions, enabling systems to efficiently perform tasks even with a limited amount of training data. The third chapter explores methods for training vision-based neural policies for managing care systems. The challenge was to process incomplete or inaccurate sensor data, which is characteristic of real-world robot operation conditions. A neural network architecture is proposed to ensure stable control based on visual information without privileged data. Simulation results in Assistive Gym systems demonstrated the high efficiency of the trained policies. The fourth chapter is dedicated to the development of methods for early detection of abnormal behavior in neural policies to enhance the safety of care systems. The main ways of risk estimation in policies include ensembles of predictive models. The chapter begins by exploring such models, their variations, and modifications. The fifth chapter focuses on the class of methods for learning controllable environments (LCE). These methods generally reduce the dimensionality of the environment’s state space so that the resulting latent state space exhibits smooth or locally linear dynamics. The sixth chapter focuses on action encoding problems and representation optimization for managing care systems. Approaches based on action encoding consistency are proposed to stabilize system behavior and ensure robust control, even in dynamic environments with complex action spaces. The seventh chapter is dedicated to developing a comprehensive multi-component control system for robotic care tasks based on neural policies. A physical feeding control system using imitation learning methods was first proposed and implemented. Mechanisms for trajectory smoothing and behavior correction were introduced for task adaptation. The resulting decision support system is end-to-end, where control is implemented by a neural network using sensor signals and outputs from other neural networks. This reduces system cost by minimizing dependency on expensive components (precision actuators, lidars, multiple cameras) and demonstrates safety capabilities in unpredictable environments involving patients and robots. All these capabilities were assessed in an empirical study, where the method is applied to both simulated assistive feeding and arm manipulation tasks, and a physical assistive feeding system. Success rates of the compared algorithms are studied along with the accuracy of the early anomaly detection system with different thresholds. The practical significance of the obtained results lies in the ability to use the proposed methods to create efficient and accessible robotic care systems for rehabilitation centers, medical institutions, and home care. Implementing such systems will significantly reduce the burden on medical personnel and ensure high-quality patient care. An actual physical implementation of the proposed system has also been developed using the results of the research of the dissertation.

Research papers

Tytarenko, Andrii. "Multi-step prediction in linearized latent state spaces for representation learning." System research and information technologies 3 (2022): 139-148. DOI: https://doi.org/10.20535/SRIT.2308-8893.2022.3.09

Tytarenko, Andrii. "Action Encoding in Algorithms for Learning Controllable Environment." System Analysis and Artificial Intelligence. Cham: Springer Nature Switzerland, 2023. 271-287. DOI: https://doi.org/10.1007/978-3-031-37450-0_16

Tytarenko, Andrii. "Reducing Risk for Assistive Reinforcement Learning Policies with Diffusion Models " System research and information technologies 3 (2024): 148–154. DOI: https://doi.org/10.20535/SRIT.2308-8893.2024.3.09

Tytarenko,Andrii."Detectingunsafebehaviorinneuralnetworkimitationpolicies for caregiving robotics" System research and information technologies 4 (2024): 86-96. DOI: https://doi.org/10.20535/SRIT.2308-8893.2024.4.07

Kurniawan, Erick and Jukl, Alexander and Kalajian, Michael and Sytyi, Mykyta and Tytarenko, Andrii and Yurchenko, Oleg and Shkurka, Olha and Tsyba, Yevhen and Carstoiu, Gabriel. Techniques for generating motion information for videos. US Patent 17219558. Dec 24, 2024. Патент США на винахід.

Similar theses