Taran V. Method of deep neural networks adaptation for hardware with specialized architecture

Українська версія

Thesis for the degree of Doctor of Philosophy (PhD)

State registration number

0823U100075

Applicant for

Specialization

  • 123 - Комп’ютерна інженерія

08-02-2023

Specialized Academic Board

ДФ 26.002.05

National Technscal University of Ukraine "Kiev Polytechnic Institute".

Essay

Thesis is devoted to the development of the complex adaptation method of deep neural networks, which allows to increase productivity and efficiency of deep neural networks applications on the hardware with specialized architecture. The complex deep neural networks adaptation method for specialized hardware was developed. The method of adaptive iterative pruning for decreasing neural network model size was developed, which is based on subsequent decrease of the model size by removing redundant channels in convolution layers and additional model training for accuracy recovery. According to the proposed method, model hyper parameters are changed after every iteration to compensate accuracy loss and to achieve time decreasing of data processing iteration. The method of neural network data processing efficiency improvement for specialized accelerators was developed. It is based on the technical aspects of deep neural network data processing on hardware with specialized architectures, for example data processing iteration and allows to determine processing parameters for decreasing influence of overheads. The method of neural network processing infrastructure efficiency improvement was developed. It allows to optimize hardware and software configuration of the target system for increasing deep neural network data processing productivity. The testing software for medical diagnostics in the context of edge computing was developed. It utilizes the developed deep neural networks adaptation method and specialized accelerator Coral Edge TPU. The result analysis of the deep neural network adaptation method application was performed. It includes adaptive iterative pruning method, data processing efficiency improvement method and computational infrastructure efficiency improvement method. The speedup up to 32,2 and 96,2% accuracy were achieved after performing 10 iteration of the developed adaptive iterative pruning method. Based on the technical processing properties analysis for specialized processing architectures, some factors were identified, which have influence on the data processing. The considerable speedup values, while utilizing TPU compared to GPU, were achieved on the later data processing iterations (>3) with deep neural networks models, when initialization overheads had small influence on the accelerator performance. Such factor should be taken into account, while improving deep neural networks data processing efficiency on the accelerators with specialized architecture. Based on the deep neural network processing infrastructure analysis of factors, which had influence on the processing productivity, the following was identified. Considerable productivity difference was achieved, while utilizing different software and hardware combinations of the processing infrastructure. The achieved speedup value was up to 8,7. Developed methods are parts of the complex deep neural networks adaptation method. It allows to prepare the selected neural network model for application on the accelerator with specialized architecture.

Files

Similar theses