Yakovlev V. Algebraic Methods for Discovering of the Vulnerabilities in Binary Code

Українська версія

Thesis for the degree of Candidate of Sciences (CSc)

State registration number

0421U101749

Applicant for

Specialization

  • 01.05.03 - Математичне та програмне забезпечення обчислювальних машин і систем

23-04-2021

Specialized Academic Board

Д 26.194.02

V.M. Glushkov Institute of Cybernetics of National Academy of Sciences of Ukraine

Essay

Considered possibilities of increasing of effectiveness of application of the symbolic methods in the cybersecurity tasks, in particular, the tasks of discovering of the vulnerabilities in the binary code. The “traditional” methods of identifying the code vulnerabilities are insufficient. At the same time, the implementation of the symbolic methods is restricted, as these methods require exhaustive amount of computer resources, and, generally, very slow. To reduce this disadvantage, the hybrid methods and technologies are being developed. These methods implement vulnerabilities detection in several steps, where the symbolic modeling is used after some steps, which are based on another approaches, such as fuzzing, code instrumentation, neural networks etc. In this work, the approach based on the theory of insertion modelling and the behavior algebra is proposed. As a theoretical background, determined the semantics for the program model, and the semantics of low-level Assembler Intel x86 language. Considered the formalization of the model behavior by means of building of the system of behavioral equations, also the formalization of the code vulnerabilities in the form of special patterns in terms of behavior algebra. Proposed the methodology and technology of creation of such patterns. Represented the algorithm of bringing of an equation system in the terms of behavior algebra to the canonical form, is the basis of the algorithm of algebraic matching, proven its correctness and complexity. Also proven the vulnerability reachability in the program model, where a vulnerability is represented as a behavioral pattern. Developed the linear algebraic matching algorithm, which, in conjunction with the symbolic modeling algorithms, significantly increases the effectiveness of the vulnerabilities discovering in the binary code due to the restriction of the search space by the “suspected” behaviors. Proven the linear complexity of this algorithm. Represented the procedure of creating the exploit on the basis of the behaviors obtained from the algebraic matching, and the conditions that leads to a vulnerability. Represented the implementation of the linear algebraic matching algorithm in the prototype of the system for the vulnerabilities discovering. The system performs the binary code analysis in 4 steps: disassembling, translation of assembly code to the behavior algebra, algebraic matching, and symbolic modeling, which proves the reachability of the traces obtained on the matching step. Considered the ways of possible application of the technologies based on the algebraic approach for the discovering of the program backdoors.

Files

Similar theses