The actual scientific and practical problem of increasing the effectiveness of computer linguistic steganography countering based on the development and implementation of the method of the textual information semantic compression with loss based on discourse analysis is solved in the thesis. The method of the textual information semantic compression for counteracting computer linguistic steganography is presented. It provides an attack on the linguistic stegosystem to remove or destroy the main part of a stegomessage by semantic compression of the text, taking into account the wide range of steganography means and the initial semantic structure of the text. The method consists of five stages: automated linguistic analysis of the text, evaluating its comprehension and allocation of the basic meaning, compression, modification and formation of the final text after making changes. The method fully provides a comprehensive steganalysis of textual data on the basis of discursive analysis and is effectively allocated by the ability of meaningless and artificially generated texts detection and compression. The new concept of discursive analysis helps to study texts of any subject and style. Methods of text abstracting, on which the compression is based and methods of morphological and syntactic analysis, on which the research of discourse is based, are adapted to take into account the possible use of means of steganography. Mathematical methods of attacking the stegosystem are effectively and harmoniously combined with linguistic methods of analysis with the help of natural languages formal grammars elements and intensional logic, the peculiarities of which use determine the developed concept for discursive analysis.
On the basis of the developed method, a program complex was implemented that detects the presence of the text modification traces by the means of linguistic steganography and makes changes to the text by its compression and modification without losing the semantic structure and semantic loading in order to remove the possible stegomessage. So, the automation of the text-research process is achieved in order to detect the presence of a hidden message in it. It provides a significant increase in the efficiency of textual data large arrays in English processing. The system of the software complex protection from the overload is implemented, which prevents the increase of the execution time for more than 20,000 milliseconds. The modular structure of the developed software complex allows to cover a wide range of threats and to adapt the software for specific conditions. Flexible two-stage setting allows to distribute executable functions between the subsystems and to increase the efficiency of the system depending on the required practical task.
The experimental study, which proved the efficiency and effectiveness of the method and the software complex, developed on its basis, was carried out. It has been found that compression efficiency ranges from 64% for sentences to 79% for meaningful texts and 93% for meaningless texts. At the same time, the probability of the stegomessage removal on average is not lower than 98.65%. Due to the complex approach and the wide range of possible threats, the effectiveness of the attack on the stegosystem increases, although at the same time with the volume of text increasing, the probability of steganalization mistakes occurrence of the first (perception of an empty container as a filled) type and the second (perception of a filled container as an empty) type increases too. Also, the developed system of steganalysis covers a much wider spectrum of investigated elements according to the tasks of the research and provides a significantly higher index of stegoattack efficiency than in the use of similar steganalizers. Comparison with existing abstracting systems has proven the efficiency of using the developed system for steganography tasks, and also revealed a higher compression ratio of the text. The created method takes into account the available methods of textual steganography and is effective in counteracting current threats.
A comprehensive approach to steganalization and compression opens up a wide range of possibilities for application of the software complex, realized by the method for solving many practical tasks of cybersecurity.
Keywords: computer linguistic steganography, semantic compression of the text, textual styganalysis, automated steganalysis, determination of the textual meaningfulness, attack on the linguistic stegosystem, counteraction the computer linguistic steganography, linguistic stegosystem, computer steganalysis, attack by compression.