Alexandra A. Investigation of Bayesian pattern recognition procedures on homogeneous Markovian chains.

Українська версія

Thesis for the degree of Candidate of Sciences (CSc)

State registration number

0404U003978

Applicant for

Specialization

  • 01.05.01 - Теоретичні основи інформатики та кібернетики

22-10-2004

Specialized Academic Board

Д 26.194.02

V.M. Glushkov Institute of Cybernetics of National Academy of Sciences of Ukraine

Essay

The dissertation is devoted to investigation of Bayesian pattern recognition procedures for homogeneous Markovian chain. The upper bound of error estimation is polynomial in dependence of input data. It is proved the significant result that any procedure may work incorrectly at absence of one of classes in learning sample and error estimation is strictly positive. Statistical inference about estimates for transition pribabilities is developed. The limiting joint normal distribution of estimates, variances and covariances are presented. We examine the ergodic properties of estimates for transition probabilities defined in the form of frequency ratio. Statistical analysis of our chromosomes and proteins in human genome are presented. On the basic of test of the hipothesis that the chain is of a given order we deduced that the homogeneous Markovian chain is the best model to analyse DNA sequence data. For large proteins the Markov chain is homogeneous under the alternate hypothesis that the aminoacids areindependent. The ergodic properties of stochastic matrix of DNA and proteins are suggested. Key words: Bayesian pattern recognition procedure, error estimation, upper bound, homogeneous Markovian chain, ergodic property.

Files

Similar theses