EN
In this paper a pattern recognition approach to classifying quantitative structure-property relationships (QSPR) of the CYP2C19 isoform is presented. QSPR is a correlative computer modelling of the properties of chemical molecules and is widely used in cheminformatics and the pharmaceutical industry. Predicting whether or not a particular chemical will be metabolized by 2C19 is of primary importance to the pharmaceutical industry. This task poses certain challenges. First of all analyzed data are characterized by a significant biological noise. Additionally the training set is unbalanced, with objects from negative class outnumbering the positives four times. Presented solution deals with those problems, additionally incorporating a throughout feature selection for improving the stability of received results. A strong emphasis is put on the outlier detection and proper model validation to achieve the best predictive power.