Search results

1

On the Consistency of Multithreshold Entropy Linear Classifier

100%

Czarnecki W. M.

Schedae Informaticae

|

2015

|

vol. 24

PL

Multithreshold Entropy Linear Classifier (MELC) is a recent classifier idea which employs information theoretic concept in order to create a multithreshold maximum margin model. In this paper we analyze its consistency over multithreshold linear models and show that its objective function upper bounds the amount of misclassified points in a similar manner like hinge loss does in support vector machines. For further confirmation we also conduct some numerical experiments on five datasets.

2

Fast Optimization of Multithreshold Entropy Linear Classifier

63%

Józefowicz R., Czarnecki W. M.

Schedae Informaticae

|

2014

|

vol. 23

PL

Multithreshold Entropy Linear Classifier (MELC) is a density based model which searches for a linear projection maximizing the Cauchy-Schwarz Divergence of dataset kernel density estimation. Despite its good empirical results, one of its drawbacks is the optimization speed. In this paper we analyze how one can speed it up through solving an approximate problem. We analyze two methods, both similar to the approximate solutions of the Kernel Density Estimation querying and provide adaptive schemes for selecting a crucial parameters based on user-specified acceptable error. Furthermore we show how one can exploit well known conjugate gradients and L-BFGS optimizers despite the fact that the original optimization problem should be solved on the sphere. All above methods and modifications are tested on 10 real life datasets from UCI repository to confirm their practical usability.

3

Analysis of Compounds Activity Concept Learned by SVM Using Robust Jaccard Based Low-dimensional Embedding

63%

Jastrzębski S., Czarnecki W. M.

Schedae Informaticae

|

2015

|

vol. 24

PL

Support Vector Machines (SVM) with RBF kernel is one of the most successful models in machine learning based compounds biological activity prediction. Unfortunately, existing datasets are highly skewed and hard to analyze. During our research we try to answer the question how deep is activity concept modeled by SVM. We perform analysis using a model which embeds compounds’ representations in a low-dimensional real space using near neighbour search with Jaccard similarity. As a result we show that concepts learned by SVM is not much more complex than slightly richer nearest neighbours search. As an additional result, we propose a classification technique, based on Locally Sensitive ashing approximating the Jaccard similarity through minhashing technique, which performs well on 80 tested datasets (consisting of 10 proteins with 8 different representations) while in the same time allows fast classification and efficient online training.

Refine search results

3 Schedae Informaticae

2 2015

1 2014

3 Czarnecki W. M.

1 Jastrzębski S.

1 Józefowicz R.

On the Consistency of Multithreshold Entropy Linear Classifier

Fast Optimization of Multithreshold Entropy Linear Classifier

Analysis of Compounds Activity Concept Learned by SVM Using Robust Jaccard Based Low-dimensional Embedding