The selection of data representation and metric for a given data set is one of the most crucial problems in machine learning since it affects the results of classification and clustering methods. In this paper we investigate how to combine a various data representations and metrics into a single function which better reflects the relationships between data set elements than a single representation-metric pair. Our approach relies on optimizing a linear combination of selected distance measures with use of least square approximation. The application of our method for classification and clustering of chemical compounds seems to increase the accuracy of these methods.
JavaScript is turned off in your web browser. Turn it on to take full advantage of this site, then refresh the page.