Lorant Bodis1, Alfred Ross2, and Ernö Pretsch1. (1) Department of Chemistry and Applied Biosciences, ETH Zurich, ETH Hönggerberg, HCI E 312, Zurich, CH-8093, Switzerland, (2) Pharmaceuticals Division, F. Hoffmann-La Roche Ltd, Grenzacherstr, Basel, CH-4070, Switzerland
Most available vector comparison methods such as the correlation coefficient and Tanimoto coefficient are only able to find point-wise similarity. Similarity criteria for spectra comparison should include information about the neighborhood of the corresponding items in order to identify shifted signals as well. So far, only few such methods have been described. A recent method is based on a locally weighted cross-correlation function being normalized with geometric mean of the individual autocorrelation functions. A much better performance has been achieved with a novel similarity criterion. The two vectors to be compared are divided into i bins (i = 1, N) and for each division the integrals in each bin are calculated. Similarity indices are derived from the comparison of the corresponding integrals. The mean of the normalized similarity indices serves as the similarity criterion. The presented similarity criteria are characterized with contingency tables and histograms obtained from tests made on simple artificial 1H NMR spectra having different degrees of similarity. Furthermore, they are applied for comparing measured and estimated spectra of a complex real-life database. Although, so far, it has only been tested with one-dimensional 1H NMR spectra, due to the generality of the approach, the application of the novel procedure with spectra of two or more dimensions including image analysis is straightforward.