36. MIXED TEXT AND STRUCTURE SEARCHING OF CHEMICAL DATABASES. Presenting Author: Jack Delany1 Secondary Author(s): John Bradshaw, Martyn Ford, Mike Lipkin, Fergus Lippi, David Salt, and Roger Sayle, 1Daylight Chemical Information Systems, Santa Fe, NM 87501 USA
Traditional searching of chemical databases is normally restricted to a search for superstructures of a given target molecule. The aim of this work is to create an environment in which one could rank compounds by similarity to a target using not only the structural data, but also non-structural data (e.g. text) associated with the compounds of interest. Given the powerful tools already available for comparing binary strings (fingerprints) we have concentrated our efforts on investigating how to convert other data types, especially text strings, into a suitable form. Two coding systems that represent the non-structural and structural information in a common format have been developed, allowing the use of standard information retrieval and chemical similarity ranking methods to relate compounds. This paper reports our latest results.