22. Data mining of toxic chemicals and database-based toxicity prediction. Jiansuo Wang, and Luhua Lai, Institute of Physical Chemistry, Peking University, Beijing 100871, China, Fax: 86-10-62751725, wjs@mdl.ipc.pku.edu.cn
Slides
(ppt)
In the early stage of drug discovery, especially for computer-aided drug design,a
large number of molecues will be proposed as potential leads and the bioactivity
risk of these molecules is expected to be evaluated prior to synthesis. The
rule-based expert systems have been used for the aim, while mining of a large
amount of toxicological data can provide us with another promise.
In term of pharmacologists/toxicologists, toxicants are the drugs that cause vital harm. Therefore, the biochemical basis of toxic chemicals is the same as that of drugs and there exist toxicant-receptor systems just like drug-receptor systems. Under such a notion, we introduce some concepts and technologies, which are developled in drug design, into toxic chemicals, and conduct the following work.
I. We have studied the structural features of toxic chemicals from the RTECS database associating with specific toxicity. Potential active frameworks, groups and structure patterns for specific toxicity are obtained by computational chemistry approaches. These structural features of toxic chemicals will be helpful to understand activities of toxic chemicals and useful to predict toxicity of chemicals, especially in the early stage of drug design.
II. We take a two-step strategy to explore noncongeneric toxic chemicals from the database RTECS: the screening of structure patterns and the generation of detailed relationship between structure and activity. From the performance of overall procedure, such a stepwise scheme is demonstrated to be feasible and effective to mine a database of toxic chemicals.
III. We develope one programme as databased-based toxicity predictor of chemicals (dbToxPre). For one activity-query molecule, the programme firstly retrieves its structure-related molecule set by quick shape comparison with the molecules in toxicological database; then carries out detailed structure-toxicity-relation analysis to the molecule set and produce the toxicity prediciton of the query molecule. The programme mainly include four parts: a) a fast and efficient clustering of molecules based on molecular shape. b) field-based similarity computation of molecular structure based on shape cluster. c) flexible CoMFA analysis of molecules based on shape cluster. d) a database of toxic chemicals suitable for such procedure.