Division of Chemical Information Sponsored Symposia
American Chemical
Society National Meeting, April 13-17, 1997
![]()
SUNDAY MORNING, APRIL 13, 1997
SECTION A
8:30 AM
1. RAPID DIVERSITY ANALYSIS IN COMBINATORIAL LIBRARIES USING MARKUSH STRUCTURE TECHNIQUES. John M. Bamard, Geoff M. Downs, David B. Turner, Simon M. Tyrrell, Peter Willett, Barnard Chemical Information Ltd., 46 Uppergate Road, Stannington, Sheffield S6 6BX UK, and Department of Information Studies, University of Sheffield, Sheffield S 10 2TN, United Kingdom.
A number of techniques and algorithms have been developed to handle Markush structures from chemical patents without enumerating the (often extremely large) sets of compounds described. This paper describes the application of such techniques to diversity analysis in combinatorial libraries. We present a data structure which shows both the chemical nature of the monomer units in a combinatorial library, and the logical relationships between them. We describe the use of algorithms to generate structural fingerprints for the compounds in the library directly from this data structure, and we also discuss its use for rapid calculation of numerical measures of library diversity.
9:00 AM
2. A COMPARISON OF DISSIMILARITY METHODOLOGIES IN CONSTRUCTING REPRESENTATIVE COMPOUND LIBRARIES. Michael. S. Lajiness, Computer-Aided Drug Discovery, Pharmacia & Upjohn, Inc., Kalamazoo, Michigan 49007-4940.
There has been quite a bit of interest in the topic of structural diversity and how it relates to pharmaceutical lead finding and development. Several different approaches have been proposed and utilized for the selection of structurally diverse subsets of compounds. An important question that is often, however, not addressed is how these approaches compare? Are they equally effective in terms of distinguishing a steroid from a prostaglandin? Is one method better at locating biologically active compounds? This paper will compare several different methods for selecting structurally diverse subsets of compounds. The effectiveness of these methods to locate active compounds will be accessed using results from several different biological assays conducted at Pharmacia & Upjohn. The methods currently under study are from Tripos, Arris Pharmaceuticals, and Pharmacia & Upjohn.
9:30 AM
3. THE WELL TAILORED LIBRARY: BEYOND MERE DIVERSITY. Eric J. Martin, Roger E. Critchlow, Chiron Corp., Emeryville, CA 94608.
Combinatorial library design attempts to choose the best substituent set for a combinatorial synthetic scheme to maximize the chances of finding useful compounds such as drug leads. Initial efforts focused primarily on maximizing diversity, perhaps allowing bias through the inclusion of a small, fixed, set of pharmacophoric groups. However, many factors besides diversity impact good library design. A library can be better "tailored" by creating categories such as polar, pharmacophoric, rigid, low molecular weight, inexpensive, etc. The most diverse designs matching desired profiles of these characteristics are generated. Comparing the diversity scores among design profiles reveals tradeoffs between high diversity and physical property distributions, synthetic difficulty, expense, pharmacophoric bias, etc. Tailored library design requires close interactive effort between computational and medicinal chemists, so specialized programs were developed to integrate substructure searching, display, and statistics to facilitate the design of well tailored libraries.
10:00 AM
4. THE DIMENSIONS OF CHEMICAL SIMILARITY SPACE Robin W.
Spencer, Pfizer Central Research, Groton, CT, 06333.
The Tanimoto distance function is an efficient measure for database searching
and diversity analysis, and may be taken to define a "chemistry space." Because
this space is based on a comparison of hundreds of molecular fragments, it has
been presumed to have high information content and high dimensionality. Yet a
generalized measure of dimension shows that it is mostly less than 10
dimensional. Analysis of the space surrounding single compounds as well as a
simulation of an ideal space shows how chemistry space depends on the presence
of true analogs, the size of the fragment library, and the size of the compound
collection.
10:30 AM
5. VALIDATING METRICS AND METHODS FOR SELECTING DIVERSE
CHEMICAL SUBSETS. Robert D. Clark, Richard D. Cramer, Jon T. Swanson;
Tripos, Inc., 1699 South Hanley Road, St. Louis, Missouri 63144.
Combinatorial synthesis for lead discovery has sharply increased the need to
systematically select diverse representative subsets from virtual databases of
compounds, because making all possible combinations of all possible reagents is
usually neither practical nor desirable. The task is complicated by the fact
that relevant diversity is in terms of physiological response, but only
descriptors derived from structural information are generally available.
Descriptors which perform well in the context of QSAR analysis or similarity
searching are not necessarily well-behaved for selecting diverse subsets.
Moreover, how well a particular descriptor performs can depend upon the
(dis)similarity measure (e.g., Euclidean vs. Tanimoto) used as well as on the
selection algorithm employed. General procedures for quantitatively assessing
the usefulness of methods for finding diverse subsets will be discussed, along
with results for some particular combinations - some common and some novel - of
descriptors, measures, and selection algorithms.
11:00 AM
6. A RIGOROUS EVALUATION OF THE NEIGHBORHOOD PROPERTIES OF
DIVERSITY METRICS. J. B. Kinney, C. J. Eyermann, DuPont and DuPont/Merck
Pharmaceuticals, Stine-Haskell Research Center, Newark, Delaware 19714-0030.
Assessing the validity of a diversity metric is an important step in the
design and analysis of combinatorial libraries. One of the important properties
of a good diversity metric is the ability to predict the properties of a
compound based on it's neighbors' properties. This paper will present a detailed
statistical analysis of the quantitative performance of a variety of common
metrics using data from several scouting and lead optimization programs. The
discussion will focus on practical aspects of the neighborhood properties using
a variety of statistical and graphical performance assessments.
11:30 AM
7. MODAL FINGERPRINTS AND TOPOLOGICAL DIVERSITY. C. J.
Blankley, Department of Chemistry, Parke-Davis Pharmaceutical Research
Division, Warner-Lambert Company, 2800 Plymouth Road, Ann Arbor, MI 48105.
A new method has recently become available (Stigmata; Shemetulskis, et al.,
J. Chem. Inf. Comput Sci. (1996),36,862-871). for extracting the common
element, termed the modal fingerprint, from a set of molecular fingerprints.
Molecular fingerprints based on a topological description of the molecule
capture atom and bond path information in binary form which is readily amenable
to comparison. The modal fingerprints for a set of molecules, extracted at
maximum, median or minimum strigencies, offers a profile of the degree of
topological similarity (or dissimilarity) within a given collection of
compounds. Approaches to using this tool to measure chemical diversity within
and between chemical datasets and relating the derived metrics to qualitative
chemical notions of diversity will be illustrated by considering data
collections of various origins typical of those encountered by medicinal
chemists. Some comparisons with other proposed diversity measures will also be
offered.
SUNDAY AFTERNOON, APRIL 13,1997
SECTION A
2:00 PM
8. DESIGNING COMBINATORIAL LIBRARIES USING AUTOMATED DOCKING
METHODS AND 3D-QSAR. M. G. Bures, Abbott Laboratories, D47E AP
10-2, 100 Abbott Park Road, Abbott Park IL 60064-3500.
Increasing emphasis is being placed on using structural information, along
with diversity analysis, to help design focused combinatorial libraries. Using
an experimental or modeled structure of a representative library member or
analog, we use docking programs such as DOCK and LUDI to orient and score a set
of possible substituents in their proposed binding region. The score, or
estimated binding energy, of the substituent is used as an indication of the
predicted potency of the resulting library compound. In addition, when
appropriate, we use CoMFA to generate a 3D-QSAR model for the compounds actually
synthesized and tested and use the model to forecast the potency of new
sets of substituents. The mechanics of this approach and results of validation
studies will be discussed.
2:30 PM
9. 2D VERSUS 3D SIMILARITY: USE OF MOLECULAR SHAPE-BASED 3D
SEARCHING TECHNIQUES FOR IDENTIFYING NOVEL COMPOUNDS. Osman F. Güner,
Matthew Hahn, Hong Li, and Moises Hassan, Molecular Simulations, Inc., San
Diego, CA 92121.
Steric shape plays a crucial role in receptor-ligand binding and a new drug
candidate must first fit inside a receptor active site before it has a chance to
binding to it and exerting a biological effect. Therefore, shape-based 3D
searching techniques complement well the traditional pharmacophore based 3D
searching techniques. Since shape-based 3D search retrieves compounds that are
"similar"" in shape, a comparison with an established 2D similarity method
reveals interesting differences. The comparative study was accomplished by
performing 2D similarity and 3D shape searches on the same database using the
same query molecule. The most similar 20 hits from both searches are compared
and analyzed. While the 2D similarity method retrieved compounds with similar
topology but different size, the 3D shape search retrieved compounds similar in
shape but with diverse topology. The advantages and limitations of each method
are presented.
3:00 PM
10. DESIGNING PHARMACOPHORICALLY DIVERSE LIBRARIES. D.
Pickett, Rhône-Poulenc Rorer S.A., Centre de Recherche de Vitry-
Alfortville, 13 Quai Jules Guesde, BPI4 94403 Vitry-sur-Seine, France.
In recent years, the pharmaceutical industry has become increasingly
concerned with methods for diversity analysis, driven by the needs of
combinatorial chemistry. The results will depend critically on the measure of
diversity selected. Methods have been developed which utilize the pharmacophores
presented by a molecule as a descriptor (S.D. Pickett, J.S. Mason, I.M. McLay,
J. Chem. Inf. Comput. Sci. in press). As the pharmacophoric properties of an
individual molecule within a library will depend upon the interaction between
different R-groups making up the molecule, reagent selection should be performed
as far as possible on the properties of the final products. The difficulty lies
in the combinatorial nature of the problem - selection of one reagent
immediately specifies a number of products. Strategies have been developed to
aid in reagent selection which address this problem. The interdependency of the
groups means that it is not possible to select one set of reagents suitable for
all situations. Rather, the selection process should be repeated for each
library of interest.
3:30 PM
11. DIVERSITY SELECTION OF REAGENTS FOR COMBINATORIAL
CHEMISTRY BY A 3D DOCKING APPROACH. H. Briem, Boehringer Ingelheim KG,
Med. Chem. Dept., D-55216 Ingelheim, Germany.
Selection of reagents is a crucial step in the generation of a compound
library by combinatorial chemistry. Ideally, the selection should retain as much
molecular diversity in 3D space as possible. In this paper a new diversity
metric for chemical reagents will be described. The approach includes docking of
an assembly of aligned compounds into different receptor binding pockets. The
common substructures of the assemblies are held fixed at different grid points
within the pocket. Each reagent at each position is scored by interaction energy
with the protein. After data reduction by principal components analysis,
different selection and clustering algorithms may be applied in order to
generate a diverse subset of reagents. The paper will describe the procedure and
some examples will be given.
4:00 PM
12. ASYMMETRIC SIMILARITY AND MOLECULAR DIVERSITY. G.M.
Maggiora, J. Mestres*, T.R. Hagadone, and M.S. Lajiness,
Computer-Aided Drug Discovery, Pharmacia & Upjohn, 301 Henrietta Street,
Kalamazoo, MI 49007.
[*Permanent address: Institute of Computational Chemistry, University of
Girona, 17071 Girona, Catalonia, Spain].
A measure of similarity of the molecules within a given set is essential to
any method for evaluating the molecular diversity of the set. Most similarity
measures in use today are symmetric, that is X is as similar to Y as Y is to X.
A new class of asymmetric similarity measures will be presented, and how these
measures provide molecular information not provided by symmetric measures will
be described. An assessment of molecular diversity based upon a Shannon-like
entropy function will also be presented, along with a comparative analysis of
the performance of symmetric and asymmetric similarity measures based upon the
Shannon diversity function.
4:30 PM
13. FAST LIGAND DOCKING INTO RECEPTOR CAVITIES. Akbar
Nayeem, Tad Hurst, Joe Leonard, Tripos, Inc., 1699 So. Hanley Rd., St.
Louis, MO 63144.
When the 3D structure of an important biological receptor is known,
researchers would like to use that information to find novel potential drug
candidates. This has fueled a high level of interest in computation methods of
ligand-receptor docking. At the same time, combinatorial chemistry has expanded
the number of structures which can be produced by medicinal research groups by
orders of magnitude. Thus, there is a desire to screen the extremely large
libraries of compounds which could be made for structures which are maximally
likely to bind to the receptor using computation techniques. This process is
called Virtual High Throughput Screening (VHTS), and requires ligand-receptor
docking tools which are extremely fast. In this presentation we will discuss the
efforts to product a high-quality flexible docking system which is appropriate
for screening databases of one million structures or more.
MONDAY MORNING, APRIL 14, 1997
SECTION A
9:00 AM
14. REDUCED DIMENSIONAL REPRESENTATIONS OF MOLECULES AND
MOLECULAR SIMILARITY. W. Graham Richards, Daniel D. Robinson Physical and
Theoretical Chemistry Laboratory, Oxford University, South Parks Road, Oxford
OXI 3QZ, United Kingdom.
Experienced molecular modelers are accustomed to displaying molecular
structures from databases as three-dimensional representations on graphics
terminals. These displays are however frequently not as easily recognized by
non-specialists who think of their chemistry in terms of molecular structures
drawn on a flat page. Using the technique of non-linear mapping we have
developed a way of displaying three-dimensional structures in two dimensions
whilst retaining the information contained in the three-dimensional distance
matrix. At the same time the figures are recognizable in classical terms. Once
derived, the two-dimensional representation has major advantages in searching
for structural similarities. For two-dimensional diagrams we can take advantage
of methods developed for pattern recognition. This holds out the promise of
calculating mutual similarities between all pairs of molecules in large data
sets derived from high throughput synthesis or combinatorial chemistry and hence
quantifying diversity.
9:30 AM
15. SIMULATED ANNEALING GUIDED EVALUATION (SAGE) OF DIVERSITY:
A NOVEL COMPUTATIONAL TOOL FOR DIVERSE CHEMICAL LIBRARY DESIGN AND DATABASE
MINING. Alexander Tropsha1, Weifan Zheng1, Sung Jin Cho1, Ceris L.
Waller2
[1Laboratory for Molecular Modeling, School of Pharmacy, University of North
Carolina, Chapel I-Hill, NC 27599-7360, 2Oncogene Science, Inc., Uniondale, N.Y.
11553.]
We have developed a program for Simulated Annealing Guided Evaluation (SAGE)
of molecular diversity. SAGE has been implemented for both the rational design
of diverse chemical libraries and database mining. Several large simulated data
sets were generated and used to evaluate the effectiveness of the method. Two
different diversity functions were designed and compared in terms of maximizing
the diversity while maintaining the uniformity of distribution of selected
objects in the descriptor space. The best diversity function was analogous to
the Coulomb law. Kohonen self-organizing map was used for both preprocessing the
datasets and visualizing the results. We propose SAGE as a general tool for
diversity analysis and database mining in the context of new drug discovery.
10:00 AM
16. STOCHASTIC ALGORITHMS FOR EXPLORING MOLECULAR DIVERSITY.
D.K. Agrafiotis, E.P. Jaeger, 3-Dimensional Pharmaceuticals, Inc.,
665 Stockton Drive, Suite 104, Exton, Pennsylvania 19341.
A common problem in the emerging field of combinatorial drug design is the
selection of an appropriate subset of compounds for chemical synthesis and
biological evaluation. In this paper, we introduce a new family of algorithms
that combine a stochastic search engine with a user-defined objective function
that encodes any desirable selection criterion. The method is applied to the
problem of maximizing molecular diversity using a novel diversity metric, and
the results are visualized using self-organizing neural networks and Sammon's
nonlinear mapping algorithm. Because the search method and the performance
metric are treated as independent entities, the method can be easily extended to
perform multi-objective selections in advanced experimental design systems.
10:30 AM
17. THE USE OF SUBTEMPLATES AND SUPERTEMPLATES IN DRUG
DISCOVERY. Charles J. Eyermann, John M. Geremia, The DuPont Merck
Pharmaceutical Company, Chemical and Physical Sciences, Experimental Station,
Wilmington, Delaware 19880-0500
Significant research on which metrics are useful in analyzing molecular
diversity has been recently reported. The use of 2D fingerprints based on
chemical fragments or atom types and bond paths have emerged as a metric which
is computationally fast as well as having reasonable neighborhood properties.
Clustering based on these 2D fingerprints has therefore been used to help select
reagents for combinatorial chemistry as well as compounds to acquire from
external sources. While clusters based on 2D fingerprints are useful for
grouping compounds in a database they do have some limitations. Here we present
an alternative method for analyzing a molecular graph based on ring templates as
well as user-defined templates. Results and examples of how these templates can
be used in molecular diversity analysis and as a synthetic feasibility filter
will be presented.
11:00 AM
18. DIVERSITY MEASURES AND THEIR INTEGRATION WITH COMPANY
DATABASES. Colin Edge, Stephen H. Calvert, Darren G. Jones, SmithKline
Beecham Pharmaceuticals, New Frontiers Science Park, Third Avenue, Harlow,
Essex, CM19 5AW, United Kingdom.
Various measures of chemical diversity have been used in the design of
combinatorial arrays. Clustering methods of the theoretically derived properties
of molecules are discussed. These clusters have been integrated with corporate
databases, using an ISIS/BASE system, allowing the analysis of physicochemical
and mass-encoded diversity, the design of new chemical arrays, the ordering of
reagents and the automatic registration of the products in the corporate
database.
11:30 AM
19. HQSAR - A HIGHLY PREDICTIVE QSAR TECHNIQUE BASED ON
MOLECULAR HOLOGRAMS. Tad Hurst, Trevor Heritage, Tripos, Inc., 1699 So.
Hanley Rd., St. Louis, MO 63144.
COMFA has proven to be an extremely valuable predictive tool for
computational chemists in the medicinal chemistry field. It is most valuable for
small sets of similar structures (10-50). The necessity of aligning the
structures prior to the use of CONTA and its large memory requirements makes it
difficult to use this technique for the larger datasets now being produced by
combinatorial chemistry and high-throughput screening. Hologram QSAR (HQSAR) is
a new technique which uses specialized fragment "fingerprints" called Molecular
Holograms as predictive variables for predicting biological activities. This
presentation will detail the results which in many cases are as good as CONEA or
better. Also discussed will be the generation and use of "Chiral Fingerprints"
in HQSAR.
MONDAY AFTERNOON, APRIL 14, 1997
SECTION A
1:15 PM
20. THE UNTIDY COLLECTION OF INFORMATION BY A JOURNALIST.
Rudy M. Baum, Chemical & Engineering News, 1155 16th St., NW,
Washington, DC 20036.
Journalists have different information requirements than other professionals
investigating a disease such as HIV/AIDS. For a journalist, a primary source is
the scientist who carried out research that is the focus of a story. The primary
scientific literature and secondary sources like newspaper accounts and review
articles are the background a journalist uses to prepare for interviews. In the
course of the HIV/AIDS epidemic, access to sources has evolved as the disease
became better known by the public and the general media.
1:45 PM
21. HIV/AIDS INFORMATION: MEETING DIVERSE NEEDS IN A
UNIVERSITY SCIENTIFIC RESEARCH LIBRARY. M. D. O'Rourke, Blommer
Science Library, Georgetown University, Washington, DC 20057.
University science libraries encounter multiple challenges trying to satisfy
the essential research demands of faculty and students for HIV/AIDS information.
Typical queries extend from immunochemical, pharmacological, and biochemical
aspects of HIV/AIDS to health education, policy matters, ethics, epidemiological
modeling, biostatistics, national and international health care economics, and
business. Adding to the complexity is the need to deliver quickly, in
coordination with other university libraries and research centers, evaluated,
refereed, current life sciences information in many formats, at multiple campus
and off-campus sites. Included in the presentation will be practical examples of
HIV/AIDS inquiries demonstrating the spectrum of services a fine scientific
research library must make available, plus suggestions for keeping abreast of
the HIV/AIDS literature and its delivery to researchers.
2:15 PM
22. TRENDS IN PATENT INFORMATION ON HIV/AIDS. Andrew H.
Berks, Wyeth Ayerst Research, Pearl River, NY 10965.
This talk will discuss trends in patenting behavior in the area of HIV and
AIDS treatment, diagnosis, and prevention. Also discussed will be patents
relevant to diseases common in AIDS patients, such as hepatitis B, Karposi's
sarcoma, P. carinii. This talk will include breakouts by inventors, corporate
source, nature of the invention, and regional and national trends. A comparison
of patents and other literature as sources of alerting and competitive
information will be discussed.
2:45 PM
23. COMPUTERIZED HIV AND OI'S INFORMATION DATABASE
SYSTEMS.
The Division of AIDS (DAIDS) supports research to identify and develop
therapeutic agents for the prevention and treatment of infections with the human
immunodeficiency virus (HIV) and associated opportunistic infections (OI's)
including Mycobacterium tuberculosis(TB). Computerized databases
containing chemical structures and biological data have been established by
DAIDS that are designed to be the most up-to- date information source on current
research on HIV, OI's and TB experimental therapies. The databases are currently
managed using ISISBASE and ISISHOST software of MDL Information Systems, Inc.
The databases provide support for: (1) the acquisition, prioritization and to
avoid duplication of testing compounds for biological evaluation in contracts
operated by DAIDS; (2) to track developments through literature surveillance and
abstraction of data on experimental chemotherapies of HIV and 0I's; (3) to serve
as knowledge base for the NIAID and the scientific community; and (4) to prepare
reviews on structure activity relationships.
3:15 PM
24. HIV CHEMICAL INFORMATION: THERAPEUTIC AGENTS, TARGETS, AND
ACTIVE SITES. Charles E. Gragg, 1649 Glengarry Drive, Cary, NC
27511-5771.
Therapeutic agents for treatment of Human Immunodeficiency Virus (HIV)
infection are well known by the acronyms AZT, ddl, ddC, 3TC and D4T. Further
information on these Reverse Transcriptase (RT) inhibitors, and inhibitors of
Protease, lntegrase, and other HIV enzyme targets expressed by the nine HIV
genes can be gathered by following the chemical information.
3:45 PM
25. AIDS INFORMATION: AN FDA PERSPECTIVE. Norman R.
Schmuff, FDA, Center for Drug Evaluation and Research, Office of
Pharmaceutical Science, Division of New Drug Chemistry-III, HFD-530, 5600
Fishers Lane, Rockville, MD 20857.
Regulatory sources of AIDS information will be discussed from the viewpoint
of the Food and Drug Administration. The range of information available through
the web, email, consultants and personal contact will be discussed. A general
description of the drug development process from pre-clinical through market
approval will be discussed as it relates to FDA requirements for AIDS drugs. A
general picture of IND and NDA requirements will be presented with an emphasis
on CMC (Chemistry, Manufacturing and Controls) requirements and available
guidance.
MONDAY EVENING, APRIL 14, 1997
SCI-MIX
26. MAKING AVAILABLE CHEMICALS AVAILABLE. Phil J.
McHale, Rebecca Franke, Gary Marquart, Richard Coad, Bryan Host,
MDL Information Systems Inc., 14600 Catalina Street, San Leandro, CA 94577.
Efficient chemical sourcing is becoming increasingly important as companies
strive to make their discovery processes more productive. The ability to find
suppliers for required compounds and to place orders in a streamlined manner can
assist in expediting chemical synthesis and biological screening. A searchable,
comprehensive, well-indexed, and detailed database of chemical suppliers'
catalogs with an integrated ordering function can offer significant advantages
over traditional means of finding suppliers for chemicals and placing orders,
and we will discuss how MDL's Available Chemicals Directory is being developed
for use as a chemical sourcing tool both on the Internet via WWW and within
companies' own corporate intranets.
27. CHEMCATS: COMMERCIALLY AVAILABLE CHEMICALS FROM
CAS.
To support the chemist's role in finding and synthesizing new substances, CAS
began building a database of commercially-available chemicals, called CHEMCATS.
CHEMCATS is international in scope and includes peptides, proteins, catalysts,
polymers, inorganics, and organometallics, as well as organic chemicals. One of
the goals for CHEMCATS has been to provide a quality source of information that
is extremely current. Toward this end, CAS is building close relationships with
catalog suppliers, accepting input to CHEMCATS in standard formats, and building
the capability for weekly updates to the database. CAS is adding CAS Registry
Numbers to the substances in CHEMCATS and will continue to do so to ensure
seamless integration of CHEMCATS with other products and services. CHEMCATS is
currently available via STN and SciFinder. Future plans for CHEMCATS, including
new classes of substances to be added and new distribution mechanisms will be
discussed.
28. COMPETITIVE INTELLIGENCE VALUE OF PATENTS VS. OTHER
LITERATURE SOURCES FOR DRUG COMPOUNDS. Andrew H. Berks, Wyeth
Ayerst Research, Pearl River, NY 10965.
Attempts to quantify the uniqueness of data in patents, compared to other
literature sources are difficult. This poster will present anecdotal cases of
several drugs and outline the history of their development and public
disclosure. The intent is to demonstrate that important molecules are often
disclosed in patents months or years before their disclosure elsewhere. If true,
this would provide evidence that unique data is present in patents that is not
available, or is delayed, in other literature sources. Such unique data has
substantial competitive intelligence value.
TUESDAY MORNING, APRIL 15, 1997
SECTION A
9:00 AM
29. ONLINE PESTICIDE RESOURCES AT THE TOXICOLOGY AND
ENVIRONMENTAL HEALTH INFORMATION PROGRAM. G.F. Hazard, Jr., V. W. Hudson,
National Library of Medicine, Bethesda, Maryland, 20894.
The Toxicology and Environmental Health Information Program (TEHIP) of the
National Library of Medicine (NLM) provides online access to toxicological and
other biomedical data. The databases that deliver these data contain a great
deal of pesticide related information. Researchers may access these databases
through the NLM ELHILL and TOXNET online systems. Recently, retrieval mechanisms
based on the World Wide Web (WWW) have been developed to offer new methods of
access. In this presentation, statistics and major data elements of interest to
pesticide researchers will be highlighted. The resources discussed will range
from a chemical dictionary file (ChemID), to secondary literature files
(TOXLINE), to evaluated or peer-reviewed data files (HSDB and IRIS). The TEHIP WWW site (http://sis.nlm.nih.gov) will
also be discussed. It contains background information about these online files
and also points to other internet resources of potential utility to the
pesticide research community.
9:30 AM
30. NATIONAL PESTICIDE TELECOMMUNICATIONS NETWORK (NPTN).
T.L. Miller, J.J. Jenkins, and S.L. Wagner, Agricultural Chemistry
Extension, Oregon State University, Corvallis, Oregon 97331- 6502.
NPTN is a unique national resource which provides pesticide-related
information to individuals in the United States, Puerto Rico, and the Virgin
Islands. Access to NPTN is via toll-free telephone number (800/858-7378) and via
the World Wide Web (http//www.ace.orst.edu/info/nptn/).
NPTN provides objective, science-based information about a wide variety of
pesticide-related subjects, including: pesticide products, recognition and
management of pesticide poisoning, toxicology, and environmental chemistry. NPTN
can: help callers interpret and understand toxicology and environmental
chemistry information about pesticides; access over 300 pesticide resources;
access pesticide label information; direct callers for - pesticide incident
investigation, emergency human and animal treatment, safety practices, clean-up
and disposal, and laboratory analyses, and supply information on regulation of
pesticides in the United States.
10:00 AM
31. USDA PESTICIDE USE INFORMATION, V. B. Johnson,
Estimates Division, National Agricultural Statistics Service, 1400 Independence
Avenue, S.W., Room 5801-S, Washington, DC, 20250-2000.
National Agricultural Statistics Service (NASS), USDA, is responsible for
collecting on- farm chemical use information to support the evaluation of water
quality and food safety issues. The information is obtained through annual
grower surveys. Published data are available from a series of surveys targeting
selected field, vegetable and fruit crops in major producing States. Data are
available in printed, as well as, electronic form, and text of the published
reports can be accessed through the internet. The Economic Research Service
(ERS) provides analytical research on the impact of alternative pesticide
regulations, policies and practices. Research reports and databases are
available in a variety of formats.
10:30 AM
32. THE EXPOSURE MODELS LIBRARY: A SELECTION OF FATE,
TRANSPORT AND ECOLOGICAL MODELS FOR EXPOSURE ASSESSMENTS. Paul L. Zubkoff,
U. S. Environmental Protection Agency OPPTS/OPP/BPPD, Washington DC
20460-0001; Lawrence A. Burns, U. S. Environmental Protection Agency
ORD/NERL/ERD, Athens, GA 30605-2720; Richard Walentowicz, U.S. Environmental
Protection Agency, ORD/NCERQA, Washington DC 20460-0001.
The Exposure Models Library (EML) and Integrated Model Evaluation System
(IMES) CD-ROM demonstrates the use of the CD-ROM technology for distributing
exposure and assessment models, their documentation, a model selection system
for use in exposure and risk assessment, and other tools. The EML: more than 100
models with source codes, manuals and data are available for determining fate
and transport in various media: air, soil, groundwater and surface water. These
models were developed primarily for use by various EPA offices and other federal
agencies and are in the public domain. Selections of the PIRANHA (Pesticide
& Industrial chemical Risk ANalysis & Hazard Assessment) model will be
illustrated. The IMES: developed for exposure and risk assessors who use
environmental fate models, the Selection Module assists users in choosing
appropriate fate models from the user's response to queries of site
characteristics and model capabilities. The Validation Module retrieves
background information on models and their validation status. The Uncertainty
Module compares model predictions with field data sets and presents
information from the uncertainty studies using an easily understood graphical
relationship. An interface for easy access to the IMES and the model directories
indicates the amount of space required for downloading the model files, and
allows for viewing text files in the model documentation directories.
11:00 AM
33. ESTIMATING FATE AND EFFECTS WITH THE AQUATIC ECOSYSTEMS
MODEL, AQUATOX. Richard A. Park1, Jonathan S. Clough2, Marjorie Coombs
Wellman2, David A. MaurielIo3.
[1Eco Modeling, 20302 Butterwick Way, Gaithersburg, MD 20879-4358, 2Office of
Science & Technology and 3Office of Pollution Prevention & Toxics, U.S.
Environmental Protection Agency, Washington DC 20460-0001.]
Toxicity and ecological effects data can be integrated with environmental
fate data to estimate trophic level responses (direct and indirect effects) of
pollutants on aquatic ecosystems with the user-friendly AQUATOX model. Effects
of toxic organics, mercury, nutrients, flow, sediments, and temperature are
represented for complex food webs in ponds, streams, reservoirs, and lakes.
Differential mortality, loss of prey, release of predation, disruptions in
nutrient cycling, anoxic conditions, and changes in turbidity and sedimentation
are all considered in this unique model. Steady-state and kinetic responses to
single, sporadic, and chronic releases of pollutants over both short and long
time periods are simulated with coupled differential equations. The risk
assessor can evaluate the possible impacts of a stressor on representative
ecosystems, or, because of the object- oriented code, one can easily add or
delete compartments and change species and site data to simulate site-specific
pollution problems. AQUATOX runs under WINDOWSTM with results presented in
tables and graphs or in several database formats for export.
TUESDAY MORNING, APRIL 15, 1997
SECTION B
34. THE CLEARINGHOUSE FOR CHEMICAL INFORMATION INSTRUCTIONAL
MATERIALS (CCIIM). Gary Wiggins, Chemistry Library, Indiana University,
Bloomington, Indiana 47405-4002.
Sponsored by the ACS Division of Chemical Information and the Special
Libraries Association Chemistry Division, the CCIIM contains a collection of
materials developed to assist in teaching about chemical information sources.
Many of the items available from the clearinghouse are on the web (http://www.indiana.edu/~cheminfo/
35. DESIGNING A WEB PAGE FOR FREQUENTLY ASKED REFERENCE
QUESTIONS. Ann D. Bolek, Science-Technology Library, The University of
Akron, Akron, OH 44325-3907.
As the World Wide Web becomes more and more popular, more opportunities exist
for creating personal home pages which can be useful to others. In the chemical
information arena, questions often arise about Chemical Abstracts, other
chemical databases, patents, and what useful information is available on the
Web. The author will provide examples of Web pages which answer these questions
in her setting. The Web pages, which can be updated frequently and easily,
replace handouts and many verbal explanations of years past.
36. SEARCHING DATABASES TO SUPPORT COLLECTION DEVELOPMENT
WORK: TIPS AND TECHNIQUES. Grace Baysinger, Swain Library of Chemistry
and Chemical Engineering, Stanford University, Stanford, CA 94305- 5080.
This poster will include sources and search strategies to aid collection
development and management work. By performing database searches in several key
files, it is possible for chemistry, and chemical engineering librarians to
better understand programmatic needs of their departments, identify newly
available resources that might be of interest to their users, find out what
journals their faculty and graduate students publish in and which journals they
cite most frequently. While focusing primarily on printed resources, this poster
will also highlight selected resources for identifying electronic resources.
37. INTEGRATION OF ACTIVE LEARNING IN THE CHEMICAL INFORMATION
CLASSROOM. Nancy J. Butkovich, Physical Science Library, 230 Davel
Laboratory, Penn State University, University Park, PA 16802.
At Penn State CHEM 400 (The Chemical Literature) has traditionally been
taught using the lecture format. During the last five years different methods of
instruction have been attempted with the goal of making the course more
intellectually appealing while preserving course content. As a result, several
lectures have evolved into active learning modules of different types,
thus allowing students to become partners in the learning process.
Coinciding with this has been an effort to go beyond the "how to use this
source" part of the course. Through the use of collaborative in-class exercises
and homework assignments, students are presented with questions which require
them to synthesize information rather than merely reciting it. Preliminary
assessment of these changes has been sufficiently satisfactory to warrant
revision of the rest of the course.
38. THE CHEMICAL INFORMATION INSTRUCTOR. Arleen N.
Somerville, Carlson Library, University of Rochester, Rochester, NY 14627-
0236.
This Journal of Education column provides instructors with practical
information on a wide range of topics related to teaching information searching
skills. Contributions aim to provide information needed by instructors to
replicate similar experiences in their institutions. Information is provided in
print, on the Web, and via a moderated discussion forum on Internet. Topics
include but are not limited to: courses, workshops, integration of instruction
into courses, integration of WWW sources into instruction, specific types of
information (i.e., inorganic), materials (i.e., patents), searches (i.e.,
citation indexing, structure), specific sources (i.e., Beilstein); ways to stay
current, teaching techniques.
39. REDEFINING INFORMATION ACCESS: TOWARD A NEW TOPOLOGY OF
SCIENTIFIC AND TECHNICAL INFORMATION. Denise A. D. Bedford & Julie
Kwan, University Libraries, William P. Weber, Department of
Chemistry, University of Southern California, Los Angeles, CA 90089; Clifford
Bedford, Naval Air Warfare Center, U. S. Department of the Navy, China Lake, CA
93555.
Two core competencies of academic and technical libraries have been
collecting information and providing intellectual access to that information.
This paper builds upon a project commissioned by the Library of Congress to
develop a topology of scientific and technical information systems and a field
test of the topology at the University of Southern California. The topology
attempts to include all types of scientific and technical information, not just
those traditionally collected by libraries. Coupled with the increase of
nontraditional information resources available through the Internet, the
topology provides a new way to look at collection development and, through an
associated interface, redefines information access for the end user. This paper
focuses on an initial application of the topology and will illustrate how a
World Wide Web interface could be developed using examples pertinent to
chemists.
40. AN AMERICAN CHEMISTRY LIBRARIAN BECOMES AN EASTENDER.
A six month job exchange took a chemistry librarian from the University of
Pennsylvania in Philadelphia to Queen Mary and Westfield College in the East End
of London. This poster session will outline the differences and similarities of
libraries on both sides of the Atlantic.
41. FROM 300 BAUD TO STN EASY: FAMILIARIZING CHEMISTRY
STUDENTS WITH ON-LINE LITERATURE SEARCHING FROM 1980 TO 1996 AT A CANADIAN
UNDERGRADUATE UNIVERSITY. Brian M. Lynch, Department of Chemistry, St.
Francis Xavier University, P.O. Box 5000, Antigonish, Nova Scotia B2G 2W5,
Canada.
Over the past 17 years, the Department of Chemistry of St. Francis
Xavier University has offered informal and formal courses aimed at developing
student skills in accessing primary and secondary chemical literature sources.
Many graduate students have referred to such course exposure as very valuable
preparation for graduate school, and have acted as quasi-missionaries in
spreading the digital word at their chosen doctoral schools. However, only about
10% of Canadian University Chemistry Departments offer similar course exposure
taught by chemists, rather than by librarians. My poster will illustrate the
current course status and will provide details of the form of problem
assignments designed to aid in student research.
42. ACADEMIC LIBRARIES IN TRANSITION. Susanne J.
Redalje, Chemistry Library, University of Washington, Seattle, WA
98195-1700.
Libraries, as always stand on the edge of past, present, and future,
generally an exciting but dangerous place to be. The future clearly includes
electronic sources but will also include paper and plastic and all the other
forms information has come in over the years. The University of Washington
Libraries is involved in several activities, including UWired which helps
freshman and others get a headstart into the electronic-world of today and
tomorrow, WILLOW™ a graphical user interface; article delivery projects, locally
mounted and CD-ROM LAN databases, and traditional bibliographic instruction
which seek to help its users survive and thrive in this exciting and changing
environment.
43. CROSSFIRE COMES TO DUKE. Kitty Porter, Duke
University Chemistry Library, Durham, NC 27708-0355.
In October, Duke Chemistry Library traded in Beilstein Current Facts in
chemistry, the Gmelin Handbook, a few other reference sources, and several
journals for Crossfire- Minerva. Although the spread of its use has been
hampered by the requirement for 20MB RAM for successful operation, it is gaining
satisfied users among organic research groups, physical chemistry lab students,
and librarians in quest of answers to questions both virtual and real.
44. MAKING DO: CREATING DOCUMENTATION WITH COMMON TOOLS.
Andrea Twiss-Brooks, The John Crerar Library and Chemistry Library, The
University of Chicago, Chicago, IL 60637.
Desktop publication packages and graphic design programs are plentiful, but
not always inexpensive. Potential authors with no budget for specialized
software packages need not despair. It is possible to create useful,
professional documentation using a handful of common desktop applications, plus
a few shareware or freeware programs. This poster will describe the creation of
a series of user's guides at The University of Chicago Library intended for
users of the Beilstein CrossFire database searching system. The use of Microsoft
Windows, Microsoft Windows Paintbrush, LView for Windows, and FTP applications
will be described. (A web version of one of these guides may be found at http://www.lib.uchicago.edu/~atbrooks/beilfact.html
TUESDAY AFTERNOON, APRIL 15, 1997
SECTION A
2:00 PM
45. EXTOXNET: AN INTERNET PESTICIDE INFORMATION RESOURCE FOR
NON-SPECIALISTS. Michael A. Kamrin, Michigan State University, East
Lansing, NH 48824, Arthur Craigmill, University of California, Davis, CA 95616;
Terry Miller and Jeffrey Jenkins, Oregon State University, Corvallis, OR 97331,
Donald Rutz, Cornell University, Ithaca, NY 14853.
EXTOXNET is a collaborative multi-university program that is designed to
provide information about the toxicology and environmental chemistry of
environmental contaminants in lay language. Initial efforts of this consortium
focused on pesticides and led to the development of profiles for almost 200
active ingredients and short summaries of important toxicology and environmental
chemistry concepts. The profiles contain information about human and wildlife
toxicology, environmental fate, physical properties and regulations governing
each pesticide. This information has been published in hard copy and on the EXTOXNET WWW site. This site
receives over 20,000 hits/month and is linked to a large number of other sites.
The division of information into modular units describing individual concepts
and chemicals will form the basis of an expansion of EXTOXNET capabilities to
other chemicals and new issues.
2:30 PM
46. TEAM BUILDING ON THE WEB: AN OVERVIEW OF COUPLING COST-
BENEFIT AND ENVIRONMENTAL RISK ASSESSMENT MODELS. F. R. Hall, Laboratory
for Pest Control Application Technology, The Ohio State University, Wooster, OH
44691.
Improved risk information content and clarity to pesticide uses and policy
makers should enhance planning skills, benefit the policy and decision-making
agenda, help defuse the current climate of crisis surrounding the use of
pesticides and aid the transition towards more efficient pesticide use patterns.
Pesticide use and delivery information from scarce and declining global
multidisciplinary sources represent complex and expensive information. New web
site linkages of these scarce global resources to promote research
collaboration, data merging/sharing and building of research teams using web
sites, lists, discussion groups and an overall DB sharing is discussed. This
overview also discusses the range of simple lists and DB's to the more complex
cost-benefit and environmental DS models. Coupling this wide array of
interacting data into a meaningful DS format is a critical step for ease of
strategy assessment as well as tactical implementation and the building of
successful global partnerships to enhance the use efficiency of crop protection
agents in agriculture.
3:00 PM
47. FLOW OF INFORMATION IN AND OUT OF A UNIVERSITY PESTICIDE
INFORMATION CENTER. Sheila D. Merrigan, Paul B. Baker, Pesticide
Information and Training Office, University of Arizona, Tucson, AZ 85719.
The Pesticide Information and Training Office (PITO) at the University of
Arizona provides information and education on pesticide-related issues to the
public, the university community, and government agencies. Information flows
into the office from many sources, including PITO staff, faculty in several
departments, federal and, state government agencies, and chemical companies.
Information flows out of the office in written and verbal formats including
reports, bulletins, brochures, training manuals, training workshops, fairs, and
telephone conversations. The PITO information center was created to help
facilitate this flow of information. This paper will discuss: where and how the
information center obtains information; how the information is managed, and how
and to whom information is distributed.
3:30 PM
48. THE NATIONAL PESTICIDE INFORMATION RETRIEVAL SYSTEM.
The National Pesticide Information Retrieval System (NPIRS) is a collection
of six pesticide-related databases available through subscription to the Center
for Environmental and Regulatory Information Systems (CERIS) at Purdue
University. The Pesticide Product database contains label information obtained
from the U.S. Environmental Protection Agency on over 88,000 active, canceled,
transferred, and suspended pesticide products. In addition, state pesticide
registration data is available from 39 states. The Pesticide Document Management
System (PDMS) contains bibliographic citations of documents submitted to EPA in
support of pesticide registration. Other databases include EPA Chemical Fact
Sheets, pesticide residue tolerance information, C& P Press Material Safety
Data Sheets, and a Federal Register archive of over 115,000 documents.
4:00 PM
49. USING DATA FROM THE NATIONAL PESTICIDE INFORMATION
RETRIEVAL SYSTEM (NPIRS) TO ASSIST IN PESTICIDE RESEARCH. Susan E.
Branchick,
The Pesticide Product Database and the Pesticide Document Management System
Database (PDMS), available through NPIRS, can be used to assist in agrochemical
research and registration. The query methods and various output options will be
looked at in detail. Additionally, examples will be presented on how the
information can be used in designing studies for pesticide registration,
locating suppliers of technical material, identifying competitive products,
determining a product's registration status and for monitoring a competitor's
activities.
WEDNESDAY MORNING, APRIL 16, 1997
SECTION A
8:30 AM
50. A COMPREHENSIVE SOFTWARE SYSTEM FOR MANAGING NMR DATA.
V.L. Shilay, D.F. Mitushev, A. A. Petrauskas, Advanced Chemistry
Development Inc., 141 Adelaide St. West, Suite 1501, Toronto, Ontario, M5H 3L5,
Canada.
Advanced Chemistry Development, Inc. has developed a comprehensive software
system for managing NMR data. It includes the following: 1) importing raw FID
NMR data from spectrometers, 2) data processing (FT, phasing, base line
correction, peak picking), 3) converting to tables of fully assigned chemical
shifts and coupling constants, 4) updating to a data base which can be searched
according to substructure, formula, MW, chemical shifts and coupling constants,
5) accurate prediction of new spectra based on the previously accumulated
experimental data, 6) Web-based Java applets and plug-ins allowing to access the
system via company intranet. The purpose of this system is to provide a complete
corporate solution for NMR collecting, interpreting, searching, predicting and
exchanging - all fully automated and easily customizable.
9:00 AM
51. SPECTRUM PREDICTION IN C-13 NMR SPECTROSCOPY: THE
IMPORTANCE OF STEREOCHEMICAL INFORMATION. W. Robien, Institute of Organic
Chemistry, University of Vienna, A-1090, Austria.
Spectrum prediction of C-13 NMR spectra is a versatile tool during structure
elucidation. A wide range of methods including increment calculation, HOSE-code
derived correlation tables and neural network technology has been described in
the literature during the last four decades. The basic concepts of these
algorithms will be discussed using examples from natural product chemistry, Most
of the methods are restricted to a two-dimensional model of structure
description neglecting stereochemical features which contribute substantially to
chemical shift values. Our approach of deriving steric interactions from a
2.5-dimensional structure representation (up/down-bonds) and utilizing this
information during spectrum prediction will be shown. Some useful applications
based on this algorithm and also some statistical evaluations derived from our
database holding 116,000 C-13 NMR-spectra will be presented.
9:30 AM
52. A 3D APPROACH TO STRUCTURE - INFRARED SPECTRA SIMULATION
AND ANALYSIS. J. Gasteiger, P. Slezer, L. Steinhauer, V. Steinhauer,
Computer-Chemie-Centrum, Universitaet Erlangen-Nuernberg, Naegelsbachstr. 25,
D-91052 Erlangen, Germany.
An empirical approach to the modeling of the relationships between structure
and infrared spectra is highly attractive, particularly when large molecules or
large datasets have to be treated. We will show that powerful neural network
techniques such as a counterpropagation network can model the relationships
between structure and IR spectra. Central to this approach is a transformation
of the 3D structure of molecules to a novel structure code. (1) This approach
allows the simulation of IR-spectra over the entire frequency range as shown
with a variety of examples. The counterpropagation network can also be used in
reverse mode; by input of an infrared spectrum the 3D structure of a molecule
can be predicted. The first examples of 3D structures derived directly from the
IR spectrum will be given.
(1) J.H. Schuur, P. Selzer, J. Gasteiger, J. Chem. Inf. Comput. Sci. 36, 334
(1996)
10:00 AM
53. MASS SPECTRA INTERPRETATION BY CHEMOMETRIC METHODS TO
SUPPORT SYSTEMATIC STRUCTURE ELUCIDATION. K. Varmuza, Dept. of
Chemometrics, Technical University Vienna, Getreidemarkt 9/152, A-1060 Vienna,
Austria.
Computer-assisted structure elucidation of organic compounds is mainly based
on NMR data. However, in many analytical problems NMR data cannot be measured
because of too low concentrations and complex mixtures. In these cases MS and IR
have to be used for the identification of unknowns. Chemometric classifiers have
been developed for low resolution mass spectra to recognize presence or absence
of substructures in a molecule. Classification is based on numerical
transformation of spectral data and multivariate discriminant methods.
Classification results are transformed to a good-list and bad-list for direct
use by isomer generator programs. Examples demonstrate that mass spectral
classifiers often provide complementary structural information to other
spectroscopic data. Cluster analysis of the structure candidates gives insight
into their structural diversity.
10:30 AM
54. ANALYTICAL INFORMATION REQUIREMENTS IN COMBINATORIAL
CHEMISTRY. William L. Fitch, Affymax Research Institute, 3410
Central Expressway, Santa Clara, CA 95051.
The advent of combinatorial methods of molecular discovery places new demands
on analytical data collection and information handling. New high throughput
spectroscopic and chromatographic techniques are being developed and only the
most automatable analytical measurements will be made in this environment. There
are unmet needs for new methods of automated spectral interpretation and
information display.
11:00 AM
55. COMBINATORIAL CHEMISTRY - A NEW CHALLENGE FOR THE
SPECTROSCOPIC LABORATORY. Reinhard Neudert, Chemical Concepts,
Boschstrasse 12, D-69469, Weinheim, Germany.
About two years ago, chemical research groups began to develop a modified
version of the classical approach to lead discovery, that is
synthesis-screening-identification- optimization. The new approach, summarized
under the name combinatorial chemistry, uses the automation techniques to
accelerate the synthesis and separation of candidates in lead discovery. Since
the structure of the compounds is known, the process is actually reduced to one
of verification of structure proposals. The classical way to verify involves
human resources to a large extent. The new synthesis techniques result in the
generation of tremendous numbers of new compounds. To handle the analytical task
at reasonable costs, the following steps in the laboratory need to be
rationalized:
- Sample preparation and data acquisition - Data management
- Archiving and automatic generation of knowledge bases - Verification
11:30 AM
56. ACHIEVING DATABASE QUALITY - WITH SPECIAL EMPHASIS ON THE
DELIVERY OF CHEMICAL INFORMATION. Dorothy M. Blakeslee, John R. Rumble,
National Institute of Standards and Technology, 820 West Diamond Avenue, Room
113, Gaithersburg, Maryland 20899.
The Standard Reference Data Program (SRDP) at the National Institute of
Standards and Technology (NIST) has long maintained a program of data
evaluation, and when computer databases began to be developed, the maintenance
of quality received new attention. Today, computer databases are the primary
distribution mechanism for NIST standard reference data, but it is a nontrival
task to make sure that what users find when using those databases is what the
data evaluator intended. To that end, NIST SRDP has established a careful
program of quality control to ensure NIST standard reference databases are of
the highest quality. This program will be described in detail. Special emphasis
will be placed on the maintenance of quality when delivering chemical
information both in PC-based databases and over the Internet.
11:50 AM
57. DATA MINING FOR LEAD IDENTIFICATION AND EXPLOSION.
Sheila Ash, Scott Gothe, Tripos, Inc., 1699 So. Hanley Rd., St. Louis, MO
63144.
Successful drug design programs need to ensure a continued flow of new
lead candidates. Data mining techniques enable drug designers to capitalize on
the various data sources available to them. This paper exemplifies the use of
these techniques and sources for lead identification and explosion purposes
available to them.
12:10 PM
58. COMBINATORIAL CHEMISTRY - ITS STRUCTURE, RELATIONSHIPS,
PERFORMANCE AND OUTLOOK. W. Gregg Wilcove, Wilcove Associates, Inc., 14
Medford Road, Morris Plains, NJ 07950.
The science of combinatorial chemistry can be defined, measured, and
characterized to reveal how it is organizing itself. We will present a
structural analysis of its research universe that shows (1) its defining
characteristics, (2) the relationships between applications, (3) where the
momentum is, and (4) the emerging work that will determine its future direction,
emphasis, and impact.
WEDNESDAY AFTERNOON, APRIL 16, 1997
SECTION A
1:30 PM
59. COMBINATORIAL SYNTHETIC DESIGN. Paul A. Bartlett,
Matthew A. Marx, Anne-Laure Grillot, Samuel J. Gillett, Mark R. Spaller, Eric D.
Turtle, Department of Chemistry, University of California, Berkeley, CA 94720-
1460.
The simultaneous synthesis of a library of compounds must be carded out
within a different set of constraints than a synthesis directed to a single
target. Restrictions on isolation or purification steps, and the differing
strategies for identification of individual structures after screening a
library, obviate many conventional approaches to synthesis. Sequences
appropriate for parallel or combinatorial syntheses begin with starting
materials that are available with diverse functionality; they are relatively
short, and, in many instances, they are carried out on solid support. It is also
generally the case that a single variable is introduced in any step. In addition
to these criteria, it is our own prejudice that cyclic or conformationally
constrained molecules offer the most interesting targets for development of
library syntheses and as screening leads. Synthetic sequences devised on the
basis of these principles will be described.
2:00 PM
60. INFORMATION REQUIREMENTS FOR PLANNING A COMPOUND
LIBRARY. Guenter Grethe, Maurizio Bronzetti, MDL Information
Systems, Inc., 14600 Catalina Street, San Leandro, CA 94577.
Careful planning is the most critical step in the process of generating
libraries of small organic molecules. After identifying a biological target and
generating a potential scaffold for the desired library, the possible metabolic
fate of compounds to be synthesized has to be considered and viable synthetic
pathways amenable to combinatorial synthesis have to be developed. Facile access
to relevant information about synthetic methodologies in solution as well as on
solid support is essential. The planned synthesis and the desired diversity of
the library influences the selection of starting materials. Effective selection
from available sources is critical. We will demonstrate the planning process and
the efficient use of information resources for the generation of a library of
small organic compounds.
2:30 PM
61. MINING A REACTION DATABASE TO SUPPORT COMBINATORIAL
SYNTHESIS. Glenn J. Myatt, Paul E. Blower, Jr., Mike Petras, Chemical
Abstracts Service, 2540 Olentangy River Road, P. 0. Box 3012, Columbus, OH
43210-0012.
CAS has analyzed the CASREACT reaction database in terms of functional group
transformations. The results are presented in tabular form which a chemist can
browse to find promising reactions for combinatorial synthesis of small
organics. This paper will give details of the analysis and describe a computer
interface that provides tools to navigate the analysis tables and reaction sets.
3:00 PM
62. INFORMATION MANAGEMENT FOR AUTOMATED PARALLEL SYNTHESIS.
David Nickell, S.H. DeWitt, E.M. Hogan, Diversomer Technologies, Inc.,
and Parke-Davis Pharmaceutical Research, Division of Warner-Lambert Co., Ann
Arbor, MI 48105.
The ability to rapidly screen many thousands of chemical entities in high
throughput biological assays has raised the issue as to how the necessary number
of compounds will be obtained for testing. The problem has been approached from
a number of directions by different organizations. Managing the information
necessary to automate these systems at the enterprise, laboratory, or desktop
levels will require creative solutions which may involve paradigm shifts in the
way chemical synthesis is performed in the laboratory. The development of fully
integrated systems to address the growing demand for automated organic synthesis
is an area which is receiving much attention. The first generation information
systems will support individual automated synthetic workstations. Later systems
will include interfaces to pick-and-place robots, proprietary reaction
equipment, purification and analysis tools as well as existing databases.
Modularly designed information systems will have the capability to grow with the
expanding needs of the automated synthesizers.
3:30 PM
63. DATA MANAGEMENT FOR COMBINATORIAL TECHNOLOGIES AT
SELECTIDE/ HMR, INC. R.F.D. Stansfield, J.D. Heddles, C.V. Summers
and K.F. Wertman, Selectide Corporation, Hoechst Marion Roussel, Inc., 1580 East
Hanley Blvd., Tucson, AZ 85737-9525.
Combinatorial technologies at Selectide are applied to synthesis, analysis
and screening for biological activity. Automation in synthesis and in screening
is key. The data management requirements cover the gamut of a traditional
pharmaceutical research organization - product management, test and results
management, and decision support - but with an additional, combinatorial slant.
Our approach is based on an analysis of the lead generation and optimization
processes and a judicious selection of commercial software tools. We are
currently building applications and databases which provide integrated views of
information for chemists and biologists. The integration is done across
different products and technologies for managing, respectively, relational
(alphanumeric) data, discrete structures and combinatorial libraries.
4:00 PM
64. REACTION-CENTERED INFORMATICS FOR COMBINATORIAL CHEMISTRY.
David Chapman, Afferent Systems, Inc., 442A Collingwood St., San
Francisco CA 94114.
Combinatorial libraries may be best represented as a series of steps, each of
which either distributes a set of reactants or performs a reaction. This
representation is used advantageously both to generate product structure
databases, and to drive synthetic instrumentation, in a tightly integrated
system: Myriad. Database generation proceeds by "virtual chemistry" simulation
of the actual synthesis. Virtual chemistry generates all and only the expected
products of chemistries such as cycloadditions, which the familiar "generic
structure" approach cannot. Because virtual reaction vessels correspond with
physical ones, including physical product in the database is easy. Myriad
includes a high- throughput, instrument-independent synthesis controller, which
transforms a graphical library definition (consisting simply of sets of
reactants and a series of reactions) into the thousands of robot actions needed
to actually make the library. It can increase instrument throughput several-fold
by interleaving multiple product batches.
4:30 PM
65. SEARCHING FOR PATENTS IN DERWENT'S WORLD PATENTS INDEX ON
COMBINATORIAL CHEMISTRY PROCESSES AND PRODUCTS. Donald W. Walter, Derwent
Information, Suite 525, McLean, VA 22102.
How can you search for patents on a particular combinatorial chemistry
product or process when the target patent may involve a mixture of millions of
amino acid, nucleotide or other sequences? Derwent's indexing of chemicals
provides a way to focus on the particular combinatorial products of interest,
and the flexibility to search narrow questions or broad ones. This paper will
present Derwent's indexing practice and philosophy on the subject, and
illustrate some techniques for searching for patents involving combinatorial
chemistry.
THURSDAY MORNING, APRIL 17, 1997
SECTION A
9:00 AM
66. EXPERIENCE WITH CHEMSPACE™: FINDING ONE COMPOUND AMONG A
BILLION. Richard D. Cramer, David E. Patterson, Robert C. Glen, Allan M.
Ferguson, Michael Lawless, Peter Hecht, Tripos, Inc. 1699 So. Hanley Rd., St.
Louis, MO 63144.
New informatics paradigms are required to exploit the orders of magnitude
increase in the accumulation of qualitative SAR data. Our focus on molecular
similarity, using several metrics "validated" as predictive of similar
biological properties, has yielded, among other things, techniques for searching
large combinatorial "virtual libraries" at rates over 100,000,000
structures/hour. This capability suggested the possibility of a "universal
database," containing "all" structures available in a few steps from
commercially available reagents. Structures identified within this database
using validated similarity metrics would be the most logical candidates for
following up a newly discovered hit from a random screening program. Some
results from the first nine months of experience with such a database will be
presented and discussed.
9:30 AM
67. NEW LEADS BY SELECTIVE SCREENING OF COMPOUNDS FROM LARGE
DATABASES, Alberto Gobbi, Dieter Poppinger, Bernhard Rohde, Ciba Geigy
AG, R-1045.1.20, Postfach, CH-4002 Basel, Switzerland.
At Ciba, a large database with over 500,000 commercially available compounds
was built. Several methods to select compounds for screening from this database
have been compared using an existing dataset including biological activity.
Using a genetic algorithm many of the most active compounds were found screening
only 1,200 out of 76,000 compounds.
10:00 AM
68. SCAM: STATISTICAL CLASSIFICATION OF ACTIVITIES OF
MOLECULES USING RECURSIVE PARTITIONING. Andrew Rusinko III, Mark W.
Farmen, Christophe G. Lambert, and S. Stanley Young, Research Information
Resources, Glaxo Wellcome Inc., Research Triangle Park, NC 27709.
Combinatorial chemistry and high-throughput screening have revolutionized the
drug discovery process in the pharmaceutical industry. Large numbers of
structures and vast quantities of biological assay data are rapidly being
accumulated which overwhelm traditional chemical/biological analysis
technologies. Recursive partitioning is a method for statistically determining
the rules that classify objects into similar categories or, in this case,
structures into groups of active or inactive molecules. SCAM is a computer
program designed to make use of this methodology in an extremely efficient
manner. Rules explaining biological data for thousands of compounds can be
determined in a matter of a few CPU minutes. A dataset of 1,650 monoamine
oxidase inhibitors was used in this investigation. Substructural rules that lead
to a general classification of structures were obtained and compared to
clustering of structures via their aggregate chemical descriptors alone.
Advantages and disadvantages of this methodology are presented.
10:30 AM
69. DATA MINING USING-PROBABILISTIC STRUCTURE ANALYSIS.
James A. Morrell Monsanto Co., GG3K, 700 Chesterfield Village
Pkwy, St. Louis, MO 63198.
The presentation describes a technique we are developing for utilizing data
in either commercial or proprietary chemical information databases to determine
a probabilistic measure of how various functional groups impact biological
activity and specificity. A precursor of the technique, which has traditionally
been used for capturing an organization's collective knowledge of toxicological
activity, has been extended to provide an additional tool for combinatorial
library design. With regards to combinatorial library building, the technique
has potential as either a pre-processing (building block selection) or
post-processing (library selection) step.
11:00 AM
70. EXPERIMENTAL TECHNIQUES FOR THE DATA MINING OF CAS DATA
FOR SUBSTANCE-USE RELATIONSHIPS. W. Fisanick, T. E. Bangert, Research
Unit, Chemical Abstracts Service, 2540 Olentangy River Road, P. 0. Box 3012,
Columbus, OH 43210.
CAS is experimenting with a variety of techniques for the data mining of
Registry and CA File data for substance-use relationships. Included are enhanced
similarity and clustering techniques for structure data and techniques that
extract, summarize, and infer substance use information from text data. The
structure similarity capabilities incorporate a composite or class similarity
search for a set of substances. The structure clustering techniques include a
Jarvis-Patrick method that has been enhanced with a screening mechanism to
improve the compute performance and two cluster relationship techniques that
allow for related cluster/substance navigation and cluster overlaps. A
partitioning technique based on common substructures is also used. The common
substructures are identified in a series of similarity and substructure
searches. Of significance in the text handling is an initial version of an
inference technique that establishes the substance to use correlation. This
paper will discuss and illustrate selected techniques and their experimental use
for a particular type of use such as a class of bioactivity.
11:30 AM
71. MINESET: AN INTERACTIVE DATA ANALYSIS AND
EXPLORATION TOOLSET. Mario Schkolnick, Silicon Graphics Computer
Systems, Mountain View, CA 94043-1389.
MineSet is a new data mining and visualization product from Silicon Graphics.
By integrating the functions of data access, data transformation, data analysis
and presentation of results, the task of mining data is supported in a very
interactive way. This talk will discuss the organization of the product and will
demonstrate its main features.
![]()
![]()
![]()
Mohamed E. Nasr, Division of AIDS, National Institute of
Allergy and Infectious Diseases, NIK Bethesda, MD 20852.
![]()
Roger J. Schenck, CAS, 2540 Olentangy River Road, P. 0. BOX 3012,
Columbus, OH 43210-0012.
![]()
![]()
cciimnro.html).
In addition to locally developed items, the lists contain references to
instructional materials developed by the producers of chemical information
tools, many of which are supplied free by the producer. A selection of
representative materials and search techniques will be presented in the paper.
Carol Carr Chemistry Library, University of Pennsylvania,
Philadelphia, PA 19104-6323.
![]()
Victoria J, Cassens, Eileen M. Luke, Peggy J. Hoover, Center
for Environmental and Regulatory Information Systems, Purdue University, 1231
Cumberland Ave., Suite A, West Lafayette, IN 47906.
Carol A. Duane, Ricerca, Inc., P. 0. Box 1000, Painesville,
OH 44077-1000, Victoria J. Cassens, Center for Environmental and Regulatory
Information Systems, Purdue University, 1231 Cumberland Ave., Suite A, West
Lafayette, IN 47906.
![]()
![]()
![]()
![]()