ACS Chemical Information Division (CINF)
Spring, 1999 ACS National Meeting
Anaheim, CA (March 21-25)
Convention Center A6
|Data Mining Chemical Information Databases 1: Structure and Structure/Text Focus|
|W. Fisanick, Organizer, Presiding|
EXTRACTION OF STRUCTURE-ACTIVITY RELATIONSHIPS FOR BIODEGRADABILITY AND MUTAGENICITY OF NONCONGENERIC COMPOUNDS USING STRUCTURAL REGRESSION TREES
C. Helma (1,2), E. Gottmann (2), B. Pfahringer (3) and S. Kramer (3), (1) Institute for Tumor Biology--Cancer Research, Borschkegasse 8a, A-1090 Vienna, Austria, (2) Institute for Environmental Hygiene, Kinderspitalgasse 15, A-1095 Vienna, Austria, (3) Austrian Research Institute for Artificial Intelligence, Schottengasse 3, A-1010 Vienna, Austria
Predicting the environmental fate and and adverse human health effects of organic compounds presents a formidable challenge for programs concerned with knowledge discovery, even if the databases are relatively small. We developed an Inductive Logic Programming (ILP) algorithm called Structural Regression Trees (SRT) and applied it to learn rules for the prediction of biodegradability and mutagenicity. These predictions are, in contrast to other Machine Learning Programs (e.g. Neural Networks), based on explicit rules which are understandable and interpretable by human experts. We applied SRT to learn the half-rate of the aerobic surface water biodegradation. The dataset contained 62 compounds and we performed 10-fold cross-validation to estimate the accuracy of the predictions. We were able to correctly estimate the decomposition rates of 78% of the compounds, which is significantly better than the performance of other Machine Learning programs on the same dataset. The SRT theory suggests that the presence of aromatic and nonaromatic halogens is an important factor reducing the biodegradability of organic compounds. For the mutagenicity experiments, we used 188 nitroaromatic compounds. Using 10-fold cross-validation SRT correctly predicted the activities of 90% of the compounds. The accuracy of SRT was slightly higher than that of other ILP programs and than that of Artificial Neural Networks. The rules generated by SRT reveal that the mutagenicity of nitroaromatic compounds is determined by their logP and LUMO values and the presence or absence of certain ring structures (e.g. phenanthrene). These examples clearly demonstrate that it is possible to automatically extract Structure-Activity Relationships from toxicological databases which are predictive and interpretable.
TECHNIQUES AND STRATEGIES IN 3D DATABASE MINING
Osman F. Güner, Rémy Hoffmann, Hong Li, Molecular Simulations Inc. 9685 Scranton Road, San Diego, CA 92121, USA.
Three-dimensional database searching has become a fundamental technology used in Computer-Aided Drug Design. With the primary objective of identifying new leads and new classes of active compounds, 3D seaching has also become an important companion for computational combinatorial library design and analysis tools, particularly used for focusing combinatorial libraries. In this presentation, we discuss different techniques and strategies utilized in 3D searching. Search results obtained from various queries are compared and analyzed. Detailed comparison between techniques involving flexible vs rigid searching, shape vs pharmacophore vs mixed query searching; receptor- vs ligand-based queries; performance of automated vs manual pharmacophores vs SAR based hypotheses; techniques involving clustering and merging of pharmacophore models;and training set selection for pharmacophore model generation. Each analysis is followed by recommendations on which techniques to be used under which conditions.
VISUALISAR: A TOOL FOR VISUALIZING TOPOLOGICAL COMMONALITIES IN CHEMICAL DATASETS.
David J. Wild, C. John Blankley, Department of Chemistry, Parke-Davis Pharmaceutical Research, 2800 Plymouth Road, Ann Arbor, Michigan 48105
To assist in structure-activity relationship studies of compounds passing through our high volume screening labs, we are evolving a tool called VisualiSAR for the detection of structurally-related groups and common or activity-related structural features in sets of compounds that are too large to efficiently analyze by hand. In this presentation, we shall describe how we have brought together 2D fingerprinting, Wards clustering, modal fingerprint analysis, 2D depiction of compounds, Stigmata coloring, and a Web-based interface to provide an effective analysis tool. VisualiSAR presents as much information as possible using 2D chemical structure depictions, which means that trends can be detected visually and the tool can be used by chemists with little training.
APPLICATIONS OF VISUALISAR TO STRUCTURE-ACTIVITY TREND ANALYSIS.
C. John Blankley David J. Wild, Parke-Davis Pharmaceutical Resarch, 2800 Plymouth Road, Ann Arbor, MI 48105
The utility of VisualiSAR, a new program for visually comparing sets of structures, in identifying SAR trends in large datasets of diverse compounds will be illustrated with examples from the NCI AIDS SCREEN dataset. Prior clustering of such collections into smaller sets of structurally related compounds and then coloring each of these by regions of commonality or difference allows important topological features to be readily discerned. This tool is particularly useful when used in conjunction with data analysis techniques which classify or partition datasets into groups based on property descriptors. Modal fingerprint scoring allows compounds related to those in interesting clusters to be retrieved from a larger database for further comparison and analysis. Easy interactivity with spreadsheet programs and a web interface add flexibility and convenience.
A KNOWLEDGE BASE FOR MINING OF ENDOCRINE DISRUPTOR DATA.
R. Perkins2 J. Anson1, W. Tong2, W. Branham1, R. Blair1, B. Hass1, H. Fang1, L. Shi2, Y. Chen2, J. Meehan2, R. Nossaman2, W. Welsh3, D. Sheehan1, 1FDA National Center for Toxicological Research, 2R.O.W. Sciences, Jefferson, AR, 3University of Missouri, St. Louis, MO.
The Endocrine Disruptor Knowledge Base (EKB) is intended to assist in risk assessment and regulatory decisions for exogenous compounds that may alter responses and disrupt vertebrate endocrine systems. This integrated multidisciplinary program comprises three primary elements: 1) in vitro and in vivo experimentation 2) an Internet-based bibliography and bioactivity database; and, 3) computer-based predictive models. The cross-linked database and bibliography with a searchable capability through an Internet browser provide an environment for virtual collaboration and knowledge mining facilitating identification of data gaps and research hypotheses, and development of QSAR and simulation predictive models.
UNSUPERVISED LEARNING IN REACTION DATABASES.
Johann Gasteiger, Oliver Sacher, Computer-Chemie-Centrum, University of Erlangen-Nuremberg, Naegelsbachstrasse 25, D-91052 Erlangen, Germany
Reaction databases provide a cornucopia of information on chemical reactions that could be used to derive knowledge for the prediction of the course and products of chemical reactions as well as to the design of organic syntheses. In order to reach this goal, the essential features of chemical reaction instances have to be recognized and generalized. This is achieved by a classification of a set of reactions by unsupervised learning techniques such as self-organizing neural networks and Bayes classifiers. In this approach, reactions are characterized by physicochemical features directly derived by computations from the constitution of the starting materials or products of a reaction. The knowledge thus derived is integrated into the reaction prediction system EROS and the synthesis design system WODCA
Convention Center A6
|Data Mining Chemical Information Databases 2: Text Focus|
|W. Fisanick, Organizer, Presiding|
PHARMACOLOGICAL AGENT DATA MINING IN CAS DATABASES
W. Fisanick, Qiong Yuan, Chemical Abstracts Service, 2540 Olentangy River Road, Columbus, Ohio, 43202-1505
Chemical Abstracts Service (CAS) has been experimenting with text and substance data mining techniques for the browsing and prediction of pharmacological agents based on data in the CAS Registry and CA databases. The text mining establishes the substance-bioactivity relationships. The browsing of pharmacological agents on classified substance databases is accomplished by 2D and 3D similarity searching and by accessing generic-specific relationships in linked substance clusters. The prediction of pharmacological agents given an input substance is based primarily on algorithms the utilize 2D and 3D structural fragments. Experimental prototypes are being developed to illustrate the browsing and prediction capabilities. This paper will discuss and illustrate several data mining techinques for pharmacological agents.
DISCOVERY OF CHEMICAL RELATIONSHIPS IN PATENTS-A COMBINATION OF NATURAL SEARCHING, WITH CLUSTERING AND VISUALIZATIONS.
W.C.Hauser, L. Martin, F. Passero and L. Schilling, Manning & Napier Information Services, 1100 Chase Square, Rochester, New York 14604
Natural Language searching of concepts combined with visualizations is a powerful combination of software utilities and features available throughthe Internet for scientists, attorneys and others interested in patent information. MAPIT (Management and Analysis of Patent Information in Text) is a data mining tool that facilitates the discovery of relationships between patents. These software products from Manning & Napier Information Services (www.mnis.net) provide powerful data judgment provisions that help rapidly reveal the most relevant clusters of patents or claims that are most closely related to your needs. Entire patents, individual claims,, markush structures or arbitrary English sentences may be searched eliminating the need for difficult keyword queries. Example searches and visualizations are presented.
TEXT MINING OF CHEMICAL AND PATENT LITERATURE.
M. Hehenberger, P. Coupet, M. Stensmo and C. Huot, Text Mining Solutions, IBM, Route 100, Somers, NY 10589
Text mining tools, used to navigate databases of chemical and patent documents, can be shown to increase the productivity of scientists and patent attorneys. Text mining concepts are explained, and it is shown how tools such as Advanced Search, Clustering, and Visualization can help solve problems in chemical / biomedical research and patent analysis. By providing the solution over the Internet, or over a company Intranet, it is made readily available to a large user population.
COMPUTER-ASSISTED SEARCH FOR NOVEL IMPLICIT CONNECTIONS IN TEXT DATABASES.
Don R. Swanson, University of Chicago, 1010 E. 59th St., Chicago, IL 60637
Useful scientific information can go undiscovered if it is not made explicit within any single article, but can be inferred only by considering toether two or more separate articles. We have developed software, called ARROWSMITH, to identify and construct suggestive juxtapositions of biomedical article titles, the purpose being to help researchers detect new and useful implicit relationships. We have analyzed eight pairs of literatures in which the two members of each pair had developed independently of one another, and yet were complementary in that together they led to inferences that could not have been reached by a study of either literature alone. In several such studies our findings were later corroborated experimentally. Here we report a study of the relationship between endothelium-derived nitric oxide and insulin resistance, a problem that has been identified in several biomedical articles. Our purpose is to show how ARROWSMITH could be helpful in guiding researchers toward a solution.
QUANTITATIVE COMPETITIVE INTELLIGENCE.
Norman J. Santora, Chemical Forecasting And Searching Technology, 1323 Partridge Road, Roslyn, PA 19001
A viable approach to analyze the competitive intelligence gathered for a department of a pharmaceutical company will be demonstrated. This analysis will provide direction for management in deciding the future course of a proposed research project in a timely manner. An example will be given, in which Markov Chain Analysis predicts whether a particular compound should be studied as a potential drug condidate. It is expected that Competititve Intelligence personnel and research managers will find this an attractive approach, especially, since combinatorial chemistry is providing so many drug candidates in such a rapid manner.
CONVERTING DATA TO KNOWLEDGE: EMPOWERING THE SCIENTIST.
N. Jones, Oxford Molecular, Oxford OX4 4GA, UK
It is clear that for a whole organization to ride the ever growing wave of information, no individual can afford to be swamped. Scientists who make decisions about research direction on a dialy basis need to be able to access and infer knowledge from masses of complex data. In order for non-specialists to use these data mining tools it is not enough that they are easy to use; it is important that techniques yield clear and intuitive answers. For this reason, we have developed a desktop application which can retrieve any corporate data and allow the scientist to make quick reasoned decisions using a series of mutually complimentary techniques tailored to fit the modern discovery processes of high-throughput screening and combinatorial chemistry. Techniques range from the simplest viewing of data as colored patterns to a graphical implementation of, FIRM, a recursive partitioning technqiue. Examples will be given of how these visual, linked and intuitive techniques can allow a scientist at the benchtop, in a few minutes, to discern knowledge from masses of data.
Convention Center A2/3
|Information at the Cutting Edge in Catalysis, Petroleum and Polymer Chemistry|
|S. Kaback, Organizer, Presiding|
CURRENT AWARENESS IN SINGLE-SITE POLYMERIZATION CATALYSIS.
Gregory G. Hlatky, Equistar Chemicals LP, 11530 Northlake Dr. Cincinnati, OH 45249
"Single-site" olefin polymerization catalysts have invigorated the mature field of Ziegler-Natta catalysts. New papers, patents, and published patent applications on catalyst structures, polymerization processes, and polymer products and end uses are appearing in ever increasing numbers. The intense commerical interest in these new systems, as well as their intellectual property aspects, generate other business and legal disclosures. This presentation outlines how Equistar Chemicals maintains current awareness for scientists, management, and attorneys in this exciting, ever changing area.
METALLOCENE CATALYSTS AND POLYOLEFINS: THE RECORD IN JAPANESE PATENTS.
A.K.Engel, ISTA, Inc., 551 W.Lancaster Ave., Suite 212, Haverford, PA 19041
Traditional and new patent mapping techniques are used to visualize metallocene technology strategies and flows in Japan. Particular attention is paid to scientific inputs and application outputs.
GOOD IP TOOLS INCREASE THE IMPACT AND FUN OF BEING IN R&D.
Paul B. Germeraad Aurigin Systems, Inc., 1975 Landings Dr., Mt. View, CA 14043
The use of good Intellectual Property Management tools greatly increases both the creativity and productivity of R&D scientists. These tools include searching methods, analysis methods, graphic tables and maps, and electronic capture of insights. The stage / gate model of product development is used as a template for presenting the best methods to use for each stage's work. Ways to expand upon initial ideas and concepts in the preliminary assessment phase are, for instance, contrasted to the figures and maps used to focus a project in late development and scale-up stages. Integrating Intellectual Property into the R&D process has been shown to speed up development time, improve project quality, and frankly make projects more fun to work on.
DERWENT'S ENHANCED KEYWORD INDEXING.
Donald Walter and Paul Sayer, Derwent Information, Suite 250, 1725 Duke Street, Alexandria VA 22314
Derwent indexing has a well deserved reputation for its power in retrieving information from the patent literature, and is particularly powerful in aiding retrieval from chemical patents. Unfortunately, powerful systems are often complicated systems, accessible only after a prolonged apprenticeship. The power of Derwent indexing is being augmented by the development of a comprehensive keyword indexing system, which will make sophisticated search and retrieval an option in a wider range of situations. We will describe this simple, comprehensive indexing system, and discuss its utilisation with reference to the petrochemical and polymer industries. The simplicity of the system may be useful in end user applications, and we will provide some pointers as to how this might be achieved.
APIPAT AND APILIT, THE (FORMER) AMERICAN PETROLEUM INSTITUTE TECHNICAL DATABASES: CURRENT STATUS, STRONG POINTS.
Nancy Lambert Chevron BP&S, P.O. Box 1627, Richmond, CA 94802-0627
The American Petroleum Institute's patent and technical literature databases, APIPAT and APILIT, are valuable information resources not only for the petroleum industry but also for certain areas of the chemical industry, particularly in hydrocarbons and oxygen-containing organics. These are some of the best-indexed databases available; but their usage has traditionally suffered from API's restrictive subscription policy and lack of publicity. However, the new owners of the databases should promote and enable their use to a wider range of customers. This paper will address the databases' current ownership and distribution and discuss some of their strong points.
MEETING THE CHALLENGE OF CATALYSTS FROM AN INDEXING AND SEARCHING PERSPECTIVE.
William B. Catus III and Linda S. Toler, Chemical Abstracts Service, 2540 Olentangny River Rd, Columbus, OH 43202
The CAplus database on STN is the richest single online source for catalyst information. With in-depth subject and substance indexing, exceptional currency, and coverage of both patents and journal literature from around the world, CAplus is an important source for information on catalyst compositions and applications. Subject indexing of catalyst information, including the new thesaurus hierarchy and improved coverage of catalysts for petroleum refining, asymmetric synthesis, and polymerication (e.g., metallocene single-site catalysts) will be addressed. Indexing of catalyst components by CAS Registry Number, subject terms, and CAS roles will be described along with coverage of common classes of catalysts, such as metals, oxides, and sulfides, heteropoly acids, clays, zeolites, and molecular sieves. Search techniques illustrating the advantages of using the CAplus and REGISTRY databases on STN for catalysts information will also be presented.
THE ROLE OF PATENT ANALYSIS IN CHARACTERIZING "MATURE" INDUSTRIES.
J.P. Wineburg, DuPont Automotive Finishes, Marshall R&D Laboratory, 3401 Grays Ferry Ave., Philadelphia, PA 19146
In "high-technology" industries, new discoveries appear at a rapid rate. Patent monitoring is essential for those who want to stay knowledgeable. Patents are a unique source of technical information and competitive intelligence. Patent analysis is a widely used business intelligence tool. In "mature" industries, new discoveries tend to appear at a more sporadic rate. Patent monitoring is still essential for those who want to stay current. However, patent anlysis can provide potentially misleading business intelligence information. The patents of competitors in "mature" industries may not reflect business strategy, R&D strategy, etc. The patents may not even correspond to products they sell in the marketplace. Case studies involving two different "mature" industries illustrate ways to identify, and deal with, these potential pitfalls. In the first case, patent analysis provides highly predictive business intelligence information. In the second case, patent analysis provides potentially misleading business intelligence information.
Convention Center A6
|G. Baysinger, Organizer, Presiding|
DIGITAL LIBRARIES AND SCHOLARLY COMMUNICATION: AN OVERVIEW.
Donald J. Waters, Director, Digital Library Federation, Council on Library and Information Resources, 205 Church Street, Third Floor, New Haven, CT, 06510-1805
Emerging digital libraries already support research and learning in a variety of disciplines. High-quality, cost-effective support requires continuing development of a variety of features of digital libraries. This presentation will provide an overview from the perspective of the Digital Library Federation of the organizational, economic, and technical developments planned and underway.
BUILDING A DIGITAL LIBRARY IN THE SCIENCES.
Susan Starr, Associate University Librarian-Sciences & Director, Biomedical Library, University of California, San Diego,La Jolla, CA, 92093-0175
In the Fall of 1996 the University of California embarked on a new venture. The creation of the first statewide digital collection in what came to be known as the California Digital Library (CDL). This first collection was called the Science, Technology and Industry Collection (STIC), and it was designed to 1) provide scholarly material in a convenient and timely way to faculty and students in the sciences, 2) serve as an opportunity to learn and plan for future CDL collections, and 3) serve as an opportunity to develop partnerships with business and industry. The presentation will describe our progress to date in building the collection and some of the lessons we have learned along the way. The following links provide more information about the California Digital Library:
Library Planning and Action Initiative Advisory Task Force. FINAL REPORT. University of California, March 1998.
California Digital Library
UC Systemwide Library Planning
LIBRARY WITHOUT WALLS PROJECT - THE FIRST 3 GENERATIONS.
Frances L. Knudson, Richard E. Luce, Doris K. Ford, Los Alamos National Laboratory, P.O. Box 1663, MS-P362, Los Alamos, NM, 87545
The Library Without Walls (LWW) Project at Los Alamos National Laboratory Research Library has existed for 4 years. We will discuss the successes and challenges of the three generations of LWW products. The first generation included delivery of scientific databases via the web to the researchers' desktop and digitization of Los Alamos technical reports. The second generation centers on searching the databases simultaneously as one mega-database; enhancing the search capabilities to deliver a truly multidisciplinary product to the researcher. The third generation will focus on techniques to aid the researcher in processing scientific information. Techniques will include data mining and visualization. Click HERE for an HTML version of the presentation.
LINKING TO FULL-TEXT: THE ISI EXPERIENCE.
Chris Leonard, Manager, New Product Development, New and Corporate Products, Institute for Scientific Information, 3501 Market Street, Philadelphia, PA, 19104
Maximum access to a wide variety integrated electronic resources are key to building the digital libraries envisioned by researchers, scholars and information professionals. ISI Links is an initiative from the Institute for Scientific Information that establishes navigational links between core Web-based research tools such as the Web of Science and the high-quality content required by users. Over the past two years, with its ISI Links initiative, ISI has gained valuable insights and experience in establishing agreements with publishers of electronic full-text titles and vendors of information services such as chemical and patent database producers. This presentation will cover technical challenges encountered, the types of content linked, trends in electronic publishing, and key concerns for participants in the chain of scholarly communication along with insights in the tools currently available to researchers and the realities of our current digital environment.
INNOVATIVE INFORMATION RESOURCES ON THE INTRANET.
Silvia E. Lee, Information Specialist, Catalytica, Inc., 430 Ferguson Drive, Mountain View, CA, 94043
One of the challenges Corporate Information Centers encounter today is how to ensure that critical information is readily and easily available to those who need it. Your company's Intranet will increasingly become an indispensable tool in accomplishing this goal. The focus of this presentation will be on two innovative and cost saving services that make information available over Catalytica, Inc.'s Intranet. The Web-Based Electronic Table of Content (eTOC) Service allows users to quickly scan a journal's TOC and also allows them to conveniently order a copy of an article again over the Intranet. The second tool is a web-based Library Catalog, which lets the user search, reserve or order books right from their desktop.
UC IRVINE SCIENCE LIBRARY CHEMISTRY RESOURCE LOCATOR.
April Love, John Sisson, Science Library R&I, University of California, Irvine, CA, 92623
The impact of the Web on library reference service is that you do not know from where or when clients will need chemistry information. In the UCI Science Library we created an "extended" pathfinder for chemistry called the Chemistry Resource Locator (CRL). The process of developing the CRL incorporated finding both locally created sources of information and identifying appropriate Web-based sources. We identified our user's most frequent chemistry information needs and how a web guide could meet those needs. This information already existed in various forms and needed to be consolidated. We also looked at similar chemistry information sources on the Web to see which sources they found most valuable. The implementation of the CRL has shown there is a need for an accessible source of different types of chemistry information. The CRL enhances the use of our existing reference collections.
Convention Center A6
Convention Center Halls A/B/C
Convention Center A6
Convention Center A6
|Techniques in Pharmacophore Development|
|O. Güner, Organizer, Presiding|
WHAT MAKES A GOOD PHARMACOPHORE?
Edmond Abrahamian, S. Ling Chan, Robert D. Clark, Trevor W. Heritage, Tripos, Inc., 1699 South Hanley Road, St. Louis, MO 63144
An experienced molecular modeler can often look at a particular pharmacophore-based query and know whether it is likely to produce useful results when used to carry out a 3D search - i.e., whether it will be so general that it "hits" too many structures, or so specific that it "hits" too few. The recent introduction of partial match capabilities in commercial 3D searching packages has rendered such evaluation considerably more difficult, but it is still humanly possible. In carrying out recent work on automated pharmacophore hypothesis generation, it became clear that a priori evaluation of the information content and/or discrimination power of a 3D query is in fact quite a complex task. Nonetheless, certain general relationships do exist between query structure and information content which can allow quantitative analysis which goes beyond simple rules of thumb. Results of such analyses will be presented.
AUTOMATIC PHARMACOPHORE GENERATION USING CATALYST.
Jon Sutter, Osman Güner, Rémy Hoffmann, Hong Li, Marvin Waldman, Molecular Simulations Inc., 9685 Scranton Road, San Diego, CA 92121-3752
Catalyst is a program suite that automatically generates pharmacophores using only experimentally-obtainable ligand information; 2D structure and biological activity. Catalyst calculates a conformational model for each molecule, creates initial pharmacophores that are common among the active compounds but not among the inactive compounds, and optimizes the pharmacophores using simulated annealing. Each pharmacophore is scored using a cost function that accounts for its complexity, the deviation from ideal chemical weights for each chemical feature, and the differences between the predicted and measured activities. The pharmacophores can be used to predict the activity of unknown compounds or to search for new possible leads contained in 3D chemical databases. In this presentation, validation studies involving new developments in Catalyst including the use of variable weights and tolerances will be discussed.
EVALUATION OF AUTOMATED METHODS FOR PHARMACOPHORE MODEL GENERATION.
Berith Bjørnholm, Klaus Gundertofte Morten Langgård, H. Lundbeck A/S. Ottiliavej 9. DK-2500 Copenhagen-Valby. Denmark
A pharmacophore model of the serotonin reuptake site has been developed using classical pharmacophore modelling tools. (1) The model can explain enantioselectivity as well as selectivity towards norepinephrine and has been validated against several test compounds. Newer, automated techniques for pharmacophore model generation, Catalyst, Cerius2 and Flo96 have been applied on the serotonin reuptake site. The generated pharmacophore models have been compared to our classical model. Differences in performance of the applied methods will be discussed.
(1) Gundertofte, K., Bøgesø, K.P. and Liljefors, T., A Stereoselective Pharmacophoric Model of the Serotonin Reuptake Site. In H. van de Waterbeemd, B. Testra and G. Folkers (Eds.), Computer-Assisted Lead Finding and Optimization. WILEY-VCH, Weinheim, 1997, pp. 443-459.
SUBSTRATE MAPPING THROUGH A MULTIPLE COPY SIMULTANEOUS SEARCH : A NEW METHODOLOGY APPLIED TO PEPTIDE DEFORMYLASE.
Luc Patiny Ecole Polytechnique, Laboratoire DCSO, route de Saclay, 91126 Palaiseau CEDEX, FRANCE
The multiple copy simultaneous search method (MCSS) was developed by Karplus in order to create functionality maps of binding sites for proteins having a known catalytic site.In this communication we would like to present the use of the MCSS in order to quickly determine the position of a catalytic site knowing a substrate and the structure of the protein. In order to reach this goal we are using directly the substrate in the minimisation. To circumvent the difficulties linked to the degree of freedom, the substrate is reduced, in a first step, to 50%. After the MCSS, we let the substrate inflate in the protein.We will present results that was obtained on peptide deformylase and compare them to experimental results.
EXPLOITING PHARMACOPHORES USING ORACLE.
Keith Davies Treweren Consultants Ltd, Evesham WR11 5LU Richard Postance, Oxford Molecular Ltd, Oxford OX4 4GA
The 3D pharmacophore searching technology developed by Chemical Design in the early 1990s was adapted to provide a quantitative measure of pharmacophore diversity for Combinatorial Chemistry Library Design. This used the concepts of pharmacophore fingerprints or keys which have also been used to identify pharmacophores which may be associated with activity. Recent advances in computer speed and the falling costs of diskspace, have allowed the pharmacophores for individual molecules to be stored in an ORACLE database. This enables much more rapid development and testing of hypothesis for the pharmacophore responsible for a given activity which in turn may significantly reduce the time taken for lead explosion and lead optimisation. This paper includes examples using this technology to select molecules to exhibit a desired selectivity of biological response.
Convention Center Halls A/B/C
|Chemical Information Sources on the WWW|
|A. H. Berks, Organizer, Presiding|
ONLINE INFORMATION SERVICES FOR THE PHYSICAL SCIENCES.
S. Calcari, S. Nannapaneni, and L. X. Payne, Internet Scout Project, Computer Sciences Dept., University of Wisconsin-Madison, Madison, WI 53706
ACD/ILAB: CHEMICAL PROPERTY PREDICTIONS OVER THE INTERNET USING JAVA-ENABLED BROWSERS OR WINDOWS CLIENT SOFTWARE.
Valeri Kulkov Antony Williams, Advanced Chemistry Development, Inc., 133 Richmond Street West, Suite 605, Toronto, Ontario M5H 2L3, CANADA
ACD/ILab is a Web-based gateway to chemical property prediction programs and databases located at http://www.acdlabs.com/ilab/ . Our recently announced Open Server Interface provides an easy way to connect chemical information resources to the ILab, reusing the existing Web interface and server utilities and thus dramatically reducing efforts needed to deploy a chemical information resource on the Internet. A unified Web browser interface allows instant access to ILab resources without having to install any additional software on a client computer. In this talk we will focus on our innovative Windows client software for the ILab - ChemSketch Online. We will show advantages and disadvantages of both clients and provide examples of successful utilization of the ILab in both academic and industrial environment.
NAME/STRUCTURE: AUTOMATED STRUCTURE GENERATION FROM CHEMICAL NAMES.
J.S. Brecher, CambridgeSoft Corporation, 100 CambridgePark Drive, Cambridge, MA 02140
A single substance may be represented by many chemical names, from the systematic ones recommended by the International Union of Pure and Applied Chemistry (IUPAC) or the Chemical Abstracts Service (CAS) to the semi-systematic ones that have evolved over the past century of common usage. Chemists in different fields or different countries may prefer yet other names. A new software routine has been developed that can interpret the majority of chemical names and generate the corresponding chemical structure with an accuracy of over 99%. Applications of this technology toward database preparation and searching will also be discussed.
NAVIGATING CHEMISTRY ONLINE WITH CHEMCENTER.
Louise Voress, Sarah Blendermann, John P. Fallon, David Koran, Beth Weston 1155 Sixteenth Street, N.W., Washington, DC, 20036
This paper gives a summarizes ChemCenter's approach to helping chemistry professionals, as well as educators, students, and the general public, navigate the World Wide Web for information about chemistry. This "virtual society" is a central source for information and interaction with ACS, its programs, and activities. It affords users the opportunity to learn about and link to other Web sites; highlights information of importance to its audience from ACS and other credible sources; offers unique features and original content; helps users organize the vast resources available electronically; and serves as a reliable Web starting point that can be accessed internationally, 24 hours a day.
SERVING SCIENTIFIC INFORMATION TO HETEROGENEOUS COMMUNITIES.
Louis J. Culot, CambridgeSoft Corporation. 100 CambridgePark Drive. Cambridge, MA 02140.
Although the Web has put a common data interface on every desk, it doesn't solve the cultural problems which exist across disciplines needing access to the same information. Chemists, biologists, and non-scientists often access the same information sources, but ask different questions in different ways. Serving scientific information to these disparate groups poses unique problems in both data-architecture and user-interface design. A general overview of the problems, along with two WWW and data case studies, will be discussed. We will focus attention on combining the data-driven architectures needed for long-term maintainability, with user-friendly interfaces that effectively serve broad populations. All case studies and examples will involve chemically-searchable (i.e., structure-searchable) data.
INTEGRATED ACCESS TO WEB-BASED CHEMISTRY RESOURCES IN A CORPORATE ENVIRONMENT.
Steve Boyle, Global Research, Nalco Chemical Company, Naperville, Illinois 60563-1198
Industrial researchers need rapid access to Web-based information for product development and customer support. Providing universal access at the desktop can be especially challenging in a corporate environment -- where concerns about data security and online productivity must be addressed early on. It is also important to combine external Web resources with equal access to internal collections of the company's proprietary reports, project files and other core intellectual property. This paper will discuss an industrial research organization's program to establish and support Web access in a corporate environment – along with intermediary online searching, SciFinder, in-house databases and traditional library reference services. Topics will include: user training, data security, policy administration, and the use of an intranet to facilitate Web searching and insure its integration with internal company resources.
THE SCIENTIFIC AND TECHNICAL INFORMATION NETWORK (STN) USES WEB TECHNOLOGY TO PROVIDE UNIQUE ACCESS TO TRADITIONAL ONLINE INFORMATION.
Chris McCue, Leni Helmes (with William Bartelt), FIZ-Karlsruhe, P.O. Box 2465, D-76012 Germany
The popularity of the Web has influenced the expectations of virtually everyone who needs information. The convenient desktop access and easy to use interaction that are hallmarks of the Web have created a demand for browser access to the scientific information on STN. This demand presented a unique challenge because the power of STN was built using traditional online technology. Web technology has enabled STN to create a Web service for information professionals that not only provides parallel capabilities, but also takes advantage of the strengths of the Web. This is accomplished using a variety of technologies, including stateful Web applications, search and display engines, structure searching and image servers. STN, combined with the power of the Web, now provides the professional searcher with Web browser access to the information they need. Click HERE for HTML version of presentation.
WEDNESDAY AM / PM
Convention Center A6
|Techniques in Pharmacophore Development|
|O. Güner, Organizer, Presiding|
FIELD-BASED SIMILARITY FORCING: A CONFORMATIONALLY-FLEXIBLE APPROACH TO PHARMACOPHORE PERCEPTION.
James R. Blinn Gerald M. Maggiora, and Douglas C. Rohrer, Pharmacia & Upjohn, 301 Henrietta Street, Kalamazoo, MI 49007-4940.
A new molecular field-based similarity forcing method for matching conformationally flexible molecules is presented. The method extends earlier work on field-based similarity matching of molecules based upon the MIMIC program, by directly coupling the similarity matching function to a molecular mechanics force-field. Thus conformational energetics are now fully accounted for within the similarity-matching process, and conformational searching constrained by the similarity function ('similarity forcing') can be carried out concurrently. This method is quite similar to that used in NMR-based structure determination with NOE distance constraints. Although a Monte Carlo search procedure is used in the present work, any type of search procedure can be employed. After the best matchings have been determined, a series of post matching analyses are performed to obtain information on the underlying pharmacophoric patterns. These analyses included an evaluation of appropriate similarity fields and inter-molecular atom-atom similarities. An example from several HIV-1 RT inhibitors will be presented to illustrate the salient features of the method.
EXTRACTION OF THE MAXIMUM 3D COMMON SUBSTRUCTURE FROM A SET OF LIGANDS.
Johann Gasteiger, Sandra Handschuh, Markus Wagener, Computer-Chemie-Centrum, University of Erlangen-Nuremberg, Naegelsbachstrasse 25, D-91052 Erlangen, Germany
Ligands binding to the same receptor are superimposed in order to extract the essential three-dimensional structure necessary for ligand binding. In this process, geometry changes due to conformational flexibility are explored to maximize overlap. The entire process uses a combination of a genetic algorithm with a quasi-Newton optimizer. The method can be applied to a set of ligands even if the 3D structure of the receptor is not known, and to hits from high-throughput screening.
VECTOR AND TENSOR PHARMACOPHORES FROM QUANTUM MECHANICAL CALCULATIONS
Tim Clark Matthias Hennemann, Bodo Martin, Computer-Chemie-Centrum, Universitaet Erlangen-Nuernberg, Naegelsbachstrasse 52, D-91052 Erlangen, Germany.
The detailed electrostatic and polarizability information available from even very simple (in this case AM1 semiempirical MO) quantum mechanical calculations can be converted into hydrogen bonding vectors, linear quadrupoles for aromatic ring binding sites and group polarizability tensors to characterize lipophilic groups. The spatial relationships between the different sorts of binding site on the molecule can be used to derive 3D-pharmacophores both from limited (20-50 compound) activity data and from larger numbers of compounds that have been identified to be active. The pharmacophore models thus derived can be used to scan entire 3D-databases at a speed of about 2,500 molecules per processor minute to obtain quantitative estimates of individual activities. Examples of the results of such searches will be presented.
THE CONCEPT OF PHARMACOPHORE AND ANTI-PHARMACOPHORE SHIELDING IN DRUG DESIGN AND STRUCTURE-PROPERTY RELATIONSHIPS.
IIsaac B. Bersuker*† Suleyman Bahceci*, James E. Boggs*, Robert S. Pearlman†, *Department of Chemistry and Biochemistry, †College of Pharmacy, The University of Texas at Austin, Austin, TX 78712
The widely employed concept of Pharmacophore (Pha) is complemented by the notions of Basic Skeleton (BS) and Anti-Pharmacophore Shielding (APS). The Pha is assumed to be the necessary element that produces the activity under consideration, while the BS is formed by adding to the Pha maximum possible atomic groups that do not influence the activity (they fill the cavities in the ligand-receptor interaction), and the APS is defined as additional (to the BS) groups that hinder the proper Pha-receptor docking, thereby diminishing (partially or completely) the activity. In combination with our Electron-Conformational method of pharmacophore identification, suggested earlier, the concept of BS and APS simplifies the search for Pha and allows for approximate numerical predictions of activities. These computer implemented novel ideas are applied to two problems: rice blast activity and angiotensin-converting enzyme inhibitors.
ADVANCED PHARMACOPHORE KEYS.
Stephen J. Cato, Oxford Molecular Group, Inc., 8380 Miramar Mall, Suite 224, San Diego, CA, 92121.
Chem-X software allows study of the potential pharmacophores exhibited by a collection of molecules and expresses the results as pharmacophoric fingerprint, called a pharmacophore key. It recognizes up to seven types of interaction centers automatically and finds all possible permutations of three of these as potential pharmacophores in each low energy conformation, setting the appropriate bits in the key. This paper compares and contrasts two extensions to the standard pharmacophore keys. The use of four-center pharmacophores (rather than three), allows the pharmacophores found to be space filling (rather than planar) and creates a much larger potential pharmacophore space. Alternatively, 'profiling', allows a count of the pharmacophores (as opposed to single bits of each) to be stored in the key on either a molecular or conformational counting basis. The application to the diversity comparison of libraries and diverse subset selection is examined.
GENEFOLD: PROTEIN SEQUENCE TO STRUCTURE TO RECEPTOR PHARMACOPHORE.
Akbar Nayeem, Tripos, Inc., 1699 South Hanley Road, St. Louis, MO 63144
When a given protein sequence does not exhibit sufficient sequence identity to known proteins, threading methods have often proven successful in identifying protein folds most likely to be compatible with that sequence. Here we examine how far we can go beyond fold recognition. The application GeneFold, a threading program developed by Godzik and Skolnick at Scripps, for developing a pharmacophore model for the binding site of an enzyme is shown. The value of the model as a basis for searching chemical libraries is also discussed.
NOVEL STRUCTURE BASED APPROACHES TO PHARMACOPHORE MODEL GENERATION AND LIBRARY FOCUSING.
C. M. Venkatachalam , Paul Kirchhoff, Jeff Jiang, Marvin Waldman, Molecular Simulations Inc, 9685 Scranton Rd, San Diego, CA 92121.
Given a library of compounds, the problem of identifying compounds that can satisfactorily dock and interact with a known Protein active site, is a challenging one. To be able to select potentially interesting compounds from a library using known information about the three dimensional structure of the active site is very crucial in generating leads especially when using combinatorial libraries. We present two different but complementary approaches: One approach employs a fast docking algorithm that finds low energy conformations of a ligand within an active site. The other approach analyzes the active site to develop a set of features (such as hydrogen bond donors, acceptors and lipophilic sites) that are required for attractive interactions within the active site and uses the feature set to derive pharmacophore models that are then employed in database searching to identify hits.
PHARMACOPHORES INCLUDING MULTIPLE EXCLUDED VOLUMES DERIVED FROM X-RAY CRYSTALLOGRAPHIC TARGET STRUCTURES TO BE USED IN 3D-DATABASE SEARCHING.
M. Gillner and P. Greenidge. Karo Bio AB, Novum, Huddinge S-141 57, Sweden.
We have optimised the method of using many (>100) excluded volumes with 3D-pharmacopho-res in database searching with respect to specificity and speed (J.Med.Chem. 41:2503-2512, 1998). Structure-based pharmacophores were supplemented with exluded volumes positioned at the coordinates of the protein atoms delineating the ligand binding site. The search speed obtained makes it practically feasible to use this method, and significantly faster than reported for other softwares. The method effectively pruned the obtained hit-lists of unspecific hits (by 70-75%) . Experimental verification showed that the remaining hits were specific (had micromolar affinities). We now show for structure-based pharmacophores that this method also improves the correlation between predicted and measured affinities for a structurally diverse set of estrogen receptor ligands. The resulting regression equation may be used for scoring of database hits.
DOCKING - DERIVED PHARMACOPHORES.
Renate Griffith, J. B. Bremner, B. Coban, Department of Chemistry, University of Wollongong, Wollongong, NSW 2522, Australia
In the absence of any detailed 3D structure of adrenergic receptors, models have been constructed for the alpha1A and alpha1B subtypes of these important members of the G-protein coupled family of membrane-bound receptors. Docking of the endogenous ligand, adrenaline, and also of a rigid, alpha1A selective antagonist (IQC), developed in our laboratory, into these models will be presented. We have, in a novel application of the Catalyst software by MSI, constructed docking derived pharmacophores from the docking results. These will be presented, as well as comparisons with traditional, ligand based pharmacophores, which were developed in Catalyst using a set of subtype selective antagonists, not including IQC. The common features of "ligand" and "structure" based pharmacophores can be superimposed strikingly well. This new approach offers further opportunities for tailored ligand design.
A DYNAMIC PHARMACOPHORE MODEL FOR HIV-1 INTEGRASE.
Heather A. Carlson, Kevin M. Masukawa, Roberto D. Lins, James M. Briggs, William L. Jorgensen, and J. Andrew McCammon, Department of Chemistry and Biochemistry, Department of Pharmacology, University of California, San Diego, 9500 Gilman Drive, La Jolla, California, 92093-0365
50-ps snapshots of a 1-ns molecular dynamics simulation of the catalytic domain of HIV-1 Integrase were used to represent the conformational flexibility inherent to the active site. Each of these protein configurations was subjected to a Multi-Unit Search for Interacting Conformers (MUSIC) simulation (a Monte Carlo, multiple-copy method available in the BOSS program). The MUSIC studies were used to develop a dynamic pharmacophore model based on the sampling of conserved binding sites for hydroxyl and ammonium functionalities in the active site of each protein model. The use of this method was validated by accurately predicting the binding site for the first of two required magnesium ions. Variations of our pharmacophore model were used to search the Available Chemicals Database.
PHARMACOPHORE DEFINITION OF RETINOID-X-RECEPTOR MODULATORS.
S.K. White, Ligand Pharmaceuticals, San Diego, CA, 92121
The Retinoid-X-Receptors (RXR) are homodimer and heterodimeric partners with a variety of intraceullular receptors, including PPAR, TR and VitD. While high affinity binding to the RXR's is important, we use a funtional cotransfection assay to differentiate the role of the RXR ligand. The criteria for good structure-activity correlation depends on proper choice of ligands according to biological function, pharmacophore sampling and 2D similarity. A general method of selecting ligands via biological activity profile and 2D descriptors for training sets in pharmacophore analysis will be presented. By using 2D chemical structure information to bin our ligands, we are able to create 3D pharmacophore descriptions which correlate to the activity data with R2 > 0.90.
THURSDAY AM / PM
Convention Center A6
|Modelling and Analysis through the Internet|
|O. Güner, Organizer, Presiding|
INTERNET-BASED MODELLING AND COMPUTATIONAL CHEMISTRY TOOLS.
Henry S Rzepa Department of Chemistry, Imperial College, London, SW7 2AY, UK.
The rapid, and frequently chaotic progress made over the last five years in chemical utilisation of the Internet will be reviewed. Three issues in particular will be addressed in detail; the development of structured documents capable of storing and expressing chemical data and semantics, the use of the model rather than the image media type as an integral part of chemical documents, and issues surround the authentication of documents and chemical objects using digital certificates.
DISSERTATIONS FROM CHEMISTRY ON THE INTERNET.
Johann Gasteiger, Wolf-Dietrich Ihlenfeldt, Robert Hoellering, Computer-Chemie-Centrum, University of Erlangen-Nuremberg, Naegelsbachstrasse 25, D-91052 Erlangen, Germany
An essential part of academic chemical research is performed by graduate students. Dissertations therefore offer a wealth of information that, quite often, only partly and slowly finds its way into publications in scientific journals. The implementation of dissertations on the internet provides rapid access to this cornucopia of chemical information. It can be accessed by the search technology employed for chemical databases such as name, structure, and substructure searching. Furthermore, the information in dissertations can be enriched by data derived from computations and modeling. Computer programs can be linked to dissertations to provide active contents, further enriching the importance of dissertations.
A STRUCTURE BASED DRUG DESIGN DEGREE COURSE DELIVERED ON THE INTERNET
Catherine Burt, Murray-Rust Peter,Brailsford Tim, Edge Colin, Richardson Christine, Overington John, Pfizer Central Research, Sandwich, Kent, CT13 9NJ, UK
The paradigm of Structure-Based Drug Design (SBDD) holds the potential to increase the efficiency of the drug discovery and development process. Unfortunately, the rapid development and highly multidisciplinary nature of the process has led to a situation where many staff in the pharmaceutical industry are ill prepared to exploit effectively these techniques to the full. The Internet and the technologies of the World Wide Web (WWW) have revolutionised the way in which information may be communicated and represent an ideal platform to deliver courses to "retool" working scientists in new disciplines. The Virtual School of Molecular Sciences (VSMS) at the University of Nottingham is at the forefront in establishing virtual collaborative courses and has launched an Internet Structure-Based Drug Design Course. An overview of the course and issues related to the delivery of distance learning courses will be presented.
SHARING CHEMICAL INFORMATION OVER THE INTERNET.
Luc Patiny ChemExper sprl, 36 clos de Profondval, 1490 Court-St-Etienne, Belgium
Creating a common database over the internet on which everybody can easily contribute is difficult because it requires a lot of management. In order to circumvent this limitation, we have developed a new web server and client program allowing research laboratories to submit chemical information (name, structure, bp, mp, NMR, etc.) automatically and which they still be able to modify at a later date. This information is directly available to the rest of the scientific community. This database, containing the chemical information, can be queried from any web browser by substructure (requires JAVA), molecular formula, bp, mp, NMR shift, infra-red maxima, etc. During this presentation we will explain the concept behind this new server as well as show the way it works in practice (submitting and retrieving information). For more information, please see: http://www.chemexper.com/.
MODELING AND VISUALIZATION ON THE WEB.
Mathew Hahn Molecular Simulation Inc., San Diego, CA 92121
The Web provides an ideal medium for allowing broader access to molecular modeling and molecular visualization. A key goal is to provide tools that non-computational scientists find easy to use. Ease of use, however, must be balanced with a level of sophistication such that the tools and the information they provide are not perceived as trivial. This talk describes various approaches towards building such balanced modeling and visualization environments, and describes our experiences with one such environment, WebLab.
WEBTABLES: COMPUTING MOLECULAR PROPERTIES ON THE WEB.
TJ O'Donnell, John Blankley, Christine Humblet, Parke-Davis, 2800 Plymouth Road, Ann Arbor, MI 48105
We have developed a web-based interface to assist in the task of computing and tabulating molecular properties for large sets of compounds. Data are manipulated in the form of a table with each row representing one molecule and each column representing one type of molecular property. The table is presented as an HTML page with GIF depictions of selected structures and text display of selected data columns. The major function of Webtables is to compute new columns of data. All computations are carried out on a web server Unix machine with the results returned using standard CGI methods. Data sets containing as many as 10,000 compounds are accomodated, although sizes nearer 1000 are more commonly used. The table may be output as a tab-delimited file. Data are input from a variety of input file types. We will show typical web page output for a sample session, list all available molecular property computations and discuss the underlying CGI scripting and the use of Daylight toolkit in Web Tables.
CHEMMART: ONE-STOP WEB SHOPPING OF STRUCTURAL DATABASES.
TJ O'Donnell, Tom Doman, G. D. Searle, 4901 Searle Parkway, Skokie, IL 60077 We have developed a web-based application to assist in the searching of structural databases. It allows for exact lookups, sub-structural, neighbor and SMARTS searches. In addition to searching, ChemMart can perform computations on input structures, such as clogP, "Pfizer Rule of Five" estimates, Concord 3-D coordinate generation, and polar surface area computation. Input and output is managed on a Web page using standard HTML. Structures may be sketched in, read from input files, or located by compound id number in the databases. Output may be saved in local files on the web-client computer. All searches and computations are carried out on a web server Unix machine with the results returned using standard CGI methods. We will show typical web page output for a sample session, discuss searching strategies and molecular property computations and explain the underlying CGI scripting. We will discuss the use of the ChemDraw plug-in, the Daylight toolkit, and the interface to helper applications in ChemMart. Click HERE for overview of WebTables and HEREfor ChemMart information.
WEB-BASED INTEGRATION OF COMPUTATIONAL EXPERIMENTS WITH BIOLOGICAL AND CHEMICAL INFORMATION.
Ajay Shah, David Huhta, Hong Li, Molecular Simulations, Inc. 9685, Scranton Rd, San Diego, CA, 92121
Abstract not available.
Extending Chimera for collaborative molecular visualization.
T.E. Ferrin, C. Huang, T.E. Klein
Chimera is a new, highly extensible computer application for visualizing and interactively manipulating molecular structures that is under development in the UCSF Computer Graphics Laboratory. Chimera provides for the addition of new functionality through use of a high-level programming language called Python. We have used this approach to add a prototype "collaboratory" module to Chimera for carrying out interactive three-dimensional studies of molecular structures among collaborating scientists at distant locations. With our collaboratory, multiple scientists can interactively manipulate images of shared, complex three-dimensional molecular models and interact with one another in the same way that traditional "face-to-face" collaborative scientific experiences would provide. When fully implemented, our collaboratory will not only facilitate collaborative research projects, but will provide the capability to establish interactive training sessions on molecular modeling techniques.
Does the web impact computer-assisted drug design?
S. Krystek, W. Langton
At Bristol-Myers Squibb, Web-based tools have been successfully integrated into the process for both structure and ligand based drug design. Web-based tools are being utilized in distributed environments for virtual screening, lead generation processing, compound selection, as well as for data management and the transfer of information among team members. A practical example will be presented detailing the use of Web-based tools in day-to-day activities of computer-assisted drug design.