#221 - Abstracts

221st National Meeting, San Diego, CA April 1-5, 2001 Robert Snyder, Program Chair


Hyatt Regency, Regency C

Web-Based Information Sources I
R. W. Snyder and J. C. Holt, Organizer
8:30 1 Keynote Address: Chemical information and the Web: Past, present, and future.
Stephen R. Heller, MDL Information Systems, 2413 Lillian Drive, Silver Spring, MD 20902, Fax: 301-946-2704, steveh@mdli.com - SLIDES
In the 1970's timesharing was the rage and providing chemical information to users connected via dial-up networks led to development of the DIALOG, ORBIT, BRS, Data-Star (Switzerland), JICST (Japan), DARC (France), STN (USA and Germany), ONLINE, Pergamon Online (UK), TDS, CIS, and others. Thirty years later most of these information systems are gone, replaced by a few large web-based sources of chemical information, and literally hundreds of small web-based resources.

This lecture will update 5 and 8 year old presentations of where we have been, where we are, and where we may be going with respect to chemical information.

9:00 2 Universal approach to Web-based chemistry using XML and CML.
Peter Murray-Rust1, Henry S. Rzepa2, and Michael Wright2. (1) School of Pharmaceutical Sciences, Nottingham University, University Park, Nottingham, United Kingdom, Peter@ursus.demon.co.uk, (2) Chemistry, Imperial College.
and domain-specific information over networks. In many cases this information represents persistent information objects which can be re-used in other applications and other contexts. Such messages and objects are likely to become the medium of chemical information flow, replacing the incompatible "legacy file formats" currently used in most applications.

This talk will describe how XML documents, and CML (Chemical Markup Language) document fragments in particular, can be used as a primary means of representing information in computers.

The talk will be illustrated with many working examples, available at http://www.xml-cml.org/

9:30 3 ChemGuide and PublishersGuide: A novel portal concept.
René Deplanque, Jost T. Bohlen, and Michael Langner, FIZ CHEMIE BERLIN, Franklinstrasse 11, D-10587 Berlin, Germany, Fax: 49 30 39977133, deplanque@fiz-chemie.de
ChemGuide and PublishersGuide Internet search engines resemble a new portal design offering a unique combination of well-defined and comprehensibly pre-selected Web sites with an easy-to-use database interface featuring sophisticated retrieval functions. Web-robots programmed in Perl automatically collect, store, and index all web-pages from relevant servers. In addition, they identify further potentially interesting servers and update the database content regularly. Click-access to narrower areas of interest can easily be created by implementing standard queries leading to the appropriate pages. The dynamic concept minimizes manual maintenance requirements and is applicable to all subjects. Various kinds of databases can easily be included into the search engine. The software package can be used to create topic-specific portals, to organize a company's large Inter- or Intranet offering or to integrate both.
10:00 4 From custom R&D web implementations to fully operational e-commerce sites: Technology and examples.
Sheila Ash,and Judith Bandy, Oxford Molecular Ltd, The Medawar Centre, Oxford Science Park, OX10 9NL Sandford-on-Thames, OXON, United Kingdom, Fax: +44 1865 784602, sash@oxmol.co.uk - SLIDES
Ease of access to data has a direct and profound effect on research productivity. Different groups of users need specific views of particular data-types to support decision-making and a web-based system enables efficient access to necessary information. This session will outline the use of an easy, yet powerful way to develop and deploy a variety of intranet applications for dealing with chemical and biological data stored in Oracle. Its scope will be illustrated with working examples, ranging from custom implementations to manage chemistry and screening in pharmaceutical organisations to fully-operational e-commerce sites.
10:30 5 Traditional online services vs. the web: Do you get what you pay for?
Rebecca A. Wolff, and Eileen Shanbrom, Product Marketing Management, Chemical Abstracts Service, 2540 Olentangy River Road, Columbus, OH 43202-1505, Fax: 614-461-7149, rwolff@cas.org  SLIDES
This presentation will focus on the differences between using free search engines (such as Google, Altavista and Yahoo) on the WWW versus professionally designed products such as STN Easy or STN on the Web, which provide access to databases built by scientists. The features that create the product experience will be explored and contrasted: The GUI, indexing, database content, and the power of the search engine. Further, the role of customer support and service will be explored. How important is this component? Do people who are searching the Web expect customer service? If they do, should it be available on the Web or via more traditional means such as the telephone? What technologies might impact customer support in the future? Recent feedback sheds some light on what online users want and value in research information services.
11:00 6 NIST computational chemistry comparison and benchmark database.
Russell D. Johnson III, Computational Chemistry Group, National Institute of Standards and Technology, 100 Bureau Drive Stop 8380, Gaithersburg, MD 20899, Fax: 301-869-4020, russell.johnson@nist.gov -  SLIDES
The CCCBDB (http://srdata.nist.gov/cccbdb) is a collection of experimental and theoretical thermochemical properties for 580 neutral gas-phase species. The goal of the database/website is to provide a benchmark set of molecules and reactions for the evaluation of ab initio computational methods and to allow the comparison between different ab initio computational methods and experiment for the prediction of thermochemical properties. Users can evaluate the accuracy of ab initio methods applied to thermochemistry by using the data at the site. The experimental and computational data is available (enthalpies of formation in kJ/mol, computed energies in hartrees, lists of vibrational frequencies, geometries) and can be used in comparisons. For example the enthalpies of a user-specified reaction can be displayed for experiment and different levels of theory.


Section a
Hyatt Regency, Regency C

Electronic Chemistry Publishing
S. Lin and P. F. Rusch, Organizer
1:30 7 Chemistry preprint server: A revolution in chemistry communication.
James R. Weeks, and Bryan A. Vickery, ChemWeb.com, 84 Theobalds Road, Holborn, London WC1X 8RR, United Kingdom, Fax: + 44 20 7611 4301, james.weeks@chemweb.com
The Chemistry Preprint Server (CPS, http://preprint.chemweb.com/) is a major new initiative for the chemistry community, hosted by ChemWeb.com. It is a freely available and permanent web archive and a distribution medium for research articles in the chemical field. The CPS was developed by closely following the Los Alamos archives, which cover physics and related disciplines. Submission to the CPS is free and open to all, and can include fully prepared articles or works in progress. This paper will review this ChemWeb.com experiment in effective scientific communication, and focus on how the CPS was developed, how it can be utilised by scientists, the response from the chemical community and the scope for future development.
2:00 8 Electronic journals from the Royal Society of Chemistry.
Jamie S. Humphrey, and Robert Parker, Royal Society of Chemistry, Thomas Graham House, Science Park, Milton Road, Cambridge UK CB4 0WF, United Kingdom, Fax: 44 1223 420 247, humphreyj@rsc.org
Three electronic journals (CrystEngComm, Geochemical Transactions and PhysChemComm, http://www.rsc.org/is/journals/current/e-only/e-only.htm ) from the Royal Society of Chemistry have been launched during recent years, to provide a venue for chemists to publish material that cannot be published in print. The editorial and technical experience that we have gained will be discussed.
2:30 9 Electronic chemistry conferences: Seven years of CONFCHEM.
Scott E. Van Bramer, Department of Chemistry, Widener University, One University Place, Chester, PA 19013, Fax: 610-499-4496, svanbram@science.widener.edu, Donald Rosenthal, Department of Chemistry, Clarkson University, and Brian Tissue, Department of Chemistry, VPI & State University -  SLIDES
CONFCHEM, the online Conferences on Chemistry organized by the ACS Division of Chemical Education's Committee on Computers in Chemical Education (CCCE), provides an opportunity for the online presentation of papers with extensive discussion by the conference participants. The emphasis of these conferences is on chemical education. Fifteen conferences have been offered and several new conferences run each year. The online presentation of the papers has changed considerably since the first conference in 1993, when participants used gopher to download papers and images. This talk will outline the development of CONFCHEM, the organization and structure of the conferences, the technical challenges, and the benefits of online conferences. Information and archives of current, upcoming, and past conferences are available at http://www.ched-ccce.org/confchem/ .
3:00 10 Electronic chemistry publishing: A librarian's perspective.
Dana L. Roth, Millikan Library 1-32, Caltech, 1200 E. California Blvd., Pasadena, CA 91125, Fax: 626-792-7540, dzrlib@library.caltech.edu, and Kimberly Douglas, Sherman Fairchild Library 1-43, Caltech
Provision of electronic chemistry journals in an academic setting will be described. The talk will include a historical review, discussion of licensing problems, an analysis of user acceptance and future projections. Caltech's Online Journals list can be browsed or searched on the Library's web page: http://library.caltech.edu/


Integrated web publications for crystallography.
Brian McMahon, International Union of Crystallography, 5 Abbey Square, Chester CH1 2HU, United Kingdom, Fax: 44(1244)314888, bm@iucr.org
The International Union of Crystallography (IUCr) is an international scientific union responsible for the promotion of crystallography. Since 1948 it has produced its own journals, which have set standards, especially for the reporting of crystal structures. Its journals have been available via the Web (http://www.iucr.org/iucr-top/journalsonline.html ) since 1999, and its first online-only journal Structure Reports Online appeared in 2001. Papers received in electronic formats are edited and converted to SGML, from which hard copy, PDF and hyperlinked HTML versions are derived. The navigable HTML version is particularly rich in added value, including hyperlinks to figures and references; to supplementary documents (often in multimedia formats); to external publications via citations databases (including PubMed, PubSci and CrossRef); and to crystal structure databases. For structural publications, data sets are supplied by the author in the standard CIF format developed by the IUCr. This is the mandatory submissions format for two of the journals. Comprehensive checks are performed to ensure the integrity and self-consistency of the data. The data sets may also be downloaded, for visualisation, modelling, or seeding solution searches for related compounds. Such checking is also provided by the IUCr as a service to other journals, including ACS titles.


Section B
Hyatt Regency, Regency D

Web-Based Information Sources II
R. W. Snyder and J. C. Holt, Organizer


SPARK: A tool for discovering structure-property-activity relationship knowledge.
Yvonne Martin, Jerry Delazzer, and Elizabeth Danaher, D-47E, AP10/2, Abbott Laboratories, 100 Abbott Park Road, Abbott Park, IL 60065-6100, Fax: 847-937-2625, yvonne.c.martin@abbott.com
A frustrating problem in the design of biologically active compounds is the lack of integration of the software useful for various aspects of the task. We have developed SPARK to solve this problem. To the user, SPARK looks like a single application. It searches databases or reads files to identify structures; retrieves biological data from databases or files; performs similarity and substructure searches on collections of compounds; calculates and looks up physical properties of molecules; provides plotting and statistical analysis capabilities; and exports HTML reports and files for use by other programs. Under the covers, SPARK is a distributed object application built using JavaBeans and RMI. This has allowed us to combine contributions from NetGenics, Daylight, KLGroup, and Statistical Graphics Corporation as well as a host of UNIX and web-server programs from a great variety of vendors. We just wrap and integrate!
1:30 13 Text and (con)text in molecular visualization.
Francis T. Marchese, Computer Science Department, Pace University, 1 Pace Plaza, New York, NY 10038, Fax: 212-346-1863, fmarchese@pace.edu -  SLIDES
Molecular visualization software and the WWW have transformed the way chemical information is derived, organized, accessed, and represented. Unlike traditional print media, where images are subordinate to precisely articulated text; dynamic, colorful web-based imagery has assumed an equal and complementary role. Increased reliance on pictures to convey information and its meaning raises significant questions about the role of chemical illustration, its underlying text, and how it is perceived and understood.

The purpose of this presentation is to analyze the uses of illustration and iconography in the display of molecular information. Topics to be covered include: the problem of imagery, the problem of context, uses of imagery, uses of iconography, scientific image content, the relationship between text and image, and the ways text and image are linked or bound.

This presentation will draw upon examples from Renaissance and contemporary art, poetry, chemical iconography, molecular graphics, and chemical web pages

2:00 14 Dymond linking: Point-and-click structure and reaction searching.
Alexander Lawson, MDL Frankfurt, Theodor-Heuss Allee 108, D-60486 Frankfurt, Germany, ALawson@mdli.com, and Christopher Leonard, IT & Business Development, Elsevier Science, Sara Burgerhartstraat 25, 1055 KV Amsterdam, Netherlands, Fax: 00-31-20-485-2812, c.leonard@elsevier.nl -  SLIDES
In the electronic-age, the concept of 'what is a journal' is becoming increasingly blurred with the concept of the database. Although database-driven querying of journal material is now routine, the complementary instance of journal-driven querying of databases has, until now, been rare and where it has existed, limited to linking from citations to abstract databases. We present a new linking technology which represents a paradigm shift in terms of chemical information provision. Chemical objects in online journals (such as bold numerals, citations and graphical representations of structures and reactions) are activated in such a way that when they are clicked, a structured-search query is passed to a third party database. From the database, the user can explore that particular structure or reacton further, going forwards and backwards in time and across journal titles from all publishers indexed in the database. There is no limit to the number of databases this linking mechanism can be applied. In addition, it also possible to predict properties (physical and chemical) of structures which appear in such online journals. Thus the user has a true 'point and click' facility to investigate interesting chemical information in online journals.


MapMaker: A tool for product-based library optimization.
John I. Manchester1, Ryszard A. Czerminski1, Jean Patterson2, and David S. Hartsough1. (1) Computational Design & Informatics, ArQule, Inc, 19 Presidential Way, Woburn, MA 08101, jmanchester@arqule.com, (2) Process Chemistry, ArQule, Inc
MapMaker is a web-based library optimization tool now integrated into Mapping Array production at ArQule. It uses a genetic algorithm to select, based on computed properties of the full enumerated library, optimal subsets from candidate reagent lists that give rise to libraries that are internally diverse, unique with respect to existing compounds in the ArQule repository, and exhibit acceptable physicochemical and predicted ADME profiles. Optimized libraries roughly correspond to a space-filling designs in the “drug-like” chemical space accessible to the full virtual library, where space occupied by existing repository compounds is excluded from the design. Constrained reagent selection and multiple design runs are supported. Thus, factors such as synthetic feasibility, reagent availability and chemical intuition can be included in the design at any time prior to synthesis.
3:00 16 Web tools for library design.
Mary Bradley, Scynexis Chemistry & Automation, Inc, P.O. Box 12878, Research Triangle Park, NC 27709
A web based system for library design is presented. An interactive stepwise guide to design of lead generation libraries allows synthetic chemists access to expert tools without the need for extensive expertise in their use. Tripos Unity, ChemEnlighten, Selector, and Sybyl executables are used to compute Drugability, ADME, and Toxicity filters, as well as scoring functions for library diversity and reagent suitability. The optimal configuration for synthesis of matrix libraries with desired properties is provided to the chemists in graphical format.
3:30 17 Web-based tools for compound selection, library design, and compound acquisition.
Andrew R. Leach, Computational Chemistry and Informatics, Glaxo Wellcome Research and Development, Medicines Research Centre, Gunnels Wood Road, Stevenage SG1 2NY, United Kingdom, Fax: 44-1438-764918, arl22958@glaxowellcome.co.uk -  SLIDES
The web offers many advantages for delivering chemoinformatics and computational chemistry tools to end-users:

1. No training may be required, as all potential end-users are familiar with the use of a web browser 2. Calculations can be "packaged" into logical sequences, providing a natural navigational pathway though a complex task 3. "Expert" tools can be provided to casual or non-expert users 4. Much of the underlying software infrastructure (browsers and web servers) are readily available; releasing new versions of the software is a trivial task.

We will describe some of our recent developments in the area of web-based computational tools, which are primarily designed to provide bench scientists access to the emerging methods in library design, compound selection and structure-based design. We also discuss how the analysis of data automatically collected by the web server can provide information about the current procedures and highlight areas for future development.


Hyatt Regency, Regency C

Electronic Chemistry Publishing
S. Lin and P. F. Rusch, Organizer
9:00 18 Future of electronic chemistry publication.
Steven M. Bachrach, Department of Chemistry, Trinity University, 715 Stadium Drive, San Antonio, TX 78212, Fax: 210-999-7569, sbachrach@trinity.edu
This talk will present a speculation on the future of electronic publication in chemistry. What are the barriers to its success? What developments are necessary to surmount these barriers? Technological and cultural/societal issues will be analyzed.
9:30 19 Science of synthesis: Transformation of a classical tertiary reference work for synthetic chemistry into a comprehensive electronic source of evaluated information.
Guido F. Herrmann, Science of Synthesis, Houben-Weyl, Georg Thieme Verlag Stuttgart, New York, Ruedigerstrasse 14, Stuttgart 70469, Germany, Fax: +49 711 8931-777, Guido.Herrmann@thieme.de
The talk describes the development of an integrated electronic system for the generation of chemical information, the evaluation procedures and the user software of Science of Synthesis, Houben-Weyl Methods of Molecular Transformations. The series METHODEN DER ORGANISCHEN CHEMIE (Houben-Weyl Methods of Organic Chemistry) was established in 1909. The comprehensive description of preparative methods in a consistent style and their critical evaluation by leading experts is the philosophy on which Houben-Weyl was founded. Facing dramatic developments in chemistry during the last few decades which have provided chemists with a wealth of new reagents and reactions, the need for a new, comprehensive, and critical treatment of synthetic chemistry has become apparent. This new edition is entitled Science of Synthesis, Houben-Weyl Methods of Molecular Transformations and is edited by D. Bellus (Switzerland), S. V. Ley (UK), R. Noyori (Japan), M. Regitz (Germany), P. J. Reider (USA), E. Schaumann (Germany), I. Shinkai (Japan), E. J. Thomas (UK), and B. M. Trost (USA). Science of Synthesis will benefit from more than 90 years of experience and will continue the tradition of excellence in publishing organic chemistry reference works. It will offer a truly comprehensive, critical treatment of synthetic organic and organometallic chemistry. Science of Synthesis will cover the whole field of organic chemistry based on all published and readily available sources from the early 1800s until the year of publication. Authors will provide chemists with the most reliable methods to solve their synthesis problems. For each method a detailed experimental procedure will be included. To best meet the needs of the scientific community, Science of Synthesis will be published as an electronic version and also in print (http://www.science-of-synthesis.com/). The electronic version is being developed under the guidance of an advisory board comprising A. Barth (Germany), G. Baysinger ! (USA), A. Mullen (Germany), H. Rzepa (UK), and E. Zass (Switzerland).
10:00 20 Stanford University’s HighWire Press: Continuing to raise the bar in electronic journal publishing.
Vicky Reich, HighWire Press - Stanford University Libraries, 1454 Page Mill Road, Palo Alto, CA 94304-6004, Fax: 650-725-6553, vreich@stanford.edu, and Grace A. Baysinger, Swain Library of Chemistry & Chemical Engineering, Stanford University, Organic Chemistry Building, 364 Lomita Drive, Stanford, CA 94305-5080, Fax: 650-725-2274, graceb@stanford.edu
Stanford University's non-profit HighWire (HW) Press (http://highwire.stanford.edu/ ) ensures that its partners - scientific societies and responsible publishers - excel in the use of web-based technologies for scientific communication. The journals focus on scientific and medical research and are among the highest-impact in their fields. HW provides an interactive dimension to the information in the printed journals by providing: links among authors, articles and citations; advanced searching capabilities; high-resolution images and multimedia - across a wide community of scholarly publications. Many offer free back issues online. It's easy to demonstrate that this online version the canonical version of the research record. Services, additional content, and content that's published online well before the print issue date drive readers to use the online. For many titles, the print edition no longer provides even an archive for peer-reviewed research. This presentation will review these issues; and discuss the approach HighWire Press is taking through research, experimentation and innovation.
10:30 21 Supplementing full text journals with factual databases.
John Rumble Jr., Angela Y. Lee, Dorothy Blakeslee, and Shari Young, Standard Reference Data, NIST, 100 Bureau Drive MS 2310, Gaithersburg, MD 20899-2310, Fax: 301-926-0416, john.rumble@nist.gov
The Journal of Physical and Chemical Reference Data, published by the National Institute of Standards and Technology and the American Institute of Physics, is a publication with a large number of tables and graphs in its articles, often hundreds of pages long. A database version of this journal is considerably different from a full text online version. The Journal contains a diversity of data that require a multi-faceted numerical database. The goal is to build a web-based database, with the most modern scientific database technology, tailored to scientists with features such as searches across articles and advanced search and display capability. Journal subscribers will then have access to three different versions: the print Journal, a full-text online Journal (now available at http://ojps.aip.org/jpcrd/ ) and an electronic numerical database, with appropriate links established and maintained between the two online products. This paper describes aspects of this unique linkage of modern electronic publications.
11:00 22 Electronic chemistry publication in China.
Wenyuan Zhao1, Ruiya Xu2, and Shu-Kun Lin1. (1) Department of Chemistry, Qingdao University, HongKong East Road 7, Qingdao, Shandong 266071, China, Fax: 86-532-5877687, wyzhao@qdu.edu.cn, (2) Chinese Academy of Sciences, Beijing Institute of Chemistry
Electronic publications have been particularly beneficial to chemists in developing countries. During recent 5 years, thanks to the large scale, nationwide installation of optical fiber networks, internet services have been tremendously improved in China. We review the online journals and online accessible chemical databases in China. Now almost all chemistry journals in China have online edition. Currently the online access to the full texts of almost all these online editions are free. There are only two online-only chemistry journals in China. Chemistry Online, the online edition of Huaxue Tongbao was launched in 1997. Since January 1999, Chemistry Online publishes review and research papers in Chinese in html form at http://china.chemistrymag.org/ as a separate journal. Chemical Journal on Internet (http://chemistrymag.org/ ) was launched in 1999 which publishes monthly online issues in English. Both of these two monthly online-only journals are free of charge to readers. The advantage of such online journals in China will be discussed.


Hyatt Regency, Regency C

Electronic Chemistry Publishing
S. Lin and P. F. Rusch, Organizer
1:30 23 Electronic conference on synthetic organic chemistry.
Shu-Kun Lin, Andrey Gutnov, Sen Lin, and Carl David Nager, MDPI Center, Molecular Diversity Preservation International, Sanegergasse 25, Basel CH-4054, Switzerland, Fax: 004161 302 8918, lin@mdpi.org
ECSOC-4, the 4th yearly Electronic Conference on Synthetic Organic Chemistry, was held during September 1 to September 30, 2000. About 150 contributions from all over the world are placed at the conference website http://www.mdpi.org/ecsoc/ . The function of full text search of all papers and a bulletin board for discussion were installed. Its participation has been free to visitors and to chemists who presented papers. ECSOC has been supported financially by several pharmaceutical and chemical companies. ECSOC experience shows that experimental chemists are also very interested in online conference. The plan for further improvement will be discussed.
2:00 24 Hunting for free chemical journal articles on the World Wide Web.
Song Yu, Libraries, Purdue University, Mellon Library of Chemistry, 1538 Wetherill, West Lafayette, IN 47907-1538, Fax: 765-494-1579, syu@purdue.edu
In contrast to other fields in science and technology, it is very difficult to locate freely available journal articles in Chemistry from the World Wide Web. In most cases, libraries and institutions need to pay for access to online journal articles. On the other hand, people are more and more critical of the "free" stuff that they can get off the web. Information professionals have to check for the reliability, accuracy, and currency of the information before they present it to their users. However, finding free electronic chemical journal articles is not a "Mission Impossible". This talk will present some ways to find journal articles that are freely accessible from the web, and discuss why the information is free. A citation analysis is used on selected journal titles to test the impact of these free journals.
2:30 25 Ullmann's Encyclopedia of Industrial Chemistry: From print to electronic.
Anette Eckerle, Ullmann's Encyclopedia, Wiley-VCH, Germany, Pappelallee 3, Weinheim 69469, Germany, Fax: 0049-6201-606-500, ullmanns@wiley-vch.de
Ullmann's Encyclopedia of Industrial Chemistry (http://www.ullmanns.de/) is a reference work detailing all areas of industrial chemistry. After over 80 years of publishing in print, Ullmann's became available on a fully networkable CD-ROM in 1997, and ONLINE through the WWW in summer 2000. This transition has not only changed the ways Ullmann's can be used, but also had an impact on the editorial office, authors, and users. As Ullmann's is among the first and biggest scientific publications to go electronic, the editors had the chance to learn from the users in order to adapt the product to their needs. The talk will provide insight into how the decision making process took place,and how the actual transfer was achieved. In addition, details from behind the scenes at the editorial office, about the contents, technical and access issues, license models, prices, and plans for the near future will be presented.
4:00 26 Division Business Meeting.
No Business Meeting in San Diego. Next Business Meeting scheduled for Chicago, August 2001
4:30 27 Open Meeting: Committees on Publications and on Chemical Abstracts Service.
Robert W. Snyder, MDL Information Systems, Inc, 14600 Catalina Street, San Leandro, CA 94577, Fax: 510-483-4738, bobs@mdli.com


Convention Center, Sails Pavilion

Sci-Mix, 9:00 to 11:00
R. W. Snyder, Organizer
28 Artificial neural network model of MDCK cell permeability.
Boyd Steere1, Robert Fraczkiewicz1, Lori Takahashi2, Laurent Salphati2, and Michael B. Bolger3. (1) Life Sciences, Simulations Plus, Inc, 1220 W. Avenue J, Lancaster, CA 93534, Fax: 661-723-5524, boyd@simulations-plus.com, (2) Analytical Chemistry, Affymax Research Institute, (3) Pharmaceutical Sciences, USC School of Pharmacy and Simulations Plus, Inc.
Purpose: To develop an accurate method for ultra high-throughput estimation of Madin-Darby canine kidney (MDCK) epithelial cell apparent permeability (Papp) from molecular properties. Such models may be useful for providing rapid estimates of absorptive behaviour. Methods: Experimental Papp values were determined for 291 drugs using the cells-on-sheet (COS, Affymax Res. Inst., Santa Clara, CA) methodology for MDCK cells. A subset of molecular descriptors calculated by QMPRPlus TM (Simulations Plus, Inc., Lancaster, CA), were analysed for fitness and incorporated into an in silico model estimating log Papp using a genetic algorithm linked to artificial neural network training. Results: The resulting model was tested using an external test set of 56 compounds. The root mean square error for all data was 0.38 log units. Conclusion: An accurate method for estimation of epithelial cell permeability has been developed and implemented in an in silico ultra high-throughput method.
29 Application of the electron-conformational method of pharmacophore identification and bioactivity prediction to group I metabotropic glutamate receptors.
Eran Rosines, Isaac B. Bersuker, and James E. Boggs, Department of Chemistry and Biochemistry, The University of Texas at Austin, Austin, TX 78712, eran@mail.utexas.edu
The electron-conformational (EC) method, developed earlier, has been applied to the problem of group I metabotropic glutamate receptor (mGluR1) agonists. A training set consisting of 12 active and 13 inactive compounds was used for the pharmacophore (Pha) identification and quantitative activity prediction. By the EC method’s electronic structure calculations, conformational analysis, and matrices processing, we found that the Pha of mGluR1 agonists is a four-point skeleton containing three oxygen atoms and one nitrogen atom at certain interatomic distances and with restricted atomic interaction indexes. The influence of the anti-Pha shielding and other auxiliary groups was parametrized and weighted by eight constants, their values being obtained from a least-square regression. The results were tested with a validation set consisting of 4 compounds. The R2 factor for the calculated versus experimental activities in the training set is >0.9, while the R2 factor for the validation set is 0.82.
30 Bringing decision-making computational chemistry tools to the combinatorial chemist's desktop.
Manton R. Frierson III, and Manton R. Frierson, Computational Chemistry and Informatics, Advanced SynTech, LLC, 9800 Bluegrass Parkway, Louisville, KY 40299, Fax: 561-258-5783, m.frierson@advsyntech.com
Combinatorial chemistry, as practiced in the pharmaceutical industry, has become much more selective in terms of the properties of compounds accepted for testing in biological screening programs. These properties may include calculation of the properties used in Lipinski's "Rule-of-Five", presence or absence of toxicophores and pharmacophores and measures of diversity. Computational chemistry and informatics tools exist which can be used for the pre-evaluation for many of these properties for proposed libraries. However, the tools are most effectively applied very early in the design such libraries. Inhibitions to the early use of the tools are many and include issues regarding ease of use and speediness and convenience of obtaining the results by the practicing combinatorial chemist. One approach to these issues is to place access to these types of calculations on the chemist's desktop for use by the chemist in one convenient location: the web browser. This approach is described starting with the chemist's (virtual) enumeration of libraries up to the actual production of them.
31 CoMFA and CoMSIA 3-D QSAR studies of epidermal growth factor receptor tyrosine kinase inhibitors.
John K. Buolamwini, and Haregewein Assefa, Department of Pharmaceutical Sciences, University of Tennessee, 847 Monroe Avenue, Suite 327, Memphis, TN 38163, Fax: 901-448-6828, jbuolamwini@utmem.edu
The overexpression or mutation the epidermal growth factor receptor (EGFR) tyrosine kinase has been implicated in a host of human cancers, and shown to affect proliferation, angiogenesis and metastasis. Consequently, it is under intense investigation as a novel anticancer drug design target. We have performed comparative molecular field analysis (CoMFA) and comparative molecular similarity analysis (CoMSIA) 3D QSAR studies on two series of EGFR kinase inhibitors comprising 105 and 51 anilinoquinazoline and anilinoquinoline derivatives, respectively. Predictive 3D QSAR models with q2 values up to 0.80 and R2 values up to 0.94 were obtained. Implications of electrostatic and steric PLS coefficient maps, as well as hydrophobic and hydrogen bond acceptor and donor coefficient maps for EGFR tyrosine kinase inhibitor design will be discussed. A comparison of the performance of CoMFA to that of the more recent 3D QSAR method, CoMSIA will also be presented.
32 Online journal Molecules: Five years experience.
Shu-Kun Lin, MDPI Center, Molecular Diversity Preservation International, Sanegergasse 25, Basel CH-4054, Switzerland, Fax: 004161 302 8918, lin@mdpi.org, and Derek J. McPhee, Uniroyal Chemical Co, 25 Erb St., Elmira, ON N3B 3A3, Canada, Fax: 519-669-1679, mcphee@mdpi.org
MDPI currently publishes 12 online issues of the chemistry journal Molecules (http://www.mdpi.org/molecules/ ) a year. The first volume was published in collaboration with Springer in 1996. But since 1997 MDPI has been solely responsible for all aspects of the project. Presently online access to all papers published by MDPI is free of charge. Molecules' editorial policy is to publish reviews, communications and full research papers in all aspects of synthetic and natural products chemistry, although other types of contributions such as conference proceedings and special issues dedicated to a particular topic are also published. Our aim is to encourage chemists to publish full synthetic procedures and characterization information where possible. MolBank (launched in April 1997) publishes one-compound-per page-experimental notes, which can be searched by substructure. In addition text search functions have been available since 1999 (cf. Lin, S. -K.; Patiny, L. MolBank: First Fully Web-Based Publication of Chemical Reaction Data of Individual Molecules with Structure Search and Submission. Internet Journal of Chemistry, 2000, 3, 1. http://www.mdpi.org/lin/molbank/ ). Another special feature is the deposit and exchange of chemical samples described in the papers. The advantage and the possible problems of such online E-journals will be discussed.
33 Pharmacophore models for identification of environmental estrogens.
Qian Xie1, Weida Tong1, Hong Fang1, Huixiao Hong1, Roger Perkins1, and Daniel M. Sheehan2. (1) Computational Chemistry Group, R.O.W. Sciences Inc., National Center for Toxicological Research, Food and Drug Administration, 3900 NCTR Rd, MC-910, Jefferson, AR 72079, Fax: 870-543-7382, qxie@nctr.fda.gov, (2) Division of Genetic Toxicology, National Center for Toxicological Research, Food and Drug Administration
There are many environmental chemicals that have estrogenic activity and affect animal reproducibility and human health. Under the legislation, some 80,000 existing chemicals and new chemicals will undergo various hormonal activity tests. It is very cost and time consuming to screen this large amount of compounds by experiments. We proposed an integrated computational screening approach, Four-Phase System, to set priority for these chemicals. Pharmacophore models are one of the important components in the system. We developed a set of three-dimensional pharmacophores based on the crystal structures of the natural estrogen (17b-estradiol) and antagonistic estrogen (raloxifene) from the ER-ligand complexes. Each pharmacophore is composed of several key features that are important for a compound to exhibit estrogenic activity. These pharmacophores are complementary one another to distinguish active compounds from inactive compounds. Moreover, the hit number of a compound by the pharmacophores is correlated with its activity. This group of pharmacophores was used to screen other data sets. The results show high efficiency and accuracy.


Section A
Hyatt Regency, Regency C

Technical Intelligence
T. Trippe, Organizer
8:30 34 New technologies supporting technical intelligence.
Anthony J. Trippe, Aurigin Systems Inc, 10710 N. Tantau Ave., Cupertino, CA 95014, tony@trippe.com -  SLIDES
Traditionally, Technical Intelligence has centered around the use of a network of technical experts within and outside of an organization. The members of these networks were often referred to as Gatekeepers and they would be responsible for monitoring new advances in an area where they had technical expertise and were responsible for making sure that this information was utilized and shared within their respective organizations.

Today, with the advent of more powerful, smaller computers and new software applications that utilize linguistic algorithms and advanced artificial intelligence routines, the work of a Gatekeeper network can be significantly enhanced by taking advantage of new software tools for Technical Intelligence.

This presentation will concentrate on new technologies that make it possible for thousands of open source or confidential internal documents to be analyzed and trends from these items visualized in a fraction of the time previously needed for this type of exercise.

9:15 35 Role of technical intellligence in research selection and assessment: A study of twenty chemical firms.
Mark E. Rerek, Skin Care R&D, International Specialty Products, 1361 Alps Road, Wayne, NJ 07470, Fax: 973-628-3401, mrerek@ispcorp.com -  SLIDES
Technical intelligence (TI) is providing a new set of tools to address the fundamental questions raised by R&D managers. Questions-such as where do I invest my research dollars and what are the expected risks and returns from this investment-are traditionally addressed by experts. TI provides an alternative view of these issues (Norling et. al., 2000). A dialectical process can then be used to synthesize these two views.

Examples of science strategies in twenty large global chemical firms are used to illustrate these TI techniques. We will show where they are choosing to invest (both scientifically and geographically) and discuss the performance of that research against a global baseline of scientific performance. We will also show how this information can be used for competitive intelligence (CI).

10:00 36 Coaching, mentoring, consulting roles in technical intelligence.
Suzanne P. Cristina, Information Network at Hamilton Sundstrand, United Technologies, MS 1-3-BC52, One Hamilton Road, Windsor Locks, CT 06096, Fax: 860-654-3689, cristsp@hsd.utc.com -  SLIDES
The information professional can go beyond traditional expectations in providing technical intelligence to their organizations by coaching and mentoring groups/teams to support and integrate technical intelligence into their processes. An information savvy professional with a solid grounding in the technical knowledge of their organization ensures the reliability / integrity of external technical intelligence and facilitates the necessary integration of external technical information with internal group goals. This integration and consideration of external forces is critical in positioning the organization in the marketplace.
10:45 37 Market strategy using patent data analysis.
Mark D. Bauer, Online Sales, Derwent Information, 1725 Duke Street, Suite 250, Alexandria, VA 22314, Fax: 703-519-5838, mark.bauer@derwentus.com -  SLIDES
Patents are a unique resource of hard data. They describe more than simply the technology that defines a company. Patents contain data that answer several market strategy questions. Where is the technology or product used or sold? How long has it been in that country? How long has the competitor been doing business there? Understanding the marketplace for a given technology can give a sharp advantage to the competition. Commercial patent databases make simple work of questions like these by offering a consolidated resource of the data. A case study using such systems to define a competitor’s portfolio will be reviewed in this session.


Section B
Hyatt Regency, Regency D

Advances in 3D Searching and Pharmacophores: Applications
Cosponsored with Division of Computers in Chemistry, and Division of Medicinal Chemistry
O. F. Güner, Organizer
8:30 38 Development and application of pharmacophore model of BK openers and blockers.
Ki H. Kim, Department of Structural Biology, Abbott Laboratories, Abbott Park, IL 60064, Fax: 847-937-2625, ki.h.kim@abbott.com
Potassium channels have attracted considerable attention due to their potential application for the treatment of various diseases of cardiac and smooth muscles and of neural tissues. The two most interesting types of potassium channel types are the ATP-sensitive potassium channel and calcium-activated potassium channel. There are at least three distinct families of calcium-activated potassium channels. Small potassium channels (SK), intermediate conductance potassium channels (IK), and big potassium channels (BK). This work will summarize the pharmacophore model of BK openers as well as BK blockers and its application in drug research.
9:00 39 Combiphores: Combining ligand-based and structure-based pharmacophores.
Renate Griffith1, John B. Bremner2, and Paul A. Keller2. (1) School of Biological and Chemical Sciences, University of Newcastle, Callaghan NSW 2308, Australia, Fax: 61 2 49 216 923, renate_griffith@uow.edu.au, (2) Department of Chemistry, University of Wollongong
Novel drug design methods are under development using a1 adrenergic receptors (ARs) and HIV-1 Reverse Transcriptase (RT) as target proteins with the aim of designing subtype-selective, as well as pathway-selective, AR ligands and mutation resistant RT inhibitors.

Structure-based pharmacophores have been developed from models of receptor-ligand complexes (for ARs) and from RT crystal structures. These novel pharmacophores represent the features on the ligand which are involved in interactions with the target protein, as well as the areas in space around the ligand which are occupied by the protein.

Classical ligand-based pharmacophores have also been developed for RT inhibitors and for AR ligands.

From the protein-ligand complexes it is also possible to derive information about all interactions which ligands could potentially form with the binding site. These describe a "superligand" and the superligand coordinates will be combined with the classical and the structure-based pharmacophores to form "combiphores".

9:30 40 Pharmacophore modeling investigation of all-trans retinoic acid inhibitors.
Omoshile O. Clement, Molecular Simulations Inc, 9685 Scranton Road, San Diego, CA 92121, Fax: 858-458-0136, oclement@msi.com, and Vincent C.O. Njar, Department of Pharmacology and Experimental Therapeutics, University of Maryland
We report a 3D pharmacophore modeling study of all-trans retinoic acid (ATRA) inhibitors based on azole derivatives of retinoic acid. Using MSI’s proprietary rational drug design program (Catalyst), the study examines the binding features common to this class of compounds, their location in 3D space, as well as the geometries adopted by these ligands in their 3D alignment with the pharmacophoric features.
10:00 41

Pharmacophore-based molecular docking.
Diane Joseph-McCarthy, Bert E. Thomas III, and Juan C. Alvarez, Biological Chemistry, Wyeth Research, 87 CambridgePark Drive, Cambridge, MA 02140, Fax: 617-665-8993, djoseph@genetics.com
Accurate virtual screening of large three-dimensional molecular databases requires consideration of the conformational flexibility of the ligand molecules. A pharmacophore-based docking method whereby conformers of the same or different molecules are overlaid and simultaneously docked by their largest three-dimensional pharmacophore is discussed. In addition, the use of an automated script to generate chemically labeled site points with the program MCSS is presented. These target-derived theoretical pharmacophore points can be directly matched to the database pharmacophores. The affect of site point set size and composition on sampling is investigated.

10:30 42 Using molecular dynamics to explain and predict the differences in affinities of two series of radiometal-cyclized hormone analogs.
Alexander L. Perryman, Biomedical Sciences, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0365, Fax: 858-534-7042, aperryma@mccammon.ucsd.edu, and Thomas P. Quinn, Biochemistry, University of Missouri-Columbia
To understand the structural basis for the observed differences in affinities of three generations of cyclic alpha-Melanocyte Stimulating Hormone analogs for B16 F1 murine melanoma cells, molecular dynamics studies were performed using the SYBYL platform. The first generation alpha-MSH analog (ApoMSH) was disulfide-cyclized, while the second and third generations (ReMSH and ReCCMSH) were cyclized through their site-specific coordination with technetium or rhenium. The metal coordination sphere was parameterized using values from a published study of Tc(V)-oxo-containing crystal structures. The electronegativity values of rhenium and technetium were calculated and utilized in the Gasteiger-Marsili charge calculation algorithm. The observed differences in affinities of the alpha-MSH analogs were explained by analyzing the display pattern and flexibility of the pharmacophore. The same modeling procedure and analysis were then applied to the technetium-cyclized somatostatin analogs designed in this lab in an attempt to predict their relative affinities for the type II somatostatin receptor.
11:00 43 Multiple binding modes in CoMFA.
Nancy E. Lambert, Business Products and Services, Chevron, P. O. Box 1627, Bldg. 50-1214a, Richmond, CA 94802. Email: nela@chevron.com
Stefan Balaz, and Viera Lukacova, Department of Pharmaceutical Sciences, North Dakota State University, College of Pharmacy, Sudro Hall 108, Fargo, ND 58105, Fax: 701-231-7606, stefan_balaz@ndsu.nodak.edu.
11:30 44 Design and synthesis of novel non stroidal anti-inflammatory drug of high selectivity to human cyclooxygenase-2 on the basis of QSAR studies.
Tarek Ma'amon El-Gogary1, Yousrey Sherif2, Gamal Dahab2, Mohaamed Attia2, and Mohamed Kabil2. (1) Chemistry, Mansoura University, Egypt, 34517 Domyat Al-Gideda, Domyat al-Gideda, Egypt, Fax: +2 057 403868, asmasomy@muum.mans.eun.eg, (2) Mansoura University
Quantum chemical Quantitative Structure Activity Relationships (QSAR) expressions for a set of 39 Non Stroidal Anti-inflammatory Drugs (NSAIDs) towards human Cyclooxygenase-1 (COX I) and Cyclooxygenase-2 (COX II) have been developed on the basis of semi-empirical quantum mechanical calculations. Eighteen physical and structural parameters (descriptors) were computed at the level of AM1 method. The data were subjected to a classical multiple regression analysis. Surface area, volume and log P have found to play the major role in the contribution to the activity towards COX I & COX II. This is in consist with the X-ray results which state that the difference between COX I & II is mainly a conformational difference in which the area of the active site in COX II is larger than that in COX I. The activity of the studied set has been computed and compared with the experimental results. The efficiency of Seventeen compounds have been tested on the basis of the developed QSAR equations. One novel drug has been designed and synthesized which shows better predicted efficiency than the well-known in-market drug, meloxicam.


Section A
Hyatt Regency, Regency C

Technical Intelligence
T. Trippe, Organizer
2:00 45 Application of benchmarking, gap, and technology analysis.
Mathias M. Coburn, The Fusfeld Group, Inc, 136 Beverly Drive, Kennett Square, PA 19348, Fax: 610-444-6700, ememcee@aol.com, Martha M. Matteo, Director, Technology Assessment, Boehringer Ingelheim Pharmaceuticals, Inc, and Daniel J. Greenwood, Principal Analyst - Technology Assessment, Boehringer Ingelheim Pharmaceuticals, Inc
Boehringer Ingelheim Pharmaceuticals, Inc. (BIPI) was seeking internal process improvement in a key and fast moving competency in the area of drug discovery. To achieve this, a study was initiated to examine the state of the art within the pharmaceutical industry, as revealed through public sources. The study consisted of initially bencmarking the current state and future trends of this competency within the pharmaceutical industry. Five technology gaps were identified. A strategic planning framework was used to prioritize these gaps. This counterintuitive approach resulted in identifying certain gaps as "urgent", in contrast to the intuitive conclusion that the company had reached prior to the exercise. The BIPI team was able to formulate prioritized recommendations in accordance with these findings. Despite cross-cultural differences around the company, the findings spurred discussion which led to action.In this case, the critical success factors included knowing the industry, the technologies and surrounding issues, as well as having a process for performing and displaying the analysis and its associated assumptions.
2:45 46 Do patent models reveal technological capabilities?
Judith Klavans, Center for Research on Information Access, Columbia University, 535 West 114th St, Room 511, New York, NY 10027, Fax: 212-854-9099, klavans@cs.columbia.edu -  SLIDES
Patent models may reveal the hidden technological capabilities and intentions of a firm. Different techniques are used- analyzing co-inventor, citation, IPC categories and linguistic patterns in a firm's patents. In this study, the outcomes from these techniques are compared to a set of development programs in a firm. The results suggest which analytical algorithms work best in this specific technological domain, and can serve as a standard for comparing results in other domains. The process is replicable by technical intelligence professionals in other firms.
3:30 47 Patent analysis as science and art: Why you should hire a consultant to help with your patent analysis.
Mary Ellen Mogee, Mogee Research & Analysis Associates, 11701 Bowman Green Drive, Reston, VA 20190, Fax: 703-478-3253, mogee@mogee.com, and Anthony Breitzman, CHI Research, Inc, 10 White Horse Pike, Haddon Heights, NJ 08035, Fax: 856-546-9633, abreitz@chiresearch.com -  SLIDES
The premise of this paper is simple. Patent analysis is not new. Consulting firms have been developing the conceptual bases and practice of patent analysis for over 20 years. In doing so they have developed a tremendous body of experience and expertise. Now that the value of patent analysis is being more widely recognized, companies would do well to take advantage of this experience and expertise. Computer software packages and systems are being offered commercially to help companies manage and organize their patent portfolios, many of them with analytical and visual capabilities. Use of such tools without a thorough understanding of the conceptual framework and practice of patent analysis, however, may lead to mistaken decisions. The paper will discuss the "scientific" aspects of patent analysis that were funded initially by the National Science Foundation, as well as the "art" of patent analysis in practice.
4:15 48 Identifying undervalued companies via patent analysis as a means of highlighting merger/acquisition targets.
Anthony F. Breitzman Sr., and Patrick Thomas, CHI Research, Inc, 10 White Horse Pike, Haddon Heights, NJ 08035, Fax: 856-546-9633, abreitz@chiresearch.com -  SLIDES
Mergers and acquisitions take place for a variety of reasons, including strategic growth, expansion into new markets, or because the target company is attractively priced and contains know-how the suitor would like to obtain. In this paper we discuss how patent analysis may be used for targeting acquisition candidates. We also describe how this technique may be used to locate attractive investment opportunities, identify companies with undervalued technology, or choose among several potential merger targets. As an example of this technique, we show how, on August 15, 2000, Acuson was found to be one of the most undervalued companies in the ultrasound industry based on quantitative, objective analysis of its patent portfolio. On September 27 2000, Siemens announced its intention to purchase Acuson at a price 50% higher than its share price on August 15.


Section B
Hyatt Regency, Regency D

Advances in 3D Searching and Pharmacophores: Novel Approaches
O. F. Güner, Organizer
1:30 49 Identification of molecular reactive sites with an interactive volume rendering tool.
Preston J. MacDougall, Department of Chemistry, Middle Tennessee State University, 1301 E. Main St., Murfreesboro, TN 37132, pmacdougall@mtsu.edu, and Christopher E. Henze, Data Analysis Group, NASA Ames Research Center, MS T27A, Moffett Field, CA 94035, chenze@nas.nasa.gov
A visualization tool is presented that employs volume rendering techniques. By interactively exploring a molecule’s electronic charge density for characteristic topological features, one can identify a diverse menu of chemically reactive sites. Several features make this tool particularly well-suited for visualizing pharmacophores. To visualize reactive sites, we investigate the Laplacian of the electronic charge density that is either calculated with quantum chemical methods, or alternatively obtained from high-resolution X-ray diffraction. We will present computed results for several biomolecules that contain a variety of reactive sites: cisplatin, penamecillin and isomers of nitrogen bases from DNA. We will show that a single tool is useful in identifying inner-shell features (cisplatin), polar, nonpolar and aromatic regions (penamecillin), hydrogen bonding sites of varying strength (nitrogen bases and penamecillin), and other reactive sites (strained heterocycle in penamecillin).
2:00 50 Shape and feature based approach to virtual library and database screening.
Santosh Putta1, Christian Lemmen1, Jonathan Greene2, and Paul Beroza3. (1) Chemical and Physical Sciences R&D, Dupont Pharmaceuticals Research Labs, 150 California Street, Suite 1100, San Francisco, CA 94111, Fax: 415-732-7170, santosh.k.putta@dupontpharma.com, (2) Adaptive Silicon, (3) Not available
Both the shape of a bioactive molecule and the chemical features displayed by a ligand are important for biological activity. This paper presents a model-building strategy that incorporates both molecular shape and chemical feature presentation during the development of a model. This model is then refined with the use of biological activities. A database search using that model is shown to be useful for identification of novel lead structures. A diverse set of conformations is taken as input for all the molecules under consideration.

The model generation is split into three steps. In the first step, the shapes of ligand molecules are encoded by canonically embedding their conformations onto a 3D grid. Typically, a very large number of such shapes result from known active and inactive molecules. These are collected into a shape catalog. If no filtering is performed, the shape catalog can become quite large. Therefore, closely related shapes are eliminated from the shape catalog during catalog generation, resulting in a diverse selection of molecular shapes. In the second step, a set of known active and inactive compounds are then compared to all the shapes in the catalog. The comparison is implemented as a shape-matching procedure followed by a similarity assessment on the basis of the grid occupancy. In the third step, a signature of each molecule is generated. A signature is a bit-string obtained by encoding the grid positions of the chemical features a compound presents when matched to the shapes in the catalog. Bits in the signature (hypotheses or models) are rank-ordered based on their information content (the ability to distinguish actives from inactives). A collection, or set, of the most informative models (hypotheses) is used as an ensemble model for scoring of virtual libraries.

We present validation experiments on thrombin as a model target. The procedure has proven to be effective in distinguishing between actives and inactives.

2:30 51 Modeling of ion complexation and extraction using substructural molecular fragments method
Alexandre Varnek1, Georges Wipff1, and Vitaly Soloviev2. (1) Department of Chemistry, Louis Pasteur University, 4, rue B. Pascal, Strasbourg 67000, France, Fax: +33-3-88416104, varnek@chimie.u-strasbg.fr, (2) Institute of Physiologically Active Compounds, Russian Academy of Sciences
A Substructural Molecular Fragment (SMF) method has been developed to model the relationships between the structure of organic molecules and their thermodynamical parameters of complexation or extraction. The method is based on the splitting of a molecule into fragments, and on calculations of their contributions to a given property. It uses two types of fragments: atom/bond sequences, and “augmented atoms” (atoms with their nearest neighbours). The SMF approach is tested on physical properties of C2 – C9 alkanes (boiling point, molar volume, molar refraction, heat of vaporisation, surface tension, melting point, critical temperature, and critical pressures) and on octanol/water partition coefficients. Then, it is applied to the assessment of (i) complexation stability constants of alkali cations with crown-ethers and phosphoryl-containing podands, and of betha-cyclodextrins with mono- and 1,4-disubstituted benzenes, (ii) solvent extraction constants for the complexes of uranyl cation by phosphoryl-containing ligands, and, (iii) distribution coefficients of Hg, In and Pt extracted by 26 phosphoryl-containing monopodands, and of uranium extracted by 32 mono- and tripodands or by 22 monoamides.
3:00 52 Drug-receptor interaction features in pharmacophore identification.
Isaac B. Bersuker, Institute for Theoretical Chemistry, Department of Chemistry & Biochemistry, The University of Texas at Austin, Austin, TX 78712, Fax: 512-471-8696, bersuker@eeyore.cm.utexas.edu
Two novel ideas are employed in order to improve the description of molecular systems in their interaction with the bioreceptor when the structure of the latter is unknown. First, an atomic index is suggested which goes beyond electrostatic interactions and takes into account orbital controlled donor-acceptor properties of the atom-in-molecule. Second, based on the known kinetic equilibrium properties of substrate-enzyme (drug-receptor) interaction and the orders of magnitude of their bonding energy in comparison with the energy difference between significantly populated conformations, it is shown that only one conformation of the drug molecule, namely that with pharmacophore and best bonding to the receptor, should be taken into account in quantitative bioactivity prediction. With these two innovations, the improved electron-conformational method of pharmacophore identification and bioactivity prediction is applied to the problem of group I metabotropic glutamat receptor (mGluR1) agonists and angiotensin converting enzyme inhibitors.
3:30 53 Topomer shape similarity searching of familiar compound databases.
Richard D. Cramer, and Robert Jilek, Tripos Inc, 1699 South Hanley Road, St. Louis, MO 63144, Fax: 314-647-9241, cramer@tripos.com
Topomer shape similarity has been so effective in searching vast virtual libraries, for structures biologically similar to a query structure, that it seems obvious to address the most accessible compounds of all, those already on hand or for sale. However, within such structurally "heterogeneous" databases, topomer combinatorics affect search speeds unfavorably rather than favorably. A consequent need for new pre-screening algorithms has produced a new "2D" descriptor, the "aggregate count vector". Its "neighborhood behavior" in a standard neighborhood validation assay seems comparable and complementary to such other widely used neighborhood metrics as 2D fingerprints and atom pairs.

Initial evaluation of these new capabilities will involve large (~500K) sets of HTS primary screening data. We plan to present a critical assessment of the results.

4:00 54 Enhancements in Catalyst conformational model generation: Scientific and testing considerations.
Clive Freeman, Jiabo Li, Jon Sutter, and Marvin Waldman, Molecular Simulations Inc, 9685 Scranton Road, San Diego, CA 92121, clive@msi.com
Conformational models are of central importance in 3D database searching using pharmacophores. Conformational models are used to populate multi-conformer databases and to derive pharmacophoric descriptions of active chemical conformations. In both these applications, it is of central importance that diverse and stereochemically reasonable pharmacophore configurations are sampled. Recent developments in conformational model generation in the Catalyst software suite will be highlighted, including algorithmic enhancements to conformational similarity analysis and refinements in the sampling of stereoisomers. An additional and important aspect of the work has been the development of a test suite and automated testing procedure. Illustrative results, showing improvements in conformational models as measured by conformational diversity and database searching will be presented.
4:30 55 On validating 3-D diversity methods: Introducing total pharmacophore diversity.
Gergely Makara, and Edward Wintner, Department of Chemistry, NeoGenesis Drug Discovery, 840 Memorial Drive, Cambridge, MA 02139, Fax: 617-868-1515, gregm@neogenesis.com
Validation of pharmacophore derived metrics for quantifying molecular diversity usually involves comparison of 2D and 3D techniques. Such studies have often made the rather surprising and counterintuitive conclusion: 2D fingerprints elucidate experimental data more reliably than 3D methods. This presentation details several pitfalls that should be avoided in constructing sets of molecules to be used in diversity validation. Molecules erroneously expected to be similar can have a major impact on the observed performance of diversity methods and are compared by 2D Unity and Total Pharmacophore Diversity (ToPD) fingerprints. ToPD, a new distance-based 3D method is also demonstrated to consistently and significantly outperform 2D binary fingerprints in different validation tests. The ToPD algorithm can be used for rapid evaluation of diversity in large screening libraries or generation of focused libraries as well as ligand-based virtual screening.
5:00 56 OSPPREYS: An oriented substituent pharmacophore property space.
Eric J. Martin, and Thomas J. Hoeffel, Chiron Corporation, 4560 Horton St, Emeryville, CA 94608, Fax: 510-923-2010, martine@chiron.com, hoeffelt@chiron.com
Numerous virtual products, numerous conformations per product, and explicit scaffold dependence, limits 3- and 4-point pharmacophore analysis of enumerated combinatorial libraries in 5 ways: to only small virtual libraries, 3- or 4-point pharmacophores, inadequate conformational sampling, simplistic diversity measures, and recalculations for every library. "Oriented substituent pharmacophores" add two additional points to each ordinary substituent pharmacophore. This recaptures orienting information lost from fragmenting the enumerated products. This necessitates the "combinatorial conformer" and "template alignment" assumptions. In return, however, they include up to 9-point product pharmacophores. Furthermore, the small number of substituents, and few rotatable bonds per substituent, reduces the number of structures by up to 1010, permitting thorough conformational sampling of large libraries. The few pair-wise substituent similarities allow creation of a Euclidean property space, not just counting set bits in a library union fingerprint. Finally, being scaffold independent and transferable, oriented substituent fingerprints are reusable between libraries.


Section A
Hyatt Regency, Manchester C

Technical Intelligence
T. Trippe, Organizer
8:30 57 Application of text mining in strategic technical planning.
Paul Frey, Search Technology, Inc, 4960 Peachtree Industrial Blvd., Suite 230, Norcross, GA 30071, Fax: 770-263-0802, paulf@searchtech.com, Nils Newman, Intelligent Information Services Corp, Robert J. Watts, U.S. Army Tank-automotive and Armaments Command, and Alan L. Porter, Technology Policy and Assessment Center, Georgia Institute of Technology -  SLIDES
An application of text mining in the decision processes surrounding strategic technical planning is described. The U.S. Army Tank-automotive and Armaments Command (TACOM) conducted a study to determine the feasibility of using ceramics for diesel engine components. This study used a combination of expert opinion and text-mined derived knowledge to complete a comprehensive assessment of the topic area. The text-mined data included bibliographic databases covering basic research, applied development, and intellectual property. Knowledge extracted from these sources led to related research in the electronics industry and to experts and organizations that enabled a transfer of those applications to the mechanical domain. The recommendations of the study led to the launch of a multi-million dollar program to develop and test ceramic coating technologies for reconditioning engine parts. Text mining of large bibliographic databases provided an important perspective on the history, current state, and future of the related science and technology.
9:00 58 Beyond searching: Organizing, analyzing, and presenting patents with BizInt Smart Charts for Patents.
Diane Q. Webb, and John A. Willmore, BizInt Solutions, 650 N. Costello Pl, Orange, CA 92869, Fax: 714-744-1316, dqw@bizcharts.com
What does the patent analyst do when even a carefully constructed search – such as "Who’s citing our patents?" "Who are our competitors in this area?" "What does our competitor’s portfolio look like?" -- yields hundreds (or even thousands) of records? Overloading the reader with information can be more dangerous than providing incomplete results. Information must be organized and presented to suit each reader’s needs.

The BizInt Smart Charts for Patents software developed by BizInt Solutions provides several techniques to organize, analyze, and present patent information more effectively. A tabular report can help the reader scan through many patents to identify those of interest. Statistics can be collected on fields such as patent assignees and class codes. These statistics can be presented graphically to assist in identifying patterns and trends. This presentation will present selected case studies to demonstrate how different techniques are useful for different types of technical intelligence problems.

9:30 59 Pharmaceutical intelligence: Disinformation from press releases?
Peter R. Steele, Current Patents Ltd, Middlesex House, 34-42 Cleveland Street, London W1T 4LB, United Kingdom, Fax: 44 20 7631 9927, peter@current-patents.com -  SLIDES
Patent documents are very rich in data and information, but transforming that content into useful intelligence is often very time-consuming. Press releases are a potential catalyst in this laborious process, but often fail to give the pertinent facts. In the course of constructing a comprehensive patents database for the pharma/biotech sector, analysts have researched the intellectual property (IP) background to several thousand press releases, and have categorized the information given. Most commonly, the statements fail to report the true origin of IP rights which are changing hands. Document numbers, if cited at all, are often in a non-ideal form. At best, informative press releases allow competitive intelligence professionals to link deals, including the licensing of platform technologies, to specific patents.
10:00 60 Patent analysis via "morphogenetic patent sets.
T. E. Clifton III, Eastport Consulting Group, 1521 Fenway Road, Crofton, MD 21114, tip@eastportcg.com, and David J. Pratt, M-CAM, Inc
Abstract text not available.
10:30 61 Using IFI's Concordance of IPC to US Patent Classification to enhance patent analysis.
Darlene K. Slaughter, and Harry M. Allcock, IFI CLAIMS Patent Services, 3202 Kirkwood Highway, Wilmington, DE 19808, Fax: 302-998-0733, darlene.slaughter@aspenpubl.com
The growing number of data visualization tools enables business analysts to produce attractive displays of charts and topological maps. A chart, however, is simply a tool, and the quality of information it provides depends upon the skill of the analyst who creates and uses it. When analyzing patents in a particular technology, there are several ways of selecting the sample of patent data. In this case study, we compare the results using patent records gathered by simple text searching to those collected by a more comprehensive classification search. We used IFI’s electronic Concordance of IPC to USPTO Classifications to design a complete and accurate classification search strategy. Although classifications can be a powerful search tool, they are often ignored by those who are not familiar with the patent classification systems. This example shows the risks of choosing a more familiar, but less effective method.
11:00 62 Patent analysis tools to explore potential R&D projects: Are lead-free products an important new thrust for the electronics marketplace?
Lawrence D. Schwartz, Business Development, Aurigin Systems, 10710 N. Tantau Ave., Cupertino, CA 95014, Fax: 408-257-9133, lschwartz@aurigin.com
Historically using the patent literature to assess technical opportunities has been difficult because it required reading hundreds or thousands of documents. New analytical tools developed by Aurigin Systems allows for the visualization of the patent landscape, forward and backward citations and claims. Additional tools that display groups of assignees, inventors, patent classifications, and most cited patents, offer the investigator insight in minutes that previously could take months. The paper will apply these tools to the question of whether a researcher should consider developing lead-free materials for the electronics marketplace. Which companies are currently in this market, how dominant are they? Is the velocity of invention stagnant or growing? What technologies have been exploited? Is there sufficient "white space" for invention and protected products to be developed? What areas warrant further investigation?


Section B
Hyatt Regency, Manchester D

Structure-Based Data Mining
Cosponsored with Division of Computers in Chemistry, and Division of Medicinal Chemistry
R. W. Snyder, Organizer
8:00 63 Automated mining of protein structural similarities using reduced variable representations.
Valeri Karlov1, Dmitri Beglov1, Eric Koistinen1, Dejan Timotijevic1, Carlos Padilla1, Jianhua Zheng2, and Kal Ramnarayan2. (1) SBI-Moldyn, Inc, 955 Massachusetts Ave., Cambridge, MA 02139, Fax: 617-491-4522, kar@moldyn.com, (2) Structural Bioinformatics Inc
This work presents a general methodology and fast data mining algorithm for detecting structural similarities between protein structures. The corresponding technology, dubbed Structure Mining and Activity Recognition Technology (SMART), is automated to screen a large 3D protein database in a high-throughput mode and find all structures similar to a specified target. Protein structural comparisons are formalized via novel reduced-variable representations (descriptors), which uniquely identify each protein. SMART offers several reduced-variable representations, providing a tradeoff in terms of speed/storage and accuracy of prediction. The detected structural similarity serves as a basis for ab initio identification of protein function (activity), e.g. via detection of important spatial arrangements of functional groups in diverse sets of proteins. SMART provides a sequence-independent inference of structural similarities and protein function. Preliminary testing demonstrates that SMART clusters the similar structures according to their function faster and better than one of the state-of-the-art methods - the NIH’s VAST code.
8:30 64 Structure-based datamining software for correlating compound classes and gene expression.
Paul E. Blower, and Chihae Yang, LeadScope, Inc, 1275 Kinnear Rd, Columbus, OH 43212, pblower@leadscope.com -  SLIDES
The National Cancer Institute routinely tests the growth inhibition and cytotoxicity of compounds against a panel of 60 human cancer cell lines (NCI60). To date more than 70,000 compounds have been tested. Recently, researchers at the NCI and their collaborators published a study in which they used cDNA microarrays to generate gene expression profiles for the NCI60 and then used bioinformatics techniques to correlate those profiles with drug activity patterns of tested compounds. We have developed structure-based datamining and visualization software that can assist researchers in exploring this rich data set. In this talk, we will describe statistical techniques used to select genes with characteristic expression patterns and illustrate how we used the software to identify several compounds classes that are well-correlated with the expression patterns of selected genes.
9:00 65 Application of statistical methods to the prediction of B3LYP-optimized polyhedral water cluster geometries.
David J. Anick, McLean Hospital, Harvard Medical School, Bowditch Bldg, 115 Mill St., Belmont, MA 02478, Fax: 617-855-3722, David.Anick@gte.net -  SLIDES
A method is described for rapid prediction of optimized geometries for polyhedral (H2O)n clusters having each O in three H-bonds. A database of polyhedral structures (1089 H-bonds, range 251 - 305 pm) was generated by optimizing 63 (H2O)n clusters, 8 £ n £ 20, via B3LYP/6-311++g**. Descriptors were selected (p .001) correlating the O - O lengths, O - O - O angles and H2O orientation parameters, with local and global cluster description parameters. The method uses the resulting formulas for predicted H-bond lengths and angles to generate an oxygen skeleton, on which the hydrogens are then positioned. Thirteen new (H2O)n clusters (12 £ n £ 24, 321 H-bonds) were tested. The lower energy structures came within 25 cal/mol per H2O of the B3LYP local minimum, and all estimates came within 50 cal/mol per H2O. Range of RMS errors for O - O distances was 0.8 - 1.6 pm, and for O - O - O angles, 0.7° - 2.2°. Issues addressed include splitting up data sets for more meaningful analysis and handling near-symmetry.
9:30 66 Asymmetric similarity in action.
C. John Blankley, and David J. Wild, Ann Arbor Laboratories, Pfizer Global Research and Development, 2800 Plymouth Road, Ann Arbor, MI 48105, Fax: 734-622-2782, John.Blankley@pfizer.com -  SLIDES
The concept of asymmetric similarity for database searching was introduced several years ago. We illustrate a particularly useful application of this in the context of analyzing screening hits from HTS experiments. Building on our previous work with modal fingerprints, we have added a third similarity measure, PMOD, to the two we have previously used. Complementing MSIM, the Tanimoto similarity of the modal to a particular compound and MODP, the fraction of common bits present in the modal, PMOD is the fraction of common bits present in a particular compound. These latter two measures are modal equivalents of the "Tversky" asymmetric similarity discussed by Bradshaw(1997). We have shown previously that a modal search for analogous compounds is best conducted using an ordering in MODP. We now illustrate that a modal search for candidate templates for a compound cluster is best conducted using ordering in PMOD. This type of analysis is easily automatable for use with cluster analysis of HTS results
10:00 67 Effective analysis of data mining results.
Marvin Waldman, and Osman F. Güner, Molecular Simulations Inc, 9685 Scranton Road, San Diego, CA 92121, marvin@msi.com -  SLIDES
One of the important aspects of data mining is an objective way of evaluating the results of a search. Availability of such an objective function lends itself to various automated procedures, such as query optimization, data clustering, prioritization, etc. These automated processes can be extremely beneficial since the databases are getting larger constantly due to the evolution of combinatorial chemistry and HTS, and therefore the need for effective filtering of the data has been increasing.

The quality of a hit list can be measured via multiple criteria, and the employment of a function that uses only one such factor has clear limitations. We focus on two important criteria: “selectivity” and “coverage.” We show that combined representation of these two parameters provides a function, so called GH-score, which can be productively used in structure-based data mining for drug design. The GH-score function represents a weighted linear combination of selectivity and coverage. In this presentation, we compare and contrast different analysis techniques for data mining with the GH-score and provide examples to highlight the benefits and shortcomings of each approach.

10:30 68 Inductive identification of good partial match queries for 3-D flex searching.
Robert D. Clark, Edmond Abrahamian, Peter Fox, and Trevor W. Heritage, Research, Tripos, Inc, 1699 South Hanley Road, St. Louis, MO 63144, Fax: 314-647-9241, bclark@tripos.com
Although flexible 3D searching has proven itself a valuable tool in drug discovery research, constructing good pharmacophoric queries is still as much an art as it is a science. Automated methods for deducing a consensus pharmacophore from a data set of confirmed actives have existed for some time, but these all use deductive methodology and so do not deal well with false positives, nor with situations where two or more distinct but potentially overlapping pharmacophores exist within a data set. Moreover, available methods are restricted to consideration of a relatively small number of conformations for each compound in the data set. These limitations sharply limit the usefulness of such methods for working with the sorts of data sets typically generated by high-throughput screening (HTS) programs.

We have recently developed an inductive approach which utilizes a genetic algorithm (GA) and fully flexible 3D searching to generate ensembles of good partial coverage/partial match queries applicable to just such data sets. The challenges addressed in the course of this work included how to identify good "seed" queries from which to start the GA; how to score query fitness; and how to apply ensemble - as opposed to individual - selection effectively.

11:00 69

Feature selection for chemical structure data mining using MDL keys
Douglas R. Henry1, Thomas M. Albert2, David E. Nassau1, and Joseph L. Durant1. (1) Product Development, MDL Information Systems, Inc, 14600 Catalina St., San Leandro, CA 94577, Fax: 510-614-3616, dough@mdli.com, (2) Technical Communications, MDL Information Systems, Inc -  SLIDES
In data mining, it is common to convert raw data into features that are more suitable for analysis. Chemical substructure keys (SSKeys) are descriptors that provide one way of converting a special type of raw data (chemical structures) into a form that is useful for chemical database searching, mining, and diversity calculation. We propose that further processing of SSKey descriptors, by combining and customizing them, can improve the results scientists obtain.

In this talk, we investigate several approaches to improve the utility of SSKey descriptors to predict the bioactivity of commercial and patented drug structures. Starting with standard and expanded key sets, we identify relevant key subsets by applying database statistics and data mining techniques. We give examples of combining SSKeys to generate hybrid features. Finally, we discuss some methods and strategies to support specific applications by generating custom keys

11:30 70

Analyzing reaction information for combinatorial chemistry.
Johann Gasteiger, Oliver Sacher, and Achim Herwig, Department of Organic Chemistry, Computer-Chemie-Centrum, University of Erlangen-Nuremberg, Naegelsbachstrasse 25, 91052 Erlangen, Germany, Fax: +49-9131.8526566, gasteiger@chemie.uni-erlangen.de -  SLIDES
A large part of the endeavor in combinatorial chemistry has to be devoted to the optimization of reaction conditions. On the other hand, combinatorial chemistry and parallel synthesis provide information that could enhance our understanding of chemical reactions. We will show how self-organizing neural networks can be used to store information on chemical reactions and use this information for making predictions. Foremost is the representation of chemical reactions which is based on physicochemical descriptors calculated for the atoms and bonds at the reaction site. We will show how the diversity of chemical reactions can be explained and how information gathered from reactions can be used for modeling selectivity and thus allow a more focussed planning of reactions.


Section A
Hyatt Regency, Manchester C

Technical Intelligence
T. Trippe, Organizer
1:30 71

Defensive publishing: The key to gaining and keeping the competitive advantage
Robert Cantrell, Director of Marketing, IP.com, 150 Lucius Gordon Drive, Rochester, NY 14586, Fax: 716-427-8183 -  SLIDES
A strong intellectual property portfolio can make or break an organization in today’s economy. To remain competitive in this fast-paced world of technology, a company must keep up the pace or get lost in the crowd. Building a formidable IP portfolio can help leverage business and products, yet it can be a challenge in and of itself to maintain. Few companies have the time or resources to patent every innovation they create. For this reason, defensive publishing is a natural and economical complement to the patenting process. Robert Cantrell will discuss the defensive publication strategy highlighting 21 tactics technology companies can utilize to increase their competitive edge. Although labeled defensive, it’s more offensive in nature. These are business advancement tools. Defensive publishing allows companies to put patentable innovations into the public domain so that other companies cannot patent those same innovations and devalue the core innovation. Core patents, surrounded by publications, create an unassailable base from which to market products that fund the next wave of business advancing innovations. This compares, in an intellectual property sense, to owning an ocean by owning the straits leading to its access.

2:15 72 Ugly symmetry: Revised information theory and its application.
Shu-Kun Lin, MDPI Center, Molecular Diversity Preservation International, Sanegergasse 25, Basel CH-4054, Switzerland, Fax: 004161 302 8918, lin@mdpi.org -  SLIDES
A library of 100000 copies of the same book would have the highest permutation symmetry but very little value. An electronic file of a text of high symmetry can be compressed to a much smaller size. According to Liews, entropy is defined as information loss. Any static structure with information recorded has high diversity (e.g., the text of this abstract has the diversity of all kinds of alphabets) and low symmetry. Any highly symmetric structure cannot have information (e.g., a perfect crystal which is symmetric but it is just like a piece of blank paper without any information). "Symmetry is beauty" or "symmetry is a measure of beauty" as a scientific conception is very misleading and wrong. Symmetry is in principle ugly because it is associated with the information loss or entropy increase, based on a new theory, mainly the logarithmic relations of entropy and symmetry for both static and dynamic systems and the similarity principle (see Lin, S. -K. Correlation of Entropy with Similarity and Symmetry. J. Chem. Inf. Comp. Sci., 1996, 36, 367-376, downloadable at http://www.mdpi.org/lin/lin-rpu.htm ). The Greek word symmetry means the "sameness measure". It is therefore closely related to distinguishability or similarity. Symmetric structure is stable but not necessarily beautiful. All spontaneous processes lead to the highest symmetry which is the equilibrium or a state of "death". Life is beautiful but full of asymmetry. It has certain symmetry for the stability reason. Molecules important in life (e.g., D-sugars and L-amino acids) are not symmetric. None of the famous drugs is symmetric. Molecular diversity preservation needs global collaboration among chemists which will provide new opportunities. The beauty of molecular diversity will be discussed. A new version of information theory will be presented. This revised information theory of structural stability and process spontaneity is a quantitative relation of the five concepts: higher symmetry, higher similarity, higher entropy, less information and less diversity; and they are all related to higher stability.
3:00 73 Intelligent queries by using uncertainty knowledge base system.
Madjid Fathi, Department Electrical and Computer Eng, NASA ACE Center, UNM-EECE Building Rm. 133, Albuquerque, NM 87131, Fax: 505-277-4681, fathi@eece.unm.edu, and Roya Hosseini, University of Dortmund, M.S - Student
Many new Information technologies are presented in the last decade showing that the necessary of applied uncertainty in information technology based on relational database system,are unavoidable. Uncertainty plays a particular role for DBS to be able to cope with imprecise and complex queries. Fortunately there are some extended SQL,for example MetaSQL(A.Becks&U.Tuben) for formulate queries to information systems for providing multidimensional information like chemical, technical and medical data . We have developed a concept, which is able to model the imprecise queries by using uncertainty Knowledge Base system. we have also realized the improving of quality of query result, which is very important in mentioned domains, because we need more and more special queries, which assume the relational database systems(SQL). To retrieve relevant facts from administrative data the application of end-user templates may be sufficient, since the working process in administration is well known, for example, when a chemist wants to select all datas with special features, one has to use a relational query language such as SQL. The problem with those crisp relational languages is that the level of abstraction is not high enough providing an adequate man-machine interface for the end-user. First, if one has a more or less vague information need, previously and conventional systems offer a too server differentiation between tuples belonging to the result set and tuples which do not. Second, the way of formulating queries is rather inconvenient, since a lot of knowledge about the structure of the database . These disadvantages become important in the chemical working day: on the one hand, the chemist needs fast and easy access to the stored data without having to worry about too much technical and logical problems. On the other hand, the chemical and medical domain are characterized by a certain vague notion. In a previousely work we have expanded for a clinical approach relational query languages to the fuzzy que! ry languages.
3:45 74 Who's researching the researchers: Practicing safe surf.
Josh Duberman, Pivotalinfo LLC, 4038 Factoria Blvd. SE #303, Bellevue, WA 98006-5200, Fax: 425-746-2542, pivotalinfo@usa.net
Can anyone look over your shoulder when you're online? What could your competitors tell about your company if they could track your online research? Many companies are currently tracking internet browsing habits for marketing or competitive intelligence purposes. Josh Duberman will review some of the case histories, technologies and issues involved, with special emphasis on the needs of online researchers. Basic privacy self-defense methods, including cookie management, proxies and anonymizing services, will also be discussed. This session will update some of the topics mentioned in Josh Duberman's recent co-authored article, "Privacy Perspectives for Online Searchers: Confidentiality with Confidence?", (Searcher, 7/00), available online in full text at: http://www.infotoday.com/searcher/jul00/duberman&beaudet.htm


Section B
Hyatt Regency, Manchester D

Structure-Based Data Mining
R. W. Snyder, Organizer
1:00 75 Unlocking corporate data stores with a decision analytics framework for technical decision making.
Shawn Kenner, Spotfire, Inc, 60 Hampshire, Cambridge, MA 02139, shawn@spotfire.com -  SLIDES
Research teams are producing and analyzing not only more data, but more kinds of data, earlier in the research cycle. In order to put all this data to work for them, researchers need simple access to it, and flexible methods that allow—even encourage—inspection of the experimental results in new and fruitful ways. I will discuss a decision analytics framework that allows researchers to retrieve, link, explore and make sense of a wide variety of data from multiple sources. Examples will be drawn from high throughput screening, lead identification and lead optimization to illustrate how the framework supports technical decision-making within the data-intensive, fast-paced environment of pharmaceutical discovery and development.
1:30 76 Virtual screening: How are we doing?
Mark E. Snow, James Dunbar, Lakshmi Narasimhan, and Christine Humblet, Discovery Technologies, Pfizer Global R & D, Ann Arbor, MI 48105, Fax: 734-622-2782, mark.snow@pfizer.com -  SLIDES
The use of 3D database searching and high throughput docking methods for the virtual screening of large libraries (actual or virtual) offers an efficient method for the identification of those compounds most likely to be active against a given target. How well do these methods perform? We will compare the results of virtual and experimental screens for a number of therapeutic targets.
2:00 77 New perspectives in virtual high throughput screening.
Jacques R. Chretien, Marco Pintore, and Frederic Ros, Lab. Chemometrics & BioInformatics, University of Orleans (France), BP 6759, ORLEANS Cedex 2 45 067, France, Fax: + 33 2 38 41 72 21, jacques.chretien@univ-orleans.fr
A Data Base Mining software exploiting molecular diversity was developed for search of new leads with help of virtual High Throughput Screening. Optimization of the number of pertinent descriptors to perform any classification is done with help of genetic algorithm based procedures. A preliminary analysis of this hyperspace is obtained by Kohonen Self Organizing Maps (SOM) that allow a direct 2D representation. The development of different Fuzzy Logic procedures offer new insights for an in depth analysis of molecular diversity via SOM or, more fruitfully, in the original hyperspace, without loosing some information content. These procedures include different algorithms: (i)Fuzzy Clustering, (ii) Fuzzy Partition and (iii) Adaptive Fuzzy Partition (AFP). All these points will be illustrated by different demonstrative examples. Not only active/inactive compounds are predicted but also, and simultaneously, their mechanism of action for anti-carcinoma compounds or the involved receptor for CNS active compounds.
2:30 78 Virtual screening: Is data mining up to the challenge?
George S. Cowan, Discovery Technology, Pfizer Global Research and Development, 2800 Plymouth Road, Ann Arbor, MI 48105, Fax: 734-622-5996, George.Cowan@pfizer.com, Alain Calvet, Discovery Computing and Drug Design, Pfizer Global Research and Development, and Kjell Johnson, Biostatistics Discovery and Early Development, Pfizer Global Research and Development
Virtual screening attempts to identify compounds with a biological activity by using only the computerized representation of the compounds. One data mining approach to virtual screening is to train statistical or machine learning methods using a data set of biological measurements. Building a virtual screening method requires coordinated work between chemists, statisticians, and machine learning experts. There are several obstacles that the unwary chemist, statistician, or machine learning expert may stumble over in attempting to build virtual screens. Some of these obstacles are extreme versions of common problems while others are unique to virtual screening. We discuss 13 of these obstacles with examples from the literature or from research experience on public or proprietary compound libraries. Some suggestions for dealing with the issues and pointers to the literature will be given.
3:00 79 Automated database tool for analyzing screening hits.
Jian Shen, Aventis Pharmaceuticals Inc, Route 202-206, P.O. Box 6800, Bridgewater, NJ 08807-0800, Fax: 908-231-3360, jian.shen@aventis.com
HTS usually generates hundreds to thousands of active compounds. It can take weeks to collect structural information and molecular properties for each compound and review them in order to select potential drug leads. The task depends highly on the individual, and human errors can easily be generated. To overcome these hurdles, we have developed an integrated tool, Hits Analysis Database (HAD). HAD is an ISIS/Base database containing compound structures, screening activities, calculated properties such as clogP, hazard fragment labels, and structure classifications. All this information is generated by other software and retrieved automatically. In addition to search capabilities, HAD provides an overview of chemical structural classes and corresponding activity statistics. The structure can be sorted by maximum common structure clustering. The ease of use and minimum technical support makes HAD an efficient data-mining tool in early drug discoveries.
3:30 80 SIV: A synergistic approach to the analysis of high-throughput screening data.
Andrew R. Leach, Darren V. S. Green, Michael M. Hann, Gavin Harper, and Andrew R. Whittington, Computational Chemistry and Informatics, Glaxo Wellcome Research and Development, Medicines Research Centre, Gunnels Wood Road, Stevenage SG1 2NY, United Kingdom, Fax: 44-1438-764918, arl22958@glaxowellcome.co.uk -  SLIDES
We will describe a variety of approaches to the analysis of HTS data that attempt to make full use of the whole data set, including not only the most active molecules but also those samples that are less potent (but which may nevertheless represent one or more novel lead series) and inactive compounds. The overall aim of these techniques is to maximise the number of viable lead series provided for medicinal chemistry optimisation. We present comparisons of these techniques across a variety of screens, using various types of 2D and 3D descriptors. In addition, we will indicate the role of interactive visualisation and highlight the role of expert medicinal chemists in the overall process which we term SIV (Selection by Interactive Visualisation). We demonstrate that the synergy of several computational/data mining techniques, interactive graphics and expert chemical knowledge proves to be more effective than any single approach alone.
4:00 81 Structure based data mining of high throughput screening data.
Stephan Reiling, Research Department, Tripos, Inc, 1699 Sourth Hanley, St. Louis, MO 63144, Fax: 314-647-9241, sreiling@tripos.com -  SLIDES
The talk will present our latest results in the field of molecular descriptors and algorithms research for the analysis of High Throughput Screening (HTS) data. Results will be presented on the use of a K-nearest neighbor classifer to compare various molecular descriptors. The descriptors evaluated include topological indices (E-states, Chi, Kappa), Unity Fingerprints, HQSAR, AtomPair Fingerprints (2D and 3D). The talk will also briefly introduce a new software package that we have developed to analyze HTS data (SARNavigator). The software centers around an interactive spreadsheet and a horizon modified Non Linear Mapping (NLM) display of the structural space.
4:30 82 Impacting PhysChem property prediction and chromatography simulation with user training.
Robert S. DeWitte, Michael McBrien, and Eduard Kolovanov, Advanced Chemistry Development, 702-90 Adelaide St. West, Toronto, ON M5H 3V9, Canada, Fax: 416-368-5596, rob@acdlabs.com -  SLIDES
A fundamental limitation of physical property prediction methods is the functional scope of the training set: in other words, has the algorithm ever seen anything like my compound? If the answer is no, then nearly all methods are likely to produce unsatisfactory results in some cases. Advanced Chemistry Development provides a solution to this limitation in a feature called "user training". This approach seamlessly employs measurements performed on proprietary compounds to enhance the predictive accuracy of LogP, LogD and pKa calculations. By coupling the relationships among these properties, user training on any one property positively impacts the predictions of the others. With ACD software, scientists can gain insight into entire families of highly novel structures by doing experiments on a very few examples. Physicochemical interactions form the basis for chromatographic separation. Since these parameters can be predicted from chemical structures, it follows that structure-based chromatograpy simulation is possible. User Training offers the ability to predict chromatographic retention times for novel species based on physicochemical data for related compounds, offering the method development chromatographer a unique evaluative/development tool.


Hyatt Regency, Regency C

General Papers
R. W. Snyder, Organizer
9:00 83 Adaptive evolutionary design of "fast-follower" structures.
Gisbert Schneider, Pharmaceuticals Division, F. Hoffmann-La Roche Ltd, CH-4070 Basel, Switzerland, Fax: +41-6168-87408, gisbert.schneider@roche.com
High-throughput screening has emerged as a driving force in the identification of potential novel lead structures. Virtual screening complements these efforts by a more rational, model-based molecular design and selection process. One particular aim is to identify unique molecular architectures exhibiting substantial bioactivity that is comparable to the activity of previously known drugs or leads ("fast-follower" design). For this purpose an evolutionary de novo design algorithm was developed (TOPAS). This stochastic fragment-based method aims at generating a new molecular architecture mimicking a template structure. Various concepts of virtual molecule "fitness" may be applied to drive the selection process, e.g. pharmacophore models or topological similarity measures. The idea of adaptive library diversity during the design process was consequently implemented in TOPAS. Its efficiency has been demonstrated in several drug discovery projects at F.Hoffmann-La Roche.
9:30 84 Automated computational method for forward synthesis.
William J. Mydlowec, Jessen Yu, and Guido Lanza, Pharmix Corporation, PO Box 215, Palo Alto, CA 94302, Fax: 650-618-1522, bill@pharmix.com
We have developed the first automated computational method to produce forward synthesis routes. Given only the desired final product, our method uses a randomized search algorithm and runs on a 1000-processor supercomputer. Specifically, it starts with a database of readily-available compounds and uses genetic programming paired with a reaction predictor to generate synthesis routes with reagents and conditions. Subsequently, it uses no expert knowledge to direct the search, instead it considers only 2D topological similarity to the desired final product. Our method has been run on several drug compounds and has produced synthesis routes using the reaction mechanisms available in the CAMEO program.
10:00 85 Comparison of the DOCK and CHARMm-dock in various protein-ligand complexes.
Mehran Jalaie, Michal Vieth, Daniel H. Robertson, and Jon A. Erickson, Computational Chemistry and Molecular Structure Research, Eli Lilly and Company, Indianapolis, IN 46285, Fax: 317-276-6545, jalaie@lilly.com -  SLIDES
Most drugs are small molecules that owe their biological activity to a specific interaction with a receptor. The ability to observe these crucial interactions has been made possible through the field of x-ray crystallography. Several thousand of these ligand-protein complex structures have been solved over the past ten years providing invaluable information to the drug design community. In order to utilize these structures for drug design, computational tools have been developed for reproducing the orientation of a drug or ligand to its protein. This technique is known as docking.

This study presents the results of a comprehensive comparison of two docking methods used in virtual screening. Over one hundred serine protease-based and other protein-ligand complexes crystal structures obtained from Protein Data Bank were docked using both CHARMm-dock and DOCK algorithms. These methods were compared in terms of the successful docking rates (reproducing X-ray binding mode) and the timing of the process. Although the CPU requirements for each of the methods differ, initial results indicate that the rate of docking in both algorithms is comparable. Moreover the role of conformational changes in protein structures on docking results was investigated

10:30 86 Identifying common mechanisms of toxicity groups for pesticides.
Pauline M. Wagner, and Randolph B. Perfetti, Environmental Protection Agency,Office of Pesticide Programs, Ariel Rios Building, 7509C, 1200 Pennsylvania Ave. NW, Washington, DC 20460, Fax: 703-308-7157, Wagner.Pauline@epa.gov -  SLIDES
The Food Quality Protection Act of 1996 stipulates that EPA perform a combined risk assessment for chemicals that produce adverse effects by a common mechanism of toxicity. In response to this mandate, the Agency issued a guidance document titled "Guidance for Establishing a Common Mechanism of Toxicity for Use in Combined Risk Assessment " on February 11, 1997. The guidance relies heavily on structure-based analysis as a first cut to identify the potential members of a group. When the first structure-based analysis is complete, the guidance then turns to toxicity and/or mechanistic studies for further selection of the candidate chemicals.

The Office of Pesticide Programs in EPA holds an immense amount of information on the toxicological properties of pesticidal chemicals. Unfortunately, most of the information is not contained in readily searchable databases. Structures of the pesticidal chemicals are also not in a searchable format. Our intention is to populate ISIS with both the chemical structures and the toxicological data and use ISIS to search for both similar structures and similar toxicities. These results will be collated and additional searches may be executed to produce specific subsets so that a thorough analysis can be performed. In order to be successful in this effort, we are developing a strategy to enter the enormous amount of data this Office holds in a timely manner and are designing search strategies that will yield the most fruitful results. These strategies include: identifying sources of the chemical structures that can be downloaded by batch, abstracting and loading toxicological data into ISIS for analysis using EXCEL spreadsheets as the means to transfer the data, and exploring methods of refining the initial searches to produce the desired results. This paper will focus on the details of how we plan to accomplish this task and the progress to date.

11:00 87 Validation study of conformer generators using PDB ligand structures.
Daniel H. Robertson1, Mehran Jalaie1, David J. Cummins2, and Michal Vieth1. (1) Computational Chemistry and Molecular Structure Research, Eli Lilly and Company, Indianapolis, IN 46285, Fax: 317-276-6545, drobert@lilly.com, (2) Statistical/Math Sciences, Eli Lilly & Company
An underlying principle used in drug discovery is the ability for ligand to interact with a receptor thereby enhancing or inhibiting some specific biological process. In modeling these protein-ligand complexes, the modeler typically uses x-ray protein structures and generates the ligand structures using various software. The accuracy and predictive power of the modeled complex is impacted by how correctly the ligand conformation is generated. This talk presents a validation study of the ability of conformer generators to reproduce the "bioactive" form of the ligand as represented by the x-ray determined protein-ligand complexes in the PDB repository. Our results suggest that the performance of conformer generators is not necessarily improved by generating more structures or by taking more CPU time. Fast single-conformer generators can outperform slower multi-conformer generators. These results are important when developing and validating new computer technologies for rapid screening and docking of drug candidates by structure-based methods or pharmacophore modeling.


Hyatt Regency, Regency C

General Papers
R. W. Snyder, Organizer
1:00 88 Impact of aggregation, navigation, and the new economy on research information.
David John Brown, ingenta plc, 73 Banbury Road, Oxford OX2 6PE, United Kingdom, dbrown@ingenta.com
Changes occurring as a result of the impact of the Internet and the new economy are having profound effects on the functions and role of information intermediaries and aggregators. This presentation adapts some of the business principles of the new economy to the practical situation facing research journal aggregators. It is the intention to show that, as a result of the ‘melting of the glue’ which separates ‘stuff’ from its associated ‘metadata’, a new set of operational requirements are being created. The future prospects for intermediaries appear good as long as they adapt their service provision to these new needs and to the increasingly computer literate and wired-up global network of researchers.

Intermediaries, and particularly aggregators, perform an increasingly relevant service in attracting the eyeballs of a broad and disparate audience of end users to a sticky site where the full range of their research and academic information needs will be satisfied. In doing this they extend on the market available to publishers for their electronic journal articles and produce new revenues stream which have hitherto been latent. This is a different set of functions to those performed by traditional intermediaries and aggregators though several of the larger subscription agents are making transition to this new cybermediation, and they are being joined by totally new players with no legacy in the traditional research communication process.

1:30 89 B2B Electronic commerce and the chemical industry.
David John Brown, ingenta plc, 73 Banbury Road, Oxford OX2 6PE, United Kingdom, dbrown@ingenta.com -  SLIDES
This paper explores the evolution of business buying and selling by using the electronic medium of the Internet. There is a brief review of static web sites and isolated transaction sites such as laboratory catalogs. The majority of the paper discusses secure integrated purchasing packages in a dynamic marketplace. This marketplace involves a market maker, buyers, service providers and suppliers of goods. An actual marketplace will be described along with the time and costs of creating that marketplace. Reports of results of the marketplace to date, including supplier and buyer experiences are reviewed.
2:00 90 Advances in CML application.
Alk Dransfeld, Kwantumchemie, KUL, Celestijnenlaan 200F, Leuven B-3001, Belgium, Fax: +32-16-32-7992, Alk.Dransfeld@chem.kuleuven.ac.be
Comparing the chemical shift,d, of calculations with measured data automatically dramatically increases reliability of benchmark calculations. CML/XMLpath provides a program independent solution for this problem. ... details later.
2:30 91 LABTrack: Introducing a legal electronic lab notebook.
Richard M. Stember, AVATAR Consulting, 26861 Trabuco Rd # 147, Mission Viejo, CA 92691, avatar@labtrack.com -  SLIDES
Over the past three years this novel software has developed into a legal replacement for paper lab notebooks. Recent advances in the use of digital notarization, electronic signatures and biometric identification has made possible what many scientists have eagerly anticipated - doing away entirely with paper notebooks.

LABTrack™ provides functions analogous to all of the common uses of lab notebooks including pasting and embedding graphics, providing legally irrefutable notebook entries, automatic date-time stamping and secure user identification. Additionally the common rules associated with lab notebooks are built in using a novel user interface based upon a word processor paradigm. Users can line-out incorrect entries, store reasons for the correction and insert replacement data directly on a notebook page. The software validates the identity of the user, digitally notarizes and date-time stamps changes.

The use of electronic lab notebooks offer substantial benefits. These include: fast searching, correlating and reporting,, secure distribution and backup of notebooks, and the simultaneous sharing of notebooks within groups of users to name a few.

3:00 92 Migrating successful CD products to the web: Challenge or opportunity?
Fiona Macdonald, CRC Press, 235 Southwark Bridge Road, London SE1 6LY, United Kingdom, Fax: +44 20 7407 7336, fmacdonald@crcpress.com, and David Proctor, Hampden Data Services
Abstract text not available.