#214 Abstracts

ACS Chemical Information Division (CINF)
Fall, 1997 ACS National Meeting
Las Vegas (Sept. 8 - 11)


Biotechnology Patent Information
Organized by: A. H. Berks
Presiding: A. H. Berks


Introductory Remarks
8:45 1 The challenge of biotech patent searching.
A. Shanler, American Cyanamid ARC, Princeton, NJ 08543-0400
There are several tools for searching chemical patents but few for biotechnology and genetic engineering patents. This talk will discuss tools available for searching biotechnology patents, including US Patent Classifications, IPC's, Derwent Manual Codes, and free text searching. The problems of using classification and coding systems include broad subject categories, inconsistent application, changes over time, and availability to subscribers only. Free-text is dependent upon the development of standardized terms. Standardized terms are frequently used, but not always (e.g., promoter and regulatory element). Different files need to be searched because of the variation of the abstracts' content and any additional indexing.
9:15 2 Searching for prior art - A collaboration between researchers and information specialists.
R. Diaz, Library and Information Services, Genentech, Inc., South San Francisco, CA 94080.
The web has made it easier to access scientific and patent information. This session will discuss some of the interfaces that are designed for en d user searching. There are strengths and weaknesses to end user interfaces when compared to the full command level interfaces that have been available to information specialists. A collaborative effort between the scientist and information specialist in designing the literature search and using various search tools can result in more precise and comprehensive search results.


Finding and evaluating on-line patent information in the biotechnical arts: Case study, the polymerase chain reaction.
R. Snead, Knight-Ridder Information, Mountain View, California
The invention of the Polymerase Chain Reaction (PCR) has revolutionized molecular biology and accelerated the discovery and development of valuable genetic sequences and processes. Some say PCR is to biotechnology what the transistor was to the electronics industry. It is fitting that the inventors in both instances were awarded Nobel prizes. Significant inventions such as these give raise to an array of valuable new ideas as evidenced in the explosion of published patent applications and issued patents. I will illustrate the breadth of inventions PCR has spawned through the analysis of on-line information found on DIALOG. Patents lend themselves to information indexing and database design. Titles, specifications, claims, assignees, patent numbers, application numbers, filing dates, issue dates, cited references and patent classifications are just a few of the parts of a patent that are indexed. This presentation will describe how such indexing is used for retrieval and analysis of on-line patent information related to PCR.
10:15 4 Sequence submissions and sequence searching for biotechnology patent applications.
A. Shah, US Patent and Trademark Office, Washington DC
As of October 1, 1990, for all patent applications which contain nucleotide and/or amino acid sequence information, applicants are required to submit this information electronically on disk. This electronic form is called the CRF (Computer Readable Form). The submission format is described in the Federal Register Notice, 37 CFR Part 1, Vol. 55, No. 84, Tuesday, May 1, 1990 entitled "Requirements for aPatent Applications Containing a Nucleotide Sequence and/or Amino Acid Sequence Disclosures". The CRF is processed to check for format errors and then the information is entered into the ABSS system (Automated Biotechnology Sequence Search System). The system is used to perform sequence searching on various databases. Once an application becomes an issued patent, the sequence information becomes available to the public via various sources.
10:45 5 Biosequence data in patents. Processing, retrieval, and dissemination. The practice at the EPO.
J. A. Descamps, European Patent Office, The Hague, Netherlands
The number of patents concerning DNA or polypeptides has grown explosively due to modern tools available in genetic engineering, such as cloning and sequencing, and to the economic importance of the products and methods concerned. Processing patent sequences to ensure rapid and complete retrieval for novelty and inventivity searches, and further dissemination, is the main preoccupation of biotechnology departments of patent offices. The following items will be discussed: a. Content and type of claims in biosequence related patent applications. Patentable matter. b. Adaption of the internal classification in the biotechnology field to the explosive growth. c. Standard representation of nucleotide and amino acid sequences in European and PCT applications, and filing on electronic carriers. d. Cooperation with the USPTO, JPO, and the public database producers. e. Dissemination and accessibility of the patent sequences. f. STRAND, the integrated program developed at the EPO allowing an automated search. g. Future developments
11:15   Lunch Break.
2:00 13 Cloning Dolly and beyond.
E. Tsevdos, Kenyon & Kenyon, One Broadway, New York, NY 10004
In February 1997, Ian Wilmet and his colleagues in Scotland set the world on fire when they disclosed the cloning of a sheep named Dolly. According to the process identified by Dr. Wilmet, this is the first time an animal could by fully reproduced by cloning. In order to realize the commercial value of Dr. Wilmet's invention, the cloned animal, Dolly, must not only be patentable, but any patents granted on Dolly also must be realistically enforceable. Therefore, the subject matter of this presentation will look into the issues of the patentability and enforceablility of the inventions surrounding Dr. Wilmet's disclosure to the world.
2:30 14 Derwent - a unique source of value added biotechnology patent information.
R. V. Buckley, Y. Kim, J. D. Myers, Derwent Information Ltd., 14 Great Queen Street, London, WC2B 5DF, UK
Over the past 20 years, the importance of patents in the biological sciences has risen dramatically. Biotechnology companies are increasingly becoming more aware of the value and importance of patents as a unique source of data and are demanding patent information to supplement their literature searches of scientific journals and databases. This paper describes how Derwent has met the challenge of disseminating biotechnology patent information by developing value-added, focused databases such as Derwent Biotechnology Abstracts, Derwent GeneSeq, and the Derwent Gene Therapy Database, all cross-referenced to the Derwent World Patents Index. These four databases may be used independently or collectively to retrieve the biotechnological content and claims of patents from over 40 patent-issuing authorities. The application of statistical analysis to this data is also a powerful way to detect current trends in the biotechnology arena.
3:00 15 Challenges in providing biotechnology patent information.
K. G. Stanley, B. M. Benjamin, K. S. Dunwoody, Technical Services Representative and Biochemistry Dept, Chemical Abstracts Service, Columbus OH 43210
CAS covers patents from 29 nations and 2 international organizations. Patents are challenging, since both legal language and scientific content must be addressed, while maintaining and improving currency. Biotechnology brings additional challenges, including the dynamic nature of patentable biotechnology ideas and the rapid increase in the volume of genetic information. Novel and claimed biosequence information is added to the Registry File, which currently contains over 420,000 unique protein/peptide sequences and over 1,420,000 unique nucleic acid sequences. The large volume of material has led CAS to develop a variety of tools to automate sequence information handling. CAS also provides significant search tools to enhance patent retrieval via STN, STN Easy, and Chemical Patents Plus. The CAS web site provides support in identifying new controlled vocabulary terms required for comprehensive retrieval of biotechnology concepts.
3:30 16 Statistical analyses of international patent data in biotechnology.
R. G. Kolar, Mogee Research & Analysis Associates, 11701 Bowman Green Drive, Reston, VA 20190
Better information on biotech patenting is needed, both by public policy makers for moral and ethical debates over biotech patenting, and by private companies to help them formulate better R&D and patenting strategies. This paper will focus on the statistical analysis of international patent data in biotechnology for competitor intelligence and informed public policy. The difference between the statistical analysis of patent data and the use of patents as sources of legal or technical information will be discussed. Results from research on the use of international patent family and patent citation data for technology monitoring and competitor intelligence will be presented.
4:00   CINF Business Meeting
4:15   Intermission
4:30   Joint Board/Council Committee on CAS concurrent with Society Committee on Publications open meeting.



Reaction Databases: Content and Applications
Organized by: G. Grethe
Presiding: G. Grethe
8:30 6 Increasing the potential of chemical reaction databases as valued information resources for chemists
J. Hayward, Synopsys Scientific Systems, 175 Woodhouse Lane, Leeds, LS2 3AR, UK.
The use of electronic means for the storage, retrieval and manipulation of chemical information has revolutionised our approach to chemical synthesis over the past 20 years. Long-winded, time-consuming literature searches in the library have been superceded by fast, structure-based searches on convenient desktop PC's. The nature of the substructure search, using 'the language of chemistry', makes it quick and easy for the user to specify the query and retrieve suitable answers. However, database users should be wary of placing too much reliance on substructure searching; there are numerous important features of a chemical reaction which cannot be described purely in structural terms. The usefulness of a database therefore depends not only on its quality but also its design, its content of non-structural data and the functionality offered by the searching software. The ability of a database to be an educational tool of to be a source of serendipitous discoveries is also dependant on such factors. The author describes progress in each of these areas.
9:00 7 Using the biocatalysis database from Synopsys
M. D. Grim, Altus Biologics, Inc., Cambridge, MA 02139-4211.
Searching the literature for biocatalytic reactions can often be a time consuming, frustrating experience since this information is often buried in synthesis papers and spread across a large number of journals in chemistry, biochemistry, biology and other disciplines the organic chemist may not cover with his usual reading. Synopsys Scientific Systems has eliminated this frustration with the introduction of its BioCatalysis Database. Altus Biologics is a catalyst company developing a new type of biocatalyst known as cross-linked enzyme crystals or CLEC catalysts. The value and effective use of the BioCatalysis Database will be discussed using case studies as illustrations.


The evolution of reaction retrieval in synthesis planning and its future in high throughput discovery
R. W. Snyder, G. Grethe, and D. Hounshell, MDL Information Systems, 14600 Catalina Street, San Leandro, California 94577
Synthesis planning has played an important role in chemical research and development. Perhaps the greatest developments in synthesis planning have occurred in reaction retrieval, from William Theilheimer's early reaction classifications to today's electronic databases of reaction methodologies. Whether the focus is traditional chemistry and the synthesis of novel compounds or combinatorial chemistry and the synthesis of libraries, understanding the scope and limitations of a reaction methodology is paramount. The requirements and use of reaction retrieval in the developing area of high throughput discovery will also be presented.
10:00 9 Reaction searching with RXL Browser for synthesis planning.
S-S. Tseng, American Cyanamid Company, Agricultural Products Research Division, PO Box 400, Princeton, NJ 08543-0400.
As a result of an enormous increase of accessible reaction databases and the introduction of client-server-based database access systems during the past few years, the computer-assisted searching of reaction databases for synthesis planning has become an important part of the intense efforts to increase efficiency and productivity in chemical research. This paper will illustrate the effective use of the MDL RXL Browser as a reaction searching tool to derive desirable synthesis plans for target molecules.
10:30 10 Using the ISISBase/RXL Browser to locate information about chemical transformations
U. Iserloh, Department of Chemistry, University of Pittsburgh, PA 15260.
The RXL Browser is a useful tool to simultaneously search several reaction databases, providing an array of information about chemical transformations. Examples illustrating the features and usefulness of the RXL Browser will be presented in the context of organic synthesis.
11:00 11 Chemical reaction databases from the Institute for Scientific Information.
M. Clark, N. S. Kopelev, The Institute for Scientific Information, 3501 Market Street, Phildelphia, PA 19104.
The Current Chemical Reactions database was created by ISI in 1985 along with a consortium of pharmaceutical companies. Its goal is to speed organic synthesis by providing a large, general, and current, resource of synthetic methods. It draws from ISI's resource of over 8,000 journal titles. It is available on several hardware and software platforms. The addition of cited reference searching to create the Reaction Citation Index significantly increases access to chemical information. The 2 million articles referenced make it one of the largest reaction data sources available. The selection criteria, structure, and growth, and case studies will be presented. In addition, trends in organic chemistry as reflected in the databases, will be discussed.
11:30 12 Derwent's reaction service - Fifty years of reaction information.
D. G. Penn, U. E. Whelan, Derwent Information, 14 Great Queen Street, London WC2B 5DF UK.
The availability of reaction information has increased dramatically over the past few years. This paper will cover the principles and selection criteria used to produce Derwent's Journal of Synthetic Methods and Theilheimer's Synthetic Methods of Organic Chemistry. We will discuss the strengths and weaknesses of using a selective reaction database as opposed to a comprehensive one, the advantages and disadvantages of keywording and graphical searching, and strategies for retrieving patent information.
12:00   Lunch Break.
2:00 17 Practical uses of large reaction databases
K. P. Cross, B. P. Cannan, G. J. Myatt, Chemical Abstracts Service, 2540 Olentangy River Road, Columbus, OH 43202
The size and content of the CASREACT database, when coupled with current commercial availability data and links to the primary and secondary literature, provide a powerful combination for researching synthetic pathways and synthesis planning. This presentation will illustrate several novel approaches for exploiting the content of CASREACT.
2:30 18 Applications of Daylight reaction-processing tools to chemical information problems
J. J. Delany III, Daylight Chemical Information Systems, 419 East Palace Avenue, Santa Fe, NM 87501.
Recently, the Daylight system has been expanded to incorporate reaction-processing tools. These capabilities allow the exploration of new approaches to handling problems in chemical information. Progress with respect to the representation of combinatorial mixtures, synthesis planning and handling large databases of reactions will be discussed.
3:00 19 Novel reaction database search engine using reaction types.
B. Rohde, Novartis Pharma AG, RIM, Basel, Switzerland, CH-4002.
The new Daylight Reaction Toolkit provides the basis for a novel reaction database search method. The method classifies reactions by their Reaction Type. Those Reaction Types are formulated as Daylight reaction SMARTS queries, which are used to build an index database for the novel search technique. The result of the search is a set of reactions, classified by Reaction Type, which are examples of transformations that could have yielded the target query molecule given appropriate starting materials. This can be considered "computer assisted synthesis planning by reaction examples". The paper will compare the search results in different collectio ns of reaction and discuss the scalability of the method to large databases.
3:30 20 Automatic derivation of rules and knowledge for reaction prediction using reaction database.
H. Satoh1, K. Funatsu2, T. Nakata1, 1:Institute of Physical and Chemical Research (RIKEN), 2-1 Hirosawa, Wako, Saitama 351-01, Japan. 2:Toyohashi University of Technology, 1-1 Tampaku, Toyohashi, Aichi 441, Japan.
We have been trying to derive rules and knowledge for reaction prediction by utilizing a huge amount of reactions stored in databases. In the first step we have automatically classified and systematized reactions focused on changes of electronic features of oxygen atoms of reaction sites. A combination method of Kohonen network and principal component analysis was used for the classification. The resulted distribution of reactions is closely related with similarity of substructural transformations and with known reaction types, about which we have reported. Then we have sorted the classified reactions by considering reaction conditions. In addition we discuss an application of the results to a SOPHIA system, which we have been developing for the purpose of predicting reactions of arbitrary reactants under arbitrary conditions.



C. Gragg, Organizer, Presiding
7:00 - 9:00
  23 The synthetic chemist's periodic table.
O. B. Ramsay, D. P. Ryan, W. V. Metanomski, Chemical Concepts Corp., Ann Arbor, MI 48104; Hofstra University, Hempstead, NY 11150; Chemical Abstracts Service, Columbus, OH 43210.
A periodic table of elements is presented in which each element is associated with its chemical compound that plays an important role, as a catalyst or reactant, in a current synthetic reaction. For each such reaction, a brief description and bibliographic reference is given. The usefulness of such a table for educational purposes is highlighted.
  22 The multicolored arrows: Chromatic aspects of the signal flow diagram.
B. S. Tice, Advanced Human Design, PO Box 2214, Cupertino, CA 95015-2214.
The signal flow diagram is a graphic method of depicting a set of linear algebraic equations that when represents a physical system graphs the flow of signals from one point of the system to another. The use of signal flow diagrams are common in such fields as engineering and practical use of them can be made in the field of chemistry. The use of color in the nodes and branches of the signal flow diagram can enhance the amount of information conveyed as well as its perspicuousness.
  21 On-line searching for enzymes and proteins.
J. H. Wittorf, Chemical Abstracts Service, Columbus, OH 43210.
On-line searching of databases and documents has profoundly changed the way scientists access information and stay current in their fields. The recent rapid growth of the World-Wide Web (WWW) segment of the Internet has resulted in a proliferation of academic, government, and commercial sites which make available to users a variety of hypertext documents and databases within a relatively short time via "point-and-click" access. Commercial vendors such as STN International also provide rapid access to previous research with databases such as CAplus, CAS Registry, and BIOSIS. WWW enzyme/protein databases for researchers in biochemical/biotechnological areas are surveyed and examples of on-line searching for enzymes and proteins in CAS databases are described.



Skolnik Award Symposium
J. Gasteiger, Organizer, Presiding
8:30 24 Making the computer understand chemistry
J. Gasteiger Computer-Chemie-Centrum, Institute of Organic Chemistry, University of Erlangen-Nuremberg, NE4gelsbachstrasse 25, D-91052 Erlangen
In the last 25 years my research group has been developing software for reaction prediction, synthesis design, spectra simulation, and lead discovery. Central to this endeavor was the development of algorithms for the representation of explicit chemical models; such as the structural formula, partial atomic charges, inductive, resonance and polarizability effect, and 3D models. Recent work concentrates on the development of methods that derive chemical knowledge directly from data. Applications of the systems developed in our group will be given.
9:00 25 From chemical distance to CrossFire: How a Ph. D. thesis can affect a career.
C. Jochum Beilstein, Varrentrappstrasse 40-42, D-60486 Frankfurt, GERMANY.
Algorithms were developed in the late seventies to check via computer whether multi-step reactions follow the principle of minimum chemical distance. Similar algorithms are used for computer-assisted reaction center detection. This paper describes how the author's research carried out 20 years ago with this year's recipient of the Herman Skolnik Award influenced the development of the Beilstein Database under the CrossFire chemical database management system.
9:30 26 Creation of 'Super Chemist' by using computers in chemical industry
I. Dohgane, Kazumi Yuki, Organic Synthesis Research Laboratory, Sumitomo Chemical Co., Ltd., Osaka, Japan.
We should acknowledge that organic synthesis research must be conducted by the division of work between computers and chemists, which complement each other. Super-chemist is defined as a chemist who understands the role of computer and is able to make the best use of its power in most efficient way in his or her research. Computers are already playing very important roles to assist chemists in areas of computer-aided synthesis planning, automated chemical synthesis and computer-assisted spectral analysis.
10:00 27 Areas where error back-propagation and Kohonen networks touch.
J. Zupan, National Institute of Chemistry, Hajdrihova 19, SLO 1000, Ljubljiana, Slovenia
The first area of application for Artificial Neural Networks (ANNs) is mapping complex multivariate objects onto two-dimensional planes, using e.g. so-called self-organizing Kohonen ANNs. Kohonen mapping is analogous to Principal Component Analysis (PCA), a standard statistical technique. The second area of ANN application is modeling. Modeling is a supervised problem and requires input/target pairs, {Xs,Ts}, for learning. For solving supervised problems, mainly error back-propagation ANNs are used. In comparison with standard statistical modeling techniques, ANN methods have a distinct advantage: they do not require the modeling function to be known at all. In the past, error back-propagation for modeling and Kohonen for mapping were strictly separated. In this paper, the ideas of how to use both ANN methods in both problems, i.e. Kohonen ANN for modeling and error back-propagation for mapping, are explained.
10:30 28 Multidimensional NMR with neural networks.
D. Ziessow, P. Arlt, M. Sielaff, C. Tegeler, Technische Universitaet Berlin, 10623 Berlin, Germany.
A new approach to multidimensional (nD) NMR spectroscopy is presented which promises a considerable reduction of measurement time vs. common nD NMR. The spins are excited with a long sequence of constant-angle r.f pulses with random pulse intervals. The resulting NMR signal is, after digitization, stored on disk (about M 3D 500 000 data values y(m)). Then the next sample is put into the magnet. Each value y(m) is related to excitation value x(m) and L past values x(m-l) by the unknown gross transfer (GT) function which characterizes the spin system. We succeeded to numerically represent this GT function with a neural network: its output value z(m) for L+1 input values x(m), x(m-1),..., x(m-L) is fitted to y(m) with the back-propagation algorithm (M-L sets). Then, using the trained neural network, time data are calculated for conventional nD pulse sequences, including t1 quadrature detection and soft-COSY, and processed as usual.
11:00 29 Evaluation of high-throughput screening hits by means of Kohonen neural networks
G. Barnickel, S. Anzali, Medicinal Chemistry Research Department, Merck KGaA, Darmstadt Germany.
The advent of combinatorial chemistry and high-throughput screening has created a strong demand for computational tools to support the conversion of hits into lead molecules. Methods are needed to analyze the active compounds obtained in terms of structural and property variations given within the screened compounds. Furthermore the chemists require a ranking of these hits to be able to determine which compounds to make next. For this purpose the Kohonen map proved to be a promising approach. This presentation will discuss the advantages of this approach and a comparison with other methods will be presented.
11:30   Society Committee on Copyrights Open Meeting
12:00   Lunch Break
2:00 30 Chemists and computers: concord or discord?
V. J. van Geerestein, Department of Computational Medicinal Chemistry, N.V. Organon, 5340BH Oss, The Netherlands
Over the last decade we have seen an increasing amount of new developments taking place at the borderline between the fields of Chemical Information and Computational Chemistry. The former field has become more computational and the latter field is more and more relying on knowledge rather than being solely based on calculations from first principles. Both science and method development is driven by a large number of enthusiastic and driven scientist, both in academia as well as in industry. Within pharmaceutical industry, Computer Chemistry has now become accepted as a significant discipline in the quest for new drug development candidates. Nevertheless, despite tremendous developments in methods and user friendliness of software, the typical Medicinal Chemist is still rather reluctant to use computer methods to the full extent. This talk will discuss various developments in the field of Computer Chemistry and how these have had their impact on the thinking and practice of (Medicinal) Chemists.
2:30 31 Prediction of chemical properties of organic compounds from molecular structure
P. C. Jurs, 152 Davey Laboratory, Chemistry Department, Penn State University, University Park, PA 16802.
Relationships between the molecular structures of organic compounds and their chemical properties can be investigated using quantitative structure-property relationship (QSPR) methods. This approach uses induction to seek generalities by examining large sets of training set compounds. Such QSPR studies involve two major activities: representation and mapping. Representation involves encoding the compounds with topological, geometrical, electronic, and hybrid molecular structure descriptors. Descriptor selection follows, which involves selecting the most informative subsets of descriptors, often using the genetic algorithm. Mapping involves analysis of the descriptors using multivariate statistical methods or computational neural networks to build mathematical models linking the descriptors directly to the chemical property under investigation. Recent investigations involving computational neural networks and genetic algorithms will be described as examples of the application of the QSPR methods. Several specific, recent QSPR studies will be discussed, including studies of prediction of the normal boiling points and aqueous solubilities of organic compounds.
3:00 32 Database searching using molecular fields
P. Willett, D. J. Wild, Department of Information Studies, University of Sheffield, Western Bank, Sheffield S10 2TN, UK.
This paper discusses the use of similarity measures based on electrostatic, hydrophobic and steric fields for similarity searching in databases of 3D structures, where the fields representing a target structure and a database structure are aligned by means of a genetic algorithm. Searches of databases of bioactive molecules demonstrate that the field-based measures result in the retrieval of active molecules that are structurally more diverse than those resulting from conventional 2D and distance-based 3D similarity measures.
3:30 33 Chemical information and robotics in the analytical spectroscopy lab.
J. R. Richert, BASF AG, Main Laboratory, ZHV/S - B 9, D-67056 Ludwigshafen, Germany
New methodologies such as combinatorial chemistry or high-throughput screening (HTS) lead to increased demands on quality and efficiency of an industrial spectroscopy laboratory. While laboratory robotics and sophisticated LIMSes accelerated data acquisition and handling, structure elucidation remained largely a "manual" process. At BASF several new approaches to automated structure elucidation have been developed over the years. Among them, SPECSOLV, which represents a new module for the multidimensional spectroscopic interpretation system SPECINFO, is a self-learning, artificially intelligent approach based exclusively on 13C-NMR. Another development is the automated interpretation of HPLC/ESI-MS. In conjunction with modern chromatographic materials, HPLC/ESI-MS provides very fast qualitative and quantitative information on reaction mixtures, and finds use in quality control, combinatorial chemistry and HTS.
4:00 34 SESAMI: An integrated desktop structure elucidation tool.
M. E. Munk, M. Madison, K. P. Schulz, A. Korytko, Y. Huang, Department of Chemistry and Biochemistry, Arizona State University, Tempe, AZ 85287-1604.
The current generation of personal computers is approaching a level of power to seriously consider the platform for running the kind of sophisticated scientific software now limited to more powerful, expensive desktop workstations. The low cost of these units makes it possible to consider one a part of the standard equipment of every laboratory. The ease of accessibility to such a useful tool can facilitate "multi-tasking" by the chemist further enhancing productivity. Realizing this goal requires the adaptation of some existing software and the development of new programs for this platform. Structure determination is one commonly encountered problem in the laboratory of the organic and medicinal chemist. Many computer-based tools for this purpose have been described, with those directed toward complex organic compounds typically requiring more powerful workstations. A working PC version of SESAMI, a comprehensive system of computer-enhanced structure elucidation which is an ongoing developmental project, has been created which offers some new features.
4:30 35 Generation of refined PLS models by automatic annihilation of variables.
M. Marsili, Department of Chemistry, University of L'Aquila, 67100 L'Aquila, Coppito2, Italy.
In search of inner relations between a block of covariant variables and a block of response variables, some rotated principal components are extracted from the blocks. The modified PLS algorithm implemented in SPECTRE, uses a cross-validation technique to determine the correct number of components for best prediction, linked to simultaneous evaluation of the importance of each variable. Elimination of noise variables allows higher predictability by PLS models. The algorithm, due to natural covariance in the data, also spans all possible models with predictive power higher than one obtains with all variables contained in the model equation. This technique allows summation of occurrences of any variable in an accepted model, thus giving a subset of variables that are relevant for prediction. The application of the algorithm to real clinical and industrial data is shown by examples that highlight the capability of the system to select subsets of strongly modeling variables.
5:00 36 Electronic Chemistry Journals for the 21st Century.
Stephen R. Heller, USDA, ARS, Bldg. 005, Beltsville, MD 20705.
This lecture will predict and describe what electronic journals in chemistry will be able to provide to the scientific community over and above their counterparts in print. With the growth of the Internet, electronic journals are often being hyped, both in positive and negative ways, beyond any reasonable degree of rationale expectation by both the entrenched vested interests of the publishers and the desires of the working chemists. A number of fundamental issues related to electronic publishing and electronic journals which will discussed including peer review, copyright, economics and pricing, social, employment, infrastructure, and where abstracting and indexing services belong in this new system. Lastly, current efforts by both existing publishers and new endeavors in the area of electronic journals and electronic editions of print journals will be discussed.
5:30 37 MJOLLNIR-II -- A chemical informatics server for the internet.
D. Weininger Daylight Chemical Information Systems, Inc., Santa Fe, NM
Mjollnir is an on-going project to address the scientific, technical, commercial and sociological aspects of chemical information exchange. The overall design is based on the sociological concept of a forum. An early version (Mjollnir-I) implements universal access to data via low-performance (e-mail) retrieval and has been operational since August 1993. Mjollnir-II raises system performance and introduces more features of the forum. One distinguishing feature of this system is that user's identities and search requests are available and searchable. Users can find out who else is interested in similar information, but it is not suitable for proprietary queries in its public implementation. Other design features: Database search engine supporting 10's - 100's searches/min; Java-based interfaces with low per-user bandwidth requirements; Large databases (7M structures and reactions); Free access; Available as a supported product for high-performance, in-house use; Restrictions against database dumping are automatically enforced.



Skolnik Award Symposium
J. Gasteiger, Organizer, Presiding
8:30 38 The synthesis planning in AIPHOS - Recent development.
K. Funatsu, H. Yoshino, Department of Knowledge-Based Information Engineering, Toyohashi University of Technology, Tempaku, Toyohashi 441, Japan.
For practicality of organic synthesis design system AIPHOS, the method for automatic selection of suitable starting material and reduction of strategic sites and the corresponding reaction paths, has been developed. Skeletons of both the starting material selected from library and target structure are overlapped. In a case that strategic sites already obtained by other module of AIPHOS are included in the above overlapped skeleton, the corresponding strategic sites are discarded because the sites break the starting material skeleton. It has been shown that the program prepared in this study can generate useful synthesis path keeping skeleton of the selected starting material.
9:00 39 Electronic information for organic synthesis - how to make the best use of it.
G. Grethe, D. Hounshell, R. W. Snyder, MDL Information Systems, Inc., 14600 Catalina Street, San Leandro, CA 94577
Over the last few years the amount of reaction information available electronically in-house or online from large databases has increased dramatically. Combined with data from smaller specialty databases, such as those for protecting groups or solid-phase synthesis, this information becomes increasingly difficult to manage by the end-user chemists. This paper will discuss the problems associated with reaction retrieval and steps taken to alleviate some of the problems. These steps include improvements in user interface to help both novice and expert users and classification of reactions and data clustering to increase the effectiveness of managing search results. Examples taken from the very recent literature will be used to demonstrate these improvements.
9:30 40 Browsing in reaction databases: Creating order out of chaos.
J. R. Rose Department of Computer Science, University of South Carolina, Columbia, S.C. 29206
The explosive growth in the size of reaction databases has resulted in a new set of problems. One of the most pressing problems is not how the data is stored but how the user navigates through such vast amounts of information. Query methods that were adequate for reaction databases comprising tens of thousands of reactions are woefully inadequate when the database grows by one or two orders of magnitude. An approach to browsing based on interactive hierarchical classification will be presented.
10:00 41 Interactive and complex visualization of chemical reaction schemes.
R. Deplanque FIZ-Chemie GmbH, Postfach 12 60 50, D-10593 Berlin, Germany
The usual method to describe chemical reactions graphically is to combine structural formulas of reactants and products in a two-dimensional static form. In the ChemInform database the total synthetic pathway is available. It is not possible to include the kinetic and thermodynamic aspects of the reaction. Important factors such as stereoselectivity cannot be shown properly. New ways of visualising chemical reactions allow interactions with the complex parameters of the total reaction within 3D hyperspace, thus increasing ones understanding of the underlying chemistry and shortening the development time of new compounds. Using this new visual approach we will discuss the advantages and the limitations of multi- and hypermedia methods in science and to which extent a combination of modern computer tools can help the scientist in the understanding of nature.
10:30 42 Software to aid in drug design, synthesis, and evaluation
Jorgensen Department of Chemistry, Yale University, New Haven, Connecticut 06520-8107
The CAMEO and MCPRO programs are being applied to aid drug discovery. CAMEO is used to predict products of organic reactions given the starting materials and conditions. This allows evaluation of the feasibility of proposed synthetic routes, provides SAR data, and assists in the design of optimal combinatorial libraries. CAMEO is also being used to evaluate the stability of drug candidates by analyzing the outcomes of degradative reactions including hydrolyses, thermolyses, photolyses, and oxidations. MCPRO is used at the design level to evaluate the structures and free energies of binding for protein-ligand complexes. Monte Carlo statistical mechanics is employed with both free energy perturbation and linear response approaches. Some systems that have been treated include thrombin with sulfonamide inhibitors, FKBP and FK506 mimics, and cyclophilin with cyclosporin derivatives.
11:00 43 The confluence of rational and combinatorial design methodologies.
S. D. Kahn Molecular Simulations, Inc., 9685 Scranton Road, San Diego, California 92121 USA
Merging rational drug design techniques that employ analog information to develop 3D structure-activity hypotheses, and using these hypotheses to focus combinatorial libraries to optimize activity, leverages the wealth of tools already developed to identify diverse chemical templates for the design of entirely new libraries optimized on a given biological response. Results will be presented that mimic multiple iterations of the drug design process to validate the methodology, and in so doing argue for the use of 3D information in the combinatorial process. Throughout the process, necessary links to corporate information (e.g., stored within MDL's ISIS databases) will be highlighted.
11:30 44 Computer-aided drug design: Current state and future perspectives.
H. Kubinyi, H. J. Boehm, Drug Design, BASF AG, D-67056 Ludwigshafen, Germany, and Computational and Structural Chemistry, Hoffmann-La Roche AG, CH-4070 Basle, Switzerland.
With the ongoing progress in protein crystallography and NMR, rational approaches in drug design become more and more important. Many successful lead optimizations prove the value of structure-based approaches. After the pioneering work of the Kuntz group at UCSF (program DOCK) and Howe and Moon at Upjohn (program GROW), the de novo-design program LUDI was developed at BASF. It incorporates and combines 3D searches of structural databases with growing and linking modes. In addition, it is capable to consider the flexibility of a ligand. A scoring function provides a rank order of the results. Some practical examples of the successful application of LUDI in the design of enzyme inhibitors and other protein ligands will be presented. Further developments of de novo-design programs will consider the synthetic accessibility of the proposed ligands and will build ligands within the binding site, following the principles of combinatorial chemistry. These options are important steps towards the goal of an automatic design of biologically active ligands.
12:00   Lunch Break
1:30 45 A field-based approach to molecular similarity with applications to small molecules and proteins
G. M. Maggiora, J. Mestres, D. C. Rohrer, Computer-Aided Drug Discovery, Pharmacia & Upjohn, 301 Henrietta St., Kalamazoo, MI 49007
Field-based approaches to molecular similarity rely primarily upon matching the steric and electrostatic fields of the set of molecules being compared. Essentially all studies to date have been confined to small molecules. Recently, however, work in our laboratory has shown that field-based approaches can be applied to proteins as well. In fact, the results obtained thus far suggest that field-based similarity matching of proteins provides a robust procedure for aligning three-dimensional protein structures that avoids the bias introduced in many other procedures based upon sequence alignment. The presentation will summarize our approach to small-molecule field-based similarity matching, as implemented in the program MIMIC, and its application to the matching of protein structures. Examples of the three-dimensional alignment of proteins from several families such as matrix metalloproteinases will be presented.
2:00 46 Descriptors that outperform substructures in diversity analysis.
Y. C. Martin, R. D. Brown, E. A. Danaher, J. DeLazzer, I. Lico, Abbott Laboratories; D-47E, AP10/2; Abbott Park, IL
In our earlier evaluation of molecular descriptors for diversity analysis, we found that traditional substructure keys more successfully distinguished active from inactive molecules than did 3D descriptors generated from the distances between atoms. This report describes 3D descriptors generated from the location of hypothetical site points complementary to pharmacophoric atoms in the molecules. Classification of atoms was accomplished with hundreds of Daylight SMARTS targets to describe Hbond donors & acceptors, positively & negatively charged atoms, and hydrophobic centers. The location of the site points was determined from crystallographic packing or 6-31G** calculations. These new descriptors more successfully distinguish active from inactive molecules in our most diverse dataset. Additionally, using them in a clustering analysis results in larger bioequivalent clusters compared to the best 2D descriptors.
2:30 47 Pharmacophores in Drug Discovery.
G. W. A. Milne, S. Wang, M. C. Nicklaus, Laboratory of Medicinal Chemistry, NCI, NIH, Bethesda, MD 20892.
Chemical bind to enzymes with a variety of forces such as hydrogen bonds and the atoms in a substrate which bind to the enzyme constitute the pharmacophore - that part of the molecule which is responsible for the biological activity. Any molecule which contains that pharmacophore is capable in principle of binding to the enzyme. A compound that can bind in the active site of an enzyme can behave as an inhibitor because it can compete for that site with the enzyme's normal substrate. If the reaction that is inhibited is essential to a disease process, the inhibitor is a potential lead drug. This has led to the technique of pharmacophore searching in which the NCI database is examined for compounds which contain a pharmacophore - defined constitutionally and geometrically. Such compounds can be retrieved from the NCI repository and bioassayed. The proportion of the compounds retrieved by pharmacophore searches which are biologically active ranges from 10% to 50%. The method has been applied successfully to the discovery of inhibitors of HIV protease and integrase, of activators of protein kinase C and a variety of other bioactive compounds.
3:00 48 Molecular similarity using two-dimensional representations of structures.
W. G. Richards, D. D. Robinson, Physical and Theoretical Chemistry Laboratory, Oxford University, South Parks Road, Oxford OX1 3QZ, UK.
The introduction of high throughput synthesis and combinatorial chemistry has necessitated a huge acceleration of the time required to compute molecular similarities. One way in which this may be achieved is to work in two dimensions. At the same time the details of three-dimensional shape are so important that they must be incorporated. Representations produced by non-linear mapping techniques have the required properties. As with three-dimensional similarity, the molecules to be compared must be superimposed and optimally aligned. Techniques such as invariant moments permit this and also offer novel ways in which similarity may be computed. The procedures are based on techniques originally developed for optical character recognition and offer similar possibilities for speed.
3:30 49 3D Structures and the bioinformatics revolution
D. H. Smith MDL Information Systems, Inc., 14600 Catalina St., San Leandro, CA 94577
Research in genomics is now a strategic component of life sciences research and discovery. This revolution requires us to examine the role of three-dimensional structures and their possible receptors with a new set of objectives. I will discuss the emerging role of structural bioinformatics in the process of target selection, and the attendant requirements for high quality structural information on candidates screened in the virtual laboratory.
4:00 50 Challenges and progress in structure-based ligand design.
A. P. Johnson University of Leeds, Leeds, LS2 9JT, UK
The SPROUT program provides a set of tools designed to aid the synthesis of synthetically accessible ligands which either fit a pharmacophore hypothesis or are predicted to bind strongly to a particular 3-D structure of a protein. The HIPPO module identifies potential target sites where suitable ligand atoms might be positioned for binding to the protein by H-bonding, covalent interactions, metal ligand interactions and hydrophobic interactions. Other modules dock small fragments to these sites and connect them together by an exhaustive growing process. The ALLIGATOR module provides clustering and ranking tools which permit the user to navigate through the answer set, including estimation of structural complexity as well as an empirical estimate of binding affinity. In the final step, the CAESA module is used to provide an estimate of the ease of synthesis of each of the suggested molecules. Examples of the application of SPROUT to inhibitor design will be presented.
4:30 51 Second generation de novo drug design methodology.
W. T. Wipke. Brian Goldman, Michael Kappler, Brett Kislin, John Lawton, Jim Arnold, Molecular Engineering Laboratory, Dept. of Chemistry, UCSC, Santa Cruz, CA 95064 wipke@chemistry.ucsc.edu
The first generation of structure-based de novo design of drug candidate molecules was based on a pharmacophore model, lead compound, and/or receptor three-dimensional structure. We have developed an automated design system, INVENTON, that incorporates these design approaches. Our recent work has been bringing INVENTON to what we call the second generation level of design concepts. These are concepts that are difficult for chemists, e.g., protein and inhibitor flexibility, potential mutations that may occur in the protein, alternative binding modes, and water molecules in the active site. We hypothesize that appropriate consideration of these factors will lead to superior designs for drug candidates.
5:00 52 De novo screening - A search method for the discovery of novel ligands.
P. W. Rose, T. J. Marrone, Agouron Pharmaceuticals, Inc., 3301 North Torrey Pines Court, La Jolla, California 92037.
De novo screening is the process of computationally discovering fragments that specifically interact with biological targets whose three-dimensional structure is known. Fragments used in this process are either generated computationally, obtained by fragmenting existing molecules, or are small molecules that are commercially available. The fragments are docked into the target protein and the binding free energy is estimated by an empirical scoring function derived from the analysis of a large number of protein-ligand complexes. The top scoring fragments can be elaborated into ligands or they can serve as core structures in combinatorial libraries.
5:30 53 Years living with molecular topology.
J-E. Dubois, ITODYS, 1 rue Guy de la Brosse, 75005 Paris, France.
From the intrusion of computing in daily theoretical chemistry and information retrieval in 1965 to the present situation where micro computing and Internet modify the working habits of the chemist, many drastic changes have occurred in the chemist's activities. The classical vision of "constitutive fragments" of a structure faced a strong competition of different captures of information with the "molecular topological paradigm". Structural topology builds a graph vision of a structure with its matrix description. As a structural paradigm it leads to new languages and a global vision of environmental aspects of a site in a structure. At first it seems to oppose the "molecular fragmentation paradigm" but if computing leads to excellent topological tools it also enables a combination of the two paradigms as often complementary sources of information and modeling tools for CAD design. These changes have played a major role professionally and institutionally in various ways and at different levels.



Electronic Notebooks and INTRANETS
R. Lysakowski, Organizer, Presiding
8:30 54 Driving creation and acceptance for collaborative electronic notebooks.
R. Lysakowski The Collaborative Electronic Notebook Systems Association, Woburn, MA 01801 USA
The Collaborative Electronic Notebook Systems Association (CENSA) is a new professional and trade association. The major project of CENSA is now the Collaborative Electronic Notebook Systems (CENS) Consortium. The CENS Consortium has many large end users and vendors that are defining and funding development of collaborative electronic notebook system hardware and software for their own applications in research, product development and and product testing in high-throughput screening, analytical chemistry, biotechnology and related areas. Applications of these technologies will rapidly spill over into many other industries as the CENS products and standards make their way to the open market. This paper will provide an overview of CENSA, the Consortium, and its progress in its first year of operation. It will also discuss some cogent issues being addressed by CENSA's global product R&D program.
9:00 55 DOE2000 electronic notebook architecture.
N. M. Nachtigal, G. A. Geist, J. D. Myers, S. R. Sachs, Oak Ridge National Laboratory, Oak Ridge, Tennessee, 37831.
We describe an architecture for extensible, inter-operable, electronic scientific notebooks being developed collaboratively by Lawrence Berkeley National Laboratory, Oak Ridge National Laboratory, and Pacific Northwest National Laboratory for the US Department of Energy's DOE2000 National Collaboratory Initiative. The architecture, based on the World Wide Web and Java, defines a base set of functionality to support text, drawing, image, and other general annotation types, and provides extension mechanisms for supporting specific types of scientific data, such as mass spectra or 3-D molecular structures. The current state of the project will be described, with an emphasis on the notebook architecture being developed and the features that are provided to support record-keeping requirements of collaborating teams of scientists and engineers. Uses of the currently available prototypes built on this architecture will be shown.
9:30 56 Electronic notebooks freedom & power.
P. A. Lofty Morton International, Tustin, CA 92680
Scientists within a modern laboratory must have the power to manage information and the freedom to use this information where they work. Wireless technology can now provide the mobility to work locally and collaborate globally. FREEDOM: The wireless Electronic Lab Notebook provides the scientist the freedom to move from lab to lab and building to building, while directly using all the functions of the systems available. POWER: Laboratory Information Management Systems fully integrated with other systems (document filing system, data warehouse, e-mail, statistical tools,..), will provide the team scientist with tools for project data management and collaboration. Actual examples of this freedom and power will be presented.
10:00 57 The EMSL publisher: A new tool for collaborative and distributed journal authoring.
C. I. Parkinson, D. B. Rex, R. A. Bair, Environmental Molecular Sciences Laboratory, Battelle Pacific Northwest National Laboratories, Richland, WA, 99352
In order to address the needs of the authoring scientist, a team at PNNL are creating a new software application that enables technical papers to be written and styled with ease, even when collaborative authoring is required, or results are distributed across a wide computing environment. This new software, "The Publisher", is written entirely in the Java language, making it portable across many different computing platforms. In addition to 'standard' word-processing features, the Publisher provides automatic journal styling, reference database manaagement, and the ability to embed 3rd-party Java components within the text, such as 2D chemical sketchers, mathematical equations, and rotatable 3D molecular models. Although this system is primarily designed for high-quality journal printing, it also enables papers to be printed electronically for the Web, in a way that preserves the interactiveness of objects like the 3D molecule viewer.
10:30 58 Chemsymphony: A tool for publishing using HTML and on the WWW.
P. Tebbutt, A. Hodgkin, A. Kassavine, Cherwell Sceintific Publishing, Magdalen Centre, Oxford Science Park, Oxford OX4 4GA, UK.
ChemSymphony, is a platform-independent set of interactive Java applets that allows 3-D molecular structures to be easily incorporated into HTML documents. Thus structures can be published on the World Wide Web or on any network or PC using the HTML base. Structures can be manipulated in real time, rendered in a variety of styles, wireframe, ball and stick, ribbons etc. and can edited by the user in addition to the publisher. The choices available to the user are, however, defined by the publisher. At present ChemSymphony recognises a variety of structure formats, PDB, XYZ, MDL mol files and Gaussian and MOPAC z-matrix. The software may be used as an electronic publishing tool, for teaching and research.
11:00 59 What the FDA's final rule on electronic signatures and electronic records means to designers and users of automation systems.
R. Lysakowski The Collaborative Electronic Notebook Systems Association, Woburn, MA 01801 USA
The US FDA's Final Rule on Electronic Signatures and Electronic Records has just become law. This new regulation has profound implications for designers and implementors of systems used in regulated scientific and technical environments. This changes the rules of the game and causes evolution far beyond the simple knowledge, data, information or document management that automation systems do today, to an entirely new breed of systems for recordkeeping that emphasize security, audit trailing, and very long-term archiving. New issues must be addressed by vendors and users in chemical, pharmaceutical and biotechnology R&D and product testing. This paper will present a thoughtful and seasoned interpretation of automation systems design now that "the electronic recordkeeping cat is out of the bag."



General Papers
C. Gragg, Organizer, Presiding
1:30 60 Searching the Beilstein bibliographic file on the internet: the NetFire system.
J. L. Wisniewski Beilstein Informationssysteme GmbH, Varrentrappstr. 40-42, Frankfurt/Main, Germany.
A new literature-based database and search system, called NetFire, has recently been made available on the Internet by Beilstein Information Systems. NetFire is a powerful new way for scientists in industry, government and academia to access titles, abstracts and authors of reports published in over 140 top journals in organic and medicinal chemistry. Specifically designed to take advantage of features found in the most popular web browsers available today, NetFire provides users with a tool for searching and displaying literature chemical information. The easy-to-use query-by-form mechanism permits the formulation of a great variety of inquiries. One may search by author, and by words included in the abstract or title. A search can also be restricted to a specific journal or time range. The paper will discuss in detail the design, implementation, and performance of this innovative resource from Beilstein.
1:50 61 Reengineering a corporate chemical information system using the latest client/server technologies: A chemist/end-user approach
P. Perraudin Questel-Orbit, Le Capitole 55, avenue des Champs Pierreux, 92029 Nanterre Cedex - France.
Chemical and pharmaceutical companies can no longer consider their in-house DBMSs for chemical structures and reactions as isolated, standalone, independent systems. Chemists need access to external chemical and other information, too, without interrupting their primary jobs. Working with five major European chemical and pharmaceutical companies, we provided their chemists a very intuitive tool, part of the development of future ChemPath Structure and Reaction System, by specifying, testing and validating the new client/server user-interface they expected. This also provided custom links to the rest of the information management system in their respective R&D centers. This paper will show how each succeeded in adapting ChemPath to its own needs and how they see the future of their global information system and end-user Windows/intranet interface.
2:10 62 The Derwent crop protection file, a valuable underutilized source of pesticide information.
J. D. Myers, M. D. Bauer, 1725 Duke Street, Alexandria, VA, 22314
In a world becoming more and more concerned with air, water and soil pollution, information on effective but environmentally friendly pesticides becomes more important. This paper shows how valuable information from published literature, scientific meetings and patents on pesticides can be obtained from the Derwent Crop Protection File (DCPF). Examples of searches on biological control techniques available for use in green-house grown ornamental plants, effects of certain popular pesticides on vertebrates, and novel pesticides currently under development are included. We will also show which companies are the major players in development of new pesticides. Statistical analysis on search results will recognize companies and scientists that are major contributors.
2:30 63 The Merck Index: Assessing usage patterns and impact on scientific communication through citation analysis.
S. Budavari. The Merck Index, Merck Research Laboratories, Rahway, New Jersey 07065
The Merck Index has been highly cited over the years and on numerous occasions has even been referred to as the "chemist's bible". This talk will present a brief overview of the results of a recent investigation of citations of The Merck Index and what they suggest with respect to how the publication is being used by scientists. In addition, inferences as to the role of the "Index" in the realm of scientific literature will be discussed, as well as the implications for future editorial policy.
2:50 64 SLIMS: A spectral laboratory information management system based on web technology.
A. J. Williams, A. Petrauskas, P. Jurgutis, D. Ross, V. Kulkov, Advanced Chemistry Development, Inc., 133 Richmond Street, Suite 605, Toronto, Ontario, CANADA M5H 2L3.
At ACD we have designed and implemented a system to allow corporate-wide access to analytical information, focussing specifically on spectral information. A number of efforts have been made over the years to implement flexible Laboratory Information Management Systems, LIMS. These include: filing systems for data, paper trails and log books, simple spreadsheets containing sample identifiers and information as well as software based vendor-supplied LIMS system. Software LIMS systems have failed to address the flexibility of interface and features required in an analytical environment that requires access to molecular structures and graphics intensive spectral displays. As a result of collaborative efforts with Eastman-Kodak company we have developed a web-based LIMS systems for managing spectral and associated molecular structure information. This user-friendly system links a unique sample identifier to sample information, a chemical structure or structures, associated spectra and finale reports of analysis.
3:10 65 Internet access to genetic sequence databases.
D. R. Jourdan, Kraft Foods, 801 Waukegan Rd., Glenview, IL 60025. Djourdan@kraft.com
In the last decade the number of genetic sequences being submitted to various databases has dramatically increased. The tools used to access these sequences have recently grown much more powerful. Email autoresponders and web databases allow information professionals and researchers to quickly locate sequences of interest based on fields such as author, organism, or sequence accession number. These records may also provide substantial structural, bibliographic, or taxonomic information. Additionally, tools such as BLAST (Basic Local Alignment Search Tool) allow for the identification of biologically similar sequences.
3:30 66 Comparing diversity selection methods against random selection.
T. Hurst, Tripos, Inc. 1699 South Hanley Road, St. Louis, MO 63144.
Many methods for selection of diverse and representative sets of structures from large collections have been developed over the last few years. These methods vary greatly. Some involve intense statistical methods for clustering, others simple 2D similarity comparisons, and others include 3D information such as pharmacophore triplets. In all of these methods, it is presumed that a diverse subset will increase the efficiency of biological screening. In this presentation, we will answer the pervasive question of this field: "How do I know that screening this selection is better than random screening?"
3:50 67 Secondary research based on the technical literature: The time factor.
M. P. Bigwood, International Technology Information, PO Box 58, Oreland, PA 19075-0058.
Content analysis of the technical literature has been used as the foundation of many technology assessments. Bibliometric analysis methods, on the other hand, are very powerful tools in areas such as strategic technology management and technical competitive intelligence. The argument has been made, however, that these methods are of little value because of the long delays that separate the time a significant technology development takes place in the laboratory and the time it appears in the published literature. In this presentation we want to share data we have collected that provides a quantitative estimate of the time required for a technical development to reach the published domain and compare that timing with a few examples of technology cycle times. As fast as today's technology might be moving, by historical standards, we will see that, for specific cases we looked at, analysis of the published literature in general, and of the patent literature in particular, is still a source of valuable information.