#222 - Abstracts

ACS National Meeting
August 26-30, 2001
Chicago, IL

SUNDAY MORNING

Section A

Chemical Identifiers: Names and Structures. Kurt Loening Memorial Symposium
McCormick Place South Bldg
W.A. Warr and A.D. McNaught, Organizer
9:00 1 Introduction to the symposium "Chemical Identifiers - Names and Structures" (honoring Kurt L. Loening).
W. V. Metanomski, CAS, 2540 Olentangy River Road, P. O. Box 3012, Columbus, OH 43210, Fax: 614-447-3713, wvmetan@cas.org - SLIDES

A short biography of Kurt L. Loening, considered by many the world's foremost expert and leader in chemical nomenclature, will be presented. His professional achievements as well as his personal traits will be highlighted. To put his legacy in a proper perspective, a brief review of milestones in the development and application of chemical nomenclature will be offered. The main purpose of the symposium will not be delving into the past, but providing insight into the future handling of information on chemical structures. To identify the chemical compounds, both the computer representations, including the development of a standard chemical identifier, and names algorithmically derived will be discussed. Using various aspects of computer representations of full structures and substructures, insight will be provided into the identification and searching for chemical compounds in combinatorial libraries and databases containing data on properties and reactions.

9:15 2 Computer aided organic nomenclature: AutoNom(tm) as effective tool for automatic naming at registration and publication of chemical structures.
J. L. Wisniewski, MDL Information Systems GmbH, Theodor-Heuss-Allee 108, 60486 Frankfurt/Main, Germany, Fax: +49-69-50504276, JWisniewski@mdli.com - SLIDES

Design and practical implementation of algorithms and routines for generation of the systematic IUPAC-sanctioned nomenclature, directly from connection tables of organic compounds, is discussed. New additions including unambiguous and efficient calculations of spatialdistribution of atoms with reference to a double bond (E,Z) and with reference to a chiral centers (R,S) are described within organic nomenclature generated automatically by the newly upgraded (AutoNom 2000) system. Inclusion, into AutoNom naming procedure, IUPAC-sanctioned CAS ring system nomenclature, as alternative (or additional) to the "native" Beilstein ring system nomenclature, is discussed and illustrated. Advantages of AutoNom package as DLL are discussed. The integration of the software into a company compound database registration system is presented in detail.

9:45 3 Completing the cycle of relating systematic names and chemical structures.
A. J. Williams, A. Yerin, Advanced Chemistry Development, 90 Adelaide Street West, Suite 702, Toronto, ON M5H 3V9, Canada, Fax: 416-368-5586, tony@acdlabs.com - SLIDES

Systematic Nomenclature is predisposed to software generation since rules-based systems are ideal tasks for computers to handle. In an ideal world there would be a single static, non-language specific systematic nomenclature accepted by chemists and in general usage. With general acceptance, rigorous application of systematic rules would produce fully reversible chemical names from which chemical structures could be generated. Of course there are multiple systematic nomenclature systems and chemical names found in the literature often are only close approximations to the correct names. We will report on a single integrated software suite which allows the generation of names using the two common standards of IUPAC and CAS Index rules as well as the ability to convert trivial names, IUPAC names and CAS Index names directly into chemical structures. We will review the success of a web-based IUPAC naming service which is accessed worldwide and presently generates hundreds of systematic names per day for chemists worldwide.

10:15   Intermission
10:30 4 Is nomenclature obsolete?
Andrew G. McDonald1, William B. Wise2, Maxwell Richardson2, and Toni Kazic2. (1) Dept. of Biochemistry, Trinity College, Dublin, Dublin 2, Ireland, Fax: 353-1-677-2400, amcdonld@tcd.ie, (2) Institute for Biomedical Computing, Washington University, 700 South Euclid Avenue, St. Louis, MO 63110, Fax: 314-362-0234, toni@athe.wustl.edu

The meaning of a molecule's name is its structure, and the semantics of a reaction is composed of those of the participating molecular species and the chemical transformations involved. Historically, mapping molecular names to structures has been done by establishing nomenclature schemes, assuming that standardized nomenclature facilitates communication among biochemists and others. But such standardization remains more a goal than a reality. There is an efflorescence of persistent, nonindentical naming schemes in biochemistry, and these are being rapidly enshrined in databases and web sites. Perhaps reconciling these schemes is illusory, and all one can do is build tables of synonyms and their structures. If so, then nomenclature would be obsolete.

We have been confronting these issues as we build the Enzyme Nomenclature Database (END) of reactions derived from the IUPAC-IUBMB Joint Commission on Biochemical Nomenclature. In this talk we will describe how we are resolving these difficulties and discuss the relevance of nomenclature to databases.

11:00 5 Towards the development of a standard chemical identifier.
Stephen E. Stein, Stephen R. Heller, and Dmitrii V. Tchekhovskoi, Physical and Chemical Properties Division, NIST, Gaithersburg, MD 20899, steve.stein@nist.gov

The development of a method for generating a unique digital representation of a chemical substance has been of interest for many years. While many of the technical challenges have been overcome, no openly-available standard exists. Such a standard is now of particular value for effective chemical communication on the Web. This presentation will describe progress made towards the development of such a standard under the auspices of IUPAC - the IUPAC Chemical Identifier, IChI.

This first phase of this program concerns the representation of well-defined organic compounds. The identifier is composed of several 'layers' of structural information as derived from 'connection table' input. The primary layer represents the basic connectivity of the substance, with additional layers representing tautomeric, stereochemical, and isotopic information. Since most necessary algorithms have been published, our task was to select and integrate available methods and then develop a practical implementation for further testing and refinement. Since this will be an open standard, we invite participation from all interested parties.

11:30 6 Developing CAS services for substructure searching by chemists.
L. S. Toler, CAS, 2540 Olentangy River Road, Columbus, OH 43210, ltoler@cas.org - SLIDES

To identify substances for the CAS Chemical Registry, CAS developed a system to give each unique compound a single CA index name, CAS Registry Number, and connection table. A rigorous system of structuring conventions was developed to address the wide variety of organic and inorganic compounds encountered by CAS in the journal and patent literature. Information professionals who search the Registry file have to apply those conventions when constructing substructure search queries. However, when more chemists (information "end users") began doing their own searches, they were interested in getting answers, not in learning structuring conventions. Clearly, software had to be created to provide more sophistication at the backend in order to give chemists more freedom in creating structure queries. The authors trace the development of CAS's end-user research tools, SciFinder and SciFinder Scholar, of intelligent systems that account for structuring conventions and tools that help chemists convert structure search results into knowledge.

12:00   Lunch Break

SUNDAY AFTERNOON

Section A

Chemical Identifiers: Names and Structures - Kurt Loening Memorial Symposium
McCormick Place South Bldg
W. A. Warr and A. D. McNaught, Organizer
1:30 7 Structure searching: what you get is what you wanted.
Keith T Taylor, Doug Hounshell, Jim Nourse, Brad Christie, and Burt Leland, Product Marketing, MDL Information Systems Inc, 14600 Catalina Street, San Leandro, CA 94577, keitht@mdli.com - SLIDES

Chemists universally understand the 2D graphical chemical structure representation but it only approximates to reality. It represents structures in terms of atoms and bonds whereas today reality is considered to be atomic nuclei and electron density. In addition the actual chemical structure often depends on the environment that the structure is in – tautomerism is a common example. Chemists are expert at interpreting the representation but this is a difficult task for computer systems, which deal with yes/no decisions. At first sight the EXACT match search appears to be unambiguous but on further investigation it soon becomes clear that an exact match depends on the viewpoint of the chemist. Software and search technologies will be discussed that allow chemists to retrieve the structures that they expect in the form that they expect.

2:00 8 Structure searching using SMILES and relational databases.
Roger A. Sayle, and John J Delany III, Daylight CIS, 441 Greg Avenue, Santa Fe, NM 87501, Fax: 505-989-1200, roger@metaphorics.com

String-based languages for representing complex objects are a powerful tool for chemical information management. The combination of these languages with relational database technology create a new platform for managing large databases. Several challenges in structural searching of large databases within the relational database environment will be discussed and are addressed with new algorithms for processing large amounts of data.

2:30   Intermission
2:45 9 Identifying and finding compounds in combinatorial libraries.
John M. Barnard, Annette von Scholley-Pfab, and Geoff M. Downs, Barnard Chemical Information Ltd, 46 Uppergate Road, Stannington, Sheffield S6 6BX, United Kingdom

Extremely large "virtual" libraries, encompassing millions or billions of individual compounds, often need to be handled when designing appropriate subset libraries to synthesise. Fast algorithms are therefore required to process them efficiently. Such libraries can be compactly represented as Markush structures, and intermediate data structures derived from these can be used for a variety of purposes (Barnard et al., J. Mol. Graph. Model., 2000, 18, 452-463). This paper describes extremely fast enumeration of compact SMILES notations for library members, and compound "names" reflecting the reagents required for their synthesis. Algorithms based on "reduced graph" representations of the Markush can be used to identify the overlap (compounds in common) between different libraries, and their potential to form the basis for a combinatorial library registration and search system are discussed.

3:15 10 Accessing and exploiting chemical and biological data through Chemlink.
Michael S. Lajiness, and Thomas Hagadone, Computer-Aided Drug Discovery, Pharmacia, 301 Henrietta Street, Kalamazoo, MI 49007, Fax: 616-833-9183, michael.s.lajiness@pharmacia.com

Chemlink is a recently developed cheminformatic system at Pharmacia. It is based on the highly regarded Cousin system, in operation at Pharmacia/Upjohn for over 20 years. Chemlink provides an environment that supports many of the typical chemistry-focused search and display functionalities. In addition, it is an extremely flexible and powerful interface for accessing and exploiting biological data. Currently, Chemlink can access millions of rows of biological data records in addition to many millions of structures. More important than mere access, however, Chemlink enhances the ability to do drug discovery through the use of a sophisticated but easy-to-use interface. This presentation will illustrate some of the more interesting features of Chemlink. Specifically, this presentation will show how one can use Chemlink to

  • Perform multiple full and partial structure searches based on sets of molecules
  • Similarity searches
  • Dissimilarity searches
  • Search for compounds using structural browsing indices
  • Search for similar molecules that produce different biological effects

     

3:45 11 Rapid retrieval of molecular geometry information from a crystallographic database
R Taylor1, J C Cole1, M Kessler1, J Luo1, B R Smith1, S E Harris2, and A G Orpen2. (1) CCDC, 12 Union Road, Cambridge CB2 1EZ, United Kingdom, Fax: +44 1223 336033, taylor@ccdc.cam.ac.uk, (2) School of Chemistry, University of Bristol

The Cambridge Structural Database (CSD) contains the results of 230,000 crystal-structure determinations. It has long been a source of information on molecular dimensions and conformational preferences. We are developing a new program, Mogul, which provides easier access to this type of data. The user selects a bond length, valence angle or torsion in a molecule. Mogul generates a search substructure that describes the environment of the selected molecular feature. This substructure is then used to retrieve all matching molecules from the CSD. Search speeds are optimised by use of a search tree indexed on keys which capture atom- and bond-property information. Traversal of the tree corresponds to an exact substructure search without the need for graph matching. Branches at the bottom of the tree point to the appropriate bond-length, valence-angle or torsion distributions. Histograms and summary statistics for retrieved distributions are displayed in the program interface.

SUNDAY AFTERNOON

Section B

Chemical Information Instruction
Hyatt Regency Chicago Riverside Center
A. B. Twiss-Brooks, Organizer
Cosponsored with Division of Chemical Education
1:30-3:30 12 SciFinder as an intellectual property tool.
Pamela J. Scott, Information Resources, Pfizer, Inc, Eastern Point Road, MS 118W-04, Groton, CT 06340, Fax: 860-715-7353, Pamela_J_Scott@groton.pfizer.com

Teaching SciFinder's capabilities as an intellectual property (IP) tool is a new venture for Pfizer, beginning with text & compound searching, alerts, and patent families, culminating with exporting patent numbers to Aurigin's Aureka product. Our IP curriculum has expanded to bring this incredibly flexible product into the limelight. The culmination of several features of SciFinder provide not only full-text access, but the extended capability for further analysis & visualization of patent data. Curriculum and teaching outcomes will be reviewed.

1:30

13

Teaching advanced Beilstein / CrossFire: One approach, two coasts!
Pam Kubiak, Library Services, Agouron Pharmaceuticals, 3565 General Atomics Court, San Diego, CA 92121, and Pamela J. Scott, Information Resources, Pfizer, Incpamela_j_scott@groton.pfizer.com

We have developed an aggressive, hands-on training class covering the following: default settings, tautomers, data searching, stereochemistry, atom lists and generic groups, both as individual concepts and in reaction searching with CrossFire 2000. Through many hands-on examples, students quickly master these techniques by successive query development & execution in the classroom setting. Curriculum and materials will be reviewed.

2:00

14

Broadening horizons: Patents in the undergraduate chemistry curriculum
Mary Ellen Teasdale, Department of Physical and Life Sciences - Chemistry, Texas A&M University - Corpus Christi, 6300 Ocean Drive, CS 130, Corpus Christi, TX 78412, Fax: 361-825-2742, mary.teasdale@mail.tamucc.edu, and Brian B. Carpenter, Science & Technology Reference, Sterling C. Evans Library, Texas A&M University - College Station

Patents account for 75% of the chemical literature. Students and newer researchers, however, frequently discount patents that appear in their literature searches because patents have less developed experimental sections than are found in journal articles, patents may take too long to obtain, and may require tanslation services. To increase chemists' comfort level with patents, we suggest exposing undergraduate chemistry students to patents during advanced inorganic laboratory or during second semester organic laboratory. The most commonly used chemistry resources are presented as classwork in these laboratories. Consequently, including a discussion of patents is a small but important amendment to existing presentations designed to make students chemically literate. This poster will present TAMU's website for patent instruction, including teaching aids for identifying keypoints of patent composition and their relationship to journal composition. We will also present strategies for incorporation of this material into the curriculum, and evaluate our experiences with this instruction.

2:30 15 Building bridges: Developing freshman students'research skills with resources in chemistry.
Mary Ellen Teasdale, Department of Physical and Life Sciences - Chemistry, Texas A&M University - Corpus Christi, 6300 Ocean Drive, CS 130, Corpus Christi, TX 78412, Fax: 361-825-2742, mary.teasdale@mail.tamucc.edu, and Edward Kownslar, Mary and Jeff Bell Library, Texas A&M University - Corpus Christi

Nonchemistry majors enrolled in freshman chemistry laboratory bring with them a fear of the subject material, a general resentment at having to take chemistry, and often express the belief that chemistry is irrelevant to their major. To address the intimidation factor, to encourage use of library resources, and to demonstrate that reading and writing skills are applicable to any discipline, library assignments were implemented into the laboratory curriculum. Exercises are prepared collaboratively by reference librarians and laboratory instructors each semester. Exercises highlight a wide variety of print and electronic resources including SciFinder Scholar. Primary objectives of the exercises are to develop critical thinking skills and to discover the relevancy of chemistry to real-world situations. Specific tasks accomplished by students include developing search strategies, evaluating responses, reading and synthesizing information, and writing citations. This poster will present example exercises and strategies for incorporation into laboratory curriculum, and evaluate our experiences.

3:00 16 Chemical information for chemists: An outline of the graduate chemical information class at the University of Pennsylvania.
Judith N. Currano, Chemistry Library, University of Pennsylvania, 3301 Spruce St., 5th Floor, Philadelphia, PA 19104-6323, Fax: 215-898-0741, currano@pobox.upenn.edu

The poster contains details of the University of Pennsylvania's graduate chemical information class, required of all first-year doctoral chemistry students, including a class syllabus and examples of assignments. It began in 1995 as eight 1.5-hour sessions taught in the summer between the first and second years of study. It has evolved into a ten-week, laboratory class that is discipline specific. The students are separated into four sections according to their research interests, and they spend the duration of the course learning information-finding techniques and being introduced to major print and electronic resources in their subdisciplines. Each week, after a lecture of forty or fifty minutes, they are given an assignment to work in class, followed by a longer homework assignment. In order to pass the course, a student must receive a cumulative score of 70% or higher and complete all homework assignments and a term project, a guide to the literature on a subject of the student's choice.

3:30

17

HOUBEN-WEYL - 100 years of chemical reference works.
Thomas Krimmer, and Guido F Herrmann, Thieme Chemistry, Georg Thieme Verlag, Ruedigerstrasse 14, Stuttgart 70469, Germany, Fax: +49-711-8931777, thomas.krimmer@thieme.de - SLIDES

The series METHODEN DER ORGANISCHEN CHEMIE (Houben-Weyl Methods of Organic Chemistry) was established in 1909 by the German chemist Theodor Weyl and continued in 1913 by Heinrich J. Houben. The comprehensive description of preparative methods in a consistent style and their critical evaluation by leading experts is the philosophy on which Houben-Weyl was founded. The 4 volumes of the second edition were published between 1921 and 1924. The third edition, again consisting of 4 volumes, was published between 1924 and 1941. The fourth edition began in 1952, was continued from 1975, and ended in 1986 with a total of 67 volumes and 3 index volumes. The series was updated with 23 additional and supplementary volumes (in 93 single books) which placed emphasis on the treatment of important classes of compounds and significant preparative methods. Since 1990, Houben-Weyl has been published in English, thus making it accessible to chemists worldwide. Facing dramatic developments in chemistry during the last few decades which have provided chemists with a wealth of new reagents and reactions, the need for a new, comprehensive, and critical treatment of synthetic chemistry has become apparent. This new edition is entitled Science of Synthesis, Houben-Weyl Methods of Molecular Transformations. Science of Synthesis started in 2000 and will comprise a total of 48 volumes in 2007. It benefits from more than 90 years of experience and continues the tradition of excellence in publishing organic chemistry reference works. Science of Synthesis covers the whole field of organic chemistry based on all published and readily available sources from the early 1800s until the year of publication. To best meet the needs of the scientific community, Science of Synthesis will, starting in 2001, also be published as an electronic version. The poster outlines the evolution of one of the most esteemed reference works for synthetic organic chemists from a 2-volume first edition in 1909 to a sophisticated electronic database in 2001 and further on, thus covering a whole century in chemical information science.

18 Intro to Daylight, a 3-day cheminformatics course.
Jeremy J. Yang, Daylight Chemical Info Systems Inc, 441 Greg Ave., Santa Fe, NM 87501, Fax: 505-989-1200, jj@daylight.com

A course introducing the software of Daylight Chemical Information Systems Inc. is described. This three-day course consists of lecture and hands-on laboratory exercises. The curriculum is organized into sections covering underlying theory and languages (e.g., SMILES, SMARTS, SMIRKS), application software, administration, and toolkit programming. An attempt to address the individual needs of students is achieved by flexible use of lab time and sufficient lab instructors. In addition, all curriculum materials, including interactive lab exercises, are presented via web pages and applications, and made available after the course. This course has been offered for four years at Daylight Summer School, with very favorable results and student feedback.

19 Mine-field data mining: chemical information searching course.
Valerian M. Khutoretsky, Scientific Information Department, Zelinsky Institute of Organic Chemistry, 47 Leninsky prospect, 119992 Moscow, Russia, Fax: 7095-135-5328, khutor@ioc.ac.ru

The course at the Higher Chemical College RAS has its goal to teach insight not technique. Having basic comprehension of main concepts, students easily acquire practical skills when necessary. Underlining major concepts is crucial: value added information and fuzzy meaning of subject concepts. Training includes understanding of how cautious searcher should be in the comprehensive data mining. Standard question “Polycarbonates from General Electric” contains, in our sense, a mine or snare: poly decamethylene carbonate will not be found using the controlled term polycarbonat?/BI. Searching for “radioactive wastes” without terms “atomic or nuclear” are marked no higher than B. We organize class work as a competition among the students: who will find more answers? The same time they are permitted to discuss their strategies informal way during tests, but not exam. Beside visible task of answers finding, training has the supertask: to teacn mutual understanding between information specialist and end user.

20 One session in a science library: the need for conceptual schemes and scientific habits of a mind.
Svetlana Korolev, Science and Engineering Library, Wayne State University, 5048 Gullen Mall, Detroit, MI 48202, ac7109@wayne.edu - SLIDES

One session in a science library: the need for conceptual schemes and scientific habits of a mind.

From a perspective of a relatively new science librarian the experience is shared regarding the strategies and the content of a library session, which was developed and integrated into a range of graduate level chemistry courses. The session was held in the library's advanced computing facility. Stressing the importance of hands-on searching experience, handouts and assignments were distributed. Within a framework this session was focused on the overview of the major information resources, services and awareness programs. As the conceptual models in science constitute the pinnacle of explanation and classed among the greatest of intellectual achievements, it was attempted to integrate "information resources" schemes, scientific method approach and information literacy concept. The purpose of this paper is to discuss the needs for designing of handful schemes, charts and tables for a library session. Those models will help students in science to understand the order out of exploding universe of information resources. What stimulus and conceptual schemes could be used by library instructors, so then an individual is to appreciate the elements of self - investigating, proper questioning and logical reasoning in developing scientific habits of a mind?

21 Poster sessions as part of a chemical information course
F. Bartow Culp, Mellon Library of Chemistry, Purdue University, 310 Wetherill, West Lafayette, IN 47907-1538, Fax: 765-494-1579, bculp@purdue.edu

In teaching a course in chemical information, the instructor is forever reconciling the opposing forces of content inclusion with class time availability. This presentation describes the use of a student poster session at the end of the semester to achieve several instructional aims. These include: Maximizing the use of class time; introducing students to collaborative work;involving students in an ACS-type poster session.

22 Two programs for delivering chemical information instruction in a pharmaceutical research setting.
Anne Marie Clark, Melvin Budzol, Jill F. Pritts, Leona J. Williams, and Judith L. Johnson Philipsen, Information Management, Pfizer Global Research and Development, 2800 Plymouth Road, Ann Arbor, MI 48105, Fax: 734--622-7008, Judith.Johnson@Pfizer.com

“Outpost” and “Lunch & Learn” programs have been successful in providing chemical, biological and clinical information at the Pfizer Ann Arbor Laboratories. During Outpost sessions, members of the Information Research and Analysis group provide searching services and instruction at the reading rooms of the Chemistry and Pharmaceutical Sciences Departments on a regularly scheduled basis. Personal training of scientists in SciFinder, Crossfire, Merck Index, and other electronic resources enables them to effectively use these resources to assist them in their research. In Lunch & Learn, scientists attend demonstration sessions during their lunch hour. These sessions provide overviews of selected electronic resources and their utility for scientists in a relaxed atmosphere. Targeted promotion, required registration, refreshments, detailed handouts, and evaluations all contribute to the success of these sessions. The “Outpost” and “Lunch & Learn” programs increase awareness of electronic resources by scientists as well as the visibility and value of the Information Management department.

23 Weaving instruction into the web.
Janice E. Mears, and William A. Weida, Marketing, Chemical Abstracts Service, 2540 Olentangy River Rd., P. O. Box 3012, Columbus, OH 43210, Fax: 614-447-3837, jmears@cas.org

In today's information-driven world, chemical education must offer instruction in acquiring and using chemical information. But there are as many ways to teach chemical information techniques as there are modes of communication. Face to face instruction represents one end of the scale; on the other is documentation, which can offer detailed information but seems less personal. Fortunately the web offers the benefits of both personal and documentary modes of communication. CAS has used webcasts to disseminate product information and tutorials effectively. Techniques for using web video technology and interactive communication to full advantage are discussed along with advice for avoiding technical pitfalls in this new instructional medium. Managers involved in producing webcasts will also present ideas for using archive files to extend their value even to customers unable to tune in to the original webcast.

2:00 24 What a difference a decade makes: From the eighties to the noughties in an academic chemical information course.
Charles F. Huber, Davidson Library, University of California - Santa Barbara, Santa Barbara, CA 93106, Fax: 805-893-8620, huber@library.ucsb.edu

The author has taught a chemical literature course for graduate students and upper-division undergraduates at the University of California - Santa Barbara annually from 1988 to the present. Dramatic changes have occurred over that time: in methods of presentation, in topics covered, in the background of the students coming into the course, and in their expectations. This poster will trace some of the major trends that have shaped the course from the end of the Eighties to the beginning of a new century.

MONDAY MORNING

Section A

Science Portals on the Internet - The Producers
McCormick Place, South Bldg.
R.W. Snyder, Organizer
8:30 25 Expanding horizons of the STM information landscape
Eileen M. Shanbrom, and Harry F. Boyle, Marketing, CAS, 2540 Olentangy River Rd., P. O. Box 3012, Columbus, OH 43210, eshanbrom@cas.org - SLIDES

First there were online databases, then the Web opened new vistas of interlinked research sources. This rapidly evolving environment is dissolving old boundaries; e.g., between primary and secondary sources, and is disrupting familiar relationships. Something new is emerging out of the interaction of publishers, database producers, online services, search engine providers, authors and consumers of STM information. It's not what you click that matters, so much as whether your relationships click (i.e., whether you form relationships that really work). Scientists now expect to follow their own research highways and branch off into productive byways. Information providers must develop relationships required to build highways offering scientists relevant content in the context of their research. The authors will discuss challenges information providers face in building an integrated digital research environment and new initiatives to help scientists solve problems and gain new knowledge.

9:00 26 ScienceDirect
Jonathan Clark, Elsevier Science B.V, Molenwerf 1, 1014 AG Amsterdam, Netherlands, j.clark@sciencedirect.com

ScienceDirect is Elsevier Science's web-based initiative for the electronic distribution of scientific information. The platform hosts both full-text and abstract databases. All journal articles are available as pdf’s and full-text html. All header information and references are in html format. Users can navigate by browsing the full-text or by searching the journal articles and the abstract databases. There are personalisation features that allow users to set-up personal journal lists and various alerting services. Nevertheless, a proportion of the 5 million articles that are accessed each month are a result of linking into ScienceDirect from external portals. This paper discusses the role of portals in driving usage to full-text platforms and the impact of cross-platform linking.

9:30 27 Information portals for the chemical community.
William G Town, and Jan Kuras, ChemWeb Inc, 84 Theobalds Road, London WC1X 8RR, United Kingdom, bill.town@chemweb.com - SLIDES

ChemWeb.com maintains an innovative portal for chemists to support their daily research activities, whilst creating a worldwide virtual community and exploiting web technology. This presentation will provide a comparison between various scientific portal sites and highlight some of the services and functionality each has developed. The portal concept offers the capability of searching and browsing scientific journals from a range of publishers, and accessing chemical information databases and supplying links to journal full text. The role of preprint servers in maintaining a permanent archive and distribution medium for authors to post pre-publication research articles in chemistry will also be discussed.

10:00 28 Introduction of Intelligent Broker between User and Search Engine for Patent-Retrieval from Internet.
Dawei Pan, and Dasheng Chu, New Century Net, Inc, 6519 Coachlight Way, West Chester, OH 45069, Fax: 513-779-3479, panwwus@yahoo.com - SLIDES

The conventional paradigm to retrieve the patents of interest from Internet relies on the keyword(s) user inputs. The biggest disadvantage of this paradigm is that the keyword could introduce ambiguity due to different formats and synonyms of the same keyword, and ignorance of its context. In turn the “messy” mass of patents could pop up on the screen.

The intelligent broker architecture we’ve introduced here has a capability of key information acquisition, process, analysis, and heuristic search. The multi-brokers also can be created to perform the different tasks and cooperate with each other. This highly autonomous broker actively interacts with both user and search engine not only initially but also during searching. The strategy that is designed for the broker’s action has been optimized by Genetic algorithm, and other rule-base reasoning algorithms. Thus, the efficiency and hit rate can be dramatically improved.

10:30 29 Scirus: a search engine for scientific information only covering both web and database sources.
Femke G.C.M. Markus, Department of Electronic Publishing, Elsevier Science, Molenwerf 1, Amsterdam 1014 AG, Netherlands, Fax: +31.20.485.3354, f.markus@elsevier.nl - SLIDES

Scirus (http://www.scirus.com/) is a search engine dedicated to scientific information covering 16 scientific subject areas including Chemistry and Chemical Engineering. Scirus is specifically designed for finding highly relevant scientific information. Using the latest in search engine technology – developed by Fast Search & Transfer, ASA (FAST) – Scirus pinpoints both web and database scientific information sources that conventional, generic search engines cannot find. Scirus recognises document types which allows users to search for specific kinds of information (e.g. patents, scientists homepages).

A study was performed to compare the Scirus search engine with two other search platforms (including eScience). The basic approach was to ask university researchers to apply a scientific query from their own area of research to each of the search platforms in turn and to evaluate the results obtained. The users were asked to rate ease of use, usefulness and usability. The results of the study will be presented in and the presentor will elaborate on the technologies used by the different search platforms to support and explain the outcome of the results

11:00 30 "What are Chemists Looking for on the Internet? An analysis of search engine queries."
Yudie Fishman, and Kirk Brattkus, ChemIndustry.com, 730 E. Cypress Ave., Monrovia, CA 91016, Fax: 626-930-0102, yudie@fishman.org - SLIDES

It is known that chemists and chemical engineers make widespread use of Internet search engines to supplement proprietary databases. General-purpose search engines have given way in recent times to discipline specific search engines and directories. An analysis of the log files of one such specialized search engine has been performed. The results provide some insight into the use of Internet resources by professional chemists.

11:30 31 The ChemGuide access to chemistry information on the web.
Jost T. Bohlen, René Deplanque, and Michael Langner FIZ CHEMIE BERLIN, Franklinstrasse 11, D-10587 Berlin, Germany, Fax: 49 30 39977135 - SLIDES

The ChemGuide Internet Search Engine offers a unique combination of well-defined and comprehensibly pre-selected Web sites with an easy-to-use database interface featuring sophisticated retrieval functions. Thus, it increases substantially the quality of Internet search results. ChemGuide and related Guides offer full-text searches of all pages of the pre-selected sites. Retrieved documents can be viewed within the search engine and all hit terms are highlighted on screen. In addition, individual alerting services (SDI's) covering fully customized and complex search profiles are available.

MONDAY MORNING

Section B

Information Challenges in CombiChem/HTS Era Theory
McCormick Place, North Bldg., Level 1
Cosponsored with Division of Computers in Chemistry, and Division of Medicinal Chemistry
O. F. Güner, Organizer
8:30 32 Characterizing Property and Activity Landscapes Using an Information-Theoretic Approach
Veerabahu Shanmugasundaram, and Gerald M Maggiora, Computer-Aided Drug Discovery, Pharmacia Corporation, 301 Henrietta Street, Kalamazoo, MI 49007-4940, Fax: 616-833-9183, V.Shanmugasundaram@Pharmacia.com - SLIDES

The use of multi-dimensional chemistry spaces to represent large compound collections has become widespread in pharmaceutical research. In such spaces compounds are treated as points. Every compound point can be associated with various properties or bioactivities whose values can be represented as an additional dimension or “height” above the chemistry space—each property or bioactivity gives rise to a corresponding property or bioactivity landscape. If, for example, similar compounds exhibit similar biological activity in a given assay, then the activity landscape for that assay will appear as gentle rolling hills and valleys. If, however, some similar compounds have very different biological activities then the landscape will take on a steeper, more rugged appearence with many cliffs. A global index that provides a suitable measure of the topographic character of property and activity landscapes, based upon our earlier information-theoretic analysis of chemistry spaces, will be described; and several examples illustrating the approach will be presented.

9:00 33 Implementation of a global computing scenario in science.
Werner Dubitzky1, Thomas F. Kochmann1, Ruediger M. Flaig2, and Roland Eils1. (1) Intelligent Bioinformatics Systems, German Cancer Research Center, Im Neuenheimer Feld 280, D-69120 Heidelberg, Germany, Fax: +49 6221 42-3620, w.dubitzky@dkfz-heidelberg.de, t.kochmann@dkfz-heidelberg.de, (2) Institute for Pharmaceutical Technology and Biopharmacy, University of Heidelberg

Today, most of the scientific knowledge is implicit, that is in the heads of scientists or on written documents, paper-based or electronic. Thus, the knowledge is not directly amenable to automatic computational processing. We argue that the progress of science will depend on making explicit the scientific information that is currently lying dormant. Once made explicit, scientific information and knowledge could be made available on the global computing grid for manipulation and processing by automated means. Such a global computing scenario in science would possess a sheer unlimited power opening up a range of possibilities for systems with radically new properties. Such systems could usher in a new era of science by allowing completely new ways of organizing, using, and evolving scientific knowledge. Two major implications of the depicted scenario are: (1) The need for a new and comprehensive computational framework capable of handling this information, and (2) A radical shift of culture within science, as the way science is taught and conducted would undergo fundamental changes. In this article we address the first issue. Firstly, we argue that there will be three major categories of scientific information systems, namely, databases, information bases, and knowledge bases. Secondly, we propose an agent-based computational methodology for tackling the resulting global computing challenge. Thirdly, we outline an algorithmic framework that is largely based on emergent, evolutionary, and neural computing [1].

[1] The 4T2 Consortium, European Union Fifth Framework IST Proposal (under FET Call: Global Computing), “Breeding Creative Information Societies”, submitted April 2001.

9:30 34 Protein and Ligand Classification using 3D Fingerprints
W. Todd Wipke1, Debra Schumacher1, and David Rogers2. (1) Department of Chemistry and Biochemistry, University of California, Santa Cruz, Molecular Engineering Laboratory, Santa Cruz, CA 95064, Fax: 831-459-2935, wipke@chemistry.ucsc.edu, (2) SciTegic, Inc

There is great interest in being able to automatically classify proteins by their three-dimensional attributes and to rapidly screen small molecules to find potential nhibitors. We have used 3D fingerprints of proteins in the new data flow system, Pipeline Pilot, to develop models that can recognize proteins in a variety of classes. The automatic learning component of Pipeline Pilot greatly facilitated this study. We applied the same method to the ligands of the proteins to try to recognize the class of the ligand and recognize other potential ligands of that class. Our results from these studies will be discussed.

10:00 35 Multiscale Bayesian approaches to extract gene sets from HTS genomics data.
Chihae Yang1, Paul E. Blower Jr.1, Limin Yu1, Bhavik Bakshi2, and James F. Rathman2. (1) LeadScope, Inc, 1275 Kinnear Rd, Columbus, OH 43212, cyang@leadscope.com, (2) Department of Chemical Engineering, The Ohio State University

Tremendous amounts of data are produced by the high throughput screening methods currently employed in drug discovery and product development. A typical cDNA microarray or genechip experiment easily generates over 10,000 data points for each array or chip. The challenge of then inferring meaningful information is formidable given the size of the data set. Most published data handling techniques include clustering of the gene sets for sub-categorization and mapping the classifications for visualization. In this paper, multiscale Bayesian approaches including principal component analysis (PCA) and wavelet transformation (WT) methods are used to extract subsets and to visualize the data in multidimensions for comparisons. Data available from the National Cancer Institute (NCI) are used to demonstrate the new methods. These include gene expression data from cDNA microarray studies on 60 cancer cell lines, and the effects of various drug compounds on activity for the same 60 cell lines. Similarity in cell lines and compound-gene correlations are effectively visualized and quantitatively compared by PCA and WT.

10:30 36 Towards hermeneutic knowledge management in science.
Thomas F. Kochmann1, Werner Dubitzky1, Ruediger M. Flaig2, and Roland Eils1. (1) Intelligent Bioinformatics Systems, German Cancer Research Center, Im Neuenheimer Feld 280, D-69120 Heidelberg, Germany, Fax: +49 6221 42-3620, t.kochmann@dkfz-heidelberg.de, w.dubitzky@dkfz-heidelberg.de, (2) Institute for Pharmaceutical Technology and Biopharmacy, University of Heidelberg

Knowledge in the life sciences and molecular sciences has become increasingly complex. As a consequence most scientists are forced to specialize on a narrow field. This problem has recently triggered a new discussion on novel ways to organize, manage, and advance scientific knowledge. In addition to its inherent complexity and increasing degrees of specialization, a critical dimension of future knowledge management in science is that science has become a global endeavor. This calls for an entirely new approach to scientific knowledge management. We propose an approach to a computational global framework for knowledge management based on the distributed/intelligent agent paradigm and emergent processes able to synthesize symbolic and subsymbolic information within a multiple-level scheme.

11:00 37 An extension of recursive partitioning for mining large screening sets.
Paul E. Blower Jr.1, Jeff Bjoraker1, Denise Fiacco1, Joseph Verducci2, and Michael Fligner2. (1) LeadScope, Inc, 1275 Kinnear Rd, Columbus, OH 43212, Fax: 614-675-3732, pblower@leadscope.com, jbjoraker@leadscope.com, (2) Ohio State University

Statistical datamining methods have proven to be powerful tools for investigating correlations between molecular structure and biological activity. Recursive partitioning, in particular, offers several advantages in mining large, diverse data sets resulting from high throughput screening. We use simulated annealing to find sets of structural features whose simultaneous presence or absence best separates the largest group of most active compounds. The search is incorporated into a recursive partitioning design to produce a regression tree for biological activity on the space of structural fingerprints. Each node is characterized by some specific combination of structural features, and the terminal nodes with high average activities correspond, roughly, to different classes of compounds. In this talk, we will describe the statistical techniques used in this new method and illustrate its application in mining a large dataset.

11:30 38 Informative library design: overview and method.
Jennifer L. Miller, Consultant, 935 College Avenue, Menlo Park, CA 94025, and Erin K. Bradley, Chemical and Physical Sciences, DuPont Pharmaceuticals Research Laboratories

One can view screening a compound library as performing a collection of parallel experiments. As with any set of scientific experiments, one wishes to learn as much as possible with minimum effort. Yet, by being based on diversity, contemporary screening libraries do not take full advantage of the information gain possible in this parallel setting. Informative library design was developed explicitly to select the set of compounds (set of experiments) that maximizes the information gain per assay. This information theoretic approach will be discussed in detail including a discussion of the design of both combinatorial and discrete libraries.

MONDAY AFTERNOON

Section A

Information Challenges in CombiChem/HTS Era Theory
McCormick Place North Bldg, Level 1
Cosponsored with Division of Computers in Chemistry, and Division of Medicinal Chemistry
O. F. Güner, Organizer
1:30 39 Chemistry 2000: resolving finer detail
Jonathan M Goodman, and Stephen C Allen, Department of Chemistry, Cambridge University, Lensfield Road, Cambridge CB2 1EW, United Kingdom, Fax: +44 1223 336362, jmg11@cam.ac.uk - SLIDES

Chemistry 2000 is an index of university chemistry sites world wide (http://www.ch.cam.ac.uk/ChemSitesIndex.html) that is automatically updated monthly to maintain its reliability. We have investigated how the data can be used to extend the range of the index. Can we go beyond chemistry departments to chemists? Can this be done automatically? We have developed a program that uses our database to search for individual academic chemists, to build up a new index of people. The paper will present the successes and problems of this approach.

2:00 40 Portals, Special Libraries, and integration for scientists.
Martin Braendle, Arun Kumar, and Engelbert Zass, Chemistry & Biology Information Center, ETH Zuerich, ETH Hoenggerberg - HCI, CH-8093 Zuerich, Switzerland, braendle@chem.ethz.ch, zass@chem.ethz.ch - SLIDES

End-user searching, databases on the net, electronic journals, and the World Wide Web diminish the role of traditional libraries offering printed holdings and mediated database searches, and even pose a potential threat to their existence. On the other hand, libraries did serve the function of todays portals long before this term was coined. The ETHZ Chemistry & Biology Information Center (http://www.infochembio.ethz.ch/) is catching the attention of its clients to existing portals and web communities. We also offer portal services ourselves using the web interface to our integrated library system CLICAPS (Chemistry Library Information Control and Presentation System, http://www.clicaps.ethz.ch/) that currently contains links to about 50'000 electronic issues of more than 930 journals in chemistry, biology, and physics. This system also serves as a "switch" to link citations from chemistry databases like CrossFire Beilstein and Gmelin or CA on CD (in the near future, SciFinder Scholar) to both electronic and printed journals. An existing structured list with about 7000 links in the areas of chemistry and biology is offered on our Web site. A "peer-reviewed" selection of these links is being integrated into the library system that will enable a user to retrieve local holdings (books, printed journals, CD-ROMs etc.), electronic journals and e-books as well as these links for the topic searched.

2:30 41 Use of portals by academic chemists and chemistry students.
Gary D. Wiggins, Chemistry Library, Indiana University, 800 E. Kirkwood Avenue, Chemistry Building Room C003, Bloomington, IN 47405-7102, Fax: 812-855-6611, wiggins@indiana.edu - SLIDES

A portal is a gateway on the World Wide Web that serves as a major starting site for users. It serves as an anchor site where users may find such things as a directory of relevant Web sites, a facility to search for other sites, and even a community forum where relevant issues can be discussed. Increasingly, academic institutions are developing portals that permit a degree of customizability of the look and feel of the portal, as well as the ability to select the information sources that appear on the main screen. Academic faculty and students use a variety of methods to locate information on the Web. Results of a survey of academic user sentiments about the existing chemistry portals will be presented.

3:00 42 Science portals on the Internet: the business case.
Wendy A. Warr, Wendy Warr & Associates, 6 Berwick Court, Holmes Chapel, Cheshire CW4 7HZ, United Kingdom, Fax: +44 1477 533837, wendy@warr.com - SLIDES

In this summary paper, we aim to produce an objective analysis of science portals on the Internet. We will discuss the concept of a portal, and classify some examples in terms of their aims, features, and uses. We will consider the market for these products and their hopes for future profitability (it being assumed that most portals currently run at a loss). Commercial aspects and business models will be considered. How many people use these tools? Are the vendors over-ambitious? Will any one of the portals discussed today actually become the “one port of call” for chemists? Which, if any, of these “vortals” are likely to become indispensable? We aim to answer these questions in a general sense, not by doing a feature-by-feature comparison of well known portals.

3:30   Intermission.
4:00 43 CINF Division Business Meeting.
Andrea B. Twiss-Brooks, John Crerar Library, University of Chicago, 5730 S. Ellis, Chicago, IL 60637, Fax: 773-702-7429, atbrooks@midway.uchicago.edu
4:30 44 Open Meeting: Committees on Publications and on Chemical Abstracts Service.
Robert J. Massie, Chemical Abstracts Service, P.O. Box 3012, Columbus, OH 43210, Fax: 614-447-3765, rmassie@cas.org, and Robert D. Bovenschulte, Publications Division, American Chemical Society

MONDAY AFTERNOON

Section B

Information Challenges in CombiChem/HTS Era Applications
McCormick Place North Bldg, Level 1
Cosponsored with Division of Computers in Chemistry, and Division of Medicinal Chemistry
O. F. Güner, Organizer
1:30 45 Informative Library Design: application to lead generation and optimization.
Erin K. Bradley1, Jennifer L. Miller2, and Peter D. J. Grootenhuis1. (1) Chemical and Physical Sciences, DuPont Pharmaceuticals Research Laboratories, 150 California Street, Ste #1100, San Francsico, CA 94111, Fax: 415-732-7170, ebradley@combichem.com, (2) Consultant

We present an approach for the design of chemical libraries (combinatorial and/or general screening) that yields maximum information on the binding characteristics of the target receptor. The process is iterative, with the data captured by inactive compounds being crucial to the refinement of an activity model. This refinement produces a more focused library design for each subsequent round of synthesis. The potential power of this approach will be demonstrated with the results from three sequential rounds of informative design and combinatorial library assay results. We will also compare the performance of the informative design method to standard methods of library design and database searching.

1:50 46 In silico ADME in drug discovery
Julie E. Penzotti1, Peter D.J. Grootenhuis1, Paul Labute2, Jayashree Srinivasan1, Robyn A. Rourick1, Daniel B. Kassel1, and Kelly M. Jenkins1. (1) Chemical & Physical Sciences, DuPont Pharmaceutical Research Labs, 150 California St., Suite 1100, San Francisco, CA 94111, Fax: 415-732-7170, Julie.E.Penzotti@dupontpharma.com, (2) Chemical Computing Group, Inc

Pharmaceutical companies have incorporated high throughput in vitro profiling of compounds for properties related to absorption, distribution, metabolism, and excretion (ADME) with the aim to identify development liabilities earlier and thereby accelerate drug discovery. This has created large data sets to which computational methods can be applied to derive in silico filters and models for ADME properties. While computational technologies have been applied successfully to model potency against a target, many complex mechanisms are involved in ADME, requiring new computational strategies and descriptors. We will describe the in silico filters and classification models that we have developed to guide the design and selection of libraries likely to have more favorable ADME profiles and to assist in the prioritization of chemical series.

2:10 47 A Program for Inductive Identification of "Good" Partial Match/Partial Coverage 3D Flex Queries.
Robert Clark, Edmond Abrahamian, Peter Fox, Alexander Strizhev, and Trevor Heritage - SLIDES

We have recently developed a program that combines a genetic algorithm with UNITY 3D flexible searching to identify ensembles of spatially constrained partial match phormacophore hypotheses. The individual queries in these ensembles are mutually (though often not completely) complementary, representing multiple pharmacophoric classes or alternative binding modes that occur in the training set. Particular queries of the ensemble may be selected to maximize class specificity or coverage over all known actives, while maintaining discrimination from a random druglike dataset. The methodology developed is particularly well suited for working with data from high-throughput screening (HTS), where error rates (false positives and false negatives) tend to be high. This talk will describe the kind of query ensembles produced for model systems, and will illustrate how those results change when system parameters are modified.

2:30 48 Virtual Screening based Experimental Design for Combichem / HTS Era.
Jacques R. Chretien, Marco Pintore, and Frederic Ros, University of Orleans, CBI / Chemometrics & BioInformatics, BP 6759, 45067 ORLEANS Cedex, France, Fax: 33 2 38 41 72 21, jacques.chretien@univ-orleans.fr

A new Data Base Mining software package (DBM Soft) was developed to search for new leads. The concepts are based on molecular diversity analysis with help of original hybrid systems involving complex combination of artificial neural networks, genetic algorithms and fuzzy logic. The DBM soft exhibits enhanced capabilities in virtual High Throughput Screening (v-HTS). Any Combichem/HTS strategy has its own advantages and limitations. A Global Predictive Strategy (GPS) based on v-HTS/CombiChem/HTS offer new insights at different levels: (i) more rational selection of the building block in CC, (ii) search for chemical analogues of better bio-availability after an HTS trial, (iii) scaffold changes or modification, (iv) costs and delay optimization. Such GPS strategy aims to gain added values in a multi-step process including the maximum of rationality and knowledge at any step rather than a pure and unique random approach on huge libraries.

2:50 49 Fast and accurate enumeration of combinatorial libraries.
Keith A Harrington, Julian Hayward, and Roger Upton, Accelrys Inc, 9685 Scranton Road, San Diego, CA 92121-3752, Fax: (858) 458-0136, keith@accelrys.com - SLIDES

Combinatorial chemistry has shown the need for the enumeration of 2-D chemical structures within libraries ranging in size from several hundred to millions of compounds. The requirement for a quality software solution for the enumeration of the structures has set considerable challenges for software developers. Clearly it is vital to ensure that the structures enumerated are free from errors and are produced at a rate fast enough for large virtual libraries.

Comparisons of different approaches to the problem will be given to demonstrate how this important area has developed and where the future lies. A toolkit-based approach and end-user applications will be discussed in detail that are both accurate for the determination of the molecular structure and sufficiently fast for the enumeration of large virtual libraries.

3:10 50 Needles in hayfields: Strategies for rapid HTS triage analysis.
Jack Andrew Bikker1, James B. Dunbar Jr.1, Dirk Bornemeier1, David J. Wild2, Alain Calvet1, and Christine Humblet1. (1) Pfizer Global Research & Development, Ann Arbor Laboratories, 2800 Plymouth Road, Ann Arbor, MI 48105, Fax: 734-622-2782, jack.bikker@pfizer.com, (2) Ann Arbor Laboratories, Pfizer Global Research and Development - SLIDES

With the growth of corporate compound collections, the requirement for efficient strategies for HTS triage analysis has also grown. The typical data package arising from the HTS experiment can consist of hundreds of thousands of compounds, with typically several thousand actives. Software capable of organizing these data into series and singletons is becoming available. However, this alone is inadequate to facilitate an organized attrition of HTS-derived series and singletons. Strategies to eliminate series based on legacy data (promiscuity, pharmacokinetic liabilities), unwanted chemical functionality, and rational prioritization are being developed and applied. Recent experience with software and strategies will be discussed.

3:30 51 cSLNs--extending the SLN language to describe variable, combinatorial libraries.
Malcolm Cline, Development, Tripos, Inc, 1699 South Hanley, St. Louis, MO 63144, Fax: 314-647-9241, mac@tripos.com, Robert Clark, Research and Development, Tripos, Inc, and Web Homer, Software Development, Tripos, Inc

SYBYL Line Notation (SLN) has been shown (Ash, et. al., JCICS, 1997) to be a versatile language for chemical structure representation. Extensions to the language allow full representation of combinatorial libraries in a precise, terse format. Applications to the specification of chirality within combinatorial libraries, database searching, and rapid computation of property data will be discussed.

MONDAY EVENING

Sci-Mix
Hyatt Regency Chicago, Riverside Center
R.W. Snyder, Organizer
9:00-11:00
52 Applications of Virtual Screening based Experimental Design in Combichem/HTS
Marco Pintore1, Frederic Ros1, Jacques R. Chretien1, and Natalia Rozhkova2. (1) University of Orleans, CBI / Chemometrics & BioInformatics, BP 6759, 45067 ORLEANS Cedex 2, France, Fax: 33 2 38 41 72 21, jacques.chretien@univ-orleans.fr, jacques.chretien@univ-orleans.fr, (2) Plant Protection Chemical Research Institute

A new Data Base Mining software package (DBM Soft) was developed to search for new leads and/or to estimate bio-activity profile of tested compounds issued from various library types. It appeared recently that strategies in Experimental Design (ED) useful for Combichem/HTS procedures might be derived with help of the DBM soft. Different examples will be selected to show how a Global Predictive Strategy (GPS) based on v-HTS/CombiChem/HTS, inside a multi-step process including the maximum of rationality and knowledge might be envisaged. This GPS offers new insights at different levels such as: (i) more rational selection of the building block in CC, (ii) search for chemical analogues after an HTS trial, (iii) scaffold changes or modification, (iv) rapidity and cost optimization.

53 Canonical-representation of stereochemistry (CAST) coding method for highly accurate prediction of 13C NMR chemical shifts for organic molecules.
Hiroko Satoh, Multimedia Information Research Division, National Institute of Informatics, JST PRESTO, Hitotsubashi 2-1-2, Chiyoda-ku, Tokyo 101-8430, Japan, Fax: +81-3-3556-1916, hsatoh@nii.ac.jp, Hiroyuki Koshino, Molecular Characterization Division, RIKEN, and Tadashi Nakata, Synthetic Organic Chemistry Lab, RIKEN

Canonical representation of stereochemistry (CAST) coding method with its application to 13C NMR chemical shift predictions are presented. The CAST method has been developed, which is appropriate to describing and retrieving stereochemical information in databases. Information on absolute/relative configurational and conformational, as well as planar structural environments can be correctly retrieved from a database where chemical structures are described by CAST. We have constructed a new database of 3D-chemical structures with the 13C NMR chemical shift data, and developed a new system CAST/CNMR, which can distinguish stereochemical similarities and differences for a carbon where the chemical shift is to be predicted, and predict 13C NMR chemical shifts with high accuracy. Applications of the CAST/CNMR system for some terpenoids, polyethers, and synthetic compounds demonstrate the availability of CAST/CNMR.

54 Descriptor-based HTS data analysis and SAR model extraction using recursive partitioning approach.
Guyan Liang, Computational Chemistry, Aventis Pharmaceuticals, P. O. Box 6800, Bridgewater, NJ 08807, Fax: 908-231-3605, Guyan.Liang@Aventis.com, and Shaoyi Li, Suite 2004

The present study applied Recursive Partitioning (RP) analysis on HTS activity data and descriptors derived from molecular structures (2D/3D) to build SAR models. The structural descriptors used include Daylight fingerprints, ISIS key, atom-pair fingerprints, and 3D pharmacophores. The Advantage of using 2D descriptors is to overcome the difficulty of conformational space sampling. In the past two years, it has been reported in literature that 2D descriptors had more stable performance in generating SARs and differentiating active molecules from inactives. The advantage of 3D pharmacophore is that they can be easily linked to molecular structures and that the knowledge learned from data mining can be directly used in guiding chemical structure optimization. Several comparisons were made in the present study using Aventis HTS data to demonstrate the performance of different descriptors. During the RP model building process, one major concern is that it may over-interpret the training data. By increasing the number of nodes (size of the tree), the misclassification rate goes down for the training sample but the one for the testing samples goes back up after reaching the minimum. As a result, the overall predictability is lower. In our process, a carefully designed cross validation was used to determine the optimum number of nodes so that it gives the best performance for the testing samples rather than for the training samples. Several sets of HTS data were used to demonstrate the feasibility and performance of this approach. Those HTS data ranged from reliable sources (confirmed IC50) to highly noisy ones. The performance of 2D versus 3D descriptors was compared for different types of HTS data.

55 Knowledge management of spectral data
Marie Scandone, and Gregory M. Banik, Informatics Division, Sadtler Software & Databases, Bio-Rad Laboratories, 3316 Spring Garden Street, Philadelphia, PA 19104-2596, Fax: 215-662-0585, marie_scandone@bio-rad.com - SLIDES

An informatics solution is presented that permits organic and analytical chemists to combine the ability to build, analyze, and access analytical databases with the ability to create, manage, and communicate knowledge from those databases. Software modules are described to manage various analytical methods, build and search databases of spectra and chemical structures, interpret and predict spectra, draw chemical structures, and annotate structures and spectra. Spectral data can be processed to optimize search results, and searching can be done on names, structure, substructure, properties and analytical data. Either commercial or user-created databases can be managed and searched. Through the use of a single user interface in an integrated informatics environment, switching between multiple programs and the training issue that this creates is eliminated. Data from multiple analytical techniques--including 1H and 13C NMR, IR, NIR, Raman, MS, GC, and UV-Vis--can be accessed, searched, processed or analyzed as well as presented in reports or web publications.

56 New MS and BS chemical informatics programs
Gary D. Wiggins, and Sonia Gupta, Chemistry Library, Indiana University, 800 E. Kirkwood Avenue, Chemistry Building Room C003, Bloomington, IN 47405-7102, Fax: 812-855-6611, wiggins@indiana.edu - SLIDES

New academic programs in chemical informatics have been started in several universities in the last two years. An overview of the course of study at three universities shows the common topics covered in the programs at UMIST, the University of Sheffield, and Indiana University.

57 Results of the fall 2000 ACS Committee on Professional Training library survey.
Kevin P. McCue, Office of Professional Training, American Chemical Society, 1155 15th Street NW, Washington, DC 20036, Fax: 202-872-6066, kpm98@acs.org, Jeanne E. Pemberton, CPT Chair (University of Arizona), and Sally Chapman, CPT Consultant (Barnard College)

In October 2000, department chairs at the 617 ACS-approved undergraduate chemistry programs were mailed a four-page survey on chemistry library holdings, access, budgets, and use. The goals of this survey were to assess the current chemistry library situation and to determine how recent developments in electronic forms of chemical information are impacting undergraduate education in chemistry. Librarians familiar with chemistry holdings and budgets were asked to complete half of the survey. The overwhelming 67% response indicates the interest of chemistry departments and the chemical information community in these issues. The results from this survey will be presented and will be broken down by institution size and highest degree offered for comparison. The implications of these findings for undergraduate chemistry education will be discussed.

58 Unified Chemical Research Interface: A New Concept of Information Exchange via the Internet.
Yan He, Department of Chemistry, University of Iowa, Iowa City, IA 52242, Fax: 319-335-1270, yan-he@uiowa.edu - SLIDES
59 Virtual screening using “LibSign”
Lars Nærum, Leif Nørskov-Lauritsen, Preben H. Olesen, and Søren B. Padkjær, Health Care Discovery, Novo Nordisk A/S, Novo Nordisk Park, Måløv DK-2760, Denmark, Fax: +45 4466 3450, lnae@novonordisk.com, sbp@novonordisk.com

One of the challenges in creating an efficient virtual screening environment is to provide a large source of virtual compounds that can readily be synthesized and tested in a corresponding assay. A computer system called LibSign that integrates solid phase chemistry, scaffold design, reagent selection and virtual screening has been built. It provides a common platform for bench, scaffold and computational chemists to enter their chemical knowledge into a database and allows for virtual screening of scaffolds towards a variety of pharmachophore-based methods. Virtual hits from verified scaffolds can be synthesized and tested within a few weeks. The presentation will contain a description of the LibSign system and an example of its use in identifying new hits by lead hopping.

TUESDAY MORNING

Computer-Assisted Applications for the Practicing Chemist Herman Skolnik Award Symposium
McCormick Place, South Bldg. Level 4
G. Grethe, Organizer
8:40 60 Thirty years of computer-assisted applications for the synthetic chemist: Experiences of a non-programmer.
Guenter Grethe, Marketing/Scientific Applications, MDL Information Systems, Inc, 14600 Catalina Street, San Leandro, CA 94577, Fax: 510-614-3616, guenter@mdli.com - SLIDES

The dramatic changes in the area of chemical information that have taken place over the last 30 years affected both the way we handle data and their use by specialists and occasional users alike. This retrospective on the use of computer-assisted applications for the synthetic chemist will travel along a route from the early days characterized by large computers with limited memory to today's increasingly complex discovery process employing powerful machines. This talk will emphasize the changes that turn endusers increasingly into proficient users of chemical information. The development of user-friendly programs, the rapid advancements in web technology, and efficient manipulation of data, all play an important role in changing the chemical information scene.

9:10 61 Automating the design of molecules.
W. Todd Wipke, Department of Chemistry and Biochemistry, University of California, Santa Cruz, Molecular Engineering Laboratory, Santa Cruz, CA 95064, Fax: 831-459-2935, wipke@chemistry.ucsc.edu

Today, almost every piece of chemical research software can run on the chemist's desktop computer with high resolution true color graphical display. We have been interested not only in the potential of the computer to enhance the creativity of the chemist, but also in the potential of the computer to be innovative and creative on its own. This paper focuses on INVENTON, a program to explore the capability of the computer to invent new chemical structures to fit criteria specified by the chemist. What might we expect in Chemistry now that the Deep Blue computer beat the World Chess Champion? What are the limitations in computer creativity?

9:40 62 Question, query and relevant response: Pick any two
Alexander Lawson, MDL Information Systems GmbH, Theodor-Heuss Allee 108, D-60486 Frankfurt, Germany, ALawson@mdli.com - SLIDES

The history of access to chemical information has been dominated for almost three decades by the availability of major secondary indexing services in computerized form. The archetypal expectation for any literature research is obviously a “relevant response” to any particular “question”. The importance of the “query” as intermediate in this scenario is widely regarded as axiomatic, but closer examination shows that this paradigm is increasingly under pressure. The present paper will trace the decline in the importance of the concept of the query as the world finally moves to a more integrated view of the chemical literature, stimulated by the onset of primary sources in electronic form. The particular instance of information on chemical reactions will be examined as a typical case of the general postulate given in the title.

10:10 63 Databases and documents: Breaking down the barriers.
W. Douglas Hounshell, MDL Information Systems, 14600 Catalina St., San Leandro, CA 94577, Fax: 510-614-3616, doug@mdli.com - SLIDES

The ability to easily follow ideas across documents and databases is becoming increasingly important to scientists who are researching prior art or maintaining current awareness. The advent of Internet access to documents (e.g. eJournals) and to databases affords the opportunity to seamlessly shift context from document-to-database, database-to-database, database-to-document, and document-to-document. Specific examples that implement some of these interconnections will be examined, including Dymond (document-to-database), Compound Warehouse (database-to-database), and LitLink (database-to-document and document-to-document).

10:40 64 Networking of information sources for the future bench chemist.
René Deplanque, Jost T. Bohlen, and Richard C. Huber, FIZ CHEMIE BERLIN, Franklinstrasse 11, D-10587 Berlin, Germany, Fax: 49 30 39977133, deplanque@fiz-chemie.de

During the last two decades numerous traditional as well as newly developed information sources have become electronically available to chemists. In addition, various tools for modeling, simulation and visualization are now routinely used due to the steadily increasing computing power of standard lab and office environments. However, until now the user has to deal with several applications and diverse interfaces on various platforms. In the future the bench chemist will have access to an integrated laboratory environment which combines all relevant sources of information from primary literature to reference works and textbooks as well as factual and reaction databases with all necessary tools for simulation, visualization, and the management of internal and external data.

11:10 65 Evolution of research informatics.
Mick Savage, Savage Consulting, 13586 Penfield Point, San Diego, CA 92130, msavage1@san.rr.com

For many years, computational methods have played a critical role in accelerating the drug discovery process. However, the revolution of automated laboratory methods has altered the research environment to require an even greater dependence upon computation. Robotics, multi-well systems, and newly entrenched technologies such as combinatorial chemistry and bioinformatics continue to generate massive quantities of data - overloading the conventional computing infrastructure. Today's discovery information comes from hundreds of sources and consists of many different types of data. This presents tremendous challenges in the integration, management, and analysis of all this valuable data. Unfortunately, the nature of database systems requires that data inquiries be restricted to that which has been pre-conceived and already calculated in the system. There is a growing recognition in the life sciences community of the need for tools and systems capable of manipulating and analyzing huge quantities of various types of data in real time. This presentation will focus on the emergence of new approaches for addressing this emerging crisis in research informatics.

TUESDAY AFTERNOON

Computer-Assisted Applications for the Practicing Chemist Herman Skolnik Award Symposium
McCormick Place, South Bldg. Level 4
G. Grethe, Organizer
2:30 66 Decision support systems for the practicing medicinal chemist.
Peter Gund, Pharmacopeia Inc, CN 5350, Princeton, NJ 08543, pgund@pharmacop.com - SLIDES

The medicinal chemist has traditionally struggled to acquire and digest all project-related information. He or she must understand and act on in-house and external chemical, biological, and computational data; design and perform appropriate experiments; analyze those results and interpret them in context; and transmit conclusions to appropriate team members and management. Project leaders and especially research directors have the more daunting tasks of integrating all project information, rationally allocating research resources, and deciding when to initiate new projects and/or terminate current ones. Research decisions at all levels will be no better than the completeness and correctness of available information. The emerging high-throughput discovery paradigm depends on integrated informatics systems, decision support capabilities, and workflows to enable significantly more rational project management. Progress on developing such systems will be reviewed.

3:00 67 Exploring structure databases
Robert W. Snyder, MDL Information Systems, Inc, 14600 Catalina Street, San Leandro, CA 94577, Fax: 510-483-4738, bobs@mdli.com - SLIDES

Structure databases hold the key to leveraging existing information in chemical research. Inherent in the chemical structure is knowledge on biological activity, synthetic feasibility, metabolic fate, and toxicological profile. Yet current tools focus on the searching and retrieval of chemical structures and their related information and not on the exploration of knowledge existing in structure databases. Novel ways for comparing the information level of reaction databases will be presented. Also, new ways to explore knowledge based on structural similarity will be discussed.

3:30 68 Reaction knowledge from reaction database: The derivation and application to synthesis desig.
Kimito Funatsu, Department of Knowledge-based Information Engineering, Toyohashi University of Technology, Tempaku, Toyohashi 441-8580, Japan, Fax: +81-532-47-9315, funatsu@tutkie.tut.ac.jp

Reaction databases potentially include much information about chemical reactions (e.g., reaction conditions, yields, catalysts, reagents, and references) and are continuously updated from year to year. Reaction databases have been widely accepted and used in many chemical research laboratories. User must, however, already have contrived the general outline of synthesis in database-oriented synthesis planning before they search whether a planned reaction scheme has an apparent connection to a literature precedent in databases. Empirical knowledge based-oriented synthesis planning systems are therefore attractive for chemists because they can propose retrosynthetic paths based on their knowledge bases without the user's framework of the synthesis of the desired target molecule. Considering these issues, we have developed a novel empirical synthesis design system by use of knowledge bases that are free from the disadvantages of transform-knowledge base. This system is called KOSP (Knowledge base-Oriented system for Synthesis Planning). The aim of the KOSP is to adjust the reaction knowledge base. To achieve this purpose, four functions are required: strategic site pattern perception, retrosynthetic scheme generation, retrosynthetic scheme evaluation, and retrosynthetic analysis termination. One of the most important advantages of KOSP is that knowledge base can be immediately derived from reaction databases in cases in which reaction data increase; KOSP can thereby consider novel and effective reactions which are being developed in the organic synthesis field. KOSP can thus use reaction data efficiently.

4:00 69 Chemist and the Web.
Stephen R. Heller, MDL Information Systems, Sushi House, 2413 Lillian Drive, Silver Spring, MD 20902, steveh@mdli.com - SLIDES

The internet and the World Wide Web contains a great deal of useful and valuable information for the practicing chemistr. While some of the the information is free, the vast majority of the most valuable information requires payment for access.

This presentation will provide an overview of what is on the web, both free and fee-based, and provides examples of what chemists do on an everyday basis when they access the web. Examples will be taken from the most popular web resouces such as patent sites, ChemWeb, Chemindustry.com, chemical socities, and primary and secondary publishers.

4:30 70 Reacting to chemists' needs: Reaction information sources, their providers and users.
Engelbert Zass, Chemistry & Biology Information Center, ETH Zuerich, ETH Hoenggerberg - HCI, CH-8093 Zuerich, Switzerland, zass@chem.ethz.ch - SLIDES

In the eye of the beholder, reaction information retrieval is the pivotal discipline ("Königsdisziplin") in chemical information. The number of reactions is by definition even larger than the number of chemical compounds. Reaction searching may involve substructure, text, and numeric data searching alone or a combination of these, and is very important in the everyday work of a chemist. Using past developments in reaction databases as a background, the present situation is outlined from the point of view and experience of a chemist turned information specialist. Future improvements and developments needed in reaction retrieval will be identified.

5:00 71 Supporting chemical information needs at Stanford University.
Grace Baysinger, Swain Chemistry and Chemical Engineering Library, Stanford University, 364 Lomita Drive, Organic Chemistry Building, Stanford, CA 94305-5080, Fax: 650-725-2274, graceb@stanford.edu - SLIDES

The mission of the Swain Chemistry and Chemical Engineering Library is to support the research and teaching needs of chemists and chemical engineers at Stanford University. To accomplish this mission, Swain has an extensive collection of resources, offers specially tailored services, is in close proximity to its primary clientele, and is accessible 24 hours a day to researchers. Swain's reference, instruction, and outreach services include in-depth reference assistance, weekly orientation tours, class lectures, and database searching workshops. Swain publishes self-help web pages and guides, a new book list, and a library newsletter. Alerting services exist for key databases. Free document delivery services are offered for items not owned by Stanford. While most resources are focused on assisting graduate level research, every Fall Quarter Swain also works intensively with a sophomore level organic chemistry class. This presentation will highlight library resources and services that support chemical information needs at Stanford University.

WEDNESDAY MORNING

Section A

E-Libraries
McCormick Place, South Bldg. Level 4
L. Solla, Organizer
9:05 72 Building an e-print service: Addressing the social challenge in environmental management science.
Lorrie A. Johnson, Office of Scientific and Technical Information, U.S. Department of Energy, P.O. Box 62, Oak Ridge, TN 37831, Fax: 865-576-3589, johnsonl@osti.gov, and Gail M. Hodge, Information International Associates, Inc - SLIDES

Building e-libraries is often as much a social challenge as a technical one. This is particularly true within a field that bridges multiple scientific disciplines, that incorporates multiple information types, that relies on both public and private sectors, and that spans multiple government agencies. The Enviro-Science e-Print Service, developed by a coalition of U.S. federal agencies involved in environmental management science, has been challenged to operate in just such an environment. The e-print service will be described briefly. Efforts to encourage contributions, obtain feedback from potential contributors and users, and to encourage information sharing will be discussed.

9:35 73 Interactive cross-linked chemical references from major publishers.
Christopher Malcolm Forbes, CEO, knovel, 13 Eaton Ave, Norwich, NY 13815, Fax: 607-337-5090, cforbes@knovel.com - SLIDES

This paper will discuss the progress in delivering comprehensive, data-intensive chemical reference information on the web, using www.knovel.com as its primary example. Progress is being made by integrating full text and database searches with interactive tables, graphs, equations, and structured information to empower users to solve problems and spur productivity. knovel is the leading on-line publisher of technical reference information, bringing hundreds of reference works, handbooks, databases, and analysis tools to users’ fingertips. knovel offers some of the most important chemistry and chemical engineering and applied science reference publications, including the Handbook of Chemistry and Physics, Perry’s Handbook of Chemical Engineering and Lange’s Handbook of Chemistry.

10:05 74 Designing a new web OPAC at the MIT Libraries.
Erja Kajosalo, Libraries, Massachusetts Institute of Technology, 77 Masschusetts Avenue, Room 14S-134, Cambridge, MA 02155, Fax: 617-253-6365, kajosalo@mit.edu - SLIDES

In the fall of 2000 the MIT Libraries selected Ex Libris Aleph 500 as their new library management system. Many MIT community members use library resources remotely and replacing our current telnet and web versions of the OPAC with this new highly customizable web catalog will make it easier to find out about and access the library resources. This paper will discuss the principles and challenges of customizing web based OPAC for MIT community, and will highlight some of the advanced patron functions available with this product.

10:35 75 Electronic reserves at the University of Illinois at Urbana-Champaign (UIUC) Chemistry Library.
Tina E. Chrzastowski, Chemistry Library, University of Illinois at Urbana-Champaign, 255 Noyes Laboratory, 505 S. Mathews, Urbana, IL 61801, Fax: 217-333-9208, chrz@uiuc.edu - SLIDES

Today a "remote" user is any user found outside the physical space of the library. Delivering information in electronic form is limited only by digital boundaries (IP address, authentication), not physical ones. This philosophy has been successfully applied to the class reserve collection at the UIUC Chemistry Library, where most users are on campus, but still appreciate access from any computer at any time. Electronic reserves were implemented in 1998 following years of classic reserve problems: missing pages, too few copies, stolen articles, and insufficient library hours. Electronic reserves are the solution. Basic equipment includes a computer, scanner, and software. A web site and rudimentary knowledge of file storage, linking, and servers complete the picture. This presentation will address student use of electronic reserves and will present results from a survey detailing student likes, dislikes, and the behavioral changes inspired by 24/7 availability of course reserve material.

11:05 76 Remote user support within a web-based community.
Davina Heaven, Bryan A Vickery, and Kristina Thrower, ChemWeb Inc, 84 Theobalds Road, London WC1X 8RR, United Kingdom, davina.heaven@chemweb.com - SLIDES

Remote users make up the complete membership of ChemWeb.com. ChemWeb Member Services provides support and information for remote users. Each remote user has a unique member name and user profile to authenticate access to full record databases and journals, available for purchase with an online credit card account. Users can choose to purchase subscriptions offline with subscription keys. These can also be used for providing institutional access. ChemWeb.com has established a number of society deals, often providing remote users with free access to online journals. ChemWeb.com is the premier online community for chemistry researchers and now has a membership of over 260,000 remote users from a diversity of countries worldwide. ChemWeb.com hosts chemical information services, made up of over 220 chemistry journals and more than 30 databases, all fully searchable and browseable. These include structure searchable databases, abstracts, full patents and dictionaries.

11:35 77 Outreaching to users - the Web as a platform for support, training, and education.
Engelbert Zass, Martin Braendle, and Spartaco A. Bizzozero, Chemistry & Biology Information Center, ETH Zuerich, ETH Hoenggerberg - HCI, CH-8093 Zuerich, Switzerland, zass@chem.ethz.ch - SLIDES

Although rapid changes in chemical information have taken place in the recent past, it is often not an obligatory part of the chemistry curriculum. Thus, traditional methods of teaching are not always efficient and accepted by students. Likewise, users who access electronic sources via the net at their workplace and no longer wish to come to the library to use the local holdings cannot be supported in the traditional way. Since the beginning of our web presence in 1995, we have offered support (and client installation) pages for our major chemistry databases, namely, CrossFire, CA on CD, ISIS, and SpecInfo. Further, we began in 1998 to augment chemical information lectures (taught formally since 1984) by using a Web-based course system. Recently, we have started to replace conventional one-hour instruction courses for individual information sources with multimedia self-teaching modules. We use these electronic resources not only locally, but also for national and international collaborations in teaching chemical information retrieval.

WEDNESDAY MORNING

Section B

Careers in Chemical Information
McCormick Place, South Bldg. Level 4
T. Wright and P. Barnett, Organizer - Cosponsored with Younger Chemists Committee
9:00 78 Chemical information careers or life outside the lab
Anne Marie Clark, Information Management, Pfizer Global Research and Development, 2800 Plymouth Road, Ann Arbor, MI 48105, Fax: 734--622-7008 - SLIDES

Shaken too many separatory funnels, watched too many gels or made too many oils? Consider a career in chemical information. Approximately 3000 chemistry documents are published every day. Who keeps this information from being lost? Chemical information professionals make sure that the information in those documents gets to bench chemists, project managers, patent attorneys and decision-makers. Who are chemical information professionals? They are usually degreed scientists who have made the leap from the lab. They are scientific librarians, scientific indexers, technical information specialists, competitive intelligence consultants, patent information specialists, market researchers or management consultants, technical publishers or editors, software developers, or computer programmers. If you are interested in reading and understanding technical documents but not necessarily doing the experiments, this may be a good career for you. In general chemical information professionals need a solid knowledge of chemistry, good communication and computer skills and an eye for detail. How do you make the career transition? Some people move from the laboratory into chemical information positions, some become scientific indexers or editors, and some get a M.L.S. degree. Myself, I became a scientific information analyst with CAS after finishing my Ph.D. From there, I joined Warner Lambert, now Pfizer, as a patent/chemistry information scientist. Career information and resources are available from the Chemical Information division of the ACS http://www.acs.org/divisions/, Chemistry division of the Special Libraries Association http://www.sla.org/division/dche/chemdiv.html, Patent Information Users Group www.piug.org, American Society of Indexers http://www.asindexing.org/index.html, and the Society of Competitive Intelligence Professionals http://www.scip.org/

9:30 79 How to juggle chemistry, computers, and business interests: perspectives on the career transition from information buyer to information supplier.
Gregory M. Banik, Bio-Rad Laboratories, Informatics Division, 3316 Spring Garden Street, Philadelphia, PA 19104, gregory_banik@bio-rad.com - SLIDES

The ability to juggle interests in chemistry, computers, and business as it relates to the career development process in chemical information is illustrated. Direct experience in leveraging chemistry and computer skills to shift from academia to industry is related. Finally, navigating a career transition from chemical informatics buyer to supplier is described via first-hand experience managing proprietary and published information at Abbott Laboratories, developing and marketing new chemical information products at ISI, managing business development at MSI, and running the Informatics Division at Bio-Rad.

10:00 80 Careers in chemistry patent information
Andrew H. Berks, Merck & Co, 126 E. Lincoln Ave RY60-35, Rahway, NJ 07065-0900, Fax: 732-594-5832 - SLIDES

Patent information management and patent searching are critical but little known job functions in research based organizations. This talk will present an overview of this field, including required and desirable skill sets, common responsibilities of patent information professionals, typical work assignments, training opportunities, and migration paths. A brief biography of the speaker will also be presented.

10:30 81 From information to intelligence: Changing thrust in patent data management.
Susan E. Cullen, Aurigin Consulting, Aurigin Systems, Inc, 10710 North Tantau Av, Cupertino, CA 95014, scullen@aurigin.com - SLIDES

Today’s challenge in making informed decisions is not that there is lack of information, but that there is more information than can easily be managed. The ability to convert knowledge to usable intelligence is vital to well-made R&D decisions or business decisions. In the chemical patent area there is a new demand for information experts who not only can search, but who can make the cream rise to the top. The skills in demand are not only subject matter expertise in science and patent matters, but aptitude for critical thinking and ability to present information in ways that are clear, provoke discussion, and display choices. Software tools to aid information management are valuable, but only when used with critical insight. Jobs are available for people with such skills in large companies, in law firms, and in consulting organizations. This work may be especially engaging for early retirees or career changers.

11:00 82 Wanted: Academic chemistry librarians at research institutions.
Grace BaysingerBaysinger, Swain Chemistry and Chemical Engineering Library, Stanford University, 364 Lomita Drive, Organic Chemistry Building, Stanford, CA 94305-5080, Fax: 650-725-2274, graceb@stanford.edu - SLIDES

While the mission of an academic research library to support research and teaching remains the same, how this is accomplished is undergoing rapid transformation. Collections, services, facilities, and staffing in libraries are all changing. It is an exciting yet challenging time to be a chemistry librarian at a major research university. To thrive in this complex environment, academic chemistry librarians must be knowledgeable about information resources for chemists, understand local programmatic needs, have strong analytical and problem-solving skills, fund management skills, excellent interpersonal and communication skills, advanced online searching skills, technical expertise in using computers, be flexible, and adapt easily to change. An academic degree in chemistry or in the sciences is also highly desired. Many interesting jobs exist for academic chemistry librarians but they are difficult to fill. This presentation will highlight activities, issues, and opportunities that exist for academic chemistry librarians at research institutions.

11:30 83 Making the jump: Moving from research into chemical informatics.
David A. Evans, Product Marketing, MDL Information Systems, Inc, 14600 Catalina Street, San Leandro, CA 94577, Fax: 510-614-3651, davide@mdli.com - SLIDES

Information is the backbone of everything that we do. Without a good chemical informatics solution critical data, experiments and results may be needlessly repeated, misunderstood ignored or, even worse, lost forever. There are a variety of jobs available in the informatics arena, e.g. working directly for a chemical company, software vendor or publishing house.

So why make the change from Research into Chemical Information? This talk will outline some of the requirements, desirable skills sets and some of the opportunities. The speaker will also present a personal perspective on making the jump.

WEDNESDAY AFTERNOON

Section A

E-Libraries
McCormick Place, South Bldg., Level 4
L. Solla, Organizer
1:35 84 Mt. Saint Helens and the rise of digital imaging in academic libraries
Susanne J. Redalje, Chemistry Library, University of Washington, Box 351700, Seattle, WA 98195-1700, Fax: 206-543-3863, curie@u.washington.edu - SLIDES

Electronic is the format of choice for most users of the academic library today. It is assumed by many users that anything of value to them is already on the web and is available 24/7. Librarians know this is not the case, of course, but are working, with others, to bring about the day when it might be true. Image databases greatly increase the access to materials that many users never knew existed, often materials which are fragile or had inadequate indexing tools. The University of Washington uses CONTENTdm, a multimedia package developed on campus, as one of its primary tools to produce image databases for use by UW students and researchers and the world at large. CONTENTdm allows easy production and management of image and multimedia databases. This paper discusses some of the issues involved in the development of one such database, covering the eruption of Mt. Saint Helens.

2:05 85 Introducing the National Science Digital Library (NSDL) Program and the "Site for Science".
John M. Saylor, Engineering Library, Cornell University, Carpenter Hall, Ithaca, NY 14853, Fax: 607-2550278, jms1@cornell.edu - SLIDES

To stimulate and sustain continual improvements in the quality of science, mathematics, engineering, and technology (SMET) education, the National Science Foundation (NSF) has launched the National Science, Mathematics, Engineering, and Technology Education Digital Library (NSDL) program. * A Progress Report, by Lee L. Zia, Lead Program Director, NSDL Program Division of Undergraduate Education. D-Lib Magazine, October 2000 Volume 6 Number 10 (http://www.dlib.org/dlib/october00/zia/10zia.html) A public version of the Library is due to launch in the fall of 2002.

"SITE for Science" (http://www.siteforscience.org/ ) is currently in development at Cornell University. "SITE for Science" is a prototype for the NSDL Central System, the central core of the national science digital library for SMET education envisioned by the NSDL program.

The key project goal is to be comprehensive in the approach to information science architecture, portal design, and production system administration of "SITE for Science": to embrace every digital resource, for every level of education, in every field of science, mathematics, engineering and technology; to accommodate diverse content, metadata, protocols, formats, authentication and business practices; and to support students and instructors from the most junior to the expert.

"SITE for Science" supports several levels of interoperability: high-quality federations of NSDL members; harvesting metadata from digital repositories; and web crawlers to gather information from scientific web sites.

This talk will describe and demonstrate the Cornell prototype and the NSDL project at large and discuss how you can get involved.

John M. Saylor http://www.englib.cornell.edu/jms/ Director,Engineering and Computer Science Library

2:35 86 Chemical structure and text hyperlinking in a web-based e-library.
James R. Weeks, and Bryan A. Vickery, ChemWeb, Inc, 84 Theobalds Road, London WC1X 8RR, United Kingdom, Fax: + 44 20 7611 4301, james.weeks@chemweb.com - SLIDES

Organisation of disparate chemical information sources into a valuable resource is not easy, especially when they are produced by many different publishers on many different platforms.

ChemWeb.com employs advanced text and structure search and retrieval technologies designed to make the life of the information searcher easier. From these results it is possible to link to other records within ChemWeb.com or to other delivery mechanisms via LitLink. Dynamic linking via hot 'structures' through the latest DYMOND technology will also be demonstrated.

The paper will also discuss the linking of the Chemistry Preprint Server to other preprint servers and open repositories via the Open Archive protocol for interoperability.

WEDNESDAY AFTERNOON

Section B

Materials Science Informatics
McCormick Place, South Bldg. Level 4
M. J. Doyle, Organizer - Cosponsored with Division of Inorganic Chemistry, and Division of Polymeric Materials: Science and Engineering
1:30 87 Charge transport in conducting polymers.
Tim Clark, Friedrich-Alexander-Universität Erlangen-Nürnberg, Computer-Chemie-Centrum, Nägelsbachstrasse 25, D-91052 Erlangen, Germany, Fax: +49-9131-8526565, clark@chemie.uni-erlangen.de

Semiempirical molecular orbital theory can be used, both for geometry optimizations and molecular dynamics simulations, to investigate quite large oligomer models for conducting polymers such as polythiophene. Pure MO and hybrid QM/MM results will be presented for calculations on the nature of the charge carriers in polythiophene and for soliton conductance simulations using molecular dynamics in a homogeneous electric field

2:00 88 Data analysis in combinatorial catalysis research
David R Dorsett Jr, Symyx Technologies, 3100 Central Expressway, Santa Clara, CA 9551, Fax: 408-748-1221, ddorsett@symyx.com

A comprehensive informatics system supporting combinatorial and high-throughput catalysis research is presented. Challenges in ad hoc data analysis, querying, and visualization resulting from use of the system in several different research projects will be presented along with future directions.

2:30 89 Data management system for high-throughput experimentation in materials science.
J. Tucker, G. Loewenhauser, J.-R. Hill, and B. E. Eichinger, Consulting Group, Molecular Simulations Inc, 9685 Scranton Road, San Diego, CA 92121-3752, Fax: 858 458 0136, jtucker@msi-eu.com, bruce@msi.com

High-throughput experimentation requires a data management system that is capable of not only storing the massive amounts of data produced during experimentation, but also of retrieving, searching, and mining these data. Such data management systems currently exist for use in pharmaceutical research and development, but high-throughput experimentation in materials science is sufficiently different from its pharmaceutical counterpart that reuse of existing data management systems is impractical. We have developed a generally applicable data management system that can handle the challenges found in materials science high-throughput experimentation. Features unique to our system are: (i)samples can be prepared by applying any sequence of user defined processing operations; (ii)processing operations can have any number of user defined parameters; (iii)samples can be prepared from chemicals or other samples; (iv)samples exist independently of plates; (v)samples can be tested using any number of user defined tests; (vi)tests can either be run as high-throughput or conventional experiments; (vii)tests can have different phases; (viii)tests can produce any amount of data; (ix)the formats of these data are free; (x)all data in the system can be searched and graphically visualized; and (xi)the user can configure the user interface to see only relevant data. The system that accomplishes all this consists of six fully integrated data management modules to be described. Illustrative examples will be provided.

THURSDAY MORNING

General Papers
R. W. Snyder, Organizer
9:00 90 Linking Context-Similar Information.
Timothy Hoctor, MDL Information Systems, Inc, 14600 Catalina Street, San Leandro, CA 94577, Fax: 510-614-3651, timh@mdli.com - SLIDES

The Internet is delivering information to the desktop at increasing speed, volume, and diversity. The Internet initially used hypertext links among documents, but is now evolving to use metadata to integrate various types of information, for example structures, bibliographic citations, intellectual property information, and other data types. As a result, there is an increasing need to interlink not just referenced text information, but also context-similar information. Interlinking "information in context" has been implemented in several information systems and examples will be presented and possible future developments discussed

9:30 91 Computer-assisted mechanism-of-action analysis of large databases, including 250,000 chemical compounds registered by NCI.
Marc C. Nicklaus1, Wolf-Dietrich Ihlenfeldt2, Dmitrii Filimonov3, and Vladimir V. Poroikov3. (1) Laboratory of Medicinal Chemistry, National Cancer Institute - Frederick Cancer Research and Development Center, National Institutes of Health, Building 376, Room 207, 376 Boyles Street, Frederick, MD 21702, Fax: 301-846-6033, mn1@helix.nih.gov, (2) Computer Chemistry Center, Institute of Organic Chemistry, University of Erlangen-Nuremberg, (3) Russian Academy of Medical Science, Institute of Biomedical Chemistry - SLIDES

The program PASS (Prediction of Activity Spectra for Substances), which currently predicts probabilities for activity as well as inactivity for 625 pharmacological effects and biochemical mechanisms-of-action (including toxicities) on the basis of a compound's structural formula (http://www.ibmh.msk.su/PASS), was used to calculate redictions for nearly all of the 250,251 structures of the Open NCI Database. A total of 64,188,212 predicted values have been made available, and searchable, on the Web servers of the Erlangen/Bethesda Data and Online Services, run jointly by the CADD MiniCore Facility of the Laboratory of Medicinal Chemistry, NCI, NIH (http://cactus.nci.nih.gov/ncidb2), and the Computer Chemistry Center of the University of Erlangen-Nuremberg, Germany (http://www2.chemie.uni-erlangen.de/ncidb2). This very large dataset, freely and easily accessible in a searchable way just by using a Web browser, is intended to provide a valuable resource to enlarge the current knowledge about structure-activity relationships in pharmaceutical development. This work is supported by CRDF (Grant RC1-2064).

10:00 92 AIDS: An Information Deficiency in Society
Chinnappan Baskar1, Crasta Carmelina Karen2, Sockalingam Nachamma3, Jagannathan Arockiam1, and Peter Natesan Pushparaj4. (1) Department of Chemistry, National University of Singapore, 3 Science Drive 3, Singapore 117543, Singapore, Fax: (65) 779 1691, scip8330@nus.edu.sg, (2) Institute of Molecular Agrobiology, National University of Singapore, (3) Department of Microbiology, National university of Singapore, (4) Department of Biochemistry, National University of Singapore

Even though science and technology has advanced in the new millennium the human population is still facing a wide variety of health problems in their day-to-day life. The current major threat to humanity is the Acquired Immuno Deficiency Syndrome (AIDS), which is due to the destruction of immune cells such as CD4+ lymphocytes by Human Immuno Deficiency Virus (HIV). Since the outbreak of the epidemic, HIV has infected more than 47 million people and so far 14 million people have died of AIDS. In 1988 alone, AIDS deaths totalled about 2.5 million where as the other top five killers including malaria caused only over a million deaths. The problem could be attributed to the communication gap between the populace and scientific community. This talk will focus on how to bridge that communication gap between the scientific community and masses.

10:30 93 Potential role of chemical information technology in 21st century science.
Peter Natesan Pushparaj1, Jagannathan Arockiam2, Chinnappan Baskar2, and P Kangueane3. (1) Department of Biochemistry, National University of Singapore, Faculty of Medicine, National University of Singapore, Singapore 119260, Singapore, bchpnp@nus.edu.sg, (2) Department of Chemistry, National University of Singapore, (3) Bioinformatics Centre, Department of Biochemistry, National University of Singapore - SLIDES

Advancements in information technology, nanotechnology, pharmacogenomics etc., will pave the way for an information revolution in science. The proposed marriage between science and information technology is cardinal for knowledge discovery from information repositories. Informatics plays a crucial role in data manipulation, data curation and knowledge extraction, thus bridging the gap between disparate information sources for subsequent model building, refinement and validation. From genomics to combinatorial chemistry, scientific advances are poised to revolutionize drug discovery and health care. The successful sampling of drug targets from a pool of chemical entities using computational tools will result in faster and effective treatment of diseases. Knowledge generated using informatics tools will serve as input parameters for in silico biomedical simulation. This talk will focus on how the chemical information could be used to unfold the myriad problems of the 21st century science.

11:00 94 Information and Organic Molecules: Structure Considerations Via Integer Statistics.
Daniel J. Graham, Department of Chemistry, Loyola University Chicago, 6525 North Sheridan Road, Chicago, IL 60626, Fax: 773-508-3086, dgraha1@luc.edu - SLIDES

Information in relation to organic molecules was investigated in a previous work (Graham and Schacht, J. Chem. Information and Computer Sci. 2000, 43, 187). The topic is given further consideration here using integer statistics. Discussed are the ramifications of an integer variable which quantifies the total number of binding complexions in an organic molecule. Offered is a statistical view of the maximum allowed number of independent regions D expressed by the molecule. Illustrated are the distribution properties of D along with upper limit estimates of the regio-information content. Integer statistics based on elementary number theory establish all of the key distribution properties. In so doing, the traits distinguishing high regio-information molecules are highlighted. The statistical approach encompasses all possible molecules and conditions, not just those reported to date in chemical databases. The aim here is to view regio-information and molecules in an alternative and general way.

11:30 95 Application of graph theory for automated mechanism generation.
Artur T. Ratkiewicz, Department of Chemistry, University of Utah, 315 South 1400 East, Room 2020, Salt Lake City, UT UT 84112, Fax: 801-581-4353, artur@mercury.hec.utah.edu, and Thanh. N. Truong, Department of Chemistry, Henry Eyring Center for Theoretical Chemistry, University of Utah - SLIDES

We present an application of chemical graph theory approach for generating elementary reactions of complex systems. Molecular species are naturally represented by graphs, which are identified by their vertices and edges where vertices are atom types and edges are bonds. The mechanism is generated using a set of reaction patterns (sub-graphs) that are the internal representations for a given class of reaction thus eliminating the possibility of generating unimportant product species a priori. Furthermore, each molecule is canonically represented by a set of topological indices (Connectivity Index, Balaban Index, Schulz TI Index, WID Index, and others) thus it eliminates the probability for regenerating the same species twice. Theoretical background and application to combustion of hydrocarbon system are presented.

12:00 96 Taming the wild world of web resources: How to manage online activities for a large number of multi-section chemistry courses.
Christopher Todd Jones, Department of Chemistry, University of Illinois at Urbana-Champaign, 601 S. Matthews A2, Urbana, IL 61801, ctjones@uiuc.edu - SLIDES

Design and management of web activities for even a small chemistry course can be a large task. So how can universities with large multi-section chemistry courses effectively utilize the web? An orderly design of a multi-course web portal can simplify its management while continuing to support a variety of teaching styles. A clear understanding of your motivations for having web activities should guide the design structure and management techniques. Focus will be on the barriers and opportunities related to the tasks of managing web activities for a large number of multi-section chemistry courses. Topics such as design strategy, content development, and assessment of web activities will be discussed.

12:30 97 Indexing and searching chemical structures and reactions with stereo-selectivity
Andrew Lemon, Chemistry Development, ID Business Solutions Ltd, 2 Occam Court, Surrey Research Park, Guildford, GU2 7QB, United Kingdom, Fax: 44 1483 595001, alemon@idbs.co.uk - SLIDES

The market share of chiral drugs is increasing year on year. The FDA have introduced new guidelines to address the stereoisomeric composition of drugs. The stereoisomeric composition of a drug with a chiral center must be known and the quantitative isomeric composition of the material used in pharmacologic, toxicologic, and clinical studies known. Stereo-selective synthetic methodologies are being developed to allow large-scale cost efficient production of chiral drugs. There is a strong need for cheminformatics systems to keep track with this progress in the nature of the chemical structures that such systems must accommodate. We present a chemical indexing and searching system designed to provide support for representation of stereo enantiomers, diastereoisomers, racemic mixtures and meso isomers in both chemical structures and reactions.

 

 

Newspaper template for websites