Abstracts, 228th ACS National Meeting
Philadelphia, PA, August 22-26, 2004

Titles link to slides when available. Please note: Presentations given at CINF symposia have been posted to the CINF website with express permission granted by the authors who retain the original copyright. These presentations are for information purposes only and cannot be further disseminated without the author's prior written permission.

CINF 1:  Where we've come from and where we're going
Lorrin R. Garson, Publications Division, American Chemical Society (retired), 1155 Sixteenth Street, N.W, Washington, DC 20036, 9929garson@verizon.net

Abstract
The development of electronic communications starting in the mid 19th century, followed by the advent of computers in the mid 20th century has led to the nearly ubiquitous electronic availability of chemical information at the start of the 21st century. At present, the great majority of scientific journals are available online. Some of the key events that led to the development of electronic journals, challenges facing scientific-technical-medical publishing, and what the scientific community can expect in the near future will be discussed.

CINF 2:  State of electronic publishing today: Tools, trends, and technology
Barry Bealer, Really Strategies Inc, 618 South Broad Street, 2nd Floor, Lansdale, PA 19446, Fax: 215-631-9358, bbealer@reallysi.com

Abstract
Publishers face many challenges when producing electronic content and products. Today, publishers accept the fact that content plus technology equals value and that growth from digital products will continue to increase revenue. However, digital products operate under a different set of assumptions and business models than print products. A variety of enticing tools, technology, and trends can enamor publishers as they forge into electronic publishing initiatives. Identifying what is hype and what is publishing reality in specific publishing environments is key to seeing a return on technology investment. Really Strategies is a full service firm that helps publishers bring strategy, content, and technology together.

CINF 3:  Journal of Chemical Education Digital Library
Jon L. Holmes, Journal of Chemical Education, University of Wisconsin-Madison, 209 N. Brooks St., Madison, WI 53715-1116, Fax: 608-262-7145, and J.W. Moore, Department of Chemistry, University of Wisconsin-Madison

Abstract
The Journal of Chemical Education has a long tradition of providing chemistry teachers with the information they need to perform their craft. Extending that tradition into the digital realm of the Internet has posed some challenges. To help meet those challenges, JCE has joined with the National Science Digital Library (NSDL) project of the NSF and is building the JCE Digital Library.

Developing a digital library collection has taught JCE, the publisher, several lessons in how to play the role of a library. The lessons include: choosing a meta data standard, developing a controlled vocabulary for our meta data; making decisions about the granularity of our meta data, devising a system for integrating the assignment of meta data into our publishing workflow, deploying a system to make our meta data available to meta data repositories, and constructing a new area of JCE Online, the JCE WWW site, to house our collection.

Here, we will discuss the challenges of constructing a digital library collection and present our solutions to some of the problems we have encountered along the way. We will look at the process of constructing a library collection from the perspective of a small publisher of an academic journal.

CINF 4:  Document acquisition: Lessons learned in providing the ChemPort linking service
Stephen A. Renner, CAS, 2540 Olentangy River Road, Columbus, OH 43202-1505, Fax: 614-447-5470, srenner@cas.org

Abstract
It is no secret that the world of STM scholarly publishing is in turmoil. It is a chaotic mix of electronic full text, myriad purchasing arrangements, A&I databases, search engines, visualization features, portals, link resolvers, new terminology such as "DOI" and "SFX," new library automation software, new standards and protocols, and associated client/server and client-based applications. Amidst this chaos one constant from the print era remains: the need for the scientist to obtain copies of journal articles and patents of interest. CAS introduced the ChemPort full-text linking service in 1997 with the mission of "connecting the user to electronic full text at the publishers' sites." We soon learned that the linking scientists to full text at publishers' sites is only one of many ways, and sometimes not the best way, to help scientists and information professionals obtain documents. This presentation will share the challenges and lessons learned in building and operating ChemPort to meet the needs of scientists, information professionals, librarians, and publishers, and will glimpse into the future of document acquisition.

CINF 5:  25 Year trends in information seeking and reading patterns of chemists
Donald W. King, University of Pittsburgh School of Information Sciences, Sara Fine Institute for Interpersonal Behavior & Technology, 600 IS Building, 135 North Bellefield Avenue, Pittsburgh, PA 15260, dwking@pitt.edu, and Carol Tenopir, School of Information Sciences, College of Communication and Information, University of Tennessee

Abstract
This paper presents evidence of chemists' information seeking and reading patterns observed by extracting chemists' survey responses from 42 readership surveys conducted over the past 25 years. In particular, trends show changes in how chemists learn about articles read, where they obtain them and the format of these articles. Five surveys conducted since 2000 show the appreciable changes brought about by the emergence of electronic journals. The survey results also show the usefulness and value of reading scholarly journal articles.

CINF 6:  Chemists online: Analysis of usage statistics and referral URLs of ACS electronic journals
Philip M. Davis and Leah R. Solla, Collection Development, Cornell University, Mann Library, Ithaca, NY 14850-2501, Fax: 607-255-0318, pmd8@cornell.edu, lrm1@cornell.edu

Abstract
Publishers and librarians are increasingly interested in measuring the use of electronic journals. Most usage reports, however, provide only the aggregate number of article requests, and tell us very little about individual user behavior. This presentation will summarize two recent studies using IP-level data to represent individual user behavior. The first study describes the distribution of ACS article downloads over a university community, and suggests that the majority of requests come from a small number of users. The second study details how scientists were referred to ACS articles using a transaction log analysis and indicates that while individuals tend to use few methods to link to articles, collectively researchers utilize many different pathways to access the same literature. These studies represent non-obtrusive methods to better understand the information seeking and usage behavior of chemists.

CINF 7:  SPARC: Model projects and strategies for changing the scholarly communication system
Julia C. Blixrud, Scholarly Publishing and Academic Resources Coalition, 21 Dupont Circle, Ste 800, Washington, DC 20036, Fax: 785-841-5576, jblix@arl.org

Abstract
The Scholarly Publishing and Academic Resources Coalition (SPARC) has been providing support for new models of scholarly communication since 1998. Several projects have reached maturity and can be evaluated as to their success. Others are still in development, but have been based on experiences gained by those who have preceded them in the use of new technologies as well as new business models needed to transform scholarly communication. This presentation will summarize some of SPARC's projects and examine whether they are achieving their stated objectives. Other, non-SPARC electronic projects also will be reviewed to determine how they are changing the nature of scholarly and scientific communication. Patterns determining successful projects will be identified.

CINF 8:  Results of survey: Academic efforts to address the scholarly communications crisis
Randall K. Ward1, David J. Michaelis2, Robert Murdoch3, Brian Roberts3, and Julia C. Blixrud4. (1) Harold B. Lee Library, Brigham Young University, 2320 HBLL, Provo, UT 84602, Fax: 801-422-0466, randy_ward@byu.edu, (2) Chemistry and Biochemistry, Brigham Young University, (3) Lee Library, Brigham Young University, (4) Scholarly Publishing and Academic Resources Coalition (SPARC), American Research Libraries

Abstract
During the 2002-03 school year a nationwide telephone survey of academic institutions was conducted from Brigham Young University to determine the extent of efforts in addressing the “scholarly communications crisis”, often characterized as the spiraling costs of scholarly journals. 170 universities were surveyed (all were members of the Scholarly Publishing and Academic Resources Coalition). Presented will be the statistical results of the 20-question survey along with some commonly mentioned anecdotal recommendations. The statistics gathered can serve as a benchmark for progress in addressing the crisis if the survey were repeated.

CINF 9:  Change as opportunity
Grace Baysinger, Swain Library of Chemistry and Chemical Engineering, Stanford University Libraries, 364 Lomita Drive, Organic Chemistry Building, Stanford, CA 94305-5080, Fax: 650-725-2274, graceb@stanford.edu

Abstract
Large academic research libraries are experiencing major changes in collections, staffing, facilities, and infrastructure. While reviews and re-examination are healthy, the rate of change that is occurring is both invigorating and humbling. While most library and information science programs introduce students to topics such as management and building library collections, few programs focus on the complexities facing science libraries and the rapid evolution that is occurring in them. Thus, mentoring, collaborating, and communicating with colleagues is vital for a successful career as a science information specialist. Understanding the culture of an institution and finding ways to influence decisions is essential. Taking leadership roles locally and nationally in professional organizations such as ACS enables participants to better understand, support, and shape these environments. This presentation will highlight some of the changes that are occurring in large academic research libraries and opportunities that are happening as a result of these changes.

CINF 10:  New latitude and attitude in the information industry
Suzan Brown, Vice President, Marketing and Sales, Chemical Abstract Service, 2540 Olentangy River Road, Columbus, OH 43202, sbrown@cas.org

Abstract
Chemistry, like the other physical sciences, began as an activity dominated by male practitioners, which is perhaps a reflection of the cultural and educational norms that prevailed over the centuries of scientific development. In any case, the demographics of professional chemists have changed slowly but steadily toward greater inclusiveness. By contrast, women have been better represented in the information profession and in electronic publishing in general. What obstacles, advantages, and opportunities does the marketing of information and electronic publishing offer women today? The experience and impressions of one female marketing executive who has worked for two prominent information providers serving very different markets, Mead Data Central and CAS. These experiences may be representative or at the very least informative for individuals considering a career in this field.

CINF 11:  The degree is only the first step in reaching for the stars
Lori Kumar, VP Oral Care R&D, Pfizer Inc, 201 Tabor Road, Morris Plains, NJ 07950, lori.kumar@pfizer.com

Abstract
Dr. Lori Kumar has over 18 years experience in various leadership roles at Pfizer Consumer Healthcare located in New Jersey. She began her career as a scientist and gradually progressed to her current position as the vice president of oral care research and development, overseeing all global research and development.

She led the development of major new oral care products such as Listerine PocketPaks, Tartar Control Listerine and most recently Natural Citrus Listerine Mouthwash. From the laboratory bench to leadership positions in research, she has been actively involved in every phase of the product, including inception through commercialization. Lori also leads the clinical group which develops and supports new claims for existing, as well as, new oral care products under development. Her team’s most recent accomplishments include proving that “Listerine is as effective as flossing”.

Lori received her Ph.D. in Analytical Chemistry from New Mexico State University.

CINF 12:  Allowing serendipity in your career planning
Tracy C. Williamson, Office of Pollution Prevention and Toxics (7406), U.S. Environmental Protection Agency, 401 M. Street, SW, Washington, DC 20460, Fax: 202-260-0816

Abstract
The speaker will describe her education path and career choices, both planned and serendipitous, from college to graduate school to a staff chemist then a manager in the Federal Government. She will discuss her current job responsibilities at her place of employment, the US EPA. She will also describe the career challenges and opportunities for woman scientists at the US EPA and across the Federal Government. This will include a discussion of technical assignments, intra- and inter-agency collaborative opportunities, academic and industry partnerships, and international programs.

CINF 13:  Chemistry: A solid foundation for alternative careers
Janice L. Fleming, V.P., Planning & Development, Cadmus KnowledgeWorks, 940 Elkridge Landing Road, Linthicum, MD 21010, flemingj@cadmus.com

Abstract
Career directions can change for a variety of reasons and our core skills and talents as chemists actually prepare us in many ways for alternative careers. Some of the skills I've found to be most valued by employers will be highlighted. Many were based in my chemistry background, while others were developed with time and attention. I'll also address the career transition process when it involves stepping into new career areas. The connections from one job to the next may surprise you, and how little you really need to know initially about the work at hand might surprise you even more. Your deeper value is often tied to the skills and creativity borne from your chemical education and training.

CINF 14:  Can editorial peer review survive in a digital environment?
Ann C. Weller, Library of the Health Sciences, University of Illinois at Chicago, 1750 W. Polk St., Chicago, IL 60612, Fax: 312-996-9584, acw@uic.edu

Abstract
The digital environment presents the opportunity for scientists to obtain electronic access to research output through a number of nontraditional routes: institutional repositories, researchers’ websites, or journals whose contents are freely available online. Some options present challenges for journals that traditionally published research results after vetting through the editorial peer review process. In addition to preservation and authentication of electronic information, electronic publishers must also assure the integrity of the research results themselves. New publishing endeavors like PubMed Central, physics e-prints, and SPARC (e.g., Organic Letters) may all impact the peer review process. This paper will focus on several aspects of peer review in this new environment: review the standard model, present some working examples of alternatives to the traditional model, evaluate some proposed models, examine the potential impact of electronic publishing, and speculate about the future of peer review and scientific publishing.

CINF 15:  Peer review in the Open Access era (see also here)
Stevan Harnad, Centre de neuroscience cognitive, Universite du Québec à Montréal, Montreal, QC H3C 3P8, Fax: 514-987-8952, harnad@uqam.ca

Abstract
Classical peer review will not change in its essentials in the online/open-access age, but it will become faster, cheaper and more efficient to implement. Papers will be submitted by depositing them in the journal's website or the author's institutional website. Many authors will elect to make their preprints publicly accessible, either on the journal's website or on their institutional website. Referee selection will be aided by online searches of the open-access literature and sometimes by calls-for-referees to targetted specialist lists. Referees will access the manuscripts online and submit their reports online, and editorial dispositions will be done online. Self-selected commentary may sometimes supplement (but not substitute for) the editor-appointed refereed. After publication, a similar system can be used for open peer commentary, and authors' responses. Article download and citation impact will be tracked from the preprint onward by online scientometric engines.

CINF 16:  Peer review is the worst form of manuscript assessment except for all the other forms that have been tried
Wendy A Warr, Wendy Warr & Associates, 6 Berwick Court, Holmes Chapel, Cheshire CW4 7HZ, United Kingdom, Fax: 011 44 1477 533837

Abstract
"High quality, high impact" is the current mantra of ACS Publications but how does an editor define "high quality" and ensure that high standards are maintained? Peer review is only one part of the quality control process but it is the critical first step and it is inevitably subjective. Editors are only human and so are their reviewers. While protecting reviewers’ anonymity, this paper will briefly examine some of the practices and foibles of an editor and her reviewers, and will outline some lessons learned over the years. Early results of a more general study of the strengths and weaknesses of peer review in chemistry publishing will also be reported. With apologies to Winston Churchill, "No-one pretends that peer review is perfect or all-wise. Indeed...peer review is the worst form of assessment except for all the other forms that have been tried".

CINF 17:  Scientific quality assurance by interactive peer review and public discussion
Ulrich Pöschl1, Kenneth S. Carslaw2, Thomas Koop3, Rolf Sander4, William T. Sturges5, Jonathan P. D. Abbatt6, John T. Jayne7, and Douglas R. Worsnop7. (1) Institute of Hydrochemistry, Technical University of Munich, Marchioninistr. 17, Munich D-81377, Germany, Fax: +49-89-70957999, ulrich.poeschl@ch.tum.de, (2) School of the Environment, University of Leeds, (3) Physical Chemistry II, University of Bielefeld, (4) Air Chemistry Department, Max-Planck Institute of Chemistry, (5) School of Environmental Sciences, University of East Anglia, (6) Lash Miller Chemical Laboratories, University of Toronto, (7) Aerodyne Research, Inc

Abstract
The traditional ways of scholarly publishing and peer review do not live up to the needs of efficient communication and quality assurance in today’s rapidly developing and highly diverse world of science. Substantial improvement can be achieved by an open access two-stage publication process with interactive peer review and public discussion. It enables rapid publication and dissemination of new results in discussion papers followed by thorough and transparent peer review which is open for input from the scientific community (permanently archived and fully citable), and it leads to final revised papers with maximum quality assurance and information density. This approach has been successfully realized and applied in the interactive scientific journal Atmospheric Chemistry and Physics (ACP, www.atmos-chem-phys.org), which is edited by a globally distributed network of scientists and has been launched in 2001. The achievements of ACP confirm that the opportunities for interactive peer review and public discussion in a high quality journal are very much appreciated by authors, referees, and the scientific community. Reference: U. Pöschl, Interactive journal concept for improved scientific publishing and quality assurance, Learned Publishing, 17, 105-113, 2004.

CINF 18:  An insider's view of peer review and publishing on the web
Carol Carr, Chemistry Dept. Univ. of PA, Organic Letters, 231 S. 34th ST, Philadelphia, PA 19104, Fax: 215-573-8256, carrca@sas.upenn.edu

Abstract
Insights on publishing and peer review in the electronic era will be presented from the perspective of the managing editor of a chemistry journal, Organic Letters, the first American Chemical Society journal to accept submissions via the web. These insights are colored by the editor's previous experience on the consumer side of publishing, as a chemistry librarian.

CINF 19:  Comprehensive sampling of diverse chemistry space: A key driver for primary screening libraries
Ying Zhang, Jean Patterson, Libing Yu, and Carmen M. Baldino, Department of Chemistry, ArQule, Inc, 19 Presidential Way, Woburn, MA 01801, Fax: 781-994-0678, yzhang@arqule.com

Abstract
Diversity-oriented synthesis (DOS) provides an efficient strategy for accessing an almost infinite amount of diverse chemistry space with complex and functionally dense natural product-like structures. However, the challenge of rapidly developing the required chemical methods to synthesize these high value libraries still remains. Herein, we describe our standardized chemistry development approach that leverages the systematic sampling of the desired chemistry space to provide much needed efficiencies. Application of this strategy has afforded the efficient development of high throughput chemistry protocols delivering significant collections of purified screening libraries that embody the desired balance of physicochemical properties to support early stage drug discovery programs.

CINF 20:  Modeling a business process in drug discovery with WebSphere Business Integration Modeler to assist solutions development
Jean Wang and Jay Zhao, Healthcare and Life Sciences, IBM, Boca Raton, FL 33487, Fax: 877-209-0085

Abstract
Using a scenario of drug discovery, this paper illustrated how IBM WebSphere Business Integration (WebSphere BI) Model offerings be used to create scientific business process models to bring the scientific business process and solutions development together. WebSphere BI Modeler provides a collaboration environment where domain experts, process managers, architects and IT developers share and contribute to the life cycle of solutions design and development.

CINF 21:  MS5 – A general interface to ADME and search results
Matthew J. Walker1, Richard D. Hull2, Suresh B. Singh3, Robert P. Sheridan1, and J. Christopher Culberson1. (1) Molecular Systems, Merck Research Laboratories, 126 E. Lincoln Ave., RY50SW-100, Rahway, NJ 07065, Matthew_Walker@Merck.com, (2) Axontologic, Inc, (3) Concurrent Pharmaceuticals

Abstract
The Pharmaceutical industry is under significant pressure to produce more and more medicines while at the same time reducing costs to the patient. This translates to pressure on scientists to decrease the time and expense of the research and development process. As a result, modelers are continually looking for new ways to produce and consume information about chemical entities. To this end the MS5 tool was originally developed to facilitate structural similarity searching by providing an integrated environment and a similar look-and-feel to several searching methods. As new workflows emerged, this tool was expanded and now covers ADME property lookup and prediction, site of metabolism prediction and Lipinski inspired filtration. Implementation of the MS5 web tool, as well as prospects for its future development are herein discussed.

CINF 22:  Dealing with the concepts of “chemical compound” vs “chemical structure”
Robert S. Pearlman, Yubin Wu, Karl M. Smith, and Brian B. Masek, Optive Research, Inc, 12331-A Riata Trace Parkway -- Suite 110, Austin, TX 78727, bob.pearlman@optive.com

Abstract
Chemical compounds can exist in various protonation states and in various tautomeric states – two aspects of what shall, henceforth, be referred to as the "protomeric state" of a compound. Different protomeric states correspond to different connection tables – hence, different structures – of a compound. A compound’s environment (solvent, membrane, receptor, etc.) determines the protomeric state which the compound is most likely to adopt. Experimentally measured properties reflect Mother Nature’s choice of which structure(s) predominates and determines the measured property value. Regrettably, we often tend to forget these facts when generating or assessing or storing predicted properties of compounds obtained from calculations based on a particular structure we or some other human chose to associate with the compound.

We need to be able to enumerate and consider the various structures which a compound might exhibit in different Natural environments. We need to be able to associate measured data with compounds while associating computed data with the particular structures used for the computations. We need a robust method to associate any given structure with its corresponding, canonically identified compound. This presentation will introduce algorithms and software tools which address these needs and others.

CINF 23:  Tapping into the InnoCentive global solver community to obtain innovative solutions for R&D
Poonam Narula, Yi Shi, Peter Lohse, Eugene Ivanov, and Jill Panetta, InnoCentive Inc, Andover, MA 01810, pnarula@innocentive.com

Abstract
We have developed a global scientific community to provide leading pharma, chemical and consumer product companies with solutions to their scientific challenges in chemical and applied sciences. Our communication with chemistry experts all around the globe is facilitated by the networking power of the internet. We have developed a secure website which regulates information and intellectual property transfer. Solution seekers contract with InnoCentive in order to post scientific questions on InnoCentive.com. Each scientific challenge includes a detailed description and requirements, a deadline, and an award amount for the best solution. Scientists worldwide are eligible to register on the web site as "Solvers" which are currently from over 150 countries.

This presentation will provide examples of problems posted as challenges in chemical and applied sciences which have been presented to our solver community and the power of this forum to provide solutions to the multinational companies.

CINF 24:  Screening molecules for their drug-like index
Anwar Rayan, Andrea Scaiewicz, Inbal Geva-Dotan, Dinorah Barasch, and Amiram Goldblum, Department of Medicinal Chemistry and the David R. Bloom Center for Pharmacy, Hebrew University of Jerusalem, School of Pharmacy, Jerusalem 91120, Israel, Fax: 972-2-675-8925, anvarr@md.huji.ac.il, amiram@vms.huji.ac.il

Abstract
A new drug like index (DLI) is presented. It is formed by applying the Iterative Stochastic Elimination (ISE) algorithm (1-4) for constructing a set of options to differentiate between drugs and non-drugs (CMC/ACD) with appropriate training and test sets. The set of best solutions forms the basis for constructing DLI, as a sum over the relative contributions of true and false negatives and positives to each of the solutions. The best k-descriptor combinations out of some 150 descriptors have been picked by ISE, as well as their optimal limits for differentiating between drugs and non-drugs. DLI has been applied to several groups of the MDDR database, resulting in several implications for DLI values of different phases in clinical trials. The use of DLI for constructing combinatorial libraries will be demonstrated.

References: (1) Glick et al., PNAS 2002, 99, 703-708. (2) Glick et al., Proteins 2000, 38, 273-287. (3) Rayan et al., Curr Med Chem 2004, 11, 675-692. (4) Rayan et al., J Mol Graph Model 2004, 22, 319-333.

CINF 25:  A self-organizing algorithm for generating biologically active conformations
Sergei Izrailev1, Huafeng Xu2, and Dimitris K. Agrafiotis1. (1) 3-Dimensional Pharmaceuticals, Inc, 8 Clarke Drive, Cranbury, NJ 08512, (2) Department of Pharmaceutical Chemistry, University of California San Francisco

Abstract
Conformational sampling of small molecule structures has been a widely used technique in structure-based drug design and virtual screening. A stochastic algorithm for conformational sampling is presented. The algorithm generates molecular conformations that are consistent with a set of geometric constraints, which include interatomic distance bounds and chiral volumes derived from the molecular connectivity table. The algorithm repeatedly selects individual geometric constraints at random and updates the respective atomic coordinates toward satisfying the chosen constraint. The ability of the algorithm to generate low-energy and biologically active conformations is discussed.

CINF 26:  Open Access publishing: An overview
Bill Town, Kilmorie Consulting, 24A Elsinore Road, London SE23 2SL, England, bill.town@kilmorie.com

Abstract
Open Access – free online accessibility of research papers – is already one of the most heated in the field of scholarly communications and is currently the focus of a Parliamentary Inquiry in the UK. Open Access can be achieved in two ways: either author self-, institutional- or subject-based archiving of papers in parallel with publication in traditional subscription-based journals, or the conversion of journals themselves to a free-to-access business model, where costs are covered by payment on behalf of the author rather than on behalf of the reader. A brief historical overview of developments in Open Access will be given as an introduction to the two sessions dedicated to considering aspects of Open Access.

CINF 27:  Open Access publishing: The promise and the reality for libraries
Michael Leach, Physics Research Library, Harvard University, 17 Oxford St., Cambridge, MA 02138, Fax: 617-495-0416, leach@eps.harvard.edu

Abstract
Open Access, as a new publishing model, promises to deliver a number of advantages for readers and librarians, including little or no cost for libraries, free access to any interested reader, liberal copyright agreements and extensive permissions for authors. Some envision Open Access as the magic bullet that will solve the serials crisis of the past decade. Certain challenges, though, have arisen already, including the potential high cost of article page charges, which can hinder adoption by research communities; the initiation of "support fees" paid by libraries in lieu of article charges; numerous long-term preservation and archiving issues; and an untried economic model. Other issues are just beginning to arise: "fiscal aggregation" of article fees via libraries; implementation of LOCKSS (Lots Of Copies Keep Stuff Safe) models to enhance persistence of digital articles; the impact on collection development and technical services in libraries; and integration with institutional repositories. This presentation will address these challenges, focusing on the near-future impact of Open Access publishing on library collections, budgets and services.

CINF 28:  How Open Access will affect the small society publisher
Sarah Cooney, Society of Chemical Industry, 15 Belgrave Square, London SW1X 8PS, United Kingdom, Fax: +44 (0)20 7235 0887, sarah.cooney@soci.org

Abstract
SCI is a unique international forum where science meets business on independent impartial ground. Members include consumer representatives, environmentalists, industrialists and academic researchers, and the Society offers a chance to share information between sectors as diverse as food and agriculture, pharmaceuticals, biotechnology and chemicals. Founded in London in 1881, today SCI is a registered charity with members in more than 70 countries. SCI’s charter requires that the Society shall “advance …science for public benefit by…publishing appropriate journals, books and other communications”. SCI is a small publisher, owning 4 peer-reviewed journals that are well-respected niche titles (Journal of the Science of Food & Agriculture is ranked second in its field category). Journals provide a large proportion of the Society’s gross revenue to SCI; the surplus (after funding associated editorial and other services) is invested back into the society. This surplus is vital for funding a range of educational activities, awards and bursaries both in the developed and developing world. This talk will explore some of the issues that face small societies like SCI in the light of the Open Access movement.

CINF 29:  Open Access and scholarly publishing
Peter S Gregory and Robert Parker, Royal Society of Chemistry, Thomas Graham House, Science Park, Milton Road, CB4 0WF Cambridge, United Kingdom, gregoryp@rsc.org

Abstract
Regular science publishing started with the Royal Society's Philosophical Transactions nearly 400 years ago. Today, many of the world's leading scientific societies have publishing operations and many of the best known science journals are published by societies. Societies have a commitment to the dissemination of science, the maintenance of ethical and scientific standards, and the need to guarantee the scientific record, but many societies also rely on the surpluses generated by their publishing operations to fund these and other charitable activities. Do open-access models provide a boost to the activities of societies or do they gnaw away at the very fibre of science?

CINF 30:  Open Access: Early stages of clinical trials
Robert D. Bovenschulte, American Chemical Society Publications Division, 1155 16th Street NW, Washington, DC 20036, Fax: (202) 872-6060, rbovenschulte@acs.org

Abstract
While open access has many definitions, the central meaning is that anyone can have free online access to any scientific or other research-based information, particularly the most up-to-date content of technical journals on the Web. Proponents of open access tout its benefits and advantages over the established subscription model for scholarly publishing. Less often, however, do they examine the various assumptions, concerns, issues, and ramifications that a shift to this new model may entail. This presentation, in a spirit of objectivity, will review the promise of open access and the obstacles that it must overcome to be successful. The presentation will also consider the possibility of unintended and deleterious consequences that may ensue if open access becomes dominant. The message being advanced here is that open access, like a new pharmaceutical drug, should be thoroughly tested in an analog to clinical trials before it is widely adopted. A limited and protracted experiment will allow the scholarly community to determine whether open access can be effective and salutary for scientific communication and publishing.

CINF 31:  Elsevier: A commercial publisher's perspectives on Open Access
Karen Hunter, Elsevier, 360 Park Avenue South, New York, NY 10010, k.hunter@elsevier.com

Abstract
The publishing industry, academia, and scientific research itself, have gone through a tidalwave of change since the emergence of the internet. During the early days of the transition to online publishing, many perceived a revolution of science in the making. Today, usage of scientific journals online has doubled year on year, indicating that scientific information is reaching users like never before. At the same time, library budgets continue to be reduced and libraries are forced to make difficult decisions about collection development and access. Various forms of "pay to publish" models are surfacing, as well as alternative distribution models. Now once again, revolution is in the air. This presentation will include proprietary Elsevier research and focus on Elsevier's view, as a commercial publisher, on Open Access and related activities, such as Open Archiving and institutional repositories, as well as the general outlook for the future.

CINF 32:  The gold and the green roads to Open Access
Stevan Harnad, Canada Research Chair, Universite du Quebec a Montreal, Montreal, QC H3C 3P8, Canada, Fax: 514-987-8952, harnad@uqam.ca

Abstract
Open Access (OA) is optimal and inevitable for the research community because it maximises research usage and impact, hence also progress and productivity. There are two roads to OA, however, and Open Access Journal Publishing, the "golden road," is only one of them, and neither the fastest nor the surest, because its cost-recovery model has not yet been tested enough for sustainability and because it would take a long time to convert 24,000 peer-reviewed journals into OA Journals. (Only about 1000 of them, <5%, are OA so far.) The green road to OA is to continue publishing in non-OA journals when there is no suitable gold journal, but to self-archive one's articles in one's own OAI-compliant institutional Eprint Archive. Over 80% of journals have already given their green light to author self-archiving, but self-archiving is still far from using its full potential to provide immediate OA. What are needed are empirical demonstrations of the dramatic causal effect of OA on citation impact along with a formal extension of universities' existing publish-or-perish policies to include providing OA for all the university's refereed research article putput.

CINF 33:  DSpace as an institutional repository
Erja Kajosalo and Margret Branschofsky, MIT Libraries, Massachusetts Institute of Technology, 14S-M48, 77 Massachusetts Ave, Cambridge, MA 02139-4307, Fax: 617-253-6365, kajosalo@mit.edu

Abstract
DSpace is an open source software platform that enables the establishment of institutional repositories for digital collections of academic research and educational material. It provides for the capture, description, distribution and preservation of digital materials of various formats. This presentation will briefly describe the benefits of an institutional repository, how the DSpace functionality supports these ideas, as well as emerging policy issues, such as user education, intellectual property, and faculty response. Worldwide implementation of the DSpace software and the future of the DSpace federation will be discussed.

CINF 34:  Citation impact of Open Access journals
Marie E. McVeigh, Product Development Manager, Thomson-ISI, 3501 Market Street, Philadelphia, PA 19104, Fax: 215-387-4706, marie.mcveigh@thomson.com, and James Testa, Director, Editorial Development, Thomson Scientific

Abstract
Open Access (OA) represents one of the most significant changes to scientific publishing in recent years. But has Open Access fundamentally changed the dynamics of scholarly citation? We have studied the citation behavior of OA journals compared to journals of similar size and scope that maintain a traditional subscription model. Year 2002 data on 191 OA journals in the ISI citation databases suggest that offering unrestricted access to journal contents on the web does not, itself, ensure higher citation activity to the journal. We will present data from the year 2003 Journal Citation Reports™, comparing journal citation metrics between OA and non-OA publications. A particular focus will be placed on Chemistry journals, with several aspects of their citation dynamics examined in detail.

CINF 35:  Scholarly communication in the digital environment: Chemistry and chemical engineering
Ian Rowlands, Department of Information Science, City University London, Centre for Information Behaviour and the Evaluation of Research (CIBER), Northampton Square, London EC1V 0HB, United Kingdom, Fax: +44 (0207) 040 8584, ir@soi.city.ac.uk

Abstract
This paper will present the key findings of an international survey of senior journal authors carried out by ciber, in association with the UK Publishers' Association and NOP, in early 2004. Responses were received from nearly 4,000 corresponding authors, making this possibly the largest survey of author attitudes to open access and other publishing initiatives so far. The main findings reveal a surprising lack of knowledge of open access: 82% of authors claimed to know 'nothing' or 'just a little' about this movement. Authors' general reluctance to contribute towards the costs of commercial open publishing (can't pay, won't) is severe and probably no where near the level required for sustainable commercial services. The paper contrasts these general findings with those obtained in the fields of chemistry and chemical engineering.

CINF 36:  Open Access: Medium and long term implications for academic libraries
David Goodman, Palmer School in Manhattan, New York University, Bobst Library, Room 707, 70 Washington Square South, New York, NY 10012-1379, Fax: 212-995-4072, dgoodman@liu.edu

Abstract
Academic librarians usually agree about a single aspect of journal articles in chemistry and other sciences: there is one system which is not viable in the long run--the presen t system. There is less agreement about remedies. There are many potential candidate systems but very little experience, especially about long-term viability and compatibility. Those committed to any of the alternatives can construct strong arguments, but in the absence of reliable theory or sufficient experiment, we are left with opinion--and prejudice. This talk shall summarize what I regard as the range of plausible future prospects, with mention of the views of those who see it differently.

CINF 37:  Cheminformatics' role in the pharmaceutical industry
Randal Chen, Abbott Laboratories, 100 Abbott Park Road, Dept. R42T, Building AP10-2, Abbott Park, IL 60064, randal.chen@abbott.com

Abstract
The productivity gap in the pharmaceutical industry is typified by the statistic that only 10% of the compounds that enter into development, will make it to the market place. Many factors determine success or failure, but a key and common strategy of the industry is to effectively employ computational methods, such as cheminformatics, to enhance the odds of success. The presentation will cover: 1) Key technological issues – what cheminformatics technologies are currently emphasized. 2) Technological framing of the problems – what types of questions the industry needs to answer. 3) Types of individuals and skill sets need to tackle these problems – what background and training emphasis should schools provide their students. 4) Organizational models and structures influence – how organizational factors impact an employer’s approach a problem (such as size, culture, financial state) and how that affects the role.

CINF 38:  Masters level training in chem(o)informatics in the UK
Peter Willett, Department of Information Studies, University of Sheffield, Western Bank, Sheffield S10 2TN, United Kingdom, p.willett@sheffield.ac.uk, and Helen Cooke, GlaxoSmithKline, 709 Swedeland Road, King of Prussia, PA 19406, helen.2.cooke@gsk.com

Abstract
In the late 1990s the Engineering and Physical Sciences Research Council (EPSRC) in the UK began to seek proposals for funding for development of Masters Training Package (MTP) courses in areas where a shortage of suitably qualified graduates was perceived. One area identified was cheminformatics, and both the Department of Information Studies at the University of Sheffield and the Chemistry Department at the University of Manchester Institute of Science and Technology (UMIST) were awarded funding for a five-year period. While both courses embrace the MTP philosophy of engaging industry and exploring new delivery methods to assist the sustainability of the courses, they have their individual “flavours”, reflecting the differing research and teaching interests in the respective departments. This presentation examines the content and evolution of the courses and reports on the careers of some of the students, attitudes of industry and future prospects.

CINF 39:  Chemical informatics and bioinformatics programs at Indiana University
Gary D. Wiggins, School of Informatics, Indiana University, 901 East Tenth Street, Bloomington, IN 47408-3912, Fax: (812) 856-4764, wiggins@indiana.edu

Abstract
The graduate program of instruction in informatics at Indiana University includes paths to an MS degree in bioinformatics and chemical informatics. The latter has a specialized track on the Indianapolis campus in laboratory informatics. In existence only since 1999, the IU School of Informatics trains students in the application of computer science to various disciplines. The programs will be described and plans for a PhD program in science informatics, to be implemented in the fall of 2005, will be outlined.

CINF 40:  Advances in enterprise-wide management of spectral data
Marie Scandone1, Gregory M. Banik2, Deborah Kernan3, and Victoria Rafalovsky2. (1) Informatics Division, Bio-Rad Laboratories, Inc, 3316 Spring Garden Street, Philadelphia, PA 19104, Fax: 215-662-0585, marie_scandone@bio-rad.com, (2) Bio-Rad Laboratories, Informatics Division, (3) Informatics Division, Bio-Rad Laboratories

Abstract
Proper management of a proprietary data is complex, yet crucial. In the pharmaceutical, chemical, and petrochemical industries, samples are typically analyzed and characterized using multiple techniques including, but not limited to IR, NMR, UV/Vis, GC, MS, Raman, and NIR. As an added complication to the problem of organizing multi-technique data, data within a given technique may be taken from instruments supplied by more than one vendor. The result is a complex matrix of data types and vendor formats.

Organizations are faced with the problem of storing and serving this data are faced with overlapping issues involving data cross-referencing, data reporting, correlation with structure and property information, user training, and software control & management.

This paper will discuss these challenges and describe a solution to integrate data types; search, retrieve, and analyze the data; and consequently communicate information to colleagues via sophisticated reporting features.

CINF 41:  AQUIRE: ArQule's united repository and information exchange system
Rojnuckarin Atipat and Sergio H. Rotstein, ArQule, Inc, 19 Presidential Way, Woburn, MA 01801, Fax: 781-376-6019, arojnuckarin@arqule.com

Abstract
The informatics infrastructure at ArQule consists of a set of highly specialized scientific database systems and a central information repository that unifies the distributed information. The central repository is ArQule's United Repository and Information Exchange (AQUIRE) system, which integrates the data from array production, reagent procurement and inventory, analytical and biological database systems. It gives our scientists the ability to query and report on data spanning multiple systems from a single application. Automatic propagation of data ensures that information (e.g. plates, compounds), entered in one system will also be available in the other systems. A role-based security scheme allows the system to enforce any necessary data access restrictions

CINF 42:  Molecular docking using ArgusLab: An efficient shape-based search algorithm and an enhanced XScore scoring function
Mark A. Thompson, Planaria Software, Seattle, WA 98155, mark@planaria-software.com

Abstract
We have developed an efficient grid-based docking method in ArgusLab that approximates an exhaustive search. We employ a simple geometric fit of a flexible ligand in the binding site at carefully-chosen search points within the free volume of the binding site cavity, along with incremental construction of the ligand’s torsions in a breadth-first order that maximizes the early rejection of unproductive pose fragments and greatly enhances the efficiency of the conformational search. We have coupled this with a simple scoring function, based on an enhancement of the XScore(HP) method of Wang and coworkers. Our enhancements allow the scoring function to be used as the objective function during docking and to include waters found in the X-Ray structure. Using the 100-target/100-pose sets of Wang et. al., we obtain 80% agreement of the lowest scoring pose being within 2.5 Angstrom RMSD of the X-Ray structure, and a correlation coefficient of 0.61 between binding scores and experimentally determined binding affinities. Typical docking times for ligands with 10-20 torsions are 30-90 seconds on a 2.4 GHz laptop computer. The 1cbx/benzylsuccinate structure (5 ligand torsions) docks in less than 10 seconds, which is typical for systems of this size. ArgusLab’s docking method is implemented for both interactive and virtual high-throughput screening of ligand databases. Simplicity in the sample preparation is stressed in our design as well as a configurable parameter set for the scoring function that allows the user to select modified parameters at runtime. ArgusLab implements a rich graphical presentation of the results of a docking calculation, including easy navigation between the poses in the final set located in the search (typically, the 50 lowest energy poses are retained for analysis).

CINF 43:  Write new books or buy/translate famous books in developing countries?
Stefan Perisanu, Laboratory of General Chemistry, Polytechnic University of Bucharest, 1 Polizu str, Bucharest 78126, Romania, Fax: 40-21-3111796, s_perisanu@chim.upb.ro

Abstract
In developing countries, like Romania, universities do not have enough money to buy the best books and especially not enough copies, in order to provide the students the necessary documentation. Given this situation most professors write their own book for each course. Although many good books were written in my country (some of them are appreciated even by students studying in western countries) alternative sources of information are more than necessary, not only for students, but also for teachers. A different selection of information, a more up-to-date information, or a more appropriate pedagogical approach are some of the reasons of this need. Some solutions to this problem are envisaged : copyright transfer and/or free Internet access to some classical textbooks, documentation stages for students and professors, co-operation between libraries. A discussion, in order to find the best alternative(s) is desired, by the author.

CINF 44:  Development of protein moment descriptors and pH-dependent descriptors for prediction of protein affinity in hydrophobic interaction chromatography systems
Qiong Luo1, Asif Ladiwala2, Dechuan Zhuang1, N Sukumar1, Curt M Breneman1, and Steve M. Cramer2. (1) Department of Chemistry, Rensselaer Polytechnic Institute, Cogswell 306, 110 8th St, Troy, NY 12180, Fax: 518 276-4045, luoq@rpi.edu, brenec@rpi.edu, (2) Department of Chemical and Biological Engineering, Rensselaer Polytechnic Institute

Abstract
Hydrophobic Interaction Chromatography (HIC) is commonly employed in the biotech industry for the downstream processing of proteins and other biomolecules. The selectivity of this technique can be optimized by varying the composition of the stationary phase as well as the pH of the mobile phase. In the present work, the effect of resin chemistry on binding affinity of proteins in HIC are investigated using high-throughput experimentation and Quantitative Structure-Retention Relationship (QSRR) modeling. Linear gradient experiments were carried out for 36 proteins on four different HIC resins having different backbone and ligand chemistries ¨C namely Phenyl Sepharose, Butyl Sepharose, Phenyl 650M and Butyl 650M. A number of sets of novel protein descriptors are developed in this study, including moment descriptors and pH-dependent descriptors, which are based on RECON/TAE method and MOE descriptors. In the development of protein moment descriptors, moments of various physico-chemical property distributions of proteins up to and including second order are calculated based on protein crystal structures using either all the protein atoms or only surface atoms identified by Delaunay Tessellation. Restricting the descriptors to surface atoms eliminates the contributions of atoms on deeply buried residues. Support Vector Machine (SVM) regression has been employed to obtain predictive QSRR models. The predictive ability of these models are verified for a randomly selected test set of proteins not included in the training of the model. The relative importance of each selected descriptor in the final models are provided by star plot analysis and correlation matrices. Once these predictive models have been validated, they can be used as an automated prediction tool for Virtual High-Throughput Screening (VHTS).

CINF 45:  Building classification models for DMSO solubility: Comparison of five methods
Jing Lu and Gregory A. Bakken, Scientific Computing Group, Groton Computational Chemistry, Pfizer Global R&D - Groton Labs, Eastern Point Road, Groton, CT 06340, Fax: (860) 7153149, jing.j.lu@pfizer.com

Abstract
It is now increasingly recognized that DMSO solubility is a problem at least as serious as compound stability in combinatorial libraries, since it may cause artifacts in library screening, and thereby negatively impact screening efficiency. It is desirable to have an effective in silico model for estimation of DMSO solubility to reveal any poorly soluble compounds, which are incompatible with assay protocols prior to screening runs. In this study DMSO solubility data at 30 mMol were gathered for 33,329 Pfizer compounds. Five linear and nonlinear classification methods were evaluated and compared on the data set using a set of 200 2D descriptors. Five predictive binary classification models for estimation of DMSO solubility class of organic compounds were derived and validated. The results show the high accuracy using ensembles of decision trees (specifically, boosting and random forests). Additionally, methods like LDA and BinaryQSAR, when used in conjunction with feature selection methods, provide accurate models. While not quantitative in nature, models such as these are effective for screening compounds to be stored in DMSO for potential solubility problems.

CINF 46:  Docking studies on hERG model
Anna Maria Capelli1, Aldo Feriani2, Frank E Blaney3, Diego Dal Ben1, Giovanna Tedesco1, and Alfonso Pozzan4. (1) Computational, Analytical and Structural Sciences, GlaxoSmithKline, Medicines Research Centre Via Alessandro, Fleming 4 37135, Verona, Italy, Fax: +39 0459218196, Anna-Maria.M.Capelli@gsk.com, (2) GlaxoSmithKline Research Centre, (3) Computational, Analytical & Structural Studies, GlaxoSmithKline, (4) Chemistry Department, Computational Chemistry and Compound Diversity Unit, Verona Research Centre, Glaxo Smith Kline S.p.A

Abstract
Drug-induced QT interval prolongation, which is a major risk for torsades de pointes arrhytmia, can be related to the inhibition of the K+ channel encoding by human ether –a –go-go related gene (HERG). As a consequence, evaluation of potential pharmacological liability associated to hERG is an important aspect of the drug discovery process. In this study, docking experiments in an in house hERG receptor model were performed to study a set of hERG standards. Validation of the poses obtained was then performed using site-directed mutagenesis experiments reported in the literature and the affinity of the ligands correlated with empirical scoring functions. The method was then used to perform docking experiments of in house ligands and used to drive drug design activities.

CINF 47:  LCOLI -- efficient generation of diverse combinatorial libraries
Rong Chen and Alan Long, Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford St., Cambridge, MA 02138, chen@lhasa.harvard.edu, aklong@fas.harvard.edu

Abstract
The program LCOLI (an acronym for LHASA for Compound Libraries) is a new module for the LHASA (Logic and Heuristics Applied to Synthetic Analysis) suite that predicts possible reactions between input starting materials and generates virtual combinatorial libraries. The libraries created using this knowledge based diversity-oriented synthesis (DOS) approach achieve a high degree of chemical diversity and complexity, rich information on synthetic accessibility, and predictions of potential toxicity. Library generation is highly efficient and holds great promise for new drug discovery. Samples of analyses will be presented.

CINF 48:  Science literacy and information literacy in a middle school
Jennifer E. Lewis1, Troy D. Sadler2, Teresa Eckart1, and Katherine M. Whitley3. (1) Department of Chemistry, University of South Florida, 4202 E. Fowler Ave SCA400, Tampa, FL 33620-5250, Fax: 813-974-3203, jlewis@chuma1.cas.usf.edu, (2) School of Education, Indiana University, (3) Tampa Library, University of South Florida

Abstract
Chemistry literacy and information literacy were integrated in a project involving University of South Florida faculty, graduate students and a local community middle school. The project was a pilot plan for enhancing science and information literacy in middle school students along with their families. The project introduced students and their families to the use of scientific information to make decisions about a socioscientific issue--namely, global warming. In addition to seeking and evaluating scientific information pertinent to the issue, students and parents performed laboratory activities investigating the properties of gases. The pilot was successful; interviews revealed that both parents and students valued the experience and that the activities were seen as an integrated whole. Additional study is required to assess the effectiveness of the activities within the project, specifically with regard to the integration of subject-specific science literacy and information literacy.

CINF 49:  Dock odysseys. I. The creation of 3D searchable small molecule databases with consideration of molecular chirality
Zengjian Hu and William M. Southerland, Department of Biochemistry and Molecular Biology, Howard University College of Medicine and the Howard University Drug Discovery Unit, 520 West Street, Northwest, Room 324, Washington, DC 20059, zhu@howard.edu

Abstract
High-throughput docking (HTD) is an important source of new leads in the drug discovery process. The quality of HTD generated lead compounds are limited by the chemical database used to generate the candidate molecules. In addition to structural diversity, molecular chirality should be considered when creating a 3D searchable chemical database. The consideration of molecular chirality is an intuitive and simple, but valuable, approach to improving the quality of chemical databases as well as HTD, since molecular chirality has a major influence on the pharmacological, pharmacokinetic, and toxicological actions of therapeutic agents. As far as we know, although there is rapid growth of public, commercial, and proprietary small-molecule databases available for HTD, there are no published reports on the investigation and creation of chiral chemical databases until now. In this reports, we present for the first time the creation of 3D searchable small molecule databases with the consideration of molecular chirality (*This work is supported by grant RCMI-NIH 2G12RR03048).

CINF 50:  Dock odysseys. II. The evaluation of AutoDock program and its comparison with other docking programs
Zengjian Hu1, Shaomeng Wang2, and William M. Southerland1. (1) Department of Biochemistry and Molecular Biology, Howard University College of Medicine and the Howard University Drug Discovery Unit, 520 West Street, Northwest, Room 324, Washington, DC 20059, zhu@howard.edu, (2) Intel Med, The University of Michigan

Abstract
In recent years, computational high-throughput docking (HTD) has emerged as a very powerful tool for identifying novel lead compounds. In principle, HTD should discover all of the ligands of interest in a database, but in practice, HTD suffers from false positives and false negatives. In this study, we evaluated the AutoDock program for its quality and accuracy in identifying and predicting ligand binding modes. AutoDock is one of the most widely used docking programs in computational binding studies. Our results show that AutoDock is able to predict preotein-ligand complex structures with reasonable accuracy and speed. We also compared AutoDock program with three other popular used programs, DOCK, FlexX and GOLD. We found its performance batter than these three programs. Its use in HTD should enhance efficiency in the discovery of lead compounds (* This work is supported in part by grant RCMI-NIH 2G12RR03048).

CINF 51:  Separation and determination of urea and methyl carbamate by reversed phase high performance liquid chromatography
Fulin Mao, Tinghua` Wu, Ya Liu, Zhuoqun Zheng, Qineng Zhang, and Fei Chen, Insitute of Physical Chemistry ,Zhejiang Normal University, Jinhua 321004, China, Fax: 086-579-2282595, mflchina@163.com, wth3907@163.com, sky48@mail.zjnu.net.cn, zhengzhuoqun1980@yahoo.com, zpy22@163.com, xinyue9877@163.com

Abstract
The method of reverse phase high performance liquid chromatography (RP-HPLC) was employed in the separation and determination of urea and methyl carbamate. In order to separate urea from methyl carbamate, two C18 columns connected in series rather than single one were used. The experiment with the mobile phase in the proportion of V (methanol): V (water) = 1:1, a flow rate of 0.5ml/min and volume of feeding of 5µL.. The two components were quantified by external standard method at the wavelength of 215nm. Under the optimum conditions listed above, the graph between peak area (A) versus mass percentage (X%) was linear, with the linear regression equation of urea A = 148565X-39384, R2 = 0.9995 and that of MC A = 69055X-90493, R2 = 0.9985. Therefore, it may be advisable to apply it to the analysis of the synthesis of methyl carbamate by alcoholysis of urea.

CINF 52:  From synthesis planning to combinatorial chemistry – applications of the LHASA suite
Alan Long1, Rong Chen2, Craig A. Marby2, Alexander P. Sukharevsky3, and Keith Ohm2. (1) Harvard University, 1414 Massachusetts Ave., Room 430, Cambridge, MA 02138, aklong@fas.harvard.edu, (2) Department of Chemistry and Chemical Biology, Harvard University, (3) Aventis Pharmaceuticals

Abstract
The LHASA (Logic and Heuristics Applied to Synthetic Analysis) program for target-oriented retrosynthetic analysis has evolved since the early 1970’s into a suite of programs that includes DEREK (for toxicology prediction), PROTECT (for protective-group analysis), APSO (for teaching organic synthesis), and, most recently, LCOLI (for diversity-oriented generation of compound libraries). Improvements in hardware and software have shifted the emphasis in synthetic analysis from interactive, user-driven strategy and tactic selection to non-interactive processing requiring intelligent screening of results. One approach to optimal route selection using analysis of synthetic accessibility and complexity will be discussed. The availability of multiple tools in the suite also allows a more comprehensive approach to synthesis planning. For example, the virtual libraries generated by LCOLI using its knowledge-based diversity-oriented synthesis (DOS) approach can be screened automatically and ranked according to synthetic accessibility and potential toxicity of the products. Applications to drug design will be discussed.

CINF 53:  Models for computer reasoning under uncertainty
Philip N. Judson, LHASA Ltd, Department of Chemistry, University of Leeds, Leeds LS2 9JT, United Kingdom, Fax: +44 (0) 113 343 6535, judson@dircon.co.uk

Abstract
Human decision makers appear to reach their conclusions through reasoning based on weighing the arguments for and against propositions. In spite of their fallibilities, humans perform well enough by this means for evolutionary success and so their methods deserve consideration as models for computer reasoning. Bulding on earlier work on the logic of argumentation, at LHASA we have developed models for reasoning about the likelihood that an event or circumstance will come about and about whether some events are more or less likely than others. This talk will outline the background to our work and present some of the key features of the models we have developed. We use them in programs for predicting toxicity and xenobiotic metabolism but they are suitable for any area where predictions have to be based on uncertain information.

CINF 54:  Electronic documents in chemistry, from ChemDraw 1.0 to present
Stewart D. Rubenstein, CambridgeSoft Corp, 100 CambridgePark Dr, Cambridge, MA 02140, Fax: 617-588-9380, srubenstein@cambridgesoft.com

Abstract
Although graphical user interfaces had been used for a number of years, the development of the Macintosh and the laser printer in the mid-80's made it possible to develop ChemDraw and other programs which became widely available to non-specialist chemists. More recently, the deployment of high-speed wide-area networks has made possible global sharing of electronic documents and other information. This talk will trace the history of some of these developments, and discuss opportunities today for deeper collaboration through the use of secure, scalable, global information systems now available.

CINF 55:  A novel method for optimizing subgraph isomorphism algorithms such that 2D stereochemical descriptors are efficiently processed in a molecule or reaction retrieval system
Anthony P Cook1, A Peter Johnson1, and Daniel G Thomas2. (1) School of Chemistry, University of Leeds, Leeds LS2 9JT, United Kingdom, Fax: 44 113 3436465, tony@bci.gb.com, (2) BCI Ltd

Abstract
Subgraph isomorphism algorithms are an essential part of molecule and reaction retrieval systems. As well as solving queries that express molecular constitution, an additional requirement is that the absolute or relative configuration of stereocentres expressed in the query must also be considered when determining a graph match. In the algorithms we present in this paper, this problem has been addressed in three ways: a) the development of a useful and generic stereochemical descriptor that simplifies the comparison of most types of stereochemical geometry; b) the incorporation of the comparison of the stereochemical descriptor step directly into the graph isomorphism algorithm; c) a new feature that uses a “best first” planning algorithm that dynamically determines the most efficient order that query atoms and stereo descriptors are tried in the subgraph isomorphism algorithm.

CINF 56:  Strategies and challenges in predictive toxicology
Glenn J. Myatt, Paul E. Blower, Kevin P. Cross, Wayne P. Johnson, and Chihae Yang, Leadscope, Inc, 1393 Dublin Road, Columbus, OH 43215, Fax: 614 675 3732, gmyatt@leadscope.com

Abstract
There are numerous challenges to the development of a comprehensive strategy for predictive toxicology. Access to high quality data from which accurate predictive models can be generated, continues to be a major impediment. An approach to domain-intelligent integration of disparate sources, both electronic and non-electronic, will be described. A toxicology controlled vocabulary, ToxML, based on the XML standard is central to the integration. Prior to building any predictive model, an assessment of the data is required to transform complex hierarchical XML data into decision point data. This process usually involves an expert judgment. Only once the data has been selected, integrated and assessed can model building start and even then there is no guarantee that the data can be modeled. An approach to building and applying predictive model will be described from various stages of the workflow, the data integration, assessment, and subsetting to prepare modelable data set.

CINF 57:  Virtual screening and similarity searching using binary kernel discrimination
Peter Willett, Department of Information Studies, University of Sheffield, Western Bank, S10 2TN Sheffield, United Kingdom, Fax: +44-114-2780300, p.willett@sheffield.ac.uk

Abstract
Binary kernel discrimination (BKD) is a machine learning technique that has recently been suggested for use in virtual screening. A molecule is scored by calculating its similarities with sets of known active and known inactive molecules, the number of these similarities contributing to the overall score for that molecule being determined by an optimisable parameter. This paper reports the use of BKD in simulated virtual screening experiments with public and corporate datasets in which the molecules are characterised by 2D fragment bit-strings. Our results suggest that BKD is fully competitive with existing approaches to 2D virtual screening in terms of its ability to prioritise compounds for biological testing. We also demonstrate that a simple modification of the method provides an effective way of carrying out similarity searches when multiple reference structures are available.

CINF 58:  Virtual screening using reduced graphs
Valerie J. Gillet, Information Studies, University of Sheffield, Regent Court, 211 Portobello Street, Sheffield S1 4DP, United Kingdom, Fax: +11 (0) 114 2780 300, v.gillet@sheffield.ac.uk

Abstract
Many different ligand-based virtual screening methods have been developed using both 2D and 3D descriptors. Both types of descriptors have their limitations. The 2D methods have a tendency to select structural analogues and thus do not easily permit the identification of new lead series. The 3D methods, on the other hand, have been shown to result in greater diversity in the hitlists, however, they are limited by the need to handle conformational flexibility. We have developed virtual screening methods in which the molecules are characterised by reduced graphs which summarise the features of the molecules while retaining the topology between the features. Thus reduced graphs can be thought of as topological pharmacophores. Two different approaches have been investigated for quantifying the similarity of reduced graphs. In one approach, the reduced graphs are mapped to fingerprints before calculating the similarity, in the other, graph-matching methods are applied directly to the reduced graphs. Here, the performance of the reduced graphs is compared with conventional descriptors in simulated screening experiments.

CINF 59:  Conformational sampling in a protein-ligand complex environment
Zsolt Zsoldos, Research and Development, SimBioSys Inc, 135 Queen's Plate Dr, Suite 520, Toronto, ON M9W 6V1, Canada, Fax: 416-741-5084, zsolt@simbiosys.ca

Abstract
Adequate conformational sampling of small molecule ligands is of high importance in flexible ligand docking as well as in de novo ligand design applications. There are software tools available to generate low energy conformers of ligands, i.e. distinct local minima of the conformational strain energy function of the ligand. Docking and de novo design software often use such sets of conformers or generate them on the fly by use of dihedral angles normally associated with low energy conformers. Conformational statistics of over 5000 high resolution (less than 2.5A) crystal structure complexes from the PDB will be presented. The experimental data demonstrates much wider conformational variation in a protein-ligand complex environment. Requirements are derived from the data for the necessary conformational sampling space and resolution to reproduce the experimental data within acceptable bounds for steric violations. The efficiency of various algorithms to provide the adequate conformational sampling will be compared.

CINF 60:  Informatics aiding drug discovery - ADME evaluation
William L. Jorgensen, Department of Chemistry, Yale University, New Haven, CT 06520-8107, Fax: 203-432-6299, william.jorgensen@yale.edu

Abstract
Computational tools have been developed for the rapid prediction of properties of organic molecules that are relevant to their potential as drugs. A small number of physically significant descriptors, especially surface area components and hydrogen-bonding potentials, are computed from an input three-dimensional structure. Simple linear regression equations have been developed from experimental datasets using these descriptors for the accurate prediction of a variety of properties including aqueous solubility, octanol/water partition coefficient, free energy of hydration, Caco-2 and MDCK cell permeabilities, serum protein binding, and brain/blood partitioning. Other developments include a rule-based system for prediction of primary metabolites, while a more quantitative approach has been used in the computation of pKa values (acidities). The algoritms have formed the basis of the QikProp program.

Prediction of Drug Solubility from Structure. W. L. Jorgensen and E. M. Duffy, Adv. Drug Delivery Reviews, 54, 355-366 (2002).

CINF 61:  A role for chemoinformatics in structure-based de novo ligand design
A. Peter Johnson, Krisztina Boda, Tamas Lengyel, Shane Weaver, and Aniko Vigh, School of Chemistry, University of Leeds, Leeds LS2 9JT, United Kingdom, Fax: 44-113-2336465, a.p.johnson@chemistry.leeds.ac.uk

Abstract
The SPROUT program for de novo ligand design benefits from extensive use made of chemical information extracted from a variety of databases and knowledge bases. Databases of x-ray structures such as CSD and the Brookhaven file provide support for the generation of conformations which should be of relatively low energy because they correspond to ones frequently found in these databases. Supplier catalogues, combined with retrosynthetic fragmentation of MDDR provide starting materials for the SynSPROUT variation in which structures are built up by virtual synthesis using common synthetic reactions stored in a reaction knowledge base. An alternative fragmentation of MDDR provides drug like entities which are flood docked to all possible target sites in a protein cavity. Analysis of the estimated binding affinity scores for all poses allows an informed choice of a subset of target sites for further structure generation.

CINF 62:  Chemical information instruction in an industrial environment
Catherine Lyons Misner, Rohm and Haas Company, 727 Norristown Road, P.O. Box 0904, Spring House, PA 19477-0904, Fax: 215 – 641 – 7811, cmisner@rohmhaas.com

Abstract
The need for chemical information and awareness of new resources grows exponentially for research chemists and (chemical) engineers who move from the academic to the industrial arena. Continuing effective instruction in retrieving and evaluating available chemical and engineering information can be a challenge to the Corporate information professional. This poster will address the issues we in the Knowledge Center face, and will outline some solutions we at Rohm and Haas Company are employing to meet the challenge. Industrial sector scientists are constrained by the time they can devote to the amount of available data and information. We have determined a key factor in effective use of the plethora of resources we teach, is the manner in which we present our training sessions and materials. The focus of our instruction has migrated from “here’s what we have” to “here’s what you will get” out of the resource or “here’s what you can do” with an information resource. We run training sessions to enable scientists to create their own accelerated information analyses (in a kilosecond). Our presentations have to address wide ranges in educational levels and computer skills, as well as an increasingly global audience. We will describe how we leverage existing technology in overcoming some of these obstacles. We have created quick reference guides linked from our web page, and collaborated with vendors to translate presentations into other languages. We are creating brief, focused online tutorials, and we use collaboration tools (like Sametime™ or WebEx™) for providing training to remote locations. We attempt to balance our charter to provide appropriate information to our customers via self-service with our role as highly trained consultants and partners.

CINF 63:  Information literacy for the physical scientist
Leah R. Solla, Physical Sciences Library, Cornell University, 293 Clark Library, Cornell University, Ithaca, NY 14853-2501, Fax: 607-255-5288, lrm1@cornell.edu

Abstract
Known locally as Chem602, Information Literacy for the Physical Scientist is a one-credit graduate course offered every spring at Cornell University. The semester-long series of classes introduces physical scientists to the extensive resources available in chemistry, the physical sciences and the life sciences. Indexes, abstracts, handbooks, databases, librarians and other fugitive resources in print and electronic formats are considered. Each lecture is organized around like resources, specific searching skills and common question types, with carefully prepared demonstrations and exercises. Graduate and undergraduate students, faculty, staff and librarians are encouraged to enroll or drop-in on individual sessions for hands-on training. This poster will illustrate the approach of the curriculum, provide sample lectures, exercises and evaluation criteria, and outline the evolution of the course with the changing nature of information, my growing experience and feedback from students.

CINF 64:  Library and database assignments for undergraduate chemistry majors
Ann D. Bolek, Science-Technology Library, The University of Akron, Akron, OH 44325-3907, Fax: 330-972-7033, bolek@uakron.edu

Abstract
At The University of Akron, undergraduate chemistry majors are given library and database assignments during their junior year in their Advanced Chemistry Laboratory classes. During the first semester, they are assigned searches in SciFinder Scholar, whereas during the second semester, they are assigned searches in Beilstein and Gmelin CrossFire, the Web of Science, the Cambridge Structural Database, various Web resources, and printed reference books. This poster will list the sources used and provide examples of some of the searches assigned.

CINF 65:  Team Green: An integrated approach to teaching analytical chemistry and chemical literacy
Mary Ellen Teasdale, James G. Gee Library - Science Reference, Texas A&M University - Commerce, 2600 South Neal, Commerce, TX 75429, Fax: 903-886-5723, libmt@tamu-commerce.edu, and Anita I. Zvaigzne, Department of Chemistry, Texas A&M University - Commerce

Abstract
How long would it take a person to die from inhalation of a poisonous gas? When the gas cylinder spews into the open lab area, LD-50 and risk assessment should already have been considered. Therefore, to develop students’ appreciation for problems encountered by analytical chemists and for using the chemical literature, Team Green was devised by a chemistry instructor and science reference librarian. The Team Green problem assigned to the students was to generate a mock Chemical Response Plan (CRP) for first responders and community officials encountering chemical warfare agents in a small rural community. This poster highlights how analytical chemistry and chemical information can be integrated. The purpose of this exercise is to teach undergraduate chemistry students chemical information strategies in order to address the kinds of problems that might be encountered in working in an industrial or community setting.

CINF 66:  Using poster sessions in a chemical information course
F. Bartow Culp, Mellon Library of Chemistry, Purdue University, West Lafayette, IN 47907, bculp@purdue.edu

Abstract
In teaching a course in chemical information, the instructor is forever reconciling the opposing forces of content inclusion with class time availability. Even a full semester course is rarely more than a one-credit/one-class-per-week offering - barely enough time to teach the sources and skills necessary to fulfill minimal course objectives. Having students prepare and present a poster session is one way to maximize the use of limited class time. At Purdue University, a student poster project has been a popular part of the chemical information course for several years. In addition to the time savings mentioned above, there are other instructional advantages to this program.

CINF 67:  Balancing theory and practice in chemical information instruction
Charles F. Huber, Davidson Library, University of California - Santa Barbara, Santa Barbara, CA 93106, Fax: 805-893-8620, huber@library.ucsb.edu

Abstract
Most undergraduate and graduate students in chemistry don't intend to become chemical information professionals. They only want the nuts and bolts of searching the literature as easily and quickly as possible with the tools at hand. But sometimes more theoretical concepts or historical background can help make the students better searchers. Where to strike the balance? Examples from my chemical literature course at UC-Santa Barbara will be used to illustrate one approach.

CINF 68:  Developing the better mousetrap: Creating chemistry course and subject guides in a content management system
Teri M. Vogel, University Library, Georgia State University, 100 Decatur Street SE, MSC 8E0705, Atlanta, GA 30033-3202, Fax: 404-651-4315, tmvogel@gsu.edu

Abstract
In 2003 Georgia State University Library began implementing a content management system (CMS) for the fifteen liaison librarians to create Web pages for their faculty and students. A CMS offers a number of advantages for any library, most importantly the use of forms and templates to make global changes and to create a uniform structure and look among pages managed by multiple contributors. For the Chemistry Liaison, the CMS has greatly streamlined the process of developing content-rich subject and class guides to support library instruction. Standard instruction pages can be created once, and then included in multiple guides with a mouse click. The form/template style makes it easier to divide these guides into smaller units, a standard rule of Web usability principles. The result is ability to develop highly sophisticated resource guides for specialized topics like SciFinder and Beilstein, while still creating pages that patrons can easily navigate and use.

CINF 69:  Experiments in teaching information skills to chemical & engineering students
Erja Kajosalo, Libraries, Massachusetts Institute of Technology, Building 14S-134, 77 Massachusetts Ave, Cambridge, MA 02139, kajosalo@mit.edu

Abstract
It is hard to get an attention of busy chemistry faculty and students, and as a chemistry librarian I am constatly trying different approaches in informing chemistry and chemical engineering faculty and students of the existence and use of the libraries' vast chemical information resources. There are individual research appointments, and hands-on sessions on major chemistry databases. Library instruction session might be a part of the course syllabus, or we create a course-specific web page. Our users can attend our open labs and lectures about chemistry resources, and we teach administrative staff how to find articles for faculty from their desktop. The latest addition is a seminar series on chemical information intruducing several databases to graduate students and postdocs in different departments. This poster will highlight these different approaches and what has worked for us.

CINF 70:  Printed Beilstein Handbook: An enduring resource in organic chemistry
Philip Barnett, Science/Engineering Library, City College of New York (CUNY), Convent Avenue at 138th Street, New York, NY 10031, Fax: 212-650-7626, pbarnett@ccny.cuny.edu

Abstract
For over a century, the printed Beilstein Handbook of Organic Chemistry has been a nonpareil source of data on properties of organic compounds. Until about a decade ago, many organizations collected most or all of the printed volumes. While many users now access Beilstein on either the subscription based CrossFire® or the online Beilstein database, many other users are bound to the printed handbook because they either cannot afford CrossFire® or they lack a sufficient budget for extensive searching of the database. Moreover, the venerable printed version has a unique feature: one can browse to see how compounds related to a compound of interest are prepared, even if the compound of interest has not yet been synthesized. New users of the massive printed handbook are often intimidated by the difficulty of learning the complex and multi-layered rules for locating compounds in the printed volumes. However, tutorials and user aids like the ones in the Clearinghouse for Chemical Information Instructional Materials, and a small computer program, SANDRA®, from the Beilstein Institute, enable new and occasional users to readily locate substances in these books. One such tutorial (http://www.indiana.edu/~cheminfo/33-16.html) explains both the layout of the handbook and how to use SANDRA®. This tutorial, accompanied by an appropriate assignment (such as finding some physical properties of a given compound in the main and all supplementary volumes) will teach mastery of the handbook. At the same time this regimen will reassure students that they can find any desired substance in this printed work.

CINF 71:  Publishing in the Chemical Information Instructor feature of the Journal of Chemical Education
Andrea Twiss-Brooks, John Crerar Library, University of Chicago, 5730 S. Ellis Ave, Chicago, IL 60637-1403, atbrooks@uchicago.edu

Abstract
Current ACS Committee on Professional Training guidelines require teaching of “the systematic use of chemical information.” 1 Skilled and inventive instructors are developing new strategies and programs to fulfill this guideline and produce chemists that know how to find, critically evaluate, and use chemical information. The Chemical Information Instructor feature of the Journal of Chemical Education provides instructors with a forum to share practical information related to teaching chemical information literacy and skills with colleagues who face the same challenges. Information is provided in print, and on the Web via JCE Online. Topics of submissions include integration of information instruction into one or more courses, integration of WWW sources into instruction, instruction on specific types of information (e.g., organic reactions) or specific types of materials (e.g., patents) or specific sources and databases (e.g., Chemical Abstracts), and teaching techniques. A brief history of the feature, a bibliography of previously published articles, and information on submission of articles will be presented.

1) American Chemical Society. Committee on Professional Training. Undergraduate Professional Education in Chemistry: Guidelines and Evaluation Procedures. Spring 2003, p.9

CINF 72:  Kinases, homology modeling, and high throughput docking
David J. Diller, Molecular Modeling, Pharmacopeia, Box 5350, Princeton, NJ 08543-5350, Fax: 609-655-4187, ddiller@pharmacop.com

Abstract
Over the past few years we have developed molecular docking software intended for discovery combinatorial library design. The two main considerations in the design were thus speed and utility with homology models. Significant effort was put into validating the approach in the context of homology models for kinase targeted library design. In this talk we briefly discuss the philosophy behind the design and the validation studies. Furthermore, we discuss how we have used the results of the study in kinase targeted library design. Finally, we discuss the success of the libraries designed with this procedure.

CINF 73:  Virtual screening for kinase inhibitors
Paul D. Lyne, Cancer Discovery, AstraZeneca, 35 Gatehouse Drive, Waltham, MA 02451, paul.lyne@astrazeneca.com

Abstract
A virtual screen of a subsection of our corporate collection was performed for checkpoint-1 kinase using a knowledge-based strategy. This involved initial filtering of the compound collection by application of generic physical properties followed by removal of compounds with undesirable functionality. Subsequently a 3-D pharmacophore screen for compounds with a kinase binding motif was applied. The remaining compounds were docked and rescored, resulting in 103 compounds being tested. This yielded 36 hits in the IC50 range of 110nM to 68uM, corresponding to four chemical classes.

CINF 74:  EA-Inventor: Using vHTS scoring functions for de novo design
Robert S. Pearlman and Karl M. Smith, Optive Research, Inc, 12331-A Riata Trace Parkway -- Suite 110, Austin, TX 78727, bob.pearlman@optive.com

Abstract
Virtual HTS involves application of a scoring function to a specified, pre-determined set of structures. De novo design involves application of a scoring function to a dynamically evolving set of structures. De novo design also involves the application of a “structure modifying engine” to generate new structures and control the process by which structures with improved scores are evolved. Previous efforts to develop de novo design packages have focused, primarily, on developing “the best possible scoring function.” This is unfortunate for two reasons. First, the choice of “best” scoring function varies greatly from one discovery project to the next (and from one scientist to the next). Second, insufficient attention has been devoted to developing “the best possible structure modifying engine.”

We will describe an exceptionally complete and robust structure modifying engine (EA-Inventor) which offers two extremely important features. First, the nature of the structural modifications (and, hence, resulting structures) can be “tuned” to best suit the needs of particular discovery projects. Second, the engine can easily be used in conjunction with literally any scoring function or composite scoring function – whatever function is deemed most appropriate for the particular discovery project.

CINF 75:  High-throughput molecular docking for lead discovery
Diane Joseph-McCarthy and Juan C. Alvarez, Chemical and Screening Sciences, Wyeth Research, 200 CambridgePark Drive, Cambridge, MA 02140, Fax: 617-665-5682, DJoseph@wyeth.com

Abstract
High-throughput virtual screening of large three-dimensional molecular databases enables the identification of novel small molecule drug leads for biologically relevant targets. Accurate molecular docking of small molecules to a target structure requires adequate sampling and accurate scoring of each library ligand in the target binding site. Our pharmacophore-based docking approach allows for efficient sampling of the ligand conformations and orientations in the target structure. The use of a fast scoring filter followed by a more rigorous scoring function to rank selected hits will be discussed. In particular, the utility of various scoring schemes for identifying viable leads in test cases as well as in a therapeutic target screen will be presented.

CINF 76:  Automating and improving virtual screening of large compound databases
Niu Huang1, John Irwin2, Chakrapani Kalyanaraman2, Brian Shoichet2, and Matthew P Jacobson2. (1) Department of Biopharmaceutical Sciences, University of California, San Francisco, 600 16th St., Suite N474E, San Francisco, CA 94143, nhuang@salilab.org, (2) Department of Pharmaceutical Chemistry, University of California, San Francisco

Abstract
Despite well-known weaknesses, molecular docking is now one of the most practical techniques to leverage structure for ligand discovery. We have developed a docking and rescoring protocol to computationally screen chemical databases containing millions of compounds with minimal user intervention. Docking was performed using the DOCK 3.5.54 program with a grid-based electrostatic and van der Waals interaction energy evaluation including a partial ligand desovlation energy correction. This fully automated docking approach was evaluated by the extent to which known binders were enriched against a background of drug-like decoys and compared favorably with enrichments obtained by an expert. The binding poses of top scoring compounds from docking was submitted to further refinement and rescoring using an all-atom force field (OPLS AA) and implicit solvent model (Generalized Born); the use of a rapid multi-scale Truncated Newton energy minimization algorithm enabled this refinement stage to be completed with less than one minute per ligand. Significant improvement in enrichment was observed for most of the systems studied.

CINF 77:  Navigating high-throughput docking results
Keana Scott, Noel Southall, Trung Nguyen, and Dr Ajay, Informatics, Celera Genomics, 45 W. Gude Drive, Rockville, MD 20850, Fax: 240-453-3303, keana.scott@celera.com

Abstract
High-throughput docking results are often subjected to strict filters that are based on multiple scoring functions and applied in linear fashion to reduce the docked poses to a manageable number for visual inspection. Although filters reduce the number of false positives, they also decrease statistical power in the docking exercise by increasing the number of false negatives. Instead, we have developed a flexible tool that 1) allows the user to navigate through the entire binding mode hypothesis space rather than a small subset of individual poses, 2) does not preclude interesting unanticipated binding modes, 3) incorporates multiple in-house developed scoring functions, and 4) enables us to leverage a modeler’s intuition. This flexibility is achieved through a Java-based user interface that allows for mathematical/logical operations in hypothesis space and works hand-in-hand with PyMol for visualization and an Oracle backend for data storage.

CINF 78:  Beyond the limits in early ADME prediction to boost v-HTS
Jacques R. Chretien1, Han van de Waterbeemd2, Nadege Piclin1, Christophe Wechman3, and Marco Pintore1. (1) BioChemics Consulting, Innovation Center, 16 L. de Vinci, 45074 Orleans cedex 2, France, Fax: + 33 2 38 41 72 21, jacques.chretien@univ-orleans.fr, (2) PDM, Department of Drug Metabolism, Pfizer Global Research and Development, (3) LBLGC / CBI, UPRES EA 1207, University of Orleans

Abstract
Appropriate pharmacokinetic properties are important for the success of a drug discovery program. There is a need to incorporate ADME considerations already in the first phases of the drug discovery, more particularly in virtual design and virtual screening (v-HTS). Such procedures will be able to predict ADME properties of any molecule beyond the limits of the Lipinski rules. Recently, new computational methods based on Genetic Algorithms and Fuzzy Logic have been developed by us allowing to develop a number of early ADME predictors [1]. In this contribution, their application to the main pharmacokinetic properties, i.e. oral absorption, bioavailability, volume of distribution and clearance, will be discussed. All models generated were validated by cross-validation, test set and Y-sampling procedures, and most of them were able to predict correctly ADME properties with prediction rates higher than 65-70%. Moreover, the proposed techniques showed robustness and a prediction power higher than those derived from other comparable methods.

References: [1] Pintore M, van de Waterbeemd H, Piclin N, Chrétien J., Prediction of oral bioavailability by adaptive fuzzy partitioning, Eur J Med Chem (2003), 38, 427-431.

CINF 79:  Improving the workflow and accuracy of in silico ADME/Tox prediction
Gregory M. Banik and Michelle D'Souza, Informatics Division, Sadtler Software & Databases, Bio-Rad Laboratories, 3316 Spring Garden Street, Philadelphia, PA 19104-2596, Fax: 215-662-0585

Abstract
In silico ADME/Tox prediction can shorten the research-to-market cycle and eliminate wasted effort in pharmaceutical R&D through the identification and evaluation of possible problems with a potential lead compounds. This presentation will demonstrate how drug discovery professionals can generate ADME/Tox profiles for potential lead compounds and accelerate lead generation with: simultaneous side-by-side prediction of multiple properties, consensus predictions using multiple models for more accurate results, validation tools to verify models, seamless data mining, and an integrated toolset for improved workflow. This session will also demonstrate how the software can be used not only as an evaluation tool for assessing fundamental ADME/Tox parameters, but also as an informatics system for mining, managing, searching, and communicating the knowledge obtained from such assessments.

CINF 80:  Classpharmer and the quest for privileged substructures
Dora Schnur, Computer Assisted Drug Design, Bristol-Myers Squibb, P.O. Box 5400, Princeton, NJ 08543, and Mark A. Hermsmeier, New Leads Chemistry, Bristol-Myers Squibb

Abstract
With the onslaught of data that has arisen from solving the human genome, creation of libraries and screening decks that are directed toward families of receptors such as GPCR’s, kinases, nuclear hormones, etc. has replaced generation of libraries and screening decks based primarily on diversity. Although diversity-based design still plays a role, particularly for orphan receptors and for receptors with no known small molecule ligands, more knowledge based approaches are required for target class design. A standard approach involves the use of privileged substructures. These “target class active” fragments or substructures may be found by various methods. This presentation focuses on the use of Classpharmer(TM) to find such substructures for target class compound sets from the MDDR as derived from Schuffenhauer,Jacoby, et al: “An Ontology for Pharmaceutical Ligands….”, JCICS 2002, 42, 947-955. It also examines the validity of the concept of "privileged substructure".

CINF 81:  Chemometric approaches to virtual screening
Alexander Tropsha, Scott Oloff, Shuxing Zhang, and Min Shen, Laboratory for Molecular Modeling, School of Pharmacy, University of North Carolina at Chapel Hill, CB # 7360, Beard Hall, School of Pharmacy, Chapel Hill, NC 27599-7360, Fax: 919-966-0204, alex_tropsha@unc.edu

Abstract
We discuss novel chemometric approaches to both ligand and receptor based virtual screening, which characterize both ligands and receptors (if available) in multidimensional space of chemical descriptors. In ligand based screening, we employ rigorously validated QSAR models to mine chemical databases for compounds with high predicted activity. We demonstrate that this approach yields an exceptionally high experimental hit rate in identifying anticonvulsant compounds from a set of 250,000 molecules. We also report on a novel approach to identifying Complementary Ligands Based on Receptor Information (CoLiBRI). CoLIBRI transforms chemical structure of both ligands and their complimentary active sites into the high-dimensional descriptor space and uses specially developed chemical similarity metrics to mine target’s complementary ligands from large databases. The results illustrate that CoLiBRI is capable of identifying all known ligands of 260 test binding sites within the top 1% of the database of ca. 60,000 compounds in 95% of all cases.

CINF 82:  Functional group fingerprints: Augmenting hit and lead identification
James R. Arnold, Charles L. Lerman, and James R. Damewood, CNS Chemistry, AstraZeneca, 1800 Concord Pike, Wilmington, DE 19850, Fax: 302-886-5382, james.arnold@astrazeneca.com

Abstract
It has been estimated that roughly 70% of drug discovery projects must be approached by ligand-based methods, as many targets are not currently amenable to structural studies. We present a novel ligand-based method called Functional Group Fingerprinting. This method classifies medicinally relevant functional groups in molecules using approximately 400 defined functional groups. It creates bitstrings that are used to calculate similarity scores between known actives and either databases or libraries of compounds. In this presentation we will show completeness and orthogonality of the functional group assignments in medicinally relevant compounds. Functional Group Fingerprinting also recovers different sets of actives than those recovered with other fingerprint methods. We will show the enrichment rates observed with this method in greater than 500 target classes within the MDDR are comparable or superior to existing methods. The method recovers on average greater than 60% of the active compounds in less than 1% of the ranked MDDR target classes. This permits fewer compounds to be screened in the hit or lead identification stages in order to identify a sufficient number of chemical classes for lead generation to progress on a given target. That results in reduced depletion of the corporate compound collection, more targets being evaluated, and more efficient identification of lead series for prioritization.

CINF 83:  Steric multiplet fingerprints as screens in high-throughput screening
Essam Metwally1, Robert Clark2, and Peter C. Fox1. (1) Tripos Inc, 1699 South Hanley Road, Saint Louis, MO 63144, emetwall@tripos.com, (2) Discovery Software, Tripos Inc

Abstract
Rapid screening of large chemical databases for potential lead compounds has become ever more important as the number of known biological targets has risen dramatically. Using fully flexible 3d searching on a large database is prohibitively time consuming to be routinely used as a screening method, thus a rapid screen that allows the majority of compounds to be filtered out before a 3d search is run is highly desirable. A number of methods have been developed to rapidly screen databases utilizing fragment-based fingerprints, and more recently pharmacophoric feature based fingerprints. The steric or shape component of each molecule is disregarded in these types of fingerprints. Screening on a fingerprint descriptor that encodes the shape information of a query may be a valuable addition to the fragment or pharmacophore-based screens.

We have extended Tripos' tuplet technology to include the ability to use steric multiplet fingerprints, which encode flexible shape information of the database molecules, either alone or in concert with pharmacophoric feature multiplets. The definition, creation and storage methodologies for these fingerprints will be discussed. Examples of the use of these fingerprints as database screens will also be covered.

CINF 84:  From gene to lead: In silico warfare on the West Nile virus
Luke S. Fisher, Dana Haley-Vicente, and Anne Marie Quinn, Lead Identification and Optimization, Accelrys Inc, 596 Midnight Pass, Antioch, IL 60002, Fax: 240-248-3096, lfisher@chemist.com

Abstract
Can we produce reliable structural models of the proteins encoded by the West Nile virus genome even when sequence identity is low among homologs? Such structural information is critical to further efforts to design drugs to fight the onslaught of disease. Here we show that the Discovery Studio® (DS) GeneAtlas pipeline can be used to produce reliable structural and functional annotation of the proteins encoded by the West Nile virus genome. The 3D homology models generated have been used as the biological targets for lead finding experiments that include a combination of docking and de novo design. New chemotypes identified have also been prioritized based on their 'drug-like' characteristics and synthetic feasibility.

CINF 85:  Automation and deployment of virtual screening to the discovery organization
Robert D Brown, Andrei Caracoti, and Rahim Lila, SciTegic, Inc, 9665 Chesapeak Dr. #401, San Diego, CA 92123, Fax: 858 279 8804, rbrown@scitegic.com

Abstract
For virtual screening to have a tangible impact on the Discovery process, an informatics infrastructure must be developed to underpin the scientific algorithms. This allows the virtual screening methodologies to be transformed from manual processes restricted to a computational chemistry lab into automated processes deployed them to the wider discovery organization. Data pipelining provides a paradigm that allows for the automation of the virtual screening process and allows web deployment of virtual screens to the chemists and biologists that need to apply the results. We will describe this approach and its advantages with reference to the integration of GOLD molecular docking from the Cambridge Crystallographic Data Centre and FlexX from Tripos Inc into an automated data pipelining process and we will show the deployment of that process through a Web based tool.

CINF 86:  Ultra-high throughput vHTS with neural networks
Victor S. Lobanov, 3-Dimensional Pharmaceuticals, Inc, 665 Stockton Dr., Suite 104, Exton, PA 19341, Fax: 610-458-8249, victor.lobanov@3dp.com

Abstract
3D pharmacophore- and docking-based virtual screening involves intensive computations that impose severe limitations on the number of compounds that can be screened within a reasonable period of time. In contrast, artificial neural networks trained with 1D and 2D molecular descriptors offer several orders of magnitude higher throughput and can be applied to screen truly massive collections. In this presentation, pros and cons of neural network based virtual screening techniques are discussed and examples of neural networks trained to predict activity and ADME properties are presented.

CINF 87:  SARtree. A new method for analyzing and visualizing the results from virtual and experimental screening of large complex chemical libraries
Donovan N. Chin, Anuj Patel, R. Aldrin Denny, and Juswinder Singh, Computational Drug Design, BiogenIdec, 14 Cambridge Center, Cambridge, MA 02142, donovan.chin@biogenidec.com

Abstract
This talk will describe a newly developed method at BiogenIdec for analyzing and visualizing the results from virtual screening of large complex chemical libraries. The method involves two unique steps. First, an algorithm to recursively partition and keep track of chemical libraries into core substructures, attached common sub-cores and unique chemical fragments. Second, an interactive graphical "tree" that permits the analyst to visualize the connection and relationships between the common and unique chemical groups from step one. Property information--e.g. virtual screening scoring, biological assay data, or both--can be overlaid on the graphical tree to allow rapid identification of chemical fragment hotspots for a given dataset. We will highlight the utility of SARtree on several chemical datasets including CDK2 kinase from virtual screening. Finally, we will demonstrate that SARtree is a powerful new way of quickly identifying chemical structure-variation and -property relationships hidden in chemical datasets. SARtree is therefore emerging as a useful tool for analyzing and understanding the output from virtual screening, and as a new way of tracking and visualizing (virtual) chemical libraries.

CINF 88:  Searching the impossible: Feature trees in fragment space
Marcus Gastreich, Sally Ann Hindle, and Christian Lemmen, BioSolveIT GmbH, An der Ziegelei 75, 53757 Sankt Augustin, Germany, Fax: +49 2241 2525 525, marcus.gastreich@biosolveit.de

Abstract
FTrees is known to be an effective program for similarity searching. Based on the feature tree descriptor, the similarity of two molecules is defined as the score for the best possible alignment of the respective compared trees. Thanks to the tree nature, this optimal alignment can be computed very efficiently.

The step beyond simple A to B similarity calculations is an on-the-fly assembly of molecule B from a fragment space such that virtual molecule B is most similar to A. Due to the combinatorial nature of the problem, the size of this fragment search space is roughly 10^18 compounds - which is impossible to search sequentially.

We report on the generation of fragment spaces, technology to efficiently search them, and example applications. Thanks to the availability of the underlying Feature Trees as a Python module, the entire process can be scripted and executed in parallel within an integrated Python environment.

CINF 89:  Fast and accurate coarse-grained estimate of small molecule binding free energies
Jun Shimada, Alexey V. Ishchenko, Kam Jim, David J. Lawson, Peter R. Lindblom, Guosheng Wu, and JP Wery, Computational Drug Discovery, Concurrent Pharmaceuticals, Inc, 502 West Office Center Drive, Fort Washington, PA 19034, Fax: 215-461-2006, jshimada@concurrentpharma.com, jwery@concurrentpharma.com

Abstract
A novel approach for the prediction of binding free energies will be presented. This approach is characterized by three critical features: (1) a coarse-grained physical model of the binding process, (2) trainability, and (3) a sophisticated machine learning algorithm that maximally utilizes the information from bioassays. When used against multiple pharmacologically relevant targets, this scoring function has proven to be accurate and generalizable outside of the training set. In virtual screening against aspartyl proteases, nuclear receptors and kinases, this approach was able to select inhibitors which, after synthesis, were shown to be active.

CINF 90:  Building predictive models from high-throughput screening data
Paul E. Blower, Kevin P. Cross, Glenn J. Myatt, and Chihae Yang, Leadscope, Inc, Columbus, OH 43215, pblower@leadscope.com

Abstract
Predictive models derived from high-throughput screening (HTS) data can be useful for prioritizing compounds for further testing. However, HTS data is typically of poor quality with many values out of range and wide variability among replicate test results. Structure-based clustering often reveals an irregular landscape, both in terms of the compound classes represented and the distribution of active compounds across structural classes. Large regions of the chemical space are devoid of activity. Indeed most compounds are not active and not similar to active compounds and thus are of marginal value for modeling activity. Even among active classes, the within class active / inactive ratio may still be very unbalanced, or the range of response values too narrow, or classes are too small to derive accurate models. This presentation will survey problems of building predictive models from large, heterogeneous screening sets and describe methods for addressing them.

CINF 91:  Virtual high-throughput screening: How to boost the pharmacophore approach
Frédérique Barbosa, Molecular Modelling, Cerep, 128, rue Danton, 92500 Rueil Malmaison, France, Fax: 33 1 55 94 84 10, F.Barbosa@cerep.fr

Abstract
The virtual screening of large libraries is structure-based or ligand-based. Structure-based virtual screening (docking) is time consuming and requires a precise knowledge of the 3D structure of the target and of the various binding contributions. For ligand-based screening, the only determining step is the appropriate choice of descriptors and similarity metrics. A careful pharmacophoric description of the 3D conformers of the ligands is required to take into account the features responsible for binding. We use such an approach to retrieve new lead compounds from large and chemically diverse virtual libraries. We have built internally a virtual library of 108 chemically feasible compounds from >7000 building blocks (>1500 proprietary) using validated chemistries. We use a so-called “ghost database” mechanism that allows for the fast calculation of multiple conformers and pharmacophoric fingerprints for each individual structure in the database. Therefore extensive virtual screening with pharmacophores can be done within tractable CPU time. The retrieved compounds are further analyzed by predicting ADME-T properties with Cerep proprietary QSAR models based on BioPrint® (>2000 drug and drug-like compounds tested across >170 in vitro assays). This two stage approach accelerates the identification of promising chemical structures for early drug discovery.

CINF 92:  Reducing CYP-2 liabilities using pharmacophore hypotheses derived from protein structures and inhibitors
Akbar Nayeem and Litai Zhang, Computer-Assisted Drug Design, Bristol-Myers Squibb, Pharmaceutical Research Institute, P.O. Box 5400, Princeton, NJ 08543-5400, Fax: 609-818-3545, akbar.nayeem@bms.com, litai.zhang@bms.com

Abstract
The CYP-2 family of cytochrome P450s ranks among the most important drug metabolizing CYP isoforms present in human liver, and numerous inhibitory drug interactions of high clinical significance involving CYP 2D6 and 2C9 substrates have been described. With the goal of reducing the liability of drug-drug interactions caused by possible inhibition of CYP 2C9, 2C19 and 2D6, 3D-pharmacophore models for each of these isoforms have been developed using their respective homology models, known substrates, and our in-house inhibitors from BMS. The pharmacophore hypotheses derived from these models are presented and are shown to be useful in understanding the active site of these isozymes. The in-silico models derived thusly are used to triage and prioritize chemical library synthesis and reduce the potential liability of drug-drug interactions caused by CYP-2 family inhibitions.

CINF 93:  Interplay of docking, pharmacophores, and shape in virtual high-throughput screening
Erik Evensen, Hans E. Purkey, Kenneth E. Lind, and Erin K. Bradley, Computational Sciences, Sunesis Pharmaceuticals Inc, 341 Oyster Point Blvd., South San Francisco, CA 94080, Fax: 650-266-3501, ee@sunesis.com

Abstract
We have observed recently that post-filtering docking results using pharmacophore models leads to improved enrichments in virtual screening exercises over docking or pharmacophore screening alone. This counter-intuitive insight leads to further questions that point to potential areas for improvement in virtual high-throughput screening. For example, it has been proposed that the improvement over pure pharmacophore-based methods is because docking selects compounds that are shape complementary to the target. We will present inquiries into improving pharmacophore post-filtering and using shape filtering as a higher throughput surrogate for docking. We evaluate the interplay and impact of these methods by applying them to data sets obtained on multiple proteins from different families.

CINF 94:  Compound optimization tools: Designed by the scientists for the scientists
Uwe Geissler, LION bioscience AG, Waldhofer Str. 98, Heidelberg D-69123, Germany, uwe.geissler@lionbioscience.com, and Manish Sud, LION bioscience Inc

Abstract
During compound optimization stage of a discovery cycle, the bench scientists are mainly interested in not only figuring out the key structural features responsible for activity, selectivity, and favorable ADME properties but also what structural changes need be made to improve these characteristics. In order to address these questions, a variety of search and analysis methodologies, along with visualization tools for structural and numerical data, are routinely deployed at the desktops. Unless these tools are easy-to-use and address relevant questions, the bench scientists are seldom interested in using them. In collaboration with an external customer, we have developed a compound optimization tools environment which provides easy-to-use search, analysis and visualization capabilities. Additionally, compute services framework is also available for deploying any internal or third party applications. Interactive visualization and analysis capabilities include: chemistry centric spreadsheet, 2D/3D scatter plots, profile and multi series plots, and 2D/3D histograms, similarity searching, clustering, R-group deconvolution, molecular descriptors calculation, and others. We present an example of using these tools to address various questions which come up during compound optimization.

CINF 95:  HierS - hierarchical scaffold clustering
Steven J. Wilkens, Jeff J. Janes, and Andrew I. Su, Computational Discovery, Genomics Institute of the Novartis Research Foundation, 10675 John Jay Hopkins Drive, C115, San Diego, CA 92121, swilkens@gnf.org

Abstract
An exhaustive ring-based algorithm has been developed to provide an intuitive approach to compound clustering. The recursive algorithm rapidly identifies all ring-delimited substructures within a compound. Molecules are grouped by shared ring substructures (scaffolds) so that common scaffolds obtain higher membership and greater importance. Once all of the scaffolds are identified, hierarchical structural relationships are established. The complex network of hierarchical relationships is then utilized to navigate compounds in a structurally directed fashion. The utility of this approach is demonstrated by providing readily interpretable model for chemical diversity in different compound sets. In addition, a web-based application has been developed which incorporates this algorithm in order to allow for the interactive analysis of the diverse sets of compounds that are produced from high-throughput screening. Biological data is coupled to scaffolds by the inclusion of activity histograms, which indicate how the compounds in each scaffold class performed in other screens.

CINF 96:  Data handling in the NIST Chemistry WebBook
Peter J. Linstrom, Physical and Chemical Properties Division, NIST, Building 221, Room A111, 100 Bureau Drive, Stop 8380, Gaithersburg, MD 20899-0830, Fax: 301-896-4020

Abstract
The NIST Chemistry WebBook (http://webbook.nist.gov/) is a web site which provides a wide range of chemical and physical property data. The site, in operation since 1996, has evolved to met the demands of new data types and a diverse and growing user base. The data handling systems used by the site have to address some common chemical informatics challenges: proper identification of chemical species, specification of property values and meta-data, and graphical representation of chemical structures and spectroscopic data. This talk will describe how the internal architecture of the site has evolved to meet these challenges. Topics to be discussed include data structures, unit conversions, and structure processing. Applications involving the new IUPAC-NIST chemical identifier will be discussed.

CINF 97:  TraX: An integrated system for drug discovery workflow scheduling and tracking
Daniel A Gschwend1, Rebecca J. Carazza1, and Sergio H. Rotstein2. (1) Research Informatics, ArQule Inc, 19 Presidential Way, Woburn, MA 01801, gschwend@arqule.com, (2) ArQule, Inc

Abstract
An efficient drug discovery process requires multiple disciplines to coordinate their efforts in a systematic and coherent manner. Often this goal is obstructed by the lack of inter-departmental communication and inter-departmental awareness of resource allocation. Even in a small company environment, much effort can be wasted through poorly coordinated actions and unexpected events. This presentation will describe TraX, an integrated resource planning and workflow tracking environment developed to support ArQule's drug discovery efforts. This system enables all of our researchers to be aware of all work going on within our drug discovery efforts and to anticipate the workload coming their way.

CINF 98:  MapMaker: An integrated compound library design tool
Daming Li and Sergio H. Rotstein, ArQule, Inc, 19 Presidential Way, Woburn, MA 01801, dli@arqule.com

Abstract
The design of a compound library requires a number of disparate steps, including reagent selection, product enumeration, the computation of properties of the enumerated products and the analysis of these properties in the context of the library as a whole. This process often involves a number of different software products across multiple operating systems, as well as the import, export and transfer of compound and property data between these systems. The complexity of the process often limits library design activities to domain experts. MapMaker is a library design tool that integrates the steps through a web-based user interface. In effect, MapMaker eliminates most of the technical complexity of the library design process, enabling synthetic chemists to carry out their own library design without requiring expert assistance