Bryan L. Roth, National Institute of Mental Health Psychoactive Drug Screening Program and Department of Biochemistry, University of North Carolina at Chapel Hill, School of Medicine, Department of Pharmacology, 8032 Burnett-Womack, CB # 7365, Chapel Hill, NC 27599, firstname.lastname@example.org, Phone: 919-966-7535
The in vitro pharmacological profiling of drugs using a large panel of cloned receptors, an approach that has come to be known as 'receptorome screening', has unveiled novel molecular mechanisms responsible for the actions and side effects of certain drugs. For instance, receptorome screening has been employed to uncover novel molecular targets involved in the actions of antipsychotic medications and the hallucinogenic mint extract salvinorin A. Receptorome screening has also implicated serotonin 5-hydroxy-t-ryptamine 2B receptors in the adverse cardiovascular effects of several medications and subsequent clinical studies have corroborated this prediction (see Roth NEJM, 2007). Receptorome screening represents one of the most effective methods for identifying potentially serious drug-related side effects at the preclinical stage, thereby avoiding significant economic and human health consequences. Receptorome screening also represents a powerful approach for rationally repositioning existing medications.
Supported by NIMH PDSP and Grants from NIMH and NIDA
Josef Scheiber, Jeremy L. Jenkins, Andreas Bender, Steven Whitebread, Jacques Hamon, Laszlo Urban, Kamal Azzaoui, James H. Nettles, Meir Glick, and John W. Davies, Lead Finding Platform, Novartis Institutes for BioMedical Research, 250 Massachusetts Avenue, Cambridge, MA 02139, email@example.com, Phone: 617-871-3697
Adverse effects of drugs that are only identified after a compound enters the clinic seriously limit therapeutic potential and could result in withdrawal from the market. Two well-known examples in recent years were Rofecoxib (Vioxx®) and Cerivastatin (Baycol®, Lipobay®), but there are other examples. Avoiding such adverse effects is therefore a key goal in the development of a drug. It would be desirable to have a computational tool that predicts possible problems even before a compound has been synthesized. Bender et al. have shown a proof-of-principle for predicting adverse events based on chemical structure. In the current study we present an advancement of this method. Approximately 200 marketed drugs were tested against 80 different targets in the Novartis Safety Profiling Panel and the IC50 values were determined. The well-documented adverse effects of these marketed drugs were stored in a database using standard MedDRA terminology. For every target, models were calculated and validated using both a Naïve Bayesian classifier and Linear Discriminant Analysis in conjunction with two chemical descriptors (Extended Connectivity Fingerprints and MDL Public Keys). We present results demonstrating correlations between chemical features and adverse effects on the one hand, and between targets and adverse effects on the other. Therefore the method can be used for predicting adverse events based on chemical structure alone. Furthermore, novel links between targets and adverse effects can be unraveled which are of interest in their own right, but which can also be applied to select targets for in vitro compound profiling.
Josep Prous Jr. and David Aragones, Medicinal Chemistry, Prous Institute for Biomedical Research, Provenza 388, 08025 Barcelona, Spain, Fax: 34-93-4581535, firstname.lastname@example.org, Phone: 34-93-4592220
Over the past decade, and despite major advances in new technologies, the pharmaceutical sector has witnessed how the number of new drugs introduced in the market every year has stayed level or decreased while the cost of drug discovery and development has significantly increased. The safety of drugs used in clinical practice is under constant scrutiny and the withdrawal of several compounds in recent years confirms the serious productivity challenges faced by modern biomedical research. BioEpisteme, a knowledge-based project, was initiated to overcome these productivity bottlenecks and to contribute to the faster discovery of new and safer drugs, as well as the finding of new uses for known molecules. In-house developed datamining algorithms have led to a model that characterizes more than 400 different molecular mechanisms of action simultaneously. The development of the project and its application in explaining new therapeutic applications for angiotensin AT1 antagonists will be presented.
Thomas Barnes, Genomic Pharmacology, Gene Logic, Inc, 38 Sidney St., Cambridge, MA 02139, TBarnes@genelogic.com, Phone: 617-649-2034
Across the pharma and biotechnology industry, reduced hurdles in lead identification are resulting in the screening of druggable targets with weaker disease hypotheses, which will increase the risk and thus incidence of programs that fail in the intended therapeutic area due to lack of efficacy. Nevertheless, these activities will result in a set of chemical tools with which to probe target function and thereby link the corresponding compounds to new therapeutic utility. What is required is sufficiently high throughput methodologies to make de novo links between specific compounds and disease.
We have integrated a set of technologies that provide the means of efficiently associating compounds with potential new therapeutic utility. This is in stark contrast to the unsystematic and serendipitous observations that are classically relied upon to reveal alternative or new drug indications. The promise of these technologies is to expeditiously reduce pipeline gaps within a pharmaceutical industry whose growth is threatened by reduced (and increasingly costly) new product flow.
Justin Lamb, Broad Institute of MIT and Harvard, Seven Cambridge Center, Cambridge, MA 02142, email@example.com, Phone: 617-252-1522
Genome-wide transcriptional analysis provides a comprehensive molecular representation of cellular activity, suggesting that mRNA expression profiling could serve as a practical universal functional bioassay. High-throughput high-density gene-expression profiling solutions raise the possibility of capturing the consequences of small-molecule and genetic perturbations at library and genome scale, respectively, and associating these disparate perturbagens with each other and external organic phenotypes to discover decisive functional connections between drugs, genes and diseases. The talk will describe our technology platform, analysis methods and interpretive tools, and illustrate how our solution can be used to identify valuable new activities of bioactive small molecules, with particular emphasis on existing pharmaceuticals.
Jordi Mestres, Chemotargets SL, Parc de Recerca Biomèdica (483.04), Doctor Aiguader 88, 08003 Barcelona CAT, Spain, Fax: +34 93 2240875, firstname.lastname@example.org, Phone: +34 93 2240882, and Tudor I. Oprea, Division of Biocomputing, University of New Mexico School of Medicine, MSC11 6145, University of New Mexico, Albuquerque, NM 87131, email@example.com, Phone: 505 272 6950
In modern drug discovery, it is no longer acceptable to test compounds synthesized within a hit or lead optimisation program against one primary target and a couple of anti-targets. Efforts towards the construction of annotated chemical libraries are connecting hundreds of thousands of compounds to hundreds of protein targets and thus highlight the need for novel integrative tools for the in silico pharmacological profiling of compounds, with potential applications from side-effect alert systems to drug repurposing. GAUDI is a tool designed to extract knowledge from the complex interaction space between small molecules (e.g., chemical genomics) and protein targets (e.g., proteomics). In its first release, it provides an integrative vista for navigating across WOMBAT , an annotated chemical library covering a chemical space of 190.000 molecules and a target space of over 1450 proteins. The integration between chemical and biological spaces is achieved by simultaneously combining bio- and chem-informatics tools for the classification of small molecules and target proteins, respectively. The result is a new generation of integrative datamining tools to extract knowledge from data stored in annotated chemical libraries.
 M. Olah et al., WOMBAT and WOMBAT-PK. In: Chemical Biology; edited by S.L. Schreiber, T.M. Kapoor, G. Wess. Wiley-VCH 2007, Weinheim, pp 760-786.
MJ Keiser1, Bryan L. Roth2, BN Armburster2, P Ernsberger2, John J Irwin1, and Brian K. Shoichet1. (1) Department of Pharmaceutical Chemistry, University of California, San Francisco, 1700 4th Street, San Francisco, CA 94143, firstname.lastname@example.org, email@example.com, Phone: 415-514-4253, (2) National Institute of Mental Health Psychoactive Drug Screening Program and Department of Pharmacology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599
We present a technique that quantitatively groups and relates proteins based on the chemical similarity of their ligands. Starting with 65,000 ligands annotated into sets for hundreds of drug targets, we computed a similarity score between each set using ligand topology. The significance of the resulting similarity scores, normalized using a statistical model, were expressed as a minimum spanning tree to map the sets together. Although these maps are connected solely by chemical similarity, biologically sensible clusters nevertheless emerged. Links among unexpected targets also emerged, among them that methadone, emetine and loperamide (Imodium) may antagonize muscarinic M3, alpha2 adrenergic and neurokinin NK2 receptors, respectively. These predictions were subsequently confirmed experimentally. Relating receptors by ligand chemistry organizes biology to reveal unexpected relationships that may be assayed using the ligands themselves. It has not escaped our notice that this approach may be useful for drug repurposing.
Fredric S. Young, Vicus Therapeutics, LLC, 55 Madison Avenue, Suite 400, Morristown, NJ 07960, firstname.lastname@example.org, Phone: 973-919-0549
We have developed a hypothesis driven drug re-profiling approach to create a pipeline of product candidates in pre-clinical and clinical development. We identify combination therapies of marketed drugs that target reaction blocks associated with the central disease causing processes. We identify the central processes through use of a pattern classifier of homeostasis and pathology. The pattern classifier is derived from the set of flux invariants associated with a self-organized control state of an absorbing state phase transition with multiple fluxes and multiple compartments. In our systems biology approach, biological systems are modeled using an object process methodology with a top down control analysis based on a universal flux control module. The role of specific genes, proteins and drug targets are defined as a function of their place in the hierarchical network of flux modules that carry out specific disease causing processes.
Fiona Macdonald1, David R. Lide2, and Robert Morris1. (1) Taylor and Francis/CRC Press, 6000 Broken Sound Parkway NW, Boca Raton, FL 33411, Fax: 561-998-2559, Fiona.email@example.com, Phone: 561-998-2564, (2) CRC Press, Editor, Gaithersburg, MD 20878
For more than 90 years the CRC Handbook has been a fixture on laboratory shelves. Its transition from print classic to interactive reference tool will be discussed, and plans for the future will be unveiled.
Andrea Twiss-Brooks, John Crerar Library, University of Chicago, 5730 S. Ellis Ave, Chicago, IL 60637-1403, firstname.lastname@example.org, Phone: 773-702-8777
Faculty, researchers, and students who use academic libraries are familiar and comfortable with print journals and books. Recent changes in scientific journals publishing have been embraced by users and electronic only access to journals is more or less accepted by the academic community. Scholarly books and monographs are viewed somewhat differently by faculty and students, and e-books have not yet become entrenched in quite the same way as e-journals have. Some of the challenges facing acceptance of e-books in academic libraries will be examined. Among the issues addressed will be search and discovery, integration of e-books into the research process, and collection management.
Stephen A. Koch, Department of Chemistry, State University of New York at Stony Brook, Stony Brook, NY 11794-3400, Fax: 631-632-7960, Stephen.Koch@sunysb.edu, Phone: 631-632-7944
The 1997 discovery, that the hydrogenase enzymes have cyanide as native ligands for iron in their active sites, caused the author to begin research in the area of iron-cyanide chemistry. The fact that this area has been under active investigation for more than 300 years provided interesting challenges when it came to doing literature searches of previous work. The recent availability of backfiles of digitized/searchable journals and the even more recent availability of digitized/searchable chemistry books have greatly aided this effort. Most important, reading and understanding the early work in the area has actually had a direct effect on our current research direction. As an added bonus, the ability to integrate 18th and 19th century chemistry and chemists with my research results has made my lectures on my research work much more interesting and enjoyable. My approach to using digitized 18th and 19th century books and journals will be presented.
Caroline F Wain, Marketing and Sales, Royal Society of Chemistry Publishing, Thomas Graham House, Science Park, Milton Road, Cambridge, United Kingdom, email@example.com, Phone: +44 1223-420-0066
eBook platforms provide a service to both the scientist and the librarian. The publisher is challenged with delivering functionality to meet the requirements of both customer groups, considering their needs at each stage of the product development process.
Rustin Kimball1, Gary Ives2, and Kathy M Jackson1. (1) Reference Department, Evans Library, Texas A&M University, 5000 TAMU, College Station, TX 77843-5000, Fax: 979-458-0112, firstname.lastname@example.org, email@example.com, Phone: 979-862-1909, (2) Acquisitions Department, Evans Library, Texas A&M University, College Station, TX 77843-5000
While the availability of e-books generally is met with great enthusiasm from college students, providing library access to e-books from a variety of vendors (whose platforms often are very different) presents many challenges to libraries. The Texas A&M University Libraries provide access to large NetLibrary and Ebrary collections, both of which include many chemistry related titles. In addition, we purchased the electronic reference books from Wiley and Elsevier a year ago. We also offer Knovel, CHEMnetBase, ENGnetBase, and Safari. Recently, our library placed an order for all of the Springer electronic books as well. Currently, our library offers over 60,000 electronic books. This presentation will discuss the different types of electronic books, and will cover the acquisitions and service issues for each type. We will compare the usage figures for electronic books in chemistry and related sciences -to the circulation figures for print books in those cases in which we have both the electronic and print book. We will discuss the reaction of users to these collections, as well as the methods employed by science librarians to publicize their availability.
Michael S. Saporito1, Christopher A. Lipinski2, Alexander Ochman1, Dana Koemer1, Jan Batten1, and Andrew Reaume1. (1) Melior Discovery, Inc, 860 Springdale Drive, Exton, PA 19341, firstname.lastname@example.org, Phone: 610-280-0633, (2) Scientific Advisor, Melior Discovery, Waterford, CT 06385-4122
Drug repositioning is increasingly recognized as an effective strategy to uncover new therapeutics with reduced developmental risk. Melior Discovery has a unique repositioning approach involving a platform comprised of 35 in vivo models representing diverse therapeutic areas. The power of this platform is illustrated by our lead compound, MLR-1023. This compound, originally developed for ulcers, exhibits robust activity in a panel of clinically relevant models of type II diabetes. For example, when compared to metformin in acute studies, MLR-1023 produced an equivalent glucose lowering response at a significantly lower dose. In comparison to rosiglitazone in chronic studies, MLR-1023 exhibited equivalent efficacy without promoting weight gain. Of importance was the identification of a previously unknown molecular target for type II diabetes. This example of Melior Discovery's approach demonstrates the potential for capturing new indications from existing molecules, and the potential for expanding our understanding of the underlying biological basis of disease.
Christian Laggner1, Lyubomir G. Nashev2, Daniela Schuster1, Thierry Langer1, and Alex Odermatt2. (1) Department of Pharmaceutical Chemistry, Computer Aided Molecular Design Group, University of Innsbruck, Institute of Pharmacy, Innrain 52c, Innsbruck A-6020, Austria, Fax: +43-512-507-5269, Christian.Laggner@uibk.ac.at, Phone: +43-512-507-5268, (2) Institute of Molecular and Systemic Toxicology, Department of Pharmaceutical Sciences, University of Basel, Basel 4056, Switzerland
The accumulated exposure to naturally occurring compounds, drugs, consumer products, and industrial chemicals that disturb endocrine functions may cause serious health problems, such as sexual and behavioural disorders, asthmatic and allergic diseases, as well as certain forms of cancer. We present a chemical library of compounds with suspected endocrine disrupting effects that is suitable for different virtual screening approaches, thus facilitating the identification of potential targets of endocrine disruptors. Names and CAS numbers for over 143000 substances related to effects on the endocrine system were taken from the publicly available Endocrine Disruptor Priority Setting Database and were used to retrieve the corresponding chemical structures from the PubChem Project, a rapidly growing collection of chemical information from a variety of sources. The combined entries were filtered for errors before constructing our final screening database. The wide applicability of this library underlines the power and usefulness of publicly available chemical information.
Artem Cherkasov, Division of Infectious Diseases, University of British Columbia, 2733 Heather Str, Vancouver, BC V5Z 3J5, Canada, Fax: 604-875-4013, email@example.com, Phone: 604-875-4588
Emergence of new infections is an increasing public health threat. The problem is that conventional antibiotic development is time-consuming, not very efficient and expensive. Moreover, current legal regulations require years of rigorous studies before a new antibiotic can enter the public sector. It becomes increasingly evident that such methodology doest not keep with emerging and re-emerging infections. As a partial but very rapid solution to this challenge we propose to identify established therapeutics with already approved toxicity and bioavailability properties that also exhibit sufficient activity against novel and re-emerging human pathogens. To assist such discoveries we have developed several QSAR approaches such as quantitative models of ‘Antibiotic-likeness' and ‘Bacterial-metabolite-likeness' enabling accurate recognition of antimicrobial substances from large collections of chemical structures. The developed models were able to relate several drugs from Merck database (with no antimicrobial annotation) to predicted antimicrobial action which has later been confirmed by other literature sources.
Theodora M. Steindl1, Daniela Schuster2, Johannes Kirchmair3, Remy Hoffmann4, Christian Laggner2, Gerhard Wolber3, and Thierry Langer2. (1) Computer-Aided Molecular Design Group, University of Innsbruck, Innrain 52c, Innsbruck A-6020, Austria, Fax: +43-512-507-5269, Theodora.Steindl@uibk.ac.at, Phone: +43-512-507-5264, (2) Department of Pharmaceutical Chemistry, Computer Aided Molecular Design Group, University of Innsbruck, Institute of Pharmacy, Innrain 52c, Innsbruck A-6020, Austria, Fax: +43-1-8174955-1371, firstname.lastname@example.org, Phone: +43-699-1507-5252, (3) Inte:Ligand GmbH, 2344 Maria Enzersdorf, Austria, (4) Accelrys, Orsay 91898, France
3D Pharmacophore-based parallel screening is introduced as an in silico method to predict the potential biological activities of potential drug molecules. This study presents an application example employing a Pipeline Pilot-based screening platform and a collection of structure-based pharmacophore models built using the LigandScout software for automatic large-scale virtual activity profiling. An extensive set of HIV protease inhibitor pharmacophore models was used to screen different test sets consisting of active and inactive compounds. In addition, we investigated, whether it is possible in a parallel screening system to differentiate between similar molecules / molecules acting on closely related proteins, and therefore we incorporated a collection of other protease inhibitors including aspartic protease inhibitors. The results of the screening experiments show a clear trend towards an enhanced signal to noise ratio (true positives/false positives and true negatives/false negatives).
A. W. Edith Chan, BioFocusDPI, Commonwealth House, 1 New Oxford Street, WC1A 1NU London, United Kingdom, Fax: +44 (0) 207 074 4700, email@example.com, Phone: +44 (0) 207 074 4642, and John P Overington, BioFocusDPI, London WC1A 1NU, United Kingdom
The concept of finding new uses for known drugs represents a significantly lower risk commercial strategy compared to developing New Chemical Entities (NCEs). There are two general approaches to expanding clinical utility for a known drug: 1) predicting new indications for the compound through the known molecular target and pathway, and 2) predicting new targets (and then new indications) for a drug. Both of these approaches rely crucially on integration of multiple information sources, but rely on fundamentally different approaches for their implementation. Several of our approaches use these databases, along with a series of target sequence and compound structure similarity calculations to make predictions of likely alternate targets or bioactivities for a compound. In this presentation, we outline our approaches of building and then applying a series of highly normalized pharmacology databases to the problem of predicting the primary or alternate molecular targets for a series of known drugs. Secondly we outline the application of these databases to a series of clinical microarray datasets. Finally, some results from a large scale prediction on a collection of ‘historical' drug candidates will be shown.
J. Maki, Vicus Therapeutics, LLC, 55 Madison Avenue, Suite 400, Morristown, NJ 07960, firstname.lastname@example.org, Phone: 973-919-0549
The combination of drug reprofiling and clinical trial offshoring offer dramatic improvement in the cost, speed, and risk of clinical development of drug products. We will provide a case study of our FDA-sanctioned Phase 2 clinical trial being conducted in the US and India of our reprofiled drug product for cancer cachexia. Cancer cachexia is a catastrophic wasting disorder associated with advanced cancer for which there is no FDA approved therapy. We will highlight the unique synergistic advantages of reprofiling and offshoring and the steps necessary to realize such advantages. In addition, we will review recent changes in the DCGI (Indian FDA-regulatory equivalent), Indian clinical research infrastructure, and the acceptance of Indian data by the FDA that is driving the explosive growth of clinical research in India.
Akinori Mochizuki, Sosei, 4F Ichiban-cho FS Bldg, 8 Ichiban-cho, Chiyoda-ku, Tokyo 102-0082 NA, Japan, Fax: + 81 (0)3 5210 3291, email@example.com, Phone: 81 (0)80 3469 1998
Although significant advances have been achieved in various toxicological predictions, an attrition rate of drug development up to Phase II appears to be remained at same level over the past 20 years.
Sosei's approach to reduce a risk of failure in development by POC study in human is to utilise the compounds that are already know to be tolerable to human. Sosei collect such compounds and incorporate into unique compound library. Sosei, together with technology-based biotechnology companies as an alliance partner, apply various technologies on the library to unlock hidden pharmacology. Development of those once halted compound for new usage enable us to predict lower risk of failure than that of new chemical entities.
In addition to this unique strategy, we carry out various reprofiling approaches including conventional medicinal chemistry method and formulation development on existing drugs, in order to leverage the risks in discovery and development, and concurrently maximise business opportunities and revenues.
David Cavalla, Cambridge, England, firstname.lastname@example.org, Phone: +44-1223-858577
Therapeutic switching, the discovery and development of secondary uses for existing drugs, has three substantial advantages in terms of reduced risk, cost and time. Together with the opportunity created by new therapeutic use patents, this represents a highly efficient route to commercially protected new medicines. There are multiple classes of such programs, depending on whether the composition of matter patent supporting the original development is still in force, and whether the active ingredient was ever successfully developed for its original indication. However, the potential for off-label competition from generic products needs to be carefully considered in order to realise the optimal potential from this approach. Case histories will be presented, including examples in the fields of fibrosis and cachexia. These highlight (i) the value of new biology and (ii) the importance of differentiation among an ostensibly similar class of agents to identify improved non-genericisable therapeutics.
Sasha Issac Gurke, Product Development, Knovel Corporation, 13 Eaton Avenue, Norwich, NY 13815, Fax: 607-337-5090, email@example.com, Phone: 607-337-5600
To ensure cross-searchability and consistency in presentation of interactive tables, chemical names and property data are normalized as e-books are loaded into Knovel Library. The challenges and techniques used for normalization are discussed.
Beth Thomsett-Scott, Reference and Information Services, University of North Texas Libraries, P.O. Box 305190, Denton, TX 76226, Fax: 940-565-3695, firstname.lastname@example.org, Phone: 940-369-6437
Electronic journals were widely and rapidly accepted by most faculty and students in chemistry. This paper will examine the trends in usage between e-books and e-journals in chemistry at one university to see if chemistry students and faculty have adopted e-books as quickly as they adopted e-journals. In addition, usage statistics for e-books will be compared to those of their print counterparts. Results will be presented and conclusions discussed with thoughts for the future.
Michael Forster, STM Books, John Wiley & Sons, Inc, 111 River Street, Hoboken, NJ 07030, email@example.com, Phone: 201-748-7699
The benefits and advantages of the electronic medium are perhaps greater in the field of chemistry for information that is currently published in the form of print books than in any other discipline. However, the challenges faced by publishers and their customers in delivering these benefits to users are not trivial - and are posed by issues of technology, user behavior, and the marketplace, to name a few. This presentation will provide a brief look at the current market in eBooks, look at some specific issues that exist with respect to chemistry as content matter, and then examine the issues that affect publishers, authors, and the customers and users who make up this community of interest. Some possible future developments and trends for the short and medium term will also be identified, as well as their associated limiting factors.
Meghan Lafferty, Science & Engineering Library, University of Minnesota, 108 Walter Library, 117 Pleasant St SE, Minneapolis, MN 55455, Fax: 612-625-5583, firstname.lastname@example.org, Phone: 612-624-9399
An increasing number of classic reference works in chemistry long available in book form are now also available online. While undeniably more convenient to our users, some of these works take better advantage of the unique features of the electronic medium than others. I will examine how well a variety of chemistry-related reference works have been converted into online versions. Some of the works I will compare include the Kirk-Othmer Encyclopedia of Chemical Technology, the Merck Index, chemistry reference books in Knovel, and CHEMnetBASE and other netBASE products. I will address the following questions and make recommendations. What features offer an improvement over the print? What features send users to the print versions unless they have no other choice? Which works are the best examples of truly transformed books and why?
Martin Grigorov, BioInformatics, Nestlé Research Center, PO Box 44, Canton de Vaud, Lausanne 1000, Switzerland, Fax: +41 21 785 94 86, email@example.com, Phone: +41 21 785 89 39
There is emerging evidence that real-world datasets are statistically self-similar and thus fractal. In this work I investigate some global topological properties of representation of chemical libraries in spaces defined by molecular descriptors. New algorithms are developed and used in this work, such as the dimension reduction of such chemical data sets by singular value decomposition and the introduction of the correlation dimension as a natural dimension of a chemical space. It is shown that the representations of molecular data sets in chemical spaces possess self-similar properties, characteristic of fractal objects. This important insight allows for a compact statistical description of the datasets as well as for the inference of the number of chemically similar structures existing in the vicinity of any member of such fractal set.
Dora Schnur, Computer-Assisted Drug Design, Bristol-Myers Squibb, Pharmaceutical Research Institute, P.O. Box 5400, Princeton, NJ 08543-5400, Phone: 609-818-4004, and Cullen L. Cavallaro, Pharmacopeia, Inc
Diversity has historically played a critical role in the design of combinatorial libraries, screening sets and corporate collections used for lead discovery. Large library design in the 1990's ranged from arbitrary through property based reagent selection to product based approaches. Over time, however, there has been a downward trend in library size as information about the desired targets increased due to the genomics revolution and the increasing availability of target protein structures from crystallography and homology modeling. Concurrently, computing grids and CPU clusters have facilitated the development of structure based tools that screen hundreds of thousands of molecules. Smaller “smarter” combinatorial and focused parallel libraries have replaced those un-focused large libraries in the twenty-first century drug design paradigm. While diversity still plays a role in lead discovery, target family and target specific approaches dominate current efforts in library design. This talk will highlight these library design trends and explore the use of software developed by R. Pearlman for sparse matrix library design.
Brian B. Masek1, Roman Dorfman2, Karl M. Smith1, and Robert D. Clark3. (1) Tripos, Inc, 1699 S. Hanley Rd., St. Louis, MO 63144, Fax: (314)-951-3409, firstname.lastname@example.org, Phone: (314)-951-3409, (2) Informatics Research Center, Tripos, Inc, St. Louis, MO 63144, (3) Informatics Research Center, Tripos, St. Louis, MO 63144
Very rapid conformational sampling is critically important in many areas of computer-aided drug design. We have developed an alternative approach wherein a selected force field is used to minimize randomized conformations of a drug-like training set of molecular structures. Torsional profiles characteristic of each type of bond in the training set are then extracted from these conformations. We have extended this method to encompass the treatment of flexible rings and the inversion of pyramidal Nitrogen. The conformations produced are biochemically relevant, as indicated by their ability to efficiently reproduce ligand conformations found in X-ray crystal structures.
Uta Lessel, Department of Lead Discovery, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach an der Riss 88397, Germany, Fax: +49/7351/83-3062, Uta.Lessel@bc.boehringer-ingelheim.com, Phone: +49/7351/54-3062
The presentation shows some ways how to apply DiverseSolutions and the BCUT descriptors for ligand based virtual screening. The results are compared with enrichments produced by other ligand based virtual screening techniques, e.g. Daylight Fingerprints or Feature Trees.
S. Stanley Young1, Atina D. Brooks1, William Welch1, Morteza G. Khaledi2, Douglas Hawkins1, Kirtesh Patil1, Gary W. Howell1, Raymond T. Ng1, Moody T. Chu1, and Jacquline M. Hughes-Oliver1. (1) NISS, PO Box 14006, Research Triangle Park, NC 27709, Fax: 919 685 9300, email@example.com, Phone: 919 685 9328, (2) Department of Chemistry, North Carolina State University, Raleigh, NC 27695
ChemModLab is a free, web-based toolbox for fitting and assessing quantitative structure-activity relationships (QSARs). Its elements include a cheminformatic front end to supply molecular descriptors, a set of statistical methods for fitting models, and methods for validating the resulting model. Input is a SD file for compounds and a text file for biological activity or the user can directly input their own descriptors (keeping compound structures confidential). Submitted data sets can be made public or kept private. Five types of descriptors are available and twelve different statistical methodologies are included, largely from the R platform. As promising new QSAR methodologies emerge from the statistical and data-mining communities, they will be incorporated into ChemModLab. The Web site also incorporates links to public data sets. The capabilities of ChemModLab are illustrated using a variety of data sets. Predictive quality varies greatly with the descriptor and modeling method choice.
Jeffrey M. Skell, Genzyme, DMPK & Pharmaceutics, 153 Second Ave., Waltham, MA 02451, Fax: 508-661-8517, Jeffrey.Skell@genzyme.com, Phone: (781) 434-3601
Many current discovery programs incorporate pharmaceutics properties (e.g., solubility, permeability, and lipophilicity) into their hit identification and lead optimization testing cascades. However, the uses of these properties are often limited to a pass/fail criterion. Upon candidate nomination, a broader range of laboratory activities are initiated including physical (salt/crystal) form selection and formulation development. The development scientist charged with advancing an optimized formulation of the nominated compound often has little or no control over the single most critical aspect of his charge: the molecular entity. Recently introduced screening techniques attempt to address this deficiency by incorporating standardized physical form and formulation tests into the lead optimization process, at the cost of significantly increasing compound requirements before compound nomination. This presentation will explore the concepts implemented in software tools for modeling solution-phase properties, their ability to address solid-phase properties, and their impact on bridging the gap between discovery data and development decisions.
Andrew Rusinko III, Alcon Laboratories, Inc, 6201 S. Freeway, Ft. Worth, TX 76134, Fax: 817-302-3701, Phone: 817-551-8140, and Karl M. Smith, Tripos, Inc, St. Louis, MO 63144
The exploration and development of search systems based on the three-dimensional (3D) structure of a molecule and not just its connection table, represented a major paradigm shift in cheminformatics and molecular modeling in the late 1980's and early 1990's. Techniques such as molecular surface area and volume calculations, 3D-pharmacophore and shape search, as well as docking studies required accurate small-molecule molecular geometries as a starting point. Since the primary source of large collections of structures at the time was corporate databases, a method was needed to automatically produce reasonable geometries of drug-like molecules quickly from corporate collections. The computer program CONCORD was developed to “rapidly generate high-quality approximate molecular coordinates.” This presentation traces the origins of CONCORD, the impact it had on early 3D search systems and will describe the current status of this classic program, some 20 years later.
Eugene L. Stewart, Peter J. Brown, James A. Bentley, and Timothy M. Willson, Computational, Analytical, and Structural Sciences, GlaxoSmithKline, Five Moore Drive, Research Triangle Park, NC 27709, Fax: 919-315-0430, Eugene.L.Stewart@gsk.com, Phone: 919-483-0152
We illustrate the use of DiverseSolutions (DVS) in its application to a problem of pharmaceutical interest, target class-directed compound selection and synthesis. We will present the use of DVS in the establishment of a chemistry space for the nuclear receptor (NR) target class and the application of this space in the selection of compounds for screening against orphans within this family of receptors. We will also present the results of a prospective validation study that will demonstrate the utility of these methods and the effectiveness of the chemistry space in this instance. Lastly, we will discuss techniques and general workflow for the application of the NR-directed chemistry space in selecting monomers and compounds for synthesis which meet the appropriate target class criteria.
James R. Damewood and Charles L. Lerman, CNS Chemistry, AstraZeneca, 1800 Concord Pike, Wilmington, DE 19850, Fax: 302-886-5792, Phone: 302-886-5792
A major activity in the design phase of drug discovery involves generating viable ideas of what to make next. NovoFLAP is a ligand-based computer-aided design (CAD) approach that generates new, medicinally relevant ideas starting from compounds known to be active at a biological target of interest. NovoFLAP combines the evolutionary de novo design capabilities of EA-Inventor with FLAP, a robust, ligand-based scoring algorithm. Specific examples of how NovoFLAP has been used to successfully design new and interesting ideas in drug discovery programs will be presented.
R. S. Pearlman1, Yubin Wu1, Karl M. Smith2, and Brian B. Masek2. (1) Laboratory for the Development of CADD Software, University of Texas, College of Pharmacy, Austin, TX 78712, firstname.lastname@example.org, Phone: 512-471-3383, (2) Tripos, Inc, St. Louis, MO 63144
Traditional cheminformatics technologies were designed to address the traditional needs of (1) identifying chemical compounds and (2) associating experimentally derived information with those compounds. This presentation will address the evolving needs of (3) identifying the various structures – 2D protomers, 2.5D proto-stereomers, and 3D proto-stereo-conformers – which individual chemical compounds can and do exhibit in various Natural environments (e.g., crystal, solvent, membrane, receptor, etc.) and (4) associating computationally derived information with those structures and the corresponding compounds. In particular, we will address the typically unappreciated consequences which protonation and tautomerization equilibria can have upon both atom-centered and bond-centered chiralities of proto-invertible chiral centers. We also need (5) a robust method to associate any given structure with its corresponding, canonically identified compound. This presentation will discuss algorithms and software tools which address these needs. We will also suggest a “bio-activity-oriented” hierarchical approach for the management of both experimentally and computationally derived chemical information.
Barun Bhhatarai, Department of Chemistry, Clarkson University, 8 Clarkson Avenue, Potsdam, NY 13699-5810, email@example.com, Phone: 315-268-2357, and Rajni Garg, Department of Chemistry & Biochemistry, California State University San Marcos, San Marcos, CA 92096
Study of mutants associated with Indinavir and its related congeners was performed and the results analyzed using Cheminformatics approach. In continuation of our previous understanding of ‘different pocket sizes for different mutants', this study was aimed to explore the effect of substituents' binding on three major pockets of HIV protease viz. P1' P2 and P3. The information obtained was used to design effective substituents which can be used with other novel pharmacophore(s) to generate new leads. Different mutant variants such as K60C, V18C, NL4-3 (molecularly cloned strain), 4X and Q60C including WT were considered and their binding pattern relating to IC50 and CIC95 data was studied. Maximum of 36 data-points for each mutant position aiming at particular viral pocket were retrieved from the literature. Total of 36x5 data-points for each biological activity were collected. Quantitative statistical relationships were developed using various descriptors and regression techniques. It is anticipated that the results of this study would help in the development of efficient chemical probes/leads by evolution of existing examples.
Raghava Chaitanya Kasara, Chemistry Department, Clarkson University, 8 Clarkson Avenue, Potsdam, NY 13699, Fax: 315-265-6610, firstname.lastname@example.org, Phone: 315-265-2357, and Rajni Garg, Department of Chemistry & Biochemistry, California State University San Marcos, San Marcos, CA 92096
The overall goal of this comparative study was performed to predict the pharmacodynamic prediction of C2 symmetric HIV-1 protease inhibitors published by Kempf et al. To understand the binding patterns of the drug molecule at the receptor site, QSAR studies were performed using various statistical analysis. A large dataset was compiled and QSAR models on pharmacodynamic models were developed. These models indicate important physicochemical parameters to have a vital role in the binding interaction at receptors site (viral protease). These models have the potential to be used as in-silico virtual screening tool for predicting the pharmacodynamic profiles of HIV-1 protease inhibitors.
Ramanathan Natarajan1, Subhash C. Basak1, Douglas M. Hawkins2, and Jessica Karaker3. (1) Center for Water and the Environment, Natural Resources Research Institute, University of Minnesota, 5013 Miller Trunk Highway, Duluth, MN 55811, Fax: 218-720-4328, email@example.com, Phone: 218-720-4342, (2) School of Statistics, University of Minnesota, Minneapolis, MN 55455, (3) Department of Mathematics, University of Wisconsin-Eau Claire, Eau Claire, WI 54702-4004
In QSAR modeling of property/ bioactivity of chemicals using calculated molecular descriptors, we are faced with the usual problem of “few compounds and many descriptors.” Hence, variable-selection (descriptor-thinning) methods are used to select a proper subset of descriptors to develop QSAR models. It is vital to incorporate the descriptor selection, as well as any parameter selection, as part of the modeling procedure to be cross-validated for assessment of the model. When the cross-validation step does not include all such elements of the modeling procedure, the “naïve q2” thus estimated suffers from an upward bias. Application of proper cross-validation that includes descriptor thinning is necessary for developing QSAR models with good predictive ability. The importance of embedding descriptor selection as well as parameter selection inside the cross-validation step, resulting in calculation of the “true q2”, is highlighted by a comparison of true q2 with naïve q2 for a few sets of compounds.
Dazhi Jiao, School of Informatics, Indiana University at Bloomington, Wells Library 043, Bloomington, IN 47408, firstname.lastname@example.org, Phone: 812-856-0089
A chatbot is a computer program designed to interact with users through intelligent conversations. AIML, the Artificial Intelligence Markup Language, is a technology commonly used in developing chatbots. In this poster, I will propose chatting as an interface for scientists to retrieve chemical information and perform scientific computations. Chatbot technologies such as ALICE and AIML will be introduced. I will also discuss a prototype of AIML-based chatbot that can be used to access information in PubChem, and other chemical databases through web services.
Muhammed Yousufuddin1, Dimitris Dimitropoulos2, Zukang Feng1, Jeramia Ory1, Hyunmi Sun1, John Westbrook1, Kim Henrick2, and Helen Berman1. (1) Rutgers, The State University of New Jersey, Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Chemistry, 610 Taylor Rd, Piscataway, NJ 08854, email@example.com, Phone: 732-445-0103, (2) EMBL Outstation-Hinxton, MSD-EBI, Cambridge, United Kingdom
The RCSB PDB, in collaboration with the MSD-EBI, has developed and released a new and expanded chemical component dictionary. This new dictionary, which now contains around 8000 small molecules, has been improved by removing redundant ligands such as SUL, correcting any valence errors, and providing IUPAC atom labeling for standard amino acids and nucleotides. In addition, the new dictionary contains many additional data items such as stereochemical assignments, idealized coordinates, and SMILES strings.
The contents of this new chemical component dictionary have been used to remediate the entire PDB archive, which currently contains over 42,000 entries. The detailed annotation of small molecules in the archive makes greater integration with cheminformatics databases and pharmaceutical applications possible.
The wwPDB is accessible from www.wwpdb.org. We acknowledge the support of our funding agencies: RCSB PDB (NSF, NIGMS, DOE, NLM, NCI, NCRR, NIBIB, NINDS) and MSD-EBI (Wellcome Trust, EU, CCP4, BBSRC, MRC and EMBL).
Paul N Mortenson1, Miles S. Congreve2, and Christopher W. Murray2. (1) Computational Chemistry Group, Astex Therapeutics, 436 Cambridge Science Park, Milton Road, Cambridge CB4 0QA, United Kingdom, Fax: +44 1223 226201, firstname.lastname@example.org, Phone: +44 1223 435014, (2) Astex Therapeutics, Cambridge CB4 0QA, United Kingdom
Medicinal chemistry programs frequently produce molecules that are potent but relatively insoluble in water. Such compounds present a range of problems, but most importantly they are unlikely to be suitable candidates for oral (or intravenous) administration. A common solution to these problems is the addition of a polar solubilising group to the molecule. We present here the results of a systematic analysis of solubilising groups present in marketed oral drugs. Proprietary software tools have been written and used to automatically extract these groups from a database of oral drugs, as well as a larger database of advanced drug candidates. A design tool has also been created that allows chemists to create virtual libraries of solubilised molecules, starting from a template that they sketch. Appropriate solubilising groups are virtually attached to the template, and the library thus created can then be further profiled for example by docking.
Rajni Garg, Department of Chemistry & Biochemistry, California State University San Marcos, 333 S. Twin Oaks Valley Rd., San Marcos, CA 92096, Phone: 760-750-8069, Srinivas Alla Reddy, Organic Division I, Indian Institute of Chemical Technology, Hyderabad 500 007, India, Xiaoyu Zhang, Department Computer Science, California State University San Marcos, San Marcos, CA 92096, and Ahmad R Hadeagh, Computer Science, California State University San Marcos, San MArcos, CA 92096
Identification of mutational patterns, and designing smart HIV drugs, which can be active even after the mutation occurs, is a challenging problem. With our continuum interest, we have developed a database of HIV protease proteins (MUT-HIV). The MUT-HIV database contains the information about both wild type and mutated HIV proteases. All the crystal structures of the HIV proteases deposited in the PDB are extracted. Details of mutated amino acids along with other properties like crystallization conditions, bound inhibitors etc. are stored in the database. Several physical, chemical and electronic properties of the bound inhibitors and binding pockets are calculated and organized in the database. The information obtained from the mutation patterns will be correlated with the inhibitor's descriptors. This database will be a valuable tool to predict the types of mutations that can occur for newly designed inhibitors and in the design of the inhibitors for multi target approach.
Tim Dudgeon, Petr Hamernik, Gyorgy Priok, Szilard Dorant, and Ferenc Csizmadia, ChemAxon Kft, Márámaros köz 3/a, 1037 Budapest, Hungary, email@example.com, Phone: +44 1865 331167
InstantJChem is an extensible desktop application designed to bring sophisticated cheminformatics to chemists. Structure databases can be quickly created in embedded or enterprise databases (allowing collaboration between multiple users). Each database allows structural and non-structural data in multiple formats to be quickly imported/exported. Chemical business rules can be applied using Standardizer to allow structure canonicalization (nitro representation, salt removal...). Structure based calculations and predictions (logP, pKa, RuleOf5, bioavailability...) are available using the Chemical Terms language. Advanced structure searching techniques can be combined with queries on data fields and Chemical Terms filters and applied rapidly to large data sets. Results can be viewed in a tabular format or with custom designed forms. As such, InstantJChem provides a simple platform to perform complex structure based analysis and prediction, including HTS analysis, SAR analysis, library overlap analysis, compound acquisition and ADMET predictions. The core functionality of InstantJChem is freely available.
Gregory Fond, Kelaroo, Inc, 312 S. Cedros Ave., Suite 320, Solana Beach, CA 92075, firstname.lastname@example.org, Phone: 858-259-7561
As start-up companies accumulate scientific data, they must assess many competing options for managing drug discovery and development information. Many resources are available to assist start-up companies in selecting and accessing the best research informatics (RI) platforms and applications. This presentation introduces a method to assist small and medium size companies in sorting through the many RI options available to them. It is based on Kelaroo's experiences with over 30 biotechnology and pharmaceutical companies whom Kelaroo has provided with custom and commercial cheminformatics and bioinformatics products and professional services. The formalization of the method is achieved using basic elements of technical, financial and organizational analysis. This presentation also illustrates how in the maturing RI industry small and medium-size biopharmaceutical companies can mitigate the trade-offs between cost, flexibility and scalability. The findings are derived empirically from discussions with industry analysts, biopharmaceutical companies of various sizes, providers of RI platforms as well as developers of custom and commercial software applications. The presentation includes case studies for illustration purpose.
George R. Thompson, Chemical Compliance Systems, Inc, 706 Route 15 South, Suite 207, Lake Hopatcong, NJ 07849, Fax: 973-663-2378, email@example.com, Phone: 973-663-2148
Chemical inventories are a valuable resource of information for numerous departments and applications throughout an organization, when properly constructed and effectively analyzed. However, an inventory system can be no “smarter” than the data it contains. At least five primary databases are required for the broadest benefits from a chemical inventory: (1) chemical/product container, (2) chemical cross-reference dictionary, (3) MSDSs, (4) chemical health/safety/ecological hazards, and (5) applicable regulatory List of Lists. Additional data and criteria will greatly enhance the utility of the chemical inventory—e.g., physical/chemical properties, process and product usage, “green” and biobased criteria, hazard ranking criteria, generic chemical classes, incompatible and/or alternative chemicals, etc.
C-CAS is a true cradle-to-grave container tracking system that can include all of the above databases (and more), identifies each container by a bar code, and tracks the precise location of that container in real time throughout its lifetime. Literally, hundreds of reports are available from C-CAS: by chemical/product/location, by manufacturer, at reorder thresholds, and for hazard classes by room, department, building, or the entire organization. C-CAS can also identify any of 650 regulations that affect a chemical, or product, and can calculate when reporting thresholds are exceeded. Additionally, C-CAS serves as the input module to our Chemical Hazard and Environmental Management System and our Chemical Homeland Security System, and can feed quantitative data into any pre-existing ISO-14001 EMS. In short, C-CAS is an invaluable tool for diverse users with seemingly unrelated responsibilities.
Michael P. Hudock, Center for Biophysics and Computational Biology, University of Illinois at Urbana-Champaign, 607 S. Mathews Avenue, Urbana, IL 61801, firstname.lastname@example.org, Phone: 217-333-4335
A substantial amount of early discovery chemistry is occurring every day not only in large industrial laboratories, but also in non-industrial settings, such as academic institutions, hospitals and government research centers. In many situations these non-industrial settings are generating substantial amounts of data but do not have a formal informatics solution to manage and mine the resulting data. Typically the most basic requirements of such systems generally do not differ. Using primarily open source software we show it is possible to build a client-server based system to handle these most basic requirements of uniting chemical structures with activity data and also even more advanced features for data mining and modeling structure-activity relationships. Using a rapid development model and standardized database architecture, feature requests can be accommodated on a short timescale. This system is routinely used in our group and has been able to detect otherwise unrecognized trends in data.
Robert D. Feinstein, James Moeder, Bret Daniel, Andrew Reum, and Gregory Fond, Kelaroo, Inc, 312 S. Cedros Ave., Suite 320, Solana Beach, CA 92075, email@example.com, Phone: 858-254-6727
Chemistry intensive organizations need to search, source and manage thousands of reagents, building-blocks and advanced intermediates. Increasingly, they must do this while minimizing IT infrastructure, avoiding disparate applications, rigid solutions and expensive licensing. We will describe Kelaroo's experience with systems addressing these operational and business needs.
This presentation will address the integration of workflows involving chemistry, purchasing, stockroom and EH&S departments. Enabling capabilities include simultaneous searching of in-house reagents and commercial catalogs; acquiring reagents from both inventory and vendors efficiently and economically; managing chemical inventory effectively (including receiving, dispensing, tracking, reconciling and EH&S reporting); and enforcing business policies to save companies substantial time and money.
The presentation will also discuss technical and business trends that are reshaping the Research Informatics landscape. This represents a paradigm shift towards integrating best-of-breed applications that are plug-and-play, web-based, full-featured, highly configurable and available as internal systems or hosted as a pay-as-you-go service.
Chandu Nair, Scope e-Knowledge, 515 Madison Avenue, 21st Floor, New York, NY 10022, firstname.lastname@example.org, Phone: 646-706-2575
As a remote knowledge services company, Scope fulfills diverse content and data requirements of various clients. Obtaining off-the-shelf, ready to use products catering to Scope's requirements is difficult and not very cost effective either.
Scope has therefore put in place a hybrid software team which comprises an internal and external team of experts to create proprietary systems.
In the knowledge space, Scope believes that achieving 100% automation is unrealistic; therefore, Scope has a philosophy of “assisted automation”. Applications are developed in such a way so that they are scalable, ensure better control and enable constant improvement, Scope follows the approach of continuous development and quantum deployment; applications are continually tweaked but deployment is done in a staged manner when it reaches a critical mass.
AGILE methodology is used in developing software. The software team and project operations team finalize the requirements together. Consequently, the applications developed are user friendly and meet user requirements more precisely.
Case studies will be discussed to illustrate these points.
Eric A. Jamois and Sai Subramaniam, Strand Life Sciences, 1902 Wright Place – Suite 200, Carlsbad, CA 92008, Fax: 760-918-5505, email@example.com, Phone: 760-918-5582
Although outsourcing to China, Russia and India has reached mainstream status, its success rides on the execution. Interestingly, some companies have questioned the viability of overseas operations, primarily on grounds of operational efficiency. Initial outsourcing models were founded on the allocation of large amounts of junior resources towards projects with disappointing returns. More recently, Indian and other geographies have turned to senior resources recruited directly from their target markets. With direct insight into requirements and a project level understanding of the challenges at hand, there is now a direct shift into greater efficiency and higher end deliverables.
We will describe several projects undertaken at Strand Life Sciences in terms of their challenges and solutions provided. We will discuss the implementation of a data analysis and visualization platform in pharmaceutical discovery. We will also describe how some specific components can be integrated in an existing environment to provide new capabilities for image analysis, SAR and other applications.
John Jegla, Symyx Technologies, 70 Wood Avenue, Iselin, NJ 08830, firstname.lastname@example.org, Phone: 802-242-9017, and Mitchell A. Miller, Symyx Technologies, Fairfax, VT 05454
Over the years, Symyx has built a number of applications to manage repositories of chemical materials and physical inventories. This work has been done for a variety of organizations in the chemical and pharmaceutical industries. Our experiences have revealed there to be significant differences between organizations regarding the definitions of chemical entities and the operations required to support inventory-related workflows, even within a given industry segment. This puts a premium on providing software solutions that are not just comprehensive and flexible, but also extensible at all levels via customer-accessible developer kit functionality. To support such efforts, we have dissected the ensemble of requirements and features into a set of application components:
Data model (Representations of the primary entities in an inventory system) Application functions (searching, browsing, object life cycle maintenance, user experience management) Application features (client configuration)
Understanding these components and designing them for easy reuse leads to greater efficiency in operation and more satisfied users in the long run. Here we present our experience in the hope that it will be instructive to others.
Scott C. Boito, North American Info Center, Rhodia, Inc, 350 George Patterson Blvd, Bristol, PA 19007, Fax: 215-781-6002, email@example.com, Phone: 215-781-6229
Rhodia moved laboratory facilities in late-2005 and with the new location it was decided to implement a new chemical inventory system. Chem SW's CisPro system was selected because it allowed management of our MSDS collection electronically with access to both the inventory records and the accompanying MSDS. The implementation of the system and the continuous evolution will be discussed.
Angela Locknar, Engineering and Science Libraries, MIT, 14S-134, 77 Massachusetts Ave., Cambridge, MA 02139, Fax: 617-253-6365, firstname.lastname@example.org, Phone: 617-253-9320, and Donald R. Sadoway, Department of Materials Science and Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Room 8-203, Cambridge, MA 02139-4307, Fax: 617-253-5418, email@example.com, Phone: 617-253-3487
Providing instruction in finding and using information ("library" skills) is common in first year English courses, but these skills are just as relevant for first year students in the sciences. Should faculty in the sciences be expected to teach these skills, or should they call upon their librarian colleagues? This presentation will describe the collaboration between an engineering and science librarian and a faculty member to deliver information skills to a large freshman level core chemistry course. An innovative pilot course, using the students themselves to help determine how to teach their peers, will be discussed. Scaling this project to reach the over 500 students enrolled in the core course, including the creation of online tutorials, will also be addressed.
Jeremy R Garritano, Mellon Library of Chemistry, Purdue University, 504 W. State St., West Lafayette, IN 47907, firstname.lastname@example.org, Phone: 765-496-7279
Graduate students at a large research university often have many information needs—from choosing a research advisor, to creating and pursuing their research agenda, to deciding on where to go after graduation. In addition, many of them have insufficient information seeking skills. The M.G. Mellon Library of Chemistry at Purdue University attempts to address many of these issues by the focused and proactive provision of resources and services to graduate students. Besides instructing graduate students on common chemical information resources, the staff of the Chemistry Library provides additional services, such as after-hours access and assistance with bibliographic management software, to enhance the educational and research experience. In addition, a biweekly series of seminars, called Ice Cream Seminars, are provided every year to help acclimatize new graduate students to these and other resources provided by the Purdue University Libraries. This talk will highlight the services provided by the Chemistry Library, from the day potential graduate students visit campus to the day they graduate and are off to pursue future endeavors (and sometimes, even after that).
Allan K. Hovland, Department of Chemistry, St. Mary's College of Maryland, 18952 E Fisher Road, St. Mary's City, MD 20686, Fax: 240-895-4996, email@example.com, Phone: 240895-4354
St. Mary's College of Maryland is a small public liberal arts college. The introduction to chemical literature course was first offered about 15 years ago. The role of the course in the chemistry curriculum was heightened when a requirement for a year-long research experience was implemented a few years ago. A strong collaboration between the chemistry faculty and the library staff has developed. As is universally true, the issue of access to materials has been one of the greatest challenges. This has been met in part by the participation in consortial arrangements. The role of information literacy in chemistry will be expanding in light of the implementation of a new core curriculum requiring information literacy components across the curriculum.
R. G. Landolt, Department of Chemistry, Texas Wesleyan University, 1201 Wesleyan Street, Fort Worth, TX 76105, Fax: 817-531-4275, firstname.lastname@example.org, Phone: 817-531-4890
This project addresses the following objectives, to: Provide FACULTY with fundamental sophistication in Chemical Informatics; Teach STUDENTS to determine if information exists, how to retrieve it and assess its quality; and Enable DECISION-MAKERS to see how institutional resources may be used efficiently. To date, students and faculty at 4-year institutions have been provided insights regarding access and use of Chemical Abstracts and ACS Publications Journals. Optimum progress has occurred by establishing consortia of institutions, with active involvement of faculty and institutional librarians. Efforts are underway to identify issues of concern regarding online access for 2-year programs, including Community College Chemistry and Chemical Technology.
N. Sukumar1, Curt M. Breneman2, Kristin P. Bennett3, Charles Bergeron3, Theresa Hepburn2, C. Matthew Sundling4, Shekhar Garde5, Rahul Godawat5, Ishita Manjrekar5, Margaret McLellan2, and Mike Krein2. (1) Department of Chemistry and Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute / RECCR Center, 110 8th St., Troy, NY 12180-3590, Fax: 518-276-4887, email@example.com, Phone: (518)276-4235, (2) Department of Chemistry / RECCR Center, Rensselaer Polytechnic Institute, Troy, NY 12180, (3) Department of Mathematics, Rensselaer Polytechnic Institute, Troy, NY 12180, (4) Department of Chemistry and Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy, NY 12180-3590, (5) Department of Chemical and Biological Engineering, Rensselaer Polytechnic Institute, Troy, NY 12180
With continuing advances in epigenetics, proteomics, interactomics, metabolomics and RNA interference, bioinformatic data is increasingly becoming 3-D (structure-based) rather than just linear (sequence-based). A unified approach to cheminformatics and bioinformatics can thus enable a rich cross-fertilization of computational methods developed independently in different disciplines. We have developed a gamut of new software tools (Dixel, Protein-Recon, QPEST) and descriptor families (sequence similarity kernels, hydration-based descriptors) at Rensselaer that present bioinformatics data in a format familiar to cheminformaticians and cheminformatics data in bioinformatics-like format. Some modeling applications such as prediction of binding affinities of T cell receptors to leukemia vaccine polypeptides, ranking of transcription factor binding sequences and identification of pyruvate kinase activators and inhibitors will be presented.
Christoph Steinbeck, Miguel Rojas, Tobias Helmus, Egon Willighagen, and Stefan Kuhn, Research Group for Molecular Informatics, Cologne University Bioinformatics Center (CUBIC), Zuelpicher Str. 47, D-50674 Cologne, Germany, firstname.lastname@example.org, Phone: 0049-221-470-7426
Identification and structure elucidation of unknown metabolite structures based on their spectroscopic properties form the basis for successful metabolome simulations. In a process known as dereplication, a scientist would record molecular fingerprint spectra and search spectral databases to check whether the compound at hand is already known. Only if this search in unsuccessful, it is reasonable to reach for one of the more sophisticated ab-inito tools for computer-assisted structure elucidation. Here we describe the use of free software, the easy access provided by the World Wide Web and the collaborative potential of the Open-Source movement to build a completely transparent system for computer-assisted structure elucidation and identification. Methods for the prediction of mass and NMR spectra have been developed and used as part of a fitness function in our structure elucidation systems based on stochastic chemical space generators.
Jean-Loup Faulon, Computational Bioscience Dept, Sandia National Laboratories, P.O. Box 5800 - MS 1413, Albuquerque, NM 87185, Fax: 505-284-1323, email@example.com, Phone: 505-284-0770
Biological and chemical databases are increasingly populated with information linking protein sequences and chemical structures (Kegg, PubChem DrugBank, MDDR). There is now sufficient information to apply machine learning techniques to predict interactions between chemicals and proteins on a genome-wide scale. Current machine learning techniques use as input either protein or chemical information. A novel Support Vector Machine method will be presented for predicting protein-chemical interaction using heterogeneous input consisting of both sequences and chemical structures. The method relies on fusing protein sequence data with chemical structure data by representing each with a common cheminformatics description. The approach will be demonstrated by predicting proteins that can catalyze reactions, even when the reactions have no known enzymatic catalysts, and predicting when a given drug can bind a target, also in the absence of prior binding information for that drug and target.
Yang Shen1, Dmitri Beglov2, Ryan Brenke3, Dima Kozakov2, and Sandor Vajda2. (1) Department of Manufacturing Engineering, Boston University, Boston, MA 02215, firstname.lastname@example.org, (2) Department of Biomedical Engineering, Boston University, 44 Cummington St, Boston, MA 02215, Fax: 617-353-6766, email@example.com, Phone: 617-353-4757, (3) Program in Bioinformatics, Boston University, Boston, MA
Two enzymes are analogous if they have the same EC number (or their EC numbers differ only in the last digit), but are evolutionarily unrelated, i.e., they lack both sequence and structural similarity. Analogous enzyme pairs are relatively rare, but occur in all major classes, assumed to be the results of convergent evolution. Research on analogous enzymes is very limited: it consists of searches for non-homologous enzymes with the same EC number and studies of specific cases of convergent evolution. It is known that at least in a number of cases the spatial arrangement of the catalytic residues is conserved, but very little is known about the similarity of the binding sites that occur on different protein scaffolds. In this work we use a new method developed to assess molecular similarity for the structural superimposition of enzyme binding sites. The physicochemical properties of the cavity-flanking residues are represented by pseudocenters. Given two sets of such pseudocenters, our goal is finding the largest subset of pseudocenters in both clefts in direct correspondence with each other geometrically as well as chemically. The proposed method performs an exhaustive evaluation of the correlation function in the discretized 6D space of mutual orientations of the two point sets using a very efficient algorithm involving Fast Fourier Transforms. The method is applied to a number of analogous enzyme pairs. Advantages over the more traditional structure comparison method based on the maximum clique algorithm are discussed.
Noel M. O'Boyle1, Gemma L. Holliday2, Daniel E. Almonacid3, and John B. O. Mitchell3. (1) Cambridge Crystallographic Data Centre, 12 Union rd, Cambridge CB2 1EZ, United Kingdom, Fax: 0044-1223-336033, firstname.lastname@example.org, Phone: 0044-1223-762531, (2) EMBL-EBI, Cambridge CB10 1SD, United Kingdom, (3) Department of Chemistry, Unilever Centre for Molecular Science Informatics, University of Cambridge, Cambridge CB2 1EW, United Kingdom
As more and more mechanistic data on enzymes becomes available, the ability to identify similar mechanisms in other enzymes is becoming more important. Such information may be used to identify mechanistically convergent or divergent enzymes, to study the link between structure and function, to perform literature searches, and to validate experimental results. However, existing methods for measuring enzyme similarity (evolutionary distance, structural similarity, classification by function) do not take chemical mechanism into account. We have developed the first method to give a quantitative measure of the similarity of reactions based upon their explicit mechanisms. The method combines classic cheminformatics techniques (Tanimoto coefficient, Euclidean distance of fingerprints) with the Needleman-Wunsch alignment algorithm used in bioinformatics. We present an analysis of the MACiE database of enzyme mechanisms using our measure of similarity, contrast functional and mechanistic classification schemes, and identify some examples of convergent evolution of chemical mechanism.
Brian C. Meadows, Needle & Rosenberg, PC, 999 Peachtree Street, Suite 1000, Atlanta, GA 30309, Fax: 678-420-9301, email@example.com, Phone: 678-420-9300
The increasing investment and reliance on intellectual property for value creation has elevated the licensing of technology to the forefront of today's global economy. The licensing process can vary from seeking revenues for your own existing technology to seeking rights in the technology of others. This presentation will explore various perspectives and strategies for creating value through the licensing of chemical technology and intellectual property.
Tena Herlihy, Technology Licensing Office, MIT, Room NE25-230, Five Cambridge Center, Kendall Square, Cambridge, MA 02142, Fax: 617-258-6790, firstname.lastname@example.org, Phone: 617-253-6966
Licensing and technology transfer agreements between the academic and industrial sectors are increasing in number. It is important to understand the nature and obligations of academic institutions and thus their needs, policies, and limitations. This presentation will cover strategies for handling a number of issues that often arise when negotiating patent licenses with academic institutions. A brief history of the Bayh-Dole Act will be given as a background to how universities came to be involved in technology transfer, followed by a discussion of topics that are unique to the academic environment. For example, universities are often very limited in the representation and warranties they can give. Also, unlike commercial agreements, there will always be retained rights under an academic agreement, and the agreement is likely to include due diligence provisions to make sure the technology is developed. The presentation will conclude with a report on likely changes in academic agreements as a result of recent case law.
Craig M. Sorensen, Director, Strategic Research Alliances, Vertex Pharmaceuticals Incorporated, 130 Waverly Street, Cambridge, MA 02139, Fax: 617-444-6865, Craig_Sorensen@vrtx.com, Phone: 617-444-6523
The pharmaceutical research environment today is very different from what it was even five years ago. The demands for a robust, productive research pipeline are arguably greater now than they have ever been and with it the challenge of finding new ways to meet this demand is increasing as well. It is widely accepted that strategic in- and out-licensing is one way to augment internal efforts to generate a robust pipeline and while, in and of itself, this is not a new concept the strategies and practices that have been put in place around these licensing activities have become entrenched and may no longer be sufficient to meet the challenges of tomorrow. What we are rapidly discovering is that the best practices of yesterday and today may not necessarily be the best practices for tomorrow. As a result of the increasing globalization of research and ever more complex interplay between pharma companies, CRO's, and academia, there is rapidly emerging a requirement for new paradigms of interaction in order to ensure success for all parties. In this presentation some of the concepts and paradigm shifts around licensing that Vertex has developed and successfully implemented will be presented.
Shuntai Wang, Pfizer, Inc, 50 Pequot Avenue, B2231, New London, CT 06320, email@example.com, Phone: 860-732-1941
At large pharmaceutical companies, information management as a cross-functional process leverages all sources of public information for the identification and assessment of licensing opportunities. These sources include news, scientific and medical conferences, pipeline databases, company public disclosures, scientific literature and patents. Information management complements direct human interactions for successful licensing and partnering. Even for those smaller companies with limited resources, effective information management can play an important role in successful product licensing.
Stephen R. Adams, Magister Ltd, Crown House, 231 Kings Road, Reading RG1 4LS, United Kingdom, Fax: +44 118 966 6620, firstname.lastname@example.org, Phone: +44 118 966 6520
This talk will examine some of the mechanisms for technology dissemination in Europe, with particular reference to the influence of industry interests and the European Union (EU). Methods for funding scientific research in Europe are markedly different to those of the US, as are the resulting methods of handling of the IP rights arising from that research. Central government funding is still a major contributor, with relatively little coming from private endowments or alumni foundations. There is a mixed experience of creating spin-off commercial enterprises from university-based research; whilst some are world leaders, not all have been successful in assisting the technology transfer process. The handling of IP rights on inventions from academia varies between the countries of Europe, although there are some EU-wide regulations, particularly in relation to technology transfer. In 2004, there was a major reform in EU technology transfer block exemptions, similar to US ‘safe harbor' regulations.
Patrick Waller, Shareholder, Biotechnology and Chemical Groups, Wolf Greenfield, 600 Atlantic Avenue, Boston, MA 02210, email@example.com, Phone: 617-646-8223
This presentation will review the impact of recent court decisions on reach-through royalties, implied licenses, and rights to improvements relating to chemical and pharmaceutical products. The discussion will address specific decisions on licensing and patent issues and also explore intellectual property trends relating to pharmaceutical compounds, formulations, salts, and structural derivatives in the context of a license or agreement.
Subhash C. Basak and Brian D. Gute, Center for Water and the Environment, Natural Resources Research Institute, University of Minnesota, 5013 Miller Trunk Hwy, Duluth, MN 55811, Fax: 218-720-4328, firstname.lastname@example.org, Phone: 218-720-4230
In the post-genomic era, "omics" technologies are generating copious data related to the effects of biological and chemicals agents on living systems. Proteomics methods such as two-dimensional gel electrophoresis (2-DE) provide data on 1,000 to 2,000 proteins. Novel methods are needed in order to extract meaningful information from these proteomics maps. Our research team has developed four classes of methods for characterizing proteomics maps using discrete mathematics and statistics: 1) association of graphs/matrices with proteomics maps, 2) information theoretic biodescriptors, 3) spectrum-like representations of proteomics maps, and 4) statistical approaches to identify critical protein biomarkers. Each of the first three methods generates a single, compact, numerical biodescriptor or a set of numerical descriptors to characterize the map. The fourth approach identifies a set of critical proteins related to the bioactivity or toxicity being studied.
Jeremy L. Jenkins1, Andreas Bender1, and Dmitri Mikhailov2. (1) Lead Finding Platform, Novartis Institutes for BioMedical Research, 250 Massachusetts Avenue, Cambridge, MA 02139, email@example.com, Phone: 617-871-7155, (2) Lead Discovery Informatics, Novartis Institutes for Biomedical Research, Cambridge, MA 02139
Cheminformatics and bioinformatics databases are often maintained in silos with minimal effort to federate chemical and genomic data. One successful cross-over between disciplines we recently presented was the prediction of ligand targets by mining target-annotated chemical databases. However, one restriction of this approach is that only targets present in the original database could be predicted. In this work we further push those boundaries in target space; By annotating 1,300 target classes in the WOMBAT database with the InterPro domains found in the targets, we have created thousands of probabilistic models that associate chemical substructures with protein domains. The models can be applied to orphan compounds for in silico "domain fishing", enabling target predictions that extrapolate to proteins outside the training set. Examples of employing the approach to triaging cell-based high-throughput screens are provided, as well as their application in ranking the proteins pulled down in small-molecule affinity chromatography experiments.
Andreas Bender, Jeremy L. Jenkins, and John W. Davies, Lead Finding Platform, Novartis Institutes for BioMedical Research, 250 Massachusetts Avenue, Cambridge, MA 02139, firstname.lastname@example.org, Phone: 617-871-3972
After recognizing that polypharmacology (i.e., activity against multiple targets) is an inherent property of most, if not all, small molecules which modulate biological functions, a subsequent question arises: Which targets are more frequently associated with each other than others? The answer to this question puts biological activities in relation to each other[1,2], not via sequence-based similarities, but rather by means of ligand-based commonalities of ligands showing the same activity in biological systems. We employ methods from evolutionary biology informatics to produce graphical representations of ligand-based bioactivity space and further apply the resulting phylochemical mappings to rationalize off-target effects. These representations are still based on conventional ligand-based fingerprints. On the other hand, biological readouts can be used directly to represent compounds by means of their impact on biological systems and an analogous analysis can be performed, this time on experimental readouts. We present visualizations of chemical space, based on ligand properties, as well as applications to the prediction of off-target (or, rather, secondary target) effects.
 "Bayes Affinity Fingerprints" Improve Retrieval Rates in Virtual Screening and Define Orthogonal Bioactivity Space: When Are Multitarget Drugs a Feasible Concept? A. Bender, J. L. Jenkins, M. Glick, Z. Deng, J. H. Nettles and J. W. Davies, J. Chem. Inf. Model., 2006, 46, 2445-2456.
 Relating protein pharmacology by ligand chemistry. M. J. Keiser, B. L. Roth, B. N. Armbruster, P. Ernsberger, J. J. Irwin and B. K. Shoichet. Nature Biotech. 2007, 25, 197 – 206.
 Chemogenomics Data Analysis: Prediction of Targets and the Advent of "Biological Fingerprints". A. Bender, P. A. Clemons, J. L. Jenkins, D. Mikhailov, D. W. Young, J. H. Nettles, M. Glick, and J. W. Davies. Comb. Chem. High Throughput. Screen., 2007, in press.
Jerry Osagie Ebalunode1, Zheng Ouyang2, Jie Liang3, and Weifan Zheng1. (1) Department of Pharmaceutical Sciences, Biomanufacturing Research Institute and Technology Enterprise (BRITE), North Carolina Central University, 1801 Fayetteville Street, Durham, NC 27707, email@example.com, Phone: 919-530-7013, (2) Bioengineering Department, University of Illinois at Chicago, Chicago, IL 60612, (3) Bioengineering Department, Carolina Exploratory Center for Cheminformatics Research (CECCR), University of Illinois at Chicago, Chicago, IL 60612
In the past decade, high throughput screening (HTS) and rapid parallel synthesis (RPS) have dramatically changed the drug discovery industry. In recent years, these same technologies also form the very basis for the NIH chemical genomics initiative. One of the main issues in both drug discovery and chemical genomics research is how to assess the diversity of compound collections so that the information obtained from HTS is maximal and meaningful. Traditional diversity measures only look at the self dissimilarity of the compound collection and ignore information from the biological space. We have developed a new approach (BioMD) to the problem whereby binding site shape information derived from computational geometry analysis of the structural genome is used in evaluating the fitness (i.e. biological relevancy) of molecules in a collection. This strategy allows us to consider not only the self dissimilarity but also the biological relevance of the individual compounds in the diversity assessment process. In this talk, I will present some preliminary data demonstrating the application of BioMD to several publicly available database and virtual libraries derived from Diversity Oriented Synthesis (DOS).
Shaillay Kumar Dogra and Ramesh Hariharan, Cheminformatics, Strand Life Sciences Pvt. Ltd, No. 237, Sir C V Raman Avenue, Raj Mahal VIlas, Bangalore, India, Fax: +91-80-23618996, firstname.lastname@example.org, Phone: +91-80-23611349
Running ‘Natural Language Processing' engine on scientific literature can yield information on interactions between biological entities like proteins and small molecules. Such an approach, when run on Medline abstracts in December 2005 yielded around 231,400 Protein-Small molecule and 110,850 small molecule-small molecule interactions. Clearly, there is a plethora of information available for analysis. However, the nature of the search, which is ‘text' driven, limits such an approach. What is of immensely more use is to run a ‘substructure' search using the query compound of interest against the small molecule interactions database. The resulting hits can then be analyzed to check if the query compound has potentially similar biological interactions. This gains significance in a drug discovery setting wherein compounds are being virtually designed and optimized for good ADME properties. An additional dimension to optimize now could be avoiding undesirable interactions with specific biological targets or with other small molecules.
Nikil Wale1, George Karypis1, and Ian A Watson2. (1) Department of Computer Science, University of Minnesota, Minneapolis, MN 47408, email@example.com, Phone: 612-626-9874, (2) Eli Lilly and Company, Indianapolis, IN 46285
Methods that can screen large databases to retrieve a structurally diverse set of compounds with desirable bioactivity properties are critical in the drug discovery and development process. In this presentation we will show a set of such methods, which are designed to find compounds that are structurally different to a certain query compound while retaining its bioactivity properties (scaffold hops). These methods utilize various indirect ways of measuring the similarity between the query and a compound that take into account additional information beyond their structure-based similarities. Two sets of techniques are presented that capture these indirect similarities using approaches based on automatic relevance feedback and on analyzing the similarity network formed by the query and the database compounds. Experimental evaluation shows that many of these methods substantially outperform previously developed approaches both in terms of their ability to identify structurally diverse active compounds as well as active compounds in general.
Donald Walter, Customer Training, Thomson Scientific, 1725 Duke Street Suite 250, Alexandria, VA 22314, Fax: 703 519 5838, Don.Walter@Thomson.com, Phone: 703-706-4220
Bioinformatic information in nucleic acid and amino acid sequences can be the first step in devising chemotherapeutic treatments for a variety of ills. I am going to show several techniques using a variety of literature and patent sources where patented sequences can be linked to specific drugs and types of drugs.
Andrea Volkamer, Thomas Lengauer, and Andreas Kämper, Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Stuhlsatzenhausweg 85, D-66123 Saarbrücken, Germany, Fax: +49 681 9325-399, Phone: +49 681 9325-303
The use of pharmacophore constraints is an established technique for improving docking results. Usually the pharmacophore has to be specified manually. In this work we present a fully automated technique that incorporates information on the buriedness of the binding pocket and structure-based pharmacophore features into the docking engine of FlexX. Key interaction points in the active site are calculated with a GRID-based energy function. Those points that pass several newly developed filters are merged to a small number of pharmacophore features. The automatically generated pharmacophores agree well with manually derived results. The performance of the method has been validated on several difficult docking tasks as well as on virtual screening scenarios using FlexX-Pharm, the pharmacophore module of FlexX. Docking results are improved in 95 % of the test cases. In general, the enrichments in virtual screening runs are higher and the compute-times are smaller than in the respective unconstrained screenings.
Thuan T. H. Huynh Buu1, Gerhard Wolber2, Thierry Langer2, Peter Lackner3, and Gerald Lirk1. (1) University of Applied Science Hagenberg, Hauptstraße 117, 4232 Hagenberg, Austria, firstname.lastname@example.org, Phone: +43-650-377-0479, (2) Inte:Ligand GmbH, 2344 Maria Enzersdorf, Austria, (3) Department of Molecular Biology, University of Salzburg, 5020 Salzburg, Austria
Virtual screening using 3D pharmacophores has evolved into an important and successful method for drug discovery over the last decades. We recently presented an efficient alignment method for super-positioning shared chemical features of pharmacophores and/or molecules in 3D space. Although efficient super-positioning techniques are of utmost importance to guarantee high throughput in virtual screening technologies, there is a need for automatically assessing the relevance and quality of a specific alignment for processing large data sets. Being aware of the problems of scoring functions in docking approaches the presented ranking approach has a different scope, since the position in 3D space is already defined by the single alignment solution coming from the alignment algorithm. The presented scoring function is therefore designed not to select poses of one single molecule, but to select those molecules, which better fit to a pharmacophore (or a shared feature pharmacophore hypothesis) compared to others. Geometric, steric and energetical contributions have been used for implementation and parametrization and applied to a diverse set of H1 antagonists. We used a pseudo-structure-based approach using a homology model and docked a data set of selected, active H receptor ligands using GOLD, and compared this to a ligand-based approach using multiple conformations generated by OMEGA within the LigandScout framework.
Yevgeniy Podolyan, Department of Computer Science & Engineering, University of Minnesota, 4-192 EE/CS Building, 200 Union St SE, Minneapolis, MN 55455, email@example.com, Phone: 612-626-9873, and George Karypis, Department of Computer Science, University of Minnesota, Minneapolis, MN 55455
Virtual screening for bioactive molecules is becoming increasingly popular as the microprocessor prices decline and their speed increases. This allows for a fast screening of a large library of molecules that are potentially active against a specific target completely in silico. One approach is to search for molecules that are similar to the known active ones using techniques such as 3-dimensional alignment or various-dimensionality descriptor-based methods. One such technique is based on pharmacophores, which are the functional or structural elements of the molecule that are believed to be responsible for biological activity. Analog-based methods that use pharmacophores in the virtual screening include those using 3- and 4-point pharmacophore binary fingerprints, feature vectors, maximum common substructure searching, etc. to find analogs. We will discuss the benefits and shortcomings of such methods as well as present results of the methods based on new approaches to using 3D pharmacophores in virtual screening.
Meir Glick, Lead Finding Platform, Novartis Institutes for BioMedical Research, 250 Massachusetts Avenue, Cambridge, MA 02139, firstname.lastname@example.org, Phone: 617-871-7130
This study describes a novel semi-sequential technique for in silico enhancement of high-throughput screening (HTS) experiments now employed at Novartis. It is used in situations in which the size of the screen is limited by the readout (e.g., high content screens) or the amount of reagents or tools (proteins or cells) available. By performing computational chemical diversity selection on a per plate basis (instead of a per compound basis), 25% of the 1,000,000-compound screening was optimized for general initial HTS. Statistical models are then generated from target-specific primary results (percentage inhibition data) to drive the cherry picking and testing from the entire collection. Using retrospective analysis of 11 HTS campaigns, we show that this method would have captured on average two thirds of the active compounds (IC50 < 10 uM) and three fourths of the active Murcko scaffolds while decreasing screening expenditure by nearly 75%. This result is true for a wide variety of targets, including G-protein-coupled receptors, chemokine receptors, kinases, metalloproteinases, pathway screens, and protein-protein interactions. Unlike time-consuming “classic” sequential approaches that require multiple iterations of cherry picking, testing, and building statistical models, here individual compounds are cherry picked just once, based directly on primary screening data. Strikingly, we demonstrate that models built from primary data are as robust as models built from IC50 data. This is true for all HTS campaigns analyzed, which represent a wide variety of target classes and assay types.
Kunal Roy1, J Thomas Leonard2, and Partha Pratim Roy2. (1) Pharmaceutical Technology, Jadavpur University, Raja S C Mullick Road, Jadavpur, Kolkata 700032, India, email@example.com, Phone: 91-9831594140, (2) Jadavpur University, Kolkata 700032, India
Quantitative structure-activity relationships (QSARs) represent predictive models derived from application of statistical tools correlating biological activity (including therapeutic and toxic) of chemicals (drugs/toxicants/environmental pollutants) with descriptors representative of molecular structure and/or property. The success of any QSAR model depends on accuracy of the input data, selection of appropriate descriptors and statistical tools, and most importantly validation of the developed model. Validation is the process by which the reliability and relevance of a procedure are established for a specific purpose. Leave one-out cross-validation generally leads to an overestimation of predictive capacity, and even with external validation, no one can be sure whether the selection of training and test sets was manipulated to maximize the predictive capacity of the model being published. In this paper, we present some representative examples of validation of QSAR models in order to explore possible importance of the method of selection of training set compounds, setting training set size and impact of variable selection for training set models for determining the quality of prediction.
Brian D. Gute1, Subhash C. Basak1, and Douglas M. Hawkins2. (1) Center for Water and the Environment, Natural Resources Research Institute, University of Minnesota, 5013 Miller Trunk Hwy, Duluth, MN 55811, Fax: 218-720-4328, firstname.lastname@example.org, Phone: 218-720-4284, (2) School of Statistics, University of Minnesota, Minneapolis, MN 55455
Quantitative molecular similarity analysis (QMSA) methods use a variety of calculated molecular descriptors and experimental properties in the creation of chemical structure spaces. These spaces are often used in selecting structural analogs and estimating a wide variety of properties: physicochemical, pharmacological, and toxicological. Traditionally, descriptor sets are selected arbitrarily, intuitively by an expert, or through a variety of data reduction techniques. 'Tailoring' is a new approach that selects indices that are strongly correlated with the property of interest. Studies have been carried out on a variety of chemical databases to examine the effectiveness of tailored vis-à-vis arbitrary similarity spaces in property estimation. The spaces considered here are all derived from the same set of topological indices, only the selection methods vary. Ridge regression and recursive partitioning will be discussed as useful approaches in descriptor selection.
Daming Li, Computational Design and Modeling, LITEC Systems Corporation, New York, NY 10006, email@example.com, Phone: 212-812-6320
Quantitative phosphoproteomics analysis is becoming a hotspot and provides the possibilities to study the dynamics of protein phosphorylation and to better understand the regulatory networks of key processes in cells. In this paper we present a quantitative application, which predicts the isoelectric points for peptides with and without methyl esterification. Numerical simulation of this model shows that methylated phosphopeptides and non-phosphopeptides can be grouped on the basis of the number of phosphate groups and basic residues in each peptide. The theoretical results are supported by experiments. We developed a SOAP webservice component and an Excel add-in and it can be easily integrated into Spotfire™ DecisionSite and SciTegic™ Pipeline Pilot.
Shaillay Kumar Dogra, Cheminformatics, Strand Life Sciences Pvt. Ltd, No. 237, Sir C. V. Raman Avenue, Raj Mahal Vilas, Bangalore, India, Fax: +91-80-23618996, firstname.lastname@example.org, Phone: +91-80-23611349
Decision Tree (DT), as a classification algorithm, has certain advantages over other methods like Neural Networks or Support Vector Machines. Apart from producing interpretable models, DTs can inherently select those descriptors that are of relevance to modeling the given property, during tree building itself. However, in context of cheminformatics data, which is characterized by high dimensionality of feature-space and less number of samples available for training, DTs tend to suffer. Here, ‘parameter tuning' and ‘feature selection' become of importance. In this study, we present our findings about the influence of parameters such as ‘attribute selection measure', ‘tree stopping criterion' and ‘tree pruning method' on the size and performance of the learned Decision Trees. Further, we introduce an initial feature selection, using wrappers, before invoking DT learning to take care of high-dimensional data. Finally, we compare our results with those obtained from ‘Decision Forest', which is an ensemble of DTs.
Xia Ning, Department of Computer Science & Engineering, University of Minnesota, 464 DTC, 117 Pleasant Street, SE, Minneapolis, MN 55455, email@example.com, Phone: 612-624-5384, and George Karypis, Department of Computer Science, University of Minnesota, Minneapolis, MN 55455
In recent years there has been an increased interest in using structural descriptors in conjunction with advanced supervised learning algorithms (e.g. support vector machines and neural networks) for solving various problems arising in virtual screening. This research resulted in the development of highly effective activity and/or property prediction methods and has provided an objective and data-driven assessment of the characteristics that a descriptor set should have in order to achieve good performance. Unfortunately, this research has primarily focused on topological descriptors and to a large extent has ignored the various 3D descriptors.
In this talk we discuss our results in evaluating the various parameters of the design space for 3D descriptors and how they impact the machine learning based virtual screening approaches. Specifically, our work focuses on the questions like: What kinds of 3D elements of the compound structures are the most significant for bioactivity and how to efficiently extract them? How to quantitatively measure and represent these significant 3D elements in descriptors so as to optimally balance the trade-off between generality and specificity of structure representation? What is the best way of using the 3D descriptors in kernel-based machine learning approaches in order to take great advantage of both the descriptors and the learning method? We address these questions by performing a comprehensive experimental evaluation using different 3D descriptors on a wide-range of datasets.
Ramanathan Natarajan and Subhash C. Basak, Center for Water and the Environment, Natural Resources Research Institute, University of Minnesota, 5013 Miller Trunk Highway, Duluth, MN 55811, Fax: 218-720-4328, firstname.lastname@example.org, Phone: 218-720-4342
Quantitative treatment of chirality is very essential because successful chirality measures will be able to direct asymmetric synthesis of new agrochemicals and pharmaceuticals. Though Cahn-Ingold-Prelog rule is very successful in discriminating configurational isomers and assign them the absolute configuration they fail to quantify molecular chirality. Even several of the commonly used topological indices, 3-D descriptors, and quantum chemical descriptors of energetics cannot differentiate enantiomers or diastereomers. Some attempts to develop topological indices to differentiate stereoisomers and enantiomers are not very successful as they treat chirality as a discontinuous measure (+1 or -1) and hence, have limitation in applying to QSAR of diastereomers. We have developed a novel topological index describing molecular chirality. This new index treats chirality as a continuous measure and hence we prefer to call it the Relative Chirality Index (RCI). Calculation of relative chirality indices and their application in QSAR modeling will be presented with appropriate examples.
Markus Sitzmann1, Igor V. Filippov2, Wolf-Dietrich Ihlenfeldt3, and Marc C. Nicklaus1. (1) Laboratory of Medicinal Chemistry, Center for Cancer Research, National Cancer Institute, National Institutes of Health, DHHS, Frederick, MD 21702, email@example.com, Phone: 301-846-5974, (2) Laboratory of Medicinal Chemistry, SAIC-Frederick, Inc., NCI-Frederick, Frederick, MD 21702, (3) Xemistry GmbH, D-35094 Lahntal, Germany
We give an overview of our recent work in the context of our Chemical Structure Lookup Service (CSLS). This service comprises (at the time of this writing) a collection of approx. 80 chemical structure databases from commercial and public sources, indexes approximately 40 million molecules representing approximately 27 million unique chemical structures, and continues to grow. We focus on our procedure for the normalization of the chemical structures, which is a crucial step in the processing of chemical databases coming from different sources. It is needed for finding a canonical representation of a chemical which otherwise might be missed because of differing encoding due to certain chemical features (e.g. different tautomers, different resonance structures etc.) or to ill-defined parts of the structure (e.g. misdrawn functional groups, missing hydrogen atoms, missing charges or incorrect valences). This structure normalization is performed for any incoming structure set to be registered, or searched by, in CSLS. We also discuss our structure-based hashcode identifiers, which are calculable for any small molecule. They are specifically designed to enable a fine-tunable yet rapid compound identification even in very large datasets. They can be set to be sensitive to a variety of chemical features such as tautomerism, different resonance structures drawn for a charged species, and presence or absence of certain fragments like counterions. One specific such identifier, called FICuS, is one of the crucial mechanisms for identification and lookup of chemicals in CSLS – enabling CSLS to function essentially as an “address book” of any small molecule. FICuS and the other identifiers are however not dependent on the infrastructure of this service. CSLS is freely available at http://cactus.nci.nih.gov/lookup. The service recognizes over 20 chemical structure representation formats as input data, including SD files, SMILES strings, InChI identifiers, or FICuS hashcodes.
Gregory M. Banik, Leo Collins, Marie Scandone, and Ty Abshear, Informatics Division, Bio-Rad Laboratories, TWO PENN CENTER PLAZA, SUITE 800, 1500 JFK Blvd., Philadelphia, PA 19102, firstname.lastname@example.org, Phone: 267-322-6931
At their inception, spectral reference databases were made available in the same fashion as other primary and secondary published resources: for sale to libraries or individuals on either a perpetual license or annual subscription basis. Recently, the open access initiative has led to the creation of pilot initiatives for open access spectral data such as NMRShiftDB.
The evolution of spectral resources will be discussed from the first reference spectra collection (the Sadtler reference spectra, now celebrating its 60th anniversary) to today's nascent open access initiatives. In addition to the traditional and open access models, a third model will be described for the open deposition of spectra by third-parties that retains a peer-reviewability feature to ensure quality and accountability in the creation of spectral reference collections. The use of data-driven software technologies to further ensure data quality will also be discussed.
Alex M. Clark, Research & Development, Chemical Computing Group, Inc, 1010 Sherbrooke St West, Suite 910, Montreal, QC H3A2R7, Canada, Fax: 514-874-9538, email@example.com, Phone: 514-393-1055
An implementation of protein-ligand interaction fingerprints will be described. Fingerprints are generated according to the presence of hydrogen bonds, ionic interactions and displacement of solvent accessible surface area, between the ligand and surrounding residues. Elements of the accompanying user interface will be presented, which makes straightforward work of gaining insights from the derived data. Several case studies will be described, including studies based solely on fingerprints derived from docking poses, improvement of structure activity relationships by mixing crystal data with docking poses, and comparative examination of selective inhibitors of families of proteins.
Howard J Feldman, Research, Chemical Computing Group Inc, 1010 Sherbrooke St. W., Suite 910, Montreal, QC H3A2R7, Canada, Fax: 514-874-9538, firstname.lastname@example.org, Phone: 514-393-1055
Recently a collaboration between MSD-EBI, PDBj and RCSB has made available the PDB Exchange Dictionary (http://pdbml.rcsb.org/schema/pdbx.xsd), adapted from the mmCIF dictionary. The data structures provided allow much disambiguation compared to the aging PDB format, for example introducing the concept of entities – unique molecules within the record. They are also more aligned with modern relational database practices – only store each piece of information once. However the problem remains that most popular software uses PDB format for both input and output. We look at some of the hurdles involved in converting PDB files to PDBML format and present a new database system, Protein SILO (PSILO) which overcomes these. The benefits of using correctly built PDBML files include more accurate interpretation of ligands and small molecules, more precise definitions of experimental conditions, and far more powerful search capabilities when stored in a relational database.
Fabian Bendix1, Vlad Sladariu1, Thierry Langer2, and Gerhard Wolber3. (1) Computer Science Group, Inte:Ligand GmbH, Mariahilferstrasse 74B/11, 1070 Vienna, Austria, Fax: +43181749551371, email@example.com, Phone: +4369915075555, (2) Department of Pharmaceutical Chemistry, Computer Aided Molecular Design Group, University of Innsbruck, Institute of Pharmacy, Innsbruck A-6020, Austria, (3) Inte:Ligand GmbH, 2344 Maria Enzersdorf, Austria
While high-throughput virtual screening is used to simply narrow down a list of potential drug candidates among a number of compounds, mining results from cross-target screening can quickly become very complex. The analysis of complex activity patterns is a human, intensive, and explorative task. Our goal in this work is to provide a powerful environment for analyzing activity profiles. These profiles are defined as biological activity patterns each corresponding to a set of computationally predicted target affinities. Our framework visualizes and categorizes the results as quickly and directly perceivable activity maps. These maps can then be used to identify the activity scope of one molecule or a set of molecules at one sight, and allow to be used for application scenarios like the identification of unwanted biological effects or minimizing off-target effects. Our activity maps are enhanced for interactive use with linking and brushing techniques for directly linking molecule lists to target points on the map. The power of visualization and human exploration abilities are put together to solve the crucial task of mining drug candidates to quickly identify those with better simulated activity profiles.
Hongyao Zhu, Computational Chemistry & Cheminformatics, Plexxikon, Inc, 91 Bolivar Dr, Berkeley, CA 94710, Fax: 510-647-4048, firstname.lastname@example.org, Phone: 510-647-4114
In ligand-protein co-crystal structures, it is often observed that a bound ligand is not in the lowest-energy conformation or not in the preferred tautomerization state for the unbound ligand. This phenomenon is referred to as “conformer focusing” or “tautomer focusing”. Therefore, the assessment of binding free-energy contribution from such conformer focusing and tautomer focusing is essential to reliable estimate binding affinity for ligand design. Taking into account energy difference between tautomerization states and between conformation states in the binding free-energy calculations represents one of the most challenging problems in computational chemistry. Several common structural-motifs are reported to illustrate general considerations of tautomer and conformer focusing in the structure-based molecular design.
Ike Shibley, email@example.com, Division of Science, Penn State Berks, Tulpehocken Rd., Reading, PA 19610
Many teachers do not include written assignments as part of a course grade because they do not feel qualified to grade the assignments. Writing can be a critical component of student learning in a variety of chemistry courses and grading these assignments can be accomplished by any chemistry teacher. Several styles of grading will be presented to demonstrate the effectiveness of grading written work in a science course without a large time commitment. Several types of writing assignments will be presented to explore the variety of ways that writing can be used to facilitate thinking. Because writing often helps students to think more clearly, an argument will also be made for including ungraded writing as well.
Bryan W. May, firstname.lastname@example.org, Department of Science, Central Carolina Technical College, 506 North Guignard Drive, Sumter, SC 29150
Students in a freshman level general chemistry class write a paper on a controversial science topic. Topics are tailored to meet specific interests of each student. The paper consists of at least three parts: both sides of the specific controversy and a conclusion section where the student describes their own conclusions or describes how a controversy was ultimately resolved. Particular emphasis is placed on research methodology and proper formatting.
Karen Anderson, Department of Chemistry, Madison Area Technical College, Madison, WI 53704
Requiring students to re-correct their first exam of the semester can be quite an eye-opener. With limited instructor comments on the exam, students make use of collaborative learning with peers, peer tutors, and other course resources in order to properly annotate a guided-discovery correction's exercise. As part of this reflection, a personal assessment of their performance is required which captures highlights of their performance, areas that could be improved towards passing the next exam, and any major discovery revealed through the writing up of this exercise. The exam rewrite also gives an instructor a better understanding of why students make the mistakes they do, with some surprising and often humbling findings.
Donald J. Wink, email@example.com and Michael Dianovsky, firstname.lastname@example.org. Department of Chemistry (MC 111), University of Illinois at Chicago, 845 W. Taylor Street, Chicago, IL 60607
This paper describes the use of journals in a general education chemistry course for elementary education majors. The particular structure of the journal involves students' description of their understanding of a topic, its development, and its connection to an aspect of their lives. This becomes the basis of instructor feedback to correct student misunderstandings, validate their efforts to use metacognition, and shape their understanding of the meaning of content for themselves. The ways in which this aligns with principles of student identity development presented by Baxter Magolda and others will be discussed, with a particular understanding of how the writing changes to reflect an improved "voice" in the content domain.
Katie E. Amaral, email@example.com, Division of Science, Penn State Berks, Reading, PA 19610-6009
A work of non-fiction was used in a fundamental organic chemistry course to relate chemistry principles to everyday life. Three class days were set aside to discuss topics ranging from agriculture to preservatives to organic foods. A dialogue evolved from these discussions which, while not always germane to chemistry curriculum, increased student interest and engagement. Students then presented posters on self-chosen topics relating to the book and subsequent discussions, which encouraged them to explore appealing concepts in greater depth. The effect of this process on students' attitudes toward and perceptions of chemistry and learning will be discussed.
Anderson L. Marsh, firstname.lastname@example.org, Department of Chemistry, Lebanon Valley College, 101 N. College Ave., Annville, PA 17003
First year seminar courses have become popular on college campuses, and generally provide freshman with a detailed look at a specific topic. At Lebanon Valley College these courses also serve as an alternative to freshman English. This talk will be centered on one of those courses that specifically targeted science majors: nanotechnology. The overall structure of the course will be discussed, as well as the informal and formal writing assignments that were completed. During the semester students learned to read and think critically about nanotechnology using various texts. Class discussion over the semester focused on a variety of issues surrounding nanotechnology, ranging from the science and technology behind specific discoveries to ethical questions surrounding the application of these discoveries. The course allowed the students to explore certain aspects about scientific research that are not usually covered in freshman chemistry lecture or laboratory courses.
Wendy L. Keeney-Kennicutt, email@example.com, Department of Chemistry, Texas A&M University, P.O. Box 30012, College Station, TX 77842-3012, Adalet B. Gunersel, Department of Educational Psychology, Texas A&M University, Center for Teaching Excellence, 533 Blocker, College Station, TX 77843-4246, and Nancy Simpson, firstname.lastname@example.org, Center for Teaching Excellence, Texas A&M University, 533 Blocker, College Station, TX 77843-4246.
This mixed-methods study investigated student perceptions of an innovative educational tool and the instructor strategies that helped change initial student resistance into acceptance and engagement. The educational tool in this study is Calibrated Peer Review (CPR)TM, a web-based program that uses writing as a learning and assessment tool. Evaluations of CPRTM from 1515 students in a general chemistry course over seven semesters were analyzed. Results indicated that when the instructor actively promoted CPRTM's usefulness as a tool for learning by being explicit about her reasons for using CPRTM, making the assignments worth a significant part of student grades, and giving individual support outside of class, students reported a more positive experience. Analysis of student perceptions suggests that successful implementation of new tools requires attention to potential sources of student resistance at the outset as well as active listening and response to student concerns.
Lorena Tribe, email@example.com, Division of Science, Penn State Berks, Tulpehocken Rd., Reading, PA 19610
A great way of testing ones understanding of a topic is explaining it to someone else. This experience is captured in an assignment for students who write class notes and post them for the rest of the class. The process includes taking notes, preparing a legible document and handing it over to a class mate for review. The reviewer then delivers the final copy to the course web site. This assignment enhances the learning experience in general chemistry courses: A deeper understanding of the material is attained by the students performing the assignment, students who were absent or unable to take notes have a source of information and a sense of community is developed. Additional benefits are a written record of the material that has been addressed in class and a non-traditional component to grading.
Margaret E. Schott, firstname.lastname@example.org, Department of Natural Sciences, Dominican University, 7900 W. Division Street, River Forest, IL 60305
A variety of short writing assignments have been developed to help students gain conceptual understanding of chemically related topics and make connections with the world around us. Each of these assignments, crafted primarily for non-science majors, provide students with choices and the freedom to express themselves through orderly written formats. Having their thoughts in writing also helps students to contribute to classroom discussions. Some examples of writing assignments include: (a) guided critique and evaluation of internet sites, and (b) library database searching on a specific topic, aimed at making connections with the national news, and (c) a reading review form with guiding questions for engaging non-chemistry texts (The State of the World series, for instance). In the latter example, students are asked to create a bumper sticker, motto, or t-shirt slogan to sum up their learning.
Aeran Choi, email@example.com, Brian M. Hand1, and Thomas J. Greenbowe, firstname.lastname@example.org. (1) Department of Curriculum and Instruction, University of Iowa, N238 Lindquist Center, Iowa City, IA 52242, (2) Department of Chemistry, Iowa State University of Science & Technology, 3051 Gilman Hall, Ames, IA 50011-3111
This study analyzed students' written laboratory reports to identify the important components that promote quality of argument and conceptual understanding. The students were enrolled in a university general chemistry laboratory course for science and engineering majors. The lab reports were written using the template provided by the Science Writing Heuristic (SWH), an inquiry-based approach that uses team learning. A key component of the SWH approach is that students construct scientific arguments in their writings. A matrix for analyzing the quality of each scientific argument was developed and evaluated. Among the seven components of a proper argument structure, “providing evidence” and “constructing the claims/evidence relationship” were identified as two factors that contributed to quality of argument. The students' holistic argument scores were higher than their analytical total argument scores. Results indicate a strong link between student writings through the SWH approach and students' conceptual understanding of chemistry.
Tiffany R. Turner, email@example.com, Department of Chemistry and Biochemistry, Baylor University, One Bear Place, # 97348, Waco, TX 76798, Glenn B. Blalock, Glenn_Blalock@baylor.edu, Department of English, Baylor University, One Bear Place #97404, Waco, TX 76798-7404, and Carol Schuetz, Carol_Carson@baylor.edu, University Libraries-Reference and Library Instruction, Baylor University, One Bear Place #97146, Waco, TX 76798-7146.
As a result of the yearlong collaboration of a chemistry graduate student, a librarian, and a faculty member from English, a sequence of discipline-specific writing-to-learn and information literacy activities was integrated into the organic chemistry lab curriculum at Baylor University for two semesters. The purpose of this redesign was to engage students more fully in the process of learning how to think and to work in ways that will characterize their future professional identities. Although lab students typically write lab reports and maintain notebooks, this curricular redesign was intended to expand students' experiences with researching chemistry scholarship, with reading chemistry articles, and with writing discipline-specific genres other than lab reports. In addition, for each of the writing assignments, students were required to engage in peer review of classmates' work-in-progress. Assessment of these curricular revisions included surveys, focus groups, and primary trait analyses of student writing. This presentation will show how writing assignments and information literacy objects can be successfully integrated into the organic chemistry lab.
Pamela J. Higgins, Department of Chemistry, Dickinson College, PO Box 1773, Carlisle, PA 17013
In addition to introducing students to common biochemical techniques, the laboratory component of Dickinson College's biochemistry course also involves teaching students how to convey their experimental data in a more professional manner. Students struggle with the transition from “writing everything down” in the laboratory notebook to reporting data in the “concise yet complete” manner that is required by the scientific community. Using guided investigational writing exercises, students can determine what critical experimental information should be extracted from their notebook and subsequently how it should be included in a research article. Multiple examples of these exercises will be presented and discussed. This investigative model results in students with improved writing quality and an increased skill in dissecting the finer points of scientific communication in the primary literature.
Marvin Charton, firstname.lastname@example.org, Pratt Institute, 200 Willougby Avenue, Brooklyn, NY 11205
Philip S. Magee: A Life in QSAR
A brief review of the work of Phil Magee on various aspects of QSAR and of the founding of the Cheminformatics and QSAR Society.
Tim Clark, email@example.com, Friedrich-Alexander-Universitaet Erlangen-Nuernberg, Computer-Chemie-Centrum, Naegelsbachstrasse 25, 91052 Erlangen, Germany
In principle, local properties at or near the surface of a molecule should be adequate to define intermolecular interactions and therefore also most properties of interest in QSAR and QSPR. However, properties such as the local hardness, electronegativity and polarizability also allow us to model reactivity effectively. Furthermore, these local properties represent an interesting alternative to interaction energies with probes such as those used in CoMFA or GRID. This lecture will describe some of the more unusual applications of the surface-modeling approach using the local properties derived from semiempirical molecular orbital theory.
Yvonne C. Martin, firstname.lastname@example.org, 2230 Chestnut St., Waukegan, IL 60087 and Ki H. Kim, email@example.com, Hope Drug Discovery Research Laboratory, 260 Southgate Drive, Vernon Hills, IL 60061.
Es values are the classic linear free energy parameter to describe the steric effect of substituents. With the advent of 3D QSAR, CoMFA in particular, we investigated whether Es is purely steric. We modeled the data on which Es is based, acid-catalyzed hydrolysis of esters, and found that although the majority of the differences between compounds can be explained by steric fields, there is a statistically significant contribution of electrostatic fields. Our work settles the long-standing debate as to whether or this hydrolysis is free of electronic effects Taft originally proposed.
Zsolt Zsoldos, Darryl Reid, Bashir S. Sadjad, and Aniko Simon. SimBioSys Inc, 135 Queen's Plate Dr, Suite 520, Toronto, ON M9W 6V1, Canada
The novel Interacting Surface Point Type (ISPT) descriptor used in eHiTS LASSO is independent of the underlying scaffold. Similarity is measured based on the surface properties of potential ligands, disregarding the 2D topology and the conformation of the ligands. This "fuzzyness" makes the descriptor suitable for scaffold hopping applications.
An automated non-linear learning model extracts the key binding patterns from the chemical interaction properties encoded in the ISPT descriptor value vector of a set of compounds with known biological activity. The acquired knowledge is applied to evaluate the surface interaction models of the screened chemical compounds. The advantage of the ISPT descriptor of eHiTS LASSO over the 2D fingerprint based descriptors is the independence from the underlying 2D structural motifs allowing the recognition of structurally diverse ligands with similar interaction profiles. On the other hand, the eHiTS LASSO ISPT descriptor does not dependent on the 3D shape of the surface, thus the descriptor values are independent of the conformation of the ligand, which is an advantage over other surface based descriptors that are biased by the specific input conformation.
The method has been validated on a variety of practical virtual screening tasks. Results will be presented demonstrating the ability of the software to retrieve the majority of actives from the screening essay at the top of the ranking list. Cluster analysis using traditional 2D structural descriptors is used to highlight the scaffold differences in the actives retrieved by eHiTS LASSO with high ISPT similarity scores. The scaffold hopping ability of the descriptor makes LASSO a powerful tool for finding new leads without the same toxicity or potential intellectual property issues as the query.
Donald W. Boerth, firstname.lastname@example.org, Todd C. Andrade1, and Erwin Eder2. (1) Department of Chemistry and Biochemistry, University of Massachusetts Dartmouth, North Dartmouth, MA 02747, (2) Institute of Toxicology, University of Würzburg, Versbacher Strasse 9, D-8700 Würzburg, NA, Germany
Theoretical molecular modeling has been widely used in the design and discovery of new pharmaceuticals in medicine and to a somewhat lesser extent in the development of new pesticides for agriculture. Our research has sought to utilize computational modeling as a means of developing molecular descriptors for screening pesticides for genotoxic potential and induction of stress in crop plants, as well as for bioactivity. A combination of semi-empirical and ab initio theory coupled with density functional methods have been employed to characterize these interactions in several classes of pesticides. Electrostatic potentials have been used to establish putative sites for interactions of pesticide molecules with DNA bases and other biomolecules. These studies have been followed by modeling of the energetics of pesticide binding at these sites in the biological systems. A QSAR analysis is constructed by correlation of the computed results with available experimental descriptors.
B Bersuker, email@example.com, Department of Chemistry & Biochemistry, Institute for Theoretical Chemistry, The University of Texas at Austin, 1 University Station A5300, Austin, TX 78712
It is shown that the electronic structure and geometry of the molecular system obtained from conformational analysis and quantum-chemical calculations and presented in a matrix form serves as a unique, most accurate descriptor in its interaction with the bioreceptor instead of arbitrary descriptors in conventional QSAR methods. By processing such electron-conformational (EC) matrices of a set of molecules in relation to their known activities in comparison with EC of inactive compounds by means of special programs, a common submatrix of activity is revealed, the pharmacophore (Pha). The latter is thus unique with a theoretically 100% qualitative (yes, no) prediction of the activity. The second part of the EC method takes into account the influence of Pha flexibility and the out-of-Pha groups on the activity quantitatively by means of regression analysis. Specific calculations are presented for antimitotic antitumor activity, glutamate receptor agonists, and antidiabetics.
David E. Patterson, firstname.lastname@example.org, Vistamont Consultancy, 571 Vistamont Ave., Berkeley, CA 94708
Given enough training compounds which bind in a consistent manner at a binding site, QSAR is a valuable tool for designing new molecules with suitable activity. Extending this case to a profile of desired target activities, while challenging, is well defined and may be approached as an intersection of independent QSAR models. The advent of "-omics" assays and systems biology based readouts presents new applications and associated questions about whether, and how, to apply QSAR usefully to design molecules with desired phenotypic patterns. The molecular target of the QSAR model is no longer known, and there may be multiple targets in multiple pathways interacting in nonmonotonic fashion. Does QSAR fill a role in this arena? What does one model? Preliminary results of prospective QSAR based selection of compounds for scaffold hopping suggest that QSAR in systems biology holds promise, with substantial room for innovation.
Robert D. Clark, email@example.com and Richard Cramer, firstname.lastname@example.org. Informatics Research Center, Tripos, 1699 S. Hanley Rd., St. Louis, MO 63144
Some 3D QSAR methods can be applied to conformational ensembles, but most require that each ligand of interest be put into a specific conformation - typically the "bioactive conformation." The well-established and widely-used Comparative Molecular Field Analysis (CoMFA) and Comparative Molecular Similarity Analysis (CoMSIA) methods go a step further and require that the specified conformation of each ligand be put into a common Cartesian frame of reference. Conformer generation and alignment can be done manually or automatically. Shared substructures, pharmacophores, docking modes, surface characteristics and topomeric heuristics can all be used to automate the process, with each yielding a model or models corresponding to a different null hypothesis. Here, we will examine the effectiveness of the various methodologies in creating robust and predictive models.
Curt M. Breneman, email@example.com, N Sukumar, firstname.lastname@example.org, Mark J. Embrechts3, Kristin P. Bennett, email@example.com, C. Matthew Sundling, firstname.lastname@example.org, Mike Krein1, and Theresa Hepburn1. (1) Department of Chemistry / RECCR Center, Rensselaer Polytechnic Institute, 110-8th Street, Center for Biotechnology and Interdisciplinary Studies, Troy, NY 12180, (2) Department of Chemistry and Center for Biotechnology, Rensselaer Polytechnic Institute, Cogswell Laboratory, 110 8th Street, Troy, NY 12180-3590, (3) Department of Decision Sciences & Engineering Systems, Rensselaer Polytechnic Institute, 110 8th Street, Troy, NY 12180, (4) Department of Mathematics, Rensselaer Polytechnic Institute, Amos Eaton Building, 110 8th St, Troy, NY 12180, (5) Department of Chemistry and Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, 110 8th Street, Troy, NY 12180-3590
The evolution of "prospective" molecular property prediction methods that truly fulfill the promise of QSAR have been paced by the need for parallel development of information-rich molecular descriptors and modern multi-objective machine-learning schemes. By creating multiple models employing data fusion techniques and multiple endpoints, maximum benefit can be derived from the relationship between the chemical information encoded within modern molecular descriptors and several channels of available experimental data. Examples of data fusion QSAR will be discussed, including means for determining domain applicability of the resulting models.
Douglas M. Hawkins, email@example.com, School of Statistics, University of Minnesota, 313 Ford Hall, 224 Church Street SE, Minneapolis, MN 55455 and Jessica J. Kraker, firstname.lastname@example.org, Department of Mathematics, University of Wisconsin, Eau Claire, Eau Claire, WI 54701.
Fitting a QSAR model involves two vital follow-up steps. Model checking involves ensuring that the data used for model development are compatible with the model fitted. This can fail to be true because of outlier data cases, excessively influential cases, mixtures of populations and failures such as unmodeled curvature. Traditional diagnostics such as plots of observed or residual values versus predicted values are unreliable in the typical high-dimensional QSAR setting. Model validation involves assessing whether, and under what conditions, the model fitted to the data can be applied to future cases. Model validation was traditionally done using ‘hold-out' samples kept out of the entire analysis and used only for assessing the model fitted to the ‘learning' cases. We show that this approach is inefficient in the usual QSAR settings, and should be replaced by newer computation-intensive cross-validation methods making full use of all data for both learning and validation. A vital part of this process is to avoid the potential pitfalls that await if these methods are applied incorrectly; proper and improper implementations are illustrated in the context of a QSAR data set.
Tudor I. Oprea, email@example.com, Tharun K. Allu, Dan C. Fara, and Oleg Ursu. Division of Biocomputing, University of New Mexico School of Medicine, MSC11 6145, University of New Mexico, Albuquerque, NM 87131
The often-discussed qualities of good QSAR models increasingly include the ability to correctly predict external sets, in particular for objects outside the already-covered descriptor space. Since the proof of the pudding is in the eating, we address the issue of how true are QSAR-based classifier models. The predictivity and applicability of two classifiers, one for drug-likeness and one for aggregation, are discussed in the context of early chemical probe and lead discovery. Each of the two classifier models was extensively investigated using decision trees (CART, WEKA) and support vector machines(LIBSVM), starting from substructures (SMARTS) and "2D"-based properties. We used ensemble methods with randomized training set data to build a committee of subsidiary classifiers (C4.5 decision tree Multiboosting, SVM), then combined the individual outputs to create a single decision from the committee of models as a whole. Thus, 100 models were in fact used for classification. Beyond cross-validation (internal consistency check for model building) and small test-set external predictivity, we used the general accuracy rate as a "blind prediction" evaluation, where significantly larger numbers of previously unknown molecules were externally predicted. Although these methods indicate the classifiers have predictive capability, we found that forward predictions (predictions made before the actual experiments) do not always work as expected, in part because appropriate information is not always made available to the models before the experiments. However, the quality of the experiments can also be questioned, when confidence in the QSAR models allows it.
Andrew C. Good, firstname.lastname@example.org, Structural Biology and Modeling, Bristol-Myers Squibb, 5 Research Parkway, Wallingford, CT 06492, Andrew Tebben, email@example.com, Computer-Assisted Drug Design, Bristol Myers Squibb, Pharmaceuticals Research Institute, P. O. Box 4000, Princeton, NJ 08543-400, and Brian Claus, Computer-Assisted Drug Design, Bristol-Myers Squibb, P.O. Box 4000, Princeton, NJ 08543-4000.
A major historical limitation of many QSAR analyses has been a reliance on retrospective analysis. This has typically limited their extension to the design of novel compounds, a central requirement for many computational chemists. Molecular similarity calculations are in essence QSAR models driven by a molecular template, and have the advantage of easy extensibility to prospective analysis. With this in mind, the DOCK program has been extensively modified to permit its application in ligand-based de novo design. A Gaussian-based scoring function have been incorporated to permit shape, electrostatic potential and weighted colored force-field similarity searching. In addition Gaussian-based exclusion volumes and r-group linker constraints have been added to permit inclusion of steric constraint SAR and fragment screening. When combined with fragment databases culled from the existing chemistry space, the resulting program KIN provides a highly flexible tool for de novo core / head group replacement. Illustrations of the software's utility are highlighted with a number of search examples.
Zheng Yang, Zheng.P.Yang@gsk.com, Computational and Structural Chemistry, Molecular Discovery Research, GlaxoSmithKline Pharmaceuticals, 1250 South Collegeville Road, Collegeville, PA 19426
This presentation will discuss application of the GSK in-house pharmacophore fingerprint quantitative structure-activity relationship (pFPQSAR) method to 7TM drug design. pFP is a GSK in-house implementation of three- and four-point pharmacophore fingerprinting that incorporates proprietary GSK physiochemical featurization and utilizes imported molecular conformers. The pFPQSAR method uses the binary pharmacophore bits as 3D molecular descriptors, and applies a nonlinear iterative partial least squares algorithm to correlate these descriptors with compound biological activities to construct QSAR models. The pFPQSAR methodology has been validated on GSK in-house kinase datasets, and has been used to build predictive QSAR models to support prospective 7TM ligand-based drug design for lead optimization.
Eugene N. Muratov, firstname.lastname@example.org, Victor E. Kuz'min, email@example.com, and Anatoly G. Artemenko2. (1) Computational Center for Molecular Structure and Interactions, Jackson State University, 1400 J.R. Lynch Street, Jackson, MS 39217, (2) Laboratory of Theoretical Chemistry, A.V.Bogatsky Physical-Chemical Institute NAS of Ukraine, Lustdorfskaya Doroga 86, Odessa, 65080, Ukraine
The Hierarchical QSAR technology (HiT QSAR) is developed for solution of any structure-activity/property task and especially for the optimization of new effective pharmaceutical agents creation process. On each stage of this technology QSAR task is solved with the use of information received from a previous stage (system of permanently improved solutions). Simplex representation of molecular structure (SiRMS) is the basis of the developed technology. In SiRMS any molecule can be represented as the system of different simplexes (tetratomic fragments of fixed composition, structure, chirality and symmetry). Such representation allows unifying description of spatial structure of compounds with saving of complete stereochemical information and determination of molecular fragments increasing or decreasing investigated properties. The advantage of HiT QSAR over several well-known QSAR approaches was shown on example of the sets of acetylcholinesterase and angiotensin converting enzyme inhibitors. The successful application of HiT QSAR was confirmed by solution of different QSAR problems.
Anton J Hopfinger, firstname.lastname@example.org, College of Pharmacy, University of New Mexico, MSC 09 5360, Albuquerque, NM 87131-0001
The modeling of an ADME/Tox endpoint is highly dependent upon the complexity of the molecular mechanism involved. In cases where the molecular mechanism is complex, and/or pharmacological understanding is quite limited, an empirical informatics approach to develop predictive models is the preferred methodology to apply. We have developed a set of universal descriptors, called 4D-fingerprints, which capture the three-dimensional size, shape, chemical composition, reactive state and molecular flexibility of a molecule for informatics type ADME/Tox modeling. These descriptors have been applied to skin sensitization and eye irritation. For ADME/Tox endpoints where cellular membrane permeation and diffusion are involved, a pseudo structure-based design approach called membrane-interaction (MI-) QSAR analysis can be applied. Here descriptors derived from the simulation of an organic molecule passing through a phospholipid membrane assembly are used with intramolecular descriptors derived from the organic molecule to build MI-QSAR models. MI-QSAR simulation modeling has revealed that some organic compounds pass directly through the membrane and, presumably, into the interior of a cell, while other organic molecules use the membrane bilayer as a two-dimensional ‘sea', hopping from cell membrane to cell membrane in order to cross tissue composed of the cells. We will discuss the MI-QSAR modeling of blood-brain barrier penetration by organic compounds.
Vladimir Potemkin, email@example.com, Department of Chemistry, Chelyabinsk State University, Br. Kashirinych 129, Chelyabinsk, 454021, Russia
A lot of modern methods for virtual discovery predicts a bioactivity at the stage of receptor – ligand interaction. At the same time a biological action of a drug includes more than one stage of action even in cases of in vitro experiments. Therefore, a new method for virtual discovery is proposed. The method creates pseudo-atomic receptor model and allows to simulate a movement of a drug molecule to the receptor through water and membrane. The interaction with competitive sites is taken into account. Also the method allows to presuppose a metabolism of drug. The method has been used for detailed elucidation of stages of action for membranotropic dihydrofolatereductase inhibitors. It has been shown that the process of biological action of the drugs includes 3 critical stages: penetration through membrane, diffusion and interaction with the target. Some of compounds play a role of pro-drugs and their metabolism yields to an active molecule. The quantitative relationships for each of the stages are obtained. The importance of every stage is estimated for each molecule. Now the algorithm is used for virtual screening of more than 20 kinds of biological activities.
Shaillay Kumar Dogra, firstname.lastname@example.org and Ramesh Hariharan. Cheminformatics, Strand Life Sciences Pvt. Ltd, No. 237, Sir C. V. Raman Avenue, Raj Mahal Vilas, Bangalore, India
Quantitative Structure-Activity Relationship (QSAR) modeling has now acquired complex dimensions from its humble beginnings. At times the focus of modelers is on fitting some model equation compromising comprehensibility in the process. For the purpose of interpretable models a two-pronged approach can be followed. One is to use intuitive descriptors in QSAR modeling. Another approach, that is presented here, advocates using simpler models over more complex ones. Model complexity can be defined in terms of algorithm complexity, number of descriptors used in the model, computation time required for training, model interpretability, etc. However, learning better models would obviously be preferred over simpler models. We present an approach wherein the user follows simple flowcharts that guide him in the modeling process, taking him from simple to complex algorithms as and when suitable model cannot be fit on the data. Several experience-based guidelines and technical tips that facilitate QSAR modeling are also presented.
Bo O. J. Nordén, Bo.Norden@AstraZeneca.com1, Igor Shamovsky2, Balint Gabos2, Magnus Munck af Rosenschöld2, Matti Lepistö2, Göran Carlström2, Johan Evenäs2, Djordje Musil3, and Kristina Stenvall, Kristina.Stenvall@astrazeneca.com2. (1) Department of Medicinal Chemistry, AstraZeneca R&D Lund, Scheelevagen, S-22187 Lund, Sweden, (2) Department of Medicinal Chemistry, AstraZeneca, S-221 87 Lund, Sweden, (3) Merck KGaA, Frankfurter Str. 250, D-64293 Darmstadt, Germany
Matrix metalloproteinases (MMPs) are a large family of zinc-containing enzymes that regulate the turnover of extra-cellular matrix proteins and activity of a number of pro-inflammatory mediators. Abnormal enzymatic activity of macrophage metalloelastase (MMP-12), one member of the MMP superfamily, is implicated in the development of cigarette smoke-induced emphysema, a hallmark of chronic obstructive pulmonary disease (COPD). Inhibition of MMP-12 activity therefore represents an attractive therapeutic strategy for the treatment of COPD. Since the enzymatic sites of MMPs contain Zn(II), the vast majority of MMP inhibitors carry a Zn(II)-chelating group to specifically target those sites. Most known MMP inhibitors possess a hydroxamic acid moiety, a strong Zn(II)-binding group which leads to their high-affinity binding to the enzymatic sites of MMPs. Correspondingly, such compounds generally exhibit potency across a wide range of MMPs. Hydroxy hydantoins, a novel class of MMP-12 inhibitors, consist of a weak Zn(II)-binding group, a hydantoin, and a lipophilic biphenyl P1' moiety, which is connected to the hydantoin via a carbinol linker. The binding mode of hydroxy hydantoins in MMPs is revealed by X-ray crystallography and solution state NMR, and is consistent with those of other Zn(II) binding MMP inhibitors. Selectivity of hydroxy hydantoins for MMP-12 against MMP-2, MMP-8 and MMP-9 is studied by different multivariate approaches and 3D-QSAR. Lipophilic substituents tend to increase potency of hydroxy hydantoins at all MMPs, whereas direct interactions of para- and meta-substituents of the terminal ring of the P1' domain with Lys-241, a distinctive residue of MMP-12, drive the selectivity.
Terry R. Stouch, email@example.com, Computational Chemistry, Lexicon Pharmaceuticals, 350 Carter Road, Princeton, NJ 08540 and Balvinder S. Vig, firstname.lastname@example.org, Biopharmaceutics R&D, Bristol-Myers Squibb, One Squibb Drive, New Brunswick, NJ 08903.
A designed series of mono, di, tri and tetra peptides and a functional membrane depolarization assay were used to assemble a consistent set of data that allowed the definition of a true substrate pharmacophore and QSAR. Straight-forward molecular modeling and the charge, size, hydrophobicity and flexibility of the constituent amino acids as well as total molecular charge and hydrophobicity were found sufficient to explain the unexpected range in transport seen for dipeptides.
Essam Metwally, email@example.com, Ronald D Mathison, firstname.lastname@example.org, Joseph S Davison, email@example.com, and Robert D. Clark, firstname.lastname@example.org. (1) Informatics Research Center, Tripos, 1699 S. Hanley Rd., Saint Louis, MO 63144, (2) Department of Physiology and Biophysics, University of Calgary, 3330 Hospital Dr. NW, Calgary, AB T2N-4N1, Canada
Submandibular tripeptide FEG (Phe-Glu-Gly) and its analogues are potent anti-inflammatory peptides. 3D-QSAR-CoMFA (comparative molecular field analysis) and CoMSIA (comparative molecule similarity indices analysis) were employed on the compound series after a GALAHAD model was constructed using a subset of the 10 most active compounds. The remaining compounds were flexibly aligned to this model. The models obtained were used to predict the activities of the 20 compound test set. Biological activity was used as a measure of the ability of the compound to bind to the “active-site” with compounds which exacerbate the response also treated as good-binders. The 3D-QSAR models obtained were in good statistical agreement with the experimental results despite having relatively low q2. Contour plots generated for the models are in good qualitative agreement with experimentally observed changes in behaviour.
David E Reichert, email@example.com, Mallinckrodt Institute of Radiology, Washington University School of Medicine, 510 S. Kingshighway Blvd, Campus Box 8225, St Louis, MO 63110
Imaging modalities commonly used in modern radiology include gamma scintigraphy, positron emission tomography (PET) and magnetic resonance imaging (MRI). All three of these rely heavily on biologically stable metal coordination complexes for providing signal or contrast. As part of our research in the design of new or improved imaging agents, we have utilized both 2D-QSAR and 3D-QSAR techniques such as CoMFA and CoMSIA on various classes of coordination complexes. These studies have allowed the prediction of important physiochemical parameters such as logP, as well as predictions of pharmacokinetic behavior and receptor binding. A particularly successful application is the use of CoMFA or CoMSIA, to serve as scoring functions for receptor docking studies of targeted metal complexes. The studies to be presented will focus primarily on copper complexes, but will also discuss technetium and rhenium containing complexes.
Vellarkad N. Viswanadhan, firstname.lastname@example.org, Yaxiong Sun1, and Mark H. Norman2. (1) Molecular Structure, Amgen, Inc, MS 29-M-B, Small Molecule Drug Discovery, 1, Amgen Center Drive, Thousand Oaks, CA 91360, (2) Small Molecule Drug Discovery, Amgen, Inc, 1, Amgen Center Drive, Thousand Oaks, CA 91320
Three Dimensional Quantitative Structure Activity Relationship (3D-QSAR) models for human TRPV1 channel antagonists were developed based on Comparative Molecular Field Analysis (CoMFA) and Comparative Molecular Similarity Analysis (CoMSiA), using a training set of 61 cinnamide TRPV1 antagonists. These models were tested on an independent test set of 47 antagonists not included in the training set. Alignment of these compounds followed a procedure described recently, which included weights for both internal energy and atom-to-atom matching against a reference molecule in its minimum energy conformation. Dependence of results on partial charge assignment was explored using Gasteiger-Huckel, Gasteiger-Marsili and AM1-BCC charge calculation methods. AM1-BCC charge assignments gave superior results overall, for both CoMFA and CoMSiA models. Comparison of the CoMSiA and CoMFA results showed that both gave very similar results. For CoMFA, the best cross-validated correlations included steric and electrostatic fields (r2 = 0.96, q2 = 0.58, n = 61 for the training set and r2 = 0.50, n = 47 for the test set). For CoMSiA, the best cross-validated correlations included steric, electrostatic and hydrophobic fields (r2 = 0.95, q2 = 0.57, n = 61 for the training set and r2 = 0.48, n = 47 for the test set). Docking of these molecules in a homology model of TM3/4 helical region of TRPV1 showed consistency between the homology model and 3D-QSAR models. Additionally, molecular alignment used in these 3D-QSAR models was also consistent with the proposed binding modes of known activators of the TRPV1 channel, such as capsaicin and reseniferatoxin.
C Eyermann, Joe.Eyermann@astrazeneca.com1, P. Fleming, email@example.com, M. Gravestock, firstname.lastname@example.org, T. Jones, email@example.com, G. Kern, firstname.lastname@example.org, R Ramsay, email@example.com, F. Reck, firstname.lastname@example.org, and F. Zhou, email@example.com. (1) Infection Discovery, Cancer and Infection Research Area, AstraZeneca, R&D Boston Inc, 35 Gatehouse Drive, Waltham, MA 02451, (2) University of St. Andrews, North Haugh, St. Andrews, Fife KY16 9ST, England
Oxazolidinones are a new class of synthetic antibacterial agents that show good activity against Gram-positive bacteria. A concern with oxazolidinones as a drug class has been inhibition of MAO, especially Type A (MAO-A), due to structural similarity to MAO inhibitors like toloxatone. Inhibition of MAO-A could potentially lead to severe hypertensive crises as a result of ingestion of tyramine-containing food together with an oxazolidinone drug. It is therefore desirable to develop novel oxazolidinone drugs, which have improved activity against linezolid-resistant Gram-positive bacteria and show an improved safety profile with regard to MAO-A inhibition. Using the published crystal structure of MAO-B, a homology model of MAO-A has been built and used to interpret experimental studies on the orientation of oxazolidinones in the MAO-A active site. In addition, the homology model has been used to guide the design of new oxazolidinones which have reduced MAO-A inhibition while maintaining good anitbacterial activity.
Thomas C. Sparks, firstname.lastname@example.org, Gary D. Crouse, email@example.com, James E. Dripps1, Peter B. Anzeveno1, Jacek Martynow1, and James Gifford1. (1) Discovery Research, Dow AgroSciences, 9330 Zionsville Rd., Bldg. 306/G1, Indianapolis, IN 46268, (2) Dow AgroSciences LLC, 9330 Zionsville Road, Indianapolis, IN 46268
Improvements in the efficacy and spectrum of spinosad, a novel fermentation derived insecticide, has long been a goal. As a large complex, fermentation product, identifying specific modifications to spinosad likely to result in improved activity was difficult since most modifications decreased activity. Along the path to spinetoram (DE-175), a variety of approaches to spinosyn QSAR were examined including multiple regression, CoMFA and others, all unsuccessful. However, application of artificial neural networks (ANN) to the spinosyn QSAR problem identified new synthetic directions for improved activity, which subsequent synthesis and testing confirmed. This information coupled with other information on spinosyn structure activity relationships directly lead to the discovery of spinetoram. Scheduled for launch in late 2007, spinetoram provides both improved efficacy and expanded spectrum while maintaining the exceptional environmental and toxicological profile already established for the spinosyn chemistry. Details of the ANN-based QSAR and subsequent identification of spinetoram will be discussed.
Maria A. Grishina, firstname.lastname@example.org, Vladimir Potemkin, email@example.com, and Elena S. Pereyaslavskaya. chemical department, Chelyabinsk State University, Ul. Br. Kashirinych, 129, Chelyabinsk, 454021, Russia
Methods of pattern recognition play an important role in analysis and prognosis of biological activity. The most of existing methods of pattern recognition doesn't consider the conformational state of biological active molecules, electronic structure etc. Therefore, in this work a new paradigm for pattern recognition of biological active compounds that takes into account the problems of the existing methods has been suggested. The method is established on the combination of two 3D QSAR algorithms BiS/MC and ConGO which will be described in the presentation. The sets of anti-tumor, anti-inflammatory etc. drugs have been considered within this approach. It has been shown that the suggested paradigm allows to recognize active drug molecules with quality not less than 0.90.
George D. Purvis III, gpurvis@CACheSoftware.com, Scigress Development, Biosciences Group, Fujitsu, 15244 NW Greenbrier Pkwy, Beaverton, OR 97006
Non-linear effects in structure-activity relationships are sometimes incorporated by transforming the descriptors with mathematical functions such as the square root or logarithm. Less commonly products of two descriptors are used to account for cross dependencies. We have found that simple division of descriptors by molecular weight creates "intrinsic descriptors" that are independent of the weight of chemical described. Intrinsic descriptors are frequently selected as the best descriptors in our automated QSARs. This talk will show examples of their appearance and suggest a physical explanation as to why they are preferred.
Alexander Tropsha, firstname.lastname@example.org, Laboratory for Molecular Modeling, School of Pharmacy, University of North Carolina at Chapel Hill, CB # 7360, Beard Hall, School of Pharmacy, Chapel Hill, NC 27599-7360
We discuss a novel approach to generating fragment-based molecular descriptors. The molecules are represented by labeled undirected graph. As applied to organic molecules the nodes are atoms labeled by their chemotypes and edges are bonds; for protein structures the nodes are residue Cα atoms linked by physical proximity edges. Fast Frequent Subgraph Mining (FFSM) is used to find common chemical fragments (subgraphs) that occur in at least a subset of all molecules in a dataset. The collection of frequent subgraphs forms a dataset-specific descriptor set. We present examples of application of these novel fragment descriptors in QSAR modeling of several chemical datasets of biologically active molecules. Concurrently we discuss the application of the same methodology to identifying protein family specific residue patterns. This study presents an example of expanding QSAR-like approaches towards novel areas such as structural bioinformatics and highlights the breadth of QSAR modeling and the legacy of its pioneers such as Phil Magee.
Jorge Galvez Sr., email@example.com, Department of Physical Chemistry. Faculty of Pharmacy, University of Valencia, Avenida V.A. Estelles s.n, 46100-Burjasot (Valencia), Spain
The efficiency of molecular topology in the design and selection of new chemical compounds and particularly new drugs has been solidly presented and demonstrated in recent years. Literature sources account for many discoveries of new leads in different therapeutic areas so illustrating , beyond any reasonable doubt, such efficiency. Since the procedure is commonly based on the search of topological patterns of activity by similarity with known drugs but also yields non-obvious (dissimilar chemical structures and classes) , the formalism has been criticised for appearing to act as a black box compared to the conventional methods based on the knowledge of the drug-receptor interaction. Procedures based upon such interaction are frequently referred to as rational drug design. The aim here is to show that although both approaches are alternative, they are also complementary; thus, examples are shown in which the topological characterization of the isolated drugs clearly agrees with what is expected from the actual knowledge of the drug-biological target interaction, which in turn may also be depicted topologically in different ways. Such interaction should be better named as targeted –rather than rational- drug design.
John H. Block, John.Block@oregonstate.edu, Department of Pharmaceutical Sciences, Oregon State University, College of Pharmacy, Corvallis, OR 97331 and Douglas Henry, BIOSAR Research, San Leandro, CA.
Using a small database of defined substrates in humans for cytochrome P450 mixed function oxidases, a series of descriptors were evaluated with respect to how well they correctly classified substrates. The descriptors ranged from structural keys to topological to stereochemical to electronic. A variety of classification schemes were examined in terms of their ability to point out which descriptors are important for predicting the cytochrome P450 specificity for a substrate. Results illustrate the relative effectiveness of the various kinds of descriptors and classification methods, as well as the value of using as well-defined a data set as possible.
Olga Obrezanova, firstname.lastname@example.org, Joelle MR. Gola, email@example.com, and Matthew D. Segall, firstname.lastname@example.org. (1) In Silico ADMET, BioFocus DPI, 127 Cambridge Science Park, Milton Road, CB4 0GD, Cambridge, United Kingdom, (2) ADMET, BioFocus DPI, 127 Cambridge Science Park, Milton Road, CB4 0GD, Cambridge, United Kingdom
We will discuss the application of the Gaussian Processes method to predictive QSAR modelling of Absorption, Distribution, Metabolism and Excretion (ADME) properties. The method has overcome many of the problems of existing QSAR modelling techniques and is sufficiently robust to enable automatic model generation - one of the demands of modern drug discovery. The method is suitable for modelling nonlinear relationships, does not require subjective determination of model parameters, such as variable importance or network architectures, is inherently resistant to overtraining, and has an ability to select important descriptors. The method is based on a Bayesian probabilistic approach. Originating in the machine learning field, it has not yet been widely used in QSAR or ADME modelling. We will show application of Gaussian Processes to modelling of several ADME properties, discuss how the method is used as part of an automatic process and compare it with other modelling techniques.
Rajarshi Guha, email@example.com, School of Informatics, Indiana University, 1130 Eigenmann Hall, 1900 E 10th Street, Bloomington, IN 47406 and Stephan Schurer, firstname.lastname@example.org, Chemical Informatics, The Scripps Research Institute Florida, 5353 Parkside Drive, RF-A, Jupiter, FL 33458.
We present a study describing the use of a random forest ensemble to predict the cytotoxicity of MLSCN screening data. The models were built using a training set taken from MDL ToxNet using LeadScope and 1052 bit BCI fingerprints as the feature set. The real valued toxicity data was discretized using a cutoff and the class assignments were used to build the ensemble, which exhibited 85% accuracy for the prediction set. We also investigated the use of the models for the purposes of feature selection. With all 1052 bits, a Naive Bayes classifier exhibited 74% accuracy on the prediction set. Using the random forest to select a set of important bits allowed us to achieve 73% accuracy using only 171 bits. We also analyzed the most important fragments and evaluated their frequency of occurence and they correspond well with previously known toxic fragments and could be used as a measure of domain applicability for the ensemble when applied to the MLSCN screening data.
David T. Stanton, email@example.com, Modeling & Simulations - CADMol Group, Procter & Gamble, Miami Valley Laboratories, 11810 East Miami River Road, Cincinnati, OH 45252
It has been generally observed in our work that topological descriptors play an important and often key role in many QSAR and QSPR models we have developed. This is found to be true even when a broad selection of descriptor types is evaluated. These descriptors do not only provide the means to generate a good fit to the observed data used to train the models, but they also provide information that is needed to generate a clear physical interpretation of the underlying structure-activity or property relationships. Examples of this will be presented using two properties that have well understood mechanisms, skin penetration and critical micelle concentration.
Subhash C. Basak, firstname.lastname@example.org, Denise Mills, email@example.com, Brian D. Gute, firstname.lastname@example.org, and Douglas M. Hawkins, email@example.com. (1) Center for Water and the Environment, Natural Resources Research Institute, University of Minnesota, 5013 Miller Trunk Hwy, Duluth, MN 55811, (2) School of Statistics, University of Minnesota, 313 Ford Hall, 224 Church Street SE, Minneapolis, MN 55455
Allergic contact dermatitis (ACD) is believed to be the most prevalent form of immunotoxicity found in humans, the adverse effect arising out of the interaction of immunoregulatory cytokines and discrete subpopulations of T lymphocytes. As such, ACD is a major impediment to the development of new cosmetics, personal hygiene products and topical medications. Calculated molecular descriptors, based solely on chemical structure, may be used to develop models for the prediction of ACD. Such models can be used to evaluate new chemicals, synthesized or hypothetical. In the current study, various statistical modeling approaches, including ridge linear discriminant analysis and tailored similarity, have been used to classify chemicals with respect to ACD. Results obtained from the different modeling methods will be discussed.
DG. Sprous, firstname.lastname@example.org and FR. Salemme, email@example.com. Chemistry, Redpoint Bio, 1 Graphics Drive, Ewing, NJ 08628
The molecular properties for Generally Recognized As Safe (GRAS) compounds (food/flavoring approved) are compared to marketed drugs and used to develop a QSAR model. It was observed that log(P) and log(S), which provide computed estimates of compound solubility in organic and aqueous solvents respectively, have significant overlap in the two populations. On the whole, GRAS compounds are seen to be more flexible, smaller, and composed of a more restricted set of elements than marketed drugs. A multivariable binary Quantitative Structure-Activity Relationship (QSAR) model can correctly identify 94% of the GRAS compounds and 92% of the pharmaceutical compounds. The performance of the model was such that training sets comprised of as little as 10% of the whole dataset could predict the 90% reserved as a test set with an accuracy >90%. To summarize, the majority of the historical GRAS compounds are not “druglike”, and easily distinguished from pharmaceuticals.
David J. Diller, firstname.lastname@example.org, Molecular Modeling, Pharmacopeia, Box 5350, Princeton, NJ 08543-5350
The 1D-molecular representation (Dixon & Merz, JMC 2001) is a misnomer in that it contains more structural information than most commonly used 2D descriptors. Furthermore, it does not suffer from the conformational ambiguities of traditional 3D methods. Thus the 1D representation has great promise to fill the void where 2D descriptors are not sufficiently rich and 3D methods are not practical such as large or noisy data sets. We describe techniques to generate 1D multiple alignments of molecules with similar biological activity. These alignments effectively isolate biologically critical regions much like a multiple sequence alignment isolates conserved motifs within a protein family. Particular emphasis in this presentation will be on our efforts to develop a QSAR model to predict hERG inhibition. Finally, we will discuss the implications of this work on 3 dimensional methods.
Marvin Charton, email@example.com, Pratt Institute, 200 Willougby Avenue, Brooklyn, NY 11205
Modeling fluorophilicity: A hybrid method
There has been considerable interest in recent years in modeling fluorophilicity, lnPF, defined as the natural logarithm of the partition coefficient of a compound between perfluromethylcyclohexane and toluene. Approaches used to model this property include the use of surface area parameters, the Abraham modification of the Taft-Kamlet equation for solvent effects, and a completely empirical model that is based on the choice of a mix of available parameters that best fit the data. Here we assume that fluorophilicity depends on the difference in intermolecular forces between a compound and solvent 1 and those between the compound and solvent 2. Our model represents the intermolecular forces of interest by a count of the number of groups X of that type in the molecule and by the sum of the polarizabilities of each group Y that is not represented by a term in the number of groups present in the molecule. Correlations of ln PF with this equation have been carried out. The regression equation obtained on the exclusion of 8 outliers is: ln PF = 0.309 (± 0.00965) n (CF2) + 0.637 (± 0.0369) n (CF3) – 0.201 (± 0.0140)n (CH2) - 1.73(± 0.0969)(n Ph / Pn) -1.03 (± 0.224) n(OH) – 6.94 (± -0.857) α - 0.734 (± 0.163) 100R2, 95.83; A100R2, 95.64; F, 409.9; Sest, 0.555; S0, 0.211; Ndp, 114.
Artem Cherkasov, firstname.lastname@example.org, Division of Infectious Diseases, University of British Columbia, 2733 Heather Str, Vancouver, BC V5Z 3J5, Canada
In a series of recent studies we reported the development of ′inductive′ QSAR descriptors which are related to atomic electronegativity, covalent radii and intramolecular distances, and that have all been derived from the original equations for steric and inductive substituent constants we have published 10 years ago. Since that time a variety of related QSAR parameters including ′inductive′ electronegativity, ′inductive′ partial charge and ′inductive′ analogues of chemical hardness and softness have been introduced. To date, 50 global and local ′inductive′ descriptors have been elaborated; possible interpretation of their physical meaning has been suggested by considering a neutral molecule as an electrical capacitor formed by charged atomic spheres. Consequent studies demonstrated successful application of ′inductive′ QSAR parameters in quantification of antibacterial activity of organic compounds and peptides, in QSAR modelling of metabolic and drug-like substances, in comparative docking studies and in the discovery of novel drug leads.
Robert J. Massie, email@example.com, Director, Chemical Abstracts Service, American Chemical Society, 2540 Olentangy River Road, Columbus, OH 43202-1505
CAS began in 1907 with Chemical Abstracts, produced as a largely volunteer effort for many years. In more recent times, CAS has continued to benefit from a professional staff imbued with the same commitment and dedication exhibited by those hard-working volunteers. CAS' mission to make the world's disclosed chemistry-related information accessible to scientists has entailed many challenges over the years. Two world wars, the post-war information explosion, recurring financial crises, and competitive threats are among the obstacles CAS has weathered. Innovations in indexing, computer-assisted publishing, the creation of the CAS Chemical Registry, creative product development and international cooperation have been crucial elements in CAS' survival and success. Twenty-first century developments, such as new advertising-based business models on the Web and governmental participation in the information industry are no less daunting than the challenges CAS has faced before. A personal view of CAS' history and organizational personality will be presented along with thoughts on its course for the future.
Catharina Maulbecker, firstname.lastname@example.org, Sales and Marketing, Chemical Abratracts Service, 2540 Olentangy River Road, Columbus, OH 43202
To mark the 100th anniversary of CAS, we will be exploring several research topics covered in the first issue of CA to see what scientists were investigating a century ago and how today's scientists might gain new insights from their discoveries. By exploring the old literature, one gains an appreciation of how the exacting, fundamental work performed by chemists in the early part of the 20th century established truths that inform today's research. The older abstracts contain detailed accounts of the experimental processes and their findings. Information, speculation, and interpretation gleaned from these details can provide new facts and perspectives for today's scientists. The more knowledge researchers possess, the better equipped they are to make new connections and to generate new ideas. Reexamining past discoveries in light of today's knowledge sparks innovative breakthroughs by accelerating the process of serendipity. This survey will provide several examples to illustrate the enduring relevance of the early literature of Chemical Abstracts and the electronic resources to which it gave rise.
Hideaki Chihara, email@example.com, Japan Association of International Chemical Information, Nakai Bldg. 6-25-4, Honkomagome, Bunkyo-ku, Tokyo, Japan
CAS has served the world's scientists and engineers over almost 4 generations. When they begin a new research project, the first thing they do is to look for what is known to mankind about the subject using CAS data in a variety of forms. The importance of CAS' products and services is well recognized and appreciated by the R&D scientists as they know there is no other information source that can provide such an exhaustive search that is provided by CAS. The value of CAS databases stems from their comprehensive coverage in terms of subject area and time frame and from a well-defined index structure and depth of indexing. These elements of the value have not changed in this changing world. During these times, CAS survived the most serious crisis for its existence from late 1960's and early 1970's, when a number of other secondary information journals had to disappear, and it was fortunate for the world's scientists that CAS was able to continue. The importance of CAS today may be expressed by a brief single sentence: No scientist, especially chemical scientist, would think he/she can work without CAS services.
Bonnie Lawlor, firstname.lastname@example.org, 276 Upper Gulph Rd, Radnor, 19087
The scientific community has been information-centric for centuries. Initially relying upon oral and hand written methods of knowledge distribution, scientists found the advent of the printing press introduced new distribution channels in the form of books, almanacs and newsletters. Soon scientists were awash in information. Then, in 1655 with the publication of the first scholarly journal focused on abstracts of original research, scientists began to rely upon what are known as abstracting and indexing (A&I) services to manage information overload and ensure the broadest possible distribution of published research. Over the past century, Chemical Abstracts Service (CAS) has filled this critical role, expediting the flow of scholarly information in the chemical and related sciences. Since its inception in 1907, it has evolved with changes in technology and user expectations, and has taken a leadership role in the dissemination of scientific information. The author will discuss CAS' role and take a brief look at the challenges and opportunities that it now faces as young, born-digital researchers gradually come to dominate the scientific community.
Damon Ridley, email@example.com, School of Chemistry, University of Sydney, Building F11, Universisty of Sydney, New South Wales, 2006, Australia
At times scientists have very specific problems to solve. We may want a particular spectrum, or specific synthetic method, or a key document. At other times we may not know precisely what we want and indeed we are welcome to suggestions, particularly quite novel ones. Historically we achieved this through browsing in the library, or attending lectures outside our immediate fields at conferences. These options are still available to us.
Browsing in the electronic library is possible, although here scientists meet several basic problems including the amount of information in the electronic library and the inability to browse many items (e.g. structures, reaction diagrams, information in tables etc). SciFinder offers solutions in two principal ways. The first is through its ability to Explore topics, substances, properties, and reactions, both separately and in iterative combinations. The second is through its many post-processing functions, and to achieve this SciFinder critically depends on 100 years of intellectually indexed data.
This presentation will present examples on how the functionality in SciFinder allows scientists to browse the electronic literature in a unique and creative way, thereby opening new research opportunities.