| CHED Abstracts | |
| CINF Abstracts | |
| COMP Abstracts | |
| SCHB Astracts |
![]()
![]()
1 - Semantic envelopment of cheminformatics resources with
SADI
Leonid L Chepelev, Egon Willighagen, Michel Dumontier.
Department of Biology, School of Computer Science, and Institute of
Biochemistry, Carleton University, Ottawa, Ontario, Canada; Department
of Pharmaceutical Sciences, Uppsala University, Uppsala, Sweden
The distribution of computational resources as web services and their
execution as workflows has enabled facile computation and data
integration for bio- and cheminformatics. The Semantic Automated
Discovery and Integration (SADI) framework addresses many shortcomings
of similar frameworks, such as SSWAP and BioMoby, while allowing for
more efficient semantic envelopment of computational chemistry services,
resource discovery, and automated workflow organization. In this work,
we apply the CHEMINF ontology and Chemical Entity Semantic Specification
and demonstrate the usability of the SADI framework in solving common
cheminformatics problems starting from RDF-based chemical entity
representations. Our eventual goal is to convert all of the functions
and functionalities of the Chemistry Development Kit (CDK) into distinct
SADI services. This would enable the formulation of all cheminformatics
problems currently addressed by CDK, as SPARQL queries, returning
meaningful RDF output which can then be easily integrated with existing
RDF-based knowledgebases or used for further processing.
![]()
2 - RESTful RDF web services for predictive toxicology
Dr. Nina Jeliazkova PhD. Ideaconsult Ltd., Sofia,
Bulgaria
The Open Source Predictive Toxicology Framework http://www.opentox.org,
developed by partners of the EC FP7 OpenTox project , aims at providing
a unified access to toxicity data and predictive models, as well as
validation procedures. This is achieved by i) an information model,
based on a common OWL-DL ontology http://www.opentox.org/api/1.1/opentox.owl
ii) flexibility by linking with related ontologies; iii) availability of
data and algorithms via a standardized REST web services interface,
where every compound, data set or predictive method has an unique web
address, used to retrieve its RDF representation, or initiate the
calculations. The OpenTox framework allows building user-friendly
applications for toxicological experts or model developers, or direct
access by an application programming interface for development,
integration and validation of new algorithms. The work presented
describes the experience of building RESTful web services, based on RDF
representation of resources, to incorporate diverse IT solutions into a
distributed and interoperable system.
![]()
3 - Linking the resource description framework to
cheminformatics and proteochemometrics
Dr. Egon L. Willighagen, Prof. Jarl E.S. Wikberg.
Department of Pharmaceutical Biosciences, Uppsala University, Uppala,
Sweden
Background
Semantic web technologies are finding their way into the life sciences.
Ontologies and semantic markup have already been used for more than a
decade in molecular sciences, but have not found widespread use yet. The
semantic web technology Resource Description Framework (RDF) and related
methods show to be sufficiently versatile to change that situation.
Results
The work presented here focuses on linking RDF approaches to existing
molecular chemometrics fields, including cheminformatics, QSAR modeling
and proteochemometrics. Applications are presented that link RDF
technologies to methods from statistics and cheminformatics, including
data aggregation, visualization, chemical identification, and property
prediction. They demonstrate how this can be done using various existing
RDF standards and cheminformatics libraries. For example, we show how IC50
and Ki values are modeled for a number of biological targets
using data from the chEMBL database.
Conclusions
We have shown that existing RDF standards can suitably be integrated
into existing molecular chemometrics methods. Platforms that unite these
technologies, like Bioclipse, makes this even simpler and more
transparent. Being able to create and share workflows that integrate
data aggregation and analysis (visual and statistical)
is beneficial to interoperability and reproducibility. The current work
shows that RDF approaches are sufficiently powerful to support molecular
chemometrics workflows.
![]()
4 - Chemical e-Science Information Cloud (ChemCloud): A
semantic web based eScience
infrastructure
Prof. Dr. Adrian Paschke PhD, Stephan Heineke. FIZ
Chemie, Berlin, Germany; Department of Mathematics and Computer Science,
FU Berlin, Berlin, Germany
Our Chemical e-Science Information Cloud (ChemCloud) - a Semantic
Web based eScience infrastructure - integrates and automates a multitude
of databases, tools and services in the domain of chemistry, pharmacy
and bio-chemistry available at the Fachinformationszentrum Chemie (FIZ
Chemie), at the Freie Universitaet Berlin (FUB), and on the public Web.
Based on the approach of the W3C Linked Open Data initiative and the W3C
Semantic Web technologies for ontologies and rules it semantically links
and integrates knowledge from our W3C HCLS knowledge base hosted at the
FUB, our multi-domain knowledge base DBpedia (Deutschland) implemented
at FUB, which is extracted from Wikipedia (De) providing a public
semantic resource for chemistry, and our well-established databases at
FIZ Chemie such as ChemInform for organic reaction data, InfoTherm the
leading source for thermophysical data, Chemisches Zentralblatt, the
complete chemistry knowledge from 1830 to 1969, and ChemgaPedia the
largest and most frequented e-Learning platform for Chemistry and
related sciences in German language.

![]()
5 - Use of semantic web services to access small molecule
ligand database
Anay P Tamhankar, Aniket S Ausekar. Software
Solutions Group, Evolvus, Pune, Maharashtra, India
Resource Description Framework (RDF) and a set of associated
technologies
like OWL, SPARQL etc..., which form the W3C's semantic web technology
stack,
are renewing interest in semantic chemistry. Semantic Web Services not
only
specify syntactic interoperability but also specify and enforce the
semantic constraints of messages being
transmitted and objects being accessed.
Liceptor database is a small molecule ligand database consisting of
approximately 4 million compounds. The database schema consists of
fields like molecular
properties (2D-structure, molecular weight, molecular formula etc...),
molecular
descriptors (H-donors, H-acceptors, logP, logD number of rotational
bonds etc...)
and pharmacological properties (bio-assays, receptors, enzymes,
parameters,
animal models, therapeutic indications etc...). Pharmaceutical and
Bio-Technology companies use this database to mine chemical space for
internal
research, to prioritize QSAR and pharmacophore studies, for synthetic
chemistry
endeavors and for advancing hit-to-lead patterns.
The database records are available in multiple formats (relational
database, XML, Rdfile etc...) as well as available online through an
interactive web application (html format).
The soon to be released version of the database includes access using
semantic web services. The ontology is expressed in OWL and RDF defines
the
overall framework. Typical consumers of the data using this access
mechanism
are expected to be third-party tool vendors and data aggregators.
Use of semantic web services allows evolution of the schema over time
without
explicitly communicating the change as well as requiring all data
consumers to
be changed.
![]()
6 - Usage metrics: Tools for evaluating science monograph
collections
Asst Univ Librarian Michelle M Foss, Dr. Vernon
Kisling, Ms. Stephanie Haas. Department of Marston Science Library,
University of Florida, Gainesville, FL, United States
As academic libraries are increasingly supported by a matrix of
databases functions, the use of data mining and visualization techniques
offer significant potential for future collection development based on
quantifiable data. While data collection techniques are not standardized
and results may be skewed because of granularity problems, or faulty
algorithms, useful baseline data is extractable and broad trends
identified. The purpose of the study is to provide an initial assessment
of data associated with the science monograph collection at the Marston
Science Library (MSL), University of Florida. The sciences fall within
the major Library of Congress Classification schedules of Q, S, and T,
excluding TN, TR, TT, and R. The overall strategy of this project is to
analyze audience-based circulation patterns, e-book usage, purchases,
and interlibrary loan statistics from the academic year July 1, 2008 to
June 30, 2009. Such analyses provide an evidence-based framework for
future collection decisions.
![]()
7 - Happily ever after or not: E-book collection usage
analysis and assessment at USC
Library
Norah Xiao. University of Southern California, United
States
With more and more e-book collections being launched by
publishers, USC Science and Engineering Library initiated its e-book
collection
acquisition since late 2008, and one of first and biggest acquired
collections
is Springer e-books. Now after two years, are users satisfied with this
e-book
collection? Are they accessing and using it? Like any other
e-collection, how
well have we, librarians and staff, been coping with this collection in
collection development (e.g.
e-book packages from other publishers), access services (e.g.
interlibrary loan, off-campus access, e-books technical issues),
outreach (e.g.
e-book market strategies), and information literacy?
This presentation will overview our assessment of this
e-book collection after 2 years. What have we learned from the usage
data? And
by analyzing the data, how did and can we improve our services to users?
It is
hoped to our experience can present a proactive implementation plan for
others
considering comprehensive digital migration of their content, with the
goal of
not only better coping with the current economic environment, but of
spurring
development, innovation, and efficiency in the long run.
![]()
8 - From Chemical Abstracts to SciFinder:
Transitioning to SciFinder and assessing customer usage
Susan Makar, Stacy Bruss. National Institute
of Standards and Technology, United States
The Research Library of the National Institute of Standards and
Technology (NIST) monitors SciFinder usage to ensure customers
have ready access to the database and to determine who uses it. Usage
statistics played a critical role in determining whether to increase the
number of seats and which heavy users should help pay for those
additional seats. While most NIST researchers were very excited to
acquire access to this product, many, who were well acquainted with
using the print version of Chemical Abstracts, needed to learn
best techniques for searching and browsing the chemistry literature
using SciFinder. Transitioning from the printed Chemical
Abstracts to SciFinder posed significant challenges to one
research project. This presentation will describe how the NIST Research
Library used SciFinder usage statistics to make collection
development decisions and how library staff worked with NIST researchers
to successfully transition from the printed Chemical Abstracts to
SciFinder.
![]()
9 - Using Web of Knowledge to identify publishing and
citation patterns of campus researchers at the University of Arkansas
Lutishoor Salisbury, Jeremy S. Smith. University of
Arkansas, United States
This presentation will provide information on
a project undertaken at the University of Arkansas in Fayetteville to
study
publications by the campus researchers with an emphasis on the STEM
(agricultural
sciences, physical science, biological sciences, engineering and
mathematics,
etc.) disciplines at the macro-level for a three-year period. The
overall
objective of the study was (1) to provide an overview of the
productivity of
faculty and researchers in the various departments which could be used
in allocating
resources for collection development
and (2) to provide evidence-based data of
periodical use to assist with collection decisions and to identify
collection
strengths at the university level. We used the Web of Knowledge
database (Science
Citation Index, Social Science Citation
Index and Arts and Humanities
Citation Index) to identify the periodical literature in which our
researchers published and those that they cite in their publications to
do
several analysis including determining the extent to which our
researchers are
publishing in and citing periodicals from the Elsevier, Wiley and IEEE
journal
packages. A methodology for extracting citations from Web of
Knowledge into an Excel spreadsheet will also be
presented. The strengths and
weaknesses of the Web of Knowledge for this study will also be
highlighted.
![]()
10 - Don't forget the qualitative: Including focus groups in the
collection assessment process
Susan Shepherd, Teri M. Vogel. University of California San
Diego, United States
To complement our ongoing quantitative collection evaluations
based on cost and usage data, the UC San Diego Science & Engineering
Library conducted a series of focus groups with graduate students and faculty
in our core departments. Our objective was to learn more about how they
use the collection for research and teaching, so that we could make more
informed decisions about collection management, as well as how best to
deploy our staff resources for increased promotion, outreach and instruction.
Participants were asked about the resources they use, how they use
them, and what gaps they perceived. We also probed their familiarity
with the top licensed resources in their fields.
In this presentation we will discuss our focus group
methods, results and the next steps we have taken in this assessment, including
a follow-up survey to the same departments to obtain more quantitative
information about usage of the collection.
![]()
11 - Strategies for the identification and generation of
informative compound sets
Michael S Lajiness. Computer Aided Drug Discovery,
Eli Lilly & Company, Indianapolis, IN, IN, United States
Mounting pressures in drug discovery research dictate more efficient
methods of picking the winners: molecules that actually have a chance to
be the drugs of the future. Clearly, these methods need to navigate a
highly, multi-dimensional landscape. It is also clear that hard filters
should never be used and that a more continuous treatment or
prioritization has clear advantages. Further, structural diversity needs
to be considered in order for the best structural ideas to be found most
efficiently. In addition, history and external sources of information
also must be examined. This presentation will describe some of the
methods, techniques, and strategies that have been employed by the
author over the past 25 years working in cheminformatic that attempt to
identify compounds that are likely to provide the most useful
information so that one might discover solid leads more rapidly.
![]()
12 - Public-domain data resources at the European
Bioinformatics Institute and their use in drug discovery
Christoph Steinbeck. European Bioinformatics
Institute, EMBL Outstation - Hinxton, Hinxton, Cambridge, United Kingdom
Small molecules are of increasing interest for bioinformatics in areas
such as metabolomics and drug discovery. The recent release of large
open chemistry databases into the public domain calls for flexible, open
toolkits to process them. These databases and tools will, for the first
time, create opportunities for academia and third-world countries to
perform state-of-the-art open drug discovery and translational research
- endeavors so far a domain of the pharmaceutical industry. This talk
will describe a couple of relevant data resources at the European
Bioinformatics Institute and will also outline our research on and
development of toolkits such as the Chemistry Development Kit and CDK-Taverna
to support the exploitation of these data sources.
![]()
13 - Decision making in the face of complicated drug
discovery data using the Novartis system for virtual medicinal chemistry
(FOCUS)
Donovan Chin. Global Discovery Chemistry, Novartis
Institutes for BioMedical Research, Cambridge, MA, United States
This talk will describe some of the broad concepts that led to the
development of the Novartis software system for data analysis & virtual
medicinal chemistry (FOCUS). The system, which is routinely used
globally, is designed to present the scientist with an accessible
interface that permits iterative hypothesis testing of many possible
chemical candidates while accounting for undesirable ADMET properties.
Some of the key principles are to present the data in a way that
reflects stored knowledge and facilitates the decision about what
compound to make next. We will highlight some of these concepts in
applications spaning the range from target identification to drug
optimization.
![]()
14 - Integrating chemical and biological data: Insights from
10 years of VERDI
Susan Roberts, W. Patrick Walters, Ryan McLoughlin,
Philppe Gabriel, Jonathan Willis, Trevor Kramer. Vertex Pharmaceuticals,
Cambridge, MA, United States
VERDI is a software system, originally developed in 2000 at Vertex
Pharmaceuticals, for integrating chemical and biological data and
delivering this information to drug discovery teams. In addition to
traditional table views, VERDI incorporated a number of modules designed
to enable scientists to understand relationships between chemical
structure and biological data. Over the last 10 years, VERDI has been
the primary data access tool for hundreds of scientists at multiple
sites around the world. A retrospective evaluation of VERDI has provided
us with a number of 'lessons-learned', which come from a multitude of
revisions, improvements and new feature additions. Some of these
lessons, which are being used as the basis for development of the next
generation of data analysis and visualization tools at Vertex, will be
presented and discussed in detail.
![]()
15 - Collaborative database and computational models for tuberculosis
drug discovery decision making
Dr. Sean Ekins PhD, Dr Justin Bradford PhD, Krishna Dole,
Anna Spektor, Kellan Gregory, David Blondeau, Dr Moses Hohman PhD, Dr Barry A
Bunin. Collaborative Drug Discovery, Burlingame, CA, United States;
Collaborations in Chemistry, Jenkintown, PA, United States; Department of
Pharmaceutical Sciences, University of Maryland, Baltimore, MD, United States;
Department of Pharmacology, Robert Wood Johnson Medical School, University of
Medicine & Dentistry of New Jersey, Piscataway, NJ, United States
Drug discovery is being re-shaped involving large scale collaborations that
connect individual researchers using collaborative computational approaches and
crowdsourcing. Future drug discovery decisions will ultimately still be made
based on massive multidimensional datasets. As an example, the search for
molecules with activity against Mycobacterium tuberculosis (Mtb) is
employing many approaches in collaborating national and international
laboratories. We have developed a database (CDD TB) to capture public and
private Mtb data while enabling data mining and collaborations with other
researchers. We have also used the public data along with several computational
approaches including Bayesian classification models for 220,463 molecules and
tested them with external molecules, enabling the discrimination of active or
inactive substructures from other datasets in CDD TB. The combination of the
database, dataset analysis, and computational models provides new insights into
molecular properties and features that are determinants of whole cell activity,
allowing prioritization and decision making around molecules.
![]()
|
16 - Data drive life sciences: The Pyramids meet the Tower of
Babel |
![]()
17 - Design principles for diversity-oriented synthesis: Facilitating
downstream discovery with upfront design
Lisa Marcaurelle. Chemical Biology Platform, Broad Institute,
Cambridge, MA, United States
To expand the diversity of our screening collection to
access a broad range of biological targets, we aspire to produce libraries of
small-molecules that combine the structural complexity of natural products and
the efficiency of high-throughput processes. Moreover, we aim to synthesize the
complete matrix of stereoisomers for all library members. We reason that this
unique collection will enable the rapid development of
stereo-structure/activity relationships (SSAR) upon biological testing
providing valuable information for the prioritization and optimization of hit
compounds. Although our library products may be distinct compared to
traditional compound collections, we are faced with fundamental questions
relevant to library design: How do you prioritize scaffolds for synthesis? How
do you select products with desirable physicochemical properties? In designing
DOS libraries we employ a number of cheminformatic methods to tackle such
issues and select compounds for synthesis/screening. An overview of our design
criteria and decision-making process will be presented.
![]()
18 - Overview: Data-intensive drug design
John H Van Drie. R&D, Van Drie Research, Andover, MA,
United States
How do we best make med chem decisions in the face of
a lot of data? This is an issue that
confronts us at many stages of the drug discovery process: screening,
hit-to-lead, early lead
optimization, and late-stage lead optimization.
In this session, speakers representing each of these stages will
describe how they have successfully tackled these issues, emphasizing
general
principles over specific computational tools.
Our brains can conveniently handle only about 7 things at a time, and
most traditional med chem. decision-making processes reflect that.
Already when the number of molecules being
considered is in the range of dozens, things get tricky; when that
number is in
the thousands to hundreds of thousands, one must re-orient one's
perspective
![]()
19 - Data-driven development: How ACS Publications uses data to
enhance products and services, and respond to customer needs
Melissa Blaney, Sara Rouhi. ACS Publications, United States
As the scholarly publishing landscape continues to rapidly transform
in unprecedented ways, publishers and libraries have had to quickly pivot to
accommodate the changing preferences that users have for accessing, collecting,
and consuming digital information. ACS Publications has used a data-driven
approach to handle these changing customer and end-user needs. Everything from
our ACS Mobile iPhone application to our transition from print to online Web
products has been shaped by this approach. This presentation will address the
role of data in developing new products, enhancing our web presence, and responding
to user behavior on the ACS Web Editions Platform.
![]()
20 - Objective collections evaluation using statistics at the
MIT Libraries
Mathew Willmott, Erja Kajosalo. Engineering & Science
Libraries, Massachusetts Institute of Technology, United States
Recent budget pressures have forced many libraries to
reevaluate their collections and substantially cut back on their
subscription
spending. The task of evaluating a
large collection of subscription-based materials, however, is a
difficult
one. Journals from different
subject areas are used differently, and journals from different
publishers have
their usage measured differently.
Evaluating each individual journal subscription separately would be a
monumental task bordering on infeasibility. This paper will discuss the
approach taken by the MIT
Engineering and Science Libraries in the spring of 2009 and 2010 to
evaluate
their journal collections, specifically for Springer, Elsevier, and
Wiley-Blackwell, the three journal publishers with which these libraries
hold
the most subscriptions. Discussion
will include the gathering and analysis of usage data, publication data,
and
citation data, as well as the process by which these data were combined
to
create an objective ranking for each journal. These objective rankings
were not final decisions;
librarians with subject expertise then evaluated the lower-ranked
journals to determine
if they were appropriate choices for cancellation, often taking into
consideration many additional factors.
However, these objective evaluations helped librarians to more
efficiently use their time by indicating which journals may be strong
candidates
for cancellation, and they helped department liaisons to defend final
cancellation choices to a very data-driven faculty. The end result was a
more efficient cancellation process as
well as a more comprehensive understanding of the library's journal
collections.
![]()
21 - Getting the biggest bang for your buck: Methods and strategies
for managing journal collections
Grace Baysinger. Stanford University, United States
Chemistry journals have the highest average cost per title of all subject areas.
Library collection budgets have not kept pace with price increases and funds to
acquire new titles are scarce. Signing big deals for journals has limited
flexibility in adapting to changes. These factors have made acquiring journals
to support programmatic needs more of a challenge than ever before. This
presentation will cover methods, strategies, and tools than can be used to help
assess how resources are allocated when developing and managing journal
collections.
![]()
22 - Taking a collection down to its elements: Using various
assessment techniques to revitalize a library
Leah Solla. Cornell University, 283 Clark Hall, Ithaca, NY,
United States
What are the elements of a research literature collection in the physical
sciences? How are they being used and what roles are they playing in research
and teaching and learning? Who is using them- students, faculty, related
disciplines? These are the questions that drove the extensive analyses conducted
on the print and electronic literature collections in the Physical Sciences
Library at Cornell University in preparation for transitioning the service model
from a print-based facility to electronic collections and services. General
trends indicated the usage of the collection had been well over 90% electronic
for years and the acquisition of books and journals in print had been reduced to
minimal levels under budget pressures. But there were significant gaps in the
electronic holdings and there remained a small but very active core of the print
collection, both warranted further study to enable us to provide the best
possible access to these crucial materials in the new service model. The library
management system was mined for a variety of data points and complemented with
external data sources and user input to build the transition map for the
physical sciences literature collections.
![]()
23 - Predicting specific inhibition of cyclophilins A and B
using docking, growing, and free energy perturbation calculations
Somisetti V Sambasivarao, Orlando Acevedo. Department
of Chemistry and Biochemistry, Auburn University, Auburn, AL, United
States
Cyclophilins (Cyp) belong to the enzyme class of peptidyl-prolyl
isomerases which catalyze the cis-trans conversion of prolyl
bonds in peptides and proteins. Twenty human Cyp isoenzymes have been
reported and many are excellent targets for the inhibition of hepatitis
C virus replication and multiple inflammatory diseases and cancers.
Given the complete conservation of all active site residues between many
of the enzymes, i.e., CypA, CypB, CypC and CypD, a better understanding
of how to specifically inhibit individual targets could potentially
reduce reported side effects in current treatments. Docking and growing
programs have been used to construct protein-ligand complexes for a
variety of reported selective inhibitors, including acylurea and aryl
1-indanylketone derivatives. Free-energy perturbation/Monte Carlo
(FEP/MC) calculations have been utilized to quantitatively reproduce the
free energies of binding for the inhibitors in multiple Cyp active sites
in order to elucidate the origin of the specificity for the compounds.
![]()
24 - Using aggregative web services for drug discovery
Dr. Qian Zhu PhD, Dr. Michael S. Lajiness PhD, Dr.
David J. Wild PhD. School of Informatics and Computing, Indiana
University, Bloomington, IN, United States
Recent years have seen a huge increase in the amount of
publicly-available information pertinent to drug discovery, including
online databases of compound and bioassay information; scholarly
publications linking compounds with genes, targets and diseases; and
predictive models that can suggest new links between compounds, genes,
targets and diseases. However, there is a distinct lack of data mining
tools available to harness this information, and in particular to look
for information across multiple sources. At Indiana University we are
developing an aggregative web service framework to solve this kind of
problems. It offers a new approach to data mining that crosses
information source types to look at the "big picture" and to identify
corroborating or conflicting information from models, assays, databases
and publications.
![]()
25 - Semantifying polymer science using ontologies
Dr. Edward O. Cannon PhD, Dr. Adams Nico, Prof. Peter
Murray-Rust. Department of Chemistry, Unilever Centre for Molecular
Science Informatics, University of Cambridge, Cambridge, Cambridgeshire,
United Kingdom
Ontologies are graph based, formal representations of information in a
domain. Currently, there is a large interest in ontologies for biology
and medicine, though little effort has been concentrated in the field of
chemistry, let alone polymer science. We have developed a number of
ontologies for polymer science: properties, measurement techniques and
measurement conditions, using the Web Ontology Language. These
ontologies will help facilitate the standardization of data exchange
formats in polymer science by providing a common domain of knowledge.
The properties ontology contains over 150 properties and has been
integrated with the measurement techniques and conditions ontology, to
give information on how a property is measured and under what
conditions. The ontologies will be of use to polymer scientists wishing
to reach a consensus in this area of knowledge. The ontologies also have
the advantage that they can be integrated into software applications to
leverage this knowledge.
![]()
26 - Toxicity reference database (ToxRefDB) to develop
predictive toxicity models and prioritize compounds for future toxicity
testing
Hao Tang, Hao Zhu PhD, Liying Zhang, Alexander Sedykh
PhD, Ann Richard PhD, Ivan Rusyn MD, PhD, Prof. Alexander Tropsha PhD.
Division of Medicinal Chemistry and Natural Products, School of
Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, NC,
United States; Department of Biochemistry and Biophysics, School of
Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC,
United States; National Center for Computational Toxicology, Office of
Research&Developoment, U.S. Environmental Protection Agency, Chapel
Hill, NC, United States; Department of Environmental Sciences and
Engineering, School of Public Health, University of North Carolina at
Chapel Hill, Chapel Hill, NC, United States
EPA's ToxCast program aims to use in vitro assays to predict chemical
hazards and prioritize chemicals for toxicity testing. We employed the
predictive QSAR workflow to develop computational toxicity models for
ToxCast compounds with historical animal testing results available from
ToxRefDB. To ensure model stability and robustness, multiple classifiers
and 5-fold external cross-validation were applied. Results show that for
three of the 78 toxicity endpoints, including one chronic and two
reproductive endpoints, the Correct Classification Rate for external
validation datasets was above 0.6 for all types of QSAR models. Our
studies suggest that it is feasible to develop QSAR models for some
endpoints, which could be further augmented by in vitro assay measures.
The validated toxicity models were used for virtual screening of 50,000
chemicals compiled for the REACH program. The compounds predicted as
toxic could be regarded as candidates for future toxicity testing.
Abstract does not reflect EPA policy.
![]()
27 - OrbDB: A database of molecular orbital interactions
Matthew A. Kayala, Chloe A. Azencott, Dr. Jonathan H.
Chen PhD, Prof. Pierre F. Baldi PhD. Department of Computer Science,
University of California - Irvine, Irvine, CA, United States
The ability to anticipate the course of a reaction is essential to the
practice of chemistry. This aptitude relies on the understanding of
elementary mechanistic steps, which can be described as the interaction
of filled and unfilled molecular orbitals. Here, we create a database of
mechanistic steps from previous work on a rule-based expert system (ReactionExplorer).
We derive 21,000 priority ordered favorable elementary steps for 7800
distinct reactants or intermediates. All other filled to unfilled
molecular orbital interactions yield 106 million unfavorable elementary
steps. To predict the course of reactions, one must
recover the relative priority of these elementary steps. Initial
cross-validated results for a neural network on several stratified
samples indicate we are able to retrieve this ordering with a precision
of 98.9%. The quality of our database makes it an invaluable resource
for the prediction of elementary reactions, and therefore of full
chemical processes.
![]()
28 - Novel approach to drug discovery integrating
chemogenomics and QSAR modeling: Applications to anti-Alzheimer's agents
Rima Hajjo, Dr. Simon Wang PhD, Prof. Bryan L. Roth
MD, PhD, Prof. Alexander Tropsha PhD. Department of Medicinal Chemistry
and Natural Products, University of North Carolina at Chapel Hill,
Chapel Hill, NC, United States; Department of Pharmacology, University
of North Carolina at Chapel Hill, Chapel Hill, NC, United States
Chemogenomics is an emerging interdisciplinary field relating the
receptorome-wide biological screening to functional or clinical effects
of chemicals. We have developed a novel chemogenomics approach combining
QSAR modeling, virtual screening (VS), and gene expression profiling for
drug discovery. Gene signatures for the Alzheimer's disease (AD) were
used to query the Connectivity Map (cmap,http://www.broad.mit.edu/cmap/)
to identify potential anti-AD agents. Concurrently, QSAR models were
developed for the serotonin, dopamine, muscarinic and sigma receptor
families implicated in the AD. The models were used for VS of the World
Drug Index database to identify putative ligands. 12 common hits from
QSAR/VS and cmap studies were subjected to parallel binding assays
against a panel of GPCRs. All compounds were found to bind to at least
one receptor with binding affinities between 1.7 - 9000 nM. Thus, our
approach afforded novel experimentally confirmed GPCR ligands that may
be implied as putative treatments for the AD.
![]()
29 - Cheminformatics improvements by combining
semantic web technologies, cheminformatical representations, and
chemometrics for statistical modeling and pattern recognition
Dr. Egon L. Willighagen. Department of
Pharmaceutical Biosciences, Uppsala University, Uppsala, Uppland,
Sweden
My research focuses on the methods needed for large-scale
molecular property prediction, using semantic web,
cheminformatics, and chemometrics methods. Originally starting
with a Dictionary on Organic Chemistry website, research was
started to find methods to accurately disseminate molecular
knowledge, resulting in participation in Open Source
cheminformatics projects, including Jmol, JChemPaint, and the
Chemical Markup Language project, and an oral presentation at
the "2000 Chemistry & Internet" conference. In that year, the
applicant founded together with the Jmol and JChemPaint project
leaders the Chemistry Development Kit (CDK), which is now a
highly cited Open Source cheminformatics toolkit. Between 2001
and 2006 the applicant continued research in the area of data
analysis with a PhD thesis on the "Representation of Molecules
and Molecular Systems in Data Analysis and Modeling" with Prof.
dr L.M.C. Buydens at the Analytical Chemistry Department at the
Radboud University Nijmegen. The thesis studies the interaction
of representation and the statistics and shows how tightly these
need to match. Topics of the thesis include: a critical analysis
of the use of proton and carbon NMR in QSAR; the use of Open
Source, Open Data, and Open standards in interoperability in
cheminformatics; the clustering of crystal structures using a
novel similarity measure; and, the use of new supervised
self-organizing maps in pattern recognition in crystallography.
Part of the research was performed in the group of dr P.
Murray-Rust at Cambridge University. Later research focused on
the use of semantic technologies to reduce error in the
aggregation and exchange of molecular data. Recent work applies
developed technologies to cheminformatics in general and QSAR
and metabolite identification in particular, with dr C.
Steinbeck at Cologne University in Germany, and with dr R. van
Ham at Wageningen University within the Netherlands Metabolomics
Center. The applicant recently joined the development team of
the award-winning cheminformatics-platform Bioclipse in Uppsala
with Prof. J. Wikberg in Sweden, to continue his research in
improving interoperability and reproducibility in
cheminformatics and pharmaceutical bioinformatics and
proteochemometrics in particular. This implies continued CDK
development, development of semantic methods in computational
chemistry, and making these technologies accessible to the
non-programming chemist by supporting the development of
cheminformatics in bench-chemist-oriented platforms such as
Bioclipse and Taverna.
![]()
30 - Prediction of consistent water networks in uncomplexed protein
binding sites based on knowledge-based potentials
Michael Betz, Gerd Neudert, Gerhard Klebe. Pharmaceutical
Chemistry, Philipps-University Marburg, Marburg, Germany
Within the active site of a protein water fulfills a variety of different roles.
Solvation of hydrophilic parts stabilizes a distinct protein conformation,
whereas desolvation upon ligand binding may lead to a gain of entropy. In an
overwhelming number of cases, water molecules mediate interactions between
protein and the bound ligand. Therefore, a reliable prediction of water
molecules participating in ligand binding is essential for docking and scoring,
and is necessary to develop strategies in ligand design. We require some
reasonable estimates about the free energy contributions of water to binding.
Useful parameters for such estimations are the total number of displaceable
water molecules and the probabilities for their displacement upon ligand
binding. These parameters depend on specific interactions with the protein and
other water molecules, and thus the positions of individual water molecules.
The high flexibility of water networks makes it difficult to observe distinct
water molecules at well defined positions in structure determinations. Thus,
experimentally observed positions of water molecules have to be assessed
critically, bearing in mind that they represent an average picture of a highly
dynamic equilibrium ensemble. Moreover, there are many structures with
inconsistent and incomplete water networks.
To address these deficiencies we developed a tool that predicts possible
configurations of complete water networks in binding pockets in a consistent
way. It is based on the well established knowledge-based potentials implemented
into DrugScore, which also allow for a reasonable differentiation between
"conserved" and "displaceable" water molecules. The potentials used were derived
specifically for water positions as observed in small molecule crystal
structures in the CSD.
To account for the flexibility and high intercorrelation we apply a clique-based
approach, resulting in water networks maximizing the total DrugScore.
To incorporate as much known information as possible about a given target, we
also allow to include constraints defined by experimentally observed water
positions.
Our tool provides a useful starting point whenever a possible configuration of
water molecules need to be estimated in an uncomplexed protein, and suggests
their spatial positions and their classification with respect to some kind of
affinity prediction.
In first tests we were able to get classifications and positional predictions
which are in good agreement with crystallographically observed water molecules
with remarkably small deviations.
![]()
31 - Functional binders for non-specific binding: Evaluation
of virtual screening methods for the elucidation of novel transthyretin
amyloid inhibitors
Carlos J.V. Simões, Trishna Mukherjee, Prof. Richard
M. Jackson PhD, Prof. Rui M.M. Brito PhD. Department of Chemistry,
Center for Neuroscience and Cell Biology, University of Coimbra,
Coimbra, Portugal; Institute of Molecular and Cellular Biology,
University of Leeds, Leeds, West Yorkshire, United Kingdom
Inhibition of fibril formation by stabilization of the native form of
transthyretin (TTR) is a viable approach for the treatment of Familial
Amyloid Polyneuropathy that has been gaining momentum in the field of
amyloid research. Herein, we present a benchmark of five virtual
screening strategies to identify novel TTR stabilizers: (1) 2D
similarity searches with chemical hashed fingerprints, pharmacophore
fingerprints and UNITY fingerprints, (2) 3D-searches based on shape,
chemical and electrostatic similarity, (3) LigMatch, a ligand-based
method employing multiple templates, (4) 3D- pharmacophore searches, and
(5) docking to consensus X-ray crystal structures. By combining the
best-performing VS protocols, a small subset of molecules was selected
from a tailored library of 2.3 million compounds and identified as
representative of multiple series of potential leads. According to our
predictions, the retrieved molecules present better solubility, halogen
fraction and binding affinity for both TTR pockets than the stabilizers
discovered to date.
![]()
32 - Using the oreChemexperiments ontology: Planning and
enacting chemistry
Prof Jeremy G Frey, Mark I Borkum, Prof Carl Lagoze,
Dr. Simon J Coles. School of Chemistry, Univeristy of Southampton,
Southampton, Hants, United Kingdom; Department of Information Science,
Cornell Univeristy, Ithica, NY, United States
This paper presents the oreChem Experiments Ontology, an extensible
model that describes the formulation and enactment of scientific methods
(referred to as “plans”), designed to enable new models of research and
facilitate the dissemination of scientific data on the Semantic Web.
Currently, a high level of domain-specific knowledge is required to
identify and resolve the implicit links that exist between digital
artefacts, constituting a significant barrier-to-entry for third parties
that wish to discover and reuse published data. The oreChem ontology
radically simplifies and clarifies the problem of representing an
experiment to facilitate the discovery and re-use of the data in the
correct context. We describe the main parts of the ontology and detail
the enhancements made to the Southampton eCrystals repository to enable
the publication of oreChem metadata.
![]()
33 - CHEMINF: Community-developed ontology of chemical
information and algorithms
Leonid L Chepelev, Janna Hastings, Egon Willighagen,
Nico Adams, Christoph Steinbeck, Peter Murray-Rust, Michel Dumontier.
Department of Biology, School of Computer Science, and Institute of
Biochemistry, Carleton University, Ottawa, Ontario, Canada;
Chemoinformatics and Metabolism Team, European Bioinformatics Institute,
Cambridge, United Kingdom; Department of Pharmaceutical Sciences,
Uppsala University, Uppsala, Sweden; Department of Chemistry, Unilever
Centre for Molecular Informatics, University of Cambridge, Cambridge,
United Kingdom
In order to truly convert RDF-encoded chemical information into
knowledge and break out of domain- and vendor-specific data silos,
reliable chemical ontologies are necessary. To date, no standard
ontology that addresses all chemical information representation and
service integration needs has emerged from previously proposed
ontologies, ironically threatening yet another “Tower of Babel” event in
cheminformatics. To avoid resultant substantial ontology mapping costs,
we hereby propose CHEMINF, a community-developed modular and unified
ontology for chemical graphs, qualities, descriptors, algorithms,
implementations, and data representations/formalisms. Further, CHEMINF
is aligned with ontologies developed within the OBO Foundry effort, such
as the Information Artifact Ontology. We present the application of
CHEMINF to efficiently integrate two RDF-based chemical knowledgebases
with different representation structures and aims, but common classes
and properties from CHEMINF. Finally, we discuss the steps taken to
ensure applicability of this ontology in the semantic envelopment of
computational chemistry resources, algorithms, and their output.
![]()
34 - Chemical entity semantic specification: Knowledge
representation for efficient semantic cheminformatics and facile data
integration
Leonid L Chepelev, Michel Dumontier. Department of
Biology, School of Computer Science, and Institute of Biochemistry,
Carleton University, Ottawa, Ontario, Canada
Though the nature of RDF implies the ability to interoperate and
integrate diverse knowledgebases, designing adequate and efficient
RDF-based representations of knowledge concerning chemical entities is
non-trivial. We hereby describe Chemical Entity Semantic Specification
(CHESS), which captures chemical descriptors, molecular connectivity,
functional composition, and geometric structure of chemical entities and
their components. CHESS also handles multiple data sources and multiple
conformers for molecules, as well as reactions and interactions. We
demonstrate the generation of a chemical knowledgebase from disparate
data sources, using which we conduct an analysis of the implications of
design choices taken in CHESS on the efficiency of solutions for some
classical cheminformatics problems, including molecular similarity
searching and subgraph detection. We do this through automated
conversion of SMILES-encoded query fragments into SPARQL queries and
DL-Safe rules. Finally, we discuss approaches to identification of
potential reaction participants and class members in chemical entity
knowledgebases represented with CHESS.
![]()
35 - Semantic assistant for lipidomics researchers
Alexandre Kouznetsov, Rene Witte, Christopher J.O.
Baker. Department of Computer Science and Applied Statistics, University
of New Brunswick, Saint John, New Brunswick, Canada; Department of
Computer Science and Software Engineering, Concordia University,
Montreal, Canada
Lipid nomenclature has yet to become a robust research tool for
lipidomics or lipid research in general. This is in part because no
rigorous structure based definitions exist for membership of specific
lipid classes has existed. Recent work on the OWL-DL Lipid Ontology with
defined axioms for class membership and has provided new opportunities
to revisit the lipid nomenclature issue [1], [2]. Also necessary is a
framework for sharing these axioms with scientists during scientific
discourse and the drafting of publications. To achieve this we introduce
here a new paradigm for Lipidomics researchers in which a client side
application tags raw text about lipids with information, such as
canonical name or relevant functional groups, derived from the ontology
and is delivered using web services. Our approach includes following
core components: (i)Semantic Assistant Framework [6]; (ii) Lipid
ontology [4]; (iii) Ontological NLP methodology; (iv) Ontology
Axiom-extractor for the GATE framework. The Semantic Assistant Framework
is aservice-oriented architecture used to enhancing existing end-user
clients, such Open Office Writter, with online Lipidomics text analysis
capabilities provided as a set of web services. The Ontological NLP
methodology links Lipid named entities occurred in a document opened on
client side with existing ontologies on server side. The Ontology
Axiom-extractor annotates each named entity with canonical name, class
name and related class axioms providing annotation for documents on the
client side. The proposed system is scalable and extensible allowing
researchers to easily customize the information to be delivered as
annotations depending on the availability of chemical ontologies with
defined axioms linked to canonical names for chemical entities.
[1] Baker CJO, Low HS, Kanagasabai R, and Wenk MR, (2010) Lipid
Ontologies, 3rdInterdisciplinary Ontology Conference, Tokyo,
Japan, February 27-28, 2010
[2] Low HS, Baker CJO, Garcia A and Wenk M., OWL-DL (2009), Ontology for
Classification of Lipids, International Conference on Biomedical
Ontology, Buffalo, New York, July 24-26
[3] Witte R., Gitzinger T., (2008), A General Architecture for
Connecting NLP Frameworks and Desktop Clients Using Web Services, 13th
International Conference on Applications of Natural Language to
Information Systems
[4] Lipid Ontology available at http://bioportal.bioontology.org/ontologies/39503
![]()
36 - ChemicalTagger:
A tool for semantic text-mining in chemistry
Dr Lezan Hawizy, Dave M Jessop, Professor Peter
Murray-Rust. The Unilever Centre for Molecular Science Informatics,
Department of Chemistry, University of Cambridge, Cambridge, United
Kingdom
The
primary method for scientific communication is in the form of
published scientific articles and theses and the use of natural
language combined with domain-specific terminology. As such, they
contain unstructured data.
Given
the unquestionable usefulness of data extraction from unstructured
literature, we aim to show how this can be achieved for the
discipline of chemistry. The highly formulaic style of writing most
chemists adopt make their contributions well suited to
high-throughput Natural Language Processing (NLP) approaches. Using
chemical synthesis procedures as an exemplar, we present
ChemicalTagger. ChemicalTagger is a tool that combines chemical
entity recognisers such as OSCAR with tokenisers, part-of-speech
taggers and shallow parsing tools to produce a formal structure of
reactions.
This
extracted data can then be expressed in RDF. This allows for the
generation of highly informative
visualisations, such as visual
document summaries, structured querying and further enrichment can be
provided by linking with
domain specific ontologies.
![]()
37 - From canonical numbering to the analysis of
enzyme-catalyzed reactions: 32 years of publishing in JCIM (JCICS)
Prof. Johann Gasteiger, Prof. Johann Gasteiger.
Computer-Chemie-Centrum, University of Erlangen-Nuremberg, Erlangen,
Germany; Molecular Networks GmbH, Erlangen, Germany
In 1972 we embarked on the development of a program for
computer-assisted synthesis design which eventually led to the present
system THERESA. Along the way many fundamental problems had to be solved
such as the unique representation of chemical structures published in
1977. This work laid the foundation for building the Beilstein database.
Methods had to be developed for the computer representation of chemical
reactions which formed the basis for constructing the ChemInform
reaction database. Recent work has concentrated on the analysis of
biochemical reactions, the prediction of metabolism and the risk
assessment of chemicals.
![]()
38 - Fifteen years of JCICS
Dr. George W Milne. NCI, NIH (Retd), Williamsburg,
VA, United States
During the period 1989-2004 when I was Editor of the Journal
of Chemical Information and Computer Sciences (JCICS), the predecessor
of the
Journal of Chemical Information and Modeling (JCIM), many papers
appeared
addressing contemporary problems in computational chemistry.
Some of these problems were completely settled and
significant progress was made with others. A third group, in spite of
numerous
publications, defied attempts at resolution and remain to this day as
challenges to computational chemists.
As JCIM, aka JCICS, aka J. Chem. Doc embarks upon its second
50 years, the progress recorded during the 1990s and the advances in
computer
hardware and software are reviewed. With a longer perspective, the
impact of
computers on chemistry is considered resolved.
![]()
39 -
Fifteen years in chemical informatics: Lessons from the past, ideas
for the future
Dimitris Agrafiotis PhD. Pharmaceutical Research
& Development, Johnson & Johnson, Spring House, Pennsylvania, United
States
A unique aspect of chemical informatics is that it has been
heavily influenced and shaped by the needs of the pharmaceutical
industry. As this industry undergoes a profound transformation,
so will the field itself. In this talk, we reflect on the
experiences of the past and explore the possibilities we see for
the future. These possibilities lie on the convergence of
chemistry, biology, and information technology, and will require
thinking and working across scientific and organizational
boundaries in a way that has never been previously possible.
![]()
40 - Applications of wavelets in virtual screening
Prof Val Gillet PhD, Mr Richard Martin, Dr Eleanor
Gardiner, Dr Stefan Senger. Department of Information Studies,
University of Sheffield, Sheffield, United Kingdom; Computational and
Structural Chemistry, GlaxoSmithKline, Stevenage, Hertfordshire, United
Kingdom
The interactions which a small molecule can make with a receptor can be
modelled using three-dimensional molecular fields, such as GRID fields,
however, the cumbersome nature of these fields makes their storage and
comparison computationally expensive. Wavelets are a family of
multiresolution signal analysis functions which have become widely used
in data compression. We have applied the non-standard wavelet transform
to generate low-resolution approximations (wavelet thumbnails) of finely
sampled GRID fields, without loss of information. We demonstrate various
applications of wavelet thumbnails including the development of an
alignment method to enable the comparison of the wavelet representations
of GRID fields in arbitrary orientation.
![]()
41 - Privileged substructures revisited: Target
community-selective scaffolds
Jürgen Bajorath. Department of Life Science
Informatics, University of Bonn, Germany
Molecular scaffolds that preferentially bind to a given target family,
so-called “privileged” substructures, have long been of high interest in
drug discovery. Many privileged substructures have been proposed, in
particular, for G protein coupled receptors and protein kinases.
However, the existence of truly privileged structural motifs has
remained controversial. Frequency-based analysis has shown that many
scaffolds thought to be target class-specific also occur in compounds
active against other types of targets. In order to explore scaffold
selectivity on a large scale, we have carried out a systematic survey of
publicly available compound data and defined target communities on the
basis of ligand-target networks. The analysis was based on compound
potency data and target pair potency-derived selectivity. More than 200
hierarchical scaffolds were identified, each represented by at least
five compounds, which exclusively bound to targets within one of ca. 20
target communities. By contrast, currently available compound data is
too sparsely distributed to assign target-specific scaffolds. Most
scaffolds that exclusively bind to a single target within a community
are only represented by one or two compounds in public domain databases.
However, characteristic selectivity patterns are found to evolve around
community-selective scaffolds that can be explored to guide the design
of target-selective compounds.
![]()
42 - Automated retrosynthetic analysis: An old flame rekindled
Prof Peter Johnson PhD, Anthony P Cook, James Law, Mahdi
Mirzazadeh, Dr Aniko Simon PhD. School of Chemistry, University of Leeds, Leeds,
United Kingdom; Simbiosys Inc, Toronto, Ontario, Canada
The last century saw truly innovative research aimed at the creation of systems
for computer aided organic synthesis design (CAOSD). However, such systems have
not achieved significant user acceptance, perhaps because they required manual
creation of reaction knowledge bases, a time consuming task which requires
considerable synthetic chemistry expertise. More recent systems like ARChem1
circumvent this problem by automated abstraction of transformation rules from
very large databases of specific examples of reactions. ARChem is still a work
in progress and specific problems which are being addressed include:
a) dentification of precise structural characteristics of each reaction, often
requiring
knowledge of reaction mechanism;
b) treatment of interfering functional groups;
c) minimising the combinatorial explosion inherent in automated multistep
retrosynthesis;
d) treatment of the results of extensive recent research into enantioselective
and
stereoselective reactions.
1 Law et al J. Chem. Inf. Model., 2009, 49 (3), pp 593-602
![]()
43 - Dietary supplements: Free evidence-based resources for
the cautious consumer
MLS Brian Erb. McGoogan Library of Medicine,
University of Nebraska Medical Center, Omaha, NE, United States
Vitamin, mineral and dietary supplements are a 70 billion dollar
industry. With marginal FDA regulation, it can be difficult to evaluate
the health claims of a given product. How can the skeptical consumer
distinguish a promising nutritional supplement from a substance that
lacks the evidence to back its nutritional claims? This short
presentation will highlight some evidence-based Internet sources that
will help the consumer navigate the dietary supplement minefield. These
sources will not only help the consumer separate bogus claims from
research supported evidence, but also help the consumer make informed
nutritional decisions regarding which supplements might be a relevant
and useful part of their healthy diet and lifestyle. The resources to be
explored have been collected in a UNMC libguide at http://unmc.libguides.com/supplements
for ease of navigation and dissemination.
![]()
44 - What lessons learned can we generalize from evaluation
and usability of a health website designed for lower literacy consumers?
Mary J Moore PhD, Randolph G. Bias PhD. Department of
Health Informatics, University of Miami Miller School of Medicine,
Miami, FL, United States; Department of Information, University of Texas
at Austin, Austin, Texas, United States
Objectives: Researchers conducted multifaceted usability testing and
evaluation of a website designed for use by those with lower computer
literacy and lower health literacy. Methods included heuristic
evaluation by a usability engineer, remote usability testing and
face-to-face testing. Results: Standard usability testing methods
required modification, including interpreters, increased flexibility for
time on task, presence of a trusted intermediary, and accommodation for
family members who accompanied participants. Participants suggested
website redesign, including simplified language, engaging and relevant
graphics, culturally relevant examples, and clear navigation.
Conclusions: User-centered design was especially important for this
audience. Some lessons learned from this experience are echoed in
usability and evaluation of commercial sites designed for similar
audiences, and may be generalizable.
![]()
45 - National Library of Medicine resources for consumer
health information
Michelle Eberle. National Network of Libraries of
Medicine - New England, Shrewsbury, MA, United States
Come learn about free, high quality web resources for consumer health
information from the National Library of Medicine. We will cover
MedlinePlus, a resource for health information for the public. The
presenter will take you on a guided tour of http://medlineplus.gov and
other specialized web resources for consumer health information
including the Drug Information Portal, DailyMed and the Dietary Labels
Supplement Database. The program will wrap up with a brief introduction
to ClinicalTrials.gov. You will leave this program equipped with
expertise to find, critically appraise, and use online health
information more effectively.
![]()
46 - Better prescription for information: Dietary supplements
online
Gail Y. Hendler MLS. Hirsh Health Sciences Library,
Tufts University, Boston, MA, United States
Dietary supplements are becoming staples in the health regimens of a
growing number of consumers worldwide. According to the most recent
National Health and Nutrition Examination Survey, 52% percent of
adults in the United States reported taking a nutraceutical in the past
month. Consumers turn to these products believing they are safe and
effective because they are “all natural.” Supplementing knowledge about
the benefits and the potential risks associated with nutraceutical use
requires information resources that are authoritative, accurate and
readable to a large and general audience. This presentation will provide
recommendations for locating high-quality, freely available online
resources that today's consumers need to support decision-making.
Featured resources will include books, databases and websites that
discuss the pros and cons and provide the evidence for better use of
dietary supplements, herbs and functional foods.
![]()
47 - Overview of the linking open drug data task
Eric Prudhommeaux, Egon Willighagen, Susie Stephens.
, W3C/MIT, Cambridge, MA, United States; Uppsala University, Uppsala,
Sweden; , Johnson and Johnson, United States
There is much interesting information about drugs that is available on
the Web. Data sources range from medicinal chemistry results, to the
impacts of drugs on gene expression, through to the results of drugs
in clinical trials.
Linking Open Drug Data (LODD) is a task within the W3C's Health Care
Life Sciences Interest Group. LODD has surveyed publicly available
data sets about drugs, created Linked Data representations of the data
sets and interlinked them together, and identified interesting
scientific and business questions that can be answered once the data
sets are connected. The task also actively explores best practices for
exposing data in a Linked Data representation.
The figure below shows part of the data sets that have been published
and interlinked by the task so far.

The LODDse data sets are represented in dark gray, while light gray
represents other Linked Data from the life sciences, and white
indicates data sets from different domains. Collectively, the LODD
data sets consist of over 8 million RDF triples, which are interlinked
by more than 370,000 RDF links. This presentation will introduce the
LODD task and show examples of recent.
![]()
48 - Control, monitoring, analysis and dissemination of
laboratory physical chemistry experiments using semantic web and broker
technologies
Prof Jeremy G Frey, Stephen Wilson. School of
Chemistry, Univeristy of Southampton, Southampton, Hants, United Kingdom
A suite of software was developed to control and monitor
experimental and environmental data and used for probing of the
air/water
interface using Second Harmonic Generation. A centralised message broker
enabled a common communication protocol between all objects in the
system; experimental
apparatus, data loggers, storage solutions and displays. The data and
context are captured and
represented in ways compatible with the Semantic Web. Experimental plans
and the enactment are
described using the oreChem experiments ontology; this provides the
means to
capture the metadata associated with the experimental process and the
resulting
data. Environmental data was stored in the Open Geospatial Consortium
Sensor
Observation Service (SOS). The SOS is part of the Sensor Web Enablement
architecture;
this describes a number of interoperable interfaces and metadata
encodings for
integrating sensors webs into the cloud. A mashup web interface was
produced to
link all these sources of information from a single point.
![]()
49 - Semantic analysis of chemical patents
Dave M Jessop, Dr Lezan Hawizy, Prof. Peter
Murray-Rust, Professor Robert C Glen. The Unilever Centre for Molecular
Science Informatics, Department of Chemistry, University of Cambridge,
Cambridge, United Kingdom
Chemical patents are a
rich source of technical and scientific information. They include
meta-data, such as bibliographic information, as well as scientific
data relating to reactions and synthesis experiments. However, they
are lengthy, largely unstructured and rich in technical terminology
such that it takes a signification amount of human efforts for
analyses. This would make them an ideal candidate for
'semantification'.
As a demonstration, an
RDF triplestore of chemical patents is created. The patents, provided
by the European Patent Office, are in an XML format. Document
segmentation is used initially to extract the relevant information,
mainly bibliographic information and experimental paragraphs. The
experimental paragraphs are then processed using Natural Language
Processing tools to extract the various components of the chemical
reaction; roles, such as reactant, product or solvent, are then
assigned. This extracted information is then converted into RDF and
stored in a triplestore where it can then be queried, visualised and
basic inferences can be made.The ultimate goal of
this semantic representation, is to make data available and re-usable
by the scientific community.
![]()
50 - Data mining and querying of integrated chemical and
biological information using Chem2Bio2RDF
Dr David J Wild, Bin Chen, Dr Ying Ding, Xiao Dong,
Huijun Wang, Dazhi Jiao, Dr Qian Zhu, Madhuvanti Sankaranarayanan.
School of Informatics and Computing, Indiana University, Bloomington,
IN, United States; School of Library and Information Science, Indiana
University, Bloomington, IN, United States
We have recently developed a freely-available resource called
Chem2Bio2RDF (http://chem2bio2rdf.org) that consists of chemical,
biological and chemogenomic datasets in a consistent RDF framework,
along with SPARQL querying tools that have been extended to allow
chemical structure and similarity searching. Chem2Bio2RDF allows
integrated querying that crosses chemical and biological information
including compounds, publications, drugs, genes, diseases, pathways and
side-effects. It has been used for a variety of applications including
investigation of compound polypharmacology, linking drug side-effects to
pathways, and identifying potential multi-target pathway inhibitors. In
the work reported here, we describe a new set of tools and methods that
we have developed for querying and data mining in Chem2Bio2RDF,
including: Linked Path Generation (a method for automatically
identifying paths between datasets and generating SPARQL queries from
these paths); an ontology for integrated chemical and biological
information; a Cytoscape plugin that allows dynamic querying and network
visualization of query results; and a facet-based browser for browsing
results.
![]()
51 - Mining and visualizing chemical compound-specific
chemical-gene/disease/pathway/literature relationships
Dr. Qian Zhu, Prajakta Purohit, Jong Youl Choi,
Seung-Hee Bae, Dr. Judy Qiu, Prof. Ying Ding, Prof. David Wild. School
of Informatics and Computing, Indiana University, Bloomington, IN,
United States; School of Library & Information Science, Indiana
University, Bloomington, IN, United States; Department of Computer
Science, Indiana University, Bloomington, IN, United States
In common with most scientific disciplines, there has in the last few
years been a huge increase in the amount of publicly-available and
proprietary information pertinent to drug discovery, owing to a variety
of factors including improvements in experimental technologies. So the
big challenge for us is how we can use all of this information together
in an intelligent way, in an integrative fashion.
We are developing an application to mine relationships between Chemical
and Gene/Disease/Pathway/Literature, and visualize them. It aims to help
answer the question “anything else should I know about this compound?”
from a medicinal chemistry perspective based on the full picture of
chemicals. For the mining part, we have already developed an aggregating
web services, named WENDI, which calls multiple individual or atomic,
web services including diversity of compound-related data sources,
predictive models and self-developed algorithms, and aggregates the
results from these services in XML; For visualizing, two ways to go:
First, we create a RDF reasoner to convert XML from WENDI to RDF, find
inferred relationships based on RDF, rank evidences focused on
chemical-disease, and print all evidences out by using SWP faceted
browser based on Longwell http://simile.mit.edu/wiki/Longwell), it mixes
the flexibility of the RDF data model with the faceted browser to enable
users to browse complex RDF triples in a user-friendly and meaningful
manner; Second, we place all relationships from WENDI into a chemical
space consisted of 60M PubChem compounds, then clustered/highlighted
particular chemical compounds with specific attributes, like
gene/disease/pathway/literature by using PubChemBrowse, which is a
customized visualization tool for cheminformatics research and provides
a novel 3D data point browser that displays complex properties of
massive data on commodity clients and supports fast interaction with an
external property database via semantic web interface.
![]()
52 - What makes polyphenols good antioxidants? Alton Brown,
you should take notes...
Emilio Xavier Esposito PhD. The Chem21 Group, Inc,
Lake Forest, Illinois, United States
The dominant physical feature of antioxidants are phenols; polyphenols
according to Alton Brown. The proposed antioxidant-tyrosinase mechanism,
based on a series of experimentally determined mushroom tyrosinase
structures, provides insight to the molecular interactions that drive
the reaction. While the enzyme structures illustrate the important
molecular interactions for tyrosinase inhibition, the enzyme structures
do not always facilitate the understanding of what makes a good
inhibitor or the mechanism of the reaction. Using an antioxidant (tyrosinase
inhibitors) dataset of 626 compounds (from the linear discriminate
analysis research of Martín et al. Euro J Med Chem 42 p1370-1381, 2007)
we constructed binary QSAR models to indicate the important antioxidant
molecular features. Exploring models constructed from molecular
descriptors based on fingerprints (MACCS keys), traditional molecular
descriptors (2D and 2½D), VolSurf-like molecular descriptors (3D) and
molecular dynamics (4D-Fingerprints), the relationship between
polyphenols' biologically relevant molecular features - as determined by
each set of descriptors - and their antioxidant abilities will be
discussed.
![]()
53 - Engineering and 3D protein-ligand interaction scaling of
2D fingerprints
Jürgen Bajorath. Department of Life Science
Informatics, University of Bonn, Bonn, Germany
Different concepts are introduced to further refine and advance
molecular descriptors for SAR analysis. Fingerprints have long been
among preferred descriptors for similarity searching and SAR studies.
Standard fingerprints typically have a constant bit string format and
are used as individual database search tools. However, by applying
“engineering” techniques such as “bit silencing”, fingerprint reduction,
and “recombination”, standard fingerprints can be tuned in a compound
class-directed manner and converted into size-reduced versions with
higher search performance. It is also possible to combine preferred bit
segments from fingerprints of distinct design and generate “hybrids”
that exceed the search performance of their parental fingerprints.
Furthermore, effective 2D fingerprint representations can be generated
from strongly interacting parts of ligands in complex crystal
structures. These “interacting fragment” fingerprints focus search
calculations on pharmacophore elements without the need to encode
interactions directly. Moreover, 3D protein-ligand interaction
information can implicitly be taken into account in 2D similarity
searching through fingerprint scaling techniques that emphasize
characteristic bit patterns.
![]()
54 - In silico binary QSAR models based on
4D-fingerprints and MOE descriptors for prediction of hERG blockage
Prof. Y. Jane Tseng PhD. Graduate Institute of Biomedical
Electronics and Bioinformatics, National Taiwan University, Taipei, Taiwan
Republic of China
Blockage of the human ether-a-go-go related gene (hERG)
potassium ion channel is a major factor related to cardiotoxicity. Hence, drugs
binding to this channel have become an important biological endpoint in side
effects screening. We have collected all available biologically active hERG
compounds from the hERG literature for a total of 250 structurally diverse
compounds. This data set was used to construct a set of two-state hERG QSAR
models. The descriptor pool used to construct the models consisted of
4D-fingerprints
generated from the thermodynamic distribution of conformer states available to
a molecule, 204 traditional 2D descriptors and 76 3D VolSurf-like descriptors
computed using the Molecular Operating Environment (MOE) software. One model is
a continuous partial least squares (PLS) QSAR hERG binding model. Another
related model is an optimized binary QSAR model that classifies compounds as
active,
or inactive. This binary model achieves 91% accuracy over a large range of
molecular diversity spanning the training set. An external test set was
constructed from the condensed PubChem bioassay database containing 816
compounds and successfully used to validate the binary model. The binary QSAR
model permits a structural interpretation of possible sources for hERG activity.
In particular, the presence of a polar negative group
at a distance of 6 to 8 Å from a hydrogen bond donor in a compound is predicted
to be a quite structure-specific pharmacophore that increases hERG blockage.
Since
a data set of high chemical diversity was used to construct the binary model,
it is applicable for performing general virtual hERG screening.
![]()
55 - Telling the good from the bad and the ugly: The
challenge of evaluating pharmacophore model performance
Robert D. Clark PhD. Simulations Plus, Inc.,
Lancaster, California, United States
Pharmacophore models are useful when they provide qualitative insight
into the interactions between ligands and their target macromolecules,
and therefore are more akin in many ways to molecular simulations than
to quantitative structure activity relationships (QSARs) based on the
partition of activity across a set of molecular descriptors. When the
performance of a pharmacophore model is assessed quantitatively, it is
usually in terms of its ability to recover known ligands or, less often,
in terms of how well it distinguishes ligands from non-ligands. This
status as a classification technique also sets it apart from more
numerical QSAR methods, in part because of fundamental differences in
what being "good" means. Carefully defining what "good" classification
is, however, can make creative combination with other techniques a
productive way to capture the value of their intrinsic complementarity.
![]()
56 - Creative application of ligand-based methods to solve
structure-based problems: Using QSAR approaches to learn from protein
crystal
structures
Prof. Curt M Breneman, Dr. Sourav Das, Dr. Matt
Sundling, Mr. Mike Krein, Prof. Steven Cramer, Prof. Kristin P Bennett,
Dr. Charles Bergeron, Mr. Jed Zaretzki. Department of Chemistry and
Chemical Biology, Rensselaer Polytechnic Institute, Troy, NY, United
States; Department of Chemical and Biological Engineering, Rensselaer
Polytechnic Institute, Troy, NY, United States; Department of
Mathematical Sciences, Rensselaer Polytechnic Institute, Troy, NY,
United States
In practice, there is no inherent disconnect
between the descriptor-based cheminformatics methods commonly used for
predicting small molecule properties and those that can be used to
understand and
predict protein behaviors. Examples of such connections include the
development
of predictive models of protein/stationary phase binding in HIC and
ion-exchange chromatography, protein/ligand binding mode
characterization
through PROLICSS analysis of crystal structures, and the use of PESD
binding
site signatures for pose scoring and predicting off-target drug
interactions. In all of these cases, models
were created using descriptors based on protein electronic and
structural
features and modern machine learning methods that include model
validation tools
and domain of applicability assessment metrics.


![]()
57 - Computer-aided drug discovery
Prof. William L Jorgensen. Department of Chemistry,
Yale University, New Haven, CT, United States
Drug development is being pursued through computer-aided structure-based
design. For de novo lead generation, the BOMB program builds
combinatorial libraries in a protein binding site using a selected core
and substituents, and QikProp is applied to filter all designed
molecules to ensure that they have drug-like properties. Monte
Carlo/free-energy perturbation simulations are then executed to refine
the predictions for the best scoring leads including ca. 1000 explicit
water molecules and extensive sampling for the protein and ligand. FEP
calculations for optimization of substituents on an aromatic ring and
for choice of heterocycles are now common. Alternatively, docking with
Glide is performed with the large databases of purchasable
compounds to provide leads, which are then optimized via the FEP-guided
route. Successful application has been achieved for HIV reverse
transcriptase, FGFR1 kinase, and macrophage migration inhibitory factor
(MIF); micromolar leads have been rapidly advanced to extraordinarily
potent inhibitors.
![]()
58 - Structure-based discovery and QSAR methods: A marriage
of convenience
Jose S Duca. Novartis, Cambridge, MA, United States
The art of building predictive models of the relationships between
structural descriptors and molecular properties has been historically
important to drug design. In the recent years there has been an
extraordinary amount of experimental data available from processes
designed to accelerate drug discovery in pharma; from high throughput
screening and automation applied to library design and synthesis to
chemogenomics and microarray analysis. QSAR methods are one of the many
tools to predict affinity-related, physicochemical, pharmacokinetic and
toxicological properties through analyzing and extracting information
from molecular databases and HTS campaigns.
This presentation will cover case studies in which QSAR and
Structure-Based Drug Design (SBDD) have worked in concert during the
discovery process of pre-clinical candidates. The importance of
incorporating time-dependent sampling to improve the quality of the nD-QSAR
models (n=3,4) will also be discussed and compared to simplified low
dimensional QSAR models. For those cases where structural information
cannot be readily available an extension of these methodologies will be
discussed in relation to ligand-based approaches.
![]()
59 - Extending the QSAR Paradigm using molecular modeling and
simulation
Professor Anton J Hopfinger Ph.D.. College of Pharmacy, MSC
09 5360, University of New Mexico, Albuquerque, NM, United States; Computational
Chemistry, The Chem21 Group, Inc., Lake Forest, IL, United States
QSAR analysis and molecular modeling/ simulation methods are often
complementary, and when combined in a study yield results greater than the sum
of their parts. Modeling and simulation offer the ability to design custom,
information-rich trial descriptors for a QSAR analysis. In turn, QSAR analysis
is able to discern which of the custom descriptors most fully relate to the
behavior of an endpoint of interest. One useful set of custom QSAR descriptors
from modeling and simulation for describing ligand-receptor interactions are
the grid cell occupancy descriptors, GCODs, of 4D-QSAR analysis. These
descriptors characterize the relative spatial occupancy of all the atoms of a
molecule over the set of conformations available to the molecule when in a
particular environment. GCODS permit the construction of a 4D-QSAR equation for
virtual screening, as well as a spatial pharmacophore of the 4D-QSAR equation
for exploring mechanistic insight. Applications that can particularly benefit
from combining QSAR analysis and modeling/simulation tools are those in which a
model chemical system is needed to determine the sought after property. One
such application is the transport of molecules through biological compartments,
an integral part of many ADMET properties. The reliable estimation of eye
irritation is greatly enhanced by simulating the transport of test solutes
through membrane bilayers, and using extracted properties from the simulation
trajectories as custom descriptors to build eye irritation QSAR models. These
key descriptors of the QSAR models, in turn, also permit the investigator to
probe and postulate detailed molecular mechanisms of action.
![]()
60 - Overview of activity landscapes and activity cliffs:
Prospects and problems
Prof Gerald M Maggiora. Department of Pharmacology &
Toxicology, University of Arizona College of Pharmacy, Tucson, AZ,
United States; BIO5 Institute, University of Arizona, Tucson, AZ, United
States; Translational Genomics Research Institute, Phoenix, AZ, United
States
Substantial growth in the size and diversity of compound collections and
the capability to subject them to an increasing variety of different
high-throughput assays manifests the need for a more systematic and
global view of structure-activity relationships. The concepts of
chemical space and molecular similarity, which are now well known to the
drug-research community, provide a suitable framework for developing
such a view. Augmenting a chemical space with activity data from various
assays generates a set of activity landscapes, one for each assay. The
topography of these landscapes contains important information on the
structure-activity relationships of compounds that inhabit the chemical
space. Activity cliffs, which arise when similar compounds possess
widely different activities, are a particularly informative feature of
activity landscapes with respect to SAR. The talk will present an
overview of activity landscapes and cliffs and will describe some of the
prospects and problems associated with these important concepts.
![]()
61 - Exploring and exploiting the potential of
structure-activity
cliffs
Dr Gerald M Maggiora PhD, Michael S Lajiness.
Department of Pharmacology & Toxicology, University of Arizona College
of Pharmacy, Tucson, Arizona, United States; Scientific Informatics, Eli
Lilly & Co, Indianapolis, IN, United States
It's well known that small structural changes sometimes result in large
changes in activity. There have been some recent efforts to identify
such changes but little in regards to defining which structural changes
are most informative or even real. Also, the missing value problem often
obfuscates the ability to detect relevant patterns
if in fact they exist. This presentation will present several ideas and
applications for exploring and exploiting Structure-Activity Cliffs. In
addition, various visualizations and approaches to communicate the
information contained in these "cliffs" will be shared. Examples will be
drawn from PubChem.
![]()
62 - What makes a good structure activity landscape? Network
metrics and structure representations as a way of exploring activity
landscapes
Dr. Rajarshi Guha. Department of Informatics, NIH
Chemical Genomics Center, Rockville, MD, United States
The representation of SAR data in the form of landscapes and the
identification of activity cliffs in such landscapes is well known. A
number of approaches have been described to identifying activity cliffs,
including several network based methods such as the SALI approach (JCIM,
2008, 48, 646-658). While a network representation of an SAR landscape
moves away from the intuitive idea of rolling hills and steep gorges, it
allows us to apply a variety of quantitative analyses. In this talk I
will first examine some of the properties of SALI networks using various
measures of network structures and attempt to correlate these features
with features of the SAR data. While most examples are from relatively
small datasets I will highlight some examples from larger datasets from
high-throughput screens. While such data can be noisy and contain
artifacts I will examine whether the underlying network structure can
shed light on specific molecules that may be worth following up. The
second focus of the talk will look at the effect of structure
representations on the smoothness of the landscape and how one can
derive ideas from the SALI characterization to suggest good or bad
landscapes.
![]()
63 - Consensus model of activity landscapes and consensus
activity cliffs
Jose L Medina-Franco, Karina Martinez-Mayorga, Fabian
Lopez-Vallejo. Torrey Pines Institute for Molecular Studies, Port St
Lucie, FL, United States
Characterization of activity landscapes is a valuable tool in lead
optimization, virtual screening and computational modeling of active
compounds. As such understanding the activity landscape and early
detection of activity cliffs [Maggiora, G. M. J. Chem. Inf. Model.
2006, 46, 1535] can be crucial to the success of
computational models. Similarly, characterizing the activity landscape
will be critical in future ligand-based virtual screening campaigns.
However, the chemical space and activity landscape are influenced by the
particular representation used and certain representations may lead to
apparent activity cliffs. A strategy to address this problem is
to consider multiple molecular representations in order to derive a
consensus model for the activity landscape and in particular identify
consensus activity cliffs [Medina-Franco, J. L. et al. J. Chem.
Inf. Model. 2009, 49, 477]. The current approach can
be extended to indentify consensus selectivity cliffs.
![]()
64 - R-Cliffs: Activity cliffs within a single analog series
Dimitris Agrafiotis PhD. Pharmaceutical Research &
Development, Johnson & Johnson, Spring House, Pennsylvania, United States
The concept of activity cliffs has gained popularity as a means to identify and
understand discontinuous SAR, i.e., regions of SAR where minor changes in
structure have unpredictably large effects on biological activity. To the best
of our knowledge, activity cliffs have been invariably evaluated using global
measures of molecular similarity that do not take into account the presence of
finer substructure among a series of related analogs. In this talk, we look at
activity cliffs within a congeneric series, by decomposing them into R-groups
and analyzing how activity is affected by changes in a single variation site.
The analysis is greatly enhanced by R-group-aware visualization tools such as
the SAR maps, which have been enhanced to specifically highlight such
discontinuities.
![]()
65 - Chemical structure representation in the DuPont Chemical
Information Management Solutions database: Challenges posed by complex
materials in a diversified science company
Dr. Mark A Andrews, Dr. Edward S. Wilks. CR&D,
Information & Computing Technologies, DuPont, Wilmington, DE, United
States
This talk will describe the novel ways we have developed to represent
precisely the structures of the diverse chemical materials of interest
to DuPont. These range from simple organics and inorganics to polymers,
mixtures, formulations, multi-layer films, composites, and even devices
and incompletely defined substances. Part of the solution involves
evaluating trade-offs, which may be situation dependent, between details
captured in the structure vs. details captured at the sample history
level, e.g., ratios of components, polymer molecular weights and
microstructures, and the existence of “fairy dust” components. An
important aspect of the solution involves ensuring robust structure
standardization and duplicate checking for complex and ill-defined
substances. We believe that our needs and solutions have challenged and
inspired a number of chemical software vendors to provide significant
upgrades to the functionalities of their drawing packages and database
cartridges.
![]()
66 - From deposition to application: Technologies for storing
and exploiting crystal structure data
Dr Colin R Groom, Dr Jason Cole, Dr Simon Bowden, Dr
Tjelvar Olsson. Cambridge Crystallographic Data Centre, United Kingdom
In December 2009 The Cambridge Crystallographic Data Centre (CCDC)
archived the 500,000th small-molecule crystal structure to
the Cambridge Structural Database (CSD). The passing of this milestone
highlights the rate of growth of the CSD in recent years and the
continuing challenges this represents in terms of information storage
and exchange.
This talk will describe the development of a number of tools for the
processing, validation, and storage of crystal structure data. Recent
developments that will aid this growing body of structural knowledge to
be exploited in a range of applications and the provision of additional
services that can assist the scientific community will also be
illustrated.
![]()
67 - Recent IUPAC recommendations for chemical structure
representation: An overview
Mr. Jonathan Brecher. CambridgeSoft Corporation, Cambridge,
MA, United States
Accurate and unambiguous depiction of chemical information
is a key step in communicating that information. Such depiction is equally
important whether the intended audience is a human chemist (as in a journal
article or patent) or a computer (as in a chemical registration system). Recent
IUPAC publications provide chemists a practical guide for producing chemical
structure diagrams that accurately convey the author's intended meaning. A
summary of those recommendations will be presented. As part of that summary,
common
pitfalls in producing chemical structure diagrams will be discussed. Solutions
to those pitfalls will also be described, with an emphasis on solutions that
are simple, straightforward, and accessible to the majority of practicing
chemists.
![]()
68 - Orbital development kit
Dr. Egon L. Willighagen. Department of Pharmaceutical
Biosciences, Uppsala University, Uppsala, Sweden
Understanding properties of molecular structures requires a computer
representation, and quantum mechanical and chemical graph
representations have been used abundantly. Own have found their own
areas of application in chemistry, and their fields are best described
as theoretical chemistry and cheminformatics, respectively. The Orbital
Development Kit (ODK) positions itself in-between these two
representations, though closest to chemical graph theory, and addressing
shortcomings of the latter. In particular, it replaces coloring of the
nodes and edges in the chemical graph with atom hybridization and bond
order explicit, making the representation more precise in how it
represents geometrical features of the molecule. The ODK does so by
replacing the atom as single node in the chemical graph by a central
atomic core surrounded by valence orbitals, possible hybridized. Using
this approach, the definition of an atom type is reformulated as a core
element with a particular and well-defined set of identifiable orbitals
with an implied, though relative, geometrical orientation. Bonding is
now the connection of two orbitals, and a lone pair becomes a single
orbital, and is therefore directional too. This approach means that the
classical double bond in ethene is now represented by one sigma bonding
between two sp2 orbitals of the two carbons, and one bonding of their
two pz orbitals. This ODK representation leaves also room for
representations beyond the chemical graph, such as proposed by Dietz in
1995: more than two orbitals can be combined into set to represent
delocalization. The presentation will present the ODK data model,
serialization and deserialization into a Resource Description
Framework-based file format, and a bridge to the Chemistry Development
Kit, for visualization and
molecular property calculation.
![]()
69 - Line notations as unique identifiers
Krisztina Boda PhD. OpenEye Scientific Software, Santa Fe,
New Mexico, United States
A wide variety of structure representation formats have been devised to encode
molecular information in order to register, store and manipulate molecules in
silico.
One class of these formats, called line notations, is designed to express
molecules as compact, unambiguous strings that can be used as unique identifiers
for compound registration eliminating the computationally more expensive graph
matching.
The presentation will provide an overview of popular line notations, such as
canonical SMILES, isomeric SMILES, and InChI, discussing their merits and
shortcomings in regards to using them as robust lossless
unique identifiers.
We will present results of testing a variety of line notations on a diverse set
of 10M compounds generated by combining organic and inorganic vendor databases.
We will also examine the information loss of various molecular normalization
procedures with regard to line notation generation.
![]()
70 - Analysis of activity landscapes, activity cliffs, and
selectivity cliffs
Jürgen Bajorath. Department of Life Science
Informatics, University of Bonn, Germany
The concept of activity landscapes (ALs) is of fundamental importance
for the exploration of structure-activity relationships (SARs). ALs are
best rationalized as biological activity hypersurfaces in chemical
space. When reduced to three dimensions, ALs display characteristic
topologies that determine the SAR behavior of compound sets. Prominent
features of ALs are activity cliffs that are formed by structurally
similar compounds having large potency differences, giving rise to SAR
discontinuity. ALs and activity cliffs can be analyzed in different ways
including similarity-potency diagrams, approximate three-dimensional
landscape representations, or molecular networks integrating compound
similarity and potency information. Annotated similarity-based compound
networks that incorporate results of numerical SAR analysis functions,
termed Network-like Similarity Graphs (NSGs) are designed to explore
relationships between global and local SAR features in compound data
sets of any source. For collections of analogs, substitution patterns
that introduce activity cliffs are identified in Combinatorial Analog
Graphs (CAGs) that make it also possible to study additive and
non-additive effects of compound modifications. Activity cliffs
identified in CAGs can frequently be rationalized on the basis of
complex crystal structures. When studying multi-target SARs using the
NSG framework, the concept of activity cliffs can be extended to
selectivity cliffs, i.e. similar compounds having significant
differences in target selectivity.
![]()
71 - Using Activity Cliff Information in structure-based
design approaches
Birte Seebeck, Markus Wagener, Prof. Dr. Matthias
Rarey. Center for Bioinformatics (ZBH), University of Hamburg,
Hamburg, Germany; Molecular Design and Informatics, MSD, Oss, The
Netherlands
Activity cliffs are often the pitfall of QSAR modeling techniques, but
at the same time they exhibit key features of a SAR. Based on the
principles of the structure-activity landscape index (SALI) [1], here we
present an approach to use the valuable information of activity cliffs
in a structure-based design scenario, analyzing key interactions between
protein-ligand complexes in activity cliff events. We visualize those
interaction “hot spots” directly in the active site of target proteins.
In addition, we use the activity cliff information to derive
target-specific scoring models and pharmacophoric hypothesis, which are
validated in enrichment experiments on independent external test sets.
The results show an improved enrichment in comparison to the standard
score for various protein targets.
1. Guha R. and Van Drie J.H., J. Chem. Inf. Model., 2008, 48, 646-658.
![]()
72 - Exploring activity cliffs using large scale semantic
analysis of PubChem
Dr David J Wild, Bin Chen, Qian Zhu. School of
Informatics and Computing, Indiana University, Bloomington, IN, United
States
Identification of Activity Cliffs, defined as the ratio of the
difference in activity of two compounds to their “distance” of
separation in a given chemical space [1], has been established as
important in the creation of robust quantitative-structure activity
relationship models. Previously, a method, SALI, for identifying and
visualizing these activity cliffs was developed at Indiana University,
and applied successfully to several established QSAR datasets [2]. In
the work reported here, we have extended this work in two ways. First,
we have used structure and activitydata from the public PubChem BioAssay
dataset to evaluate the method on a much larger scale, and second, we
have integrated it with a project called Chem2Bio2RDF to look not just
for activity cliffs based on reported assay values, but also on
computationally established relationships between compounds and genes
and diseases. We thus propose an extended application of SALI which can
be used in a systems chemical biology and chemogenomic context.
[1] J. Chem. Inf. Model., 2006, 46 (4), p 1535
[2] J. Chem. Inf. Model., 2008, 48 (3), pp 646-658
![]()
73 - Quantifying the usefulness of a model of a structure-activity
relationship: The SALI Curve Integral
John H Van Drie, Rajarshi Guha. R&D, Van Drie Research LLC,
Andover, MA, United States; Chemical Genomics Center, NIH, Bethesda, MA, United
States
In 2008, in two papers Guha and Van Drie introduced the notion of
structure-activity landscape index (SALI) curves as a way to assess a model and
a modeling protocol, applied to structure-activity relationships. The starting
point is to study a structure-activity relationship pairwise, based on the
notion of "activity cliffs"--pairs of molecules that are structurally similar
but have large differences in activity. The basic idea behind the “SALI Curve”
is to tally how many of these pairwise orderings a model is able to predict.
Empirically, testing these SALI curves against a variety of models, ranging over
structure-based and non-structure-based models, the utility of a model seems to
correspond to characteristics of these curves. In particular, the integral of
these curves, denoted as SCI and being a number ranging from -1.0 to 1.0,
approaches a value of 1.0 for two literature models, which are both known to be
prospectively useful.
![]()
74 - Status of the InChI and InChIKey algorithms
Dr. Stephen Heller. CBRD, MS - 8320, NIST,
Gaithersburg, MD, United States
The Open Source chemical structure representation standard, the IUPAC
InChI/InChIKey project, has evolved considerably in the past two years.
The
project is now being supported and widely used by virtually all major
publishers
of chemical journals, databases, and structure drawing and related
software.
This usage of the InChI/InChIKey in their products enable them to link
information between their products and other (fee-free and fee-based)
chemical
information available on the world wide web via the Internet
These
organizations are now providing for a stable and financially viable
structure to
the project. This is enabling the world-wide chemistry community to
expand its
use of the InChI knowing that this freely available Open Source
algorithm will
be widely accepted and used of as a mainstream standard. The mission of
the Trust is quite simple and limited; its sole purpose is to create and
support
administratively and financially a scientifically robust and
comprehensive InChI
algorithm and related standards and protocols.
This presentation will
describe the current technical state of the InChI and InChIKey
algorithms.
![]()
75 - Self-contained
sequence representation (SCSR): Bridging the gap between bioinformatics
and
cheminformatics
Dr Keith T Taylor, Dr William L Chen, Brad D
Christie, Joe L Durant, David L Grier, Burt A Leland, Jim G Nourse.
Symyx Technologies Inc, San Ramon, CA, United States
In this paper we will discuss the benefits and disadvantages
of the current approaches for storing biological sequence information.
We have developed a hybrid representation that uses the
compactness of the sequence, together with the detail of chemical
connectivity
information for modified regions. It represents standard residues with
substructure. All instances of the same residue are represented by a
single
template. This hybrid approach is compact and scalable.
We have developed a converter that takes a UniProt format
file extracts the sequence information and derives the modifications
producing
an SCSR record. The SCSR is encoded as a molfile and registered into a
Symyx
Direct database. Duplicate checking, exact matching - with and without
the
modifications -molecular weight calculation, and substructure searching
are all
available with these structures.
We are using this representation for peptides,
oligonucleotides, and we are now extending it to oligosaccharides.
Non-natural
residues can be included in an SCSR.
![]()
76 - Representation of Markush structures: From molecules toward
patents
Szabolcs Csepregi, Nóra Máté, Róbert Wágner, Tamás Csizmazia,
Szilárd Dóránt, Erika Bíró, Tim Dudgeon, Ali Baharev, Ferenc Csizmadia. ChemAxon
Ltd., Budapest, Hungary
Cheminformatics systems usually focus primarily on handling
specific molecules and reactions. However, Markush structures are also
indispensable in various areas, like combinatorial library design or chemical
patent applications for the description of compound classes.
The presentation will discuss how an existing molecule
drawing tool (Marvin) and chemical database engine (JChem Base/Cartridge) are
extended to handle generic features (R-group definitions, atom and bond lists,
link nodes and larger repeating units, position and homology variation). Markush
structures can be drawn and visualized in the Marvin sketcher and viewer,
registered in JChem databases and their library space is searchable without the
enumeration of library members. Different enumeration methods allow the
analysis of Markush structures and their enumerated libraries. These methods
include full, partial and random enumerations as well as calculation of the
library size. Furthermore, unique visualization techniques will be demonstrated
on real-life examples that illustrate the relationship between Markush
structures and the chemical structures contained in their libraries (involving
substructures and enumerated structures).
Special attention will be given to file formats and how they
were extended to hold generic features.
![]()
77 - CSRML: A new markup language definition for chemical
substructure representation
Dr. Christof H. Schwab, Dr. Bruno Bienfait, Dr.
Johann Gasteiger, Dr. Thomas Kleinoeder, Dr. Joerg Marucszyk, Dr. Oliver
Sacher, Dr. Aleksey Tarkhov, Dr. Lothar Terfloth, Dr. Chihae Yang.
Molecular Networks GmbH, Erlangen,, Bavaria, Germany; Altamira LLC,
Columbus, Ohio, United States
Although, chemical subgraphs or substructures are quite popular and used
since a long time
in chemoinformatics, the existing and well established standards still
have some limitations.
In general, these standards are suited even for complex substructure
queries, however,
show some insufficiences, e.g., for the inclusion of physicochemical
properties or annotation
of meta information. In addition, the existing standards are not fully
interconvertible and
specify no validation techniques to check the semantic correctness of a
query definition.
This paper proposes an approach for the representation of chemical
subgraphs that aims to
overcome the limitations of existing standards. The approach presents a
well-structured,
XML-based standard specification, the Chemical Subgraph Representation
Markup
Language (CSRML), that supports a flexible annotation mechanism of meta
information and
properties at each level of a substructure as well as user-defined
extensions. Furthermore, he specification foresees a mandatory inclusion and use of test cases.
In addition, it can be
used as an exchange format.
![]()
78 - Prediction of solvent physical properties using the
hierarchical clustering method
Dr. Todd M Martin, Dr. Douglas M Young. National Risk
Management Research Laboratory, Environmental Protection Agency,
Cincinnati, OH, United States
Recently a QSAR (Quantitative
Structure Activity Relationship) method, the hierarchical clustering
method,
was developed to estimate acute toxicity values for large, diverse
datasets. This methodology has now been
applied to the estimate solvent physical properties including surface
tension
and the normal boiling point. The
hierarchical clustering method divides a chemical dataset into a series
of
clusters containing similar compounds (in terms of their 2D molecular
descriptors). Multilinear regression
models are fit to each cluster. The toxicity
or property is estimated using the prediction value from several
different
cluster models. The physical properties
are estimated using 2D molecular structure only (i.e. w/o the use of
critical
constants). The hierarchical clustering
methodology was able to achieve excellent predictions for the external
prediction sets. A freely available
software tool to estimate toxicity and physical properties has been
developed. The software tool is based on
the open source Chemistry Development Kit (written in Java).
![]()
79 - Scaffold diversity analysis using scaffold retrieval
curves and an entropy-based measure
Jose L Medina-Franco PhD, Karina Martinez-Mayorga,
Andreas Bender PhD, Thomas Scior PhD. Torrey Pines Institute for
Molecular Studies, Port St. Lucie, FL, United States; Leiden University,
Leiden, The Netherlands; Benemerita Universidad Autonoma de Puebla,
Puebla, Mexico
Scaffold diversity analysis of compound collections has several
applications in medicinal chemistry and drug discovery. Applications
include, but are not limited to, library design, compounds acquisition
and assessment of structure-activity relationships. The scaffold
diversity is commonly measured based on frequency counts. Scaffold
retrieval curves are also employed. Further information can be obtained
by considering the specific distribution of the molecules in those
scaffolds. To this end, we present an entropy-based information metric
to assess the scaffold diversity of compound databases [Medina-Franco,
J. L. et al. QSAR Comb. Sci. 2009, 28, 1551]. The
entropy-based information metric takes into account the frequency
distribution of the different scaffolds and is a complementary measure
of scaffold diversity enabling a more comprehensive analysis.
![]()
80 - Nonsubjective clustering scheme for multiconformer
databases
Dr. Austin B. Yongye, Dr. Andreas Bender, Dr. Karina
Martinez-Mayorga. Torrey Pines Institute for Molecular Studies, Port St
Lucie, FL, United States; Medicinal Chemistry Division and Pharma-IT
Platform, Leiden/Amsterdam Center for Drug Research, Leiden University,
Leiden, The Netherlands
Representing the 3D-structures of ligands in virtual screenings via
multi-conformer ensembles can be computationally intensive, especially
for compounds with a large number of rotatable bonds. While clustering
and RMSD filtering methods are employed in existing conformer
generators, the novelty of this work is the inclusion of a
non-subjective clustering scheme. This algorithm simultaneously
optimizes the number and the average spread of the clusters. Using this
method 10 times less conformers per compound were obtained on averaged
and performed as well as OMEGA. Furthermore, we propose
thresholds for root-mean square filtering depending on the number of
rotors in a compound: 0.8, 1.0 and 1.4 for structures with low (1-4),
medium (5-9) and high (10-15) numbers of rotatable bonds, respectively.
The protocol employed is general and can be applied to reduce the number
of conformers in multi-conformer compound collections and alleviate the
complexity of downstream data processing in virtual screening
experiments.
![]()
81 - Finding drug discovery "rules of thumb" with bump hunting
Mr. Tatsunori Hashimoto, Dr. Matthew Segall PhD. Department
of Statistics, Harvard University, Cambridge, MA, United States; Optibrium,
Cambrdige, United Kingdom
Rules-of-thumb for evaluating potential drug molecules, such as Lipinski's Rule
of Five, are commonly used because they are easy to understand and translate
into practice. These rules have traditionally been constructed by observation or
by following simple statistical analysis. However, application of these
techniques to QSAR models or early screening data often ignores the underlying
statistical structure. Conversely, when machine learning algorithms are used to
classify 'drug-like' molecules, they often result in black-box classifiers that
cannot be modified to suit a particular target drug profile. We propose a novel
hybrid approach to constructing rules-of-thumb from existing data to match a
given target product profile for any therapeutic objective. These rules are
easily interpretable and can be rapidly modified to reflect expert opinions
before application.
![]()
82 - Machine learning in discovery research: Polypharmacology
predictions as a use case
Nikil Wale PhD, Kevin McConnell PhD, Eric M Gifford
PhD. Computational Sciences Center of Emphasis, Pfizer Inc, Groton, CT,
United States
In this talk I will lay out the increasing role of machine learning
technology in discovery research at Pfizer. Specifically, I will talk
about how algorithms and methods inspired by (Machine) Learning Theory
are playing an increasing role in in-silico predictive technologies in
pharmaceutical research. These methods will be put in the context of
other popular methods based on the classical statistics based approaches
and overlap and contrast will be discussed. I will use poly-pharmacology
predictions as an important use case to demonstrate the power of large
scale machine learning methods for such application. In particular,
prospective validation of these methods will be emphasized and
discussed.
![]()
83 - Interpretable correlation descriptors for quantitative
structure-activity relationships
Prof. Jonathan D. Hirst. School of Chemistry,
University of Nottingham, Nottingham, Nottinghamshire, United Kingdom
Highly predictive Topological Maximum Cross Correlation (TMACC)
descriptors for the derivation of quantitative structure-activity
relationships (QSARs) are presented, based on the widely used
autocorrelation method. They require neither the calculation of
three-dimensional conformations, nor an alignment of structures. Open
source software for generating the TMACC descriptors is freely available
from our website: http://comp.chem.nottingham.ac.uk/download/TMACC. We
illustrate the interpretability of the TMACC descriptors, through the
analysis of the QSARs of inhibitors of angiotensin converting enzyme
(ACE) and dihydrofolate reductase. In the case of the ACE inhibitors,
the TMACC interpretation shows features specific to C-domain inhibition,
which have not been explicitly identified in previous QSAR studies.
![]()
84 - Chemistry in your hand: Using mobile devices to access
public chemistry compound data
Dr Antony J Williams PhD, Valery Tkachenko.
ChemSpider, Royal Society of Chemistry, Wake Forest, North Carolina,
United States
Mobile devices allowing browsing of the internet to access chemistry
related data come in many forms: phones, music players and,
increasingly, as “tablets” and “pads”. With the permanently online
connectivity of these mobile devices, the browser now being the default
environment for much of our computer-based interactions, and the
increasing availability of rich datasets online, the aggregation of
these offerings mesh together to provide chemists with the capabilities
to query and search for chemistry in ways that were the stuff of science
fiction only a few years ago. Using the ChemSpider platform as a
foundation, and with the intention of continuing to enable the community
to access Chemistry, we have delivered mobile chemistry applications to
search across over 20 million compounds sourced from over 300 data
sources to retrieve data including properties, spectra and links to
patents and publications. This presentation will discuss Mobile
ChemSpider and the challenges of delivering such a tool.
![]()
85 - Feature analysis of ToxCastTM compounds
Patra Volarath, Stephen Little, Chihae Yang, Matt
Martin, David Reif, Ann Richard. National Center for Computational
Toxicology, U.S. Environmental Protection Agency, Research Triangle
Park, NC, United States; Center for Food Safety and Nutrition, U.S. Food
and Drug Administration, Bethesda, MD, United States
ToxCastTM was initiated by the US Environmental Protection
Agency (EPA) to prioritize environmental chemicals for toxicity testing.
Phase I generated data for 309 unique chemicals, mostly pesticide
actives, that span diverse chemical feature/property space, as
determined by quantum mechanical, feature-/QSAR-based, and ADME-based
descriptors. Results in over 450 high-throughput screening assays were
generated for the chemicals. Deriving associations across such a
structurally diverse and information-rich dataset is challenging.
Approaches to determine relationships between the bioassay data and
chemistry-/biology-informed structural features, and methods to
meaningfully represent this knowledge are being developed. We initially
focus on the Phase I data set. Successful approaches will be applied to
the much larger chemical libraries in ToxCast Phase II and Tox21
projects (the latter to screen approximately 10,000 chemicals). These
approaches will be used to develop data mining approaches to inform
toxicity testing and risk assessment modelling. This abstract does
not reflect EPA or FDA policy.
![]()
86 - Extracting information from the IUPAC Green Book
Prof Jeremy G Frey, Mark I Borkum. School of
Chemistry, Univeristy of Southampton, Southampton, Hants, United Kingdom
The IUPAC manual of Symbols and Terminology for Physicochemical
Quantities and Units (the Green Book) was first published in 1969. One
of the fundamental principles of the IUPAC Green Book is the reuse of
existing symbols and terminology, in order to enable the accurate
exchange of information and data. Accordingly, there is a need for the
IUPAC Green Book to be repurposed as a machine-processable resource.
This paper reports an experiment where we define a syntax for the
subject index of the IUPAC Green Book in the Parsing Expression Grammar
(PEG) formalism. We repurpose the resulting Abstract Syntax Tree (AST)
as the primary data source for a Ruby on Rails application and Simple
Knowledge Organization System (SKOS) concept scheme. We demonstrate a
metric that gives prominence to the most significant terms and pages in
the subject index, and reflect upon the usefulness and relevance of the
information obtained.
![]()
87 - Biologics and biosimilars:
One and the same?
Roger Schenck. Chemical Abstracts Service, Columbus,
OH, United States
Biopharmaceuticals (or biologics)
and generic follow-on biosimilars currently
account for more than 10% of the revenue in the pharmaceutical market.
As
patent protection for first generation biotherapeutics begins to expire,
follow-on biosimilars have begun to appear. This presentation will
provide insights on how the CAS databases
handle biologics and biosimilars, how these substances are treated
differently in patents, and how
biosimilars are viewed by different patenting authorities. What the CAS
databases reveal about
trends in biopharmaceutical research and development will be discussed
along
with specific examples
![]()
88 - Intelligent mining of drug information resources
Rashmi Jain, Anay Tamhankar, Aniket Ausekar, Yuthika
Dixit. Evolvus Group, Pune, India
A fundamental aspect of any research is to understand and keep track of
progress made by peer groups in terms of scientific discoveries.
Research Conferences form a definitive source of this information.
Annually, thousands of papers are presented in such conferences for any
given disease vertical from a Therapeutic, Biological, Pharmacological,
Clinical perspective. At first glance, the problem of finding relevant
conference proceedings of interest and then organizing the information
into a format which is easily analyzed, stored and efficiently retrieved
seems to be difficult and chaotic as there are no patterns by which a
process can be defined, furthermore conference presentations are highly
fragmented and non-standardized.
A hybrid approach, wherein a Machine Learning based text-extraction
software coupled with assisted expert annotations by human editors come
to the rescue. An in-house Machine Learning software system is used in
the first stage wherein the conference proceedings are classified based
on keywords, segmented and converted into standardized format.
The software then uses a proprietary, heuristic based, learning
algorithm to extract relevant data from the segments. Since it is well
known that any automated approach cannot be 100% accurate, in this step
the software is assisted by a team of expert human editors who analyze
the extracted and segmented data and perform necessary corrections, if
any. In the third step, the software then pushes each segment to a team
of expert human editors who analyze the segment, extract information
relevant to the area of research, and store the information in our
internal databases.
![]()
89 - Cheminformatics semantic grid for neglected diseases
Paul J Kowalczyk PhD. Department of Computational
Chemistry, SCYNEXIS, Durham, NC, United States
We present a summary of our progress towards establishing a
cheminformatics semantic grid for neglected diseases. Our efforts are
based on using public data and open-source programs to generate both
descriptive and predictive models, which are themselves made publicly
available. There are three modes of model access: as web services, via
web portals, and as downloads. Models are saved in Predictive Model
Markup Language (PMML) format. Information stored for each model
includes the training set, test set, descriptors and model tuning
parameters. This information is provided so that researchers may
determine a model's domain, and its applicability to their data.
Examples will be presented for two data sets retrieved from PubChem:
enzyme inhibition of dihydroorotate dehydrogenase (AID:1175), and a
cytochrome panel assay with activity outcomes (AID:1851).
![]()
90 - Extraction and integration of chemical information from
documents
Dr Hugo O Villar, Dr. Juan Betancort, Dr Mark R
Hansen. Altoris, Inc., La Jolla, California, United States
Effective chemical research requires that all sources of information be
incorporated in the decision making. Here we introduced a tool that
saves time when trying to build chemical databases that can be built
from web information or chemical literature, including patent
information. We discuss some of the challenges faced in automating the
identification and extraction of chemicals named in patents, and their
conversion into chemical databases that can be mined effectively. The
integration of external sources of data can be valuable for research
informatics. To that end we have integrated the conversion of IUPAC
names with chemical optical character recognition. We show examples
where such integration can provide useful competitive information.
![]()
91 - SAR and the role of active-site waters in blood
coagulating serine proteases: A thermodynamic analysis of ligand-protein
binding
Dr. Noeris K Salam, Dr. Woody Sherman, Dr. Robert
Abel. Schrodinger, Inc., San Diego, CA, United States; Schrodinger,
Inc., New York, New York, United States
The prevention of blood coagulation is important in treating
thromboembolic disorders. Several serine proteases involved in the
coagulation cascade are classified as pharmaceutically relevant and are
the focus of structure-based drug design campaigns. Here, we investigate
the serine proteases thrombin and factors VIIa, Xa, and XIa, using a
computational method called WaterMap that describes the thermodynamic
properties of the water solvating the active site. We show that the
displacement of key waters from specific subpockets (e.g. S1, S2, S3 and
S4) of the active site by the ligand is a dominant term governing
potency, providing insights into SAR cliffs observed in several compound
series. Furthermore, we describe how WaterMap scoring can be
supplemented with terms from an MM-GBSA calculation to improve the
overall predictive capabilities.
![]()
139 - Configurational entropy and mechanical stress in molecular
recognition
Prof. Michael K. Gilson M.D., Ph.D.. School of Pharmacy,
University of California, San Diego, La Jolla, CA, United States
I will present molecular dynamics simulations consistent with long-ranged
entropy effects throughout a protein upon binding a peptide. The results are
somewhat preliminary, given the challenge of generating converged simulation
results, but are qualitatively consistent with the long-ranged changes in
orientational order parameters due to binding, which have been observed in NMR
studies of binding.
These apparent long-ranged effects raise questions regarding the mechanisms by
which binding affects remote parts of the protein. I will explain why the
concept of mechanical stress may be useful in thinking about such long-ranged
consequences, and will describe our initial computational studies of stress at
the molecular level. This image

shows computed stress tensors as a guest molecule is pulled from its
cucurbituril host in a simulated single-molecule pulling experiment.
![]()
140 - Advancing anthrax toxin countermeasures using topomeric
searching and virtual screening methodologies
Prof. Elizabeth A Amin PhD, Dr. Ting-Lan Chiu PhD,
Dr. Derek J Hook PhD, Dr. Michael A Walters PhD, Prof. Barry C Finzel
PhD, Jonathan Solberg, Satish Patil, Dr. Todd W Geders PhD, Dr.
Subhashree Rangarajan PhD, Dr. Rawle Francis PhD, Xia Zhang. Department
of Medicinal Chemistry, University of Minnesota, Minneapolis, Minnesota,
United States; Institute for Therapeutics Discovery and Development,
University of Minnesota, United States; Department of Chemistry,
University of Minnesota, United States
One of the most dangerous bioterror agents is the rod-shaped,
spore-forming bacterium Bacillus anthracis, which is the
causative agent of anthrax. Concentrated anthrax spores have been
deployed as biological weapons in the United States and elsewhere,
resulting in high mortality rates among those exposed. The lethal factor
(LF) enzyme is secreted by the bacillus as part of the anthrax lethal
toxin, and is mainly responsible for anthrax-related cytotoxicity. As LF
can remain in the system long after antibiotics have eradicated the
bacilli, the preferred therapeutic modality would be the administration
of antibiotics together with an effective LF inhibitor. To date,
however, no LF inhibitor is available as a therapeutic or preventive
agent. Here we present an original high-throughput computational
protocol that successfully identified five promising novel LF inhibitor
scaffolds with low micromolar inhibition against that target,
demonstrating a 12.8% experimental hit rate. This protocol incorporated
topomeric shape-based searching techniques that were particularly
effective in identifying potential new leads. Three of the five new hits
exhibited experimental IC50 values less than 100
mM and may potentially serve as scaffolds
for lead optimization. Virtual screening simulations predicted that
these preliminary hits are likely to engage in critical ligand-receptor
interactions with nearby residues in at least two of the three (S1',
S1-S2, and S2') subsites in the LF binding area. Notably, it was found
that micromolar-level LF inhibition can be attained by compounds with
non-hydroxamate zinc-binding groups that exhibit monodentate zinc
chelation as long as key hydrophobic interactions with at least two LF
subsites are retained.
![]()
141 - Model-free drug-like filters
Dr Oleg Ursu, Dr Cristian G. Bologa, Prof. Tudor I. Oprea MD, PhD.
Department of Biochemistry and Molecular Biology, Division of Biocomputing,
University of New Mexico School of Medicine, Albuquerque, NM, United States
Extended connectivity descriptors computed by the Morgan algorithm have been
used for the classification of various molecular properties. The information
content encoded by such descriptors can be used to compute any 2D descriptors
[1]. As these atom environments are canonical, we extracted them as molecular
substructures (SMARTS) queries. Rooted in the information gain concept, already
applied to derive selection rules in decision trees [2], we aimed at a better
separation between classes of chemicals such as “drugs” and “non-drugs”. The
most discriminating atom environments (having the highest information gain) were
selected as model-free drug-like filters. These can be used to evaluate third
party chemical libraries to assess drug-likeness.
[1] JL Faulon, DP Visco, RS Pophale. J. Chem. Inf. Comput. Sci. 2003, 43:707-720
[2] JR Quinlan. Machine Learning 1986, 1:81-106
![]()
142 - Chemocentric informatics: Enabling bioactive compound discovery
through structural hypothesis fusion
Prof. Alexander Tropsha. School of Pharmacy, University of
North Carolina at Chapel Hill, Chapel Hill, NC, United States
Historically, computational drug discovery studies have relied on limited
sources of data such as biological assays of compound libraries tested against
single targets with results published in print. Nowadays, the information
resources have broadened dramatically including large chemical genomics
databases (e.g., ChEMBL, PubChem, PDSP, ToxCast), digital libraries (e.g.,
PubMed), gene expression profiles (e.g., cmap), and others. I shall describe a
chemocentric informatics strategy integrating different information resources
and diverse computational methodologies towards discovering novel bioactive
compounds. I shall describe the use of digital libraries for establishing new
datasets to analyze the relationships between chemical structure and biological
activity; highlight the importance of chemical data curation; and illustrate how
computational models help spotting and correcting erroneous data. I will
describe a study combining Quantitative Structure Activity Relationship (QSAR)
modeling, virtual screening (VS), text mining, and gene expression profiling of
chemicals for identifying novel experimentally confirmed high-affinity GPCR
ligands as potential anti-Alzheimer drug candidates.
![]()
143 - Computers and drug discovery: From duds to $5B drugs
Prof. Robert C Glen PhD. Department of Chemistry,
University of Cambridge, Cambridge, Cambridgeshire, United Kingdom
Despite what you may think, given the investment in industrial scale
pharmacology and chemistry, drug discovery is still a cottage industry.
Small focussed groups of scientists combine diverse expertise from
pharmacology and biology to synthesis and design, wrestling with complex
and uncertain data. It is a poorly defined science, with undefined
outcomes, often guided by rule-of-thumb, intuition and sheer luck.
Bringing the logic of computation to the chaos of biology is very
difficult, but every so often we succeed beyond our wildest dreams.
Since this is the 50th anniversary of The Journal of Chemical
Information and Modeling, I would like to review some of our work on
novel algorithms and drug discovery, focussing on GPCR's, over the past
twenty years and in particular identify some things that worked, some
that didn't and also challenge some views of where modelling and
computation should be applied, and where it shouldn't (yet).
![]()
144 - Weighting and fusion methods for similarity-based
virtual screening
Prof. Peter Willett, Shereen Arif, Dr John Holliday,
Nurul Malim, Christoph Mueller. Information School, University of
Sheffield, Sheffield, South Yorkshire, United Kingdom
Recent work in Sheffield on similarity searching has focussed on the use
of data fusion and fragment weighting methods to search the MDDR, WOMBAT
and MUV databases. Data fusion involves the combination of multiple
similarity searches. The overlap between multiple searches is shown to
follow a Zipf-like, power law distribution, with very few molecules (or
active molecules) common to multiple searches; and a comparison of a
large number of different group-fusion algorithms shows that one based
on molecules' inverse rank positions is the most effective of those
tested. Information about the frequencies with which fragments occur in
molecules can be used in two ways to increase search effectiveness (when
compared with using just the presence or absence of fragments in
molecules): using functions of the frequencies of fragment occurrences
in individual molecules, and using inverse functions of the frequency of
fragment occurrences in the database as a whole.
![]()
![]()
10 - Construction of topical faculty learning communities by the
Center for Workshops in the Chemical Sciences (CWCS) and the use of Drupal as a
development platform
Dr. Cianán B. Russell, Dr. David M. Collard. School of
Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA, United
States
A new national dissemination initiative of the Center for Workshops in the
Chemical Sciences (CWCS) is to develop topical faculty learning communities to
further spread the adoption of innovative content and to propagate the use of
good pedagogical practice in the teaching of undergraduate chemistry. CWCS has
provided 88 workshops in a variety of topical areas, hosting over 1400
participants who have then used the workshop materials in a number of ways to
improve undergraduate education. In this new initiative, we wish to engage
workshop participants as the foundation of online communities that provide
access to databases of curricular materials and pedagogies, together with the
shared expertise of the group through discussion boards, blogs, etc. The Drupal
platform was used to develop a flexible and adaptable interface. The process of
developing this interface, and challenges associated with prototyping,
assessing, and modifying our approach to the development will be discussed.
![]()
11 - Ebooks: A culture shift for academic libraries?
Assisstant Professor Barbara A. Losoff. Science
Library, University of Colorado, Boulder, CO, United States
The decline of print materials in academic libraries is a result of
changing technology, cost, and plummeting use by patrons. This mobile,
Google/YouTube/Facebook, generation acquires their information online.
Images are as important as text. Librarians must ask themselves the
question: in what ways are these users transforming the very definition
of a book, and how can libraries support this cultural shift to digital
content, and does anyone know what the book of the future will resemble?
![]()
12 - Engaging student discussion: The role of a google jockey
Prof. Laura E Pence, Emily R. Greene. Department of
Chemistry, University of Hartford, West Hartford, CT, United States
A challenge to the inclusion of real world applications in a
course can be the students' lack of mental images to provide context.
PowerPoint images are a solution in a
structured lecture environment, but in a first year seminar course with
an
emphasis on discussion, preselected illustrations constrain the dialogue
and
reflect only the instructor's mental framework.
A powerful alternative solution employed a senior student embracing the role of Google Jockey, whose purpose is to search and display images from the internet as illustration or counterpoint to an ongoing discussion. The replacement of mental images with visual images enhanced the student engagement in the class and allowed the senior to have a vital, if silent, contribution to the dialogue.
![]()
13 - Rip-Mix-Learn (RML): Using Google Docs to create
collaborative multimodal class notes
Dr. Lucille A Benedict, Dr. Harry E Pence. Department
of Chemistry, University of Southern Maine, Portland, ME, United States;
SUNY College at Oneonta, Oneonta, NY, United States
Computer and internet use has become ubiquitous among college students
and can be very powerful educational tools when properly incorporated
into the course curriculum. The Rip-Mix-Learn (RML) approach applies
students' knowledge of surfing the web with course content to create a
set of collaborative class notes that incorporate multimodal
representations of each concept to make the students more personally
invested in the topics. To create these class notes, first-semester
general chemistry course students were given a basic set of notes each
week in Google Docs focusing on the current course topics. The students'
task was to annotate these documents with pictures, videos, or other
representations found on the web and then write brief descriptions of
how these annotations related to the specific topics. This talk will
focus on the implementation, use, and advantages and drawbacks of using
this RML approach in a large lecture first-semester general chemistry
course.
![]()
14 - Smart phones, smart objects, and chemical education
Prof. Harry E. Pence PhD. Department of Chemistry and
Biochemistry, SUNY Oneonta, Oneonta, New York, United States
The mobile phone is already changing the way we communicate, but it is also
creating new ways to access information. Companies, like Google, Yelp, and Layar,
are building a layer of digital information that can augment the photograph a
user takes with his/her smartphone. As 2D bar codes become more popular in this
country, these symbols can label an object with a URL which, in turn, can cue a
smartphone or personal computer to access a web site. This means that a piece of
paper can include the equivalent of a hyperlink that may lead to structural,
safety, or other information. What new opportunities open up for chemical
educators when smartphones offer not only portable access to a massive library
of information but also a quick and convenient way to work with smart objects
that are connected to the World Wide Web?
![]()
15 - How community crowdsourcing and social networking is
helping
to build a quality online resource for chemists
Dr Antony J Williams PhD. ChemSpider, Royal Society
of Chemistry, Wake Forest, North Carolina, United States
With an intention to provide a free internet resource of chemistry
related data for the community, ChemSpider provides an online database
of chemical compounds, reaction syntheses and related data. Members of
the community can contribute to the database via the deposition of
chemical structures, synthesis procedures and analytical data. Data are
also aggregated from many other depositors, at present over 400 data
sources. The aggregation of data associated with over 25 million
chemical compounds does not come without data quality issues. By
engaging the community to curate the data the quality continues to
improve on a daily basis. The presentation will provide an overview of
our ongoing efforts to expand and curate the database. Using a
combination of game-based and recognition systems as well as our
dependence on societal giveaway by the community ChemSpider continues
its path to become a high quality resource and foundation for the
semantic web for chemistry.
![]()
16 - Chemistry of social media
Scott Jensen. American Chemistry Council, Arlington,
VA, United States
The rise of Web 2.0 or social media has created a new frontier in
communicating with a variety of audiences on issues directly related to
chemistry and how it impacts their lives. Blogs, Twitter, Facebook and
even YouTube have created new opportunities to disseminate information
in a very direct and targeted fashion. At the same time, social media
tools can allow for dialogue or a forum for debate.
This presenttion will discuss how The American Chemistry Council's
Chlorine Chemistry Division has entered this new frontier and utilized
Web 2.0 tools to engage and educate a range of audiences from policy
makers to the general public regarding chlorine related issues.
![]()
43 - Communicating organic chemistry through the internet:
Global learning
communities
Prof. Philip A Janowicz. Department of Chemistry and
Biochemistry, California State University - Fullerton, Fullerton, CA,
United States
The power of broadband internet has allowed for instant communication
across the world, and opportunities for distance education have been
greatly enhanced. In the spring of 2009, students from Peking University
in Beijing, China, joined in with students from the University of
Illinois at Urbana-Champaign in synchronous discussion sessions for
organic chemistry. In the fall of 2009, students from Lahore University
of Management Sciences in Lahore, Pakistan, joined the synchronous
discussions. Experiences during these semesters will be shared along
with an outlook for the future.
![]()
44 - Focusing CENtral Science: An overview of C&EN's redesigned blog
portal and its usefulness to educators
Editor, C&EN Online Rachel Pepling. Chemical & Engineering
News, Washington, DC, United States
In March 2010, Chemical & Engineering News magazine relaunched its blog,
C&ENtral Science (http://centralscience.org), as a portal to several
content-focused blogs meant for different audiences (and dropped the "&" along
the way). This overview will discuss why that decision was made and how the new
CENtral Science can be a valuable resource to chemical educators.
![]()
45 - Chemistry blogging: From literature to controversy to
community to...
Aaron D. Finke. Department of Chemistry, University
of Illinois, Urbana-Champaign, Urbana, IL, United States
This talk will focus on my experiences as the co-author of a popular
chemistry blog, Carbon-Based Curiosities. Initially, I started blogging
as a means to keep up with the literature by forcing myself to read and
summarize papers I enjoyed or found interesting. However, as the blog
progressed, the audiences increased, and my interests diverged, I found
myself using the chemistry blogosphere as a means to a different end,
one in which one's personal creative energies, even those that tended to
diverge far from chemistry, could be applied to ideas, problems, and
controversies in current chemical research. In this personal account, I
will draw from not only my own experiences in blogging, but also from
others across the chemical blogosphere, and show how this small
community has already made some big waves.
![]()
46 - Blogging: Ego trip, or sound science? Its role in
chemical education and research
Prof Henry S Rzepa D. Sc.. Chemistry, Imperial
College, London, United Kingdom
Blogs evolved as a personal statement by an individual, but in science
and chemistry have now emerged as a fascinating new way of reviewing the
correctness of previously reviewed traditional published science. I will
argue they can be much more. In chemical education, they enable the
chemist to communicate their accumulated expertise in an accessible
manner to both the educational and as it happens the research
communities, and indeed to present new and original science that might
otherwise be lost. The speaker has posted more than 50 blogs in a year
of activity, and a number of these have also been used to enhance taught
courses. Others have morphed into published peer-reviewed articles, in
traditional journals. The difference between a publication and a blog
will be discussed, as well as how a blog can be enhanced with semantic
attributes, harvested and aggregated and archived for the longer term.
![]()
47 - Teaching scientific communication in pharmaceutical
bioinformatics education
Dr. Egon Willighagen. Department of Pharmaceutical
Biosciences, Uppsala University, Uppsala, Uppland, Sweden
Communication is a central part in science. Traditionally, students are
educated to access scientific literature, but communication channels are
changing. The amount of literature has risen sharply, and not even
established researchers can keep up with the amount of publications that
appear each week. At the same time, new technologies have changed
communication as we knew it, and with the introduction of the internet
communication world anyone around the world has become as easy as
communicating with people at the same department. Research has become so
specialized, however, that peers at the same university not always are
the best judges of ones work, and the international communication
becomes more and more important.
In my education of students doing a 20 week research project in
Pharmaceutical Bioinformatics at Uppsala University, we made the use of
social websites a core part of their education. Within their projects,
the students (two at the moment) report the work they do via their blog;
additionally, taking advantage of the programming side of their work,
the results of their experiments (source code) is submitted to a central
source code repository. This is quite similar to the use of wikis for
describing synthesis experiments in organic chemistry. Additionally,
reusable components or examples on how their work can be used, is shared
via the social MyExperiment.org website, allowing others to download the
protocols the students have development, comment on them, and rate them.
The students also take part in a journal club, where we discuss related
literature. Goals of these meetings is that the student learns to
formulate an opinion on the paper, after which we discuss the theories
behind the paper in more detail. For each discussed paper, one or two
participants write up a dedicated blog post, which we mark up such that
social websites like Chemical blogspace and ResearchBlogging.org can
pick up the discussed literature. CiteULike.org is used to share the
list of discussed papers using a dedicated hashtag.
By making the literature reviews and their progress in the 20 week
project publicly available, the students engage in a scientific
discussion with peers. By having parts of their work publicly available
in their blog, it is easier for them to discuss issues on more targeted
mailing lists for databases and software libraries they use in their own
project. Using these social websites helps the student to put their
scientific work in
perspective, and learns them to discuss their research with other
scientists around the globe.
![]()
48 - Developments in chemistry resources on Wikipedia
Prof. Martin A Walker PhD. Department of Chemistry,
SUNY Potsdam, Potsdam, NY, United States
In recent years, Wikipedia has become a standard information source for
students and researchers alike, but its open nature tends to undermine
its reliability. This presentation will explain how to use this immense
resource effectively, and also describe efforts made by the Wikipedia
chemistry community to address users' concerns. A collaboration with
Chemical Abstracts Service has led to validation of Registry Numbers and
structures, while other collaborations with ChemSpider and RSC have also
brought improvements, yet much remains to be done. The presentation will
close with an overview of work that is planned or under way, indicating
the direction of likely future developments.
![]()
49 - ChemEd DL WikiHyperGlossary
Dr. Robert Belford, Dr. Daniel Berleant PhD, Michael
Bauer, Dr. John W. Moore PhD, Roger Hall. Department of Chemistry, UALR,
Little Rock, AE, United States; Department of Chemistry, University of
Wisconsin-Madison, Madison, WI, United States; Department of Information
Sciences, UALR, Little Rock, AR, United States; MidSouth BioInformatics
Center, UALR, Little Rock, AR, United States
We will present the new editing interface of the wikihyperglossary
generating program being developed for ChemEd DL. We will go over the
database design, present several databases, including a non-editable one
with IUPAC Gold book definitions, along with several editable ones. We
will then discuss our experiences in a general chemistry class where
students created definitions for terms in their class textbook using
textual and multimedia online resources.
![]()
50 - Chempedia Lab: Group meeting on a global scale
Ph. D Richard L Apodaca. Metamolecular, LLC, La
Jolla, CA, United States
Online database searches have become the information tool of choice for
answering tough experimental chemistry questions. But what if it were
possible to answer questions by simply asking the entire experimental
chemistry community directly? What would a system that made this
possible look like, and how might it work? Chempedia Lab (http://lab.chempedia.com)
represents our attempt to answer these questions through a fundamentally
new approach to online knowledge-gathering. This talk will discuss how
traditional databases have failed the experimental chemistry community,
and what Chempedia Lab might teach about the chemical information
systems of the future.
![]()
![]()
13 - Tautomerism in chemical information management systems
Wendy A. Warr M.A., D. Phil. Wendy Warr & Associates, Holmes
Chapel, Cheshire, United Kingdom
Tautomerism has an impact on many of the processes in a chemical information
management system including novelty checking during registration into chemical
structure databases; storage of structures; exact and substructure searching in
chemical structure databases; and depiction of structures retrieved by a search.
For this talk the approaches taken by a great many different software vendors
and database producers have been compared. Since it is important to take account
of the nature of the database and the process for which it is designed, and the
user requirements vary, it is dangerous to lay down the law about what is right
and wrong. The comparison is nevertheless of considerable interest.
![]()
14 - Tautomerism in large databases
Dr. Markus Sitzmann, Dr. Wolf-Dietrich Ihlenfeldt,
Dr. Marc C Nicklaus. Chemical Biology Laboratory, Center for Cancer
Research, National Cancer Institute, National Institutes of Health,
DHHS, NCI-Frederick, Frederick, MD, United States; Xemistry GmbH,
Königstein, Germany
We are reporting on a comprehensive tautomerism
analysis of one of the largest currently existing sets of real (i.e. not
computer-generated) compounds. We used the Chemical Structure DataBase
(CSDB)
of the NCI CADD Group, an aggregated collection of over 150
small-molecule databases totaling 103.5 million structure records.
Tautomerism
was found to be possible for more than 2/3 of the unique structures in
CSDB. A
total of 680 million tautomers were calculated from the original
structure
records. Tautomerism overlap within the same individual database (i.e.
at least
one other entry was present that was really only a different tautomeric
representation of the same compound) was found at an average rate of
0.3% of
the original structure records, with values as high as nearly 2% for
some of
the databases in CSDB. Tautomeric overlap across all constituent
databases in
CSDB was found for nearly 10% of the records in the collection.
![]()
15 - Tautomerism in drug discovery
Bahaa El-Dien M. El-Gendy, Prof. Alan R. Katritzky
PhD, Dr. C. Dennis Hall PhD, Bogdan Draghici. Department of Chemistry,
University of Florida, Gainesville, Florida, United States; Department
of Chemistry, Benha University, Benha, Qalubia, Egypt
The influence of tautomerism on the precise structure of drugs and thus
of their potential to interact in biological systems is discussed from
thermodynamic and kinetic aspects. The types of tautomerism encountered
in the structure of drugs in current use are surveyed together with the
effect of pH, solvent polarity, and temperature.
![]()
16 - Quantitative forecasts of biological potency of
molecules
that can tautomerize
Dr. Yvonne C Martin. Martin Consulting, Waukegan, IL,
United States
Whether one is using ligand-based 2D or 3D QSAR or
structure-based estimates of potency of molecules, tautomerism needs to
be
addressed. This talk will highlight insights as to when one needs to
consider tautomerism and
how it can be included in potency forecasts.
![]()
17 - New questions about tautomerism in cytosine: Quantum
chemical and matrix isolation spectroscopic studies
Prof. Geza Fogarasi, Mr Gabor Bazso, Prof Peter G
Szalay, Dr. Gyoergy Tarczay. Laboratory of Theoretical Chemistry,
Institute of Chemistry, Eotvos University, Budapest, Budapest, Hungary;
Laboratory of Molecular Spectroscopy, Institute of Chemistry, Eotvos
University, Budapest, Budapest, Hungary
In spite of numerous studies, there is much uncertainty about
tautomerism in
nucleic acids and specifically cytosine. In the gas phase, form 2
dominates but DG maybe about 1 kcal/mol
for both 1 and the “rare” form 3. Spectroscopic
studies “see” them but in much smaller abundance. The UV spectrum is
normally assigned to 1. Dimerization may also influence
tautomerization.

Fig. 1. Selected isomers and a dimer of cytosine
We present infrared and UV spectroscopic measurements in Ar matrix and
discuss them by MP2 and CC quantum chemical calculations, including
electronic excitations. Contributions from isomers/tautomers and/or
dimers to the spectra are discussed.
![]()
55 - FPGA implementation of cheminformatics and computational
chemistry algorithms and its cost/performance comparison with GPGPU,
cloud computing and SIMD implementations
Dr. Attila Berces PhD, Prof. Bela Feher PhD, Peter
Szanto, Imre Pechan, Laszlo Lajko, Zoltan Runyo, Peter Laczko, Janos
Lazanyi. Chemistry Logic Kft, Budapest, Hungary; Dept. of Measurement
and Information Systems, Budapest University of Technology and
Economics, Budapest, Hungary; evopro Kft, Budapest, Hungary
We have developed binary fingerprint based similarity searching,
topologial torsional fingerprint based similarity searching, chemical
library to library comparison, sphere exclusion and Jarvis Patrick
clustering, peptide mass spectrometry fingerprinting, BLAST prefiltering,
short read mapping in color space on Silicon Graphics RC100 FPGA card.
In addition, we implemented the Autodock docking software on FPGA. We
reached 5 to 500 folds acceleartion compared to CPU in these
implementations. In this presentation the audience will learn what
characteristics an algorithm should have to make it worthwhile to
implement it on FPGA. We shall also compare the cost/performance
characteristics to other alternatives such as cloud computing, GPGPU,
and single-instruction-multiple-data (SIMD) optimization.
![]()
56 - Technologies for desktop HPC: Application developer's
perspective
Dr. Volodymyr Kindratenko PhD, Guochun Shi. National
Center for Supercomputing Applications, University of Illinois, Urbana,
IL, United States
In the last few years we have witnessed the emergence of a new computing
paradigm: computational accelerators. Most prominent examples of such
accelerators include FPGAs, Cell/B.E., and most recently GPUs. While
these technologies bring unprecedented computing capabilities to the
desktop users at a fraction of the cost of a traditional HPC system,
their use comes with substantial difficulties due to the need for
software reengineering. We survey the landscape of application
accelerators for desktop systems and discuss the challenges of
re-implementing computational chemistry applications on some of these
systems using Hartree-Fock method and molecular dynamics codes as
examples.
![]()
57 - Faster, cheaper, and better science: Molecular modeling
on GPUs
John E. Stone. Beckman Institute, University of
Illinois at Urbana-Champaign, Urbana, IL, United States
Over the past ten years graphics processing units (GPUs) have evolved
from fixed-function single-purpose devices into highly programmable
massively parallel co-processors. State-of-the-art GPUs support
double-precision floating point arithmetic and achieve performance
levels approaching one trillion floating point arithmetic operations
per second. Modern GPUs enable software development in dialects of
familiar C, C++, and Fortran languages, and GPU acceleration extensions
exist for Python, Matlab, and other popular languages and computing
tools.
The high performance of GPUs has created opportunities for acceleration
of many computationally demanding molecular modeling algorithms that
contain significant parallelism.
We will describe how GPUs are currently employed to accelerate some of the most computationally demanding tasks involved in molecular dynamics simulation, visualization, and analysis in our NAMD and VMD software, and give an overview of how GPUs are expected to evolve in the next few years.
![]()
58 - Folding@home: Petaflops on the cheap today, exaflops
soon?
Prof. Vijay Pande. Department of Chemistry, Stanford
University, Stanford, CA, United States
Over the last 10 years, Folding@home has emerged as a very powerful
resource. Today, it has multi-petaflop performance, making it the most
powerful supercluster in the world. I will talk about how Folding@home
works, both in terms of infrastructure and algorithms, and how one can
easily reproduce these sorts of approaches in your own lab. I will also
very briefly touch on recent results from Folding@home to highlight what
petascale power can do to dramatically change the nature of what
simulations can inform us about systems of interest.
![]()
59 - Protein-ligand docking on the Cell/BE processor with
eHiTS Lightning
Zsolt Zsoldos PhD, Orr Ravitz PhD. SimBioSys Inc.,
Toronto, Ontario, Canada
The eHiTS flexible docking has proven to be among the most accurate pose
prediction tools (http://www.simbiosys.ca/ehits/ehits_validation.html)
providing one of the highest enrichment factors based on comparative
evaluation studies (http://www.simbiosys.ca/ehits/ehits_enrichment.html).
The accurate results of eHiTS have been achieved at the price of longer
CPU times in the past, but that has changed with the recent port of the
algorithm to the Cell/BE processor (http://www.bio-itworld.com/issues/2008/july-august/simbiosys.html).
The revolutionary hardware that powers RoadRunner (the world's current
fastest supercomputer) and also available in the low cost SONY PS3 game
console, gives eHiTS 30-50 fold speedup compared to a single core
Intel/AMD processor. The advantages of the Cell/BE platform over other
acceleration techniques (FPGA,GPGPU) will be described, along with the
challenges faced during the porting effort. A new proximity data
structure is introduced that is optimized for SIMD architectures. It
allows efficient evaluation of short range pairwise interactions with
optimum cache locality.
![]()
60 - Fragment-based druggable hot spot identification in
proteins and protein-protein interactions using HPC
Dr. Gwo Yu Chuang, Dr. Ryan Brenke, David R Hall, Dr. Dmitri
Beglov, Dr. Dima Kozakov, Dr. Sandor Vajda. Department of
Biomedical Engineering, Boston University, Boston, MA, United States
Here we present a highly parallel FFT-based method FTMAP for performing
computational fragment mapping. Mapping methods place molecular probes
on the surface of proteins in order to identify the most favorable
binding positions. Since regions of the protein surface that are major
contributors to the binding free energy in drug-protein interactions
also bind a variety of small organic molecules, mapping can identify
such “hot-spots” and the number of probe molecules bound is a good
predictor of druggability. The highly parallel nature of our FFT-based
approach allows it to be fully scalable, running efficiently on
everything from desktop machines with CUDA enabled graphics adapters to
an IBM Blue Gene. The method has been applied to both canonical and
protein-protein interaction drug targets, successfully predicting
binding hot-spots and target druggability. Our public web server is
gaining popularity among academic users and generating significant
interest from industry.
![]()
61 - GPUs: What is all the fuss about?
Brian Cole, Bob Tolbert, Anthony Nicholls. OpenEye
Scientific Software, Santa Fe, NM, United States
High performance computing hardware is undergoing a revolution. The best
way to achieve increasing performance is through highly parallelized
architectures like the graphics processing unit. However, the GPU
requires a new assessment of algorithm design based on different memory
versus time tradeoffs. Good performance is no longer gained by simply
reducing the number of operations, but by organizing the interaction of
those operations with a complex hierarchy of memory with varying
latencies. Understanding the changing programming paradigm is critical
both to selecting which algorithms will benefit from the GPU and how to
achieve optimal performance. We will discuss design principles used when
porting ROCS to the GPU. We will compare performance of a GPU
implementation of ROCS to the highly-tuned production CPU
implementation. We will show that higher performance can be achieved on
the GPU at a significantly reduced cost compared to CPU clusters.
![]()
76 - Water in protein binding sites: Consequences for ligand
optimization
Dr. Julien Michel, Dr. Julian Tirado-Rives, James Luccarelli,
Prof. William L Jorgensen. Department of Chemistry, Yale University, New
Haven, CT, United States
An efficient molecular simulation methodology, JAWS, has been developed to
determine the positioning of water molecules in the binding site of a protein or
protein-ligand complex. Occupancies and absolute binding free energies of water
molecules are computed using a statistical thermodynamics approach. The
importance of determining proper water occupancies is illustrated in Monte
Carrlo/free energy perturbation calculations for ligand series that feature
displacement of ordered water molecules in the binding sites of scytalone
dehydratase, p38-aMAP kinase, and EGFR kinase. The change in affinity for a
ligand modification is found to correlate with the ease of displacement of the
ordered water molecule. For accurate results, a complete thermodynamic analysis
is needed. It requires identification of the location of water molecules in the
protein-ligand interface and evaluation of the free energy changes associated
with their removal and with the introduction of the ligand modification. Direct
modification of the ligand in free-energy calculations is likely to trap the
ordered molecule and provide misleading guidance for lead optimization.
![]()
77 - Efficient method for computing the free energies of
active site waters: Application to drug discovery
Jinming Zou, Sia Meshkat, Zenon Konteatis, Anthony Klon,
Charles H. Reynolds. Ansaris, Blue Bell, Pennsylvania, United States
Grand canonical Monte Carlo and systematic free energy methods have been
reported previously that allow us to rapidly compute protein-fragment
interaction energies. The same methodologies can be employed to compute
free energies of binding for water. We have used this approach to
identify critical waters in a number of therapeutically interesting
protein active sites. Knowledge of the location and affinities of these
waters can be useful for designing ligands with improved potency.
![]()
78 - Using explicit solvent implicitly
Dr. Christopher J Fennell, Charles W. Kehoe, Prof.
Ken A. Dill. Department of Pharmaceutical Chemistry, University of
California, San Francisco, San Francisco, CA, United States; Graduate
Group in Bioinformatics, University of California, San Francisco, San
Francisco, CA, United States
Solvent plays a critical role in biomolecular simulations. It mediates
the transfer of small molecules, it bridges interactions between ligands
and binding sites, it stabilizes protein stuctures with external
hydrophilic groups and buried hydrophobic cores, among others. When
solvent is modeled explicitly in simulations, the microscopic
interactions can be handled rigorously, but obtaining converged
solvation energetics can be time-consuming. Here we describe a process,
called Semi-Explicit Assembly, where we precompute the solvation
response in simple systems and apply it in complex systems. We show that
it is possible to have a detailed/explicit-like treatment of solvation
at a computational cost similar to the fastest of implicit solvents.
![]()
79 - Role of solvent in protein-ligand binding
Robert Abel PhD, Noeris Salam PhD, Thijs Beuming PhD,
Woody Sherman PhD, Ramy Farid PhD. Schrodinger Inc., New York, NY,
United States
Calculation of protein-ligand binding affinities continues to be an
active area of research. Although many techniques for computing
protein-ligand binding affinities have been introduced, ranging from
computationally very expensive approaches, such as free energy
perturbation (FEP) theory to more approximate techniques, such as
empirically derived scoring functions, which, although computationally
efficient, lack a clear theoretical basis - their remains pressing need
for more robust approaches. The recently introduced WaterMap technology,
which calculates the locations and displacement free energies of
hydration sites in proteins, was developed to bridge the gap between the
accuracy of FEP and the computational efficiency of empirically derived
scoring functions. In the present work, we apply WaterMap to a number of
pharmaceutically relevant targets, and present a generalized approach
for accurate predication of binding affinities that combines solvation
terms from WaterMap with other important thermodynamic terms.
![]()
80 - Compute the contribution of protein-pocket solvation to
ligand-binding affinity by explicit water simulations
Dr. Ming-Hong Hao, Dr Ingo Muegge. Department of
Medicinal Chemistry, Boehringer Ingelheim Pharmaceuticals, Inc,
Ridgefield, CT, United States
A significant fraction of ligand-binding free energy in proteins arises
from the replacement of water molecules by the ligand in the binding
site of proteins. Continuum solvation models based on surface areas do
not treat the short-range correlations of water molecules well in the
highly irregular and heterogeneous protein-binding pocket. We have
developed a computational procedure to simulate the density distribution
and free energy of water molecules in the ligand-binding pocket of
proteins using a molecular dynamics procedure (NAMD) with explicit water
model (TIP3P). Our results are comparable with literature works (e.g.
WaterMap software from Schrodinger Inc.) and show good agreement with
crystallized water molecules observed in the X-ray structures of
proteins. In our procedure, the distribution of water molecules in the
protein-binding pocket is presented as water density on a 3-dimensional
grid which we find to provide an intuitive way for visualizing the
hydrophobic or polar characteristics of a binding site. The contribution
of solvation to ligand-binding free energy is estimated by the
difference of free energy of the pocket of water replaced by the ligand
in the protein binding site and in the bulk solvent. This contribution
is added to the direct ligand-protein interactions in scoring the
binding affinity of ligands. We investigated the effects of residue
mutations in protein binding-site on ligand binding affinity, including
the Tryptophan mutations (W79F, W92F, W108A and W120A) in the
high-affinity Streptavidin-Biotin complex and the drug-resistant mutants
of HIV protease in complex with the inhibitor U-89360E. In these
systems, X-ray crystallography showed no significant differences in the
given protein-ligand complex structures between the wild type and mutant
proteins. Intermolecular interactions between protein and ligand alone
do not fully account for the changes in ligand-binding affinity. The
free energy change of solvation in the binding site between wild type
and mutants provides a good explanation for the shift in ligand-binding
affinity. We also applied the procedure to study the structure-activity
relationship of congeneric series of ligands. Our results suggest that
binding-pocket solvation is an important factor in understanding the
binding affinity of ligands to proteins.
![]()
81 - All-atom explicit-solvent fragment-based drug discovery: SILCS
("Site Identification by Ligand Competitive Saturation") molecular dynamics
simulations applied to IL-2
Prof. Olgun Guvench M.D., Ph.D.. Department of Pharmaceutical
Sciences, University of New England College of Pharmacy, Portland, ME, United
States
Two challenges in computer-aided drug discovery are incorporation of protein
flexibility and an accurate description of solvation effects. Fast in silico
screening methods typically employ rigid or near-rigid protein conformations and
continuum descriptions of solvation, while more physical and accurate
explicit-solvent all-atom molecular dynamics or Monte Carlo methods are very
computationally demanding. Site Identification by Ligand Competitive Saturation
(SILCS) is a recently-developed computationally-efficient fragment-based drug
discovery method that employs all-atom explicit-solvent molecular dynamics
simulations, essentially soaking the target in a 1 molar bath of hydrophobic
fragments to compute 3-D probability maps of hot-spots on the protein surface
that preferentially bind hydrophobic fragments or water molecules. Applied to
the apo crystal structure of IL-2, SILCS identifies two hydrophobic pockets not
present in the apo crystal, but later discovered to exist in complexes with
small molecule inhibitors and to bind hydrophobic moieties on these molecules.
![]()
101 - Predicting tautomer preference: Simple rules and unforeseen
complexities
Peter W. Kenny PhD, Peter J Taylor. AstraZeneca (retired),
Cheadle, United Kingdom
Tautomer ratio depends on phase, so for coherent analysis this must be chosen
first. We settle for water as the biological medium, and show inter alia that
the gas phase is still more removed from water than even the least polar of
organic solvents. We also point out that, while minor tautomers may bind to
receptors, this must entail an energetic penalty.
The 'basicity method' is the main source of quantitative data in water but
suffers from systematic errors through its inevitable reliance on model
compounds. Elimination of these using correction factors not only improves
accuracy but has demonstrated structural regularities that have gone unsuspected
till now. Their extrapolation leads to plausible predictions amenable to
experiment. The effects of benzofusion, and of intramolecular lone pair and
dipolar repulsion, exemplify these regularities and will be discussed.
Central to our approach is the realisation that tautomerism takes two forms,
'C-type' and 'N-type,' which depend on different electronic factors. The
apparent inconsistencies that result may have helped to inhibit the
comprehensive approach to tautomer ratio that is needed, and hopefully their
rationalisation will help in its renewal.
![]()
102 - Methods for robust and efficient tautomer enumeration,
tautomer searching and tautomer duplicate filtering
József Szegezdi, Zsolt Mohácsi, Tamás Csizmazia, Szilárd
Dóránt, Ákos Papp, György Pirok, Szabolcs Csepregi, Ferenc
Csizmadia. ChemAxon Ltd., Budapest, Hungary
Tautomerism is an important and difficult problem in cheminformatics,
and has gained much attention recently. [1] The presentation will focus
on ChemAxon's approaches and algorithms for handling tautomerism.
There are four main topics to cover:
1. The tautomerization calculator plugin [2] is the basis of most
methods. It can identify tautomerizable regions, enumerate all or
dominant tautomers and
predict the distribution of dominant tautomers. Furthermore, it can
provide generic and canonical tautomers that are used by the methods
discussed. It first identifies possible proton donors and acceptors and
finds the tautomerization paths between them. Depending on the desired
operation, it then combines the paths into regions (generic tautomer),
combinatorially enumerates all possible tautomeric forms (all tautomers),
filters and ranks enumerated structures based on pKa and other criteria
(dominant tautomers) or canonicalizes using empirical rules (canonical
tautomer).
The tautomerization plugin is also used to improve results of other
calculations, such as macro pKa and logP.
2. Tautomer duplicate search uses generic tautomers combined with a hash
key. This method also allows fast filtering of tautomers in chemical
database tables. It will be shown how this method is able to handle
tautomeric migration of H isotopes and interactions with
stereochemistry.
3. Tautomer substructure search enumerates tautomers of the query, and
searches each of them separately. In case of query H constraints
(explicit H), the constraint is enforced on the tautomeric region to
retrieve only true tautomers.
4. Standardizer is a tool for performing custom and built-in
transformations on molecules. It is integrated with the JChem chemical
database system, so that database and query structures are automatically
transformed by the specified transformations [3]. It will be shown how
the canonical tautomer and custom transformations can be used to handle
tautomerism. Custom transformations also allow handling of ring-chain
tautomerism.
References:
[1] Martin, Y.C.: Let's not forget tautomers J Comput Aided Mol Des
(2009) 23:693-704, DOI 10.1007/s10822-009-9303-2
[2] Szegezdi, J.; Csizmadia, F: Tautomer generation. pKa based dominance
conditions for generating dominant tautomers.
American Chemical Society meeting, Aug 19-23rd, 2007
http://www.chemaxon.com/conf/Tautomer_generation_A4.pdf
[3] Pirok, G. et al: Standardizer - Molecular Cosmetics for
Chemoinformatics.
Drug Discovery Technology, August 7-10th, 2006
http://www.chemaxon.com/conf/standardizer.pdf
![]()
103 - Tautomerization approach for drug-like molecules
Dr. John C. Shelley PhD, Arron P. Sullivan, David
Calkins, Dr. Jeremy R. Greenwood PhD. Schrodinger, Inc., Portland,
Oregon, United States; Schrodinger, Inc., New York, New York, United
States
We outline a pragmatic approach for generating the important protonation
states, including tautomers, for drug-like molecules in the context of
ligand and structure based virtual screening. The emphasis is on
generating those states that have significant populations (which we
define to be 0.01 mole fraction or more) in solution. These states also
encompass the vast majority of those intuited from the examination of
more than 2,500 protein-ligand complexes. The overall technology
combines the use of many pre-parameterized tautomeric equilibria with
Hammett and Taft calculation estimates of pKa values, which in turn can
also be used to generate variations in both protonation states and
tautomeric states. The overall approach permits the calculation of the
mole fractions for the states generated along with their relative free
energies. These free energy estimates have been shown to improve the
performance of subsequent studies such as docking with Glide.
![]()
104 - Acid/base ionization vs. prototropic tautomerism
Dr. Robert Fraczkiewicz PhD, Dr. Marvin Waldman PhD,
Dr. Robert D. Clark PhD, Walter S. Woltosz MS, MAS, Dr. Michael B.
Bolger PhD. Life Sciences, Simulations Plus, Inc., Lancaster, CA, United
States
The most serious difficulty in computational predictive modeling of
tautomerism is the lack of a sufficiently comprehensive database of
tautomeric constants. [1] Published data on aqueous protonic ionization
is, on the other hand, quite abundant to build successful QSPR models.
Moreover, prototropic tautomerism is intimately tied to ionization in
more than one way. We present compelling examples of how these ties can
be explored to make both qualitative and quantitative predictions
regarding tautomers using a truly predictive model of ionization
constants. We show a very surprising case where the model refuted the
widely accepted tautomeric form of one of the most successful drugs on
the market today and how all of these predictions were confirmed beyond
any doubt, both experimentally and theoretically. We demonstrate how the
complex tautomerism of another very well known drug could be explained
and quantified from its predicted ionization patterns. A general
theoretical treatment of tautomer and ionization equilibria will be
presented as well.
1. Martin, Y. C. J. Comput. Aided Mol. Des. 2009, 23, 693-704.
![]()
105 - Combinatorial-computational-chemoinformatics approach
to finding and analyzing low-energy tautomers
Dr. Maciej Haranczyk, Prof. Maciej Gutowski.
Computational Research Division, Larence Berkeley National Laboratory,
Berkeley, CA, United States; Chemistry-School of Engineering and
Physical Sciencs, Heriot-Watt University, Edinburgh, United Kingdom
Enumeration
of low-energy tautomers of neutral molecules in the gas-phase or typical
solvents can be performed by applying available organic chemistry
knowledge.
However, in esoteric cases such as charged molecules in uncommon,
non-aqueous
solvents there is simply not enough available knowledge to make reliable
predictions of low energy tautomers. We have been developing an approach
to
address the latter problem and we successfully applied it to discover
the most
stable anionic tautomers of nucleic acid bases that might be involved in
the
process of DNA damage by low-energy electrons. The approach involves
three steps: (i) combinatorial
generation of a library of tautomers, (ii) energy-based screening of the
library
using electronic structure methods, and (iii) analysis of the
information generated
in step (ii). In steps i-iii we employ combinatorial, computational and
chemoinformatics
techniques, respectively. This presentation summarizes our developments
and
most interesting methodological aspects of our approach.
![]()
106 - Comparison of pattern-based and algorithm-based
approaches to tautomer informatics
Ben Ellingson, Robert Tolbert, A. Geoffrey Skillman.
OpenEye Scientific Software, Inc, Santa Fe, NM, United States
Tautomers are an important consideration for cheminformatics and
molecular modeling. In cheminformatics, a unique tautomer is stored as
the singular registration key where it is vital that the unique key can
be generated from any tautomer as well as that all tautomers can be
generated from the unique key. The stored tautomer is often chosen for
aesthetics or computational ease, but chemical implications such as the
loss or gain of aromaticity or stereochemistry through tautomerization
must also be addressed. Molecular modelers are often concerned with
small ensembles of low energy tautomers. Unfortunately, determining the
low energy tautomers is a complex task, for which sub-kcal/mol accuracy
remains computationally intensive [1]. Thus, tautomer prediction for
large-scale modeling or cheminformatics remains the domain of
approximate. We will discuss two such approximate methods, pattern-based
tautomer recognition and atom-type tautomer recognition. The advantages
and disadvantages of these approaches will be examined.
1. Geballe, M. T.; Skillman, A. G.; Nicholls, A.; Guthrie, J. P.;
Taylor, P. J. The SAMPL2 Blind Prediction Challenge: Introduction and
Overview. Journal of Computer-Aided Molecular Design 2010,
24, XX.
![]()
120 - Community structure-activity resource: Collecting,
curating, and generating protein-ligand data to improve docking and
scoring
Dr. James B. Dunbar Jr, Prof. Heather A. Carlson.
Department of Medicinal Chemistry, University of Michigan, Ann Arbor,
Ann Arbor, MI, United States
The Community Structure-Activity Resource (CSAR) is a center at the
University of Michigan funded by the National Institute of General
Medical Sciences. The function of this center is to collect, curate, and
disseminate protein-ligand data sets of crystal structures, biological
binding affinities, and thermodynamic data to aid in the refinement of
docking and scoring methodologies. These data sets are to come from
in-house projects at the University of Michigan, other academic labs,
and most importantly from industrial, pharma sources. Part of our remit
is to augment the deposited data with synthesis, crystallography, and
assays to expand the range of properties, binding affinities, and other
relevant characteristics involved in docking and scoring. Here, we
present CSAR's capabilities and summarize our current in-house project
and potential future targets. We also outline the creation of a dataset
(based on the PDB, Binding MOAD, and PDBbind) used in our first
community-wide benchmark exercise.
![]()
121 - Results of CSAR's 2010 Benchmark Exercise
Dr. James B. Dunbar, Dr. Richard D. Smith, Prof. Heather
A. Carlson. Department of Medicinal Chemistry, University of
Michigan, Ann Arbor, Ann Arbor, MI, United States
The goal of CSAR's Benchmark Exercises is not to declare winners and
losers! Instead, we combine the results of all participants to provide a
wider assessment of the field. Here, we present an analysis of which
protein-ligand complexes score poorly across the majority of submissions
(“globally bad” complexes) and compare their properties to the set of
complexes that score well across the majority of methods (“globally
good”). It may be tempting to draw conclusions by simply examining the
characteristics of the globally bad set, but those characteristics must
be rarely observed in the globally good set to gain true insight.
Lastly, each participant was asked to submit a standard method and an
alternative approach. Several groups showed that the correlation to
experiment was the same for vdw/fit-based scores as for full scoring
functions that included electrostatics and hydrogen bonding. To help the
field overcome this limitation, CSAR will focus on creating datasets
that provide a range of hydrogen-bonding characteristics. The
overarching goal of our benchmark exercises is to provide insight into
what data is most needed to move our field ahead.
![]()
122 - Scoring performance of eHiTS on the CSAR dataset
Zsolt Zsoldos PhD, Orr Ravitz PhD. SimBioSys Inc.,
Toronto, Canada
Numerous studies have pointed out at the inability of scoring functions
to perform uniformly well accross all biological systems of interest.
Some studies suggest guidelines for choosing the best method for a
specific problem, others advocate consensus techniques.
An alternative solution is to tailor the scoring function for the system
of interest. eHiTS uses a novel scoring method consisting of statistical
knowledge focused on interacting surface points and physical terms
combined with an adaptive parameter scheme. During the automated tuning
of eHiTS-score, receptor targets are clustered according to the chemical
and shape similarity of the active site, and weight sets are optimized
for each family.
The performance of eHiTS on the CSAR dataset was evaluated using the
default parameters (pre-tuned on other data). In addition, the automatic
tuning utility was run on one subset of the CSAR data and tested on the
other. Results will be presented from both studies.
![]()
123 - Hydrophobic complementarity: A dominant term in
affinity and binding mode prediction
Dr. Leslie A. Kuhn, Matthew E. Tonero. Biochemistry &
Molecular Biology, Michigan State University, East Lansing, MI, United
States
Empirical scoring functions designed for high-throughput docking,
containing linear combinations of terms measuring protein-ligand
interactions, were tested for affinity prediction. Scoring functions
that best predicted affinity were dominated by hydrophobic or shape
complementarity terms. Similarly, a scoring function containing only
polar terms compensated for the absence of a hydrophobic term by heavily
weighting the polar term that correlated most with hydrophobic
complementarity. These results are consistent with Eisenberg &
McLachlan's observation that the solvation component of the change in
Gibbs free energy upon binding is proportional to the surface area and
degree of hydrophobicity of atoms buried in the interface. Scoring
functions that perform best at affinity prediction are not necessarily
optimal for binding mode prediction, though hydrophobic burial is
important in both. In other words, tuning scoring functions only to
predict the affinity of good ligands in the correct binding mode can
limit their applicability, suggesting a broader approach.
![]()
124 - Docking and scoring for 2010 CSAR benchmark using an improved
iterative knowledge-based scoring function with MDock
Sheng-You Huang, Xiaoqin Zou. Department of Physics,
Department of Biochemistry, Dalton Cardiovascular Research Center, Informatics
Institute, University of Missouri-Columbia, Columbia, MO, United States
Based on a physics-based iterative method (Huang & Zou, J. Comput. Chem.,
2006, 27, 1865-75; 1876-82), we have extracted a set of distance-dependent
all-atom potentials for protein-ligand interactions (ITScore2.0) using a large
training set of 1300 protein-ligand complexes. The iterative method circumvents
the long-standing reference state problem in traditional knowledge-based scoring
functions. ITScore2.0 has been tested with the 2010 CSAR dataset of 345 diverse
protein-ligand complexes, and achieved a correlation coefficient of 0.73 between
the calculated binding scores and experimental affinity data, compared to 0.58
for the van der Waals (VDW) scoring function and 0.32 for the force field (FF)
scoring function consisting of VDW and electrostatic terms. For rigid-ligand
docking, ITScore2.0 achieved a success rate of 86.7% in identifying native
binding modes, compared to 80.0% and 64.1% for FF and VDW. For flexible-ligand
docking, ITScore2.0 yielded a success rate of 79.7%, compared to 71.0% and 52.8%
for FF and VDW. The moderate performance of VDW suggests that VDW alone may
serve as a benchmark for evaluation of scoring functions. What we have learned
through participating in CSAR scoring will be shared.
![]()
145 - Lead Finder in the CSAR scoring challenge
Victor Stroylov MD, Dr Ghermes Chilov, Dr Oleg
Stroganov, Fedor Novikov, Val Kulkov MD, MBA. "Molecular Technologies",
Ltd, Moscow, Russian Federation; BioMolTech, Corp., Toronto, Ontario,
Canada
Lead Finder is a specialized software package for ligand docking,
binding energy evaluation and virtual screening. The standard approach
in estimation of binding affinities of protein-ligand complexes of the
CSAR test set was the use of Lead Finder v.1.1.14 scoring mode that
estimates free energy of protein-ligand binding for the fixed ligand
coordinates for each protein-ligand complex. No pre-optimization of
either protein or ligand structures were performed.
The improvements in the scoring protocol included corrections of
protein's and ligand's protonation states, positions of functional
hydrogen atoms (for proteins only), and local geometry of nitrogen atoms
(for ligands only). No other improvements of Lead Finder's the standard
scoring function have been performed.
The RMSD of estimated vs experimentally obtained protein-ligand binding
energies was found to be equal to 2.07 kcal/mol and 1.98 kcal/mol for
the standard and improved protocols correspondingly.
![]()
146 - Benchmark of solvated interaction energy (SIE) scoring
function on the CSAR-2010 dataset
Traian Sulea, Qizhi Cui, Herve Hogues, Christopher R
Corbeil, Enrico O Purisima. Biotechnology Research Institute, National
Research Council Canada, Montreal, QC, Canada
Solvated interaction energy (SIE) is a first-principle function for
predicting absolute binding affinities from force-field non-bonded
terms, continuum solvation, and scaling for configurational entropy.
Standard SIE parametrization applied to the CSAR dataset with binding
interfaces refined by constrained minimization predicted absolute
affinities with 2.5 kcal/mol mean-unsigned-error, but with correlation
outperformed by buried surface or van der Waals interaction alone.
Re-training SIE on CSAR subsets led to increased solute dielectric and
reduced electrostatic interactions, stressing the weak signal carried by
calculated electrostatics in this heterogeneous dataset. Overestimated
complexes implicate highly negatively-charged ligands interacting via
metals. Underestimated outliers reveal alternate protonation states that
significantly improve SIE predictions. In an upgraded version of the
CSAR dataset with reassigned protonation states, 10% of ligands and 20%
of proteins are affected. Among other investigated aspects are the
sensitivity to polar hydrogens orientation, incorporation of
MD-generated ensembles, different solvent models and entropy estimates,
and ligand strain.
![]()
147 - Protonation states and scoring receptor-ligand poses:
It's always the details
Emilio Xavier Esposito PhD. exeResearch LLC, East
Lansing, Michigan, United States
The protonation state of the receptor - ligand complex has a large
influence over the correct approximation of the binding interactions.
Using the CSAR dataset, various methods of assigning the complex's
protonation state are used to explore the abilities of several scoring
functions with respect to protonation state. In conjunction with the
complex's protonation state, the 'standard' protocols employed to
prepare a receptor for a docking simulation, along with the post-dock
refinement of poses, are explored.
![]()
148 - Role of active-site solvent in protein-ligand binding affinity
calculations
Dr. Ye Che, Dr. Veerabahu Shanmugasundaram. Groton Structural
Biology, Antibacterials Chemistry/Discovery Technologies, Pfizer
PharmaTherapeutics Research & Development, Groton, CT, United States
Accurate methods for computing binding affinities of a small molecule to a
protein are needed to speed the discovery and optimization of new medicines. An
assessment of six scoring functions commonly applied at Pfizer using the CSAR
(Community Structure-Activity Resource) set of protein-ligand complexes will be
presented. A current weakness amongst these various scoring functions is the
treatment of active-site water molecules. Here, we quantitatively estimate the
thermodynamic properties of active-site water molecules and capture the effects
of solvent displacement from the protein active site. Water inclusion shows
promise in improving current scoring functions and we propose that this could be
used more extensively in virtual screening and lead optimization applications.
![]()
149 - Flexible docking using a stochastic rotamer library of
ligands
Dr. Feng Ding, Dr. Shuangye Yin, Prof. Nikolay V.
Dokholyan. Biochemistry and Biophysics, University of North Carolina
at Chapel Hill, Chapel Hill, NC, United States
Uncovering structures of molecular complexes via computational
docking is at the heart of many structural modeling efforts and virtual
drug
screening. Modeling both receptor
and ligand flexibility is important in order to capture receptor
conformation
changes induced by ligand binding, but is a major challenge in
computational
drug discovery. Many flexible docking approaches model the ligand and
receptor
flexibility either separately or in a loosely-coupled manner, which
captures
the conformational changes inefficiently. Here, we propose a truly
flexible docking
approach, MedusaDock, which models both ligand and receptor flexibility
simultaneously using sets of discrete rotamers. We developed an
algorithm which
allows for the building of the ligand rotamer library “on the fly”
during
docking simulations. MedusaDock benchmarks demonstrate a rapid sampling
efficiency and high prediction accuracy in both self-docking (to the
co-crystallized state) and cross-docking (to a state co-crystallized
with a
different ligand), the latter of which mimics the virtual screening
procedure in
computational drug discovery. We also perform a virtual-screening test
for a
flexible protein target, cyclin-dependent kinase 2. We find a
significant
improvement in virtual screening enrichment when compared to
rigid-receptor
methods. The high predictive power of MedusaDock comes from several
innovations, including the generation of a stochastic rotamer library of
ligands, the efficient docking protocol, and the novel ligand
pose-ranking
method. We expect a broad adaption of these methodologies and the
application
of MedusaDock in ligand-receptor interaction predictions and drug
discovery.
![]()
150 - Cheminformatics meets molecular mechanics: A combined
application of knowledge based pose scoring and physical force
field-based hit scoring functions improves the accuracy of virtual
screening
Jui-Hua Hsieh, Shuangye Yin, Xiang S. Wang, Shubin Liu,
Nikolay V. Dokholyan, Alexander Tropsha. University of North
Carolina at Chapel Hill, United States
Many scoring functions fail to discriminate between true binders and
non-binders (binding decoys), leading to a large number of false
positive hits in virtual screening (VS) studies. We have developed a
novel binary QSAR-like approach that discriminates geometrical pose
decoys from native-like poses for each ligand. We have applied it for
filtering (presumed) decoy poses from a library of docked ligand
conformations followed by scoring the remaining poses with the
MedusaScore physical force field-based scoring. We have demonstrated
that this pre-filtering affords a significant improvement of hit rates
in virtual screening studies for 5 of the 6 benchmark sets from the
Database of Useful Decoys (DUD). Moreover, the top 10 hits in these 5
sets were found to include chemically diverse ligands while yielding
high true positive rates (60-100%). We will discuss the methodology as
well as the results of applying this approach to CSAR datasets.
![]()
151 - Application of free energy methods to water molecules
in protein binding sites
Prof. Jonathan W. Essex D.Phil., Dr Caterina
Barillari PhD, Mr Michael Bodnarchuk, Dr Russell Viner PhD. School of
Chemistry, University of Southampton, Southampton, Hampshire, United
Kingdom; Jealott’s Hill International Research Centre, Syngenta,
Bracknell, United Kingdom
Water molecules play a crucial role in mediating the interaction between
a ligand and a macromolecular receptor. An understanding of the nature
and role of each water molecule in the active site of a protein could
efficiency of rational drug design approaches. In this presentation, a
range of different simulation methods, including double decoupling with
replica exchange thermodynamic integration, Grand-Canonical Monte Carlo,
and JAWS, are used to calculate the absolute binding free energies of a
number of water molecules in protein-ligand complexes. The relative
merits of each of these methods are discussed. In addition, the
development of a number of descriptor-based QSAR models for calculating
water binding free energies is described, with a view to reducing the
need for expensive free energy simulations.
![]()
152 - Which waters are important and how do we
identify them?
Dr Simon Bowden, Dr Jason C Cole, Dr Oliver
Korb, Dr Tjelvar Olsson, Dr John Liebescheutz, Dr Colin Groom.
Cambridge Crystallographic Data Centre, Cambridge, United Kingdom
The important role waters play in ligand binding both in terms of
thermodynamics and selectivity is well known but identifying which
waters are important for the success of a docking experiment is still
difficult. Given that consideration of waters involved in primary and
secondary mediated protein-ligand contacts has been shown to improve
success rates in both native docking and virtual screening,
experimenters need tools to help them decide which waters are important
and which are not even real.
In this talk we will describe tools which may be of use to identify
important waters and to highlight dubious waters. Conserved water
structures can also be identified which may have an important influence
on ligand binding. The effect of this information when applied to
molecular docking will be demonstrated.
![]()
153 - Free energies and entropies of water molecules at
protein-ligand interfaces
Prof. Steve W Rick PhD, Mr. Hongtao Yu. Chemistry,
University of New Orleans, New Orleans, LA, United States
Water molecules are commonly found at he protein-ligand interface. The
thermodynamics of these water molecules plays an important role in
ligand affinity. In particular, the entropic cost of localizing a water
molecule at the binding site can be significant. From the database of
crystal structures, it is evident that the local environments of water
molecules at the protein-ligand interface can vary considerably. We use
molecule dynamics simulations and thermodynamic integration to calculate
the free energy, enthalpy, and entropy changes associated with
localizing a water molecule at a wide variety of sites at protein-ligand
interfaces. Results analyzing how the free energies, enthalpies, and
entropies depend on the details of the local environment, including the
number of hydrogen bonds and the cavity size, will be presented.
![]()
154 - Role of water molecules in docking studies of
Cytochromes P450
Dr. Chris Oostenbrink. Institute of Molecular
Modeling and Simulation, BOKU University, Vienna, Austria; Chemistry and
Pharmaceutical Sciences, VU University, Amsterdam, The Netherlands
Active-site water molecules form an important component in biological
systems facilitating promiscuous binding, or an increase in specificity
and affinity. Taking water molecules into account in computational
approaches to drug design or site-of-metabolism prediction is far from
straightforward. The effect of including water molecules in molecular
docking simulations of metabolic Cytochrome P450 enzymes is
investigated, focusing on pose prediction, virtual screening and free
energy estimates. The structure and dynamics of water molecules that are
present in the active site simultaneously with selected ligands are
described. The transferability of hydration sites between different
ligands is investigated. The role of water molecules appears to be very
dependent on the protein conformation and the substrate, further
enhancing the versatility of these metabolic enzymes.
![]()
155 - Modeling explicit waters in docking and scoring
Dr. Niu Huang. National Institute of Biological Sciences,
Beijing, Beijing, China
Water molecules play an important role in protein-ligand recognition. However,
incorporating explicit waters during docking is challenging in both the sampling
and scoring aspects. We explored a method to switch ordered water molecules “on”
(retained) and “off” (displaced) during docking screens. This method assumes
additivity and scales linearly with the number of waters sampled despite the
exponential growth in configurations. We tested this approach for ligand
enrichment in screens of a large compound database against 24 DUD targets,
exploring up to 8 waters in 256 configurations. Compared to calculations where
the water positions were not sampled, enrichment factors increase substantially
for 12 of the targets and are largely unaffected for most others. However, in
our previous study, the positions of the water molecules were obtained from the
x-ray structures, and all waters were treated as equally displaceable without
the consideration of the differential energy of water binding. Our recent work
in improving the treatment of waters during docking and scoring will be
presented.
![]()
156 - Desolvation/resolvation: A revolving door that controls
the rates of association/dissociation of protein-ligand complexes?
Analysis of PCSK9-EGF-A binding kinetics using WaterMap
Dr. Robert A. Pearlstein Ph.D., Dr. Qi-Ying Hu Ph.D.,
Dr. Jing Zhou Ph.D., Dr. David Yowe Ph.D., Dr. Julian Levell Ph.D.,
Bethany Dale, Virendar Kaushik, Dr. Doug Daniels Ph.D., Susan Hanrahan,
Dr. Woody Sherman Ph.D., Dr. Robert Abel Ph.D.. Novartis Institutes for
BioMedical Research, Cambridge, MA, United States; Schrodinger, Inc.,
New York, NY, United States
We hypothesize that desolvation and resolvation processes can constitute
rate-determining steps for protein-ligand association and dissociation,
respectively. We tested this hypothesis using proprotein convertase
subtilisin-kexin type 9 (PCSK9) bound to the epidermal growth
factor-like repeat A (EGF-A) of low density lipoprotein cholesterol
receptor (LDL-R). We analyzed and compared predicted desolvation
properties of wild-type vs. gain-of-function mutant Asp374Tyr PCSK9
using WaterMap, a new method for calculating preferred locations and
thermodynamic properties of water solvating proteins (“hydration
sites”). We propose that fast kon and entropically driven
thermodynamics observed for PCSK9-EGF-A binding is due to functional
replacement of water occupying stable PCSK9 hydration sites (exchange of
water for polar EGF-A groups). We further propose that relatively fast koff
observed for EGF-A unbinding results from limited displacement of
unstable water. Slower koff observed for EGF-A and LDL-R
unbinding from Asp374Tyr PCSK9 may be due to destabilizing effects of
this mutation on PCSK9 hydration sites.
![]()
157 - Biophysics-based library design: Discovery of “non-acid”
inhibitors of S1 DHFR
Veerabahu Shanmugasundaram, Kris Borzilleri, Jeanne Chang,
Boris Chrunyk, Mark E Flanagan, Seungil Han, Melissa Harris, Brian Lacey,
Richard Miller, Parag Sahasrabudhe, Ron Sarver, Holly Soutter, Jane Withka.
Groton Structural Biology, Antibacterials Chemistry/Discovery Technologies,
Pfizer PharmaTherapeutics Research & Development, Groton, CT, United States;
AntiBacterials Chemistry, Pfizer PharmaTherapeutics Research & Development,
Groton, CT, United States; AntiBacterials Research Unit, Pfizer
PharmaTherapeutics Research & Development, Groton, CT, United States
Methicillin-resistant Staphylococcus aureus (MRSA), the causative agent of many
serious nosocomial and community acquired infections, and other gram-positive
organisms can show resistance to trimethoprim (TMP) through mutation of the
chromosomal gene or acquisition of an alternative DHFR termed "S1 DHFR" To
develop new therapies for health threats such as MRSA, it is important to
understand the molecular basis of TMP resistance and use that knowledge to
design and develop novel inhibitors that are effective against S1 DHFR. This
presentation will highlight and illustrate an effort using a multi-pronged
biophysics based strategy that utilizes NMR, thermodynamic, kinetic, structural,
computational and medicinal chemistry information in developing an understanding
of the mechanism of resistance in S1 DHFR as well as using this prospectively in
drug discovery. Specifically this presentation will illustrate computational
studies using WaterMap (WM) that developed an understanding of a key element of
the mechanism of resistance that was supported by a variety of biophysical
experiments and use of these WM calculations in a prospective fashion in library
design.
![]()
170 - Computational evaluation of tautomers and zwitterions
of D-amino acid oxidase (DAAO) inhibitors
Scot Mente. Neuroscience Chemistry, Pfizer Global
Research and Development, Groton, CT, United States
Quantum mechanical calculations and molecular docking were used in to
design novel inhibitors of D-amino acid oxidase (DAAO). Using available
x-ray structural information and simple tautomer enumeration tools,
reasonable docked poses of a set of small ligands have been obtained.
Use of these tools have helped lead to the optimization of the novel
non-acidic 3-hydroxyquinolin-2(1H)-one Series (I), as well as the
identification of structurally similar 3-hydroxyquinoline (II) and
benzotriazole (III). Despite their small sizes, all three of these
molecular scaffolds are capable of adopting multiple tautomer or
zwitterionic states. The ability to accurately predict these states with
quantum mechanical methods will be discussed.

![]()
171 - Defining states of ionization and tautomerization of thiamin
diphosphate at individual reaction intermediates on enzymes: Enzymes that use a
rare tautomeric form
Prof. Frank Jordan PhD, Dr. Natalia S. Nemeria PhD, Mr. Anand
Balakrishnan, Mr. Siakumar Paramasivam, Prof. Tatyana Polenova PhD. Chemistry,
Rutgers University, Newark, NJ, United States; Chemistry and Biochemistry,
University of Delaware, Newark, DE, United States
The author and coworkers demonstrated on several thiamin diphosphate (ThDP)
enzymes that the 1',4'-iminopyrimidine tautomer of ThDP participates at several
reaction steps. Hence, ThDP has dual function: an electrophilic covalent
catalyst - a function long accepted- and an acid-base catalyst facilitating the
ionization of the weak carbon acid to generate the C2 ylide.
It is proposed that ThDP exists in these forms on enzymes: the N1'-protonated
4-aminopyrimidinium (APH+) in protolytic equilibrium with its three conjugate
bases, the canonical 4-aminopyrimidine (AP), its 1',4'-iminopyrimidine (IP)
tautomeric form, and the C2 carbanion or ylide (Yl). The first three forms have
been observed on multiple enzymes in the absence of substrate. In the presence
of substrate and analogs, the IP form has been seen on several enzymes along
with the APH+ state. Circular dichroism and solid-state NMR methods are being
used for the first time to characterize different species. Supported by
NIH-GM-050380 and 5P20RR017716.
![]()
172 - Do tautomers matter in calculating molecular
similarity?
Dr. Steven W Muchmore PhD, Isabella Haight,
Dr. Scott Brown. Cheminformatics, Abbott Laboratories, Abbott Park,
IL, United States
Compounds that have multiple tautomeric forms, which typically account
for about 25% of pharmaceutical company corporate collections, present a
challenge in cheminformatic analysis. While widely recognized, their
manipulations are often ignored in database registration, substructure
searching and similarity searching due to incremental increases in
computation time and
data management. However, clustering and diversity selection, which are
based on similarity calculations, could yield erratic results if they
include or exclude molecules that happen to be encoded as different
tautomers. We enumerated tautomers
for a data set of more than 66,000 compound pairs with associated
activity against protein targets used in the assessment of similarity
programs (Muchmore et al. J. Chem. Inf. Model. 2008, 48,
941). The similarity value for the highest scoring tautomer pair was
compared to the original data to determine if its similarity score
increased. These tautomer similarity values were also applied to
single representation results to determine if tautomer enumeration would
yield a better estimate of the probability that two compounds will be
equipotent.
![]()
173 - Automated prediction of tautomeric states in
protein-ligand complexes
Sascha Urbaczek, Stefan Bietz, Prof. Dr. Mathias
Rarey. Center for Bioinformatics, University of Hamburg, Hamburg,
Hamburg, Germany
Hydrogen bonding plays a mayor role in the stabilization of
protein-ligand complexes. Unfortunately, the positions of hydrogen atoms
are not resolved in most structures present in the PDB. This makes it
particularly hard to predict adequate tautomeric and protonation states
for the atoms and groups involved in the binding. To overcome this
difficulty many approaches have been developed to predict the correct
protonation of either the ligand or the protein separately using a
variety of different methodologies. We present a new method that
predicts the tautomeric and protonation states as well as the resulting
hydrogen atom positions of both the protein and the ligand
simultaneously. The optimization of these states is based on an
empirical scoring scheme used also in docking methods. Assuming an
optimal hydrogen bonding network, the obtained results indicate that the
most stable tautomeric forms in solution do not always correspond to
those found in binding modes.
![]()
174 - Predicting relative binding affinities in the CSAR
Scoring
Challenge
Prof. Matthew P Jacobson, Dr. Chakrapani Kalyanaraman.
Department of Pharmaceutical Chemistry, UCSF, San Francisco, CA, United
States
We have been interested in evaluating whether all-atom force
fields combined with implicit solvent models can be also used as a
docking scoring
function. Our prior experience has
suggested that such energy functions can be used for, at best,
predicting
relative binding affinities to a particular binding site, with the best
results
being achieved for chemically related compounds, such as congeneric
series
generated in lead optimization. Thus,
although predicting absolute binding affinities is a noble challenge, we
have
not attempted to do so in the CSAR exercise. Instead, with the
assistance of the organizers, we focused
on series of compounds bound to the same target. The results using the
protein-ligand structures as provided
showed essentially no ability to rank order compounds by binding
affinity. However, complete energy minimization,
and in some cases correcting protonation states, significantly improved
the
results, to the point where there was some ability to distinguish more
potent
from less potent compounds, as we have also shown in other work on
congeneric
series. I will also discuss our
attempts to characterize and correct some of the many limitations of
this
simple scoring scheme.
![]()
175 - Surflex:
Docking and scoring on CSAR
Prof. Ajay N Jain PhD. Bioengineering and Therapeutic
Sciences, UCSF, San Francisco, CA, United States
One of the most challenging aspects of structure-based
drug design is binding affinity prediction, since it embeds both the
pose determination
problem as well as requiring accuracy in estimation of energetic
contributions
where differences on the order of 1 kcal are large enough to matter.
Even in
the artificial case where a bound ligand/target structure is known, this
remains a challenging problem. We present results for the Surflex family
of
methods for making predictions on the CSAR 2010 benchmark data set.
Results will
include straight docking-based pose prediction and scoring, tuned
scoring
approaches through scoring function optimization and protein structure
optimization, and ligand-based approaches.
![]()
176 - What we can learn from very large panel docking screens
Kong T Nguyen, John J Irwin, Brian K Shoichet,
Michael M Mysinger. Department of Pharmaceutical Chemistry, University
of California San Francisco, San Francisco, CA, United States
Whereas molecular docking is the most practical way to leverage
structure for ligand discovery, the method retains important weaknesses.
Among the more confounding problems is that docking can work well one
target yet fail completely on the next, yet predicting in advance which
will succeed or fail is challenging. To investigate the strengths and
weaknesses of docking we have assembled a very large panel of
experimental information with which to test it. We have used our
automated docking program, DOCK Blaster1. , to study the
performance of DOCK 3.5.54 against many protein targets for which
experimental control information is available2. We have
focused on two of the seven stated goals of the 2010 CSAR Workshop: to
provide a baseline assessment of current scoring functions and to
document which targets are most difficult. This approach has enabled us
to comprehensively test the effect of changes in sampling, scoring and
library composition.
References
1. Irwin, J.J. et al. Automated docking screens: a feasibility study.
J Med Chem 52, 5712-20 (2009).
2. Overington, J. ChEMBL. An interview with John Overington, team
leader, chemogenomics at the European Bioinformatics Institute
Outstation of the European Molecular Biology Laboratory (EMBL-EBI).
Interview by Wendy A. Warr. J Comput Aided Mol Des 23,
195-8 (2009).
![]()
177 - Docking and scoring of fragments
Dr Marcel L Verdonk PhD. Astex Therapeutics Ltd, Cambridge,
United Kingdom
Through the application of fragment-based drug discovery, Astex have produced
>1,400 in-house X-ray crystal structures of fragments and >2,500 structures of
lead-like compounds against a range of drug targets. From this wealth of
structural data, we have constructed two test sets, each containing ~100
complexes, representing 10 drug targets. In the first test set the ligands are
fragments, whereas in the second test set the ligands are lead-like compounds.
By applying docking and virtual screening on these sets, we will discuss whether
fragments are harder to dock and score than larger compounds, and present our
latest experiences on docking and scoring fragments. In addition, we will show
how structural data on fragments obtained early on in drug discovery projects
can be used to improve docking and scoring during the hit-to-lead phases.
Finally, we will show examples of the application of docking and scoring of
fragments on actual drug discovery programs.
![]()
217 - Molecular dynamics studies of water-protein interactions
Gerhard Hummer, Jayendran C. Rasaiah, Hao Yin, Guogang Feng.
Laboratory of Chemical Physics, National Institutes of Health, Bethesda, MD,
United States; Department of Chemistry, University of Maine, Orono, ME, United
States
We use molecular dynamics simulations to study the interaction of water with
proteins. With the help of a semi-grand canonical formalism, we determine the
structure, dynamics, and thermodynamics of water in the protein interior and at
buried sites. We find that water filling of weakly polar protein cavities from
the solvent is governed by a subtle balance between the loss in bulk hydrogen
bond interactions, the gain in strong hydrogen-bond interactions between
confined water molecules, weakly attractive interactions between water and the
cavity, and the entropic gain from filling a void space. The simulation results
will be compared to X-ray crystallography and NMR experiments. The effects of
interfacial and cavity water on protein function and ligand binding will be
discussed.
![]()
218 - Addressing limitations with the MM-GB/SA scoring
procedure using the WaterMap method and free-energy perturbation
calculations
Dr. Cristiano R. W. Guimaraes. CVMD Chemistry,
PharmaTherapeutics Research and Development, Pfizer, Inc., Groton,
Connecticut, United States
The MM-GB/SA scoring technique has become an important computational
approach in lead optimization. Despite showing good accuracy, much work
is necessary before the method can be applied to rank multiple chemical
series. Here, we investigate the poor estimation of protein desolvation
provided by GB/SA and the large dynamic range in the MM-GB/SA scoring
compared to that of the experimental data. In the former, replacing the
GB/SA protein desolvation by the WaterMap free energy liberation of
binding-site waters provides the best results. However, the improvement
is modest over results obtained with the MM-GB/SA and WaterMap methods
individually, apparently due to the high correlation between the free
energy liberation and protein-ligand van der Waals interactions. As for
the large dynamic range, comparisons between MM-GB/SA and FEP
calculations indicate that it has its origin in the lack of dynamical
screening of protein-ligand electrostatic interactions and the
incomplete description of enthalpy-entropy
compensation effects.
![]()
219 - Prediction of potency of protease inhibitors by GBSA
simulations with polarizable quantum mechanics-based ligand charges and
a hybrid water model
Dr. Debananda Das, Dr. Hiroaki Mitsuya, Dr. Yasuhiro
Koh, Yasushi Tojo, Dr. Arun Ghosh. HIV and AIDS Malignancy Branch,
National Cancer Institute, Bethesda, MD, United States; Departments of
Hematology and Infectious Diseases, Kumamoto University Graduate School
of Medical and Pharmaceutical Sciences, Kumamoto, Japan; Departments of
Chemistry and Medicinal Chemistry, Purdue University, West Lafayette,
Indiana, United States
Reliable and robust prediction of binding affinity for drug molecules
continues to be a daunting challenge. We have simulated the binding
interactions and free energy of binding of several protease inhibitors
(PIs) with wild-type and various mutant proteases by performing GBSA
simulations, in which each PI's partial charge was determined by quantum
mechanics and the partial charge accounts for the polarization induced
by the protease environment. We employed a hybrid solvation model that
retains selected explicit water molecules in the protein with surface
generalized Born implicit solvent. We examined the correlation of the
free energy with antiviral potency of PIs. The free energy showed a
strong correlation with experimentally determined anti-HIV-1 potency.
The present data suggest that the presence of selected explicit water in
protein, and protein polarization-induced quantum charges for the
inhibitor, compared to lack of explicit water and a static force
field-based charge model, can serve as an improved lead optimization
tool, and warrants further exploration.
![]()
220 - Continuum theory and the analysis of active sites
Dr. Anthony Nicholls PhD, Dr. Mike Word. Department
of Research and Development, OpenEye Scientific Software, Inc, Santa Fe,
NM, United States
Continuum theory for electrostatics free energies at the molecular level
was never supposed to work- water is discrete and the very idea of
treating its properties as a mean field was considered inappropriate.
Yet Poisson-Boltzmann (PB) theory continues to perform as well as, if
not better than, explicit water treatments in the estimation of small
molecule solvation or macromolecular biophysics. However, it is still
assumed PB will fail to correctly describe the physics of the active
sites of proteins. As this remains a focus for predictive drug
discovery, is this assumption correct? And if it is, can we improve
continuum theory by going beyond the mean field limit, i.e. producing a
'virial' expansion of PB? This talk will cover our attempts to date and
the physical insight gained.
![]()
221 - Prediction of consistent water networks in uncomplexed
protein binding sites based on knowledge-based potentials
Michael Betz, Gerd Neudert, Professor Gerhard Klebe
PhD. Institute of Pharmaceutical Chemistry, Philipps-University Marburg,
Marburg, Germany
Within the active site of a protein water fulfills a variety of
different roles. Solvation of hydrophilic parts stabilizes a distinct
protein conformation, whereas desolvation upon ligand binding may lead
to a gain of entropy. In an overwhelming number of cases, water
molecules mediate interactions between protein and the bound ligand.
Therefore, a reliable prediction of water molecules participating in
ligand binding is essential for docking and scoring, and is necessary to
develop strategies in ligand design. We require some reasonable
estimates about the free energy contributions of water to binding.
Useful parameters for such estimations are the total number of
displaceable water molecules and the probabilities for their
displacement upon ligand binding. These parameters depend on specific
interactions with the protein and other water molecules, and thus the
positions of individual water molecules.
The high flexibility of water networks makes it difficult to observe
distinct water molecules at well defined positions in structure
determinations. Thus, experimentally observed positions of water
molecules have to be assessed critically, bearing in mind that they
represent an average picture of a highly dynamic equilibrium ensemble.
Moreover, there are many structures with inconsistent and incomplete
water networks.
To address these deficiencies we developed a tool that predicts possible
configurations of complete water networks in binding pockets in a
consistent way. It is based on the well established knowledge-based
potentials implemented into DrugScore, which also allow for a reasonable
differentiation between "conserved" and "displaceable" water molecules.
The potentials used were derived specifically for water positions as
observed in small molecule crystal structures in the CSD.
To account for the flexibility and high intercorrelation we apply a
clique-based approach, resulting in water networks maximizing the total
DrugScore.
To incorporate as much known information as possible about a given
target, we also allow to include constraints defined by experimentally
observed water positions.
Our tool provides a useful starting point whenever a possible
configuration of water molecules need to be estimated in an uncomplexed
protein, and suggests their spatial positions and their classification
with respect to some kind of affinity prediction.
In first tests we were able to get classifications and positional
predictions which are in good agreement with crystallographically
observed water molecules with remarkably small deviations.
![]()
222 - Explicit-water modeling of a model protein-ligand
binding site predicts the non-classical hydrophobic effect
Demetri T. Moustakas PhD, Phil W Snyder PhD, Woody
Sherman PhD, Prof. George M Whitesides. Department of Infection,
Computational Sciences, AstraZeneca R&D Boston, Waltham, MA, United
States; Department of Chemistry and Chemical Biology, Harvard
University, Cambridge, MA, United States; Schrödinger, Inc., New York,
NY, United States
This work reports a study of the thermodynamics of hydrophobic
interactions between human carbonic anhydrase II and a series of
structurally analogous heteroaromatic sulfonamides. Isothermal titration
calorimetry (ITC) established that increasing the non-polar surface area
of the ligands resulted in a large enthalpy-dominated increase the
binding affinity - the so-called non-classical hydrophobic affect.
Subsequent X-ray crystallography studies reveal no significant changes
in protein-ligand interactions as a function of increasing the ligand
non-polar surface area, suggesting that solute-solvent interactions are
responsible for the observed thermodynamic effects. Modeling studies
using explicit solvent models suggest that the larger ligands alter both
the structure and thermodynamic characteristics of water molecules in
the binding site, which contributes significantly to the observed
non-classical hydrophobic effect.
![]()
223 - New coarse-grained model for water: The importance of
electrostatic interactions
Zhe Wu, Prof. Qiang Cui, Prof. Arun Yethiraj. Department of
Chemistry, UW Madison, Madison, WI, United States
A new coarse-grained (CG) model for water is developed based on the properties
of clusters of four water molecules in atomistic simulations. CG units interact
via a soft non-electrostatic interaction. Electrostatic interactions are
incorporated via three charged sites with the charges and model topology chosen
to reproduce the dipole moment and quadrupole moment tensor of 4-water clusters.
The parameters in the model are optimized to reproduce experimental data for the
compressibility, density, and permittivity of bulk water, and the surface
tension and interface potential for the air-water interface. This big multipole
water (BMW) model represents a qualitative improvement over existing CG water
models, e.g., it reproduces the dipole potential in membrane-water interface
when compared to experiment, with modest additional computational cost.
![]()
359 - Introduction to cross pharma high performance computing forum
John C Morris MBA, Dr Zheng Yang. Massachusetts Research
Business Technology, Pfizer, Cambridge, MA, United States; Department of
Computational and Structural Chemistry, GlaxoSmithKline Pharmaceuticals,
Collegeville, PA, United States
High Performance Computing (HPC) within the pharmaceutical industry is a growing
and critical component of research due to the large scale analytical demands
driven by modern research methods and advancements in computational chemistry
and bioinformatics methods to model biological systems. HPC has become a
necessary capability to facilitate the analysis of the terabytes of scientific
data being generated from technologies such as Next Generation Sequencing,
modeling complex drug-target interaction, and statistical analysis. To support
the industrialization of scientific research, integrated and coordinated HPC
information technology tools, methods, and capabilities are needed. The Cross
Pharma HPC forum is a group of scientists, engineers, and key stakeholders
within the pharmaceutical industry working together to promote best practices,
coordinate activities, optimize methods, and leverage experience in the
non-competitive areas within HPC. In this talk, the history, current status, and
future directions of HPC in the pharmaceutical industry will be discussed.
![]()
360 - Applications and use of cloud computing in the
pharmaceutical industry
Dr. Michael D Miller PhD, David M Powers, Gregory
Stiegler, Dr Jeremy Martin M PhD. Research Business Technlogy, Pfizer,
Groton, CT, United States; Research and Development IT, Eli Lilly,
IIndianapolis, Indiana, United States; Scientific Computing,
Bristol-Myers Squibb, Princeton, New Jersey, United States; System
Support Department, Information Technology, GlaxoSmithKline R&D Ltd,
Harlow,, Essex, United Kingdom
Technological advances across the sciences have enabled basic drug
research with an unprecedented amount of data. As a result, the
application of computational methods are becoming an increasingly
important approach in drug discovery and development. The need for
increased computing capacity has reached the point where, today it can
become rate limiting. As a result Pharmaceutical companies have begun
exploring the use of cloud computing to address these needs. We will
present on some of the challenges Pharmaceutical companies have faced in
using cloud resources and the different approaches that have been taken
to address them.
![]()
361 - Current trends of high performance computing in Pharma
Dr. Stephen Litster, Dr. Jeremy Martin. NITAS
Scientific Computing, Novartis Institutes of BioMedical Research,
Cambridge, MA, United States; Department of System Support, Information
Technology, GlaxoSmithKline R&D Ltd, Harlow, Essex, United Kingdom
The world of high performance computing (HPC) has evolved quickly, as
exemplified by recent developments in hardware (e.g. Intel Nehalem
multi-core CPUs with integrated memory controller), software (e.g. NAMD,
a highly scalable molecular dynamics program), computing services (e.g.
cloud computing), and storage (TB+ scale file systems). Given these
recent developments and much lower cost of entry into HPC, Pharma based
Scientific Computing groups are beginning to apply traditional HPC
techniques to “non-traditional” (e.g. High Content Screening) and
emerging areas of research (e.g. Next Generation Sequencing).
We present here a number of case studies highlighting the current trends
of HPC in the pharmaceutical industry and its to impact scientific
workflows.
![]()
362 - Challenges of HPC and collaboration opportunities in
Pharma
Robert Stansfield PhD, MBA, Michael D Miller PhD.
R&D Information Solutions, sanof-aventis U.S., Bridgewater, NJ, United
States; Research Business Technologies, Pfizer, Groton, CT, United
States
High Performance Computing (HPC) in Pharmaceutical R&D is well
established in computational chemistry and computational biology for
drug discovery, but is increasingly seeing broader application across
research and development. In addition, internal capacity is being
supplemented by external “cloud computing”. In consequence, the issues
around providing HPC services to in-house scientists in an optimal way
for the entire company become more visible and critical. From a
technical perspective, HPC requires a holistic view across compute,
network, and storage capabilities. From an organizational perspective,
effective governance - roles, responsibilities, prioritization and
decision making across multiple different groups, operations, and
support to end-user scientists - makes all the difference. For these
reasons at least, HPC deserves a place in strategic planning. These
issues will be explored, as well as the opportunities afforded by
pre-competitive collaboration in the Cross-Pharma HPC Forum for
identifying best practices.
![]()
376 - Approaches to the treatment of multidrug resistant gram
negative infections
Dr. Mark C Noe PhD, Dr. Steven J Brickner PhD, Dr.
Thomas Gootz PhD, Michael Huband, Dr. Mark E Flanagan PhD, Dr. John
Mueller PhD. Department of Antibacterials Research, Pfizer Global
Research and Development, Groton, CT, United States
Each year, over 4.3 million people worldwide contract hospital-based
bacterial infections, approximately half of which are caused by Gram
negative organisms. The widespread emergence of genes that confer
multidrug resistance in these pathogens threatens to undermine the
clinical utility of several antibiotic classes, including the
fluoroquinolones, cephalosporins, carbapenems and aminoglycosides.
Particularly concerning are the extended spectrum beta lactamases,
including carbapenemases, which are advancing at an alarming rate and
compromise the effectiveness of the most widely used classes to treat
Gram negative infections. This talk will review the medical need for new
antibacterial agents, some of the challenges associated with discovering
new antibiotics, examples of potentially enabling technologies and
recent advances in our understanding of privileged targets for
antibacterial therapy. An example of one antibacterial drug discovery
program will be presented.
![]()
377 - Physicochemical property space of antibiotics
Heinz E Moser PhD. Department of Chemistry, Achaogen, South
San Francisco, California, United States
While there have been enormous discovery efforts during the past decades to
identify novel classes of antibacterials with clinical utility against
Gram-negative pathogens, no first-in-class compounds have been successfully
developed to use in humans for roughly half a century, and none is currently in
clinical evaluation. Predictably, this lack of success has been met by an
increasing prevalence of Gram-negative pathogens causing serious infections in
hospitals and critical care settings. Recent outbreaks caused by multi-drug
resistant (MDR) or pan-resistant organisms such as K. pneumoniae have
been reported recently and leave physicians with few to no treatment options.
This presentation focuses on the physico-chemical property space of
antibacterial drugs and how an understanding of this property space can assist
in the discovery and lead optimization of antibiotics, in particular that of
antibacterial drugs active against Gram-negative bacteria. Specific examples
will be presented and discussed in detail.
![]()
378 - Physicochemical properties correlated with
Gram-negative antibacterial activity of compounds in the Pfizer
corporate library
Jeremy T Starr PhD, Rishi Gupta PhD, Veerabahu
Shanmugasundaram PhD. Department of Antibacterials and Discovery
Technologies, Pfizer Pharmatherapeutics Research and Development,
Groton, CT, United States
Correlation of computed physicochemical properties of Pfizer proprietary
compounds with their respective E. coli or P. aeruginosa
MICs has led to the identification of a physicochemical fingerprint
associated with higher probability of whole cell activity with a
cytosolic target and presumed passive cell penetration. A computational
tool has been designed to calculate a desirability quotient based on
these parameters which demonstrates positive differentiation of higher
scoring compound classes.
![]()
379 - Combining lessons from computational design of gram
positive antibacterials with datamining to aid the design of novel gram
negative antibacterials
Charles J. Eyermann. Infection Discovery,
AstraZeneca, Waltham, MA, United States
Our approach to address the emergence of resistant bacterial strains has
been to identify new chemotypes with a novel mode of action. A
significant effort has been made to develop novel inhibitors against
gram positive strains like Methicillin-resistant Staphylococcus
aureus (MRSA) These efforts have provided a number of key lessons
related to target isozyme specificity and drug safety margins. Work to
identify novel MurI inhibitors of H. pylori has also provided
insights into the physiochemical properties that impact gram negative
antibacterial activity. Combining the lessons learned from the above
research efforts with datamining of existing gram negative agents
provides a framework to aid in the optimization of novel leads for gram
negative antibacterials.
![]()
380 - Targeting gram-negative pathogens: Drug design to
improve antibiotics permeation?
Eric Hajjar, Amit Kumar, Paolo Ruggerone, Matteo
Ceccarelli PhD. Department of Physics, Universita degli Studi di
Cagliari and Sardinian, Monserrato, Italy
Gram-negative bacteria are protected by an outer membrane and to
function, antibiotics have to diffuse passively through outer membrane
channels, known as porins, such as OmpF in E.coli (Pages, J. M. et al.
Nat. Rev. Microbiol. 2008, 6, 893). Bacterial strains can modulate their
susceptibility to antibiotics by under-expressing or mutating the
structures of porins, becoming resistant, in the worst case, to
different antibiotics families. These multidrug resistant bacteria are
now ubiquitous in both hospitals and the larger community and the
resurrection of tuberculosis provides one ominous example highlighting
the risk associated with evolved drug resistance (Cars, O. et al. Brit.
Med. J. 2008, 337, 726). Moreover, many pharmaceutical companies
abandoned this field and no truly novel active antibacterial compounds
are currently in clinical trials. A major current dilemma for the
pharmaceutical industry is whether to develop drugs for new targets or
promote those drugs presently on the market (Weiss, D. et al. Nat. Rev.
Drug. Discov. 2009, 8, 533.), identifying bottlenecks of existing
antibiotics to suggest chemical modifications. Following such a
strategy, we revealed the complete permeation pathways of b-lactams and
fluoroquinolones antibiotics through porins using metadynamics
simulations and found that experimental results remarkably confirmed the
computational predictions. Further, simulations revealed its
potentiality to overcome experimental limitations and provide
microscopic details on the permeation process (Hajjar, E. et al. Biophys.
J. 2010, 98, 569; Mahendran K. et al. J. Phys. Chem. B, IN PRESS).
Here we follow the paradigm for selecting antibiotics with better
permeation properties using computer simulations only. Taking advantage
of the atomic level of detail that the simulations provide we find that
the diffusion of ampicillin through OmpF is governed by a subtle balance
of interactions with partners in the porin channel: we draw, for the
first time, the complete inventory of the rate-limiting interactions and
map them on both the porin and antibiotics structure. Our methodology,
which can be conveniently employed to study other porins/antibiotics,
allows identifying the functional groups that govern optimal
translocation. Such findings will directly benefit rational antibiotics
design, by defining for example, some appropriate pharmacophores within
high throughput screening strategies.
![]()
381 - Structure-based lead optimization of novel bacterial
type II topoisomerase inhibitors
Dr Neil D Pearson, Dr Zheng Yang, Dr Benjamin D Bax,
Michael N Gwynn. Department of Antibacterial Chemistry, Infectious
Diseases Center of Excellence in Drug Discovery, GlaxoSmithKline
Pharmaceuticals, Collegeville, Pennsylvania, United States; Department
of Computational and Structural Chemistry, GlaxoSmithKline
Pharmaceuticals, Collegeville, Pennsylvania, United States; Department
of Antibacterial Microbiology, Infectious Diseases Center of Excellence
in Drug Discovery, GlaxoSmithKline Pharmaceuticals, Collegeville,
Pennsylvania, United States; Department of Computational and Structural
Chemistry, GlaxoSmithKline Pharmaceuticals, Stevenage, Hertfordshire,
United Kingdom
The emergence of multi drug resistant Gram negative pathogens is a major
concern given the paucity of new therapies in clinical development. GSK
has discovered a novel series of inhibitors of both DNA gyrase and
topoisomerase IV (NBTIs) with a unique mechanism and no target based
cross resistance to established classes of antibacterials including the
fluoroquinolones. Optimisation of the Gram positive selective early
leads led to new series which afforded good activity versus Gram
negative pathogens. GSK subsequently solved the first X-ray structure of
a NBTI inhibitor in complex with S.aureus DNA gyrase and DNA
providing unprecedented knowledge for lead optimization and the design
of novel inhibitors. This talk will discuss how the structural
information enabled the medicinal chemistry team to design new subunits
as well as illustrating when optimization of interactions with the
binding site have been well served by traditional medicinal chemistry.
![]()
382 - Fragment-based development of tetrazole inhibitors
against class A beta-lactamase
Yu Chen PhD. Department of Molecular Medicine,
University of South Florida, Tampa, FL, United States
The production of beta-lactamases is the predominant cause of resistance
to beta-lactam
antibiotics, such as penicillins, in Gram-negative bacteria. Whereas
high through-put screening has appeared insufficient for the development
of new beta-lactamase inhibitors, fragment-based methods provide an
effective approach in sampling novel chemical space in antibiotics
discovery. We have previously used fragment-based molecular docking to
identify mM range
tetrazole inhibitors against CTX-M Class A beta-lactamase and to
subsequently evolve their affinities to ~10 micromolar. New compounds
have now been synthesized using the micromolar-affinity tetrazole
scaffold, based on some similarities between this scaffold and
beta-lactam antibiotics or on X-ray crystal structures of the
inhibitor-bound complexes. Other fragment compounds have also been
tested to probe regions of the active site not sampled by existing
inhibitors. Combining the fragment-based approach with molecular
docking, X-ray crystallography and chemical synthesis, we hope to
eventually develop these tetrazole compounds into nM inhibitors.
![]()
419 - Utilizing organic
syntheses and microbial iron assimilation processes for the development
of new
antibiotics
Prof. Marvin J. Miller. Chemistry and Biochemistry,
University of Notre Dame, Notre Dame, IN, United States
Pathogenic microbes have rapidly developed resistance to all known
antibiotics. To keep ahead in the “microbial war,” extensive
interdisciplinary effort is needed. Resistance develops primarily
to overuse of antibiotics that can result in alteration of microbial
permeability, alteration of drug target binding sites, induction of
enzymes that destroy antibiotics (ie, beta-lactamases) and
even cause efflux of antibiotics. A combination of chemical syntheses,
microbiological and biochemical studies will demonstrate that the known
critical dependence of iron assimilation by microbes for growth and
virulence can be exploited for the development of new approaches to
antibiotic therapy. Iron recognition and active transport relies on the
biosyntheses and use of microbe-selective iron chelating compounds
called siderophores.
Our studies demonstrate that siderophores and analogs can be used for
-Iron transport-mediated drug delivery (“Trojan Horse”).
-Induction of iron limitation (Development of new agents to block
microbial iron assimilation).
-Converting microbe-induced chemistry of iron into a process that is
lethal to microbes.
![]()
420 - Utilization of bacterial iron transport systems for
drug delivery
Dr. Ute Moellmann, Dr. Lothar Heinisch. Department of
Molecular and Applied Microbiology, Leibniz Institute for Natural Product
Research and Infection Biology - Hans Knoell Institute, Jena, Germany
The outer membrane permeability barrier is an important resistance factor of
bacterial pathogens. In combination with other factors like drug inactivating
enzymes, target
alteration and efflux, it can increase resistance dramatically. A strategy to
overcome this membrane mediated resistance is the misuse of bacterial transport
systems. Most promising systems are those for iron transport. They are vital for
virulence and survival of bacteria in the infected host, where iron depletion is
a defense mechanism against invading pathogens. We synthesized biomimetic
siderophores as shuttle vectors for active transport of antibiotics through the
bacterial membrane. Structure activity relationship studies resulted in
ampicillin siderophore conjugates highly active against Pseudomonas
aeruginosa and other Gram-negative pathogens, which play a crucial role in
destructive lung infections in cystic fibrosis patients and in severe nosocomial
infections. The mechanism of action, in vitro and in vivo efficacy
were demonstrated.
![]()
421 - Activity of BAL30072, a novel siderophore sulfactam
Prof. Malcolm G P Page PhD. Basilea Pharmaceutica
International Ltd, Basel, Switzerland
BAL30072 is a monocyclic b-lactam antibiotic
belonging to the sulfactams. BAL30072 showed potent activity against
multidrug-resistant (MDR) Pseudomonas aeruginosa and Acinetobacter
spp., including many carbapenem-resistant strains. BAL30072 was bactericidal
against both Acinetobacter spp. and P. aeruginosa, even against
strains that produced metallo-b-lactamases that conferred resistance to all
other b-lactams tested, including aztreonam. It was
also active against many species of MDR Enterobacteriaceae, including isolates
that had a class A carbapenemase or a metallo-b-lactamase.
Unlike other monocyclic b-lactams, BAL30072 was found
to trigger spheroplasting and lysis of E. coli, rather than the formation
of extensive filaments. The basis for this unusual property is its inhibition of
the bifunctional penicillin-binding proteins PBP 1a and PBP 1b in addition to
its high affinity for PBP 3, which is the target of monobactams such as
aztreonam.
![]()
422 - Targeting bacterial multidrug efflux pumps
Olga Lomovskaya PhD, Scott Hecker PhD. Mpex
Pharmaceuticals, San Diego, California, United States
Powerful techniques of modern drug discovery such as comparative
genomics, ultra-high-throughput screening, structure-guided drug design
and combinatorial chemistry have been used to identify novel targets and
optimize novel, preferentially broads-spectrum antibiotics to combat
antibiotic resistance. However, despite the fact that these employed
targets are broadly conserved in bacteria, no drug candidate advanced
using these methods has demonstrated relevant activity against most
gram-negative bacteria. Thus, the outlook for new antibiotics appears
unchanged from present in that of all approved classes of antibiotics,
representatives of only three classes (fluoroquinolones, b-lactams and
aminoglycosides) have clinical utility for the treatment of
gram-negative bacteria such as Pseudomonas aeruginosa.
Multidrug resistance (MDR) efflux pumps play a prominent and proven role
in gram-negative intrinsic resistance. Moreover, these pumps also play a
significant role in acquired clinical resistance. Together, these
considerations make efflux pumps attractive targets for inhibition in
that the resultant efflux pump inhibitor (EPI)/antibiotic combination
drug should exhibit increased potency, enhanced spectrum of activity and
reduced propensity for acquired resistance. To date, at least one class
of broad-spectrum EPI has been extensively characterized. While
these efforts indicated a significant potential for developing small
molecule inhibitors against efflux pumps, they did not result in a
clinically useful compound. Stemming from the continued clinical
pressure for novel approaches to combat drug resistant bacterial
infections, a second-generation programs have been initiated based on a
number of recent developments in the field, including structural
elucidation of all three individual components of MDR efflux pumps and
ligand-based insights into the mechanism-of-action of drug transporters.
Building upon previous efforts, these new approaches show early promise
to significantly improve the clinical usefulness of currently available
and future antibiotics against otherwise recalcitrant gram-negative
infections.
![]()
423 - Interaction of b-peptides
with membranes
Jagannath Mondal, Dr. Xiao Zhu, Prof Qiang Cui, Prof
Arun Yethiraj. Department of Chemistry, UW Madison, Madison, WI, United
States
A new class of anti-microbial agents named b-peptides
have recently been reported that show interesting sequence dependent
activity and selectivity. In this work we investigate the interaction of
these molecules with a model membrane in an effort to obtain physical
insight into the mechanism of anti-microbial activity. We investigate
the effect of sequence on the adsorption of these b-peptides
to a membrane using computer simulations with both implicit and explicit
solvent and membrane. Two classes of molecules are investigated:
10-residue oligomers of 14-helical sequences, and four sequences of
random co-polymeric b-peptides. The oligomers
of interest are two isomers, globally amphiphilic (GA) and non-GA, of
two 10-residue 14-helical sequences. The penetration of the molecules
into the membrane and the orientation of the molecules at the interface
depend strongly on the sequence. We attribute this to the propensity of
the b-phenylalanine (bF)
residues for membrane penetration. The membrane adsorption studies are
consistent with potential of mean force calculations using the same
model. Results are similar when the membrane and solvent are treated in
an implicit or explicit fashion. For the four sequences of
random-co-polymeric b-peptides, the extent of
stabilization of free-energy correlates with their efficiency to
segregate the hydrophobic and cationic residues. The simulations are in
qualitative accord with experiments on the minimum inhibitory
concentration, and suggest simple strategies for the design of
candidates for anti-microbial beta-peptides.
![]()
424 - Molecular modeling of beta-lactamase inhibitors
Sookhee Nicole Ha, T. Blizzard, H. Chen, S. Kim, J. Wu, K.
Young, Y. Park, A. Ogawa, S. Raghoobar, R. Painter, N. Hairston, S. Lee, A.
Misura, T. Felcetto, P. Fitzgerald, N. Sharma, Jun Lu, E. Hickey, J. Hermes, M.
Hammond. Merck & Co., Inc, Whitehouse Station, New Jersey, United States
Resistance against new antibiotics usually appears within few years after their
marketing. Expression of the beta-Lactamase is the most common mechanism of
resistance to the beta-Lactam antibiotics in Gram-negative bacteria. To maximize
delaying the drug resistance, we have developed a beta-Lactamase inhibitor for
combination therapy. We report our efforts on optimization of bridged mono-bactam
analogs.
![]()
425 - Assembly and function of large Gram-negative bacterial
machines studied by molecular simulation integrated with experimental
data
Prof. Matteo Dal Peraro. Institute of Bioengineering,
Swiss Federal Institute of Technology, EPFL Lausanne, Lausanne,
Switzerland
Gram-negative bacteria have evolved several means to attack their hosts
and defend themselves from external attacks. Here, we use molecular
simulations closely integrated with new experimental data to
dissect the structural and dynamic features of the assembly mechanism of
three large bacterial machines.
(i) We propose a four-helix model of E.coli PhoQ two-component system
transmembrane domain, which is consistent with new experimental
cross-linking data, and can explain the bacterial response to divalent
cations and antimicrobial peptides. (ii) We study, with the aid of
site-directed mutagenesis, the role of the pore-forming loop and the
C-terminal pro-peptide for the heptamerization of pore-forming toxin
aerolysin from A.hydrophila. Finally, (iii) we model the needle
formation and regulation for the type III secretion system from
Y.enterocolitica (injectisome) based on fresh genetic and
mutagenesis results.
The full comprehension of the structural assembly of these bacterial
machines can contribute, on one side, to unveil their fundamental
biological function, and, on the other, will permit to develop rational
strategies to specifically interfere with them for therapeutic
intervention.
![]()
426 - Design of potent, broad-spectrum AccC inhibitors
Li Xiao PhD, Cliff Cheng, Gerald W Shipps, Aileen
Soriano, Peter Orth, Todd Black. Merck Research Laboratory, Kenilworth,
New Jersey, United States; Merck Research Laboratory, Cambridge,
Massachusetts, United States
The biotin carboxylase (AccC) is part of the multi-component bacterial
acetyl coenzyme-A carboxylase (ACCase) and is essential for pathogen
survival. We identified and validated AccC as an antibacterial drug
target for our in-house AS/MS screen. An initial hit,
2-(2-chlorobenzylamino)-1-(cyclohexylmethyl)-1H-benzo[d]imidazole-5-carboxamide
(1), was identified, and x-ray crystallography and computer
modeling were utilized in its optimization. In this presentation we
report our biology, chemistry and structure based drug design efforts in
discovering a novel series of AccC inhibitors, exemplified by (R)-2-(2-chlorobenzylamino)-1-(2,3-dihydro-1H-inden-1-yl)-1H-imidazo[4,5-b]pyridine-5-carboxamide
(2). These inhibitors are potent and selective for bacterial AccC
with good cell-based activity against a sensitized strain of E. coli
(HS294 E. coli).
![]()
433 - Exploring protein conformational changes with
accelerated molecular dynamics in NAMD
Dr. Yi Wang, Prof. J. Andrew McCammon. Chemistry and
Biochemistry, Howard Hughes Medical Institute, University of California,
San Diego, La Jolla, CA, United States
Accelerated molecular dynamics (aMD) enhances conformational space
sampling by reducing energy barriers separating different states of a
system. Here we present the implementation of aMD in the highly
efficient parallel molecular dynamics program NAMD and offer exemplary
applications performed on systems up to 60,000 atoms. Our results
indicate that while providing significantly enhanced sampling, aMD
simulations have only a small overhead in comparison to classical MD
simulations. A 10-ns aMD simulation performed on the bacterial enzyme
RmlC successfully revealed its transition from apo- to holo- state,
which is not observed in a 50-ns classical MD simulation. We demonstrate
that aMD can be applied efficiently to explore the conformational
changes of complex biomolecules, especially when little is known about
their alternative structures and transition reaction coordinates.
![]()
434 - Pseudo-chair conformation of carboxyphosphate
Venkata S Pakkala, Steven M Firestine, Jeffrey D
Evanseck. Department of Chemistry and Biochemistry, Duquesne University,
Pittsburgh, Pennsylvania, United States; Eugene Applebaum College of
Pharmacy and Health Sciences, Wayne State University, Detroit, Michigan,
United States
For over 40 years, carboxyphosphate has been postulated as a key
intermediate in several carboxylase enzymes. Unfortunately, this
compound is extremely unstable (t1/2 of 70 ms), thus precluding direct
experimental studies. Therefore, we have utilized high level ab inito
(MP2 and CCSD(T)), DFT (B3LYP, BB1K, M05-2X, M06-2X and MPW1K) and
ONIOM(DFT:AMBER) methods to investigate the structure and energetics of
carboxyphoshpate in vacuum, in a PCM continuum solvation model and in
the active site of N5-CAIR synthetase, an enzyme shown to proceed via
the formation of carboxyphosphate. We report here, for the first time,
that carboxyphosphate adopts a “pseudo-chair” conformation and
calculations reveal that this conformation is found to be the most
stable in vacuum, solvent and the active site. This study has
implications in the development of the carboxyphosphate analogs as
potential inhibitors, in understanding the instability of the compound,
and in elucidating the mechanisms of enzymes utilizing this compound.
![]()
435 - Analysis of vibrational spectra of polypeptides in
terms of localized vibrations
Dr. Christoph R Jacob, Prof. Markus Reiher. Center
for Functional Nanostructures, Karlsruhe Institute of Technology (KIT),
Karlsruhe, Germany; Laboratorium für Physikalische Chemie, ETH Zurich,
Zurich, Switzerland
While nowadays efficient quantum chemical methods allow for the
calculation of vibrational spectra of large (bio-)molecules, such
calculations also provide a large amount of data. In particular for the
vibrational spectra of polypeptides, a large number of close-lying
normal modes contribute to each of the experimentally observed bands,
which hampers the analysis of the calculated spectra considerably.
Here, we discuss how vibrational spectra obtained from quantum chemical
calculations can be analyzed by transforming the calculated normal modes
contributing to a certain band in the vibrational spectrum to a set of
localized modes [1]. We demonstrate that these localized modes are more
appropriate for the analysis of calculated vibrational spectra of
polypeptides and proteins than the delocalized normal modes.
We apply this methodology to investigate the influence of the secondary
structure on infrared and Raman spectra of polypeptides [2]. As a model
system, a polypeptide consisting of twenty (S)-alanine residues in the
conformation of an a-helix and of a 310-helix
is considered. In particular, we show how the use of localized modes
facilitates the analysis of the positions and of the total intensities
of the bands in the vibrational spectra, and how the couplings between
localized modes determine the observed band shapes. Finally, this
analysis is applied to analyze the Raman optical activity (ROA) spectra
of these helical polypeptides, which provides a detailed picture of the
generation of ROA bands in proteins [3].
[1] Ch. R. Jacob and M. Reiher, J. Chem. Phys. 130 (2009),
084106.
[2] Ch. R. Jacob, S. Luber, M. Reiher, J. Phys. Chem. B 113
(2009), 6558.
[3] Ch. R. Jacob, S. Luber, M. Reiher, Chem. Eur. J. 15 (2009),
13491.
![]()
436 - Conformational coupling between LOV and kinase domains
in phototropins: A computational perspective
Dr. Marco Stenta PhD, Prof. Matteo Dal Peraro PhD.
Department of Bioengineering, Ecole Polytechnique Fédérale de Lausanne
(EPFL), Lausanne, Vaud, Switzerland
Phototropins constitute
an important class of plant photoreceptors playing key roles in many
physiological
responses to light, including phototropism, chloroplast movement and
stomata
opening. Phototropins feature, along with a serine-threonine kinase
domain, two
LOV (light-, oxygen- or voltage-regulated) domains, each binding a FMN (flavin
mononucleotide).
Blue light affects the kinase domain by triggering, in the LOV domain,
the
formation of a covalent intermediate between the FMN cofactor and a
nearby
cysteine residue. Despite X-ray structures provided solid ground for
mechanicistic hypothesis, the molecular details of the inter-domain
communication
process are still unknown. By using accurate QM/MM (quantum
mechanics/molecular
mechanics) calculations we investigated the formation/breaking of the
FMN/Cys covalent
intermediated. We investigated the coupling between the LOV and kinase
domains
by means of long MD (molecular dynamics) simulations and detailed PES
(potential
energy surface) explorations (MM level).
Zoltowski, B. D.; Vaccaro, B.;
Crane, B. R. Nat Chem Biol 2009, 5, 827-834.
![]()
437 - Conformational sampling of macrocycles through accelerated
molecular dynamics simulation
S. Roy Kimura Ph.D.. Department of Computer Assisted Drug
Design, Bristol Myers Squibb, Wallingford, CT, United States
Macrocyclization is a strategy used in medicinal chemistry to lock a molecule in
its bioactive conformation. The resulting decrease in conformational flexibility
often leads to higher potencies due to the reduced entropy loss upon binding,
and sometimes improved physical chemical properties such as bioavailability.
Conformational searches of macrocycles are usually performed by temporary ring
opening and Monte Carlo (MC) sampling to overcome the energy barriers between
low energy states. However, widely available MC algorithms can only be used in
conjunction with simplified continuum solvents such as dielectrics or
Generalized Born-related models. In this study, we assess the use of molecular
dynamics simulation in explicit solvent with periodic high-temperature pulsing
as a method to overcome the characteristic energy barriers of macrocycles. The
pros and cons of this methodology versus MC sampling are discussed.
![]()
![]()
44 - Best practices in scientific computer modeling
Dr. Masha V Petrova. Department of Research, MVP
Modeling Solutions, LLC, Springfield, IL, United States
Computer modeling can help research organizations save a lot of money
and time, if the modeling program is implemented correctly. Are you sure
that your research group is making the most out of computer modeling?
Attend this session to learn:
How companies and research groups tend to shoot themselves in the foot
when setting up a computer modeling project;
What measures you can take to make sure that you don't spend a lot of
time going down the wrong path or purchasing the wrong software;
The best way to take a scientific/engineering problem and translate it
into computer modeling terms.
![]()
45 - New wave of computational tools for the leads selection
in biomedical industry
Dr. Aurora D. Costache PhD, Prof. Doyle D. Knight
PhD, Prof. Joachim Kohn. New Jersey Center for Biomaterials, Rutgers -
The State University of New Jersey, Piscataway, NJ, United States;
Mechanical and Aerospace Engineering, Rutgers - The State University of
New Jersey, 98 Brett Rd, Piscataway, NJ, United States
The high cost and intensive labor of developing new polymeric
biomaterials for tissue engineering, drug delivery and other medical
applications highlights the need for a change in the discovery process.
As large corporations continuously look to cut costs, individual
contractors or small businesses that can provide them with lead
materials for given biomedical applications are expected to thrive. With
this business niche in mind, the New Jersey Center of Biomaterials
(NJCBM) created “Biomaterials StoreTM”- a computational tool
specifically designed for development of new biomaterial leads. This
integrated database and datamining tool allows the user to create/use
large databases of virtual polymer libraries and to apply modeling tools
to predict relevant polymer properties and biological responses to
biomaterials. Based on the requirements for a specific application, the
most promising candidates are selected for synthesis and complete
experimental evaluation, thus accelerating the discovery process and
cutting costs at the same time.
![]()
46 - Computational modeling of soft condensed matter and
biomaterials
Dr. Jayeeta Ghosh. New Jersey Center for
Biomaterials, Rutgers, Piscataway, NJ, United States
Computational modeling helps understand chemistry starting from quantum
level to process dynamics length and time scales.
This presentation will discuss the application of atomistic and
mesoscale modeling for soft condensed matters as well as combinatorial
computational approach to biomaterials invention. The main objective is
to show the relevance and importance of detailed molecular modeling
versus approximate surrogate modeling.
Molecular modeling of soft condensed matters including glasses, polymers
and lipids will be discussed in the context of industrial application
and drug delivery.
Quantitative structure property relation (QSPR) modeling approach for
identifying suitable biomaterials starting from a large combinatorial
library of polymers for tissue engineering and biomedical applications,
can help reduce the experimental cost and time and advance business.
![]()
47 - First-principles computational approach for the
characterization and design of novel organic electronic materials
Roel S Sanchez-Carrera PhD, Prof. Alan Aspuru-Guzik.
Deparment of Chemistry and Chemical Biology, Harvard University,
Cambridge, MA, United States
Organic electronics have recently emerged as a technology that will
revolutionize the way in which we visualize information, generate energy
from renewable resources, and communicate with people around the world.
Thus, various international academic laboratories and major chemical
companies are actively involved in the fine-tuning and development of
the molecular materials used in the field of organic electronic devices.
To highlight the potential of current computational methodologies, in
this study, on the basis of quantum chemistry calculations and molecular
dynamics simulations, we investigate the microscopic charge transport
parameters of one of the most outstanding candidates, the
dinaphtho-thieno-thiophene organic semiconductor. The good agreement
found in this work between observed and computed properties, stresses
the importance of using computational chemistry techniques to identify
suitable molecular materials for the emerging field of organic
electronics.
![]()
48 - Recent advances in structure-based drug design
Woody Sherman. Schrodinger, New York, NY, United
States
Structure-based drug design is an important part of the drug discovery
process and recent methodological advancements, as well as increased
computing resources have resulted in a growing number of success
stories. In this presentation, we highlight some of the most promising
methods and applications, including the accurate assessment of water
free energies, incorporation of protein flexibility into docking
algorithms, and structure-based modeling of GPCRs. In addition, we
describe the most significant limitations in the existing methods and
provide a development roadmap to overcome these limitations.
![]()
49 - Computer simulation of ligand binding to a flexible protein
target
Dr Philip W Payne. Consulting, InterBiotics LLC, Sunnyvale,
CA, United States
A research-based biotechnology or pharmaceutical business must focus capital and
labor on the experiments that will most rapidly discover or refine intended
products. Computer simulations are useful adjuncts to an experimental program
when they provide structural insights that suggest how a protein or ligand
structure should be modified to improve a measured outcome - enzymatic rate,
receptor activation, or ligand affinity; the successful modeling program means
that fewer proteins need to be mutated or fewer ligands synthesized during a
product development campaign.
Important biological functions often entail large displacements of protein main
chains or loops, and industrially useful modeling of protein structure needs to
assess such motion and its impact on protein-ligand affinity or ligand-directed
signaling. Unfortunately, there is little commercial software that can
cost-effectively predict important protein motions. Therefore we have developed
a strategy (Inverse Docking) for analyzing main chain movements that conform a
G-Protein Coupled Receptor (Dopamine D2S) to a nanomolar D2 antagonist,
spiperone.
![]()
50 - FAST Predictions of protein stability and flexibility
Prof. Dennis R. Livesay, Dr. Hui Wang, Prof. Donald
J. Jacobs. Department of Bioinformatics and Genomics, University of
North Carolina at Charlotte, Charlotte, NC, United States; Department of
Physics and Optical Science, University of North Carolina at Charlotte,
Charlotte, NC, United States
Accurate descriptions of stability and flexibility are necessary for a
complete understanding of protein structure and function. As such, we
have developed “FAST” to provide a Flexibility And
Stability Test on proteins in aqueous solutions. Herein,
all intramolecular interactions are assigned enthalpy and entropy
values. Total enthalpy is the sum of all components, whereas efficient
graph-rigidity algorithms account for entropy nonadditivity. FAST
has been designed from the ground-up to account for dependence on
temperature, pressure, pH, salt concentration, etc. As such, free energy
landscapes as a function of multiple thermodynamic variables can be
quickly calculated. FAST also calculates a wide variety of
mechanical properties related to structural rigidity and flexibility
with virtually no increase in computational expense. This talk will
summarize our general approach, and recent improvements in regards to
speed and accuracy. Support for this work has been from grants from the
NIH (R01-GM073082) and the Charlotte Research Institute.
![]()
51 - Patentability of computer simulations and models
Noah Malgeri. Law Office of Noah V. Malgeri,
Uxbridge, Massachusetts, United States
In recent years, several companies and individuals, including major
industry leaders, have filed patent applications for computer models,
particularly in the area of control systems, project management
simulations and for modeling pathologies. This presentation will address
the subject of scientific software patentability.
![]()