Titles link to slides when available. Please note: Presentations given at CINF symposia have been posted to the CINF website with express permission granted by the authors who retain the original copyright. These presentations are for information purposes only and cannot be further disseminated without the author's prior written permission.
![]()
CINF
1
Fate of
chemistry branch libraries: Onward toward 2015
Jeremy R Garritano, Mellon Library of Chemistry, Purdue University,
504 W. State St., West Lafayette, IN 47907, jgarrita@purdue.edu
The
pressures of technology, multidisciplinary research, and shrinking budgets
have caused many librarians to rethink the roles of chemistry branch
libraries in recent decades. Some of these libraries have reinvented
themselves, while others have been consolidated into general science and
technology libraries. The author will report on the results of a 2005 survey
of Association of Research Libraries (ARL) institutions and the status of
their chemistry related library resources and facilities. The survey will
look at the past, present and future of their chemical information
resources, paying particular attention to those that have been or will be
combined with other facilities. The reasons for consolidation will be
discussed, as well as what other disciplines are included within the
combined collection, and other issues regarding administration and outreach.
![]()
CINF
2
The Harvard
chemistry library: Ghosts aboard the starship
Marcia L. Chapin, Chemistry & Chemical Biology, Harvard
University, 12 Oxford St., Cambridge, MA 02138, Fax: 617-495-0788, chapin@chemistry.harvard.edu
The
Harvard Chemistry Library has played a quiet but profound role in chemical
education and research at Harvard. Since 1927, the Library, located in the
heart of the Chemistry & Chemical Biology Department complex, has served
as a focal point for chemical information resources, chemical contemplation,
and a host of Harvard chemistry community gatherings. The spirit of many an
illustrious faculty member is to be felt there. The reading room embodies
what students have come to expect from Harvard, a sense of history and
elegance. With the advent of digital access to chemical information, the
space occupied by the Library is beginning to be scrutinized very closely.
Is it reasonable to harvest the current Library space for laboratories and
create a small “starship” information center, a new paradigm where most
everything would be online? A perfect storm of university politics, space
competition, and financial constraints has come to bear on these decisions.
How important is the historic space to 21st century teaching and research?
![]()
CINF
3
Adaptation of a
chemistry library: The University of Chicago experience
Andrea Twiss-Brooks, John Crerar Library, University of Chicago, 5730
S. Ellis Ave, Chicago, IL 60637-1403, atbrooks@uchicago.edu
“You
cannot step twice into the same river.“ Heraclitus (c. 540 - c. 480 BC)
During
its eight decades of existence, the Chemistry Library at the University of
Chicago has undergone a lot of change. This change has been driven by many
factors, including advances in library technology, construction of new
library and research buildings, space planning challenges, migration to
electronic information resources, and more. In summer 2005, the Chemistry
Library was completely closed and the collections merged into the holdings
of the main science library. This presentation will explore some of the
driving forces behind the closure of the Chemistry Library, the changing
role of the Chemistry Librarian, and chemical information reference and
instructional services in the context of a centralized science library
environment. Effects of the closure on staff, resource reallocation, the
process of moving the collections, and service marketing in the new
environment will also be addressed.
![]()
CINF
4
Metamorphosis of
the chemistry library: What will emerge?
William W Armstrong, LSU Libraries, Chemistry Library, Louisiana
State University, Baton Rouge, LA 70803, Fax: 225-578-2760, notwwa@lsu.edu
Forces
ranging from institutional financial pressures and space constraints to
rapid technological advances are acting on the chemistry library causing a
metamorphosis. Technological advances have revolutionized the way scientists
communicate with one another and the way this information is disseminated.
Has the library's role in the flow of information changed in response to
these new developments? Have the needs of patrons changed as a result? What
shape or role will the library have in the future? What should its role be?
The author will provide an overview of some of the changes occurring or
likely to occur, while highlighting any positive or negative aspects these
changes might entail. We must balance an ideal with a knowledge of the
realities which act as constraints, or parameters, in which these changes
will take place. Change will occur. Will we merely react, or will we direct
this change?
![]()
CINF
5
Changing
mission, strengthened focus: A new use for the Current Periodicals Room at
the University of California, Santa Cruz
Catherine B. Soehner, Christy Hightower, and Wei Wei, Science &
Engineering Library, University of California, Santa Cruz, 1156 High Street,
Santa Cruz, CA 95064, Fax: 831-459-2797, soehner@ucsc.edu
The
Science & Engineering Library at UC Santa Cruz was built in 1991 and
included a beautiful room dedicated to a print collection of current
periodicals. During the past two years we have systematically canceled all
print journals for which there was an electronic counterpart, thus
diminishing the number of journals in the Current Periodicals Room. During a
strategic planning effort, the Library determined that it should be
identified as the 'Information Center' of the campus and be the 'destination
of choice' for students, faculty, staff, and members of our greater
community even in this digital age. As a first step toward realizing this
goal, the library staff began a lecture series entitled Synergy:
Explorations in Science and Society, held in the Current Periodicals Room.
This new lecture series highlights research, teaching and grants in science
and engineering at UCSC and brings these efforts to the attention of the
UCSC and greater Santa Cruz community. The response to this lecture series
has been overwhelmingly positive with record attendance. This venture marks
the beginning of a successful move toward integrating the library further
into the mission of the University and further increases the library's
connection with its faculty.
![]()
CINF
6
Planning a
combined engineering, computer sciences, and physics library at Stanford
University
Grace A. Baysinger, Swain Library of Chemistry & Chemical
Engineering, Stanford University, Organic Chemistry Building, 364 Lomita
Drive, Stanford, CA 94305-5080, Fax: 650-725-2274, graceb@stanford.edu
A
new library for the Engineering, Computer Sciences, and Physics communities
at Stanford University is slated to open in 2012. It will be a
state-of-the-art facility that will be designed as “stackless” or
without book stacks. Planning efforts include reviewing trends, assessing
issues, and developing future visions for the facility, including its
collections, services, and staffing. User needs are being assessed via
surveys and interviews. Technical, financial, and legal opportunities and
challenges are also being evaluated. This presentation will provide an
overview of the vision and planning efforts going into this new library.
![]()
CINF
7
Knowledge
management at Cytec Industries: Building the library of the future
David A. Breiner1, Joseph J. Kozakiewicz1,
Jeanne L. Courter1, Leonard Davis1, Raymond S.
Farinato1, Steven Greenhouse1, John H. Hillhouse2,
Nimal Jayasuriya1, James A. Jubinsky1, Dana B. Moore1,
J. Wilfredo Perez1, and Gary Walters1. (1) Cytec
Industries, 1937 West Main Street, Stamford, CT 06904, david.breiner@cytec.com,
(2) Phosphine Technical Center, Cytec Industries
Since
2003, the Cytec Technical Information Center (TIC) in Stamford, Connecticut,
has undergone a radical transformation. From moving its physical location to
hiring a new staff to launching a virtual library, the Cytec TIC has become
a center of excellence for learning, idea exchange, and innovation. As its
mission, the TIC partners with Cytec R&D to leverage appropriate
technology in order to search, archive, and disseminate internal and
external information in a cost-effective, user-friendly manner. To achieve
its mission, the Cytec TIC has designed and implemented a simple web portal
for instant “one-stop” global access to technical information. Primary
resources for external information include ACS, MicroPatent, Knovel,
Elsevier ScienceDirect, Teltech, and SRI Consulting, while a web-based
document management system is utilized for retrieving important internal
information. In addition, the Cytec TIC has become a hub for
cross-functional R&D activity by hosting scientific discussion forums
and weekly poster sessions. This presentation will highlight experiences
encountered during a Knowledge Management initiative including identifying
system requirements, process design, implementation issues, cultural
challenges, and lessons learned.
![]()
CINF
8 Virtually
virtual: The postmodern pharmaceutical library
Mary Laskow, Lou Ann Di Nallo, and Mary Talmadge-Grebenar,
Information & Knowledge Integration, Bristol-Myers Squibb, Rt. 206 &
Province Line Rd., PO Box 4000 J12-01, Princeton, NJ 08543, Fax:
609-252-6280, mary.laskow@bms.com
The
Research Libraries at Bristol-Myers Squibb serve a wide and varied audience,
with one of the main user groups comprised of chemists. Historically we have
given them primary focus from a collection and service viewpoint. As early
but sometimes reluctant adopters of ejournals and other electronic
resources, BMS chemists have, over time, become comfortable in the virtual
world. Increasing demands on our physical library spaces from other parts of
the organization have fortunately led to the opportunity to rethink our use
of space in a thoughtful fashion. Some of the areas we are addressing
include: increasing opportunities for collaboration, aligning chemical
information professionals with clients that they serve, and reducing our
collection footprint. The physical library will remain for at least the near
future until key chemistry reference resources become either available
electronically, or until pricing models evolve to make them more affordable.
![]()
CINF
9
Copyright basics
Eric S. Slater, Publications Division, Copyright Office, American
Chemical Society, 1155 Sixteenth Street, NW, Washington, DC 20036, Fax:
202-776-8112, e_slater@acs.org
This
session will feature a general discussion of basic United States Copyright
Law, including, but not limited to, such topics as subject matter of
copyright, exclusive rights of copyright, and duration of copyright.
Additionally, there will be a detailed discussion of ACS Publications
Division Copyright Policy and how United States Copyright Law ties in to ACS
Policy. In this regard, the speaker will cover why ACS requires transfer of
copyright from authors and discuss why this approach is beneficial to all
parties involved. Other related topics will include a detailed explanation
of the ACS Copyright Status Form, and specifically, to the rights that ACS
grants back to authors/employers of authors. Finally, the session will
conclude with a primer on the permissions process, and why it is important
to be aware of copyright when using material posted on the Internet.
![]()
CINF
10
Teaching
copyright to chemistry students
S. Scott Zimmerman, Department of Chemistry and Biochemistry, Brigham
Young University, C205 BNSN, Provo, UT 84602-5700, Fax: 801-422-0153,
scott.zimmerman@byu.edu
As
chemistry professors and students, we might ask the following questions
about copyright: Who owns the copyright to students' research reports and
laboratory notebooks? Can instructors make copies of a JACS paper and
distribute them to the students in their classes? What published materials
can instructors legally include in their course packets? Can graduate
students publish papers in scientific journals and then publish the same
papers in their theses or dissertations? In this presentation, I will try to
answer these and other questions about copyright. I will also outline
suggested topics and list online resources that instructors can use in
teaching copyright to chemistry students.
![]()
CINF
11 Solution
provider perspective: A brief case study in serving the customer and their
end-users
Robert Weiner, Senior Vice President, Copyright Clearance Center, 222
Rosewood Drive, Danvers, MA 01923, Fax: 978-750-0347, bweiner@copyright.com
The
demand for digital content is greater than ever, forcing both information
content users and rights holders to search for new ways to engender
compliance with U.S. copyright law. Rights holders want to maintain control
over how their intellectual property is used and at what cost, while
information consumers want to reproduce and disseminate material without
putting their institutions at risk of infringement litigation. Fortunately,
there are solutions.
![]()
CINF
12
Intellectual
property agreements
Gianna Arnold, Epstein Becker and Green, 1227 25th Street, NW, Suite
700, Washington, DC 20037-1175, Fax: 202-296-2882, garnold@ebglaw.com
Intellectual
property assets are critical for technology companies and often account for
a large percentage of such company's capital. Accordingly, appropriate
protection and leveraging of such assets can greatly enhance value and can
be crucial to success. Patents, trademarks, copyrights, trade secrets and
contracts are used to protect and leverage intellectual property assets.
This presentation will focus upon the use of contracts – both in-house
agreements and strategic alliances. Whether such contracts are used to
protect intellectual property rights, improve in-house capability or garner
revenue, the goal is to enhance the strength and value of the corporate
entity. Items discussed will include types of contracts, the licensing
process, and drafting considerations.
![]()
CINF
13
Publish and
your patent rights may perish
Alan M. Ehrlich, Weiss, Moy & Harris, P.C, 1101 Fourteenth St.,
N.W, Suite 500, Washington, DC 20005, Fax: 202-216-0083, aehrlich@weissmoyharris.com
Patents
are awarded for inventions of articles, methods and compositions that are
useful, novel, and not obvious to one ordinarily skilled in the art. A
patent's value stems from the fact that a patent owner initially has the
exclusive right to exclude others from making, using, selling or importing
the invention, and the owner can sell that exclusive right in whole or in
part. The novelty is lost if the invention has been published prior to
filing of a patent application. Thus, there is a potential conflict between
researchers' interests in publishing and their employers' desires to
maintain that exclusivity. This paper will outline those disclosures that
destroy patentability and ways to balance the interests of publication and
commercialization.
![]()
CINF
14
Harvesting the
scientific information in patent documents: What non-patent specialists
should know
William M. Mercier and Jan Williams, Chemical Abstracts Service,
Columbus, OH 43210, Fax: 703-435-0827, wmercier@cas.org
CAS
databases offer millions of patent references from more than 50 active
patent-issuing authorities around the world. These patents can be viewed not
only as documents of legal significance, but also rich sources of scientific
information; in fact, over 60 percent of the new small molecules CAS adds
each year to the CAS REGISTRYSM are from patent documents rather than
journal literature. The scientific information contained in these patent
records makes a broader scope of data available for research and data
analysis. Those patents records that qualify CAS selection criteria (those
covering chemistry, biochemistry and chemical engineering), are analyzed and
fully indexed by CAS scientists in less than 27 days from the date of issue.
Complementary to patent information, CAS references a wealth of journal
literature dating back to 1907. This information can assist in making
business critical decisions, direct a research project, or assess prior art
for patentability.
![]()
CINF
15 Text search
anomalies and how to cope with the "tough" searches in Pubmed for
your just-in-time knowledge needs
Soaring Bear, MeSH, NLM/NIH, 8600 Rockville Pike B2E17, Bethesda, MD
20894, Fax: 301-402-2002, soaringbear@nih.gov
As
much as one fifth of Medline subject header (MeSH) indexing vocabulary
(http://www.nlm.nih.gov/mesh/MBrowser.html) is modified each year to keep up
with additions and changes in science. Recent changes in MeSH will be
presented along with three easy steps you can follow to help you keep up
with and use the changes for better and faster search results.
Changes
in MeSH usually improves search results but can sometimes confuse searchers
and automated informatics tools. For instance, why does a search on the word
‘sweetening' fail to deliver 100 thousand citations on ‘sweetening
agents'? Why does a search on benzo[a]pyrene give a syntax error? Why does a
search on ‘plants' fail to find 20 thousand citations about ‘plant
extracts'. Why does a search on ‘anti-inflammatory' fail to get 60
thousand citations about ‘antiinflammatories'? MeSH is doing the best we
can to help provide good search results, but the multiplicity of word
meaning and the budget limits what any categorization scheme can do. You've
got to do the rest. Here's how.
![]()
CINF
16 Text and data
mining: Together at last!
Anthony J. Trippe, Science IP/Chemical Abstracts Service, 2540
Olentangy River Rd., Columbus, OH 43210, atrippe@cas.org
Many
techniques and tools have long been available to information professionals
for statistical analysis of fielded (structured) data. Lately, there has
been an increased focus on the analysis of textual (unstructured) data.
Traditionally, these forms of analysis have been conducted separately. In
general, it was not possible for the value and strengths of these approaches
to be combined. New software now allows the application of rigorous data
mining tools, e.g., data grouping and clean-up, to the creation of bar
charts and 2-D matrix charts from fielded data. It also allows the use of
text mining elements, including data harmonization, for the creation of
concept clusters and maps from unstructured data. Output from both is linked
and dynamically interactive. A brief discussion of the software's
capabilities will be followed by a case study on how the marriage of text
and data mining supports strategic business research by providing rapid,
insightful analyses.
![]()
CINF
17 Knowing when to
say "When..."
Farhad Soltanshahi, Michael S. Brusati, and Robert D. Clark, Tripos,
Inc, 1699 South Hanley Road, St. Louis, MO 63144
Sampling
large data sets efficiently is a computational challenge but it can also be
a philosophical one. Keeping structural diversity within the selected subset
high is important, but so is maintaining representativeness of the data set
as a whole. As the fraction of the data set selected increases, enhancing
diversity becomes increasingly expensive in computational terms, but of
progressively less value in practical terms. So when does it make sense to
stop worrying about diversity and shift over to straight random sampling?
Optimizable k-dissimilarity (OptiSim) is a stochastic selection method that
is uniquely positioned for addressing this question, in part because it
returns an ordered selection set in which the earlier selections being, on
average, measurably more distinctive and more representative than are later
ones.
![]()
CINF
18 Maximizing
chemical knowledge: New approaches in spectral data mining and search via
the successful consolidation of multi-technique spectral data
Gregory M. Banik1, Deborah Kernan2, Kevin
Scully3, and Marie Scandone3. (1) Bio-Rad
Laboratories, Informatics Division, 3316 Spring Garden Street, Philadelphia,
PA 19104, gregory_banik@bio-rad.com, (2) Bio-Rad Laboratories, Informatics
Division, (3) Informatics Division, Bio-Rad Laboratories, Inc
It
has become standard practice in multiple applications, such as compound
verification or unknown sample identification, for scientists to run a
sample and, using spectral search software, compare it to commercial and/or
proprietary reference databases of spectra. The software mines the reference
data and calculates a score or hit quality index (HQI) to describe the
correlation or “closeness” of the match between the spectrum being
examined and the spectra of known compounds in reference databases.
This
paper describes a new approach to spectral searching which gives scientists
who analyze samples using multiple spectral techniques the ability to
simultaneously combine all spectral information available to yield a single
search result. In a series of case studies, we will demonstrate how this
approach enables the optimization of chemical similarity and maximizes
chemical knowledge in order to identify several unknown samples.
![]()
CINF
19 Hierarchical
k-means clustering using principal components to solve the unsupervised
multi-class classification problem
James F. Rathman1, Syed B. Mohiddin1, and
Chihae Yang2. (1) Department of Chemical and Biomolecular
Engineering, The Ohio State University, Koffolt Laboratories, 140 West 19th
Avenue, Columbus, OH 43210-1110, Fax: 614-292-3769, rathman.1@osu.edu, (2)
Leadscope, Inc
Current
clustering techniques can be grouped as either supervised or unsupervised.
In a supervised method, each observation in the training dataset is
pre-assigned to a class based on prior knowledge, while an unsupervised
method uses no prior knowledge of the class distinction. Numerous supervised
techniques have been demonstrated to work well for binary classification and
a few of these are reasonably good at making supervised multi-class
predictions. However, techniques for unsupervised binary and multi-class
predictions have not been fully developed. In this work, we present an
analysis technique based on hierarchical K-means using differentially
weighted principal component analysis to address unsupervised classification
for both binary and multi-class problems. We demonstrate the methodology on
both biological (NCI 60 cancer cell lines dataset and acute leukemia
dataset) as well as chemical datasets with the objectives of predicting
class membership and identifying non-redundant features most responsible for
differentiating the observed classes.
![]()
CINF
20 Dynamic
equation of state evaluation with ThermoData Engine
Chris D. Muzny1, Eric W. Lemmon1, Robert D.
Chirico2, Vladimir V. Diky2, Qian Dong1,
and Michael Frenkel2. (1) Physical and Chemical Properties
Division, National Institute of Standards and Technology, 325 Broadway,
Boulder, CO 80305-3328, Fax: 303-497-5044, chris.muzny@nist.gov, (2)
Thermodynamics Research Center (TRC), National Institute of Standards and
Technology (NIST)
ThermoData
Engine (TDE) is a software tool recently released by the Thermodynamics
Research Center at the National Institute of Standards and Technology that
for the first time implements the concept of dynamic data evaluation for
thermodynamic property data. In this talk we will present an extension of
TDE that implements the dynamic data evaluation concept for pure fluid
equations of state. We will detail the performance of TDE in comparison to
established equations of state based on individual static data evaluations.
The specific equations of state we compare against are those presented in
NIST REFPROP, a software tool that delivers recent, state-of-the-art
equations of state for over 80 fluids. Full implementation of the dynamic
data evaluation concept requires continuous acquisition and storage of new
data. Toward this end we will also present an extension of TDE that allows
for on-demand TDE local database updates from a central server.
![]()
CINF
21 Leveraging open
access chemical information with Text Influenced Molecular Indexing
Richard D. Hull, Axontologic, Inc, 12565 Research Parkway, Suite 300,
Orlando, FL 32826
Research
and development of new text mining algorithms for drug discovery have been
hampered by the restricted availability of large, open access chemical
databases. Recent efforts to make more chemical information available to
researchers are opening promising new avenues of research. Text Influenced
Molecular Indexing (TIMI) is a process that discovers correlations between
structural components of chemical structures and the textual contexts that
these structures are described within, namely, the scientific literature,
internal research reports, and chemical patents. TIMI can identify
recognized and novel latent relationships between compounds, proteins,
genes, diseases and other domain concepts that are expressed across very
large textual corpora. A linchpin of this technique is the ability to
recognize chemical names within these texts and access their corresponding
chemical structures. We describe our work with TIMI as an example of what
can be done when large numbers of chemical structures are made available for
text mining purposes.
![]()
CINF
22 PubChem
Stephen H. Bryant, Computational Biology Branch, National Center for
Biotechnology Information, National Institutes of Health, Bldg. 38A, Rm.
5S504, Bethesda, MD 20894, Fax: 301-480-9241, bryant@ncbi.nlm.nih.gov
PubChem
is a new online information resource from NCBI. The system provides open
access to information on the biological properties of chemical substances.
Following the sequence-deposition model followed by GenBank, PubChem's
content is derived from user depositions of chemical structure and bioassay
data, including data from NIH's Molecular Libraries Roadmap initiative. The
PubChem retrieval system supports searches based on chemical names and
chemical structure, as well as searches based on bioassay descriptions and
activity values. It furthermore provides links to depositor sites, for
further information on each substance, as well as links to other NIH
resources such as the PubMed biomedical literature database and Entrez's
protein 3D structure database.
![]()
CINF
23 The ZINC
database as a new research tool for ligand discovery
John Irwin and Brian Shoichet, Department of Pharmaceutical
Chemistry, University of California, San Francisco, 1700 4th St, San
Francisco, CA 94143,
jji at cgl.ucsf.edu
(email address altered at author's request)
ZINC
is a free database of commercially available compounds for virtual
screening, available on the web at http://zinc.docking.org. ZINC represents
small molecules as biologically relevant models suitable for virtual
screening and other related applications. To make the database useful we
have focused on addressing commercial availability, "drug
likeness", stereochemical and regiochemical ambiguity of many supplier
catalogs, physical properties, protonation, charge and tautomeric equilibria.
The database may be searched and subsets created using on-line tools. Parts
of ZINC have been downloaded by thousands of institutions worldwide in
academia, government, and industry. ZINC continues to evolve: a dozen new
compound suppliers and millions of new compounds have been added over the
past year via quarterly releases. Numerous errors have been corrected thanks
to alert and helpful users. This presentation will discuss some applications
of ZINC as well as some of the ways we are trying make ZINC better. ZINC
relies extensively upon vendor catalogs, commercial software and GPLed
software which are acknowledged on our website. The delicate balance of
providing a freely available service based partly on commercial software
will be discussed.
![]()
CINF
24 MOLTABLE: An
open access intiative on molecular informatics
M Karthikeyan, Information Division (Digital information Resource
Centre), National Chemical Laboratory, Dr. Homi Bhabha Road, Pune 411008,
India, Fax: +91-20-5893973, karthi@ems.ncl.res.in, and S Krishnan,
Information Division, National Chemical Laboratory
MolTable
is an open access initiative[1] to collect, compute and distribute the data
to academic and research community. Through this portal one can query large
number of molecules for similarity, computed molecular properties, etc., and
will be able to download the results in .csv format[2]. Since molecular
descriptors are extensively used for QSAR, QSPR, QSTR studies it was
proposed to compute descriptors such as topological, electronic, properties
data for all the molecules[3-4]. These data in combination with activity,
property or toxicity data can be used for building predictive models with
the aid statistical tools (PLS, PCR, kNN, SVM, ANN etc.). Some of the
molecules are linked with Dspace@NCL an open access initiative[5,6].
Molecular data can downloaded in standard SMILES format. The visualization
of the molecules achieved with the help of ChemAxon's MarvinViewer. Details
will be presented.
1.
http://moltable.ncl.res.in/index.htm 2. http://moltable.ncl.res.in/nrm/sample.txt
3. http://moltable.ncl.res.in/nrm/moltable.jsp 4. http://moltable.ncl.res.in/nrm/molprop.jsp
5. http://dspace.ncl.res.in/ 6. http://moltable.ncl.res.in/public/thesis_1130.jsp
![]()
CINF
25 Open access
chemical-information and computer-aided drug design resources
Marc C Nicklaus, Laboratory of Medicinal Chemistry, CCR, NCI, NIH,
Bidg.376 Boyles Street, Frederick, MD 21702, Fax: 301-846-6033, mn1@helix.nih.gov,
Markus Sitzmann, Laboratory of Medicinal Chemistry, CCR, National Cancer
Institute/Frederick, NIH, DHHS, Igor V. Filippov, Laboratory of Medicinal
Chemistry, National Cancer Institute, and Wolf-Dietrich Ihlenfeldt, Xemistry
GmbH
We
present an update on the tools and resources used in the drug design and in
silico screening work of the CADD Group at LMC, CCR, NCI. Many of these
chemoinformatics resources are implemented in the form of web services, and
open access is granted to the public for most of them at http://cactus.nci.nih.gov.
Web-based search interfaces are presented for databases with millions of
compounds using a search engine operating in distributed mode across a Linux
cluster. Many of these databases are being made publicly available,
including multi-million collections of commercial screening samples, as well
as data sets from various U.S. Government agencies. Also presented are new
automated tools for generating such web services, as well as tools and
services utilizing new calculable CACTVS hash code-based identifiers useful
for rapid compound identification and database overlap analyses.
![]()
CINF
26 Automatic
aggregation of open chemical data
Nick E Day1, Peter Murray-Rust2, Henry S. Rzepa3,
Simon M. Tyrrell4, and Yong Zhang4. (1) Department of
Chemistry, Unilever Centre for Molecular Sciences Informatics, Lensfield
Road, CB2 1EW Cambridge, United Kingdom, Fax: +44-1223-763076, ned24@cam.ac.uk,
(2) Unilever Centre for Molecular Informatics, University of Cambridge, (3)
Department of Chemistry, Imperial College of Science, Technology and
Medicine, (4) Unilever Centre for Molecular Science Informatics, University
of Cambridge
Most
experimental chemical data (e.g. crystal structures (80%), spectra (99%),
comp chem (>99%)) is never published in machine-understandable form and
is effectively lost. However where authors deposit it alongside publication,
either in repositories or as supplemental data to journal articles or
theses, we show that it can be extracted and preserved.
The
components of our process have been automated and are:
| a
workflow to manage the process | |
| conversion
of legacy structural formula (MOL, ChemDraw, SMI, etc.) to InChI (the
IUPAC chemical identifier) | |
| conversion
of crystallography (CIF), spectra (JCAMP) and computational chemistry (MOPAC,
GAMESS, etc.) to CML | |
| archival
in an Open XML-aware repository | |
| publication
of metadata through the Open academic repository system (e.g. DSpace,
eprints), disseminated using RSS and RDF. |
The
primary data object is the chemical compound, indexed by InChI and its
properties (with standard CML/RDF metadata). Robots can search collections
for compounds and properties and compile indexes of different degrees of
comprehensiveness or specialisation. We have shown that these are well
indexed by conventional search engines (Google(TM), MSN(TM)) thus removing
the need for specialised chemical software on the Chemical Semantic Web. The
search results are highly customisable and as they are Open can be used
directly for further scientific research or re-dissemination
All
software in this system ("WorldWideMolecularMatrix", WWMM) is
available as Open Source.
![]()
CINF
27 Predictive
models for genotoxicity based on discriminating structural features and
reassembled medicinal chemistry building blocks
Constantine Kreatsoulas1, Chihae Yang2, Glenn
J. Myatt2, and James F. Rathman3. (1) BMS, Princeton,
NJ 08543, constantine_kreatsoulas@merck.com, (2) Leadscope, Inc, (3)
Department of Chemical and Biomolecular Engineering, The Ohio State
University
A
chemical structure-based strategy is used to develop two classes of
predictive models of genetic toxicity as determined by the SOS Chromotest
assay. The SOS assay has high concordance with the standard Ames assay and
has been used successfully for numerous diverse compound classes. In one
approach, the MultiCASE algorithm was used to automate the extraction of
substructures for the prediction of genotoxicity. This model was then
applied to data sets for which SOS data is available.
In
addition to modeling the global results, models for chemically similar
subsets were also developed. For each specific dataset and endpoint,
predictive scaffolds were then constructed using structural features from a
library of 27,000 medicinal chemistry building blocks. Scaffolds were built
separately for the global dataset and each subset. Results are compared for
models built using partial logistic regression for both binomial and
multinomial ordinal toxicity endpoints.
![]()
CINF
28 Building and
using an in-house platform for data mining and analysis integrating open
source and proprietary software: I. Designing and constructing the framework
Erik Evensen, Hans E. Purkey, Ken Lind, and Erin K. Bradley,
Computational Sciences, Sunesis Pharmaceuticals Inc, 341 Oyster Point Blvd.,
South San Francisco, CA 94080, Fax: 650-266-3501, ee@sunesis.com
A
common problem faced by computational chemists is integrating and
transferring data among numerous and disparate systems. This process often
involves managing and translating multiple flat files, a process that does
not scale well to complex workflows with large data sets. We have
constructed a database-backed platform utilizing open source software,
primarily MySQL and Python, that enables building complicated data
management and analysis processes incorporating data generated by both open
and closed source software. In addition, we have developed internal
protocols based on open standards such as XML-RPC to make available
computational results both within and outside of our platform. By using
well-known, open standards, we are able to leverage widely available
knowledge and experience. We will present lessons learned and wisdom gained
during the development of this platform.
![]()
CINF
29 ABCD: From data
to insight
Dimitris K. Agrafiotis, Johnson & Johnson Pharmaceutical Research
& Development, L.L.C, 665 Stockton Drive, Exton, PA 19341, Fax:
610-458-8249
Johnson
and Johnson has recently unveiled ABCD (http://www.bioitworld.com/archive/061704/discovery.html),
an informatics platform that bridges multiple continents, data systems and
cultures using modern information technology, and provides researchers with
an environment that allows them to make better decisions. The system
consists of three major components: 1) a data warehouse, which combines data
from multiple chemical and pharmacological transactional databases,
organized using dimensional modelling principles to support supreme query
performance; 2) a state-of-the-art application suite, which facilitates data
upload, retrieval, mining and reporting, and 3) a workspace, which
facilitates collaboration by allowing users to share queries, templates,
results and reports across project teams, campuses, and other organizational
units. A central goal of ABCD is to provide users with the means to
retrieve, view and analyze multifactorial SAR data. Key to the success of
this effort is the ability to combine fast substructure and similarity
searching with conventional relational queries, and deliver the results in
an expedient and visually compelling format. In this presentation, we give
an overview of ABCD, and focus on a few core components that represent the
system's "chemical intelligence", including the chemical
cartridge, sketcher, molecular spreadsheet and interactive data mining
components.
![]()
CINF
30 Double focusing
by molecular bioactivity and drug likeness
Anwar Rayan, David Marcus, Ohad Givaty, Dinorah Barasch, and Amiram
Goldblum, Medicinal Chemistry and Natural Products, Hebrew University of
Jerusalem, School of Pharmacy, Jerusalem 91120, Israel, Fax: 972-2-675-8925,
anvarr@md.huji.ac.il, amiram@vms.huji.ac.il
We
have developed an Iterative Stochastic Elimination (ISE) algorithm to
construct sets of best results for highly complex combinatorial problems1-4.
The ISE was used to construct sets of molecular descriptor ranges that serve
as filters for distinguishing between drugs and non-drugs. Other methods
suggest filters that produce a binary result, acceptance or rejection of a
molecule as a drug candidate. We employ large sets of best filters to assign
a Drug Like Index (DLI) to any molecule, which corresponds to its chance to
belong to a database of drugs. A similar approach is applied to databases of
biological activity, for which a Molecular Bioactivity Index (MBI) is
produced for any specific activity. We find many molecules with a high DLI
value in large databases of non-drugs, and propose to examine them for their
bioactivity. These molecules are then assigned values of MBI for a specific
bioactivity. This double focusing approach with DLI and MBI is proposed as a
process for discovering molecules with specific biological activities in
large databases of known or of virtual molecules.
References:
(1)
Glick, M.; Rayan, A.; Goldblum, A. Proceedings of the National Academy of
Sciences of the United States of America 2002, 99, 703-708. (2) Glick, M.;
Goldblum, A. Proteins-Structure Function and Genetics 2000, 38, 273-287. (3)
Rayan, A.; Noy, E.; Chema, D.; Levitzki, A.; Goldblum, Current Medicinal
Chemistry 2004, 11, 675-692. (4) Rayan, A.; Senderowitz, H.; Goldblum, A.
Journal of Molecular Graphics and Modelling 2004, 22, 319-333.
![]()
CINF
31 Chemical
datamining approach to scaffold based QSAR studies of NCI anti-tumor dataset
M Karthikeyan, Information Division (Digital information Resource
Centre), National Chemical Laboratory, Dr. Homi Bhabha Road, Pune 411008,
India, Fax: +91-20-5893973, karthi@ems.ncl.res.in, Letha Sebastian, Dept of
Bioinformatics, Amman College, and Alexander Tropsha, Laboratory for
Molecular Modeling, School of Pharmacy, University of North Carolina
National
Cancer Institute (NCI) has been carrying out in vitro screening of compounds
to determine their in vitro inhibitory activity of cell growth in the NCI 60
human cancer cell lines for the purpose of anticancer drug discovery. The
chemical structures along with their activity data were processed for
removing duplicate molecules and error structures. In this process about
32000 molecules with their reported biological activity data (NLOGGI50,
NLOGTGI, NLOGLC50) for 60 human tumor cell lines were organized in the
Oracle database table. Each molecule and their biological activity data were
linked to corresponding molecular descriptors using common identifier for
querying the database. Various molecular descriptors of type “topological,
electronic, quantum mechanical, 2D and 2D” along with predicted properties
such as molar refractivity, solubility logP(o/w) partition co-efficient, and
the drug likeliness related information related with Lipinsky rule of 5
including ‘number of rotatable bonds, number of hydrogen bond acceptors,
number of hydrogen bond donors, total polar surface area etc., were
calculated for all these molecules which are essentially required for QSAR/QSPR
analysis. Scaffold and functional group analysis was conducted on NCI data
set to identify the number of common scaffolds. [Fig-1]. Selected sets of
scaffolds were used for QSAR studies using MOE descriptors and in-built PLS,
PCR and other statistical methods. The methods of data-mining and
computational results are presented.
![]()
CINF
32 The use of
Random Forests for modeling in vitro ADMET endpoints
Jason D Hughes, Molecular Informatics, Pfizer, 620 Memorial Dr,
Cambridge, MA 02139
A
framework for molecular property/activity prediction consisting of a Random
Forest model coupled with a custom set of descriptors has been found to be
very effective across a variety of endpoints, including kinetic solubility,
membrane permeability, metabolic stability, and dofetilide binding. Random
Forests are bagged decision tree ensembles that are trained and applied
normally but for one exception: only a small, randomly selected subset of
descriptors are considered when selecting the best split at each node during
tree construction. The descriptors used here are all simple molecular
substructure or feature counts encoded as Daylight SMARTS queries. Some
mathematical properties of these RF-based models have been explored,
including the impact of descriptor and training set selection schemes,
nearest neighbor effects, etc. Additionally, examples will be given to
demonstrate that the effectiveness of this modeling paradigm compares
favorably to a selection of alternatives.
![]()
CINF
33 Web services as
integrators of public chemistry databases
Gary Wiggins, School of Informatics, Indiana University, 901 E. Tenth
Street, Bloomington, IN 47408-3912, Fax: 812-856-4764, wiggins@indiana.edu
PubChem
and other chemistry databases on the Web will provide a wealth of chemical
and biological information. We are embarking on a series of projects that
will utilize computer simulation and visualization environments to create an
integrated chemical informatics cyberinfrastructure built on modern
distributed service architectures. The projects will use the emerging
high-capacity computer networks, powerful data repositories, and computers
that comprise the Grid, thus ensuring scalability, computational efficiency,
and interoperability among heterogeneous components. A description of the
overall architecture of the projects and the planned links to the databases
will be presented.
![]()
CINF
34 Chemical and
biological data from DTP/NCI
Daniel W Zaharevitz, Information Technology Branch, Developmental
Therapeutics Program, National Cancer Institute, EPN, Room 8010, 6130
Executive Blvd, Bethesda, MD 20892, Fax: 301-480-4808, zaharevitz@dtpax2.ncifcrf.gov
The
Developmental Therapeutics Program (DTP) at the National Cancer Institute
has been acquiring compounds for testing since 1955. This effort has
resulted in the accumulation of a wealth of chemical and biological
information. DTP has made this information useful to the research community
by making the data publicly available and by developing tools that search
and analyze the data. Over 250,000 chemical structures and over 10 million
biological data points are available. Biological data includes measurement
of growth inhibition in sixty human tumor cell lines, growth inhibition in
yeast strains with defined mutations, protection from HIV in cell culture,
anti-tumor activity in numerous mouse tumor models in vivo, and several
other assays. Searches can be done by NSC number, CAS registry number,
chemical name, or chemical substructure. Development of a data architecture
for organizing this data will be discussed as well as plans for future
additions to the data.
![]()
CINF
35 Public
information databases for virtual screening
John Irwin and Brian Shoichet, Department of Pharmaceutical
Chemistry, University of California, San Francisco, 1700 4th St, San
Francisco, CA 94143, jji@cgl.ucsf.edu
Investigators
wishing to apply computational methods such as virtual screening to discover
novel ligands for proteins require a database of molecules suitable for
docking. To shorten the hypothesis-testing cycle, these compounds should be
commercially available and broadly "drug-like". To address this
need, which has been a barrier to entry to this field, we developed the ZINC
database of purchasable compounds for virtual screening, a collection
currently of 3.3M compounds available from over 20 vendors. Notwithstanding
our original goal of serving the virtual screening community, ZINC has
attracted the attention of cheminformaticists more generally as a source of
publicly available chemical structures for research. By the time of this
meeting, a large part of ZINC should have been loaded into PubChem, the new
database of chemical structures and screening data from NCBI that is tightly
linked into the chemical and biological literature. This link from PubChem
to ZINC complements the existing links from ZINC into PubChem, and to
compound vendor websites. We hope this growth of a web of publicly available
chemical information, linking the literature to 3D structures, properties,
and chemical suppliers, will be a boon to investigators, particularly those
who have hitherto not had access to this information. ZINC is on the web at
http://zinc.docking.org.
![]()
CINF
36
NIST
Computational chemistry comparison and benchmark database
Russell D. Johnson III, Computational Chemistry Group, National
Institute of Standards and Technology, 100 Bureau Drive Stop 8380,
Gaithersburg, MD 20899, Fax: 301-869-4020, russell.johnson@nist.gov
The
NIST Computational Chemistry Comparison and Benchmark Database (CCCBDB) is a
website and database which allows users to compare ideal-gas thermochemical
properties determined by experiment or by quantum chemical calculations. The
database contains experimental data for more than 640 small molecules, and
over 100 000 calculations. Types of data include enthalpies of formation,
entropies, geometries, vibrational frequencies, and dipole moments. The
primary goal of the CCCBDB is to allow comparisons of thermochemistry and
related properties (entropies, geometries, vibrational frequencies). The
CCCBDB illustrates the question “How good is that calculation?” by
providing many examples. This talk will describe the data present in the
CCCBDB, the tools available through the website for comparisons, and the
future plans of the CCCBDB. The CCCBDB is accessible at http://srdata.nist.gov/cccbdb.
![]()
CINF
37 Chemical
information databases for environmental fate and exposure assessments
Suzanne Bogaczyk1, Philip H. Howard2, William
M. Meylan2, Amy Hueber2, and Jay Tunkel2.
(1) Syracuse Research Corporation, 1215 South Clark Street, Suite 405,
Arlington, VA 22202, Fax: 703-418-1044, sbogaczyk@syrres.com, (2)
Environmental Science Center, Syracuse Research Corporation
Accurate
and dependable sources of chemical information are of great importance in
the assessment of chemicals for environmental purposes. Syracuse Research
Corporation (SRC) produces and maintains several databases of this type,
including the Environmental Fate Database (EFDB) and the physical properties
database (PHYSPROP). The EFDB, which is continually updated and maintained
at SRC, was developed in conjunction with the EPA to allow rapid access to
available environmental fate and physical/chemical properties data on
chemical substances. PHYSPROP contains a recommended single value for water
solubility, octanol water partition coefficient, melting and boiling point,
vapor pressure, Henry's Law constant, and hydroxyl radical rate constant for
over 25,000 chemicals. SRC also developed ChemS3, a web-based
search engine which allows sub-structure searches to be combined with
queries of text and numeric data. The compilation and versatility of these
databases to effectively search for environmental fate and exposure
information on chemical substances will be discussed.
![]()
CINF
38 3-D Database
search queries for colchicine binding site inhibitors
Ann Hermone, Tam Luong Nyguyen, James Burnett, Connor McGrath, Ernest
Hamel, Daniel W Zaharevitz, and Rick Gussio, Information Technology Branch,
Developmental Therapeutics Program, PO Box B, FVC 310, Frederick, MD 21702,
Fax: 301-846-6106, hermone@dtpax2.ncifcrf.gov
Microtubules,
which are linear arrays of alternating alpha and beta tubulin, are critical
for cellular proliferation and are therefore a target of cancer
chemotherapy. Colchicine was the first compound found to bind at the
interface of alpha and beta tubulin and to destabilize microtubules. Over
the years, a large number of structurally diverse small molecules have been
shown to bind at the colchicine site of tubulin and inhibit tubulin
polymerization. In other work by our group, docking studies involving the
recently-determined X-ray structure of the alpha,beta tubulin/colchicine
complex were used to construct binding models for a set of structurally
diverse colchicine site inhibitors, which subsequently formed the basis for
a common pharmacophore. This study expands on that work by developing
internally consistent Catalyst search queries that can discriminate between
colchicine site inhibitors and their inactive congeners.
![]()
CINF
39 Algorithms and
cancer drugs: In silico design of S100B ligands to block p53 binding
John L. Whitlow, Department of Chemistry, East Carolina University,
300 Science and Technology Building, Greenville, NC 27858, Fax:
206-424-1645, john@johnwhitlow.com
Cancer
is the leading cause of death for persons under the age of 85. Elevated
levels of S100B are associated with cancer. This research focused on
interactions between S100B and the tumor suppressor protein, p53. S100B
disrupts p53's protective function by inhibiting p53's C-terminal regulatory
domain phosphorylation. This study designed compounds to block the effects
of S100B on p53. Compounds that enhance p53's cellular function may provide
potent anticancer therapies.
Accelrys's
Cerius2 software was used for de novo drug design. The three dimensional
structure of S100B was analyzed to resolve its main interaction sites.
Fragment molecules were screened against targets of interaction in the S100B
active site. Top fragment molecules were used as scaffolds to design
complete ligand molecules. Additionally, public and private molecular
libraries were run through docking algorithms to locate existing molecules
with high affinities for the S100B active site. ADME and toxicity properties
were also investigated.
![]()
CINF
40 Framework for
integrating transcriptomic and proteomic profiles in Escherichia coli
Kunal Aggarwal, Leila H. Choe, and Kelvin H. Lee, School of Chemical
and Biomolecular Engineering, Cornell University, 120 Olin Hall, Ithaca, NY
14853, Fax: 607-255-9166, ka62@cornell.edu
We
have developed a model experimental system to study the relationship between
mRNA and protein expression profiles in genetically perturbed E. coli.
Experimental data at the genomic, transcriptomic and proteomic levels from
these cells are integrated on a common platform to understand the effects of
the introduced genetic and environmental perturbations in the cells at the
molecular level. The cells are perturbed to overexpress fragments of rhsA
in presence of IPTG and are observed to have a reduced growth rate. Gene
expression and protein abundance data from these cells suggests a perturbed
translation machinery and a non linear correlation between the mRNA and
protein levels in rhsA overexpressing E. coli cells. The gene
expression data is integrated with the connectivity information between
genes and their transcription factors using network component analysis to
gain information on altered levels of transcription factor activity and to
identify parameters that may cause the observed non linearity between the
mRNA and protein levels.
![]()
CINF
41 3-D-QSAR CoMFA
and COMSIA studies of novel alkoxylated and hydroxylated chalcones as
potential anti-malarial agents
Devendra S Puntambekar and Mange Ram Yadav, Department of
Pharmaceutical Chemistry, The M.S University of Baroda, Pharmacy department,
Faculty of Technology & Engineering, Kalabhavan, P.O Box - 51, Vadodara,
Gujrat, India, Baroda 390 001, India, Fax: +91-0265-2423898/2418927,
devendra_res@yahoo.co.uk
Comparative
molecular field analysis (CoMFA) and Comparative molecular similarity
indices (CoMSIA) was performed on a series of novel alkoxylated and
hydroxylated chalcones as antimalarial agents. The ligand molecular
superimposition on template structure was performed by atom/shape based RMS
fit methods. The removal of outliers from the initial set of 69 compounds
improved the predictivity of the models. The statistically significant model
was established from 52 compounds, which were validated by evaluation of
test set of 14 compounds. The atom and shape based alignment yielded best
predictive CoMFA model (r2cv = 0.674, r2ncv = 0.957, r2pred = 0.670, F value
= 83.040, r2bs = 0.992 with six components) while CoMSIA model yielded (r2cv
= 0.610, r2ncv = 0.913, r2pred = 0.726, F value = 50.115, r2bs = 0.947 with
seven components). The contour maps obtained from 3D-QSAR studies were
appraised for the activity trends of the molecules. The results indicate
that steric, electrostatic, hydrophobic and hydrogen bond donor substituents
play significant role in the antimalarial activity of these compounds
![]()
CINF
42 Automatic
molecular library generation of processed bioenzymes by proteolisys methods
for bioremediation processes
Vito Librando, Danilo Gullotto, and Zelica Minniti, Department of
Chemistry, University of Catania, via Andrea Doria 6, Catania, Catania,
Italy, vlibrando@unict.it, envch3@unict.it
The
goal of the present work concerns the implementation of informatic
procedures, able to interface themselves with application software
environments. Procedures were developed for computer processing in molecular
modeling fields and allow generation of molecular libraries, including data
relative to sequence and structure configurations of bio-enzymes. Each
library contains molecular structures that differs for several amino acid
delections inside specified molecules regions. So, it is possible to obtain
a collection of molecular fragments, sourced from the ancestral protein.
Protein side chains obtained by this strategy, were compatible with the
enzymatic proteolysis methods that are used on conventional laboratory
protocols and that was useful to decrease the time required to apply
experimental procedures. The developed methodology was able to identify many
chemical-physic properties in the source molecule, leading the selection
procedure to find out the most suitable residues candidates to proteolisys.
The program takes into consideration a set of index and parameters, related
both amino acids sequences properties (hydrophobicity) and the occurrence of
amino acids typology within secondary structures(helixs, sheets and loops).
Criteria used to perform the choice of residues suitable for proteolisys
methods were based on the capability to recognize many features in a protein
sequence. The advantage of a such strategy consists of allowing proteins to
maintain their structural and energetics features, without loss of
conformational changes in the secondary structure release avoiding,
consequently, a probable loss of the protein activity. Finally, this method
allows generation of a wide set of optimised fragmented structures that are
suitable to be tested and applied in subsequent computing molecular modeling
environment.
Acknowledgements
The Authors are grateful to MIUR for the financial support
![]()
CINF
43 Library
generation and lead selection for optimal laboratory procedure of
environmental biocatalists
Vito Librando, Danilo Gullotto, and Zelica Minniti, Department of
Chemistry, University of Catania, Viale Andrea Doria, 6, Catania 95127,
Italy, Fax: +39-095-580138, vlibrando@unict.it
Between
Sicilian contaminated sites, particularly the Siracuse Bay, poor attention
has been given to the pollution and remediation. The petroleum products that
remain as long term contaminants, include polycyclic aromatic hydrocarbons (PAHs),
that are a family of ubiquitous pollutants with similar biological activity,
high toxicity, mutagenic and carcinogenic power. This paper describe
preliminary results of an in situ treatment strategy using engineered
enzymes extracted from selected bacteria for low-cost bioremediation of
petroleum products that are poorly degraded by naturally-occurring bacteria.
Effects of sequence modification can be predicted using particular
algorithms, and it is possible to design and test numerous different active
molecules derived from the original ones. Multiple virtual delections of the
aminoacidical sequence were obtained working on the original PDB file, and
new sequence were annealed using force fields in molecular dynamics
simulations in which were considered real environmental parameters. The
structures were analyzed to find the ones with the best configuration of
active site and selective channels for the substrate; then multiple docking
simulations were performed for all the different substrates giving
information about the amount of the interactions between enzymes and
substrates of environmental interest. A complete scan of protein surface
were carried out using naphthalene as probe to find new eventual inactive
binding site that could hold the substrate far from the active site.
![]()
CINF
44 Modeling vs.
X-ray crystallography: The basal activity of constitutive androstane
receptor (CAR)
Björn Windshügel, Institute for Pharmaceutical Chemistry,
Martin-Luther-University Halle-Wittenberg, Wolfgang-Langenbeck-Str. 4, Halle
(Saale) 06120, Germany, bjoern.windshuegel@pharmazie.uni-halle.de
Abstract
text not available.
![]()
CINF
45 Mok: A
domain-specific language for molecular information processing
Ivan Tubert-Brohman and William L. Jorgensen, Department of
Chemistry, Yale University, 225 Prospect St., New Haven, CT 06520, Fax:
203-432-6299, Ivan.Tubert-Brohman@yale.edu
Mok
is a domain-specific language for molecular information processing, based on
the same execution paradigm as the AWK programming language. It is derived
from Perl and includes specialized functions and command-line options for
molecular file input and output, substructure matching, bond perception from
3D coordinates, and an object model for accessing and modifying various
properties of the atoms and bonds in a structure. It is freely available on
CPAN under the same license as Perl itself.
![]()
CINF
46 WinDock: An
integrated structure-based drug discovery environment using graphical user
interface
Zengjian Hu1, Donnell Bowen2, Shaomeng Wang3,
and William M. Southerland1. (1) Department of Biochemistry and
Molecular Biology, Howard University College of Medicine and the Howard
University Drug Discovery Unit, 520 West Street, Northwest, Room 324,
Washington, DC 20059, zhu@howard.edu, (2) Department of Pharmacology, Howard
University College of Medicine, (3) Comprehensive Cancer Center and
Department of Internal Medicine, The University of Michigan
In
recent years, virtual database screening using high-throughput molecular
docking (HTD) has emerged as a very important tool and method for finding
new leads in the drug discovery process. Most HTD efforts utilize expensive
workstations and hard-to-master Unix-like operating systems. With the advent
of powerful and inexpensive personal computers (PCs), it is now possible to
perform HTD investigations on Windows-based PCs. To make HTD more accessible
to a broad community, we present here WinDock, an integrated structure-based
drug discovery environment on Windows-based personal computers (PCs) which
integrates small molecule searchable 3D databases, homology modeling tools,
ligand-protein docking programs, and consensus scoring functions into a
cohesive system which provides a general tool for a wide range of
computer-aided drug discovery applications, including protein homology
modeling, lead identification, and lead optimization. WinDock is coded in
C++ language and is distributed free of charge for all users.
![]()
CINF
47 Turbo
similarity searching
Jérôme Hert1, Peter Willett1, David J. Wilton1,
Kamal Azzaoui2, Edgar Jacoby2, and Ansgar
Schuffenhauer2. (1) Department of Information Studies, University
of Sheffield, Western Bank, Sheffield S10 2TN, United Kingdom, j.hert@sheffield.ac.uk,
(2) Discovery Technologies, Novartis Institute for Biomedical Research
Previous
work has shown that fusing the outputs of similarity searches carried out
using different isoactive reference compounds produces a more effective
ranking than one based on just a single reference compound. Turbo similarity
searching applies this strategy using a reference molecule and its nearest
neighbours. The similar property principle implies that these neighbour
compounds are likely to have a similar bioactivity profile; accordingly it
may be worth including them in a fusion procedure. The effectiveness of this
method is investigated by means of simulated virtual screening experiments
using the MDL Drug Data Report Database. Extensive searches are carried out
for eleven diverse activity classes and consistently demonstrate the
superiority of turbo similarity searching over conventional similarity
search. This method hence represent a simple way of enhancing
similarity-based virtual screening methods.
![]()
CINF
48
On-line
submission and peer review systems
William G Town, Kilmorie Consulting, 24A Elsinore Road, London SE23
2SL, United Kingdom, bill.town@kilmorie.com
In
the last ten years, electronic publishing of the results of scientific
research has developed from being a novelty to being accepted as the normal
method of publishing. Systems for online submission of articles, for peer
review and for transmission of approved articles into the production
workflow systems which manage both print and electronic publishing are now
commonplace. This paper will review the technologies which have made this
transition possible and the impact of these systems on authors' and peer
reviewers' experience of publishing and on the timeliness of peer review and
publishing. The impact of preprint servers will also be discussed.
![]()
CINF
49
Path to
document recommendation services: Technologies that enabled the development
of on-line information systems
Gerry Grenier, Publishing Technologies, IEEE, Inc, 445 Hoes Lane,
Piscataway, NJ 08855, g.grenier@ieee.org
Online
information services are a 40-year-old phenomenon. The evolution of these
services has been on-going, with perhaps the most significant period of
change occurring over the past 10 years. The development of the internet
infrastructure, the rise of the world wide web and the http protocol spurred
an explosion of online information services that has evaporated temporal and
spatial barriers to information. Lost in the excitement of the development
of the internet are the developments of the previous 30 years that have
contributed significantly to the search and discovery capabilities of online
information services. Among those developments are full text search,
relevancy ranking, and markup languages. This paper will offer a look at the
development of these three technologies, and then review the state of the
art of the nascent service of document recommendation – a service that is
built upon the three aforementioned technologies.
![]()
CINF
50
Clustering and
meta-search as enabling technologies for rapid creation of vertical web
portals
Raul E. Valdes-Perez, Vivisimo Inc, 2435 Beechwood Blvd, Pittsburgh,
PA 15217, Fax: 412-422-2495, valdes@vivisimo.com
Specialized
web portals, or vortals, provide a comprehensive gateway for information on
a scientific specialty. The vortal fad of the late 90's stalled because of
costs, both human and technology, and the lack of a web business model. The
situation is now changed: the new technologies of search, meta-search, and
clustering enable rapid deployment of vortals that index one's own or public
information, meta-search partner search engines, and cluster the combined
information into categories. This opens up radical new possibilities for
publishers and for industry to deploy internal vortals for their scientists
and engineers.
![]()
CINF
51
Why your
library doesn’t do what you want it to
Stuart L. Weibel, Office of Research, OCLC Online Computer Library
Center, 6565 Frantz Road, Dublin, OH 43017, Fax: 614 764 2344, weibel@oclc.org
The
success and appeal of Lego blocks is more than esthetics; it is rooted in
engineering principles that comprise the foundation of the industrial age,
and are essential for systems engineering. The Lego metaphor gives us a
sound conceptual model for the design of information systems including the
principles of standardized interfaces, modular design, and extensibility.
There
is a darker aspect of information systems that can be modeled with Lego as
well, however, and this metaphor sheds some light on the difficulties we
have in the design and use of electronic publishing systems and digital
library technology in general. The Week-After-Christmas metaphor evokes a
box of dozens or hundreds of unique parts that, while interoperable in some
sense, can be recombined in a staggering array of configurations… not all
of which make sense. The rapid changes in a broadly distributed information
environment make it impossible to anticipate these changes and difficult
simply to accommodate them in a coherent way. The result is constant change
and a requirement for adaptation that is a new feature of the education and
research process.
The
challenge of designers and users is to recognize the useful bits that work
together, and configure them in sustainable, cost-effective systems that
meet the functional requirements of libraries and their constituents.
![]()
CINF
52
CAS Registry:
An evolving resource for science
Roger J. Schenck, Chemical Abstracts Service, 2540 Olentangy River
Road, Columbus, OH 43202, Fax: 614-461-7140
From
the inception of CA in 1907 and the publication of the first CA Substance
Index in 1920, the identification and storage of substance information from
the publicly-disclosed literature has always been a major focus at CAS.
Starting from a manually-curated 3X5 card file, the CAS Registry has evolved
into a computer-based collection approaching 80 million records for organic
and inorganic molecules, proteins, and sequences. The CAS Registry began as
a tool to serve the needs of CAS Production Operations but soon became and
remains today an essential adjunct to the work of researchers in academia,
industry, and government agencies around the world. This talk will
concentrate on the history of the CAS Registry, focusing on the changes in
computer technology that have enabled the evolution and huge growth of this
world resource.
![]()
CINF
53
Why are we
still reading "papers" in a digital world? Can papers become
digital, too?
David P Martinsen, ACS Publications, American Chemical Society, 1155
16th Street NW, Washington, DC 20036, Fax: 202-872-4389, d_martinsen@acs.org
The
last ten years have witnessed a revolution in the way scientists receive
information, with a remarkable impact on discovery and delivery. It is much
easier than ever before to find articles of interest and to obtain those
articles without ever leaving the lab or office. However, the method by
which scientists read articles has been evolutionary at best. Most of us
still print out a copy of the article to read and make notes. This talk will
examine some of the challenges to making reading a more digital experience,
to better realize the promises of the enhanced, digital editions of
articles.
![]()
CINF
54
Electronic data
standards for spectroscopy and analytical procedures
Antony N. Davies, Waters Informatics, Europaallee 27-29, 50226
Frechen, Germany, Fax: +49-2234-9207-99
With
the ever increasing availability of information in electronic form such as
in peer-reviewed journal articles, electronic patent submissions or
pharmaceutical submissions to regulatory bodies comes the equally increasing
pressure on standards bodies such as IUPAC and the ASTM International to
ensure that the associated data can be made available in open standard
formats. This talk will review the state-of-the art and identify
good-practice which scientists around the globe should adopt.
![]()
CINF
55
Science
online: Bridging scientific disciplines
Monica M. Bradford, Science, AAAS, 1200 New York Avenue, NW,
Washington, DC 20005, Fax: 202-289-7562, Mbradfor@aaas.org
Science,
the premier, inter-disciplinary journal of the world, has been helping
scientists communicate their peer-reviewed results to the scientific
community for over 125 years. For the last 10 of those years, Science has
embraced electronic publishing as a means of enhancing scientific
communication and helping scientists make their results more accessible.
Moving to an entirely electronic workflow for the journal has led to
decreased processing times, increased submissions, and a more international
review process. Forward and backward reference linking, multi-media
enhancements, online supplemental material, and a suite of tools have made
the online version of the journal a richer resource. Creation of two online
knowledge environments has allowed the staff to experiment with creating
online resources that expand beyond the traditional journal. Over the next
five years, the challenge will be to match technology to researchers'
behaviors to ensure that the communication vehicles match work styles and
information needs.
![]()
CINF
56
Publishing
innovation at the Royal Society of Chemistry
Richard Kidd and Robert Parker, Royal Society of Chemistry, Thomas
Graham House, Science Park, Milton Road, Cambridge CB4 0WF, United Kingdom,
Fax: +44 1223 420247, kiddr@rsc.org
The
RSC implemented an XML production route for its journals in 2000, and from
there has worked to use the XML to improve our publications. XSL-FO is used
to produce our database products and this had promise for journal article
make-up. We will demonstrate our innovative data checker, developed with the
Unilever Centre at Cambridge University, which has great potential for the
extraction of chemistry from previously published work. We will show the
ways we are developing our online articles to increase the chemical science
content in our journal articles.
![]()
CINF
57
Meeting the
communications needs of physicists: AIP’s electronic publishing
experiences
Tim Ingoldsby, Director of Business Development, American Institute
of Physics, 2 Huntington Quadrangle, Suite 1NO1, Melville, NY 11747-4502,
Fax: 516-576-2327, tingoldsby@aip.org
Physicists
invented the World Wide Web to satisfy their need for rapid communication of
research results. AIP responded to this need through early experiments with
Web publishing and the creation of an online hosting service, Scitation, to
meet the needs of physical science and engineering publishers. The evolution
of AIP's online services will be discussed, with special emphasis on the
development of linking, one of the primary “value adds” of electronic
publishing. Reports about two ongoing projects, the STIX Fonts and Essential
Information Objects, will be used to speculate about things to come in the
world of electronic scholarly communication.
![]()
CINF
58
Electronic
publishing and disruptive technologies
Karen Hunter, Elsevier, 360 Park Avenue South, New York, NY 10010,
k.hunter@elsevier.com
Paul
Saffo of the Institute for the Future has said we are in a period of
"unprecedented uncertainty" as to technology and its effects on
our business. This paper explores the effects of disruptive technologies and
"unprecedented uncertainty" about technology on the process of
experimentation, product creation and investment in STM electronic
publishing. Specific experiences of Elsevier over the past 25 years are used
as illustrations of technology trade-offs and technology-related strategic
issues, as well as a look at some of the current technology concerns.
![]()
CINF
59
Genesis of ACS
electronic journals
Lorrin R. Garson, Publications Division, American Chemical Society
(retired), 1155 Sixteenth Street, N.W, Washington, DC 20036-4892,
garson9929@yahoo.com
The
availability of ACS journals on the World Wide Web is the consequence of
over 25 years effort. In anticipation that “someday” journals would be
delivered electronically, a concerted effort was made in journal production,
starting in 1975, to create journal data in consistent structures that would
enable the creation of digital products. Over the years various experimental
efforts were undertaken in collaboration with universities, publishers and
organizations to bring us to today's journals on the Web.
![]()
CINF
60 InChI: Open
access/open source and the IUPAC International Chemical Identifier
Stephen R. Heller, Stephen E. Stein, and Dmitrii V. Tchekhovskoi,
Physical and Chemical Properties Division, NIST, Gaithersburg, MD
20899-8380, srheller@nist.gov
With
the acceptance and use of the Internet throughout the world-wide scientific
community, the ability for chemists and their colleagues in related fields
to communicate more readily and at less expense has finally arrived. Open
Access and Open Source are public domain, freely available projects which
allow for the free exchange of information and are having and will continue
to have a positive and profound effect on chemists worldwide.
IUPAC
has long been involved in the development of systematic procedures for
naming chemical substances on the basis of their structure. The resulting
rules of nomenclature, while covering a large fraction of compounds, were
designed for text-based media. IUPAC has now developed an open source,
public domain means of representing chemical substances in a format more
suitable for digital processing and the Internet, involving the computer
processing of chemical structural information (connection tables). This has
led to the development of the IUPAC International Chemical Identifier, InChI.
Details of the status and acceptance InChI and related freely available Open
Access information tools, such as chemistry journals (e.g., Beilstein
Journal of Organic Chemistry) will be discussed in this presentation.
![]()
CINF
61 Promoting data
standards and open public access to structure-searchable toxicity databases:
DSSTox and coordinated public efforts
Ann M. Richard, US EPA, MD B143-06, Research Triangle Park, NC 27711,
Fax: 919-685-3263, richard.ann@epa.gov, and Maritja Wolf, Lockheed Martin,
Contractor to the US-EPA
The
Distributed Structure-Searchable Toxicity (DSSTox) database project seeks
to: promote use of chemical structure standards and file formats for
chemical toxicity databases; coordinate with other public efforts to
encourage chemical structure annotation, data standardization, and open
public access to toxicity databases; and enhance the ability of scientists
and regulators within and outside EPA to integrate, explore, and utilize
chemical toxicity data from a structural perspective to improve capabilities
in predictive toxicology. DSSTox is coordinating with the public LIST ToxML
and International Life Sciences Institute on toxicity data standardization
efforts in Developmental Toxicity and other areas of toxicology. Additional
collaborations are ongoing with the LHASA VITIC structure-activity database
effort, the National Cancer Institute's chemical databases and
structure-browser, NCBI's PubChem, the IUPAC/NIST InChI chemical information
code project, and the NIEHS Chemical Effects in Biological Systems (CEBS)
toxicogenomics knowledge-base, among others.
![]()
CINF
62 The US EPA
contribution to the OECD work on the validation, for regulatory purposes, of
(quantitative) structure activity relationships: (Q)SARs
Maurice Zeeman, Kelly Mayo, and Oscar Hernandez, Office of Pollution
Prevention and Toxics, U.S. Environmental Protection Agency, 1200
Pennsylvania Ave., NW (7403), Washington, DC 20460, zeeman.maurice@epa.gov
OECD's
work on the validation, for regulatory purposes, of (Quantitative) Structure
Activity Relationship (aka (Q)SAR) models will be described.
![]()
CINF
63 OECD residue
chemistry guideline harmonization project
Amy Rispin, Rick Loranger, and Steve Funk, Office of Pesticide
Programs (OPP), U.S. Environmental Protection Agency, 1200 Pennsylvania
Ave., NW, Washington, DC 20460, rispin.amy@epa.gov
A
US-led Expert Group on Pesticide Residue Chemistry, established in Oct.
2003, is developing two guidance documents and five Test Guidelines (and
templates for reporting test study summary data).
![]()
CINF
64 Performance
standards for quality assurance of validated alternative test methods
Amy Rispin, Office of Pesticide Programs (OPP), U.S. Environmental
Protection Agency, 1200 Pennsylvania Ave., NW, Washington, DC 20460,
rispin.amy@epa.gov, Kailash Gupta, U.S. Consumer Product Safety Commission (CPSC),
and William Stokes, Division of Intramural Research, National Institute of
Environmental Health Sciences (NIEHS)
Alternative
test methods for toxicological tests are being sought to replace animal
testing when possible. Such methods must be validated in order to make their
use a feasible alternative.
![]()
CINF
65 Facilitating
electronic submission of chemical information: OECD Harmonized Templates
(and XML schema), the U.S. High Production Volume Information System (HPVIS),
and the European Union's IUCLID database [panel]
Randall Brinkhuis1, Leslie MacDougall2, Jay
Ellenberger3, Brion Cook4, and Todd Holderman4.
(1) Office of Pollution Prevention and Toxics, U.S. Environmental Protection
Agency, Washington, DC 20460, brinkhuis.randall@epa.gov, (2) OPPT, Risk
Assessment Div, U.S. Environmental Protection Agency, (3) Office of
Pesticide Programs, U.S. Environmental Protection Agency, (4) OPPT,
Information Management Division, U.S. Environmental Protection Agency
This
session will include a description of the development of harmonized
templates by OECD for submission of individual study report summaries or
study evaluation reports. Separate templates are being developed for over
seventy different toxicology, ecotoxicology, and physicial-chemical property
types. In addition, XML schema are being developed for each template.
Electronic
submission of chemical summaries and data will also be discussed,
particularly with respect to high-production-volume (HPV) chemicals. The
European Union's IUCLID database will also be described.
![]()
CINF
66
ELNs: What are
they, and what do they need to do?
Keith T. Taylor1, David Hughes1, and Phil
McHale2. (1) Marketing, MDL Information Systems, Inc, 14600
Catalina Street, San Leandro, CA 94577, k.taylor@mdl.com, (2) Corporate
Communications and Scientific Affairs, MDL Information Systems, Inc
The
paper notebook is well understood and the R&D workflow has been derived
around its capabilities. The notebook is used to support patent claims, and
to prove compliance with regulations, for example the FDA's GLP. An ELN must
satisfy these basic needs but it has the potential to do more. The basic
functionality the ELN and its potential to drive the evolution of an
electronic-R&D environment will be discussed.
![]()
CINF
67
Electronic
laboratory notebooks in the advanced undergraduate laboratories
Todd E. Woerner, Department of Chemistry, Duke University, Box 90346,
Durham, NC 27708, Fax: 919-660-1605, todd.woerner@duke.edu
As
part of an ongoing effort to use computer technology in undergraduate
courses we introduced an electronic laboratory notebook (ELN) to the
physical chemistry laboratory. Blackboard® software was selected as the
host for the ELN because it is familiar to students and meets the necessary
requirements of security and accessibility. Our eventual goal is to use the
ELN throughout the advanced undergraduate laboratories thus completing a
long-term initiative to upgrade the laboratories with networked computer
systems. With the inclusion of the ELN all aspects of laboratory work, from
literature review and experimental design, through instrument control, data
collection and record keeping, to analysis and report writing, is managed
electronically.
![]()
CINF
68
Using
electronic laboratory notebooks in an academic environment
Mahesh H Merchant1, Paresh C Sanghani2, and
Sonal P Sanghani2. (1) School of Informatics, Indiana University,
719 Indiana Avenue, Walker Plaza, WK319, Indianapolis, IN 46202, mmerchan@iupui.edu,
(2) Department of Biochemistry and Molecular Biology, Indiana University
The
notion of a paperless laboratory has been around for a long time; however,
the adaptation of electronic laboratory notebooks has lagged behind.
Innovations in technology and advances in the software implementation of
electronic notebooks are making this a reality. The School of Informatics at
Indiana University has launched a graduate program in Laboratory
Informatics. The curriculum of this program includes the use of a commercial
electronic laboratory notebook in conjunction with other scientific data
management tools. The experiences of the first group of students using this
commercial electronic laboratory notebook will be discussed. An upcoming
joint project between the School of Informatics and the Masters Program in
Biotechnology using Electronic Laboratory Notebooks on Tablet PCs will be
presented.
![]()
CINF
69 Expanding the
available public chemical information using ELN's
Scott E. Schaus, Department of Chemistry, Boston University, Life
Science and Engineering Building, 24 Cummington Street, Boston, MA 02215,
Fax: 617-353-6466, seschaus@bu.edu
The
Center for Chemical Methodology and Library Development at Boston University
(CMLD-BU, http://cmld.bu.edu/) is a new center funded by the National
Institute of General Medical Sciences (NIGMS) focused on the discovery of
new methodologies to produce novel chemical libraries of unprecedented
complexity for biological screening. The goal of the CMLD-BU is to explore
and expand the diversity of small-molecule libraries by creating general,
useful protocols for stereocontrolled synthesis. A major objective of the
CMLD-BU is to provide information and chemistry protocols to the public on
parallel and chemical library synthesis. The Center has created the
Synthesis Protocols Database of electronic laboratory notebook procedures to
accomplish this goal.
![]()
CINF
70
From
collaboration tool to semantic e-record: The evolving role of the Electronic
Laboratory Notebook (ELN)
James D. Myers1, Charles E. Arp2, Tara Talbott1,
and Michael Peterson1. (1) Mathematics and Computational Science,
Battelle / Pacific Northwest National Laboratory, Battelle Blvd. MS K1-87,
Richland, WA 99352, Fax: 208-474-4616, jim.myers@computer.org, (2) Records
Management Office, Battelle
The
open-source Electronic Laboratory Notebook (eln.sourceforge.net) has been in
use as a collaboration and productivity tool at Pacific Northwest National
Laboratory (PNNL) for nearly a decade. During that time, the ELN has evolved
significantly and has incorporated records-related functionality.
Independent of this technical evolution, interest at PNNL in fielding a
general electronic notebook as a business record has been growing. Over the
past year, these trends have resulted in an active dialog between ELN
developers and records managers and discussion of pilot deployment. A number
of factors, ranging from the incorporation of web service and semantic
technologies within the ELN and records systems at PNNL to successes in
migrating other forms of records to electronic form, have contributed to the
current momentum. This talk will review the drivers of the current activity,
highlight enabling factors, and present the technical and organization
progress being made.
![]()
CINF
71
Global ELN
deployments: Experience from the front lines
Chris J. Ruggles, Professional Services Dept, CambridgeSoft Corp, 100
CambridgePark Drive, Cambridge, MA 02140, Fax: 617-588-9190, cruggles@cambridgesoft.com
In
this paper, we will present the challenges presented when managing global
deployments of ELN Systems. The migration from a personal, paper paradigm,
to a public, electronic model poses complex challenges. We will describe how
end user sensitivity to scientific security issues, IT Manager concerns for
consistency and maintainability, and executive desire for increased
productivity can be bridged by providing a common forum in which these
issues can be resolved. The ELN deployment process provides a unique
opportunity for groups often in conflict, to discuss, debate, and resolve
these issues. Coordinating requirements for patent attorneys, medicinal
chemists, process engineers, biologists, as well as executive management
brings a perspective to each group that is too often lost in the day to day
focus of individual work. CambridgeSoft will present the results of several
global ELN deployments where functional working groups have provided
invaluable assistance not only to the success of the deployment, but to the
greater organization as a whole.
![]()
CINF
72
Electronic Lab
Notebooks: How they can impact productivity in the laboratory and how to
justify a purchase
Richard M. Stember, Scientific Division, EKM Corporation, 25255 Cabot
Rd Ste 103, Laguna Hills, CA 92653, Fax: 949-455-1523, stemberr@ekmco.com
Understanding
how an electronic lab notebook can benefit laboratory operations is critical
to the justification, purchase and proper implementation. This presentation
will detail ELN functions that can have the greatest benefit to an
organization, review several real-world cases and independent productivity
studies. The benefits of collaborative tools, data mining and searching,
corporate and regulatory compliance protocols, and interfacing to related
applications will be discussed.
An
introduction to the developing a Return on Investment (ROI) based upon these
functions and projected productivity enhancements will be described.
•
Which features are needed to improve productivity • What is needed for
Research, R&D, QC, and Services laboratories • How integration with
existing systems impacts productivity • What may be important outside of
the laboratory • How to develop ROI justification based upon projected
productivity gains
![]()
CINF
73
Green chemistry
and the environmental community: Building bridges with ICE -- information,
communication, education
Frederick Stoss, Science and Engineering Library, University at
Buffalo - SUNY, Buffalo, NY 14260, fstoss@buffalo.edu
“Green
Chemistry” is a concept approaching its adolescence in terms of the number
of years it has existed. Much of its growth and development has gone
virtually unnoticed by the environmental community. The demand for consumer
goods, however, still requires connections to chemistry and chemical
processes. This demand for goods has placed a wide variety of
well-documented burdens on the environment and the producers of those
consumer goods. Combating the environmental impact of "chemicals"
has proven to be a costly venture in terms of time, energy, expertise, and
money. Green Chemistry emerged as a means to shift the focus of
environmental impact to prevention of adverse impacts. This is accomplished
through design of chemical process more compatible with positive
environmental outcomes, such as decreasing the amounts of harmful chemicals
used in processes, reduction in on-site storage of chemicals, and
implementation of strategies to achieve zero-discharge of pollutants and
emissions. The concept of Green Chemistry has been closely aligned to that
of sustainability and to a lesser degree the tenants of the Precautionary
Principle. The "three-legged stool" model of
sustainability—Ecology, Economics, Equity—has a logical analog for
introducing the concepts of Green Chemistry for stakeholders: Government
Agencies, Business and Industry, Environmental Concern. This presentation
will examine another model to explore the potential for increased public
awareness and outreach: Green Chemistry ICE—Information, Communication,
Education—and cite the example of the Center for Environmental Information
(Rochester, NY) as the type of organization that can perform such services,
programs, and publications. The roles of components of the American Library
Association (e.g., Task Force on the Environment, Social Responsibilities
Round Table, and Science and Technology Section of the Association of
College and Research Libraries) and the Special Libraries Association (e.g.,
Environment and Resource Management Division, Chemistry Division,
Engineering Division, Science and Technology Division) will also be
discussed The case for development of educational services and information
products, information resources sharing networks, and effective
cross-disciplinary communications to better inform the communities of
environmental concern and responsibility will be made.
![]()
CINF
74
IUPAC Ionic
liquids database, ILThermo
Qian Dong1, Chris D. Muzny1, Robert D. Chirico1,
Vladimir V. Diky1, Joseph W. Magee1, Jason Widegren1,
Kenneth N. Marsh2, and Michael Frenkel1. (1) Physical
and Chemical Properties Division, National Institute of Standards and
Technology, 325 Broadway, Boulder, CO 80305-3328, qian.dong@nist.gov, (2)
Department of Chemical and Process Engineering, University of Canterbury
Recent
reviews on ionic liquids, one of the emerging topics in chemistry during the
past five years, have called for comprehensive investigations of chemical
and physical properties in order to understand the nature, functionality,
and potential uses of ionic liquids. In 2003, a IUPAC task group was formed
to address the need for international scientific cooperation. The goal was
to create a web-based comprehensive database for storage and retrieval of
metadata and numerical data for ionic liquids, including their syntheses,
structures, purity, and properties. The data project consists of two major
parts: (1) data capture and storage employing a NIST/TRC data archive system
known as SOURCE; (2) data search and retrieval building on a J2EE/ORACLE web
platform. It aims to provide users worldwide with up-to-date information on
publications of experimental investigation on ionic liquids, including
numerical values of chemical and physical properties, measurement methods,
sample purity, uncertainty of property values, as well as many other
significant measurement details. The database can be searched by means of
the ions constituting the ionic liquids, the ionic liquids themselves, their
properties and references. The first edition is scheduled for public release
via the internet by the end of September 2005.
![]()
CINF
75
New tool to
improve access to green chemistry and engineering resources
Jennifer L. Young, Green Chemistry Institute, American Chemical
Society, 1155 Sixteenth St., NW, Washington, DC 20036, Fax: 202-872-6206,
j_young3@acs.org, and Paul T. Anastas, Green Chemistry Institute, American
Chemical Society
The
resources for green chemistry today are in many forms, but there is no
unifying framework that assembles these widely scattered resources. As a
result, chemists and engineers, who may not be experts in green chemistry
and green engineering, struggle to access the information and implement
pollution prevention in their work. The Green Chemistry Institute is
currently working on a tool to aid chemists and engineers in identifying
opportunities for green chemistry and engineering and locating the most
relevant green chemistry resources. With this tool, the user is guided
through an opportunity assessment protocol in relation to their chemical or
process design and then directed to the resources most relevant to their
application. The tool provides access to a wide variety of resources
including software, databases, web sites, examples, journal articles, and
keywords. Progress on this project and related information dissemination
tools will be presented.
![]()
CINF
76
Green chemistry
and environmental sustainability: A middle school module
Michael Rottas, Pfizer Global Research & Development - Groton/New
London Laboratories, Eastern Point Road, Groton, CT 06340, michael.h.rottas@pfizer.com
Pfizer
Global Research & Development, through a grant from the Pfizer Education
Initiative, and in collaboration with The Keystone Center (of Keystone,
Colorado), is developing an interdisciplinary middle school module focused
on environmental sustainability and green chemistry. Middle school students
learn about the real world challenges of product development while balancing
the economic, social and environmental bottom lines. Green chemistry is
introduced as an means to succeed in business while ensuring natural
resources are available for future generations. The presentation will
provide an overview of the ten-day module and the drivers for its
development.
![]()
CINF
77
CBIAC and
homeland security information
James M. King, Chemical and Biological Defense Information Analysis
Center, PO Box 196, Gunpowder, MD 21010, Fax: 410-676-9703, kingj@battelle.org
Chemical
and Biological Defense Information Analysis Center (CBIAC) is a full service
DoD Information Analysis Center. It is DoD's centralized source for Chemical
and Biological Defense (CBD) information/technology. CBIAC supports DoD,
Federal agencies, contractors, state and local governments, and first
responders. CBIAC encompasses all aspects of CBD and homeland security (HLS):
Manufacturing Processes for NBC Defense Systems, Chemical and Physical
Properties of CBD Materials, Identification, Combat Effectiveness, Counter
Proliferation, Counter Terrorism, Decontamination, Defense Conversion and
Dual-Use Technology, Demilitarization, Domestic Preparedness, Environmental
Fate and Effects, Force Protection, Individual and Collective Protection,
International Technology Proliferation and Arms Control, Medical Effects and
Treatment, NBC Survivability, Smoke and Obscurants, Toxic Industrial
Chemicals and Materials, Toxicology, Treaty Verification and Compliance, and
Warning and Identification. This presentation focuses on CBIAC's HLS
activities. For more information, see http://www.cbiac.apgea.army.mil/.
![]()
CINF
78
Preparing for
chemical terrorism response at the Centers for Disease Control and
Prevention
David L. Ashley, National Center for Environmental Health, Centers
for Disease Control and Prevention, Mailstop F-47, 4770 Buford Highway,
Atlanta, GA 30341, Fax: 770-488-0181, dla1@cdc.gov
In
the event of domestic terrorism, identification of the agent used and the
people affected is critical in an effective response. CDC has been given the
federal responsibility for analyzing clinical specimens from people with
suspected exposure to chemicals used during a domestic terrorist event, a
chemical accident, or an unknown chemical exposure. Specimens may be
collected and evaluated for the purpose of identifying the agent used,
confirming a preliminary field agent identification, assessing whether
individual subjects were exposed, determining the extent of exposure of
individual subjects, relating internal dose levels to medical symptoms,
assessing individuals for assignment to registries, assessing the
geographical and temporal extent of exposure and/or differentiating exposed
subjects from the worried well. Biomonitoring is able to provide important
information to clarify the confusion during a terrorist event and aid in the
public health response to a terrorist event. This presentation will discuss
the current status of efforts within DLS to respond to a chemical terrorism
event.
![]()
CINF
79
Information
sharing for science and security: The path forward
Gigi K. Gronvall, Center for Biosecurity of UPMC, 621 East Pratt
Street, Suite 210, Baltimore, MD 21202, Fax: 443-573-3305, ggronvall@upmc-biosecurity.org
Since
9/11/2001, a tremendous amount of funds have been devoted to homeland
security goals. The people of the US, represented by their government,
clearly are counting on scientists to reduce the threat and consequences of
another attack. However, are enough scientists truly engaged to work in this
field? At the same time as more funds are becoming available, new laws
regarding the handling of biological agents have come into effect, and new
norms for scientific publication have been promoted by journal editors.
Fundamental public policy questions have arisen, questions that have deep
implications for the scientific community. How can the competing needs for
research transparency be reconciled with the desire to not lower the
barriers towards weapons development? How can we bridge the gulf between the
scientific and national security communities, so that the best science is
brought to bear on homeland security issues? How do US domestic actions
square with the global setting that research occupies and information is
exchanged? This presentation will focus on the new roles that scientists,
their professional organizations, research institutions, and the government
are taking on in order to increase national and international security, the
sometimes uncomfortable merging of cultures, and paths forward to increase
the ease and interest of scientists to work in the interests of homeland
security.
![]()
CINF
80 ArQiologist: An
integrated decision support tool for lead optimization
Atipat Rojnuckarin, Research Informatics, ArQule Inc, Presidential
Way, Woburn, MA 01730, arojnuckarin@arqule.com
This
talk describes ArQiologist, a Web-based tool that integrates chemical,
analytical, biological, and computational data to facilitate decision
support for lead optimization at ArQule. It features an easy-to-use
graphical query builder that allows queries to be saved, reused, and shared
by researchers. In addition to being an integrated portal for ArQule's
discovery data, ArQiologist offers customizable treatment of the
oft-neglected nuances unique to hierarchical compound-centric discovery
data.
![]()
CINF
81 Integrating R
with the CDK for QSAR modeling
Rajarshi Guha and Peter C. Jurs, Department of Chemistry,
Pennsylvania State University, 104 Chemistry Building, University Park,
State College, PA 16802, rxg218@psu.edu
In
this work we describe the development of a framework combining
cheminformatics and statistical functionality to provide a freely available
platform for the purposes of QSAR modeling. The Cheminformatics Development
Kit (CDK) is an open-source project that provides a comprehensive framework
for cheminformatics projects. Features include the ability to read multiple
formats, calculate molecular descriptors and perform substructure searches.
The R software package is an open-source project that provides a variety of
statistical and data mining functionality. A framework was developed to
allow the use of the statistical functionality provided by R from within the
CDK. We also describe a publicly available service to create QSAR models as
well as obtain predictions from reference models developed using this
framework. The service also implements the validity measure developed by
Guha et al. (J. Chem. Inf. Model., 2005, 45, 65-73) to provide a measure of
confidence in the predictions.
![]()
CINF
82 Chemogenomic
assessment of SAR data from learned journals
Richard Cox, Ah Wing E. Chan, Bissan Al-Lazikani, David Michalovich,
and John P. Overington, Inpharmatica Ltd, 60 Charlotte St, London W1T 2NU,
United Kingdom, Fax: +44 (0)20 7074 4700, r.cox@inpharmatica.co.uk
Structure
activity relationships (SAR) between chemical structures and bioactivities
are fundamental to understanding drug-target interactions, especially when
considered in a gene family, where one needs to understand and predict
relative potency and selectivity. Although this kind of information has been
available in learned journals for many years, there is no available system
to retrieve this kind of SAR. Historically this has involved intensive
scientific effort, and it is difficult to assess coverage and accuracy in
any individual case. To address this need, we have developed StARLITe, a
database that comprises around 300,000 bioactive, synthetically tractable
compounds abstracted from primary medicinal chemistry journals. The
associated bioactivity data includes over 1.3 million activity data points
and in excess of 4000 unique molecular targets, enabling both rapid
extraction of arbitrary SAR tables, and sequence based entry to the data - a
unique departure from the traditional compound-centric view of medicinal
chemistry data.
![]()
CINF
83 Enterprise
knowledge management and the industrial revolution in scientific
experimentation
Kenneth Eric Milgram, Laboratory Operations Director, Metabolon, 7820
Kingsbrook CT, Wake Forest, NC 27587, Fax: 919-882-8822, eric.milgram@gmail.com
The
term “high-throughput” is widely used throughout the
pharmaceutical industry. Most scientists probably have a mental
image of a high-throughput lab, where many activities are automated
with the aid of robotics instrumentation. However, defining
specific criteria to designate methods as high-throughput is
challenging. For example, in order to be considered high-throughput,
is there a minimum threshold for samples produced or analyzed per unit time? Must
a high-throughput method be capable of processing samples in a
simultaneously parallel fashion, such as with a MUX interface or 96-channel
pipette? Is a high-throughput technique inherently of lower quality
than a conventional technique?
In
many organizations the term high-throughput has acquired negative
connotations, particularly with regard to confidence in the quality of a
product or service or the ability to infer knowledge from an experiment.
Organizations have an unfortunate tendency to dichotomize products and
services depending upon whether they were derived from conventional
or high-throughput processes. This presentation has two
objectives. The first is to gain a better understanding of the
relationship between high-throughput and non-high-throughput
processes. The second is to understand the implications of this
relationship for enterprise data and knowledge management and to offer some
concrete examples based on this author's time in industry as a member of
both a large pharmaceutical organization and a vendor of knowledge
management solutions for scientists. The insights from this
presentation have implications ranging from experimental design and
implementation to selection of LIMS and electronic laboratory notebooks (ELN).
![]()
CINF
84 Integration of
chemoinformatics and fragment-based lead discovery
Kashif Hoda, Structural GenomiX, San Diego, CA 92121, kashif_hoda@stromix.com
Structural GenomiX (SGX) has developed FAST(tm) (Fragment of Active Structures), a proprietary technology for lead discovery. In this fragment-based technology, large-scale crystallographic and biochemical screenings, library designs and synthesis are conducted on a daily basis. It is critical to ensure seamless data tracking and query system across the entire processes from fragment screening and selection, virtual library generation, library design, library synthesis, which supports lead discovery efforts. The concept and implementation of the system will be discussed.