The ACS Committee on
Professional Training's Library Survey: Is there a future for modern chemical
information as a central component of education in
chemistry?
Jeanne E. Pemberton, Department of Chemistry,
University of Arizona, 1306 E. University Blvd, Tucson, AZ 85721, Fax:
520-621-8248, pembertn@u.arizona.edu
Abstract
In the fall of
2000, the ACS Committee on Professional Training undertook a survey of all
approved chemistry programs to ascertain the current situation with respect to
library and chemical information resources and their accessibility to students.
The results indicate a wide range of expenditures for chemical information
between institutions of different size, and suggest a growing problem of
affordability of modern chemical information resources, especially at
institutions that offer only bachelor's and master's degrees in chemistry. The
current state of chemical information resources at ACS-approved institutions
based on the results of this survey will be presented, and the serious issues
and concerns that these results raise will be discussed.
Evolving doors of access to ACS Web
Editions
Dean J. Smith, Sales & Marketing, American
Chemical Society, 1155 16th Street NW, Washington, DC 20036, Fax: 202-872-6005,
d_smith@acs.org
Abstract
Since the inception
of the Web Editions in 1998, ACS Publications has continually revised its
pricing models to address the needs of all customers. As a result, the cost per
article of ACS publications has steadily decreased over the years. Subscription
prices for ACS journals have traditionally been as much as 40% less than the
competition while providing the highest quality of chemical research. The Option
B pricing model for Web Editions is a consortia-based approach allowing the
widest range of access across the largest number of institutions from small to
large. In the first two years of its existence, the ACS did not charge a
consortia entry fee for institutions without ACS journals. The ACS has since
taken an innovative approach and experimented with a number of entry fees for
schools without any purchasing history. ACS Publications has completed a pilot
study in 1999 with UCAIR institutions to measure usage levels at 33 small
colleges. The results of these findings and an in-depth analysis conducted by an
outside consultant on market penetration to small colleges has presented options
for consideration to provide low-cost alternatives.
Serving academia: adapting
to the needs of scientific students and faculty
Craig
Stephens, Manager, North American Sales and Customer Support, Chemical
Abstracts Service, 2540 Olentangy River Road, Columbus, OH 43202-1505,
cstephens@cas.org
Abstract
A major component
of the CAS mission is to meet the needs of the academic community. CAS has
introduced many new products and programs in recent years to meet these needs
with features, terms, and conditions adapted for the requirements of campus-wide
access to scientific information for institutions large and small. The
widespread and rapid acceptance of SciFinder Scholar by universities around the
world exemplifies the features and requirements that constitute successful
research for the academic community at B.S., M.S. and Ph. D. granting
institutions. How CAS assessed and met such needs by drawing upon the input of
the user community and our own experience will also be discussed.
Affordable tools for teaching undergraduates at small
institutions and community colleges
Patricia Kirkwood,
Science Librarian, Pacific Lutheran University, Mortvedt Library, Tacoma, WA
98447, Fax: 253-535-7315, kirkwope@plu.edu
Abstract
Community colleges
and small undergraduate institutions that routinely graduate less than 20
chemistry majors a year are at a disadvantage when it comes to teaching chemical
information literacy. The major resources are simply too expensive and are not
available. At $10,000 plus per shared seat for Scholar, more than $5,000 for
Chemical Abstracts Student Edition, and Web of Science priced even higher,
smaller institutions cannot justify the expenditure either by the amount of
research funding they receive nor by the number of students served. The
librarian (or faculty member is a librarian is unavailable) must teach
information skills using the products that are available for little or no cost.
Information products like the NIST Web Book, CRC Handbook of Chemistry and
Physics, General Science Abstracts, Basic BIOSIS, JStor, online encyclopedias
and full-text database available through various vendors are the resources to
consider. In this presentation, I will review resources that are affordable and
propose a plan for teaching the basic chemical information literacy skills
required by the CPT with the available tools rather than instruction that
focuses on specific information tools and platforms. A well designed program
that uses affordable tools can teach life long information literacy skills and
prepare students well for the work world or graduate school.
How much is enough? CPT Guidelines and chemical
information access in research universities
David
Flaxbart, Chemistry Library, University of Texas at Austin, Welch Hall
2.132, Austin, TX 78713, flaxbart@uts.cc.utexas.edu
Abstract
While research
libraries have not had much difficulty in the past meeting CPT library
guidelines, the changing formats of chemical information and the soaring costs
associated with them have brought an array of new challenges to even the largest
and best-funded libraries. The CPT Library Survey of 2000 revealed a number of
disturbing trends in the adequacy and affordability of access to required
information, affecting all types of institutions. This presentation will outline
some of the issues faced at a large ARL library, and examine some of the
possible solutions.
Chemical information and chemical informatics literacy
at a research university
Gary D. Wiggins, Chemistry
Library, Indiana University, 800 E. Kirkwood Avenue, Chemistry Building Room
C003, Bloomington, IN 47405-7102, Fax: 812-855-6611, wiggins@indiana.edu
Abstract
The Department of
Chemistry at Indiana University offers four one-hour chemical
information/informatics courses on the undergraduate level and two three-hour
courses on the graduate level. Most of the courses have been taught via
teleconferencing across two campuses during the past two years, with some
lectures delivered from England in one graduate course. A mix of free and
commercial software and databases is used in the courses. Methodology, software,
and cost figures will be presented.
Green chemistry: Sustaining a high technology
civilization
Terrence J. Collins, Department of
Chemistry, Carnegie Mellon University, 4400 Fifth Ave., Mellon Institute,
Pittsburgh, PA 15213-2683, Fax: 412-268-1061, tc1u@andrew.cmu.edu
Abstract
Because we do not
live in a sustainable civilization, sustainability has become the most important
single idea for universities for the next century. Chemists are principal
custodians of the technological challenges of sustainability. As quickly as
possible, we must learn how to develop the research and educational programs
that will be essential for steering our communal thinking and our technology
base toward sustainable directions. In chemical research, three areas stand out
as being vital—the invention of more efficient technologies for converting solar
to electrical or chemical energy, the replacement of polluting chemical
technologies with economical non-polluting substitutes, and the development of
renewable feed-stocks for the chemical industry. These areas will be briefly
sketched with an emphasis given to pollution reduction. Chemistry-oriented books
and materials that are available, are becoming available, or that one hopes will
become available to deal with the sustainability dilemma will be discussed.
Biological engineering: from blue roses to space
suits
Cory Craig, Physical Sciences and Engineering
Library, University of California, Davis, One Shields Avenue, Davis, CA 95616,
Fax: 530/752-4719, cjcraig@ucdavis.edu
Abstract
Biological
engineering is the application of engineering principles to biological and
medical problems. Biological engineering research has played a key role in the
development of innovations as diverse as: artificial limbs, space suits, and the
production of synthetic vaccines. Evidence of the breadth and cross-disciplinary
nature of this discipline is found in the many different fields of biological
engineering, including: biomedical engineering, biochemical engineering,
environmental engineering and agricultural engineering. This talk will outline
the development and history of biological engineering, provide an overview of
the types of research conducted in different areas of biological engineering,
and identify major breakthroughs and current challenges in the field. This
overview will help information professionals provide better reference assistance
to library patrons in this exciting and evolving area of research.
Nano nonet: Nine things chemistry librarians need to
know about nanoscience
F. Bartow Culp, Mellon Library
of Chemistry, Purdue University, West Lafayette, IN 47907, bculp@purdue.edu
Abstract
The hot fields of
nanoscience and nanotechnology deal with structures and materials in the 1-100
nm dimensional scale. While the concept has been discussed for over thirty
years, only recently have experimental breakthroughs made reality out of theory.
The purpose of this session is to give an overview of these fields, and to
emphasize points of particular interest to chemistry librarians, including
terminology and information resources.
Chemoinformatics, cheminformatics, chemical
informatics: What is it?
Gary D. Wiggins, and Wendie
Shreve, Chemistry Library, Indiana University, 800 E. Kirkwood Avenue, Chemistry
Building Room C003, Bloomington, IN 47405-7102, Fax: 812-855-6611,
wiggins@indiana.edu
Abstract
The terms
"chemoinformatics," "chemiinformatics," "cheminformatics," and "chemical
informatics" are all used to describe a broad array of computer techniques and
applications to solve chemistry problems. We will look at the areas that
comprise chemical informatics by examining the topics in existing textbooks and
other secondary sources. The identified topics will be mapped to the graduate
courses in the chemical informatics program at Indiana University.
Combinatorial materials
research: opportunities, challenges and successes
Laurel A.
Harmon, Striatus, 8703 Webster Hills Rd., Dexter, MI 48130, Fax:
734-661-0409, lharmon@striatus.com
Abstract
The introduction of
combinatorial methods into materials discovery introduces new challenges for
both laboratory methods and for informatics. This talk highlights applications
in which combinatorial methods are being successfully applied to materials,
issues that arise in combinatorial materials research, and informatics
strategies that are being developed. Despite many similarities, there are
fundamental differences between high throughput approaches to drug and material
discovery. Materials research imposes new requirements for data management, data
storage and data analysis. New strategies for experiment planning are required
to effectively navigate the high-dimensional experimental spaces. Different
modes of combinatorial experimentation are outlined, mapping, screening and
optimization, with examples drawn from current high throughput combinatorial
materials research.
Recent Developments in OpenURL (SFX) Linking at the
University of Chicago
Andrea B. Twiss-Brooks, John
Crerar Library, University of Chicago, 5730 S. Ellis, Chicago, IL 60637, Fax:
773-702-7429, atbrooks@midway.uchicago.edu
Abstract
SFX (from ExLibris)
is a linking technology based on the OpenURL protocol for creating customized
links among diverse information products. The University of Chicago Library
implementation of SFX to provide better management of electronic resources and
improved service to the scholarly community is described.
The Library defined its electronic collection, and constructed rules that guide SFX in creating context-sensitive links. These customized, context-sensitive links use web-transportable packages of metadata to connect users to resources and services. Links to resources are dynamically generated to provide information about all available online copies. In addition, SFX services have been configured to include searches of rich print collections and additional information about journals.
Current and future developments described include an OpenURL generator/DOI resolver tool, a dynamically generated comprehensive online journal A to Z list, and additional SFX services such as automated interlibrary loan request generation.
Preserving data: The role
of databases in future scientific discovery
John Rumble
Jr., Standard Reference Data, NIST, 100 Bureau Drive MS 2310, Gaithersburg,
MD 20899-2310, Fax: 301-926-0416, john.rumble@nist.gov
Abstract
A wide variety of
methods have been used to save and preserve scientific data for thousands of
years. The physical nature of these means and the inherent difficulties of
sharing the physical media with others who need the data have been major
barriers in advancing research and scientific discovery. The information
revolution has changed this in significant ways: ease of availability, breadth
of distribution, size and completeness of data sets, and documentation. As a
consequence, scientific discovery itself is changing now, and in the future,
perhaps even more dramatically. In this talk I will review some historical
aspects of data preservation and the use of data in discovery. And I will
provide some speculations on how preserving data digitally might revolutionize
scientific discovery.
Knowledge discovery in a
database of biochemical pathways
Johann
Gasteiger1, Martin Reitz1, and Oliver
Sacher2. (1) Computer-Chemie-Centrum and Institute of Organic
Chemistry, University of Erlangen-Nuremberg, Naegelsbachstr. 25, Erlangen 91052,
Germany, Fax: +49-9131-85 26566, Gasteiger@chemie.uni-erlangen.de, (2) Molecular
Networks GmbH
Abstract
A database of
biochemical reactions has been built on the basis of the Poster Biochemical
Pathways originally produced by Boehringer Mannheim (now Roche Diagnostics).
Each structure is represented by a connection table including stereochemical
information. Reactions are coded by the bonds broken and made in the course of
the reaction. It will be shown how models of the transition states of
biochemical reactions can be developed and be compared with inhibitors of
enzymes. Furthermore, the information allows the definition of similarity of
reactions which is compared with standard enzyme classification.
Dynamic Data Evaluation:
Algorithm development and analysis for thermodynamic properties of pure organic
compounds
Vladimir V. Diky1, Robert D.
Chirico1, Xinjian Yan1, Randolph C. Wilhoit2,
and Michael Frenkel1. (1) Thermodynamics Research Center (TRC),
National Institute of Standards and Technology (NIST), Mailstop 838.00, 325
Broadway, Boulder, CO 80305, diky@boulder.nist.gov, (2) Texas Experimental
Engineering Station, Texas A&M University System
Abstract
Traditional
critical data evaluation is an extremely time and resource consuming process,
which includes extensive manpower applied to data collection, mining, analysis,
fitting, etc. Furthermore, it must be performed far in advance of need, which
has led to a significant part of the existing recommended data having never been
used. The concept of “dynamic” data evaluation was developed by TRC at NIST, and
requires large electronic databases (such as the TRC Source data system) capable
of storing essentially all of the ‘raw/observed’ experimental data known to date
with descriptions of relevant metadata and defined uncertainties. In combination
with expert-system software this system allows production of recommended
property values (with uncertainties) dynamically or ‘to order.’ Aspects of the
implementation of Dynamic Data Evaluation will be discussed including
thermodynamic consistency between related properties, selection of fitting
equations, use of estimated properties, and uncertainty propagation. The output
of the software being designed includes complete sets of thermodynamic property
data with reliable uncertainties for any (including hypothetical) organic
compound.
A self-organizing
algorithm for extracting the intrinsic dimensionality of large high-diemnsional
data
Dimitris Agrafiotis, and Huafeng Xu, Research
Informatics, 3-Dimensional Pharmaceuticals, Inc, 665 Stockton Drive, Exton, PA
19341, Fax: 610-458-8249, agrafiotis@3dp.com
Abstract
We present
stochastic proximity embedding (SPE), a novel self-organizing algorithm for
producing meaningful underlying dimensions from proximity data. SPE attempts to
generate low-dimensional Euclidean embeddings that best preserve the
similarities between a set of related observations. The embedding is carried out
using an iterative pairwise refinement strategy that attempts to preserve local
geometry while maintaining a minimum separation between distant objects. Unlike
previous approaches, our method can reveal the underlying geometry of the data
without intensive nearest neighbor or shortest-path computations, and can
reproduce the true geodesic distances of the data points in the low-dimensional
embedding without requiring that these distances be estimated from the data
sample. More importantly, the method scales linearly with the number of points,
and can be applied to very large data sets that are intractable by conventional
embedding procedures. The advantages of the algorithm are illustrated using
examples from the molecular diversity and conformational analysis literature.
From gene to lead: An
architecture for cooperative drug discovery
Stephen A.
Baum, Accelrys Inc, 9685 Scranton Road, San Diego, CA 92121, Fax:
858-799-5100, sbaum@accelrys.com, and Shikha Varma, 9685 Scranton Rd, Accelrys
Inc
Abstract
The challenges and
rewards of multi-disciplinary cooperation in drug discovery efforts are well
appreciated. While the rewards of collaborative research are enormous, the
problem of knowledge management of data generated by chemists, biologists, and
modelers remains a challenge. Even in today’s networked computing environments,
incompatible file formats, disparate data sources, and lack of methods and
application integration amongst other factors stall information transfer and use
between scientists of differing disciplines. Applying data generated by
colleagues working in differing scientific disciplines with divergent tools and
methods can provide fresh perspectives in drug discovery. Even when results are
successfully transferred to others, often the inability to understand how, when
and why data were generated questions the utility and credibility of such
information in subsequent studies. Discovery Studio applications are addressing
many of the challenges associated with multi-disciplinary collaboration by
providing an underlying multi-tiered system architecture that facilitates
information and file transfer, data management, data mining and methods
integration across various scientific disciplines. This discussion will describe
the Discovery Studio architecture as well as automated electronic data capture,
reporting, and promotion of data to a shared project workspace for cooperative
drug discovery across various scientific disciplines. A virtual pharmaceutical
drug discovery scenario presented here illustrates how computational methods can
be applied within one environment - from gene to lead.
Chemical handbooks: Glorious past, questionable
future
F. Bartow Culp, School of Library Science,
Purdue University, Mellon Library of Chemistry, West Lafayette, IN 47907
Abstract
Chemistry handbooks
are almost as old as modern chemistry itself. For nearly two centuries,
compilations of chemical information such as Gmelin, Beilstein and
Landolt-Boernstein have organized the diffuse primary literature to make facts
easily available to the chemist. The increasing size and complexity of the
chemical literature in the 20th century signaled the demise of the comprehensive
nature of such efforts, but the notion and reality of the handbook persists
today. While their current formats and even definitions have changed over time,
modern handbooks share some core characteristics: They are selective in scope,
labor intensive to prepare, and costly to purchase. In order to appeal to the
new consumers of chemical information, some publishers have converted their
handbooks into electronically searchable products, while others have held to
primarily print versions. It is reasonable to question whether, in the coming
age of disembodied journals and deconstructed texts, there will even be a place
for the classically organized handbook. The purpose of this talk will be to
review briefly the history of chemistry handbooks, to look critically at their
present incarnations, and to propose some means of their survival in the brave
new world of the Internet, e-books, and metadata.
CRC Handbook of Chemistry and Physics: From paper to
web
Fiona Macdonald, CRC Press, 23 Blades Court, Deodar
Road, London SW15 2N, United Kingdom, Fax: +44 20 8871 3443,
fmacdonald@crcpress.com, and David R. Lide, Editor
Abstract
In print for nearly
90 years, the CRC Handbook of Chemistry and Physics has become an institution in
many laboratories worldwide. Today the demands are for instant desktop access,
the most current information, plus sophisticated search and display facilities.
This talk will focus on the challenges encountered in getting the 'Rubber Bible'
on the web, the latest version of which is now available at http://www.hbcpnetbase.com/.
The next 100 years: The
evolution of The Merck Index toward a fully electronic
publication
Irwin Schreiman1, Barbara
Solomon1, Jonathan Brecher1, and Ann
Smith2. (1) Informatics, CambridgeSoft Corporation, 100 CambridgePark
Drive, Cambridge, MA 02140, ischreiman@cambridgesoft.com,
jbrecher@cambridgesoft.com, (2) Merck & Co. Inc
Abstract
Among printed reference works, The Merck Index stands out for its integrity, detail and longevity. We faced many challenges when converting this treasure trove of information to electronic form. Throughout the conversion process, we focused on the importance of data quality. The well-structured data that was used for production of the print version eased the creation of the electronic version. We have found when undertaking a project such as this, it is also important to understand the objectives. What data needs to be searchable -- and therefore included in a database -- versus what can be presented in a static file format (such as a "pdf" file)? Questions such as "Which fields must be searchable independently?" and "How should numerical searching work? (ranges, significant figures, etc.)" must be addressed. Finally, presentation versus accessibility must be gauged to ensure a successful end-user experience.
Science of Synthesis/Houben-Weyl: Conversion of a Major Reference Work in Organic
Synthetic Chemistry (Print) into an Interactive, Highly Accessible Electronic
Product
Dr. M. Fiona Shortt de Hernandez, Georg Thieme
Verlag, Ruedigerstrasse 14, D-70469 Stuttgart, Germany, Fax: 0049-711-8931777,
fiona.shortt@thieme.de
Abstract
Houben-Weyl
(http://www.houben-weyl.com/) is an
indispensable treatise for every synthetic chemist serving the scientific
community with a critical selection of synthetic methods. The project was
established in 1909 and contains over 140 volumes covering all aspects of
organic synthetic chemistry. The fifth edition which will carry on the tradition
of Houben-Weyl but will include new features such as safety information,
scope, comparison of methods etc. was launched in the year 2000 and called
Science of Synthesis (http://www.science-of-synthesis.com/).
This series is edited by D. Bellus (Basel, Switzerland), E. N. Jacobsen
(Cambridge, USA), S. V. Ley (Cambridge, UK), R. Noyori (Nagoya, Japan), M.
Regitz (Kaiserlautern, Germany), P. J. Reider (Thousand Oaks, USA), E. Schaumann
(Clausthal-Zellerfeld, Germany), I. Shinkai (Tsukuba, Japan), E. J. Thomas
(Manchester, UK), and B. M. Trost (Stanford, USA). Science of Synthesis
will be published in a total number of 48 volumes and will contain ca. 150,000
reactions. Science of Synthesis has been designed using new workflows and
production techniques as well as XML technology so that this work is not just
available in book format but in an electronic format as well. The electronic
product which was developed in collaboration with InfoChem and an international
advisory board is available as an Intranet or Internet solution offering
powerful text, substructure, and reaction searching. The Houben-Weyl
archive is now available in a digital format as well so that it is now possible
to use an intuitive electronic guide (designed by Thieme Publishers) to access
over 100 years of invaluable information!
The knovelized e-Reference
Robert
R. Brand, knovel Corporation, 33 Main Street, Newtown, CT 06470,
rbrand@knovel.com
Abstract
While the classical
reference/handbooks decline with the disappearance of library shelf space, they
are being supplanted by it’s digital representation delivered to networks
worldwide.
This paper will demonstrate in concrete terms the vision of reference books, handbooks and their paperless cousin the database only version. The new representation should be interactive and deep searchable (IDS) and accessible on the Internet; the e-Reference.
Essential to the new emerging model is the role of aggregation with similar data, maximizing portability of customized data sets, mobility of data elements across classical handbooks and keyword searching via a common interface. New data element interactivity will be presented in detail along with near future elements.
Building a virtual reference collection in
chemistry
Patricia Kirkwood, Science Librarian, Pacific
Lutheran University, Mortvedt Library, Tacoma, WA 98447, Fax: 253-535-7315,
kirkwope@plu.edu
Abstract
So your library
offers reference on demand through the web. Great! Now you can get what you need
without leaving the lab. But when you chat with the librarian you find that the
table you need to work with is only available in the library and its way too big
to fax. So now you still have to find time to go to the library. Is this a
common complaint? Why develop a virtual reference service if you don't have a
virtual reference collection to support the service. Electronic reference
resources such as handbooks, encyclopedias, and dictionaries are becoming more
available and much more usable. However, so far, the tools don't get used very
much. What can the library do to create the basic electronic reference
collection in the sciences? After the collection is chosen, how do users find
out about it and figure out how to use it? Of course, there are always more
questions than answers as this work is done. How should the librarians and the
publishers/vendors work together to make sure the licenses and the technology
work together to provide a valuable and usable resource for the chemist?
The next step at major reference
works
Claudia Pick, Peter Loew, Josef Eiblmaier, and
Hans Kraut, InfoChem GmbH, Landsberger Straße 408, D-81241, München, Germany,
Fax: +49 89 5 80 38 39, Claudia.Pick@infochem.de
Abstract
Meanwhile
digitalization is standard and most of the primary chemistry literature is
already online available. The next step is the digitalization of major reference
works secondary literature that combines the information of the primary
literature with the expertise of highly trained scientists creating validated
review articles. But the digitalization of major reference works is offered in
different qualities and scales. A new dimension in digitalization of major
reference works is the addition of the feature of structure and reaction search
and, moreover, to provide one system that allows global searching in several
major reference works at a time. InfoChem GmbH cooperated with publishing houses
like John Wiley & Sons, Springer-Verlag, and Thieme-Verlag in the design and
development of Internet and Intranet versions of electronic major reference
works. The software used in these web versions (e-EROS from John Wiley, Science
of Synthesis from Thieme, CAC from Springer) has been developed exclusively by
InfoChem allowing - among other things - the retrieval of structures, reactions
and text in several major reference works at the same time.
Capitalizing on the value
in in-vitro hepatotoxicity data
Philippa R.N. Wolohan, and
Robert D. Clark, Research, Tripos, Inc, 1699 South Hanley Road, St. Louis, MO
63144, Fax: 314-647-9241, pwolohan@tripos.com
Abstract
Pre-clinical
decision making in the hit or lead generation phase of drug development is
routinely made based on in vitro cellular screening studies. In order to make
salient predictions of toxicological properties in man it is critical to fully
understand the in vivo relevance of models based on such cellular assays.
Assessing the predictive value of such models is no simple task and becomes an
even more pertinent issue when we make the leap into in silico modeling of such
systems. We will present an evaluation of hepatotoxicity data in a human cell
line, discuss the biological and statistical concepts associated with
interpreting this data and our strategy for factoring these considerations into
the design of our in silico models. Understanding the natural limitations of
training models on in vitro data allows us to better determine the appropriate
level of confidence to have in predictions made from such models in a rational
drug discovery setting.
Computational models for
predicting chemical toxicity
Julie E. Penzotti, and Gregory
A. Landrum, Rational Discovery LLC, 555 Bryant St. #467, Palo Alto, CA 94301,
Penzotti@RationalDiscovery.com
Abstract
Despite major
advances in the field of toxicology, safety assessment remains a costly
challenge in chemical development. Computational approaches to identify
compounds that are hazardous to human health or the environment are of great
interest. These approaches can be used early in the development process to
select compounds likely to have fewer toxicity liabilities and to prioritize
toxicity studies in risk assessment. Because multiple (possibly unknown)
mechanisms can lead to the same toxicity endpoint, algorithms for toxicity
prediction must be capable of handling multiple patterns of activity and modes
of action. We have developed a unique ensemble approach for building
computational models to screen large numbers of chemical structures for
toxicological properties. A major strength of our method is its ability to
provide a confidence level for each prediction that can be used to identify
compounds which require further testing. Our approach and its application to
modeling toxicological endpoints will be presented.
Facing database mining
challenges in ecotoxicity
Jacques R. Chretien1,
Marco Pintore1, Nadège Piclin1, Frederic Ros2,
and Emilio Benfenati3. (1) BioChemics Consulting, Centre
d'Innovation, 16, rue Leonard de Vinci, Orleans cedex 2 45074, France, Fax: + 33
2 38 41 72 21, jacques.chretien@univ-orleans.fr, (2) University of Orleans, CBI
/ Chemometrics & BioInformatics, (3) Department of Environmental Health
Sciences, Istituto di Ricerce "Mario Negri"
Abstract
New DMB tools,
based on Genetic Algorithms and Fuzzy Logic, were developed and applied on large
data sets of toxic chemicals, in order to establish general Structure-Activity
Relationships (SAR). Several salient examples will be shown, underlining
possibilities and limitations of the proposed procedures. These examples deal
with three biological models: (i) a series of 235 pesticides studied on rats or
(ii) on trout and (iii) a series of 568 chemicals studied on fathead minnow.
Levels of good prediction of 75%, for test sets, support particular interest of
these DBM tools in the area of ecotoxicity, due the high variability affecting
the experimental procedures.Importance of a powerful strategy in Molecular
Experimental Design (MED) based on supervised self organizing maps (sup-SOM)
will be underlined to handle chemical diversity and the real predictive power of
any predictive model relatively to large chemical data bases. (We acknowledge
financial support from The European Commission: project IMAGETOX,
HPRN-1999-00015).
In silico methodologies
for predictive evaluation of toxicity based on integration of
databases
Chihae Yang, LeadScope, Inc, Columbus, OH 43212,
cyang@leadscope.com, and Ann Richard, National Health & Environmental
Effects Research Lab, U.S. EPA
Abstract
The ability to
accurately “predict” toxicity with in silico methods is increasingly emphasized
as industry moves toward efficient up-front screening to reduce late stage
attrition. However, current methods for structure-based toxicity estimation are
not yet satisfactorily predictive. Reasons include the intrinsically complex
nature of chemically induced toxicity and the lack of data from which the
“models” or “predictions” are derived. Although toxicity information is publicly
available, most of these databases are not optimized for building
structure-toxicity relationships. The relationship between quality of data and
prediction model accuracy intensifies the need for improved access to quality
toxicity information. This paper describes collaboration between an
EPA-sponsored public initiative, DSSTox (Distributed Structure Searchable
Toxicity) database network, and a private sector effort, LIST (LeadScope In
Silico Tox) focus group. Both are working towards improved data access and the
integration of disparate data formats from various data sources. DSSTox is
promoting SDF format for toxicity databases inclusive of chemical structures,
whereas LIST is developing controlled vocabularies and mapping the data fields
of SDF and XML schema. Improving prediction capability by integration of data to
enhance chemical space, a shared goal of the DSSTox and LIST initiatives, will
be discussed. This abstract does not reflect EPA policy nor does mention of
trade names indicate EPA endorsement.
Mining molecular fragments
with MoFa - Finding relevant substructures in sets of
molecules
Michael R. Berthold1, Heiko
Hofer1, and Christian Borgelt2. (1) Research, Tripos, Inc,
1699 South Hanley Road, St. Louis, MO 63144, Fax: 314 647 9241,
berthold@tripos.com, (2) University of Magdeburg
Abstract
We present an
algorithm to find fragments in a set of molecules that help to discriminate
between different classes of, for instance, activity in a drug discovery
context. Instead of carrying out a brute-force search, our method generates
fragments by embedding them in all appropriate molecules in parallel and prunes
the search tree based on a local order of the atoms and bonds, which results in
substantially faster search by eliminating the need for frequent,
computationally expensive reembeddings and by suppressing redundant search. We
prove the usefulness of our algorithm by demonstrating the discovery of
activity-related groups of chemical compounds in the National Cancer Institute's
HIV-screening dataset.
Compressed Chemical Markup
Language (CCML) for Compact Storage and Inventory
Applications
M Karthikeyan, Deepak Uzagare, and S Krishnan,
Information Division, National Chemical Laboratory, Dr. Homi Bhabha Road, Pune
411008, India, Fax: +91-20-5893973, karthi@ems.ncl.res.in
Abstract
CML representation
is well documented however due its size comparison with other existing file
format it is prohibitive for many applications. If suitable tool is developed to
store CML format in compressed mode without loss of information and freedom of
use then it will encourage users community to apply CCML for their applications.
Here in NCL we developed a methodology for encoding chemical structures as
compressed CML generated by popular chemical structure generating programs like
JME. This CCML format consists of both SMILES and/or equivalent data along with
co-ordinate information about the atom for generating chemical structures in
plain text format. Each structure generated by JME in standalone or generated by
virtual means can be stored in this format for efficient retrieval, as it
requires about one tenth or below of actual CML file format, since the SMILES
describes the interconnectivity of the molecule. This CCML format is compatible
for automated inventory applications
New chemical information
interchange standards based on CML: a submission for the Object Management
Group
Mitchell A Miller1, Scott S.
Markel1, Juan C. Esteva2, and Wendy L. Sharp3.
(1) LION bioscience, 955 Ridge Hill Lane, Midvale, UT 84047,
mitchell.miller@lionbioscience.com, (2) Department of Computer Information
Systems, Intelligent Solutions / Eastern Michigan University, (3) Intelligent
Solutions
Abstract
A new standard for
chemical information interchange is presented. This standard is based on
Chemical Markup Language (CML) but has been extended to support multiplexed
structures - multiple isomers, tautomers and conformers for a given compound -
as well as chemical searching across a variety of database types.
Novel Applications of XML
in Chemistry
Peter Murray-Rust, Unilever Centre for
Molecular Informatics, Cambridge University, UK, Lensfield Road, CB2 1EW
Cambridge, United Kingdom, pm286@cam.ac.uk, and Henry S. Rzepa, Chemistry,
Imperial College
Abstract
Following from our
concept of the datument (an integration of data+document in XML) we present here
a review of a wide range of chemical concepts in active information objects. We
shall demonstrate how these can be used by machines as well as read by humans.
Current use of chemical information in human hands is both expensive and highly
error-prone and robot chemists will act as "information prosthetics" to carry
out routine or high-throughput e-chemistry. The semantics of XML chemistry can
be engineered so that actions and interpretations can be determined from
reference dictionaries. When datuments replace conventional articles they can be
used for many tasks such as extraction of data, control of instruments or
running calculations. This forms the infrastructure of a semantical chemical
GRID, where knowledge is available without ontological impedance. These chemical
XML resources have been layered on the emerging global technologies of
peer-to-peer communications and Web Services. As these develop towards the
Berners-Lee web-of-trust, the security and authentication protocols are
fundamentally integrated into modern chemical e-communication.
The family of XML
languages in Chemistry
Henry S. Rzepa, Chemistry, Imperial
College, London, United Kingdom, rzepa@ic.ac.uk, and Peter Murray-Rust, Unilever
Centre for Molecular Informatics, Cambridge University, UK
Abstract
XML supports
chemistry through a family of interoperating XML languages which support a wide
range of core concepts. The design is modular and extensible. Chemical Markup
Language (CML) describes molecules and crystal structures, including their
complete electronic description and flexible representation in connection
tables. Physical data are represented in Scientific Technical Medical Markup
Language (STMML) which supports a wide range of numeric data types and
scientific units. Intensive properties of substances are described through a
PropertyType library in the SELFML system. Reactions, along with mechanisms,
stoichiometry and associated physical quantities are managed by CMLReact.
Computational Chemistry Markup Language (CCML) supports all aspects of the
input, control and analysis of chemical computation. These are represented in
XML Schemas which can validate the strucure, vocabulary, dataypes and values
within chemical documents. With the addition of XSLT stylesheets complex
chemical concepts can be encapsulated as machine-enforceable rules leading to a
major increase in the quality and reusability of chemical information.
Open meeting: Committees
on publications and on Chemical Abstract Services
Robert J.
Massie, Director, Chemical Abstracts Service, American Chemical Society, 2540
Olentangy River Road, Columbus, OH 43202-1505, Fax: (614) 447-3713,
rmassie@cas.org, and Robert D. Bovenschulte, American Chemical Society
Publications Division, 1155 16th Street NW, Washington, DC 20036, Fax: (202)
872-6060, rbovenschulte@acs.org
Web-based tools for
cheminformatics and drug design
Marc C
Nicklaus1, Wolf-Dietrich Ihlenfeldt2, Johannes H.
Voigt3, and Frank Oellien2. (1) Laboratory of Medicinal
Chemistry, National Cancer Institute, National Institute of Health, Building
376, Boyles Street, Frederick, MD 21702, Fax: 301-846-6033, mn1@helix.nih.gov,
(2) Computer Chemistry Center, Institute of Organic Chemistry, University of
Erlangen-Nuremberg, (3) Laboratory of Medicinal Chemistry, National Cancer
Institute - Frederick Cancer Research and Development Center, National
Institutes of Health
Abstract
We present a
collection of web-based services at
http://cactus.nci.nih.gov, useful for drug design and cheminformatics in
general. Among others, we present the Enhanced NCI Database Browser (http://cactus.nci.nih.gov/ncidb2)
for searching in 250,000 compounds and a large number of calculated properties,
including hundreds of predicted biological activities; the GIF Creator for
Chemical Structures (http://cactus.nci.nih.gov/services/gifcreator/),
a tool to generate GIF and PNG images of chemical structures from 2D or 3D input
files in many different formats, and with numerous rendering options; the Online
SMILES Translator (http://cactus.nci.nih.gov/services/translate/),
a service that converts SMILES strings into Unique SMILES, and converts between
SMILES, SDF, PDB, MOL and other formats, including, if applicable,
multi-structure files; the Online Pseudorotation Tool (http://cactus.nci.nih.gov/Pseurot/),
which calculates pseudorotation parameters as used in the fields of
nucleoside/nucleotide chemistry, and correctly recognizes and processes DNA and
RNA, both single and double strand, nucleoside analogs with non-standard sugars,
nucleosides/nucleotides complexed with proteins and other tough cases; the
Self-Organized Map (SOM) of Compounds Tested in the NCI anti-HIV Screen (http://cactus.nci.nih.gov/services/som_qsar/),
a self-organizing map (SOM) of 42,000 AIDS-screened compounds clustered by
structure similarity, onto which the user can map compounds from the 42k AIDS
set, predefined datasets, or even one's own compounds, and which allows the user
to run searches in the whole NCI Open Database, starting from seed compounds in
the SOM; and the NCI Screening Data 3D Miner (http://cactus.nci.nih.gov/services/3DMiner/),
a service employing VRML for visualization and data mining in the NCI's 60 cell
line anti-tumor screening data.
CINF 36:
Markedly Photoconductivity Enhancement of Poly(2,5-dialkoxy-p-phenylene
vinylene)-Perylene Derivative Composites Film upon
Annealing
Wei Feng Sr.1, Haifeng Yu2,
Yaobang Li1, Akihiko Fujii3, and Katsumi
Yoshino3. (1) Department of Chemical Engineering, Institute of
Polymer Science and Engineering, Department of Chemical Engineering,Tsinghua
Universit, Beijing 100084, China, Fax: 86-10-62770304, weifeng@tsinghua.edu.cn,
(2) Department of Chemical Engineering and School of Materials Science and
Engineering, Tsinghua University, (3) Department of Electronic Engineering,
Department of Electronic Engineering, Graduate School of Engineering, Osaka
University
Abstract
The interplay
between phase separation in composite films comprising
poly(2,5-dialkoxy-p-phenylene vinylene) (ROPPV) and perylene derivative(PV)
which show photoinduced charge transfer and photovoltaic performance has been
nvestigated. The change in morphology and molecular reorientation occurring in
composite films upon annealing were investigated using SEM. Upon annealing, PV
microcrystallines of 8-10 micron in size lying parallel to the substrate surface
can be obtained. Annealing effect improved the photovoltaic performance of
ITO/CP-PV/Al Schottky type solar cells, which can be attributed to the formation
of an electron conducting PV crystal network. Preliminary studies indicate that
the morphological structure in CP-PV composite film has an important influence
to their photovoltaic properties.
Quantitative
structure-activity relationship study of histone deacetylase
inhibitors
Aihua Xie1, Chenzhong Liao1, Boyu
Li1, Zhibin Li1, Zhiqiang Ning1, Weiming
Hu1, Xianping Lu1, Jiaju Zhou2, and Leming
Shi1. (1) Chipscreen Biosciences, Ltd, Research Institute of
Tsinghua University, Suite C301, Shenzhen 518057, Guangdong, China, Fax:
+86-755-26957291, aihxie@chipscreen.com, lmshi@chipscreen.com, (2) Institute of
Process Engineering, Chinese Academy of Sciences
Abstract
Histone
deacetylases play a critical role in gene transcription and have become a novel
target for the discovery of drugs against cancer and other diseases. During the
past several years there have been extensive efforts in the identification and
optimization of histone deacetylase inhibitors (HDACIs) as novel anticancer
drugs. We have identified, collected, and verified the structural and biological
activity data for more than 100 compounds and performed an extensive QSAR study
on this comprehensive data set by using various QSAR and classification methods.
The predictive QSAR model reached an R2 of 0.80 and leave-one-out
cross-validated R2 of 0.75. The overall rate of correct prediction of the
classification model is around 95%. The computational models have been used in
our internal projects on the design and optimization of HDACIs. The advantages
and limitations of the models will be discussed.
Ultrafast optical Kerr
efferct of poly{thiophene-2,5diyl}[(2-carboxy-4¡¯-N,Ndimethylamino)
-azobenzylidene]}(ptcmaabe)
Wei Feng Sr.1,
Wen-hui Yi2, Haifeng Yu3, and Hong-cai Wu2. (1)
Department of Chemical Engineering, Institute of Polymer Science and
Engineering, Department of Chemical Engineering,Tsinghua Universit, Beijing
100084, China, Fax: 86-10-62770304, weifeng@tsinghua.edu.cn, (2) School of
Electronics and Information Eng, School of Electronics and Information Eng.,
Xi¡¯an Jiaotong University, (3) Department of Chemical Engineering and School of
Materials Science and Engineering, Tsinghua University
Abstract
Poly{Thiophene-2,5diyl}[(2-Carboxy-4'
-N,Ndimethylamino)-Azobenzylidene]} (PTCMAABE) was synthesized. The
time-resolved Optical Kerr effect (OKE) was investigated with femtosecond laser
pulses at 790nm. Only a ultrafast component of OKE of PTCMAABE is observed and
its dephasing time is 92.7fs, which attribute to p-electron-cloud distortion
occurring upon the non-resonant excitation. The second-order
hyperpolarizabilities g and third-order nonlinear optical susceptibility
c3 of PTCMAABE were also determined by transient OKE. The results
show that PTCMAABE exhibit large off-resonant nonlinearities g as large as
g=1.66x10-32 esu for each structural units and as large as
c3=4.2410-10 esu for the material have been obtained.
Use of Barnard and
Daylight fingerprints in ligand-based virtual screening
S. Kuen
Yeap1, Mike Snarey1, and Cesare Federico2.
(1) Molecular Informatics, Structure and Design, Pfizer Global R&D, Ramsgate
Road (ipc 636), Kent CT13 9NJ, Sandwich, United Kingdom, Fax: 44 1302 658463,
yeap_sk@sandwich.pfizer.com, (2) Department of Chemistry, UMIST
Abstract
Large
pharmaceutical companies possess millions of proprietary compounds that are
regularly screened for activity against targets of interest. By pre-selecting
compounds by computational means, virtual screening aims to maximise the chance
of finding a hit early in the screening programme.
In ligand-based virtual screening, 2-D fingerprints are often used to screen for ligands similar to a lead or leads. Barnard fingerprints compute the presence or absence of predefined structural features. Daylight fingerprints are hashed to fit all fragments available in the dataset into a bit-string. A comparative analysis of these fingerprints has yet to be reported.
Barnard and Daylight fingerprints of several lengths were computed for a MDDR dataset comprising 3000 known actives belonging to six activity classes and another 7000 compounds selected at random. The compounds were ranked against selected leads using the Daylight Nearest Neighbour algorithm. Analysis of the top-ranking lists shows that: (i) relative performance of the fingerprints is: Daylight 2K folded < Daylight 2K unfolded ~ Barnard 1K < Barnard 4K; (ii) a lead with structural features that are common within an activity class retrieves most hits; (iii) Barnard fingerprints tend to identify more chemotypes, whilst Daylight fingerprints identify more similar hits.
These observations support the use of as many structurally diverse leads as are available, and the consensus application of these fingerprint methods for ligand-based virtual screening. The order of fingerprint efficacy was also found to apply to the clustering of drug datasets. Barnard 4K outperformed the shorter fingerprints as judged by medicinal chemists’ intuition.
Patent fundamentals for
non-experts
Edlyn S. Simmons, SourceOne-Business
Information Services, Procter & Gamble Co, 5299 Spring Grove Ave.,
Cincinnati, OH 45217, Fax: 513-627-6854, simmons.es@pg.com
Abstract
A patent is both a
legal document and a technical publication, subject to national laws and
precedent, international conventions and traditional forms of expression as well
as the conventional language of scientific and technical writing. When searching
patent databases and evaluating search results, the legal aspects of patents
should always be considered. This presentation provides an overview of the
fundamental elements of patent law, and patent documentation. The role of
priority under the Paris Convention, the timeline for patent filing, examination
and grant, the interpretation of patent claims, and the nature of the patent
monopoly are discussed.
Patent information
resources for non-experts
Andrew H. Berks, Merck & Co,
126 E. Lincoln Ave RY60-35, Rahway, NJ 07065-0900, Fax: 732-594-5832
Abstract
A brief overview of
information sources for locating patent information will be presented, including
important resources on services such as Questel, STN, and Dialog, and low or no
cost web-based resources. Chemical structure databases covering patents will be
discussed.
Patent term, expiries and
extensions
Stephen R. Adams, Magister Ltd, Crown House, 231
Kings Road, Reading RG1 4LS, United Kingdom, Fax: +44 118 929 9516,
stevea@magister.co.uk
Abstract
The basic period of
patent monopoly enjoyed by the patent holder is fixed by national law, but it is
only recently that worldwide standards have begun to develop, notably as a
result of the WTO TRIPS Agreement. The actual period obtained, however, is
subject to additional variables such as national requirements for annuity
payments, transitional legislation and term extensions. These factors will be
discussed, with special reference to legal status information sources.
Patentability,
infringement or validity: What kind of search?
Barbara A.
Hurwitz, Barbara Hurwitz, consultant, 36 Waverly Street, Portland, ME 04103,
Fax: 207-228-6418
Abstract
The most common
type of patent search is a patentability search, that is, a search for the prior
art. Freedom to Operate searches (FTO) are basically infringement searches where
in-force patent claims are searched to confirm that a new product or process
will not cause an infringement problem. Validity searches are performed when the
client wishes to invalidate a patent that is currently in force. The scope of a
search is determined by which of these searches is needed. Scope refers both to
the time period to be covered and the kinds of databases to be searched.
CINF 44:
Demystifying the Patent Search Process
Randall K.
Ward, Harold B. Lee Library, Brigham Young University, 2320 HBLL, Provo, UT
84602, Fax: 801-422-0466, randy_ward@byu.edu, and Barbara J. Ikeler, Novartis
Pharmaceuticals Corp
Abstract
To many, patent
searching may have an aura of sophistication that is intimidating and can make
one less than confident in attempting such searches. The authors will try to
give confidence to the non-expert regarding some of the basics of patent
searching, though it is supposed that patent searching experts could find many
caveats and exceptions to the information presented herein. Of course it is
crucial to use judgment as to when to consult an expert, but the intent here is
to share concepts that may (in many instances) provide adequate search results.
Presented will be some databases (with examples and characteristics) necessary
in basic patent searching, such as Derwent World Patents Index, Chemical
Abstracts, and INPADOC. Lightly touched will be tools and systems useful in
patent searching, such as STN, DIALOG, and Micropatent
Patents Sell, But Who's
Searching - the rise of the non-expert in the patent searching
arena
Katharine Hancox, Product Development Group,
Chemistry Division, Thomson Derwent, 14 Great Queen Street, Holborn, London,
United Kingdom, katharine.hancox@derwent.co.uk, and Gez Cross, Product
Development Group, Chemistry Division, Thomson Derwent
Abstract
Patent searching -
a mystifying art practised by seasoned Information Professionals equipped with
an array of complex search strategies and command languages, prohibited to the
non-expert searcher for reasons of complexity and cost. As organisations
continue to realise the value of patent information across a wider range of
business functions, patent searching has evolved to encompass the end-user as
well as the specialised searcher.
Critical to meeting the needs of this ever-growing user communities is enabling powerful, simple methods to search, navigate and analyse patent data without detracting from the value of the content. In this paper we will use case studies on a particular platform, the Derwent Innovations Index, to show how web-based services are evolving to meet these needs, delivering flexible searching, Alerting and personalisation with links to citing and cited patents, full-text patent sources and related scientific literature.
History of the DARC
system
Jacques-Emile Dubois, University Denis Diderot,
ITODYS, Paris 75005, France, dubois@paris7.jussieu.fr
Abstract
Chemistry
communication patterns changed dramatically in the 60s and 70s. Digital
computing required novel languages and codes. Employing topology to describe and
handle molecules was the basic paradigm of the DARC (Description, Acquisition,
Retrieval, and Correlation) System. New original and coherent concepts for
identification, retrieval and correlation of structures were developed for
technological and academic needs. DARC implementation profited greatly by
cooperation with nongovernmental institutions, e.g. ACS, CAS, IUPAC, as well as
private industries, to adapt, hone and harmonize its original tools and
strategies. Mike O’Hara contributed with enthusiasm and competence, both to the
classic DARC products, the generic DARC and the DARC/MARKUSH for patents. Human,
pedagogical, societal and political facets of the DARC story enlighten this
history of the past 50 years.
Substance handling at
Chemical Abstracts Service
W. Fisanick, Research and New
Product Development, Chemical Abstract Service, 2540 Olentangy River Road, P. O.
Box 3012, Columbus, OH 43210, Fax: 614-447-3813, wfisanick@cas.org
Abstract
Since the advent of
the Chemical Registry in 1965, Chemical Abstracts Service (CAS) has developed
and used a variety of approaches and techniques relative to the handling of
chemical substances. These approaches and techniques involve the representation,
registration, and search and retrieval of chemical substances. Representation
aspects include the use of 2D and 3D structures along with nomenclature and
molecular property surrogates. Registration aspects include the use of special
structuring conventions. Search and retrieval aspects include the use of exact,
substructure and generic structure search capabilities. A key strategy is the
development of a continuum in the representation and access among specific and
generic substancs. This paper will review the key approaches and techniques used
by CAS for substance handling.
Creating the MARPAT File:
Practical and philosophical issues in patent analysis and
database-building
David E. Connolly, Dept. 56 - Synthetic
and Polymer Chemistry, Chemical Abstracts Service, Columbus, OH 43210,
dconnolly@cas.org
Abstract
Creating the MARPAT
File: Practical and philosophical issues in patent analysis and
database-building abstract: Most users of patent information see only the output
of searches from the CAS MARPAT file. This presentation will explore what goes
into creating the CAS MARPAT file and the challenges that database builders face
in taking complex Markush structures from the literature and turning them into
useful and meaningful information for our customers. These include: selection of
Markush structures from patents, interpreting confusing or contradictory patent
language and translating it into MARPAT coding that complements the CAS Registry
File. The presentation will include data and observations on trends in the
Markush literature.
Back for the future 2:
Cool codes, marvellous Markush and hot interfaces
Gez
Cross, Product Development Group, Chemistry Division, Thomson Derwent, 14
Great Queen Street, Holborn, London, United Kingdom, Fax: +44 207 344 2911,
gez.cross@derwent.co.uk, and Katharine Hancox, Product Development Group,
Chemistry Division, Thomson Derwent
Abstract
Throughout his
career Mike O’Hara was involved with training in and development of innovative
search systems for chemical and patent data, notably the CAS online and Markush
DARC systems providing structure searching of the CA Registry and the Derwent
and INPI owned MMS file. He continually sought for improvements to the systems
he was involved with to enable better retrieval and relevance for his clients.
In the 2002 Skolnik Award symposium, this author spoke about potential improvements in the Derwent patent files to enable older code-based data to be searched more easily by new and existing users, particularly in conjunction with structure searches. This paper will seek to honor Mike’s memory by reporting on progress in implementing these improved search capabilities including recent and forthcoming improvements to the searching of Markush structures from patents.
Tips and tricks for
searching MMS
Sandy Burcham, Service Is Our Business, Inc,
111 Lincoln Terrace, Norristown, PA 19403-3317, Fax: 610-630-0863,
cass123@earthlink.net
Abstract
This paper will
cover strategies developed to get the most from the MMS search system and
possibly some little known facts about content. All of these tips and tricks can
be attributed to Mike O'Hara - some learned from him directly and others learned
answering customer's questions when covering his phone.
Marketing chemical
information in a research organization
David S. Saari,
Library Information Center, Schering-Plough Research Institute, 2015 Galloping
Hill Road, Kenilworth, NJ 07033, Fax: 908-740-7015, david.saari@spcorp.com
Abstract
Michael O'Hara
constantly was involved in marketing activities. To honor Mike's contribution to
the chemical information community, this paper offers suggestions on how to
market information resources and services in a research organization. The
objectives of marketing are to transfer knowledge, initiate actions, establish
habits, and influence opinions. Marketing communications should include both the
features and benefits of the resource or service. Information professionals must
create opportunities to build relationships with current and potential clients.
ProtysTM - A fulltext English index of new Japanese
patents
Alan Engel, Paterra, Inc, 526 N Spring Mill Road,
Villanova, PA 19085-1928, Fax: 610-527-2041, aengel@paterra.com
Abstract
ProtysTM is a new Internet database that provides
fulltext English indexing of new Japanese patents. This SDI-targeted database
covers four weeks of Japanese Kokai with a one-week lag from publication. The
profile-centered user interface allows users to maintain multiple search
profiles and easily track profile runs versus database updates. Proximity
operators allow colocation searches by sentence, paragraph, claim(s), and
section (experimental, description of drawings, etc). Bibliographic search
fields include the full set of JPO-applied classification schemes including
F-terms and the 'facet' extension to the IPC and FI systems. ('Facets' allow
searching, for example, by pharmacological class.) Display options include a
term-in-context display that shows the basic information (bibliography, abstract
and front page image) plus only those portions of the document that match the
fulltext search criteria.
Post-Processing of Merged
Markush Service Results
Joseph M Terlizzi, Questel-Orbit,
8000 Westpark Drive, McLean, VA 22102, jterlizzi@questel.orbit.com
Abstract
The Merged Markush
Service (MMS) was designed for searching generic and specific chemical
structures indexed in the Derwent World Patent Index (DWPI), and INPI’s PHARM
database on Questel-Orbit. Compound number results from a search appear in two
different vendor formats and can only be searched in these bibliographic
databases separately. The Questel-Orbit MEM feature can be used to combine
results and display these records. Various vendor software packages, such as
Questel-Orbit’s Imagination and STN Express can be used for MMS searching and
extraction of these compound numbers to PHARM and DWPI. This presentation will
compare these vendor software packages, along with Questel-Orbit’s QWEB
interface, for transfer of MMS data to Questel-Orbit’s bibliographic databases.
MEMing techniques for combining results, displaying of images, ease of use, and
post-transfer to Bizint Smart Charts will all be explored and evaluated.
A life preserver for the
data flood
Gregory A. Landrum, Erik Evensen, Julie E.
Penzotti, and Santosh Putta, Rational Discovery LLC, 555 Bryant St. #467, Palo
Alto, CA 94301, Landrum@RationalDiscovery.com
Abstract
Recent years have
seen great advances in high-throughput screening; HTS systems capable of
handling hundreds of thousands (or even millions) of compounds are now routinely
used in drug discovery. Flexible new tools are needed to allow chemists to wade
through the flood of HTS data without drowning in it. Beyond providing an
interface to screening data, these data triage tools will facilitate the
development of new insights via efficient mining of results.
We have developed a system, built upon an established computational drug discovery platform, which enables: data mining using tools such as similarity searching and hierarchic clustering; constructing pharmacophore- and/or shape-based models; and applying a variety of machine-learning methods for building predictive models. The system, accessible via GUI and scripting interfaces, is usable by both bench and computational chemists. Here we present an overview of the system and its application to mining the NCI AIDS dataset.
AIMS: Array information
management system
David S. Hartsough, Informatics and
Modeling, ArQule, Inc, 19 Presidential Way, Woburn, MA 08101,
dhartsough@arqule.com
Abstract
High throughput
parallel synthesis places demands on chemical tracking and registration systems
that are not present in a single compound synthesis environment. This
presentation will describe ArQule’s integrated Array Information Management
System (AIMS) that manages workflows and processes associated with parallel
synthesis. Specific features to be presented will include tools for array
layout, tracking of reagent and product locations, product culling and
reformatting tools based upon analytical characterization and process monitoring
reports that allow tracking of project timelines and workflow. Powerful analysis
and query capabilities that allow managers and scientists to track and evaluate
their work will also be presented.
DirectedDiversity(r)
informatics - a status report
Victor S. Lobanov, and
Dimitris K. Agrafiotis, 3-Dimensional Pharmaceuticals, Inc, 665 Stockton Dr.,
Suite 104, Exton, PA 19341, Fax: 610-458-8249, victor.lobanov@3dp.com
Abstract
Having good-quality
compounds in a high-throughput screening deck can significantly increase the
odds of finding good leads and minimize the chance of project failures due to
ADMET liabilities. When combinatorial chemistry is the primary source of novel
chemical entities, compound selection becomes a daunting task due to the sheer
number of possibilities. We have developed efficient algorithms and software
systems to automate the analysis of combinatorial libraries and the selection of
compounds for a variety of purposes (e.g. diverse libraries for screening,
focused libraries for lead optimization, etc). The system is optimized for
maximum performance on a desktop computer, and allows complex library analysis
and planning experiments to be carried out in nearly interactive time frames.
ASPECT: A lims system for
characterization of combinatorial libraries
Brian Deneau,
Informatics and Modeling, ArQule Inc, 19 Presidential Way, Woburn, MA 01801,
bdeneau@arqule.com
Abstract
Informatics is a
powerful tool for supporting the synthesis and characterization of combinatorial
libraries for drug discovery. The interface between Cheminformatics and LIMS is
a particularly important area that can provide great benefit to both synthetic
and analytical chemists. The ASPECT system is a LIMS system that supports
characterization of ArQule's combinatorial libraries. ASPECT is fully integrated
with ArQule's AIMS system that supports the production of lead generation and
lead optimization libraries. This presentation will cover the tracking and
analysis capabilities that ASPECT offers for both the synthetic and analytical
chemists. The capabilities of the ASPECT system for streamlining library
characterization and analysis will also be described.
GeminiChemistry:
automating rapid analog synthesis
John Brohan Brohan,
Automation Consulting, Traders Micro, 317 Barberry Place, Dollard des Ormeaux,
Montreal, QC H9G 1V3, Canada, Fax: no Fax, jbrohan@videotron.ca, and Rejean
Fortin, Medicinal Chemistry, Merck Frosst & Co Canada
Abstract
Automating liquid
handling steps of library synthesis.
Many of the constraints met with in chemistry are solved in an embedded manner by GeminiGemistry .
High-throughput
chromatographic method selection and structure
verification
Michael McBrien, and Eduard Kolovanov,
Advanced Chemistry Development, 600-90 Adelaide W, Toronto, ON M5H 3V9, Canada,
michael@acdlabs.com
Abstract
The advent of LCMS structure verification for high-throughput and walk-up laboratories has led to the development of so-called “generic” chromatographic methods. These methods are designed to be applicable to given groups of samples such that reasonable chromatographic performance is observed without the necessity of consideration of individual samples. The problem with this approach is that no one method can apply to all circumstances. The result can be inadequate sample retention, carryover to subsequent samples, and/or instrument downtime. MS data provides molecular weight information; the correct mass is often incorrectly taken as the correct compound. Advanced Chemistry Development has developed algorithms that predict retention times for new compounds under generic conditions. The approach uses training set selection based on structure similarity searches combined with physicochemical parameters in order to predict retention times. These predictions are used as the basis for selection between generic methods for each sample in a set. Retention times can be subsequently used as a structure verification filter to supplement other structure verification tools such as mass spectrometry, as well as being used for high-throughput method selection between generic methods.
REACTOR: Software system
for reagent selection, analysis and inventory management
Daniel
A Gschwend, Research Informatics, ArQule Inc, 19 Presidential Way, Woburn,
MA 01887, gschwend@arqule.com
Abstract
Elegant algorithms for library
design have been published that incorporate a wide variety of factors to be
considered in reagent selection. However, all of this work will be for naught if
the original set of reagents to be considered in the virtual library is not
amenable to automated high throughput synthesis. This presentation will describe
the REACTOR reagent selection and inventory management system developed at
ArQule to address this problem. Features of REACTOR will be presented that
address chemical suitability of reagents, historic information regarding reagent
utility, historic information of vendor reliability and current inventory
status. Powerful analysis and query capabilities that enable chemists to use
these and other reagent properties in their reagent selection and design will be
described.
Chemoinformatics tools for
combinatorial chemistry
M Karthikeyan, S Krishnan, and
Deepak Uzagare, Information Division, National Chemical Laboratory, Dr. Homi
Bhabha Road, Pune 411008, India, Fax: +91-20-5893973, karthi@ems.ncl.res.in
Abstract
Chemoinformatics plays a major
role in the drug discovery process, by eliminating poor choices quite early and
helping to focus on good candidates. Development of chemoinformatics tools to
stream line the combinatorial chemistry research is presented. Development of
automation technology for encoding, decoding chemical structures using
commercial barcodes for inventory and search applications. In house developed
structure based electronic laboratory notebook (D-LAN) for chemistry and allied
field environment to preserve organizational knowledge and to assist
intellectual property activities is presented. Module to assist combinatorial
chemistry interface will assist to guide, collect and store in proper format
along with structural information and predicted properties will ease the
inventory and reproducibility in research. Virtually genrated very large
molecular collection with predicted physico-chemical properties of
organizational interest and its interface with combi-chem research is explored.
Combinatorial informatic
systems at the NIST Combinatorial Methods Center
Cher H.
Davis1, Wenhua Zhang1, Alamgir Karim1, Eric
J. Amis2, and michael J Fasolka1. (1) Polymers Division,
National Institute of Standards and Technology, 100 Bureau Dr, MS 8542,
Gaithersburg, MD 20899, Fax: 301-975-4924, (2) Polymer Division, National
Institute of Standards and Technology
Abstract
Combinatorial methods involve
automated sample-array preparation, computer-driven characterization and
analysis, and overwhelming amounts of data. In order to coordinate this
automation and accommodate this data load, an informatics project has been
established at the NIST Combinatorial Methods Center (NCMC). The core of this
informatics effort is a scientific database system. This database will provide a
central and secure environment for data storage that is specifically geared to
accept materials research data from a variety of sources, structured for
scientific aims, and allows for selective, intelligent retrieval of data through
scientist-mediated and automated routes. By interfacing the database with
instrumentation and data analysis tools throughout the NCMC laboratories, it
will help implement a longer range design-of-experiments (DOE) plan. With these
connections established, the system will enhance the design/refinement of
complex experiments; help streamline, document and organize research activities;
enable the seamless cross-correlation of data-sets produced across materials
disciplines, and facilitate new experiments based upon such comparisons. As data
handling and analysis routines will be automated, time-consuming data
maintenance chores will be eliminated. As the database content grows it will
increasingly serve as a library useful to new experimental design and providing
feedback for experimental refinement, dramatically reducing time spent on trial
and error. The NCMC database system is being built upon open-source code, and
will be supported by web-based interface software. In this presentation, we will
discuss the details and logistics of our growing project, including protocols we
are developing to standardize experiments, data, and procedures, to make them
more easily accommodated by the database and more comparable to each other.
Data management system for
catalyst discovery via combinatorial techniques
George
Fitzgerald, Jorg Hill, Georg Lowenhauser, Joe Tucker, and Michael J. Doyle,
Accelrys, 9685 Scranton Rd., San Diego, CA 92121, Fax: 858 458 0136,
gxf@accelrys.com
Abstract
Long established in the
pharmaceutical industry, combinatorial techniques are rapidly becoming de
rigueur in the development of new materials. However, owing to the variety of
elements, wide range of synthesis techniques, and general lack of detailed
structural information, data management can be far more challenging than for
organic molecules. We have initiated a project using high throughput techniques
to develop new catalytic materials for the reduction of NOx in automotive
exhaust, with the goal of meeting the "Tier II" emissions standards that will be
effective in 2007.
While synthesis, screening, data analysis, and modeling all play roles in the catalyst discovery process, we will focus on analysis and modeling. In particular, we will discuss the ability of the data management system to: (i) incorporate any user-defined processing operation in synthesis; (ii) store, retrieve and analyze data in a way meaningful to the experimental end-user; and (iii) support non-collocated teams.
Data Storage and
Evaluation Tools for High Throughput Experimentation Applied to Heterogeneous
Catalysis
Wolfgang Strehlau, hte Aktiengesellschaft,
Kurpfalzring 104, Heidelberg 69123, Germany, Fax: +49 (0) 6221 7497 134,
Wolfgang.Strehlau@hte-company.de
Abstract
Data Storage and Evaluation
Tools for High Throughput Experimentation Applied to Heterogeneous Catalysis
D. Demuth, K.-E. Finger, O. Gerlach, J. Klein, A. Sundermann, U. Vietze, D. Demuth and W. Strehlau
hte Aktiengesellschaft, Kurpfalzring 104, 69123 Heidelberg, Germany
High Throughput Experimentation is the rapid completion of two or more experimental stages in a concerted and integrated fashion. It typically comprises four interconnected stages, e.g. “Design”, “Make”, “Test” and “Model”. This cycle applies equally to the discovery and development of drugs, heterogeneous catalysts, or other materials. The data relating to and produced by all of these operations are housed in the MatInformatics system. The “Design” step leverages various computational tools, such as factorial design and other design of experiment (“DOE”) protocols, the evaluated results of past rounds of experiments, information already available from other sources, and the insights and intuition of the project team. Design of experiment (“DOE”) tools support the choice of which experimental points to sample in a complex parameter space. Full coverage of the parameter space defined by just the compositional dimensions of a multi-element inorganic system would require an infinite number of experiments. Thus the practitioner need decide (i) how many experiments to perform; (ii) at what increments each variable is sampled. The DOE tools provide an aid in the design process, with a value that can increase as understanding of the parameter space is accumulated in successive iterations through the HTE cycle. The catalyst testing profiles defined in the Design stage are typically applied in a parallel reactor system (“Test” stage). For data evaluation and mathematical modeling as well as search strategies dedicated to reduce the number of data sampling points a variety of different techniques are currently under discussion. The successful use of each of these mathematical approaches depends on the specific problem to which the algorithm is applied. A prediction of what data evaluation algorithm should be used to meet the research goals most economically and how the interaction between DOE tools and evaluation algorithm can be implemented most efficiently is difficult. The presentation illustrates some the design and evaluation tools mentioned above by means of practical examples derived from recent research programs.
Rational design, an
alternative to the combinatorial explosion
François
Gilardoni, Alasdair Graham, Ben McKay, and Brown Brown, Avantium
Technologies B.V, Zekeringstraat 29, 1014 BV, Amsterdam, Netherlands, Fax: +31
(0)20 586 8085, Francois.Gilardoni@avantium.nl
Abstract
Automated and parallel methods
are rapidly growing in chemical process research and development, initiated by
the extensive implementation of combinatorial techniques in medicinal chemistry.
Combinatorial chemistry libraries are generated by systematic permutation of the
structural parameters of constituent building blocks. The diversity of these
libraries cannot be exploited even with very high-throughput experimental
platforms. Avantium has developed an alternative, called “Rational Design”
(Figure 1), that maximizes diversity in the least number of experiments to
create a performance-based model. This cost-effective approach for high
throughput experimentation combines clustering, molecular modelling, statistical
design of experiment and multivariate statistics. A model correlates properties,
or descriptors, of a catalyst or a formulation component and process conditions,
to its end performance, without requiring a complete mechanism. Avantium is
actively involved in implementing further developments to render this technique
faster, more cost-effective and integrated into an HTE platform, with tools and
techniques to design “Rational Libraries”.
Increasing the Efficiency
of High-Throughput Experimentation by Use of Experimental Design and Data
Analysis Techniques
Arne L. Ohrenberg, and Andreas
Schuppert, Bayer Technology Services, Bayer AG, Leverkusen D-51368, Germany,
Fax: +49-214-3064801, arne.ohrenberg.ao@bayertechnology.com
Abstract
Experimental design is a
powerful method to improve the efficiency of HTE to discover new materials,
drugs or catalysts. The parameter space of screening experiments is usually
high-dimensional and the variables are possibly discrete. The response surface
of the screened systems can be very rugged, characterized by smooth planes as
well as steep and narrow ascents of abundant suboptima. These conditions make
the exclusive use of the classical statistical experimental design and data
analysis inappropriate. Evolutionary strategies, neural networks and data mining
may be an efficient alternative. On various examples, we show the practical
benefit of design strategies which combine different techniques. The selection
of the methods depends on the nature of the respective HTE-problem. An optimal
design strategy makes HTE more efficient, and reduces research costs and time to
market. Furthermore, the early application of a design strategy enables reliable
statements about the feasibility of the research project.
Iterative experiment
design
Steven G. Schlosser1, Alan J.
Vayda1, Erik J. Erlandson1, Maureen Bricker2,
Ralph Gillespie2, and J. W. Adriaan Sachtler2. (1)
NovoDynamics, Inc, 123 N. Ashley, Ann Arbor, MI 48104, Fax: 734-205-9101,
steve@novodynamics.com, (2) UOP LLC
Abstract
Materials discovery is an
iterative process. Experiments are designed, materials are synthesized and
tested, analyses are performed, and new experiments are designed. The
combinatorial approach and specialized high-throughput equipment speeds up the
process but does not solve the fundamental problem of how to drive the discovery
process. Traditional statistical experiment design approaches are better suited
to single experiments. Traditional analysis approaches are not easily linked to
experiment design tools. A new approach is described in this paper which
utilizes highly integrated experiment design and predictive modeling tools. The
experiment design tool features an optimal coverage algorithm for placement of
experimental points within complex multidimensional regions of iterest while
taking into account previously tested points. The predictive modeling tools
operate on data with arbitrary point placement and do not require regular
placement along the various axes. These integrated tools have been applied to
the discovery of heterogeneous catalysts.
Machine-learning models
for high-throughput materials discovery
Gregory A. Landrum,
and Julie E. Penzotti, Rational Discovery LLC, 555 Bryant St. #467, Palo Alto,
CA 94301, Landrum@RationalDiscovery.com
Abstract
In order for any model building
methodology to be useful in high-throughput materials discovery, it is essential
that it be both flexible enough to handle the complexity of the problems at hand
and fast enough to not create a bottleneck in the discovery process.
Machine-learning techniques satisfy both of these criteria.
We have developed an ensemble approach to model building which provides both high accuracy and confidence estimates for each prediction. The flexibility and efficiency of our approach have been validated on a number of materials, catalysis, and life-science problems.
Here we present an interpretable machine-learning model for the prediction of ferromagnetism in binary transition metal alloys, and the results of applying our ensemble approach to the prediction of Tc values in superconductors and Tg values in polymers. We will also discuss the selection of descriptor sets which enable high computational throughput for these problems.
Computer-aided discovery
of compounds with combined mechanism of pharmacological action in large chemical
databases
Alexey A. Lagunin1, Oleg A.
Gomazkov1, Dmitrii A. Filimonov2, Nina I.
Solovyeva2, and Vladimir V. Poroikov2. (1) V.N. Orekhovich
Institute of Biomedical Chemistry of Rus. Acad. Med. Sci, Pogodinskaya Str., 10,
Moscow 119121, Russia, Fax: (7-095) 245-0857, alex@ibmh.msk.su, (2) Institute of
Biomedical Chemistry of Russian Academy of Medical Science
Abstract
The prediction of the spectra
biological activity spectra for substances has been studied as a tool for the
search of compounds with dual mechanisms of action in large chemical and
combinatorial databases. Biological activity spectra of substance including
pharmacological effects, mechanisms of action, mutagenicity, carcinogenicity,
teratogenicity and embryotoxicity are predicted by computer program PASS (http://www.ibmh.msk.su/PASS) on the basis
of their structural formulae. Relationships between pharmacological effects and
molecular mechanisms of actions are identified with computer program
PharmaExpert. The data about mechanism-effect relationships and prediction
results of biological activity spectra allow user to quick select compounds with
possible combine mechanism of action causing specific pharmacological effect.
The search for potential antihypertensive compounds with dual molecular
mechanisms of action in databases of commercially available compounds (AsInEx
and ChemBridge, totally about 200000 compounds) is presented as example of
highthroughput computer-aided drug discovery. Four substances, potential
inhibitors of angiotensin converting enzyme (ACE) and neutral endopeptidase
(NEP) were selected. The experimental testing of these compounds confirmed that
they are inhibitors of ACE and NEP with in range IC50 10-5-10-9 M.
Citation Linking -- How
important is it?
Suzanne Fedunok, Coles Science Center, New
York University Bobst Library, 70 Washington Square South, New York, NY 10012,
Fax: 212-995-4283, suzanne.fedunok@nyu.edu
Abstract
Conventional wisdom is that the
more citation linking available to a reader of a scientific paper, the better.
Publishers representatives are quoted to say "the publisher with the most links
wins." This paper reports on a study of a small set of chemistry journals to
determine the proportion of cited references hyperlinked and their importance to
the understanding of the paper.
Global submission and
validation of experimental thermodynamic data using Guided Data Capture (GDC)
software: Benefits to authors, journals, and data users
Robert
D. Chirico1, Vladimir V. Diky1, Randolph C.
Wilhoit2, and Michael Frenkel1. (1) Thermodynamics
Research Center (TRC), National Institute of Standards and Technology (NIST),
Mailstop 838.00, 325 Broadway, Boulder, CO 80305, Fax: 303-497-5044,
chirico@boulder.nist.gov, (2) Texas Experimental Engineering Station, Texas
A&M University System
Abstract
Guided Data Capture software
(GDC) has been developed by TRC at NIST for mass-scale abstraction from the
literature of experimental thermophysical and thermochemical property data. As
of September 2002, the Editorial Board of the Journal of Chemical and
Engineering Data established a new policy for submission and dissemination of
experimental data with use of the GDC software at its core. Following the
peer-review process, authors are requested to download and use the GDC software
to capture the experimental property data accepted for publication. The output
file from the GDC software is submitted directly to TRC. After additional
consistency tests, the files are converted into an XML-based format (ThermoML)
with software developed at TRC. Upon publication of the manuscript, ThermoML
files are posted on the TRC Web site for unrestricted public access. Discussions
are in progress for implementation of this process with other major journals in
the field. Key features of the GDC software will be discussed together with
benefits derived to authors, journals, and data users.
Concept of metadata in
scientific publications and the way from data to
information
Horst Bögel, Department of Chemistry,
Martin-Luther-University Halle, Kurt-Mothes-Str. 2, Halle 06120, Germany, Fax:
49-345-5527664, boegel@chemie.uni-halle.de
Abstract
The amount of scientific data
increases rapidly in many areas, e.g. in chemistry an bio-sciences. By new
computer and communications technologies we have online access to world wide
data collections, databases and online journals in full-text and we can search
in the world wide web. Most of these resources we can access from our desk in
the office and without any delay by ordering documents. So we are able to solve
problems in much shorter periods of time. Sometime we have a very huge number of
hits of quite similar data and it's not easy to find to that information we were
searching for. A few questions should be raised: 1. Do these documents have a
convenient structure for efficient and successful searching 2. Do they have the
original data in a representation for an efficient re-use 3. Do they support
automatic data transfer into databases and archives 4. Do they support the
generation of multipurpose interoperability of data There are several models for
associating resources and metadata. In text documents on the Web, descriptive
information is most commonly embedded in the documents by using the META tags of
the Hypertext Markup Language (HTML). These metadata can be created by the
author itself or by the publisher. Creating and managing these metadata is often
labour-intensive, and semi-automated sophisticated procedures are in progress.
Usually the data and the metadata are combined and transported over the Web to
the user. The browser at the client side of the network displays the data in a
given layout. Publishers may want to use metadata in order to make the contents
in their restricted resources and services visible to searchers. Extended Markup
Language (XML) gives the possibility to separate content from layout which is
collected in the Document Type Description (DTD) and Cascading Style Sheets
(CSS). XML has a modular approach; an application is built from components.
Chemical Markup Language (CML) and others (MathML) are useful for the
representation of specific contents and to provide a more universal
infrastructure for publishing. At the moment XML is increasingly widely accepted
as an information infrastructure.
Now that everything can be
published, should we really publish everything?
Anthony W.
Czarnik, Sensors for Medicine and Science, Inc, 12321 Middlebrook Road STE
210, Germantown, MD 20874, awczarnik@s4ms.com
Abstract
If the term "publication"
literally means, 'making information public,' then the digital age heralds a
time when anyone can publish anything at anytime. When everything CAN be
published, what SHOULD be published? The arbiters or these decisions- editors-
have a more important role today than yesterday. They must decide not 'What is
available to read' but rather 'What is important to read.' There's a big
difference, and while the ultimate power of editors is now lessened the
responsibility upon editors is increased. Editors who 'profess' what their
professions believe, as codified by professional organizations, will have
influence proportional to that of the brand of the organization. Publication in
the Digital World will necessarily become a more pluralistic process.
Chemistry journals: How I
want to read in 2012
Steven M. Bachrach, Department of
Chemistry, Trinity University, 715 Stadium Drive, San Antonio, TX 78212, Fax:
210-999-7569, sbachrach@trinity.edu
Abstract
Electronic media offer an
opportunity for radically restructuring the way chemists communicate. As of
2003, the majority of STM publishers have only scratched the surface of its
potential. In this talk, I will present a vision of the future of the chemistry
journal, highlighting how technological innovations will dramatically enhance
the information content of the chemistry article, enabling scientists to more
effectively communicate and more efficiently assimilate information.
Some stumbling blocks on
the road to publishing chemistry on the web
David P
Martinsen1, Lorrin R Garson1, and Joseph E.
Yurvati2. (1) ACS Publications, American Chemical Society, 1155 16th
Street NW, Washington, DC 20036, d_martinsen@acs.org, (2) Journal Publishing
Operations, American Chemical Society
Abstract
The publication of chemistry on
the web allows a number of features to be included which are impossible to
render in the print version. However, using these web-enhanced objects is not
without difficulty. This paper will examine some of the problems encountered in
receiving electronic documents, chemical structure files, animations, and VRML
and the impact of these on both the review process and the publication process.
Implications of the publication of these new types of objects on the long-term
archive of the manuscripts will also be addressed.
Nanoworld in Chemical
Abstracts
Felix S Sirovski, Laboratory of Fine Organic
Synthesis, Zelinsky Institute of Organic Chemistry, 47 Leninsky pr, 119991
Moscow, Russia, Fax: 7-095-135-5328, sirovski@gol.ru, Nadezhda Krukovskaya,
Information Department, Zelinsky Institute of Organic Chemistry, and Valentina
Efremenkova, Methodological Department, VINITI
Abstract
The intensive exploration of
nanoworld began at the end of the last century. The works of Kroto and Iijima
gave rise to “Sturm und Drang” in the field of carbon nanomaterials. That is
illustrated by the graph below. The communication is devoted to pecularities of
indexing of works devoted to nano-technologies in CA.
Building an Internet
Chemistry Business
Scott G. Hutton, ChemNavigator, Inc,
6126 Nancy Ridge Drive, San Diego, CA 92121, Fax: 858-625-2377,
shutton@chemnavigator.com
Abstract
ChemNavigator is a growing
company providing chemistry and cheminformatics services facilitated greatly by
the Internet. Founded in 1999 as a pure e-commerce chemistry company,
ChemNavigator has learned a great deal about what works and what doesn’t work
with chemistry businesses on the Internet. This presentation will cover the key
business and technical issues which ChemNavigator has learned play critical
roles in the success of a chemistry business on the Web. Both ChemNavigator’s
and ChemNavigator’s key client perspectives will be covered.
"Tools of Research" course
for chemistry graduate students
Patricia Muisener, and
Katherine M. Whitley, Department of Chemistry, University of South
Florida, 4202 E. Fowler Avenue, Tampa, FL 33620, Fax: 813-974-1733,
muisener@chuma1.cas.usf.edu, kwhitley@lib.usf.edu
Abstract
Chemistry graduate students at
the University of South Florida can prepare for many aspects of their research
careers in this new required course. Co-instructors are the Chemistry
Department’s Assistant Chair for Graduate Concerns and the Chemistry Librarian.
First semester integrates training in information retrieval using specialized
databases and web sites, in protecting intellectual property, in locating
funding sources, in writing grant proposals and journal articles, in reviewing
their colleagues’ work, in oral presentation, and in discussion for journal
clubs. Guest speakers included a former NSF program officer, a sponsored
research specialist, a patents specialist, and an officer from the Institutional
Review Board. Second semester exposes the students to major instrumentation
resources they will need for their research. Each week, expert guest lecturers
will discuss and demonstrate an instrumental technique. A familiar comment from
many professors is “I wish I’d had a course like that when I was a graduate
student!”
Classification of mass
spectra using Fuzzy logic inference engine
Jill R. Scott,
Timothy R. McJunkin, and Paul L. Tremblay, Department of Chemistry, Idaho
National Engineering and Environmental Laboratory, 2525 N. Fremont Ave., MS
2208, Idaho Falls, ID 83415, Fax: 208-526-8541, scotjr@inel.gov
Abstract
Previously, we automated our
imaging internal laser desorption Fourier transform mass spectrometer. Mass
spectral data are acquired at a rate of approximately 7200 files/hour. Manual
analysis of these files would take several weeks for a trained operator;
therefore, we developed an inference engine to automate the data analysis. The
inference engine software is a fuzzy logic expert system designed to simulate
the analysis that a human operator would perform. The cues that a human operator
uses to classify mass spectra have been encapsulated into the fuzzy rule base.
The inference engine can analyze 7200 files in approximately 20 minutes and
prepare the output in a format for any commercial graphics program. A second
inference engine is used to help refine the rule base for mass spectral
assignment by gathering statistics on ions not currently part of the rule base,
but may be candidates for making the rule base more robust.