Program With Abstracts
![]()
Combinatorial Chemical Information (Cosponsored with COMP) 1
Cosponsored with Division of Computers In Chemistry
Convention Center, Room 220
R. Snyder, Organizer, Presiding
8:30-Introductory Remarks
8:40-1. Chemistry first
- Accord solutions to non-trivial enumeration problems.
Julian Hayward and Keith A Harrington. Synopsys Scientific Systems Ltd., 5 North Hill
Road, Leeds, LS6 2EN, United Kingdom. Email:mailto:julian.hayward@synopsys.co.uk
Widespread adoption of combinatorial chemistry methodology in the Pharmaceutical industry has created the need for chemists to generate computer representations of large numbers of compounds for analysis, tracking and registration. Software for this so-called 'compound enumeration', based on the input of starting materials and scaffolds, is widely used by scientists within the industry today to provide adequate solutions to straightforward chemical problems.
However, organic chemistry is not that simple and enumeration problems typically occur for examples which involve symmetry, stereochemistry and multiple attachment points - even the humble Diels-Alder reaction has presented a major obstacle for enumeration software.
This presentation will focus on how recent
advances in chemical representation within the Accord Chemistry Engine have led
to simple solutions to complex enumeration problems.
9:20-2. Where are the
GaPs? A rational approach to monomer selection.
Andrew R. Leach1, Darren V.S. Green2, Michael M. Hann1,
Andrew Good1, and Duncan B. Judd3. (1) Computational
Chemistry Group, Glaxo Wellcome Research & Development, Gunnels Wood Road,
Stevenage, SG6 3RT, United Kingdom, (2) Lead Discovery Unit, Glaxo Wellcome
Research & Development, (3) Lead Design Unit, Glaxo Wellcome Research &
Development. Email: mailto:arl22958@ggr.co.uk
We will outline a computational method for the classification and selection of monomers for combinatorial libraries. The molecules are described in terms of the pharmacophoric groups they contain and where those pharmacophoric groups can be located in 3D space. The approach involves a detailed conformational analysis of each molecule. This conformational analysis is done within a common coordinate frame thus enabling molecules to be compared.
To date there have been two major
applications of the approach. First, it has been used to help decide which
monomers to purchase. The use of a partitioned space is key to this particular
application as it facilitates the identification of regions of space which are
under-represented by existing compounds. The method has also been used in a
number of medicinal chemistry projects to select "diverse" sets of monomers.
Both applications will be described.
10:00-3. Reagent
Selector: a decision support approach to reagent library design.
Douglas R. Henry, Al J. Gushurst, Maurizio Bronzetti, David Pirkle,
Richard Coad, Rik Winter, Alex Nguyen, Chris Nguyen, Thomas E. Moock, Ali G.
Özkabak, and Jay A. Turner. MDL Information Systems, 14600 Catalina Street, San
Leandro, CA 94577. Email: mailto:dough@mdli.com
This talk describes a unique, decision
support approach to designing reagent libraries. The program Reagent Selector
uses data objects familiar to chemists: structures, lists, and tables. It is
based on a relational chemical datamart, which can dynamically manage large
numbers of structures, supplier data, and property information from multiple
sources. The talk shows the application of simple, effective decision support
tools such as grid and table displays, sorting, filtering, and clustering as
applied to the design and selection of reagent libraries. We also describe the
extension of Reagent Selector to interface with property calculators,
alternative data analysis methods, and external programs.
10:40-4. Virtual
Optimization of Chemical Libraries using Genetic Algorithm.
Alfonso Pozzan1, Andrew Leach2, Aldo
Feriani1, and Mike Hann2. (1) Medicinal Chemistry
Computational Chemistry, GlaxoWellcome S.p.A., v. A. Fleming 2, 37135 Verona,
Italy, 37135 Verona, 37135, Italy, (2) Computational Chemistry, GlaxoWellcome,
Gunnels Wood Road, Stevenage Herts SG5 2NY, UK. Email: mailto:ap16390@glaxowellcome.co.uk
One of the essential points in
combinatorial library design concerns the selection of the monomers to be used
as building blocks for the combinatorial synthesis of the final molecules.
Currently, public databases like the ACD consist of many thousands of molecules
suitable as monomers to react under combinatorial chemistry condition.
Considering that the number of available monomers is increasing and that
combinatorial chemistry technology is giving access to more and more chemical
reactions, one of the major tasks for library design is to select the best set
of monomers out of a large number of potentially reactants. For this reason we
have developed in house a program called VOLGA (Virtual Optimization of chemical
Libraries using Genetic Algorithm) which allowed us to optimize the design of a
wide class of chemical libraries by choosing among different fitness functions.
When VOLGA was planned, particular attention was paid to obtaining a program
that could use any fitness function defined by the user. Fitness functions that
have been successfully used to date include: 3D pharmacophore fitting, 2D
similarity/dissimilarity measures, drug like profiles and QSAR derived models.
The program allows optimization of libraries ranging from few tens up to 10000
molecules. Optimization can be run by starting from potentially huge virtual
libraries ranging from a few thousand to several millions molecules (i.e. all
those that could be generated by combinatorial explosion of all the reactants
considered in the design model). The aim of this paper is to critically analyze
the different methods and scoring functions that have been used along with
details on how classical GA theory was adapted in order to optimize
combinatorial libraries. Advantages and drawbacks of this method are
discussed.
11:20-5. Use of Markush
Structure Analysis Techniques for Rapid Processing of Large Combinatorial
Libraries.
John M.
Barnard1, Geoff M.
Downs1, and Robert D. Brown2. (1) Barnard Chemical
Information Ltd, 46 Uppergate Road, Stannington, Sheffield, S6 6BX, United
Kingdom, (2) MSI Molecular Simulations Inc., 9685 Scranton Road, San Diego, CA
92121-3752. Email: mailto:barnard@bci1.demon.co.uk
A Markush structure is an extremely compact way of representing a large virtual combinatorial library, in which common parts of the individual product molecules are shown only once. Using extended versions of algorithms originally developed for storage and retrieval systems for Markush structures from chemical patents, we have written software to generate structural fingerprints for the molecules in a library, by direct analysis of a Markush representation. This can speed up the analysis process by orders of magnitude, as compared with approaches based on emumeration of the individual molecules, and the program can be linked to routines for fast clustering of library members, and calculation of numerical diversity measures.
The principles behind the algorithms used
will be described, and results obtained using the software for analysis of
libraries will be presented. Issues concerning the optimisation of Markush
representations for this type of analysis will be discussed (especially where
"non-regular" libraries and variable scaffolds are involved), and current work
on building such representations from input based on sequences of reactions and
precursor molecules described. Opportunities for use of these techniques for
rapid generation of additional descriptor types will also be mentioned.
![]()
Combinatorial Chemical Information (Cosponsored with COMP)
2 Cosponsored with Division of Computers In Chemistry Convention Center, Room 220 R. Snyder, Organizer, Presiding
1:30-Introductory Remarks
1:40-6. Penalty-biased
diversity. Design of diverse, drug-like libraries.
Moises Hassan and Marvin Waldman. Molecular Simulations Inc., 9685 Scranton Road, San
Diego, CA 92121. Email: moises@msi.com
Diverse libraries in which molecules are
restrained to exhibit properties similar to those of known drugs are expected to
find a higher percentage of active compounds in lead discovery programs which
will prove more suitable as viable drug candidates. Diverse, drug-like libraries
are designed by optimizing R-group fragments to simultaneously maximize the
molecular diversity and minimize a penalty function based on the specified
properties of the products. Two types of penalties are implemented. The first
uses property ranges, penalizing molecules when their calculated descriptors are
outside desired ranges. The second is based on a property distribution (profile)
of the library, penalizing a library when the profile for a given property
differs from the desired one. Several applications of this approach to library
design are presented, including biasing libraries to satisfy Lipinski-like
rules, focusing libraries to exhibit properties found in molecules with a
specific biological activity, designing libraries that exhibit a desired
property profile, such as a uniform molecular weight distribution to facilitate
identification by mass spectroscopy, and combinations of these
approaches.
2:20-7. Lead-Hopping and
Library-Hopping by Topomer Shape Similarity Searching of Vast Virtual
Libraries.
Katherine
Andrews-Cramer and Richard D.
Cramer. Tripos Inc., 1699 South Hanley Road, St. Louis, MO 63144. Email:
kcramer@tripos.com
Using the ChemSpaceTM technology, seven libraries containing 3.8 x 1012 virtual molecules were searched, using query structures that were chosen from each of 34 articles published in the Journal of Medicinal Chemistry in 1998, in order to represent a diverse set of lead structures active toward different known targets.
The results of the searches will be
considered from several perspectives: 1) How often are similar structures
identified? 2) Are hits which are both novel and intuitively convincing
obtained? 3) How shape similar are the hits to the query structure, when an
alternative shape assessment tool is used? 4) What can be said about the
potential biological activity of the hits, based on those found in the
literature? 5) From the hitlists, can libraries be designed for lead follow-up
synthesis which are amenable to high-throughput synthesis and combinatorial
chemistry?
mailto:kcramer@tripos.com
3:00-8. Creating maximal
diversity in a HTS screening library: a statistical approach.
Jan T. Pedersen, Anne Marie Munk Joergensen, and Peter Faester Nielsen. Acadia
Pharmaceuticals, Fabriksparken 58, Glostrup, DK-2600, Denmark. Email: mailto:jan@acadia-pharm.com
The Acadia in-house HTS screening library currently contains ~120,000 compounds. We have attempted to build a library with maximal diversity and increased information content for receptor screening and profiling. The library contains both a diverse set of compounds from the ``known'' chemical space together with a large set of common drugs.
The diversity measures that we use are based solely on structure (2D) and physical chemical properties. A fast graph-theoretical comparison algorithm is used to evaluate structural similarities and structural properties. Distributions of these properties and the correlation between different distributions are used to compare and evaluate compound collections that are potentially included in the screening library. We have used a conditional probability formalism, where a ``random library'' is the common reference state for comparison of libraries and evaluation of their diversity.
We have evaluated this library in a large
number of GPCR screenings and analyzed the data using phylogenetic clustering of
the screening hits. The phylogenetic clustering uses the formalism of
phylogenetic comparison from sequence analysis. The basis of the phylogenetic
tree is in this case not sequence similarity but similarity of the chemical
graphs. This appears to be a simple and efficient way to evaluate large numbers
of screening hits and an efficient way to identify unique HTS hits. The basis of
the phylogenetic clustering will be outlined and demonstrated on a recent
dataset. It will also be demonstrated how this evaluation method can be used to
automatically classify clusters of HTS hit structures according to known
drugs.
3:40-9. A simple method
to simultaneously increase diversity and favorably enrich the content of
chemical libraries.
Ryan T.
Koehler, Steve L. Dixon, and Hugo
O. Villar. Computational Chemistry Laboratory, Telik Inc., 750 Gateway Blvd,
South San Francisco, CA 94080. Email: koehler@telik.com
To streamline pharmaceutical discovery,
chemical libraries employed for routine screening should be both diverse and
enriched with "drug-like" compounds. We describe a simple new algorithm for
simultaneously addressing both objectives, providing a means of strategic
compound selection to expand screening libraries. The algorithm exploits
differences in descriptor distributions associated with different chemical
libraries to identify those additional compounds that are most different from
compounds currently comprising a screening library and most similar to compounds
comprising a library to be emulated. Tests with publicly available compound
databases (ACD, CMC, NCI) demonstrate method behavior and effectiveness. Results
of spiking experiments, in which "drug-like" CMC compounds are spiked into sets
of ACD compounds then ranked for selection, are presented. The algorithm
performs substantially better than random. Our algorithm is general in
principle, operating with any set of descriptors, similarity measure, and
specification of reference libraries.
4:20-10. Integrated
Informatics for Library Design and Analysis.
Tim Mitchell, Cambridge Combinatorial Ltd., The Merrifield Centre, Rosemary Lane,
Cambridge CB1 3LQ United Kingdom. Email: mailto:tim.mitchell@cam-com.com
The Atlas Informatics system developed by
Cambridge Combinatorial is a set of integrated tools to support library design,
control of automation for synthesis, analysis and purification, registration and
reporting. Most of these processes are designed to be performed by chemists at
their desktops. In the cases where specialised skills are required, data and
information exchange is designed to be seamless. The library design process
involves template and precursor selection, virtual library enumeration,
registration and profiling. Precursor selection is critically dependent on the
amount of information available about the target biological receptor. Diversity
assessment can be used in both precursor and product selection, but is it
usually far more productive to profile the library in terms of descriptors of
physico-chemical properties (e.g. LogP, Hydrogen bonding, potential toxicity,
solubility). If the library in being designed around a Pharmacophore, then the
Pharmacophore content of the virtual library also needs to be confirmed. Most
importantly, the computational design of a library has to be compatible with the
practical considerations of synthesis and analysis-fully enumerated libraries,
96-well format etc. The Atlas Informatics system provides tools for the
profiling of a virtual library by a wide range of descriptors. Profiling of the
virtual library products allows for the rapid identification of desirable and
undesirable monomers and the rapid optimisation of focussed library
design.
![]()
Integration of Primary and Secondary
Literature on the WWW
Convention Center, Room 220
C. Huber, Organizer, Presiding
8:25-Introductory Remarks
8:30-11. The new
chemical information environment.
Harry F Boyle, Product
Marketing, ACS Chemical Abstracts Division, P.O. Box 3012, Colmubus, OH 43210
and Susan A Barclay, New Product Development, ACS Publications Division, 1155
16th Street, NW, Washington, DC 20036. Email: mailto:hboyle@cas.org
The evolution of indexes, computer
services, and web technology has made it progressively easier for scientists to
browse published information broadly or pinpoint specific items of interest.
Traditional print publishers and patent offices are offering electronic versions
of their documents and traditional information providers are acting as document
aggregators. A working alliance of the ACS Publications and CAS divisions, other
scientific journal publishers, patent offices, and the STN partner organizations
is now opening a path to the next level of literature exploration and
acquisition. A new environment is taking shape, largely unconstrained by the
traditional boundaries that separate one publisher from another, primary sources
from secondary ones, and in-house holdings from external sources. This paper
will discuss the emergent chemical information environment of the future.
9:00-12. Linking
between content providers: the ISI experience.
Chris Leonard, New Product Development, Institute for Scientific Information, 3501
Market Street, Philadelphia, PA 19104. Email: cleonard@isinet.com
A researcher in the digital environment expects that relevant information regardless of its location or structure will be brought to the desktop. In response to this expectation, content providers have created partnerships that facilitate links between various forms of information from different organizations. Still in the initial phases, these alliances are the basis of integrated content in digital libraries.
ISI Links is one of these partnership
initiatives. With the Web of Science® forming the basis for effective retrieval
and navigation, ISI is working with publishers to build links to a variety of
content including full text, chemical structures and patent information.
Consequently, ISI has faced some of the significant issues that all content
providers face in building the linked environment. What elements affect the
success of linking content? Which partnerships will add value to proprietary
content? What impact do format considerations, data transfer processes, and
administrative issues have on linking?
9:30-13. Creating and
maintaining dynamic links between database citations and their corresponding
fulltext files.
Margery
Tibbetts, California Digital
Library, 1111 Franklin Street, Oakland, CA 94607-5200. Email: mailto:margery.tibbetts@ucop.edu
This presentation will discuss how links
are being created and maintained between articles in the California Digital
Library (CDL) hosted databases and their corresponding full text files. The CDL
maintains some full text files of its own but the focus of this presentation
will be on linking to full text files maintained at the publisher's site. Issues
to be covered in detail include how the CDL linking system is designed, the
various linking algorithms used (SICI, DOI, etc.), access issues, and some of
the problems we have encountered while developing the system. The early
experience of the CDL with article images and the general architecture of the
CDL system will covered briefly.
10:00-14. LitLink:
dynamic linking of the primary and secondary literature.
Steven Young, MDL Information Systems, Inc., 14600 Catalina Street, San Leandro, CA
94577. Email: mailto:stevey@mdli.com
Dynamic linking of the primary and
secondary literature offers many advantages over static linking. Static linking
normally requires that all the necessary linking information be stored and
maintained in a centralized database. The databases used for static linking are
typically limited to selected sources of the primary and secondary literature.
Dynamic linking offers the advantage of interlinking any primary or secondary
literature source. LitLink, an example of a dynamic linking system, uses a
citation as input to automatically generate and submit it as a query to the
appropriate literature sources. Approaches and applications utilizing LitLink as
an electronic article broker will be presented.
10:30-15. Integrating
primary and secondary literature - patents versus journals.
Breda F. Corish, Product Development, Derwent Information, Holbrook House, 14 Great
Queen Street, London, WC2B 5DF, United Kingdom and Jeff Clovis, New &
Corporate Products, ISI (Institute for Scientific Information), 3501 Market
Street, Philadelphia, PA 19104. Email: jclovis@isinet.com
Derwent Information and ISI specialise in
the creation of value-added secondary databases focusing on patents and
journals, respectively. Both companies see the same demand for access via
standard Web browser technology to their value-added secondary databases with
seamless linking to the corresponding primary level data. In meeting this need,
a common problem lies in the fact that not all primary data sources are
available in a suitable electronic format. From a commercial perspective,
providing access to primary patent documents is relatively simple as this
material is already in the public domain. For journals, this is complicated by
the need to have separate business agreements with each of the primary
publishers who hold journal copyright. These topics to be explored with
reference to: ISI's "Web of Science" (WOS); links from WOS to journal full text;
links from WOS to "Derwent Innovations Index" (DII); development plans for
linking DII to patent fulltext.
Email: mailto:bcorish@derwent.co.uk
11:00-16. Authors' e-mail address and URL be added to Chemical Abstracts . S-K. Lin
Shu-Kun Lin, lin@mdpi.org, http://mdpi.org/lin/, MDPI, Molecular Diversity Preservation International, Sangergasse 25, Basel CH-4054 Switzerland. Email: mailto:lin@mdpi.org
It is suggested that CAS add authors'
e-mail addresses, if available in the original publications, to the Chemical
Abstracts entries. Authors' URL or website addresses also can be included. These
may be treated as an important part of a full address. MDPI's journals Molecules
( http://mdpi.org/molecules) and Entropy
( http://mdpi.org/entropy) publishes
authors' e-mail address, URL, telephone and fax numbers, in addition to their
full surface mail address. E-mail address is normally concise, particularly
useful and should be included in abstracts. To include e-mails will be of great
convenience for readers to request for reprints and other convenient contacts
with the authors or for discussions. Old e-mail address might be used even if
you move to a new place. E-mail is very fast. It is the least expensive way of
communication. Here, I have successfully put my e-mail lin@mdpi.org and URL http://www.mdpi.org/lin/ in the author's
address of this abstract and hope the modulators do not delete them. Some other
arguments and a summary of the discussions at CHMINF-L mailing list
(CHMINF-L@LISTSERV.INDIANA.EDU, http://listserv.indiana.edu/archives/chminf-l.html)
during February 1999 will be presented.
Combinatorial Chemical Information (Cosponsored with COMP) 3
Cosponsored with Division of Computers In Chemistry
Convention Center, Room 217
T. Wright, Presiding
R. Snyder, Organizer
8:30-Introductory Remarks
8:40-17. Novel methods
for assessing and comparing the diversities of chemical libraries.
Robert S. Pearlman1, Xiao C. Wang2, Ying
Su2, and Michael Green2. (1) Laboratory for Molecular
Graphics and Theoretical Modeling, University of Texas, College of Pharmacy,
Austin, TX 78712, (2) Trega Biosciences, Inc., 9880 Campus Point Drive, San
Diego, CA 92121. Email: mailto:pearlman@vax.phr.utexas.edu
Standard methods for assessing diversity
and comparing libraries are based on nearest-neighbor statistics computed using
Tanimoto "distances" between molecular fingerprints. However, such
distance-based methods yield relatively crude information. We will present
several cell-based methods which offer substantial advantages. We will introduce
the concepts of "library fingerprints" and "library vectors." We will indicate
how the distributions of compounds in two libraries can be compared using the
well-known Carbo and Hodgkin indices computed from library vectors and we will
also indicate how library fingerprints can be compared using the Tanimoto index
and novel binary forms of the Carbo and Hodgkin indices. Finally, we will
introduce the concept of "fraction overlapped" as an ideal and intuitive
approach for library comparisons.
9:20-18. Combinatorial
library design and diversity analysis.
Xiao Chuan Wang1,
Ying Su1, and Mike Green2. (1) Computational Chemistry,
Trega Biosciences Inc., 9880 Campus Point Dr., San Diego, CO 92121, (2)
Chemistry, Trega Biosciences Inc. Email: mailto:xwang@trega.com
Combinatorial chemistry is speeding up the
process of drug discovery. How can we design a drug like combinatorial library
with good diversity? How should we compare two libraries to avoid redundancy in
library production and to increase the potential of finding active compounds
from Trega libraries? In order to answer these questions we have designed and
developed a strategy called Quasi-Virtual-Library-Cherry-Picking (QVLCP) to
assist Trega chemists in library production and to ensure the intra-library
diversity. We have collaborated with Prof. Robert Pearlman and applied the new
measure, percentage of overlapped cells (POC) within All Drug Space in our
inter-library diversity analysis to compare how different one library is from
others and to design new library. Finally, we have developed the Trega Diverse
Bundle strategy to generate subsets of each library that are representative of
the full diversity present in the library.
10:00-19. Integrated
structural, synthetic, and analytical combichem informatics.
David Chapman, Afferent Systems, Inc., 1550 Bryant Street, Suite 760, San Francisco,
CA 94103. Email: mailto:chapman@afferent.com
Combinatorial synthesis produces a deluge
of data, including compound structures, synthetic protocols, sample information
such as vessel locations and synthetic history, and analytical information
(spectra and chromatograms). I will describe a chemistry knowledge base system
that integrates, organizes, and makes sense of these divergent data types, and
interfaces with both synthetic and analytical instruments.
10:40-20. SLIMS - A
web-based solution for sample, structure and spectral management.
Antony John
Williams, Val Kulkov, and Alexey
Karezin. Advanced Chemistry Development, 133 Richmond Street West, Suite 605,
Toronto, ON M5H 2L3, Canada. Email: mailto:tony@acdlabs.com
Laboratory Information Management systems
are essential to allow corporate-wide access to analytical information. A number
of efforts have been made over the years to implement flexible LIMS but, in
general, these have failed to address the flexibility of interface and features
required in analytical and R&D environments that require access to molecular
structures and graphics intensive spectral displays. We have developed a
web-based system for managing sample, spectral and associated molecular
structure information, SLIMS. This user-friendly system links a unique sample
identifier to sample information, a chemical structure, associated spectra and
final reports of analysis. This full-featured sample manager allows desktop
access to sample information as well as access to a structure database for
accessing historical reference data. The system has been configured to allow
full integration with desktop helper applications including standard desktop
structure drawing packages and spectral display packages. We will report on our
continued advances in this area.
11:20-21. Combinatorial
Chemistry: integration with the research environment.
Maurizio Bronzetti, MDL Information Systems, 14600 Catalina Street,
San Leandro, CA 94577. Email: mailto:mauriziob@mdli.com
The adoption of high speed technologies in
Genomics, Chemistry and Biology has pushed research organizations to explore new
ways of capturing data, organize results and samples, avoid duplication of
effort and emphasize data rationalization. Personal and Team productivity
together with economics and patent regulation, are often the criteria that drive
these changes.Combinatorial Chemistry especially has challenged data management
by causing proliferation of chemistry, samples and analytical data: chemistry
(classical and combinatorial) should be captured correctly and consistently if
data have to be searched and mined later. Moreover, the relationship between
reactants, samples, products, batches, side products and protocols, should be
carried along with the synthesis experiment through purification and analysis.
This presentation will introduce a new scalable system designed to manage
compound libraries and classical synthetic experiments in the context of
multiple project teams.
![]()
Alternative Careers in Chemistry (Cosponsored with YCC)
Convention Center, Room 220
Cosponsored with Younger Chemists Committee
A. Twiss-Brooks, Organizer, Presiding
1:00-Introductory Remarks
1:05-22. Employment and
marketability: ACS Career Services and you.
Jean A. Parr, Department of Career Services, American Chemical Society, 1155 16th
Street, NW, Washington, DC 20036. Email: mailto:j_parr@acs.org
No one can accurately predict tomorrow's economy, but recent data about careers in chemistry tell us that: the market will remain tight; chemists will make more frequent job changes; chemists will apply their knowledge and skills to a wider range of professions and industries.
This presentation will discuss these trends
and offer recommendations for staying marketable. ACS career services, designed
to help members address these issues, will be outlined with special emphasis on
the new online job service available to members.
1:35-23.
Strategic partnering for knowledge management.
Suzanne P. Cristina, UTC Information Network - Hamilton Standard,
United Technologies, One Hamilton Road, Windsor Locks, CT 06096. Email: mailto:cristsp@hsd.utc.com
Chemists/ Chemical engineers will realize
significant time and cost savings partnering with information professionals
(librarians) to facilitate knowledge management within their organizations.
Information professionals can leverage expertise in information organization and
use their understanding of database content and structure to identify the
information needs of their organizations and make specific recommendations for
internal and external databases to be shared over the organization's Intranet.
As an integrated partner on project/research teams, the information professional
is able to contribute proactively not reactively, anticipating and analyzing
information specific to a research project. The roles of the Research Analyst,
Information Manager and Knowledge Analyst will be described in detail as well as
the strategic significance of the MLS (Master of Library Science) combined with
a technical/chemistry degree. Several Intranet projects will also be discussed.
Click for HTML version
of this presentation.
2:05-24.
The study of applied organic chemistry in graduate school and at a remote
university.
Forrest S.
Schultz, Chemistry Department,
University of Wisconsin-Stout, Jarvis Hall, SW303D, Menomonie, WI 54751. Email:
mailto:schultzf@uwstout.edu
This presentation explores an alternative
pathway for the graduate study of organic chemistry. In particular, the study of
the interface between organic chemistry, materials chemistry, and chemical
engineering will be presented. Career opportunities and possibilities will be
presented. The presentation will also discuss the importance of online
information when different fields of study are brought together. The necessity
of online information by an applied chemist at a remote university will be
explored. Click for an HTML version
of this presentation.
2:35-25. Managing
dynamic chemical information environments in industry.
Keith P. Schreiber, Business Information Services, Procter &
Gamble, Ivorydale Technical Center, 5299 Spring Grove Ave., Cincinnati, OH
45217. Email: mailto:schreiber.kp@pg.com
Combinatorial chemistry, proliferation of
chemical publications, accelerated product development cycles - industrial
R&D is increasingly dependent on effective use of information. Chemical
information professionals draw upon expertise in chemistry and in information
tools and environments to maximize this effective information use. The result:
personal involvement across a vast array of projects, an opportunity to work
with a tremendous variety of individuals, and participation in one of the most
dynamic fields around at a pivotal moment of the information age.
3:05-26.
Look! Up in the sky! It's a chemist! It's a librarian! It's both!
F. Bartow Culp, Mellon Library of Chemistry, Purdue University,
West Lafayette, IN 47907-1538. Email: mailto:bculp@purdue.edu
In the Internet age, isn't the concept of a
librarian outmoded? If easy and almost unlimited information access is available
to anyone at the click of a mouse button, why should a chemist consider
librarianship as a career? There are lots of reasons, including excellent job
prospects, a high degree of career satisfaction, plus the chance to be a central
player in the current redefinition of how science is done. The fundamental
skills of a librarian have always been the ability to organize knowledge and
make it available to others. And for most of the history of the science, those
same skills were an integral part of a chemist's profession. In this age of
high-entropy information, the felicitous combination of abilities that
chemist/librarians bring to their jobs does not simply have the power to
organize and access chemical information; it can also enhance the value of that
information and improve the entire communication process itself. We will present
examples of how chemist/librarians are integral participants in the advancement
of both of their professions. Click for an HTML version
of this presentation.
3:35-27. From
laboratory to law office: a career as a patent attorney.
Anita Varmas, Foley, Hoag & Eliot LLP, One Post Office Square, Boston, MA 02109.
Email: mailto:PXC@FHE.COM
The field of patent law provides
opportunities to chemists seeking a career outside of the laboratory that allows
them to utilize and apply their scientific knowledge. Opportunities exist to
practice in the Patent Office, in companies and in law firms. Many of the skills
that scientists use in their scientific endeavors translate well to the practice
of law, including organizational and analytical skills and problem-solving
ability. This presentation will address the various types of opportunities
available in this exciting field, and some tips for determining whether it may
be right for you.
4:05-Division Business Meeting
4:20-Intermission
4:30-Open Meeting: Committees on Committees on Publications and on Chemical Abstracts Service
Combinatorial Chemical Information (Cosponsored with COMP) 4
Cosponsored with Division of Computers In Chemistry
Convention Center, Room 217
R. Delmendo, Presiding
R. Snyder, Organizer
1:30-Introductory Remarks
1:40-28. Automated
laboratories for high-density microplate screening: Merging novel and
traditional technologies.
Franz E.
Leichtfried, Robocon GmbH,
Davidgasse 85 - 89, Vienna A-1100 Austria. Email: mailto:f_leichtfried@robocon.co.at
Over the last three years a new trend for assay miniaturization caused by the desire for faster identification of drug leads at lower costs per test has gained considerable momentum. High-throughput screening in 384-well microplates, which was unheard of just a few years ago, has now become routine in automated screening laboratories generating up to 100.000 data points per day. Such laboratories use novel readers and other work stations, which have adapted standard technology to the 384-well plate format.
As screeners move to even higher microplate well densities and novel "screening chip" technologies, nanoliter pipetting devices, imaging readers and other novel devices will have to be merged with instruments, which have traditionally been used for automated microplate processing. In some cases, traditional and novel methods can both be applied to reach the same goal.
In this paper traditional and new
technology building blocks are investigated as to how they can be fitted into
automated laboratories reaching data outputs of a quarter million data points
per 24 hours and beyond.
2:20-29. Application of
Version Spaces to the Analysis of High-Volume Structure-Activity
Relationships.
George S.
Cowan, Jr. and C. John Blankley,
Sr. Parke-Davis Pharmaceutical Research, 2800 Plymouth Road, Ann Arbor, MI
48105. Email: mailto:george.cowan@wl.com
In the new world of high volume screening
and combinatorial chemistry, new methods are needed to rapidly analyze the
results of such assays for SAR information. We describe the application of
version spaces, a machine learning methodology described originally by Mitchell,
et al. (1978, 1997) to this problem. This method organizes all possible
"concepts" that agree with a given set of data. In this case, data is taken to
be biological activity and concepts are taken to be sets of structural fragments
associated with active compounds. Concept models are expressed as
fingerprint-like bit strings with three possible values for each bit: 1=required
present, 0=required absent, and #="don't care". Training is done on both active
and inactive compounds and unknowns can then be compared to the various concept
models to see if they are examples of active or inactive molecules. The
fragments responsible for the classification are also identified.
3:00-30. High
throughput screening software tools for analytical spectroscopy.
Antony John
Williams, Advanced Chemistry
Development, 133 Richmond Street West, Suite 605, Toronto, ON M5H 2L3 Canada.
Email: mailto:tony@acdlabs.com
High throughput screening by Mass
Spectrometry and Tubeless NMR have become the techniques of choice for the
analysis of combinatorial libraries. Coupling automation with flow NMR and MS
technology now allows spectra to be acquired from a combinatorial plate in only
a few hours. This routine acquisition of large amounts of data can indeed
increase the rate of throughput for such analyses but the technology can lead to
an inordinate amount of data with no appropriate manner to track the information
in a facile manner. Since the chemist can often offer suggestions for the
structures expected for each vial on the plate it would be appropriate to
attempt to relate the experimental spectra to those predicted for the structure.
The development of software to allow the databasing of MS and NMR spectral
curves associated with molecular structures, and the application of NMR
prediction algorithms to allow comparison of experimental and predicted spectra
will be discussed.
3:40-31. Managing
Combinatorial Data in Excel.
Harold
Helson and Michael Swartz.
CambridgeSoft Corporation, 100 Cambridge Park Drive, Cambridge, MA 02140
Most combinatorial chemists manage their
experiments with spreadsheet programs such as Microsoft Excel. In most cases,
these chemists must manage their chemistry tasks such as the enumeration of
product molecules outside Excel and then import this data into Excel.
CambridgeSoft has developed solutions that provide for combichem data management
directly inside Excel. This makes it possible for users to integrate combichem
specific chemistry intelligence directly inside the spreadsheet experiment
managers they have already developed. This presentation will focus on a sample
application that shows users how they can manage their combichem data directly
inside an Excel application which in turn integrates with ChemDraw, the drawing
program preferred by most chemists.
Email: mailto:mswartz@camsoft.com
4:20-32. Intelligent
data visualization for large sets of chemical structures and property
data.
Glenn J.
Myatt, Paul E. Blower, Jr., Kevin
P. Cross, and Wayne P. Johnson. Research and Development, Columbus Molecular
Software, Inc., Business Technology Center, 1275 Kinnear Rd., Columbus, OH
43212. Email: mailto:gmyatt@columbus.rr.com
Combinatorial chemistry and high-throughput
screening has dramatically increased the speed and quantity of compounds that
are made and tested for biological activity. We will present a chemically
intelligent data visualization computer program designed to process and
intelligently categorize the large volumes of chemical and biological screening
data being generated. The program organizes sets of structures using a taxonomy
of approximately 20,000 familiar drug-like features, such as heterocycles and
topological pharmacophores. The sets of structures are graphically presented,
for example, using histogram bars that represent the number of structures in the
set. Sets containing a statistically high number of active structures are
highlighted suggesting the common structural feature is highly correlated with
biological activity. This presentation will introduce the program and
demonstrate its application to drug discovery, combinatorial library design and
diversity analysis.
![]()
Sci-Mix Poster Session Convention Center, La Nouvelle Ballroom B/C A. H. Berks, Organizer, Presiding 7:00-9:00
33.- A water-quality
information system for the Lower Mississippi River.
Boumediene Belkhouche1, James E.
Bollinger2, and William J. George2. (1) Computer Sciences
Department, Tulane University, New Orleans, LA 70118, (2) Division of
Toxicology/Pharmacology Department, Tulane University. Email: mailto:bb@mailhost.tcs.tulane.edu
A major issue in monitoring and managing ecosystems is the
lack of an integrated model. Consequently, we developed a water quality
information system for the Lower Mississippi River that provides a uniform
conceptual model of the ecosystem, integrates large amounts of heterogeneous
data collected by various sources, and facilitates the analysis and
interpretation of existing ambient water-quality data. We conceptualize a river
as a an object-oriented model consisting of classes and relationships among
them. The automated analysis process supports exploratory questions about the
availability of data and their geographic distribution, the concentration levels
and distribution of parameters, river hydrology, and the relationships among the
individual variables. In addition to these design features, a strict quality
control protocol has been implemented to document the flow of data beginning at
the point at which data are obtained from their source, through a comprehensive
validation process, until their upload into the database system.
34.- Generation of VRML
for use in 3D chemical structure display on the Internet.
Min He and Jiaju Zhou. Laboratory of Computer Chemistry, Institute of Chemical
Metallurgy, Chinese Academy of Sciences, Beijing, 100080, China. Email: mailto:mhe@ns.icm.ac.cn
The Internet has been growing at an
exponential clip, and chemistry benefits from the development of the Internet.
On the one hand, HTML plays the important role in the rapid expansion of web
technology on both the Internet and the Intranet. On the other hand, HTML is
limited to a two-dimensional (2D) world. In this work, a program, VRMLMaker, has
been developed for three-dimensional (3D) chemical structure display on the
Internet in our Chinese Drug Database Searching System (CDDBSS). VRMLMaker can
convert a molecular MOL2 or ML2 format file to VRML format files in four
different styles, including wireframe, capped sticks, ball-and-stick, and a CPK
space-filling model. Being a plain text (standard ASCII text) format file and
unlike graphic files, such as GIF or JPEG, the VRML file of 3D chemical
structure generated by VRMLMaker can be transferred in a compressed format and
uncompressed automatically by a viewer. This reduces the time and the charge of
transmission on the Internet. The images generated by VRMLMaker are "live", in
that they can be magnified and rotated. It is suited for 3D chemical structure
display of chemical database on the Internet. The VRML molecular model,
generated by VRMLMaker, is used in our CDDBSS.
35.- Recent
advancements in the development of SENECA, a computer program for Computer
Assisted Structure Elucidation based on a stochastic algorithm
Christoph Steinbeck, Computational Chemistry Group,
Max-Planck-Institute of Chemical Ecology, Tatzendpromenade 1a, Jena 07745
Germany. Email: mailto:steinbeck@ice.mpg.de
Recent advancements in the development of
SENECA, a new program package for Computer Assisted Structure Elucidation
(CASE), currently being developed in our group, are outlined. Seneca is an
object-oriented, platform-independent approach using the programming language
Java. It features a client program for input or import of spectral data and for
setup of the structure elucidation process, as well as a structure elucidation
server that is distributed over a network of multiple machines of commodity
type. Results are presented that demonstrate the promising performance of the
stochastic algorithm implemented in SENECA. This algorithm optimizes a
multi-parametric target function towards maximum similarity between the real and
the back-calculated set of spectra.
36.- Implementation of
Chinese drug database searching system.
Min He and Jiaju Zhou. Laboratory of Computer
Chemistry, Institute of Chemical Metallurgy, Chinese Academy of Sciences,
Beijing, 100080, China. Email: mailto:mhe@ns.icm.ac.cn
Chinese drug has played an important role
for Chinese people to treat diseases and protect health since ancient times. In
the past thousand years, the use of Chinese drug has generated a great deal of
information, which spreads around many categories of Chinese drug literatures
and books. For absence of the scientific study of Chinese drug, the mode of
action of the Chinese drug is not clear. To support scientific study on Chinese
drug, we have developed a Chinese drug database searching system (CDDBSS). The
platform of system is Windows NT, while database management system (DBMS) is
Microsoft SQL Server 6.5. The information system of Chinese drug database
consists of four parts: (1) the main information needed for Chinese drug
mechanistic studies, such as physical and chemical properties, pharmacology,
clinical application data, etc.; (2) chemical components; (3) molecular
structures; (4) bio-activity data. All of information can be searched in
specified mode by user. The transfer of 3D chemical structure is used by using
chemical VRML files.
37.- Modular chemical
descriptor language (MCDL) and unique structure representation.
Michael N. Burnett, A. C. Buchanan, III, and Andrei A. Gakh. Chemical
and Analytical Sciences Division, Oak Ridge National Laboratory, 1 Bethel Valley
Road, P.O. Box 2008, Oak Ridge, TN 37831-6197. Email: mailto:mnb@ornl.gov
Several approaches exist for representing
molecular structures with linear descriptors, such as the IUPAC and ACS
nomenclature systems and the more computer-oriented Daylight SMILES system. All
of these require relatively complex rules to create unique descriptors. A new
simplified modular system has been developed for representing molecular
structures uniquely. Molecules are described by their structural fragments (1st
module) and the connectivity of these fragments (2nd module), and, if needed, a
module providing the stereochemistry is included. For example, the unique
descriptor of R-2-bromopentane
CH3CHBrCH2CH2CH3 is
CBrH;2CHH;2CHHH[2,4;3;5]/SA:1,Br,H,4,2/. The simplicity of the approach arises
from its use of simple ASCII ordering (dictionary order in English) to
prioritize structural features in place of complicated rules on the relative
priorities of functional groups. Additional information about the molecule, such
as atom coordinates and physical properties, can be included in the descriptor
as a set of supplemental non-unique modules. [This research was sponsored by the
U.S. Department of Energy Initiatives for Proliferation Prevention (IPP)
program.]
38.- Bioinformatics in
the CAS databases
Leo W.
Collins, Eva M. Hedrick, and Anish
Mohindru. Chemical Abstracts Service, 2540 Olentangy River Road, Columbus, OH
43202. Email: mailto:lcollins@cas.org
Bioinformatics is generally regarded
as the information of genomics research. Demand for bioinformatics has increased
dramatically in recent years due to the advancement of the Human Genome Project,
and other projects having the expressed objective of determining gene sequences.
Since 1907, CAS has abstracted and indexed the scientific literature, including
references and literature from genomic and other biologic sources. More than 37%
of the abstracts in the CAS Chemical Abstracts are from biochemical sources. In
addition, more than 18% of the CAS Registry File contains biosequences collected
from the journal literature, patents, and the Human Genome Project. This vast
collection of biosequences and related biological information makes the CAS
databases a valuable source of information for biotechnology research and
process development. This presentation will illustrate with examples the
comprehensive content of the biosequences, patents, and related information in
the CAS databases.
39.- ChemIDplus:
an experimental public chemical information and structure search system
Perlita M. Liwanag, Vera W. Hudson, and George F. Hazard, Jr. Division
of Specialized Information Services, National Library of Medicine, Bldg. 38A,
Rm. 3N-315A, 8600 Rockville Pike, Bethesda, MD 20894. Email: mailto:perlita_liwanag@nlm.nih.gov
ChemIDplus is a web-based
search system that provides access to structure and nomenclature authority files
used for the identification of chemical substances cited in National Library of
Medicine (NLM) databases. ChemIDplus also provides structure searching
and direct links to many biomedical resources at NLM and on the Internet for
chemicals of interest. The database contains over 349,000 chemical records, of
which some 56,000 include chemical structures. ChemIDplus is searchable
by Name/Synonym, CAS Registry Number, Molecular Formula, Classification Code,
Locator Code, and Structure. The Locator Codes are hyperlinked at the substance
level to biomedical databases at NLM and on the Internet and to the NLM
Superlist compilation of chemical substances of interest to federal and state
regulatory agencies. Ease of navigation from the system's Locator Display Page
to the other web sites and vice versa is a characteristic feature of
ChemIDplus. In addition to data queries, the system provides three types
of structure queries: Substructure Search, Similarity Search, and Exact
Structure Search. ChemIDplus facilitates structure searching by providing
two options that eliminate the need to draw the structure queries to allow
novice users to take advantage of the structure searching capability of
ChemIDplus. One option "Use Structure for Query" pastes the retrieved
structure from a previous query to the Structure Input Box while the other
option "Use Structure for Similarity" starts an immediate search for similar
structures. The database is maintained using ISIS(tm)/Host and Oracle® and uses
Chemscape Server(tm) to integrate the retrieval of structural data with related
textual information.
55.-Patents in Combinatorial Chemistry. See subsequent listing.
Click for an HTML
version of this poster.
76.-Electronic laboratory notebook systems for R&D and testing labs: Status of creation and acceptance in industry. See subsequent listing.
75.-Collaborative electronic notebook systems: A technical knowledge management paradigm beyond LIMS, Groupware, and the Web. See subsequent listing.
73.-A Web-Based Engineering Chemistry Database. See subsequent listing.
81.-Information services on the intranet: where we are and where we want to go. See subsequent listing.
![]()
]
The Changing Chemical Information Scene: Keeping and Nurturing the Baby as
the Bathwater Rushes By
Skolnik Award
Symposium, Session 1
Convention Center, Room 220 S. Kaback ,
Organizer, Presiding 8:45-Introductory Remarks
8:50-40. Award
Address.A 40-Year Countdown to the Millennium
Stuart M. Kaback, Information Research & Analysis Group,
Research Services Division, Exxon Research & Engineering Co., Clinton
Township, Route 22 East, Annandale, NJ 08801. Email: mailto:smkabac@erenj.com
Forty years have passed since this
chemist elected to attempt to become an information chemist, thus pursuing a
career path he had not heard of during his undergraduate and graduate education.
Much has changed during that period. Punched cards, microfilm and microfiche,
coordinated term indexes and more have come and gone. The US Patent and
Trademark Office has issued three million patents, matching its total output in
all the years that came before. Online database searching replaced prior
reliance on printed indexes and classified card files, and now seeks to redefine
itself to stand up against the juggernaut of the Internet. In the face of all
that change the traditional abstracting and indexing function is still with us,
though not without considerable reshaping. The author surveys this landscape of
change and suggests that if we are wise, we will nurture this intellectual
activity far into the future.
9:30-41.
Chemical Registries -- in the fourth Decade of Service
Robert E. Buntrock, Buntrock Associates, Inc., 670 N. Eagle St.,
Naperville, IL 60563-3024. Email: mailto:buntrock2@earthlink.net
Methods for precise yet usable description
of virtually any topic are essential, especially for identification of chemical
compounds and materials. Systematic nomenclature and precise chemical
structures, if known, are the ultimate in description of chemical compounds.
However, there has always been a pervasive need for brief, yet precise methods
to "register" chemical compounds, for use as a "hook" to both index and retrieve
additional information. By far the most predominant chemical registration system
is the CAS Registry System, begun in 1965. The history and use of this system
will be described and its importance to a number of disciplines, not just
chemistry, will be discussed. Click to download or view a Word 6.0 version
or for an HTML
version of this presentation.
10:00-42.
Markush Structure Searching Over the Years
Edlyn S. Simmons, Patent
Department, Hoechst Marion Roussel, Inc., 2210 E. Galbraith Rd., Cincinnati, OH
45215-6300. Email: mailto:edlyn.simmons@hmrag.com
The indexing and retrieval of Markush
structures has always been among the most problematic aspects of patent
information and the most expensive. Indexing advanced from the simple
classification systems of the 1950s to proprietary fragmentation systems, which
were followed in the 1980s by topological systems. The cost of access to the
latest indexing systems has varied widely over the years. In spite of
improvements in indexing and less restrictive access conditions, comprehensive
Markush structure searches remain the sole province of well financed
organizations. Click for HTML version
of presentation.
10:30-43. A history of
cross-file and multi-file searching of online patent databases
Nancy E. Lambert, Business Products and Services, Chevron, P. O. Box
1627, Bldg. 50-1214a, Richmond, CA 94802. Email: mailto:nela@chevron.com
Patent searchers have long known that they
must search, not just one database, but all relevant databases if they need to
ensure as complete a search as possible. The challenge has always been to
eliminate duplicate references found in the various databases, and to combine as
much as possible the different indexing systems available on the different
databases. The ideal situation, as envisioned by Stuart Kaback in 1982, is
"super records" that will combine all the indexing from all patent databases. We
haven't reached this yet, but we've made progress. This talk will trace the
history of multi-file and cross-file patent searching and discuss how online
search capabilities have evolved to permit some ingenious combinations of
different databases.
11:00-44. End-user searching - the roads
we've travelled and where we're headed now
Patricia L. Dedert,
Corporate Research/Information Research & Analysis Unit, Exxon Research and
Engineering Co., Route 22 East, Clinton Township, Annandale, NJ 08801. Email: mailto:pldeder@erenj.com
Seekers of chemical information were early
beneficiaries of the online searching revolution, but bench chemists usually
found that they had to relinquish control of the search process to professional
searchers. Since the early days of online searching, many attempts have been
made to re-empower chemists in the task of chemical information retrieval. The
training methods and empowerment tools have evolved significantly over time, as
have the attitudes of both chemical information professionals and their
clientele. This paper will examine the history of end-user searching as
practiced at Exxon Research & Engineering, a company interested in many
types of chemical and technical information. The learnings acquired over twenty
years of end-user programs and experiments will be explored, and I will attempt
to define the current needs and wishes of our population of end-user
searchers.
![]()
The Changing Chemical Information Scene: Keeping and Nurturing the Baby as
the Bathwater Rushes By
Skolnik Award
Symposium, Session 2
Convention Center, Room 220 S. M. Kaback,
Organizer, Presiding 2:00-Introductory Remarks
2:05-45. Exxon's
Database for Organizing and Analyzing Patent Records
Sandra S. Unger, Information Research and Analysis Unit, Exxon Research and Engineering
Company, Route 22 East, Annandale, NJ 08801. Email: mailto:ssunger@ix.netcom.com
This presentation gives an overview of
Exxon's proprietary database system for electronically displaying and analyzing
patent data, organized by its technical content. Using this system, several
databases have been constructed, each focusing on one broad technical topic and
containing both the Derwent abstracts, the corresponding US claims and EP claims
and technical reviews. A customized hierarchy of subject categories may be
populated by means of technical searches of the commercial databases or by
intellectually reviewing each abstract and/or the corresponding full document.
This methodology provides a custom database of thousands of categorized and
evaluated patent records that can be used by scientists and legal staff at their
desktop. Sophisticated reports based on the proprietary categorization, combined
with commercially available data, provide unique capabilities for patent mapping
across technologies and companies. These features are specifically claimed in a
granted US patent.
2:35-46. The CAS
database: Growing with the chemical sciences and electronic information
technology
Matthew J.
Toussant and David W. Weisgerber.
Chemical Abstracts Service, P.O. Box 3012, Columbus, OH 43210. Email: mailto:mtoussant@cas.org
In striving to be a leader in meeting the
chemical science information needs of scientists worldwide, Chemical Abstracts
Service (CAS) has provided a family of diversified information services that
have grown in size and utility along with the chemical sciences and electronic
information technology. This paper will survey the growth in the chemical
literature and recent enhancements in CAS database content and describe how CAS
has responded to the need for more efficient and timely database creation and
delivery. CAS production system approaches will be described. Key among these is
the use of electronic workflow technology and electronic input from primary
publishers and patent offices which have enabled CAS to create its databases
with much greater timeliness, comprehensiveness, and increased quality. And on
the delivery side, linking of the CAS databases to the full-text of the primary
sources via the Internet has culminated in a much more timely and highly linked
environment of chemical information resources.
3:05-47. Re-inventing
the Derwent Abstract
Tim
Miller, R&D, Derwent
Information Limited, 14 Great Queen Street, London WC2B 5DF United Kingdom.
Email: mailto:tmiller@derwent.co.uk
The Derwent patents abstract has been
developed and refined, with considerable input from our customers, over many
years. Why did Derwent decide to change it? Patent documents have changed over
the years and the use to which patent information is put has changed and
continues to change, especially as new methods for disseminating information
become available. This paper will describe the thinking behind the new abstract
format, the benefits which Derwent is trying to obtain, for its customers and
for its internal processing requirements, and the lessons learned during its
implementation.
3:35-Intermission
3:45-48. Teaching
computers to index
Darlene K.
Slaughter and Harry M. Allcock.
IFI/Plenum Data Corporation, 3202 Kirkwood Highway, Wilmington, DE 19808. Email:
mailto:claims@ifiplenum.com
Computers have become indispensable tools
for the indexing of chemical patents by IFI. Rather than replacing human
indexers, however, they improve efficiency by generating descriptors that can be
accepted, rejected or modified by the indexers. By automatically performing
routine indexing tasks, the computer gives the information chemists more time to
analyze and interpret the new technology described in the patents. IFI is
combining the strengths of machine indexing with the power of human
comprehension to increase productivity while maintaining quality.
4:15-49. An economic
analysis of trends in the chemical information sector
Robert J. Massie, Chemical Abstracts Service, P.O. Box 3012,
Columbus, OH 43210. Email: mailto:rmassie@cas.org
The Chemical Information sector has undergone not only a technical, but also a structural evolution in the past forty years, as commercial forces and realities have increasingly influenced its course. This evolution has been marked by the shift to consolidation and funding through world capital markets, and the increasing dominance of corporate interests where entrepreneurial, family-business, not for profits and government entities once ruled. The impact of the Internet has accelerated this evolution.
This presentation discusses the major developments in the Chemical Information sector from a business and economic standpoint, noting among other trends:
- the emergence of for-profit acquisition strategies aimed at vertical integration and market dominance;
- the increasing importance of investment and scale economies in technical infrastructures for data collection, storage and manipulation;
- the role of government entities, especially patent offices, in providing taxpayer subsidized free information;
- the stresses on and evolution of the academic sector;
- the dramatic potential of electronic journals and other primary information available online, and potentially interlinked in the Web environment.
An updated version of this talk was
given at the International Chemical Information Conference in Annecy, France.
Click here to see an online version of
that talk.
![]()
Recent Developments in Markush and Patent Searching Convention Center, Room 220 A. Trippe, Organizer, Presiding 8:25-Introductory Remarks
8:30-50. Searching
Markush Structures in the MARPAT Database
G. Kenneth Ostrum,
Marketing, Chemical Abstracts Service, 2540 Olentangy River Road, Columbus, OH
43210. Email: mailto:gostrum@cas.org
This presentation will describe the
techniques and benefits of searching for Markush structures in MARPAT, a CAS
database that complements the CAplus and Registry files on STN. The emphasis
will be on evaluating answers in MARPAT and discussing their value as an
enhancement to the chemical literature and patent information available in
CAplus and Registry.
9:00-51. MMS,
the Markush structure file for the chemical patents community
Philippe Borne, DDI, Institut National
de la Propriété Industrielle (French Patent and TradeMark Office), 26bis rue de
Saint-Pétersbourg, PARIS cedex 08, 75800, France; Michael P. O'Hara,
Millennium Information Services (INPI North American Representative), 215 12th
Street, SE, Washington, DC 20003-1427. Email: mailto:mohara@millenniuminfo.com
INPI (The French National Institute of
Industrial Property), and Derwent Information Ltd, have decided to merge their
Markush structure databases to create a new structural database, MMS (Merged
Markush Service) which became available in June 1998. MMS covers all chemical
patents from January 1987 to the present, with the coverage for Pharmaceutical
patents going back to January of 1984. MMS currently contains over 700.000
structure records, which represents a total of approximately 250 millions single
prophetic structures. The file is being indexed both forwards and backwards in
time. This paper will concentrate on the current status of MMS and on the
development plans. Special emphasis will be placed on the backfile indexing.
Click for a HTML version of
this presentation.
9:30-52. Markush
patents at the start of the 21st Century - doing it the Derwent way
G Cross, P Sayer, and T J Miller. Derwent Information, 14
Great Queen Street, London, WC2B 5 DF, United Kingdom. Email: mailto:gcross@derwent.co.uk
Markush structures have been included in patents for many years, owing their name to a US patent applicant. More recently, Combinatorial libraries have been patented, needing similar handling techniques. Derwent has provided indexing and searching services for Markush patents since the 1960s, through abstracts, punch cards, manual and fragmentation codes. Since 1987, they have also been searchable as structures on the Markush DARC system.
Traditional searchers have learned the complex systems that enable them to retrieve this vital information. However, in this Internet era, the demand is for user-friendly systems providing high-quality information very rapidly. There is also a need for such systems to be available outside the traditional online hosts, enabling companies to manage their own combinatorial collections.
In this paper, we will look at how
Derwent is approaching the task to make Markush and combinatorial patent
information more accessible to traditional and newer users.
10:00-53.
Color coding system for simplifying IFI chemical fragmentation code
searching
Anthony J.
Trippe, Procter & Gamble Co.,
8700 Mason-Montgomery Rd., Mason, OH 45040. Email: mailto:trippe.aj@pg.com
The IFI comprehensive file is one of the few electronic sources that allow for a form of chemical structure searching back to the 1950's. This chemical structure searching takes the form of a chemical fragmentation system which allows for the searching of generic or prophetic chemical substances within granted US patents.
While powerful, the system is perhaps
underutilized since chemical fragmentation coding systems are difficult to use
and learn. This presentation will focus on a method for creating IFI
fragmentation code queries that takes advantage of the IFIREF file and a color
coding scheme which makes generation of these strategies easier to keep tack of.
Click for HTML version
of presentation.
10:30-54. Finding
Markush structures using IFI fragmentation
Darlene K. Slaughter and
Harry M. Allcock. IFI/Plenum Data Corporation, 3202 Kirkwood Highway,
Wilmington, DE 19808. Email: mailto:claims@ifiplenum.com
IFI's fragmentation coding system is
applied to all claimed Markush structures in U.S. patents, and provides a fast
and comprehensive method of retrieving all structures (including prophetic
substances) with specified characteristics. Searchers using IFI's system can
retrieve references to patents issued as early as 1950. Both the CLAIMS Uniterm
and CLAIMS Comprehensive databases offer access to fragmentation coding, but IFI
subscribers to the Comprehensive database benefit from greater precision in
retrieval. For searchers who do not use IFI fragmentation codes frequently, the
recently enhanced CLAIMS PC Reference software simplifies the process of
building Markush search strategies.
11:00-55.
Patents in Combinatorial Chemistry
Andrew H. Berks, Merck &
Co., 126 E. Lincoln Ave, Rahway, NJ 07065-0900 Email: mailto:andrew_berks@merck.com
Patenting activity in combinatorial
chemistry will be discussed, including bibliometric parameters such as leading
companies and growth in patenting activity. Also discussed will be patents
claiming various technologies used in combinatorial chemisty, such as lead
generation, synthetic methodologies, and claims to libraries. Online search
strategies for locating and monitoring combinatorial chemistry technology will
be presented. Click for an HTML version
of this presentation.
![]()
Convention Center, Room 220
G. Grethe, Organizer, Presiding
1:25-Introductory Remarks
1:30-56. Reaction
information for the practicing synthetic chemist: data, problems and
solutions
Guenter
Grethe, Product Development /
Scientific Applications, MDL Information Systems, Inc., 14600 Catalina Street,
San Leandro, CA 94577-7409. Email: mailto:guenter@mdli.com
Synthetic chemists in today's competitive
research environment require fast and easy access to information ranging from
new methodologies for the synthesis of new compounds or compound libraries in
solution- or solid-phase to the availability of starting materials or new
reagents. Fortunately, the amount of information available electronically
inhouse or online from large databases combined with data from smaller specialty
databases has increased dramatically. But on the other hand, this information
becomes increasingly difficult to manage by the enduser chemist. Providing
effective post-search management of search results and an user-friendly
environment is mandatory to entice infrequent users to effectively utilize the
wealth of available data. Based on examples we will discuss some of the problems
and their solution, including reaction classification, clustering of data and
linkage to the primary literature.
2:00-57. Tracking
reaction pathways in the published chemical literature
Alexander J. Lawson, Director of R&D, Beilstein Information
Systems, Theodor-Heuss-Allee 108, Frankfurt a/M D-60486 Germany. Email: mailto:alawson@beilstein.com
Synthesis is arguably the highest art in organic chemistry. Historically, the many efforts of computational chemists to vie with human ingenuity in this area by providing "expert systems" and "artificial intelligence" to aid in synthesis planning have often met with only lukewarm response from the researcher active at the bench. Paradoxically, it has been the relatively "dumb" systems based on large collections of single-step reaction reports taken from the primary literature (i.e. reaction databases) which have enjoyed more favor with the working chemist. The largest and most widely used of these is the Beilstein File under CrossFire, which currently operates principally on a "Point & Click" basis.
This talk will give an overview of the
progress in now extending this paradigm to reaction pathways, thus cutting
across the boundaries of individual publications while still retaining the
natural simplicity of the navigation method.
2:30-58. Insight,
access and content - schemes for making the most of reaction-based chemical
information
Julian
Hayward and Keith A Harrington.
Synopsys Scientific Systems Ltd., 5 North Hill Road, Leeds, LS6 2EN, United
Kingdom. Email: mailto:julian.hayward@synopsys.co.uk
Over the past 20 years or so, corporate databases have focused on the storage of individual compounds, along with their molecular properties and biological test data. However, the ability of a reaction to convey so much more information to an organic chemist (selectivity, reagents, conditions etc.), as well as requirements related to combinatorial library synthesis, has led to a fundamental reappraisal of the way in which corporate data is registered and stored.
With reference to a number of commercial
reaction databases produced by Synopsys, along with some new reaction retrieval
tools in Accord, the author will discuss features which increase the value of
reaction databases to the chemist. The rationale behind the concepts and design
of new reaction databases will also be highlighted, focusing in particular on
retrieval mechanisms for Metabolism data and on the content of the new Synthons
and 'Failed Reactions' databases.
3:00-59. Integrated
Protocol Management in Combinatorial Synthesis
J. Christopher Phelan, Director, Product Management, Afferent Systems,
Inc., 1550 Bryant Street, Suite 760, San Francisco, CA 94103. Email: mailto:phelan@afferent.com
The increasing use of combinatorial
chemistry and parallel synthesis has put a new burden on chemists. Suddenly so
much data can be generated in one experiment that the bench chemist's job
involves learning and using many different kinds of information handling
software. We will present reaction management software that represents synthetic
methods in a versatile instrument-independent way, enabling its use for manual
parallel chemistry in glassware as well as high-throughput automated synthesis.
We will also discuss our combinatorial enumeration module based on the
chemically intuitive "virtual chemistry" paradigm, on-line access to analytical
data (e.g. MS, LC, and LC/MS), and new complex search capabilities for
exploitation of these data structures by the synthetic chemist. All of these
functionalities are combined in an integrated package with a user interface that
is intuitive to the bench chemist, minimizing the tasks of software training and
data handling and putting the chemist back in the laboratory.
3:30-60. Finding the
winning reactions in reaction databases
Robert L. Swann, Director of
Research, Information Systems, and New Product Development, Chemical Abstracts
Service, 2540 Olentangy River Road, Columbus, OH 43202-1505. Email: mailto:rswann@cas.org
With the increasing availability of
electronic information, chemists can seek reaction information and access
relevant literature articles more rapidly and efficiently than ever before. As
the size of reaction databases grows, reaction database providers must work to
ensure that these end-user chemists are able to obtain germane answers to their
reaction questions. This talk will discuss some of the approaches being taken to
deliver precise reaction information to chemists.
4:00-61.
The distribution of synthetic
techniques in the chemistry literature. M. Clark
[Paper withdrawn]
4:30-62. Synthetic
information in patents - an underused resource
D G Penn, P Sayer, G Cross, and T J Miller. Derwent Information, 14 Gt Queen
Street, London, WC2B 5DF, United Kingdom. Email: mailto:dpenn@derwent.co.uk
The recent growth in the number of Reaction Databases has meant synthetic chemists have more electronic information resources at their disposal than ever before. Coverage from Chemical Journals is comprehensive; however, the Patent Literature is less well covered. Patents are an important source of synthetic information. Because of the legal requirement for a Patent to be granted the technical details must be fully disclosed and this detail should be sufficient to replicate the invention. This means that patent specifications can contain considerably more information than a corresponding journal article.
We shall compare and contrast these
two sources of synthetic data and give examples of retrieval strategies. We
shall discuss the systems used for indexing reactions on Derwent World Patent
Index at present and the enhancements currently being developed.
Numeric Chemical Information
Convention Center, Room 216
S. Heller, Organizer, Presiding
1:30-Introductory Remarks
1:35-63.
Uniformity and the New Protein Data Bank
Gary L. Gilliland1, Phoebe
Fagan1, John Westbrook2, Helen Berman2,
Phil Bourne3, and Peter Arzberger3. (1) National Institute
of Standards and Technology, 100 Bureau Drive, Gaithersburg, MD 20899, (2)
Rutgers University, Rutgers, NJ, (3) UC - San Diego, San Diego, CA. Email: mailto:phoebe.fagan@nist.gov
The Protein Data Bank (PDB) is an international repository for macromolecular structure data, generated experimentally by X-ray crystallographic and NMR methods, or from theoretical modeling. On October 1, 1998, the Research Collaboratory for Structural Bioinformatics (RCSB), became responsible for the management of the PDB. The RCSB has three member institutions: the Biotechnology Division of the National Institute of Standards and Technology (NIST), the Department of Chemistry at Rutgers, the State University of New Jersey, and the San Diego Super Computer Center at the University of California. The new resource is committed to providing efficient deposition and processing of data, versatile query and reporting capabilities, and reprocessing of legacy data to create a uniform archive.
A one-year transition period was allotted for the transfer of the PDB to the RCSB. The new systems are in place ( http://www.rcsb.org). Since January 27, 1999 RCSB has processed all the new depositions. The new query system, Searchlite, has been released to the public. A clean-up process for the legacy data has begun.
The RCSB is using mmCIF based dictionary and tools to transform the flat file format of the PDB structure files into a relational database that will provide controlled access to all the data in the PDB files. Collaborations are underway to clarify nomenclature and clearly define fields such as the classification, name and source of the molecule to enable more reliable searches. The uniformity (clean-up) process will be addressed in some detail. The original PDB files and format as well as mmCIF based files will be preserved. The RCSB will ensure that the community has extensive input into the PDB archival activities.
The RCSB vision is to enable new science by providing accurate, consistent, well annotated structural data delivered in a timely manner to a wide audience. The first step in this process is to provide a channel of open communication.
The PDB is funded by the National Science
Foundation, the National Institutes of Health (NIGMS & NLM), and the
Department of Energy.
2:05-64. Evaluation of
the NIST/EPA/NIH Mass Spectral Library
S E Stein1, P Ausloos1, C L Clifton1,
J K Klassen1, S G Lias1, A I Mikaya1, O
D Sparkman1, D V Tchekhovskoi1, V Zaikin2, and
Damo Zhu3. (1) Physical and Chemical Properties Division, NIST, 100
Bureau Dr Stop 8380, Gaithersburg, MD 20899-8380, (2) Topchiev Institute of
Petrochemical Synthesis, Moscow, Russia, (3) Dalian Institute of Chemical
Physics, Chinese Academy of Sciences, Dalian, China. Email: mailto:jane.klassen@nist.gov
The NIST/EPA/NIH Mass Spectral
Library contains mass spectral information on over 100,000 compounds and is used
for fingerprint mass spectral matching. The confidence in correctly identifying
a compound by matching its spectrum with a reference library spectrum depends
directly on the quality of the library. Since it has become clear that automated
quality control algorithms are not reliable, a spectrum by spectrum evaluation
of the NIST/EPA/NIH Mass Spectral Library has been undertaken. The archive has
been exhaustively examined by individuals well trained in mass spectrometry.
Because of unavoidable uncertainties in judging the quality of a spectrum, an
important requirement has been the agreement on both the analysis and the remedy
for each spectrum by at least two individuals. An exact record of any
modifications to the spectra has been maintained. Several factors pertaining to
the evaluation of the data will be discussed along with examples of difficult
evaluations.
2:35-65. Web-Based
Access to Structure Based Prediction and Databases for Spectroscopy and Physical
Propereties
Anthony J.
Williams and Valery Kulkov.
Advanced Chemistry Development, 133 Richmond Street West, Suite 605, Toronto, ON
M5H 2L5, Canada. Email: tony@acdlabs.com
The Interactive Laboratory, ACD/ILab,
offers a universal Web-based gateway to various chemical information resources,
property prediction programs and chemical databases. ACD/ILab utilizes
Java-based structure drawing and spectral display applets to provide structure
submissions for prediction purposes and display of predicted spectra. Currently,
the following database searches and property predictions available at ILab
include HNMR spectrum prediction and searching of 82,000 assigned chemical
structures, C13 NMR spectrum prediction and searching of 67,000 assigned
chemical structures, pKa prediction and pKa database search (over 9000
structures) , LogP prediction and LogP database search (over 3500 structures).
Other structure based databases are also available and will be discussed.
3:05-66. Generating
numeric chemical information from chemical structures during chemical
registration
Christopher S.
McKenna and Phil McHale. Product
Marketing, MDL Information Systems, Inc., 14 Walsh Drive, Parsippany, NJ 07054.
Email: mailto:chrism@mdli.com
Chemists and biologists are
increasingly looking for more descriptive properties that can be used in
decision-making and analysis. This trend is due in large part to the increasing
numbers of compounds and screening data that need to be analyzed for lead
finding, optimization, and candidate selection. In this session we will discuss
and demonstrate a new chemical scripting language from MDL for producing numeric
information from chemical structures. That numeric information can be registered
along with chemical structures through chemical registration processes, and then
analyzed in spreadsheets or interactive charting tools to aid in
decision-making.
3:35-67. Numerical Data
In the Beilstein File under CrossFire
Gabriele Ilchmann, Alexander
J. Lawson, and Huyen Nguyen. Beilstein Information Systems, Theodor-Heuss-Allee
108, Frankfurt, D-60486, Germany. Email: mailto:gilchmann@beilstein.com
As well as being one of the world's
major abstracting and indexing services to the chemical primary literature, the
Beilstein File is also the world's largest collection of experimentally measured
property data on organic chemicals. Many of these data are numerical, such as
melting point, boiling points/pressures, refractive indices, optical rotation,
thermodynamic values etc. and this aspect of Beilstein has been widely used by
chemists for many generations : the Beilstein Handbook has always been highly
valued as a source of characterising data in its own right. A less well-known
aspect of the Beilstein File under CrossFire is the use of numerical data in
another context : as search filters, for instance in the restriction of reaction
conditions to a particular temperature range. The release of the EcoPharm
database under CrossFire now greatly increases the numerical data content of the
Beilstein File in the key areas of ecological and pharmacological data (see
Figure), and this is accompanied by the ability to use new numerical filters
(such as physiological activity) to arrive quickly at highly specific and
relevant data. This talk will discuss the variety of numerical data in Beilstein
under CrossFire including the EcoPharm database, and will illustrate search
techniques for searching with numerical values with range-searching in this
important new data collection.
4:05-68. Dimensionality
and classification considerations in pattern recognition: demonstration of a
novel efficient procedure
Norman J.
Santora, Chemical Forecasting And
Searching Technology, 1323 Partridge Road, Roslyn, PA 19001-2807. Email: njsmbs@msn.com
Pattern recognition consists of two
distinct steps:(a)Preprocessing; wherein, a data matrix is operated upon in
order to reduce its dimensionality; and (b)Classification; wherein, the data
elements are placed into discriminate property classes. An application of the
procedure developed will illustrate the classification of therapeutic agents
using novel effective preprocessing and classification procedures on a data
matrix comprised of organic structural information.
![]()
Web-Based Deployment of Info Management Tools Convention Center, Room 220 O. Guner, Organizer, Presiding 8:30-Introductory Remarks
8:35-69.
Cheminformatics and the Internet
Osman F. Güner, Omer Casher,
Ajay V. Shah, and Chris Hempill. Molecular Simulations Inc., 9685 Scranton Rd.,
San Diego, CA 92121. Email: mailto:osman@msi.com
The Internet has fundamentally changed the way chemical information is accessed and utilized. A typical corporate intranet has become a busy infra-structure for broader access to not only corporate databases, but also to various computational and analysis tools. Various in-house efforts in developing Web-based tools for the non-computational chemists are complemented by emerging commercial software systems that utilize the corporate intranets in an attempt to integrate the in-house informatics tools with analysis tools. In this presentation, we describe an environment that utilizes this change in paradigm and provides computational productivity tools to medicinal chemists. Click for a PDF version of this presentation.
For a recent review in this area, see Güner, O. F., and Casher, O., "Role of the Internet in Chemoinformatics: Recent Developments," Current Opinion in Drug Discovery & Development 1999 2(3).
9:05-70.
Web-based technology for cheminformatics
Joe R McDaniel,
Cheminformatics, Oxford Molecular Group, Inc., 11350 McCormick Road, Executive
Plaza III - 1100, Hunt Valley, MD 21031. Email: jmcdaniel@oxmol.com
A discussion and presentation of Web-based approaches for cheminformatics using client-side technologies. Server-side technologies will be discussed as they relate to the primary topic and will include an overview of Oracle, JSP, and other tools.
Client tools discussed will include ActiveX, PlugIn, and Java controls for display and editing of structure diagrams developed by the author as well as an overview of other tools available.
A discussion of techniques for compression
of structure data for HTML will show an easily implemented compression scheme
based on Splay Trees. Click for a PDF version of
this presentation.
9:25-71.
The intranet at the interface between computational and synthetic
chemistry
Herman van
Vlijmen, Philip C. Huang, Matthias
Nolte, and Juswinder Singh. Biogen, Inc., 14 Cambridge Center, Cambridge, MA
02142. Email: mailto:herman_vanvlijmen@biogen.com
Computational chemists generate large
amounts of data that need to be presented clearly to synthetic chemists. Making
all the information available on everyone's desktop allows a fast sharing of
data and enables people to look at the results at any time. We have created a
web-based system that allows computational chemists to automatically publish on
the intranet a variety of modeling results, including Dock, Ludi, Leapfrog,
Catalyst, and HQSAR data. The free web plug-in Chime is used to display 3D
structures; GIF files are used for 2D structures. We have also created a
"chemical workbench", which allows bench chemists to run simple minimizations,
Dock, Ludi, and the calculation of molecular properties. It has been our
experience that by giving chemists interactive access to modeling results and
calculation tools we have increased the impact of computational chemistry and
its integration into the drug discovery process. Click for a PDF version of
this presentation.
9:55-72.
Searching NMR databases and predicting NMR spectra over the Web
Valeri Kulkov and Antony Williams. Advanced Chemistry Development, Inc., 133 Richmond
Street West, Suite 605, Toronto, ON M5H 2L3, Canada. Email: mailto:val@acdlabs.com
Searching and sharing spectral information in the networked environment has been a traditionally challenging task. Lack of cross-platform, embeddable software tools for visualization and manipulation of complex objects such as spectra and chemical structures still presents a burden for effective information interchange.
On the example of ACD/ILab, http://www.acdlabs.com/ilab/, a Web-based gateway to chemical information resources, we will describe
our approach to handling spectral information on the Web. H/C/F/PNMR databases
on the ILab are searchable by chemical shifts, structure, substructure, formulae
and molecular weight. Simulated H/C/F/PNMR spectra for a known or unknown
structure can be obtained by accessing a corresponding server-based prediction
engine. For manipulation and visualization of spectra and chemical structures,
viewing peak assignments to the corresponding atoms in a molecule we developed a
set of Java applets. We will present an XML-based approach to sharing of
spectral and structural information on the Web.
10:25-73.
A Web-Based Engineering Chemistry Database
Xue-Liang Fang, Wei
Zhang, Hao Wen, and Zhi-Hong Xu. Institute of Chemical Metallurgy, Laboratoy
of Computer Chemistry, Chinese Academy of Sciences, P.O. Box 353, Zhongguancun,
Beijing, China, Beijing, 100080, China. Email: mailto:xlfang@lcc.icm.ac.cn
Many databases and computer programs have been developed since last 20 years to match the requirements of data in chemical process developments. However, some databases and programs can only work on the individual computer, which is difficult for the people who want to find online data from another computer. Internet can provide the possibilities to get the data from database, to do calculations, to draw the figures on the client-end computers.
A methodology study is performed in
order to get access to databases and calculations via Internet. The data
retrieved from a database or calculated from a program can be converted into
HTML documents using the HTML extension files (*.htx) as the template and sent
back to the web browsers. As an example, a database and a program package for
thermodynamic and equilibrium properties data retrieval and calculations in
engineering chemistry are developed on the web-server ( http://mole.icm.ac.cn).
10:55-74. The
versatility of a web-based spent nuclear fuel database
Luis R. Canas, Savannah River Site, Westinghouse Savannah River Company, Aiken, SC
29808. Email: mailto:luis.canas@srs.gov
The Spent Fuel Storage Division of
the Westinghouse Savannah River Company, principal operations contractor for the
Department of Energy's Savannah River Site chemical-nuclear complex, has
developed a prototype Spent Nuclear Fuel Database (SNFD) with a web-front
interface for smart retrieval of a wide variety of technical data from any
personal computer on the corporate intranet. The SNFD resides in a Microsoft
Access file in a dedicated Windows 95 platform. The web interface is managed by
the WebSite commercial web server in concert with a custom Visual Basic script
and a library of HTML and graphics files. The server transmits static (directly
from HTML files) or dynamic (custom HTML composed by the VB script with embedded
data from queries on the Access file) pages to a remote user's web browser in
response to requests for particular information.
![]()
General Papers Convention Center, Room 220 A. Berks, Organizer, Presiding 12:55-Introductory Remarks
1:00-75. Collaborative
electronic notebook systems: A technical knowledge management paradigm beyond
LIMS, Groupware, and the Web
R. Lysakowski, The
Collaborative Electronic Notebook Systems Association, Woburn, MA 01801. Email:
mailto:rich@censa.org
Collaborative Electronic Notebook
Systems (CENS) are sophisticated systems for technical knowledge management that
integrate electronic recordkeeping and records management systems, LIMS,
groupware, document management, the Web, databases, instrument systems, and many
desktop and server applications that scientists and engineers routinely use.
They are also the first major technical software applications to take advantage
of handheld, wireless computer hardware. The Collaborative Electronic Notebook
Systems Association (CENSA) is now leading the paradigm shift from paper-based
to fully electronic recordkeeping systems. This paper will provide: 1) an
overview of CENS technologies and systems; 2) a discussion of the legal,
regulatory, technical, and business imperatives that must be addressed to
implement successful systems in regulated industries where patents are
generated; 3) an overview of CENSA and its projects with industrial companies
and regulatory agencies worldwide to drive creation and acceptance of electronic
recordkeeping systems worldwide.
Email: mailto:rich@censa.org
1:30-76. Electronic
laboratory notebook systems for R&D and testing labs: Status of creation and
acceptance in industry
R.
Lysakowski, The Collaborative
Electronic Notebook Systems Association, Woburn, MA 01801. Email: mailto:rich@censa.org
Collaborative Electronic Notebook
Systems (CENS) will eventually replace traditional paper-and-pen-based
recordkeeping systems with fully electronic, legally-defensible, multimedia,
multiuser systems that offer MANY advantages over paper. How soon will they be
on the market? What's being done now to make them available to scientists and
engineers? How will these recordkeeping systems integrate with existing data
management systems, such as LIMS, instrument data management, combinatorial
chemistry, and high-throughput screening applications? What about the various
wireless, handheld notebooks, PDAs, and other portable devices -- how will their
ephemeral datasets be transported and secured in an emergent recordkeeping
infrastructure? The Collaborative Electronic Notebook Systems Association
(CENSA) is an international professional and trade association formed in late
1996 to answer these questions and many more. This presentation will cover the
mission, objectives and progress of CENSA in its research and product
development programs for industry and government.
Email: mailto:rich@censa.org
2:00-77. A
water-quality information system for the Lower Mississippi River
Boumediene Belkhouche1, James E. Bollinger2, and
William J. George2. (1) Computer Sciences Department, Tulane
University, New Orleans, LA 70118, (2) Division of Toxicology/Pharmacology
Department, Tulane University, 1430 Tulane Ave., New Orleans, LA 70112. Email:
mailto:bb@mailhost.tcs.tulane.edu
A major issue in monitoring and
managing ecosystems is the lack of an integrated model. Consequently, we
developed a water quality information system for the Lower Mississippi River
that provides a uniform conceptual model of the ecosystem, integrates large
amounts of heterogeneous data collected by various sources, and facilitates the
analysis and interpretation of existing ambient water-quality data. We
conceptualize a river as a an object-oriented model consisting of classes and
relationships among them. The automated analysis process supports exploratory
questions about the availability of data and their geographic distribution, the
concentration levels and distribution of parameters, river hydrology, and the
relationships among the individual variables. In addition to these design
features, a strict quality control protocol has been implemented to document the
flow of data beginning at the point at which data are obtained from their
source, through a comprehensive validation process, until their upload into the
database system.
Email: bb@mailhost.tcs.tulane.edu
2:30-78.
MolBank: preservation and publication of chemical reaction data
Shu-Kun Lin, Molecular Diversity Preservation International, Sangergasse 25, Basel
CH-4054 Switzerland. Email: mailto:lin@mdpi.org
Molecules
(http://mdpi.org/molecules/, ISSN 1420-3049) publishes in the section of
MolBank
(http://mdpi.org/molbank) very short notes of experimental data records for
individual molecules. Any scattered, unassembled experimental data for
individual compounds which is conventionally not publishable is particularly
welcomed, to be published as one-paper one-page for one structure and given
special page numbers (M1, M2, etc.). They have been published in HTML format,
with at least a formula of the target molecule. MDL MOL file is also included
for every MolBank short notes. All papers submitted for consideration and
publication in this column of "MolBank" have been refereed and the accepted
papers edited (English corrected and format unified). The related chemical
samples are in most cases available and the availability information is also
published. All papers published in the MolBank section have been indexed and
abstracted by several leading indexing and abstracting services, including
Chemical Abstracts; CAPLUS; Science Citation Index Expanded; SciSearch, Research
Alert; Chemistry Citation Index; Current Contents/Physical, Chemical & Earth
Sciences. This is the first online publication of experimental chemistry. I will
report the experience and the planned improvement of the MolBank section and the
journal Molecules.
Email: mailto:lin@mdpi.org
3:00-79. Handling
stereoisomerism and adding alternative, CAS based, ring system nomenclature into
organic compound names generated algorithmically directly from a connection
table: AutoNom(TM) approach
Janusz L. Wisniewski,
Beilstein Information, Theodor-Heuss-Allee 108, D-60486 Frankfurt/Main Germany
Email: mailto:jwisniewski@beilstein.com
Design and practical implementation
of algorithms and routines for generation of stereochemical descriptors for
organic stereoisomers, directly from their connection tables, is discussed.
Techniques and methods for unambiguous and efficient calculations of spatial
distribution of atoms with reference to a double bond (E,Z) and with reference
to a chiral centers (R,S) are described and demonstrated within organic
nomenclature generated automatically by the Beilstein's newly upgraded (Version
4.0) AutoNom system. Inclusion, into AutoNom naming procedure, IUPAC-sanctioned
CAS ring system nomenclature, as alternative (or additional) to the "native"
Beilstein ring system nomenclature, is discussed, evaluated, and illustrated by
various names generated for sample organic compounds.
Email: mailto:jwisniewski@beilstein.com
3:30-80. Data
Management for High Performance Computing Users
Kerstin Kleese, High Performance Computing Initiative Centre, CLRC - Daresbury
Laboratory, Keckwick Lane, Warrington, WA4 4AD, United Kingdom and Lois
Steenman-Clark, Reading University. Email: mailto:k.kleese@dl.ac.uk
The demand for data storage has exploded in the last few years. Whereas ten years ago we still measured storage space in Mbytes, today's wellestablished national facilities offer very much increased disk and tape storage capacities, but the existing storage space is already filling up quite rapidly and it is anticipated that this trend will increase. Obviously this trend has provoked many questions: What are the reasons behind this development? Is it really necessary to keep all this data ? For how long is the stored data valuable for us? Who are the main producers? Are we making the best possible use of this data? This paper will concentrate on the data management issues of users of national High Performance Computing facilities and address some of the strategic questions posed.
4:00-81.
Information services on the
intranet: where we are and where we want to go
Kerryn A. Brandt and Joanne L. Witiak. Information Services
Dept., Rohm and Haas Company, P.O. Box 718, Bristol, PA 19007 Email: rahkab@rohmhaas.com, Email: mailto:rsrjlw@rohmhaas.com
Searching the web has become an additional aspect of many chemical information searches. However, web and intranet technology itself can be exploited by information professionals to deliver search results more effectively. The web can also be a valuable marketing tool. For example, by publishing profiles of key competitors on our intranet, we showcased the value we add by collecting, organizing and summarizing information. This information was rapidly and simultaneously available to our global customer base.
We will discuss examples of how we have used our intranet to interact with remote customers, integrate external and internal information, provide enhanced context, navigation, and management of online search results, offer customized views of the same data to different clientele, close the gap between secondary sources and primary information, and generate continually updated searches personalized to customer needs. We will explain where we would like to go with these approaches in the future and raise some issues that challenge our progress.