|
CINF 1
PDF
PPT
MP3 Beth Thomsett-Scott, Reference and Information Services, University of North Texas Libraries, P.O. Box 305190, Denton, TX 76226 The term “social software” has gained popularity in the last several years, although the idea of providing computer-based social interactions can be traced back to the 1940s. Today, social software is used to refer to any form of software that provides for or promotes “social” interaction through the Internet. Early group interactions through an electronic medium have been around since the early beginnings of e-mail, bulletin board services, and chatrooms. Today there are many more options for online group interactions. Some of these social softwares are being used for educational and research purposes, including RRS, IM, and wikis. This talk will provide a brief history of social software and offer an overview of the major softwares used in education and research. Information provided on each software will include how the software is being used and, where possible, some assessment of its functionality. |
|
CINF 2
PDF
PPT
MP3 Teri M. Vogel, Science & Engineering Library, University of California, San Diego, 9500 Gilman Drive, #0175E, La Jolla, CA 92093 Though it is still on the “bleeding edge” for most web users, RSS has become one of the Web 2.0 technologies that are impacting how information is delivered, shared and received. This presentation will: 1) explore the role that RSS is playing in an increasingly complex web landscape of blogs, wikis, search engines, tagging folksonomies, and more recently in the delivery of audio/visual content; 2) examine how libraries, publishers, A&I database producers, and other information providers are utilizing RSS to deliver content to students and researchers in chemistry and other sciences, and if those end-users are taking advantage of these opportunities; and 3) speculate on the effects that RSS-based technologies will have for library and research services, particularly in response to our users' evolving information needs and the new and changing devices they will be using to manage their information. |
CINF 3
PDF
PPT
MP3
|
|
CINF 4
PDF
PPT
MP3 Barbara A Greenman, Science Library, University of Colorado at Boulder, 184 UCB, Norlin Library, Boulder, CO 80309-0184 Although not widely embraced in the U.S., open access has become the mode of publishing for many academic authors worldwide, thereby providing free online access to their scientific research. Traditional venues for scholarly communication are undergoing fundamental change driven by two forces in particular: online publishing and blogging. These forces are transforming not only the academic publishing structure but also the configuration and the format of the research article itself. This presentation explores how the new culture of open access, coupled with the increase in blogging by students, faculty, and the general public, is impacting scholarly research. |
|
CINF 5
PDF
PPT
MP3 Jeremy R Garritano, Mellon Library of Chemistry, Purdue University, 504 W. State St., West Lafayette, IN 47907 and David B. Eisert, Teaching and Learning Technologies, Purdue University, 504 W. State St., West Lafayette, IN 47907. Considered one of the larger and broader coursecasting programs currently in the United States, Purdue University's BoilerCast system offers many challenges and opportunities for faculty, staff, and students on and off campus. The ease of accession of audio course lectures online and their integration with RSS feeds allow students to review lectures before exams, can supplement in-class talks, and even let faculty critique their own lectures. However, a podcasting or coursecasting service is not without its tribulations. For those exploring the possibilities of coursecasting, this paper will discuss the ongoing costs and benefits of a large-scale coursecasting system, lessons learned, and future directions. Reactions from both faculty and students will also be presented, focusing on Chemistry courses involved. The implementation of an audio tour of the Undergraduate Library to be used with circulating Apple iPods will also be discussed. |
|
CINF 6
PDF
PPT
MP3 Randy Reichardt, Science & Technology Library, University of Alberta, 1-26 Cameron, Edmonton, AB T6G 2J8, Canada Weblogs, or blogs, are websites, which are regularly and frequently updated with new entries, links, documents, multimedia, graphics, and pictures. First appearing as online diaries or journals in the late 1990s, other applications evolved as the usability and robustness of blog software improved, making it easier for anyone to create and use a blog. In science and technology, discipline- and subject-specific blogs began to appear, attracting the interest of students, academics, and practitioners working in areas such as chemistry and engineering. In 2003, the engineering librarian introduced blogs into chemical, material and mechanical design engineering classes. Students working in groups of four on capstone design projects in these subjects were given the option of using blogs as a project management tool. Instructions on blog creation and utility were written and distributed to interested student groups, who worked with the engineering librarian to create, upload and maintain a blog for each project. Certain Subject-specific databases such as Compendex (Engineering Index) now provide the ability to make use of blogging and RSS functionality, features which were available to the students as well. The use and application of blogs will be discussed. Included will be a brief review of examples of blogs in chemistry as well as the author's professional blog, covering issues of interest to science and technology librarians. Use of blogs in chemical engineering design classes will be covered, as well as newer applications, such as the “Blog This” and RSS features now available in the engineering database, Compendex. |
|
CINF 7
PDF
PPT
MP3 Martin A. Walker, Department of Chemistry, SUNY Potsdam, 44 Pierrepont Ave, Potsdam, NY 13676 Wikipedia is an open, collaborative encyclopedia based on the World Wide Web with articles written by enthusiastic volunteers. It is currently the 38th most popular website on the internet (and growing), with over 1% of all web users accessing the site on any given day. A Google search on many topics often gives a Wikipedia article (perhaps from a mirror site) as the main reference source. But is it reliable? Is it destroying our students' interest in using "authentic" peer-reviewed chemical information? Or is it a revolution, delivering a high level of information to the masses? This presentation will give an insider's description of the software and the community that is Wikipedia, and describe the associated strengths and weaknesses. It will show how chemistry-related pages are organized, and also provide some insights into likely future developments in Wikipedia. |
|
CINF 8
PDF
PPT
MP3 Jonathan L. Coffman, Drug Substance Development, Wyeth BioPharma, One Burtt Road, Andover, MA 01810 The Biochemical Technology Division of ACS has successfully programmed a series of web seminars. Our goal was to expand the number of people able to attend BIOT symposia, to provide educational experience for students in the biochemical sciences, and to provide a forum for new topics to be discussed. Initial programming featured the top presentations from our most recent annual meeting. These web symposia were well received, and indicate the important role web symposia will play in the future of BIOT. Each web symposium typically had three speakers, each speaking for 20 minutes, allowing for 10 minutes of questions. Our audience has averaged 250 people, which was larger than the largest audience at our BIOT annual meeting. On-line surveys showed that 100% of respondents would return for another web conference. Industrial audience members paid fees to join, allowing academic members to attend with a full scholarship. The fee charged to each industrial site could have funded up to five scholarships. This fee structure will allow nearly unlimited programming in the future. We will discuss how BIOT set up web symposia: choosing a technology provider, choosing programming material, extending the programming to include new topics, and how we set pricing. Since the cost of doing Web Seminars is low, we anticipate that many for-profit symposia organizations will begin doing web seminars soon. ACS, its divisions, and its publications must use web seminars as a key strategy in supporting the chemical profession, advancing the chemical sciences, and communicating the value of chemistry and chemical engineering to the public. |
|
CINF 9 F. Bartow Culp, Mellon Library of Chemistry, Purdue University, 504 West State Street, West Lafayette, IN 47907-2058 In the Internet age, isn't the concept of a librarian outmoded? If easy and almost unlimited information access is available to anyone at the click of a mouse button, why should a chemist consider academic librarianship as a career? There are many reasons, including excellent job prospects, a high degree of career satisfaction, plus the chance to be a central player in the current redefinition of how science is being done. In this age of high-entropy information, the unique combination of abilities that we chemist/librarians bring to our jobs gives us not only the power to organize and access chemical information; it can also enhance the value of that information and improve the entire communication process itself. We will present examples of how chemist/librarians are integral participants in the advancement of both of their professions. |
|
CINF 10 Mary Talmadge-Grebenar, Information & Knowledge Integration, Bristol-Myers Squibb, Rt. 206 & Province Line Rd., PO Box 4000 J12-01, Princeton, NJ 08543 When every reagent bottle has the words carcinogen, mutagen, or teratogen on the label, it makes you rethink your career options. Moving from the medicinal chemistry lab to the world of chemical information was an easy choice. Chemistry and Library Science have many things in common and the skills gained in the laboratory can be translated to the information profession. This talk will cover why I choose this new career and the path that has been followed since the time of that decision. |
|
CINF 11 Patrick Joseph O'Malley, School of Chemistry, The University of Manchester, North Campus, Sackville Street, Manchester, M60 1QD, United Kingdom To meet the demands of training people in cheminformatics skills we developed a postgraduate masters degree course in Cheminformatics at The University of Manchester. This was established to fill a perceived need for scientists equipped with the necessary skills in chemical information. Traditional chemistry undergraduate courses did not teach such skills and the course provides an opportunity for fresh undergraduates to learn these skills as well as providing an opportunity for more experienced personnel to retrain in these new skills. This talk will give an outline of our experience in designing such a course and address problems and accompanying solutions that we have learned. Career destinations of graduates will be examined and current trends on the training and need for chemical information specialists in the UK will be presented. |
|
CINF 12 Frederick W Stoss, Science and Engineering Library, University at Buffalo, 228-B Capen Hall, Buffalo, NY 14260-1672 The post-Genomic Era began with the completion of the Human Genome in 2003. This achievement was made under the auspices of the Human Genome Project (HGP), a 13-year project coordinated by the U.S. Department of Energy and the National Institutes of Health. The goals of the HGP included identification of 20,000 to 25,000 genes encoded in human DNA, determination of sequences of chemical base pairs (~30 million) making up human DNA, storing this genomic data in specialized databases and developing and enhancing the databases and other tools for accessing and analyzing this data. In more recent years we have witnessed the emergence and ongoing evolution of intertwining disciplines in the biological, life, chemical, and computational sciences forming a New Biology of genomics, bioinformatics, proteomics, chemical biology, systems biology and other subdisciplines spinning off the branches of molecular and structural biology and genetics. Keeping abreast of the science and technology behind the New Biology is a daunting task. Simultaneously keeping abreast of the new and ever-changing developments in the data and information storage and delivery systems for the New Biology is an example of “information synergism,” begging the question, “How can science librarians and information specialists provide reference services and library instruction for the new and rapidly emerging fields of research and inquiry of the New Biology?” This presentation will discuss a variety of continuing education initiatives, including: the full-suite of education resources available from the National Center for Biotechnology Information's (a program within the National Library of Medicine) continuing education services and tutorials for librarians, information specialists, and researchers to library school bioinformatics, current awareness services, special journal issues, selected reference and book titles, and science education journals and periodicals. Recruiting science students into library, information, and data careers will be briefly discussed. |
|
CINF 13 Pamela J. Scott, Legal Division, Pfizer, Inc, Eastern Point Road, MS 8260-1611, Groton, CT 06340 Patents and patent information provide many career opportunities in today's marketplace. Career opportunities include government positions as patent agents. The academic and private sectors afford careers in patent law, education, and patent research, and finally careers as independent consultants will be discussed, touching any and all market sectors. |
|
CINF 14 M. Scott Furness, Office of Generic Drugs, Food and Drug Administration, 7500 Standish Place, Rockville, MD 20855 Working as a chemist at the FDA is probably one of the least understood career paths available to chemists today. This talk will provide a general overview of the FDA drug approval process with an emphasis on the chemist's role in the scientific reviewing divisions as well as their role in the evaluation of current Good Manufacturing Practices (cGMPs). As Regulatory Review Scientists, chemists evaluate the chemical sections of drug applications. This evaluation includes an assessment of the adequacy of the methods, facilities, and controls used for the manufacture of drugs. As Consumer Safety Officers (commonly referred to as investigators within FDA), chemists audit, review, and evaluate the manufacturing processes of products that the FDA regulates by inspecting manufacturing facilities within the United States and abroad as well as work as members of multi-disciplinary teams to assure efficient enforcement of the Food Drug and Cosmetic (FD&C) Act. |
|
CINF 15 Nancy McGuire, Public Affairs, Office of Naval Research, 875 N. Randolph St., Arlington, VA 22203 Your list of journal publications is as long as your arm, but you've always wondered if you could write science stories for a magazine or newspaper. What skills do you already have? What will you need to learn? Will you write as a sideline or make it a full-time career? A lab scientist turned full-time science communicator tells of her mid-life career transition and shares a little of what she learned along the way, with a brief drive-by tour of other career options available to scientists who work with words. |
|
CINF 16 J. Phillip Bowen, Center for Drug Design, Department of Chemistry and Biochemistry, University of North Carolina at Greensboro, 400 New Science Building, PO Box 26170, Greensboro, NC 27402-6170 Computational chemistry is accepted today as a specialized field of chemical or biochemical research. Computational chemistry methods are used in both industrial and academic environments worldwide to gain detailed insights into chemical and biochemical problems at the molecular level. Computational chemistry methods are used in many different areas of chemistry, ranging from polymer research to pharmaceutical design. This presentation will focus on discussing the many different career pathways available in computational chemistry, and the necessary background and communication skills necessary to be successful. |
|
CINF 17 George M. Whitesides, Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA 02138 This talk will discuss one (probably of many) style(s) useful in writing a scientific paper: that is, the one we use in our research group. It has the advantage that it integrates doing research, writing about research, and managing research; it has the disadvantage that it is very labor intensive. |
|
CINF 18 Willis B. Wheeler, Associate Editor, Journal of Agricultural and Food Chemistry, 4938 Hampden Lane, Box 298, Bethesda, MD 20814 and Heijia L Wheeler, Journal of Agricultural and Food Chemistry.
The purpose of the review process is to evaluate a paper's scientific merit, originality, clarity of presentation and importance to the field. The goals of the review are to ensure that the journal publishes only noteworthy papers and to provide authors with guidance so they can improve their manuscripts. Reviewers are selected in a number of ways. Some journals request that authors suggest potential reviewers. Journals have databases of scientists that can be searched by areas of scientific interest. In addition, editors know potential reviewers from their own experience in the research field. In terms of ethical considerations, scientists are obligated to review papers. If a scientist does not feel qualified to review, he/she should make that known to the editor and destroy the manuscript. Reviewers should be an objective judge of the paper and should be sensitive to potential conflicts of interest. Papers sent to reviewers must be considered as confidential. Reviewers should explain and support their evaluations. Reviewers are expected to evaluate the quality of the work, its appropriateness for the journal, the technical quality, the clarity of presentation, and any ethical issues. Reviews should give specific and substantive evaluation of the strengths and weaknesses of the manuscript. |
|
CINF 19 Leonard V. Interrante, Department of Chemistry and Chemical Biology, Rensselaer Polytechnic Institute, Troy, NY 12180 As Editor of a large ACS journal (Chemistry of Materials) and a long-time author/reviewer of scientific papers, the speaker will give his view on scientific publishing in 2006 and beyond. From this perspective, he will attempt to answer some of the questions raised in the outline for this symposium, such as "what do editors want (and not want) in an article?", "how are reviewers selected?", etc. In addition, the enormous growth in submitted papers that Chem. Mater. has experienced in recent years has brought with it a number of problems, including reviewer overload and an increased frequency of violations in the "Ethical Guidelines" established by the editors of the ACS journals [see L.V. Interrante and E. Reichmanis, C&EN, Vol 83(6), p. 4 (2005)]. A major portion of this talk will be devoted to a discussion of these problems, and what we, and other journal editors, are doing to confront them. |
|
CINF 20 Joseph E. Yurvati1, Terri K. Lewandowski2, and Anne C. O'Melia2. (1) Journal Publishing Operations, American Chemical Society, 2540 Olentangy River Rd, Columbus, OH 43210, (2) Journal Production and Manufacturing, American Chemical Society, 2540 Olentangy River Road, Columbus, OH 43054 This paper examines the steps an author manuscript undergoes after it has been accepted enroute to it becoming a published manuscript both on the Web and in print. While technological changes have introduced considerable automation into this phase of the journal publishing process, the basic purpose remains: transform a scientist's research findings into a medium that ensures long-term accessibility to the interested. scientific community. This examination will focus on the critical activities of a publishing operations keyed to ensuring fast and efficient high-quality journal products. |
|
CINF 21 Evelyn Jabri and Sarah Tegen. ACS Chemical Biology, American Chemical Society, 1155 16th St NW, Washington, DC 20036 The Web has revolutionized the way we retrieve information and use scientific journals. Formerly, we had shelves of journals with tables of contents to help us select papers, and indices helped us find information later. Today, electronic TOCs and RSS feeds are pushed to the reader; we use online search engines and databases to collect information on topics; we carry information with us on our PDAs and iPods. A sometimes overwhelming amount of content can come our way, often getting lost in the mess on our desktops. So, how can we effectively organize and manage all of it? And what can publishers do to help? Publishers are learning tricks from places like Amazon, Google, Yahoo, and Apple. This talk will detail some of the innovations ACS Chemical Biology is using to help you organize and digest the scientific literature. |
|
CINF 22 Jacqueline A. Erickson, Sr. Analytical Scientist, GlaxoSmithKline, 1500 Littleton Rd., Parsippany, NJ 07054 No abstract available |
|
CINF 23 Song Yu, Columbia University Libraries, Columbia University, 454 Chandler, 3010 Broadway, New York, NY 10027 How do academic librarians seek mentorship in their organizations, especially for those specialized in a filed of science, like chemistry? Some libraries and professional organizations have formal or informal mentoring systems. However, one has to be creative and take initiatives to learn and develop in a way that suites his/her own situation |
|
CINF 24 Leah Solla, Physical Sciences Library, Cornell University, 293 Clark Hall, Ithaca, NY 14850 Professional associations such as the American Chemical Society provide a wide variety of important career benefits and opportunities. In the Division of Chemical Information members can avail themselves of state-of-the art technical programs, resources for day-to-day work, and professional networking and educational opportunities. The best way to make the most of these opportunities is to get involved and work directly with other members and active membership is encouraged and further enhanced through mentoring within the division organizational structure. The CINF Education Committee incorporates several approaches to mentoring into committee procedures to encourage participation of both experienced and new members and build on the knowledge and enthusiasm of all involved. |
|
CINF 25 Valerie A. Vaillancourt and Donna Kaye Wilson. Legal Division, Pfizer, Inc, Kalamazoo, MI 49007 There are many factors which may lead to a change in career path and for many scientists, a new career in information science is a very attractive alternative. But, how does one effectively acquire the skills necessary for this job? External vendors offer in-depth training courses, but this is just not enough. The leadership team of GLIST (Global Legal Information Science Team) at Pfizer believes that mentoring is one way to provide support to our members entering a new job. This presentation will discuss the role of a mentoring program in the patent information science field during job transition, including the advantages and challenges of such a program. Both the mentor and trainee will share their perspectives. And, they will highlight their thoughts regarding the characteristics that contributed to their successful mentoring partnership. |
|
CINF 26 Johann Gasteiger1, Thomas Seidel1, Krisztina Boda1, Achim Herwig2, and Oliver Sacher2. (1) Computer-Chemie-Centrum, University of Erlangen-Nuremberg, Erlangen, 91052, Germany, (2) Molecular Networks GmbH, 91052 Erlangen, Germany, Erlangen, 91052, Germany De novo design systems usually generate large numbers of novel structures. Then, it becomes of crucial importance to develop methods that allow one to select those structures that are easily synthesizable. Various criteria can be invoked to estimate the structural complexity of a compound and its synthetic proximity to available starting materials. Furthermore, data mining in reaction databases can point out strategic bonds where a molecule should be cut to obtain simpler fragments whereby the cuts simultaneously correspond to reactions with a broad scope and high yields. |
|
CINF 27 John L. Whitlow, Department of Electrical and Computer Engineering, NC State University, 2300 Avent Ferry Road, O2, Raleigh, NC 27606 and Yumin Li, Department of Chemistry, East Carolina University, 300 Science and Technology Building, Greenville, NC 27858. Cancer is the leading cause of death for persons under the age of 85. Elevated levels of S100B are associated with cancer. This research focused on interactions between S100B and the tumor suppressor protein, p53. S100B disrupts p53's protective function by inhibiting p53's C-terminal regulatory domain phosphorylation. This study designed compounds to block the effects of S100B on p53. Compounds that enhance p53's cellular function may provide potent anticancer therapies. Accelrys's Cerius2 software was used for de novo drug design. The three dimensional structure of S100B was analyzed to resolve its main interaction sites. Fragment molecules were screened against targets of interaction in the S100B active site. Top fragment molecules were used as scaffolds to design complete ligand molecules. Additionally, public and private molecular libraries were run through docking algorithms to locate existing molecules with high affinities for the S100B active site. ADME and toxicity properties were also investigated.
|
|
CINF 28 N. Sukumar1, Curt M Breneman1, Steven M. Cramer2, James A. Moore3, Kristin P. Bennett4, Mark J. Embrechts5, Min Li1, Jia Liu2, and Long Han6. (1) Department of Chemistry and Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, 110 8th St., Troy, NY 12180-3590, (2) Department of Chemical and Biological Engineering, Rensselaer Polytechnic Institute, Troy, NY 12180, (3) Department of chemistry, Rennselaer Polytechnic Institute, 110-8th street, Troy, NY 12180, (4) Department of Mathematics, Rensselaer Polytechnic Institute, Amos Eaton Building, 110 8th St, Troy, NY 12180, (5) Decision Sciences and Engineering Systems, Rensselaer Polytechnic Institute, 110 8th St, Troy, NY 12180, (6) Decision Science and Engineering Systems, RPI, 110 8th St, Troy, NY 12180 Low-molecular-weight displacers employed in ion-exchange displacement chromatography have shown a great potential for the purification of proteins from complex mixtures. One of the advantages being their ability to carry out selective displacement chromatography in which target proteins can be eluted separately. Identifying efficient displacers, however, is a major challenge for protein displacement chromatography, as it depends not only on the protein mixtures, but also on the chemistry of the stationary phase and the conditions of the mobile phase. The choice of displacers is still mostly driven by trial-and-error and is largely dependent on domain knowledge of an expert. In this work we investigate an efficient procedure to quickly predict novel selective displacers: a small set of known selective displacers are used to train machine learning models (SVM and decision trees) that are then used to identify novel selective displacers from available commercial chemical catalogs and to progressively enrich the models. |
|
CINF 29 A. Peter Johnson, Krisztina Boda, Shane Weaver, Aniko Valko, and Vilmos Valko. School of Chemistry, University of Leeds, Leeds, LS2 9JT, United Kingdom An efficient de novo design system can generate large numbers of hypothetical structures which have been tailored to bind to a specific receptor. An advantage of the de novo process is that many of these structures will have novel structural motifs. However, a possible disadvantage is that some of the structures might be relatively diffficult to synthesise. A number of different solutions to the synthetic accessibility problem have been developed for use with the SPROUT system for de novo design: a)CAESA - a separate system for assessment of synthetic feasibility b)SynSPROUT - a de novo system which incorporates synthetic feasibility into the de novo construction process c)Complexity analysis which matches the designed structures against substitution patterns of known drug like molecules d)SPROUT LeadOpt which optimises structures to improve binding affinity by application of known chemistry using available starting materials. The relative merits of these different approaches will be discussed. |
|
CINF 30 Jörg Degen and Matthias Rarey. Center for Bioinformatics (ZBH), University of Hamburg, Bundesstrasse 43, 20146 Hamburg, Germany We present a new molecular design program, called FlexNovo, for structure-based searching in large fragment spaces following a sequential growth strategy. The fragment spaces used consist of several thousands of chemical fragments and a corresponding set of rules, which primarily specifies how the fragments can be connected with each other. FlexNovo is based on the FlexX molecular docking software and therefore uses the same chemical models, scoring functions, docking algorithms and pharmacophore models. In addition, several placement geometry, chemical property (drug-likeness) and diversity filter criteria are directly integrated in the build-up process. FlexNovo has been used to design potential inhibitors for four targets of pharmaceutical interest (DHFR, CDK2, COX-2 and Estrogen receptor). The compounds obtained show that FlexNovo is able to generate a diverse set of reasonable molecules with drug-like properties. By comparing these to known inhibitors, similarities with respect to their structures and binding modes are frequently observed. |
|
CINF 31 Robert D. Chirico1, Michael Frenkel1, Vladimir V. Diky1, Qian Dong1, Kenneth N Marsh2, John H. Dymond3, William A. Wakeham4, Stephen E. Stein5, Erich Koenigsberger6, and Anthony R. H. Goodwin7. (1) Physical and Chemical Properties Division, National Institute of Standards and Technology, 325 Broadway, Boulder, CO 80305-3328, (2) Department of Chemical and Process Engineering, University of Canterbury, Private Bag 4800, Christchurch, New Zealand, (3) Chemistry Department, University of Glasgow, Glasgow, G12 8QQ, United Kingdom, (4) School of Engineering Sciences, University of Southampton, Southampton, SO17 1BJ, United Kingdom, (5) Physical and Chemical Properties Division, NIST, Gaithersburg, MD 20899, (6) Division of Science and Engineering, School of Mathematical and Physical Sciences, Murdoch University, Murdoch, WA 6150, Australia, (7) Schlumberger Technology Corporation, 125 Industrial Blvd., Sugar Land, TX 77478 ThermoML is an XML-based emerging IUPAC standard for storage and exchange of experimental, predicted, and critically-evaluated thermophysical and thermochemical property data. The basic principles, scope, and description of the structural elements of ThermoML will be discussed. ThermoML covers essentially all thermodynamic and transport property data for pure compounds, mixtures, and chemical reactions. Representations of uncertainties in ThermoML conform to the Guide to the Expression of Uncertainty in Measurement (GUM). Representation of fitted equations with ThermoML will also be described. The role of ThermoML in global data communication processes will be discussed with emphasis on a collaborative project with major journals (the Journal of Chemical and Engineering Data, The Journal of Chemical Thermodynamics, Fluid Phase Equilibria, Thermochimica Acta, and the International Journal of Thermophysics) for distribution of property data with benefit to authors, journal publishers, and data users. The project model described is readily applicable to other disciplines and data types. |
|
CINF 32 Vladimir Diky, Thermodynamics Research Center (TRC), National Institute of Standards and Technology (NIST), Mailstop 838.01, 325 Broadway, Boulder, CO 80305 As ThermoML is becoming a standard for thermophysical and thermochemical data exchange, a necessity in publicly available tools for creation and interpretation of ThermoML files is becoming obvious. Guided Data Capture (GDC) is the natural choice as a generating tool. GDC was developed as a data entry tool for public use by a wide range of users. GDC does not require any specialized database knowledge, is compact and compatible with common PC systems. The program provides a sequence of screens guiding a user through the entire data entry process. The form design is based on the major thermodynamic principles that assures a complete and unambiguous definition of each system and property and allows preliminary validation of the information. GDC maintains internal data formats for basically all ThermoML data, so the only feature needed for making it a ThermoML-generating tool was data export in an XML format. XML export has been implemented at the text level because writing XML output is easy at low level when the data structures already exist in the program, and this solution eliminated the dependence on any XML parsing tool and the necessity to include it in the redistribution kit. ThermoML-generating version of GDC has been successfully used at Thermodynamics Research Center (NIST) and is planned to be freely available. |
|
CINF 33 Kenneth N Marsh, Department of Chemical and Process Engineering, University of Canterbury, Private Bag 4800, Christchurch, New Zealand The progress of science and technology is based primarily on measured numeric data with the scientific journals publishing the values as numbers within the text and as tables and/or figures. Those journals are now published electronically but researchers and others needing that data have to retype the data to their required format for further analysis. Till recently only a few systems have been devised to capture numerical data at its source (the author), e.g. genome, protein and crystallographic data. Why? This was because there was no financial gain to the publisher, there was a lack of a standard format, and there was no central distribution site. A successful thermophysical property data gathering system requires: an agreed upon standard format, a mechanism and an incentive for the author to submit the data, and a body to accept, verify and distribute the data freely to the thermodynamic community. ThermoML with TRC/NIST backing provides a solution for thermophysical property data. The role of the Journal of Chemical and Engineering Data in the development of the ThermoML standard will be outlined. |
|
CINF 34 Michiel S. Thijssen, Chemistry & Earth and Environmental Sciences, Elsevier B.V, Radarweg 29, Amsterdam, NL-1043 NX, Netherlands A linking agreement between the Thermodynamics Research Center at NIST and Elsevier since early 2004 provides readers of The Journal of Chemical Thermodynamics (JCT) with a ThermoML link next to articles on ScienceDirect.com. This link connects to the respective ThermoML database record at TRC, where thermophysical and -chemical data related to the article are stored for direct and free downloading into laboratory environments. A reciprocal link connects the ThermoML record to the journal content. From 2005 onwards, authors of Fluid Phase Equilibria and Thermochimica Acta also deposit their data. The NIST-TRC & Elsevier collaboration will be discussed in detail. Statistics show its popularity, as presently 75% of JCT authors submit data. The data is enhanced in 10-20% of the cases by the data-capture process, and subsequently updated before publication in the journal. The collaboration contributes positively to our joint efforts to serve the authors, readers and users of thermal data. |
|
CINF 35 Chris D. Muzny, Physical and Chemical Properties Division, National Institute of Standards and Technology, 325 Broadway, Boulder, CO 80305-3328 ThermoData Engine (TDE) is a recently released database and software product produced by the Thermodynamics Research Center at the National Institute of Standards and Technology in Boulder, Colorado. TDE is a dynamic data evaluation tool for thermodynamic properties that relies on SOURCE, a comprehensive experimental archival data system that includes rigorous quality evaluation. TDE is useful for any application that requires thermodynamic property information, but it is especially well suited to chemical engineering applications and process simulations. Because of the need to communicate results of data evaluations to other chemical engineering software applications, TDE implements ThermoML as a standardized data communication method. The use of ThermoML for both output and input of data in TDE will be described and examples of the usefulness of this method will be given. |
|
CINF 36 Michael Frenklach1, Andrew Packard1, Zoran M. Djurisic1, David M. Golden2, Craig T. Bowman2, William H. Green Jr.3, Gregory J. McRae3, Thomas C. Allison4, Gregory J. Rosasco5, and Michael J. Pilling6. (1) Department of Mechanical Engineering, University of California at Berkeley, Berkeley, CA 94720-1740, (2) Department of Mechanical Engineering, Stanford University, Stanford, CA 94305, (3) Department of Chemical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Ave, Bldg. 66, Room 270, Cambridge, MA 02139, (4) Computational Chemistry Group, National Institute of Standards and Technology, 100 Bureau Drive, Stop 8381, Gaithersburg, MD 20899-8381, (5) Physical and Chemical Properties Division, National Institute of Standards and Technology, 100 Bureau Drive, Mail Stop 8380, Physics Building (221) Rm. A107, Gaithersburg, MD 20899-8380, (6) School of Chemistry, University of Leeds, Woodhouse Lane, LS2 9JT Leeds, United Kingdom Process Informatics is a data-centric approach to developing predictive models for complex chemical reaction systems (http://primekinetics.org). It deals with all aspects of integration of pertinent data of complex systems (industrial processes and natural phenomena) whose complexity originates from chemical reaction networks. The primary goal of process informatics is information gathering, validation, and transformation into a useable form. The latter includes development of predictive (numerical/computer) models with quantified degrees of reliability. The Process Informatics infrastructure has two principal components: a Data Depository and a collection of Tools. The Depository is designed to represent the most currently complete set of knowledge available in a given field. The currently built Tools are of two general kinds, those enabling the collection, transfer, organization, display, curation, and mining of the data, and those enabling processing and analysis of the data along with assembly of the data into models. The handling of thermodynamics will utilize ThermoML |
|
CINF 37 Wei Chen1, Chia en Chang2, and Michael K. Gilson1. (1) Center for Advanced Research in Biotechnology, University of Maryland Biotechnology Institute, 9600 Gudelsky Drive, Rockville, MD 20850, (2) Department of Chemistry, University of Maryland, College Park, MD 20742 The present study uses an accurate and theoretically well-founded method of computing binding affinities as the basis for the design of novel receptors targeting the biologically important peptide RGD. This method is found to yield excellent agreement with experimental affinities for a synthetic RGD receptor; and four new receptors constructed in silico by a fragment-based approach are analyzed here. One of the new receptors is predicted to bind as tightly as the existing receptor, despite its lower molecular weight. One is found to provide affinity in the same range as expected for proteins for ligands with the size of RGD. Further analysis of these systems yields insights into the maximization of affinity in the face of losses in configurational entropy and solvation. The present study indicates that more efficient and tighter-binding receptors for RGD can be made, and represents a significant step toward the broader goal of targeted receptor design. |
|
CINF 38 Richard D. Cramer1, Farhad Soltanshahi2, Robert Jilek3, and Brian Campbell3. (1) Chief Scientific Officer, Tripos, Inc, 1699 South Hanley Road, St. Louis, MO 63144, (2) Research, Tripos, Inc, 1699 South Hanley Road, St. Louis, MO 63144, (3) Tripos Inc, 1699 South Hanley Road, St. Louis, MO 63144 The limited novelty and feature content of commercially offered reactants and the very rapid similarity/QSAR-based ligand searching capability provided by topomers have prompted the creation of "allchem", a database of 10E7 mutually reactive synthons synthesizable in a few simple steps from commercially available reagents. Its contents are proving especially useful as novel scaffolds within lead discovery libraries and as candidate side chains within lead optimization programs. Virtual library construction via topomer-based searching of such databases seems very attractive to working medicinal chemists. |
|
CINF 39 Jonathan M Goodman and Ingrid M Socorro. Unilever Centre for Molecular Science Informatics, Cambridge University, Department of Chemistry, Lensfield Road, Cambridge, CB2 1EW, United Kingdom Good drug candidates must be accessible through reasonable synthetic routes, and must not be too susceptible to degradation reactions that would alter or remove their biological activity. The ROBIA (Reaction Outcome By Informatics Analysis) program analyses organic transformations using detailed conformation analysis and molecular modeling approaches in order to generate and to evaluate likely reaction pathways. This can be used both to assess the likely stability of candidate structures and also to examine synthetic pathways towards these molecules. |
|
CINF 40 Qian Dong, Physical and Chemical Properties Division, National Institute of Standards and Technology, 325 Broadway, Boulder, CO 80305-3328 IUPAC Ionic Liquids Database, ILThermo, was released to the public via internet in December of 2005 to meet the urgent need for critical data in academia and industry. ILThermo was constructed on the basis of NIST SOURCE - an extensive repository system of over 100 thermodynamic, thermochemical, and transport properties for pure compounds and mixtures extracted from world's scientific literature. ILThermo is a prototype for generating special-retrieval-purpose databases from SOURCE for different applications. First, ionic liquids data are captured and stored through the ThermoML-based data capture mechanism (GDC) for SOURCE on a daily basis; secondly, this ionic liquids subset is extracted, reorganized, and populated into ILThermo periodically; and thirdly, an updated ILThermo is exported from an internal server and imported to a NIST external server. ILThermo presents information via a high-density screen, which enables users to easily retrieve comprehensive ionic liquids data by navigating through a series of tables on one web page. |
|
CINF 41 Martin Schmidt, Software development, FIZ Chemie Berlin, Franklinstr. 11, Berlin, 10587, Germany
The database Infotherm, currently available at www.chemistry.de/infotherm/, comprises more than 170,000 tables of PVT-properties, phase equilibria, transport and surface properties, caloric properties, acoustic and optical properties of 26,000 mixtures and 7,000 pure compounds taken from journals, data collections, manuals and measurement reports some of which exclusive to Infotherm. This database contains search functions in order to combine about 150 properties, conditions and types of equilibria with definable value ranges, substance names, formulas and CAS registry numbers by Boolean operators. Infotherm was relaunched in November 2005 with a download option in the ThermoML-format, an XML-based IUPAC standard for experimental thermodynamic property data storage and exchange. Infotherm is a native XML-database, which is excellently adapted for the representation of a ThermoML-scheme and fast shared access by multiple users. The internal concept of the database is introduced and the import/export options will be illustrated. An insight into those parts of ThermoML, which are mainly used for the Infotherm application will also be provided and the quality of the data will be discussed. Finally, the roadmap for the further developments will be presented. |
|
CINF 42 Marco Satyro, Virtual Materials Group, Inc, 657 Hawkside Mews NW, Calgary, AB T3G 3S1, Canada Process simulators are used by engineers and scientists for the solution of material and energy balance equations that represent equipments found in processing plants. The most fundamental step for the creation of quality thermodynamic models used in the solution of balance equations is the proper characterization of pure component and mixture data. Therefore, the existence of a standard communication interface between physical property data providers and physical property consumers like simulators is a significant step towards rational use of resources, minimizing translation errors and maximizing the speed at which new data can be entered into process simulators. In this presentation we will show how ThermoML is used to facilitate the work process when integrated with the VMGSim process simulator and the VMGThermo physical property calculation kernel. |
|
CINF 43 Andrew I. Johns, Oil, Gas & Chemicals Group, TUV NEL Ltd, Scottish Enterprise Technology Park, East Kilbride, Glasgow, G75 0QU, United Kingdom and Alan C. Scott, Oil, Gas & Chemicals Group, TUV NEL Ltd, Scottish Enterprise Technology Park, East Kilbride, Glasgow, G75 0QU, United Kingdom.
This paper deals with the use of the ThermoML standard by the Physical Property Data Service software suite as a tool for the import and export of thermophycical property data. An outline of the approach taken to implement the standard will be given together with some examples of its use. |
|
CINF 44 Huijun Wang and David Wild. School of Informatics, Indiana University, 1105 N. Union St., #112, Bloomington, IN 47408 Alzheimer's disease is a progressive, irreversible brain disorder with no known cause or cure. More than 4.5 million Americans are believed to have Alzheimer's disease and by 2050, the number could increase to 13.2 million. Brain imaging based on functional MRI (fMRI) is one of the powerful tools for characterizing age-related changes in functional anatomy. Completing such explorations may yield insights into the origins of age-associated cognitive change and perhaps even provide functional–anatomic markers that predict cognitive decline associated with Alzheimer's disease. Our integrated Alzheimer's Disease information system is designed to create applications by permitting data mining across a wild variety of chemical, biological, genomic and other databases using the IO-informatics Sentient package, which is designed to create applications by “pointing to” related but distributed data and securely and efficiently integrating relevant meta-data and in some cases image subsets into an object-oriented analysis and query environment.. The system has been developed in conjunction with several other institutions, and is of particular use in identifying biomarkers that cross traditional discipline boundaries. We outline several ways the system can be used to enhance Alzheimer's disease research, and discuss the implications of the system for future development of chemical and bioinformatics systems. |
|
CINF 45 Xiao Dong and David Wild. School of Informatics, Indiana University, Bloomington, IN 47408 We are developing a system of managing and mining chemoinformatics tools and data that uses web services and intelligent agents. Using this system, scientists are able to make high level requests to intelligent agents, which then use other agents and web services to carry out the request, employing a variety of computational tools and databases. In this poster we describe how use-cases can be implemented as workflows of web services wrapped around chemoinformatics tools and databases, thus enabling previously complex queries and requests to be carried out simply. The potential impact of systems such as this on the use of early stage drug discovery information will also be addressed. |
|
CINF 46 Noel M. O'Boyle1, Gemma L. Holliday2, Daniel E. Almonacid1, Peter Murray-Rust1, John B. O. Mitchell1, and Janet M Thornton2. (1) Department of Chemistry, Unilever Centre for Molecular Science Informatics, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, United Kingdom, (2) EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom There is a clear need to develop informative enzyme classification schemes complementary to the EC system, which uses a hierarchical classification to describe enzymes by their overall reactions. For example, in the EC system all beta-lactamases are classified as 3.5.2.6. However, although the overall reactions are the same, the four different types of beta-lactamase use quite different mechanisms. Conversely, enzymes with very similar mechanisms may be widely separated in the EC system, as exemplified by the eukaryotic and prokaryotic phosphoinositide-specific phospholipases C. We have developed MACiE (Mechanism, Annotation and Classification in Enzymes), a representative database of enzyme reaction mechanisms. Each reaction step is fully described, both graphically and using annotation. MACiE will aid the development of a new enzyme classification system, based upon reaction mechanisms. Here we present progress in the development of a key component, a method to measure similarity between enzyme reaction mechanisms. |
|
CINF 47 Barun Bhhatarai and Rajni Garg. Department of Chemistry, Clarkson University, 8 Clarkson Avenue, Potsdam, NY 13699-5812 QSAR is an important tool for ‘chemical information retrieval'. It helps in structure modification of ligand to yield a potent inhibitor. However, the successful outcome of future drug-therapy is determined by the drug-combination-therapy and retained susceptibility to mutant variants. Different ligands have different affinity to wild-type and mutant protein. It requires a clear understanding of the mutation pattern to explain them quantitatively. We use QSAR as a cheminformatics tool to understand the difference between the wild-type and mutant variety of HIV-protease. The comparison between the important parameters observed in QSAR models helps in finding ligand-receptor binding pattern and provides information about different types of receptor. QSAR models based on structural modification of Indinavir molecule analyzing different mutant variants such as K60C, V18C, NL4-3, 4X and Q60C were developed. Quantitative assessment of the similarities and difference between the wild-type and mutant receptor pocket in conformation and in affinity will be presented. |
|
CINF 48 Lorant Bodis1, Alfred Ross2, and Ernö Pretsch1. (1) Department of Chemistry and Applied Biosciences, ETH Zurich, ETH Hönggerberg, HCI E 312, Zurich, CH-8093, Switzerland, (2) Pharmaceuticals Division, F. Hoffmann-La Roche Ltd, Grenzacherstr, Basel, CH-4070, Switzerland Most available vector comparison methods such as the correlation coefficient and Tanimoto coefficient are only able to find point-wise similarity. Similarity criteria for spectra comparison should include information about the neighborhood of the corresponding items in order to identify shifted signals as well. So far, only few such methods have been described. A recent method is based on a locally weighted cross-correlation function being normalized with geometric mean of the individual autocorrelation functions. A much better performance has been achieved with a novel similarity criterion. The two vectors to be compared are divided into i bins (i = 1, N) and for each division the integrals in each bin are calculated. Similarity indices are derived from the comparison of the corresponding integrals. The mean of the normalized similarity indices serves as the similarity criterion. The presented similarity criteria are characterized with contingency tables and histograms obtained from tests made on simple artificial 1H NMR spectra having different degrees of similarity. Furthermore, they are applied for comparing measured and estimated spectra of a complex real-life database. Although, so far, it has only been tested with one-dimensional 1H NMR spectra, due to the generality of the approach, the application of the novel procedure with spectra of two or more dimensions including image analysis is straightforward. |
|
CINF 49 David C. Thompson, Iain J. McFadyen, Natasja Brooijmans, and Diane Joseph-McCarthy. Department of Structural Biology & Computational Chemistry, Wyeth Research, Chemical & Screening Sciences, 200 Cambridge Park Drive, Cambridge, MA 02140 In this present work our pharmacophore-based molecular docking approach, PhDock, is further validated against two well-known test sets: the CCDC/Astex set and the published Vertex set. Each element within our virtual screening protocol will be critically assessed as we examine potential correlations between the generation of site points through MCSS2SPTS and the position of true “hot spots” within the receptor, relationships between pharmacophores of the best scoring hits, and the importance of re-scoring hits with a physically realistic scoring function. The concepts discussed and tested here are of importance to the development of accurate and efficient approaches to structure-based drug design and are generally applicable to any docking scheme. |
|
CINF 50 Nobuaki Koga1, Masahiko Hada2, Kenro Hashimoto2, Haruo Hosoya3, Toshio Matsushita4, Hidenori Matsuzawa5, Umpei Nagashima6, Shinkoh Nanbu7, Keiko Takano3, and Shinichi Yamabe8. (1) School of Informatics and Sciences, Nagoya University, Furo-cho,Chikusa-ku, Nagoya, Japan, (2) Department of Chemistry, Tokyo Metropolitan University, 1-1 Minami-Ohsawa, Hachioji, Tokyo, Japan, (3) Department of Chemistry, Ochanomizu University, 1-1-1 Otsuka, Bunkyo-ku, Tokyo, Japan, (4) Department of Chemistry, Osaka City University, 3-3-138 Sugimoto, Sumiyoshi-ku, Osaka, Japan, (5) Department of Chemistry, Chiba Institute of Technology, 2-17-1 Tsudanuma, Narashino, Chiba, Japan, (6) Research Institute for Computational Sciences, National Institute of Advanced Industrial Science and Technology, and CREST-JST, 1-1-1 Umezono, Tsukuba, Ibaraki, Japan, (7) Computing and Communications Center, Kyusyu University, Hakozaki 6-10-1, Higashi-ku, Fukuoka, Japan, (8) Department of Chemistry, Nara University of Education, Takabatake, Nara, Japan
Quantum Chemistry Literature Data Base (QCLDB) is a database of those papers published after 1978 which treat only ab initio calculations of atomic and molecular electronic structure. From about thirty core journals they are collected, surveyed, and given proper tags revealing the content and essence of the paper by the group of young Japanese quantum chemists. Those theoretical works even without reporting any computational results are also collected which are judged to have significant relevance to ab initio calculations, while no semi-empirical calculations are included. QCLDB is finally edited and copyrighted by Quantum Chemistry Data Base Group (QCDBG). We announce the opening of our new web-version of QCLDB II (http://qcldb2.ims.ac.jp/) from April 1, 2004, which is offered the registered users free usage of the updated database including all the previous data. The new QCLDB II will help your research activities more efficiently than before. |
|
CINF 51 Sivanesan Dakshanamurthy, oncology, Lombardi Cancer Center, Georgetown University, reservoir road, E401, NRB, washington DC, DC 20057 Natural killer (NK) cells constitute an important part of the innate immune system. Human killer-cell immunoglobulin-like receptors (KIR) are expressed on the surface of natural killer (NK) cells and modulate NK cell mediated cytotoxicity of tumor cells. These receptors deliver activating or inhibitory signals that depend, in part, on binding to HLA ligands. There are many different KIR2DL polymorphic receptors and exhibits several unique features. Previous studies indicated that HLA recognition by KIR depends on charge complementarity between them. Usually, KIR provides acidic residues and HLA contributes basic residues to the interface in addition to the hydrogen bond interactions. In the present work, several different mutations on the KIR2DL and HLA interface residues and subsequently, the stability, energetics of various KIR2DL/HLA complexes were performed by molecular mechanics and dynamics simulations. It has been found that the salt-bridge interactions between complementary residues are important for the KIR2DL receptor and HLA recognition. |
|
CINF 52 Zhong Li and Kayvan Najarian. Department of Computer Science, The University of North Carolina at Charlotte, 9201 University Blvd., Charlotte, NC 28223 Molecular similarity calculation are important for drug design. This paper presents a novel molecular similarity calculation method based on spanning tree matching algorithm and the physical chemistry parameters of atoms and bonds. The similarity between 15 FDA proved anti-HIV drugs were calculated and clusters formed according to their similarities. |
|
CINF 53 Peter Murray-Rust, Department of Chemistry, Unilever Centre for Molecular Science Informatics, University of Cambridge, Lensfield Road, CB2 1EW Cambridge, United Kingdom, Henry S. Rzepa, Department of Chemistry, Imperial College of Science, Technology and Medicine, Exhibition Road, South Kensington, London SW7 2AY, United Kingdom, Joe A Townsend, Department of Chemistry, Unilever Centre for Molecular Science Informatics, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, United Kingdom, and Dan Wilson, Mineralogisches Institut, Johann Wolfgang Goethe-Universit¨at, Senckenberganlage 30, Frankfurt am Main, Germany. High-throughput computation of the structures and properties of molecules and materials is now supported by a generic infrastructure based on Chemical Markup Language (CML). By converting the input to and output from a code (such as CASTEP, GAMESS, DL-POLY, SIESTA, etc.) it is possible to chain together several operations which can process jobs automatically. This is supported by flexible dictionaries (XML) and ontologies (RDF) to represent computational processes, physical properties, strategies, parameters and algorithms. This can support coarse-grained parallelism, data mining and analysis. XMLisation is either through the additional of CML libraries to the code or transduction of legacy data (stylesheets and parsers). An important benefit is the increased detection of program errors and control of input and output quality. |
|
CINF 54 Maren Fiege, Waters GmbH, Europaallee 27-29, 50226 Frechen, Germany Analytical instruments today are producing data in a multitude of different formats. This makes the interchange of data between systems difficult. To deal with this problem, standard formats like ANDI and JCAMP have been created in the past. Based on the experience gained with these, ASTM has started an effort to create a highly flexible yet validateable standard format based on XML that can accommodate any kind of analytical data. This presentation will give an introduction into the concepts behind AnIML, and will show how AnIML can be customized to suit special needs without breaking the standard. |
|
CINF 55 Peter T. Corbett, Unilever centre for Molecular Sciences Informatics, Department of Chemistry, Lensfield Road, Cambridge, United Kingdom, Peter Murray-Rust, Department of Chemistry, Unilever Centre for Molecular Science Informatics, University of Cambridge, Lensfield Road, CB2 1EW Cambridge, United Kingdom, Nick E Day, Department of Chemistry, Unilever Centre for Molecular Sciences Informatics, Lensfield Road, CB2 1EW Cambridge, United Kingdom, Joe A Townsend, Department of Chemistry, Unilever Centre for Molecular Science Informatics, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, United Kingdom, and Henry S. Rzepa, Department of Chemistry, Imperial College of Science, Technology and Medicine, Exhibition Road, South Kensington, London SW7 2AY, United Kingdom. Much of the semantics in a chemistry article are now supported by Chemical Markup Language (CML) describable by an XML Schema (XSD). CML can support molecules, structures, reactions and reaction schemes, spectra (including annotations) and physicochemical data. These are supported by dictionaries and lexicons (also in XML) that provide linguistic and semantic support for the markup. Manuscript components can be created either with a range of authoring tools or through linguistic processing of conventional text. The semantics in such papers can now be processed by machine leading to high-throughput information extraction. A major feature is that chemical documents will be quicker to author and have a higher quality of embedded data and structure through machine validation. |
|
CINF 56 Alexander Roth1, Ronny Jopp1, Peter J. Linstrom2, and Gary W. Kramer1. (1) Biochemical Science Division, NIST, 100 Bureau Drive, Bldg. 227; Rm. A-157, Gaithersburg, MD 20899-8312, (2) Physical and Chemical Properties Division, NIST, Building 221, Room A357, 100 Bureau Drive, Stop 8380, Gaithersburg, MD 20899-0830
AnIML (Analytical Information Markup Language) is being created by ASTM Subcommittee E13.15 to describe chromatography and spectroscopy data and metadata based on XML (eXtensible Markup Language) and its associated technologies. Once in AnIML format, analytical data can be interchanged over the web, converted to other formats, validated, or visualized in multiple formats using existing XML-based tools. AnIML is built around a core schema that defines ways for describing almost any data. Technique Definition files are used to constrain the myriad data description mechanisms available for a given analytical technique to only those commonly accepted, to delineate the metadata items ordinarily associated with such domain data, and to permit content extension by vendors and users without changing the core schema. This presentation will describe the naming and design rules (NDRs) and other techniques being employed to ensure that AnIML is as interoperable as possible with other markup languages. |
|
CINF 57 Ronny Jopp1, Alexander Roth1, Peter J. Linstrom2, and Gary W. Kramer1. (1) Biochemical Science Division, NIST, 100 Bureau Drive, Building 227; Rm. A-159, Gaithersburg, MD 20899-8312, (2) Physical and Chemical Properties Division, NIST, Building 221, Room A357, 100 Bureau Drive, Stop 8380, Gaithersburg, MD 20899-0830
Units Markup Language (UnitsML) is being developed to encode scientific units of measure using XML (eXtensible Markup Language). The development and deployment of a markup language specifically for units will allow for the unambiguous storage, exchange, and processing of numeric data, thus facilitating collaboration and the sharing of information, especially over the Internet. Incorporating UnitsML into other markup languages prevents duplication of effort and improves interoperability. ASTM Subcommittee E13.15 is creating AnIML (Analytical Information Markup Language) to describe chromatography and spectroscopy data and metadata based on XML and its associated technologies. AnIML facilitates access to analytical data by building in descriptions of the data and metadata with delimited tags. UnitsML is being employed to handle the markup of the units information in AnIML. This presentation will describe how UnitsML is being used and how it is being incorporated into AnIML. |
|
CINF 58 Michael Burke, Agilent Technologies, 6612 Owens Drive, Pleasanton, CA 94588 Abstract text not available. |
|
CINF 59 Gregory A. Landrum, Julie E. Penzotti, and Santosh Putta. Rational Discovery LLC, 555 Bryant St. #467, Palo Alto, CA 94301 In order to develop robust machine-learning or statistical models for predicting biological activity, descriptors that capture the essence of the protein--ligand interaction are required. In the absence of structural information from x-ray or NMR experiments, deriving informative descriptors can be difficult. We have developed feature-map vectors (FMVs) to address this challenge. FMVs are problem-specific – derived from the conformational models of a few actives – and highly interpretable. By using shape-based alignments and scoring with chemical features, FMVs combine information about a molecule's shape and the pharmacophores it can match. We will present the details of the algorithm and the results of validation studies that establish the utility and interpretability of FMVs. After describing the performance of models built to predict biological activity for several biological targets (CDK2, thrombin, DHFR, and ACE), we will examine what can be learned about the protein--ligand interactions from the descriptors themselves. |
|
CINF 60 Mark D. Mackey, Cresset BioMolecular Discovery, Spirella Building, Bridge Rd, SG 6 4ET, Letchworth, United Kingdom Proteins recognise ligands through their surface properties (or fields), not their particular arrangement of atoms and bonds. Describing molecules in terms of molecular fields leads to powerful new techniques for ligand- and structure-based drug design. In particular, we detail a powerful field-based virtual screening method with real-world successes. We also present a new technique of field-based molecular alignments and its success in determining the bound conformation of active molecules purely from ligand data in the absence of any protein information. Case studies will be reported. |
|
CINF 61 Rajarshi Guha1, Debojyoti Dutta2, Peter C. Jurs1, and Ting Chen2. (1) Department of Chemistry, Pennsylvania State University, 104 Chemistry Building, University Park, State College, PA 16802, (2) Department of Computational Biology, University of Southern California, 1050 Childs Way, Los Angeles, CA 90089 We introduce a data mining framework built on top of an approximate nearest neighbor algorithm termed Locality Sensitive Hashing (LSH). The core LSH algorithm hashes molecular descriptors so that points close to each other in descriptor space are also close to each other in the hashed space, resulting in sublinear search times. We validate the accuracy and performance of our framework on three real datasets of sizes ranging from 4,337 molecules to 249,071 molecules. Our results indicate that the identification of nearest neighbors using the LSH algorithm is two orders of magnitude faster than the ordinary kNN method and is over 94% accurate. We also use this framework to determine extremely rapidly whether a compound is located in a sparse region of chemical space. The algorithm is quite accurate compared to results obtained using PCA-based heuristics. |
|
CINF 62 Valerie J. Gillet1, Simon Cottrell1, and Robin Taylor2. (1) Information Studies, University of Sheffield, Regent Court, 211 Portobello Street, Sheffield, S1 4DP, United Kingdom, (2) Cambridge Crystallographic Data Centre, 12, Union Road, Cambridge CB2 1EZ, United Kingdom A pharmacophore is defined as the set of chemical features and the spatial relationships between the features that together form a necessary requirement for biological activity. The two major issues in pharmacophore identification are the correct representation of the chemical features, so that bioequivalent features are mapped together, and the appropriate sampling of conformation space so that the bioactive conformation of each compound is found. Often, there are several plausible hypotheses that could explain the same set of ligands and in such cases, it is important that the chemist is presented with alternatives that can be tested with different synthetic compounds. We have applied a multiobjective genetic algorithm to the pharmacophore elucidation problem to generate a range of chemically diverse solutions that represent equally plausible hypotheses. The hypotheses are evaluated over a number of objectives which are considered independently, according to the principles of Pareto dominance. Recent developments of the method will be described which allow the identification of pharmacophore features that are common to some, but not all of the ligands. |
|
CINF 63 Gerhard Wolber and Alois A. Dornhofer. Inte:Ligand GmbH, Mariahilferstrasse 74B/11, 1070 Wien, Austria Aligning and overlaying two or more bio-active molecules is one of the most important tasks in computational drug discovery and cheminformatics. Molecule characteristics from the view point of a macromolecular target - represented as a 3D pharmacophore - are of special interest when regarding macromolecule-ligand interaction. We present a novel approach for aligning rigid three-dimensional molecules according to their chemical-functional and steric pharmacophoric features. Optimal chemical feature pairs are identified using distance and density characteristics and obtained by correlating pharmacophoric geometries. The presented approach proves to be faster than existing combinatorial alignments and creates more reasonable alignments than earlier methods. Correlations between two similar pharmacophore features can even be identified if they show different constraints. Examples will be provided to demonstrate the feasibility and speed of this method. Fig. 1. Three CDK2 inhibitors from the PDB (1ke5, 1ke6, 1ke7) in their bio-active conformation all aligned with their 3D pharmacophores describing the ligand-macromolecule interaction. Graphics were created with LigandScout 1.0, available from http://www.inteligand.com
|
|
CINF 64 |