From: "Secretary, ACS Division of Chemical Health and Safety" <secretary**At_Symbol_Here**DCHAS.ORG>
Subject: [DCHAS-L] InChI project opportunity
Date: Wed, 27 May 2015 11:41:43 -0400
Reply-To: DCHAS-L <DCHAS-L**At_Symbol_Here**MED.CORNELL.EDU>
Message-ID: D9B66F1A-A373-4F34-92C0-4B5B72C3D0B9**At_Symbol_Here**

On behalf of DCHAS, I have been involved in discussions with the ACS Division of Chemical Information about ways to improve electronic access to chemical safety information. One of the needs identified in this process is an extension of the InChI system (see ) to accommodate mixtures of different chemical structures. We believe that this idea is particularly important for users of chemical safety information to organize the voluminous data that needs to be used for this purpose. People who are interested in more technical information about what InChI is and how it is used may want to review their FAQ page at

To this end, we are developing a proposal for IUPAC to sponsor an international committee to address this need. One of the requirements of the granting process is that representatives from at least 5 nations be identified. So, while we are interested in hearing from any DCHAS members who are interested in this opportunity, we are particularly interested in hearing from non-US citizens who might want to be involved in such discussions. Much of the work will be done electronically, but IUPAC sponsorship would allow the group to cover travel expenses to one or two in person meetings of the group. IČ??ve appended a summary paragraph from our proposal below.

If you are interested in this opportunity, please let me know and I can provide more details about the idea and level of commitment required.

Let me know if you have any questions about this.

- Ralph

Project Decription

This project proposes to encode states of matter and mixture composition (e.g., solutions and impurities) within the IUPAC International Chemical Identifier (InChI). This innovation will extend the utility of the InChI as a general purpose identifier of chemicals beyond indexing of chemical structure representation. The ability to describe all chemical components and interactions in a given chemical system is desirable for a variety of purposes. Specifically, canonicalization of states could support data management pertaining to reaction planning, property calculation, anticipation of potential hazards, and process optimization. Such chemical information systems are already in development or planning stages, including electronic laboratory notebooks, chemical hazard and risk management procedures, and various data analysis applications. Enabling the InChI algorithm to be implemented at the process development stage will improve data linking and interoperability and establis!
hes the best standard of practice up front.

In the context of developing safer and greener chemical management, additional information about concentration, purity, density and other issues related to mixtures and state becomes critical for planning and documentation chemical processes. For common compounds with definitive core chemical structures, there may be multiple forms manufactured and several variations of mixtures stocked. Information systems that index by chemical structure will pool data across forms without recourse for disambiguation. Other than the catalog numbers of chemical manufacturers, no known identifier system exists that addresses these issues, and these identifiers suffer the same challenges as other internal record schemas.

The InChI system is particularly well designed to circumnavigate such limitations through dynamic application of clearly defined property layers. Prior InChI extension projects for representing salts and coordination compounds have set the stage for further consideration of states and component interactions. This project will scope the minimum additional information required to usefully specify form and address issues of property expression (e.g., units).

What most stakeholders consider useful chemical identifiers focus on chemical structure representation. This is not the case with most current systems in place. The largest data sets currently used are tied to functions of registration and record management and are not generally chemically meaningful outside of the original system. For example, the CAS Registry Number is among the most familiar and has found its way into almost all known chemical safety information sources. However these uses are neither scalable, as the CAS system is proprietary, or transferable, as there is no cross-connection with other chemical identifiers to link the myriad of potential information streams relevant to these chemical processes.

Ralph Stuart
Division of Chemical Health and Safety
American Chemical Society

Previous post   |  Top of Page   |   Next post

The content of this page reflects the personal opinion(s) of the author(s) only, not the American Chemical Society, ILPI, Safety Emporium, or any other party. Use of any information on this page is at the reader's own risk. Unauthorized reproduction of these materials is prohibited. Send questions/comments about the archive to
The maintenance and hosting of the DCHAS-L archive is provided through the generous support of Safety Emporium.