The Cambridge eScience Centre


Molecular Informatics

Unilever Cambridge Centre logoMolecular Standards for the Grid

Exploiting modern methods of information management to discover new molecular information.

Call for early adopters...

JPG poster | PDF poster

Conference paper: A semantic GRID for molecular science (Word 97 document)

The Problem

  • Chemical safety is now a key concern in Europe and elsewhere.

  • Of over 30,000 compounds in regular industrial use very few have had a complete safety analysis.

  • Safety is determined by properties of the compound and its environment.

  • Chemical properties are difficult and expensive to measure and may involve animal testing.

  • Less than 0.1% of published chemical properties are Openly available in an electronic form.

  • Conventional publication produces fragmented, un-validated, unfriendly data.

Our Solution

Journal-eating Robot

  • Create an Open architecture to compute molecular properties.

  • Our robots read normal journal articles and extract the molecular data, which is turned into XML, aggregated, and validated in CML (Chemical Markup Language).

  • Our Grid-aware high-throughput computation of properties uses W3C and OGSA tools (Condor-G, Xindice and Tomcat).

  • Create a Grid-based "black-box" for non-specialists to compute molecular properties on demand.


  • Can generate globally unique identifiers for molecules.

  • Validates molecular properties against community-wide metadata.

  • Contains a global ontology and standard for molecules.

  • Represents data and metadata in a generic architecture.

  • Quality of resulting data is enforced by use of standard protocols such as XML.

  • Supports authoring by humans, instruments and computational chemistry programs.

  • Our application is now being made available to selected collaborators.


Comments to the webmaster. Contact: Tel (+44/0) 1223 764282, Fax (+44/0) 1223 765900