The Cambridge eScience Centre


CeSC Technical Forum

Creating Markup Languages in XML

Prof. Peter Murray-Rust (Unilever Centre for Molecular Informatics, Department of Chemistry)

2pm, Thursday October 3rd
Centre for Mathematical Sciences, Wilberforce Road.


XML allows any discipline, however small, to create an infrastructure for exchanging information within the discipline and outside it. XML itself is easy to learn, but the implementation of any markup language requires careful design and often considerable investment in tools. Software - at least proof-of-concept - should always parallel the language design.

XML is about getting groups within a discipline to agree on what their information infrastructure is and what they want to do. In some cases this is already clear, most frequently in regulated processes, supply chains, and existing best practice. Sometimes the XML can be designed from communal database schemas or form-based information interchange. In many cases, however, especially in new and expanding multidisciplinary sciences, there is little existing practice.

The fundamental requirements include:

  • community agreement (and resource) to make it work

  • development of communal ontologies (dictionaries)

  • creation of namespaces and XML Schemas

  • tools for creating and editing XML

  • converters from current (legacy) formats

  • domain-specific processing and rendering software (DOM and SAX)

  • database interfaces and stylesheets (XSLT)

In some cases (e.g. where the data are textual or simple numeric objects) these can be met with generic software, but often they must be created afresh or adapted from existing tools.

XML often works best when it is modularised and solutions can be borrowed from existing domains . Thus MathML, SVG (scalable vector graphics) and Chemical Markup Language (CML) have been designed to work independently of the scientific application and toolkits can use all of these. With Henry Rzepa (Imperial) we have developed STMML - a simple generic infrastructure for Scientific/Technical/Medical applications. It is often better to solve a small part of the problem well than to attempt too much - the experience will be valuable and success will encourage future developments.

Demonstrations of tools will be given using some of the languages above.

How to get here


Comments to the webmaster. Contact: Tel (+44/0) 1223 764282, Fax (+44/0) 1223 765900