There is a growing demand
for digital storage and archiving systems for analytical
instrument data. Although there are data archiving methods currently
in the market, they are incomplete solutions for long
term archiving of data from analytical instruments. In general,
these systems offer a centralized method of indexing, storing and
retrieving the original binary data files generated by the many
proprietary instrument control software packages used throughout
the laboratory. When a particular piece of data is retrieved from
such a system, it is viewed using that same proprietary software
that generated it.
There are two principal problems with this approach:
1. Laboratories typically have many more people that need to access
the data than they have computers running the proprietary software
required to view it. In some cases, those people may be in a
different lab, building or country than the proprietary data
station software. It is impractical to deliver copies of
proprietary software applications to every person who might need to
access the data.
2. Instruments and data systems often have shorter lifetimes than
the required retention periods for the actual data they generate.
It is very likely that when a critical piece of data is needed some
time in the distant future (to demonstrate compliance to a
regulator or for a legal defense of a company's intellectual
property), the data station software, hardware or even operating
system is obsolete or cannot be obtained anymore.
Users must be able to access, view and potentially even reprocess
the data in the archive throughout the entire record retention
lifetime and beyond. Thus, in order to make data accessible for an
undetermined length of time, it must be "normalized" in
such a way that it can be easily understood outside the realm of
any individual software system.
It is from this understanding that we have proposed a new XML-based data model.
GAML.ORG will focus on
how XML solves a number of issues related to access and storage of
analytical instrument data and will describe the features of the
proposed data model in detail.