What’s New in Lab Data and Information Management?
Posted: 02/14/2011 12:00:00 AM EST | 0
What Does The Future Hold For Electronic Lab Notebooks In 2012?
Download the free 2012 ELN Global Forecast eBook now to find out!
Burkhard Schaefer, head architect at AnIML, BSSN Software joins John Trigg, director and founder of phaseFour-informatics to discuss the AnIML project and advances in laboratory data and information management.
J Trigg: We are seeing great traction in the vendor community, we have active participation from instrument vendors in the subcommittee of ASTM that is putting AnIML together, and we have had great support dealing with vendors in putting together AnIML conversion tools and so on. Now we are actually able to bring data into the AnIML format and we’ve had some great interest there and also some systems that are actually running based on the tools that have emerged from this. The big question, to sort of generalise, is what does it take for a vendor to actually be interested in implementing a standard data format? There are the pros and cons and I will start with the cons first. An instrument becomes more easily replaceable once you have an open interface, so a piece of equipment can be swapped out for a competitor’s instrument, but that’s only the one end. On the other hand you have pressure from the market, which is something where AnIML really was very lucky, as I mentioned before. Then there is another interesting component, implementing a standard data format can actually be a cost saving measure for a vendor if done correctly.
B Schaefer: It’s a standard thatis developed under the umbrella of ASTM with participation from a number of different stakeholders from the pharmaceutical industry, from the environmental side, some government agencies, instrument vendors and academia. We think there is the right crowd of people sitting around the table to give this enough traction. We have seen enormous interest in it, and also have had a couple of deployments already. This is a very interesting initiative that I sort of stumbled into a couple of years ago, actually, in about 2003 while working at the National Institute for Standards and Technology. I’ve been involved with the project since then and served as the architect for the standards.
Over the course of eight years, we’ve put together this format, which is based on XML. Eight years is a lot of time, but the difficult time was really trying to get it applicable to as many use cases as possible. Most of the time was not spent on the actual implementation and the design but on the validation of the data model against all those different measurement technologies and potential use cases where this would fit in. About a year ago we were happy to declare victory, so to speak. We have been able to finalise the data model and now it’s going through the process at ASTM with regards to documentation and the boarding process, in order to make it an official standard.
What I’d like to do is invite you to introduce yourself and to explain what AnIML is, and give a brief overview of your role.
J Trigg: There are a couple of questions that immediately spring to mind. One is that the standards issue has been in the background for a number of years and there have been previous initiatives to address data interchange, particularly around analytical instrumentation. What is the difference that AnIML brings in terms of being relative to those previous initiatives, and is the scope the same? It sounded to me, from what you were saying as though there was a slightly different scope.
B Schaefer: That is a very interesting question. There have been a few standards that actually have succeeded in this space. Most notably the JCAMP-DX initiative from IUPAC, which was significantly driven by Bob McDonald, and also the ANDI project which started with AIA and then was also taken over to ASTM. But there have been other initiatives which have failed and up to now we do not have a deep penetration of standards yet in this space, and that’s a pity.
In contrast to some of the earlier efforts, first of all AnIML is not trying to reinvent the wheel, so we are leveraging the efforts that have proven successful. We are working very closely with people from IUPAC to leverage what was learnt from the JCAMP effort. Actually AnIML is being developed by the same committee that has jurisdiction over the AIA-ANDI standards within ASTM. So we’re not trying to do something completely new; we’re trying to expand on what’s already there and we are putting it on a modern technology level, which is XML, and back when these efforts were current, XML just wasn’t around yet. That’s the one side. We’re sort of standing on the shoulders of giants now.
The other thing that you’ve pointed out, AnIML has a slightly different scope from previous data standards. AnIML has been very deliberately designed to handle data from different analytical techniques. There are standards out there, for example in the MASPEC community, which are great for MASPEC data interchange; there are some in other fields that dominate that field. The problem is that in today’s experiments we look at the same sample with many different analytical techniques, and it becomes very difficult to integrate data from different techniques within the same data system because, potentially, everything would be in a different format.
So AnIML has been designed to let us reuse the same tools, the same data format to work with data no matter what the source is. So here we have a difference in scope. There is also a difference in the granularity of information recorded. Where some other initiatives focus on mainly delivering results, the actual interpreted results or the raw data, AnIML takes a few other concepts and mixes them in. For example there is always very detailed information about samples, about the analytical methods, about the instruments and software used, an audit trail, digital signatures, and of course the measurement results and the underlying raw data. The goal is that we capture everything needed to be able to fully reproduce an experiment. Each of those experiments would then be reflected in what AnIML calls an experiment step. Then the user can put as many of these experiments as are needed to describe a laboratory work flow, not just a single experiment, into an AnIML document and in that way tie together the information that came from the different data sources within the lab. Obviously once you have the data in such a common format, it becomes a lot easier to distribute it to other systems that may have an interest in the data.
Here, laboratory management systems or LMS and electronic lap notebooks or ELNs, come to mind as a consumer for information like this.
We have a slightly different scope, it’s a bit broader but that is both a feature and a burden, which makes the implementation a bit more complex.
J Trigg: That’s really interesting and it’s immediately brought to mind another initiative which is the Pistoia Alliance. I appreciate that that’s coming from a different viewpoint than perhaps where AnIML is coming from, but do you have any comment on that at all?
B Schaefer: The Pistoia Alliance, and the disclaimer - I am not involved with this initiative so I’m just following this from the outside - I think they are addressing some very important issues. What Pistoia does is they take a much, much broader view of this domain compared to AnIML. AnIML is really just focusing on the data side and Pistoia is putting a big framework all around that. I think the two efforts are really complementary, where Pistoia handles the overall approach and AnIML can be one of the building blocks. It would be interesting to see where the two initiatives could go if they are used together with each other. In terms of deliverables, there are different levels of progress within the two initiatives so that would have to be synced up to have something that actually can be implemented. I think both of these initiatives are very important for this field.
J Trigg: The other issue with standards over the years has been really gaining sufficient momentum, and it seems to me there are two drivers there: one is the vendor community and their willingness to take on standards for that data output and the other is the community side, the end user side, whether there’s not enough inertia there to actually try to drive vendors towards this. What sort of level of interest are you getting from vendors and from end user communities for AnIML?
B Schaefer:I think every standardisation has a couple of war stories to share because it’s really a chicken and an egg problem, to have the end users that are interested in moving away from the proprietary data format so that the data is more accessible, but then this is only really possible if there’s adequate support from either the vendor community directly or through implementation efforts that the end user would have to go through themselves. The goal is to have native support for standardised data formats directly from the implementers and instruments. We had two very lucky incidents in the AnIML community where two of the top ten pharmaceutical companies have had projects that they needed to move forward on and actually one of them put a requirement for AnIML support into the RFP for a long term archiving system. That caught the attention of a couple of vendors who really wanted to be in on that contract, so that led to a snapshot of AnIML being implemented for this project a couple of years ago. Then we were in the lucky position to be able to have big names on our list of references, which are really important and which has resulted in AnIML being a name that people are very familiar with.
3 Tips on Overcoming Poor Aqueous Solubility and Stability
The True Cost of Not Having Brand Protection
Is Your Forecast for Global Demand Wrong or Lucky?
Streamlining the Comparator Sourcing Process
Is Your Clinical Outsourcing Clear?
Reducing Risks in Global Comparator Sourcing
Investing in Techniques for Developing Amorphous Materials
Why Are We Going to Move to Another CRO?
Early Phase Trials: Reevaluating Asian Markets
Making Compound Management Simpler
* = required.