Quality by design: Is your data management FAIR?
Ian Harrow, project lead for the Pistoia Alliance's FAIR implementation project, discusses the legacy of ineffective data management and how we can better enact FAIR principles
The legacy of ineffective data management
If there was one word to describe the biggest challenge life sciences companies face when it comes to data management, it would be legacy.
In the early 90s, big pharma had to get comfortable with growing levels of mergers and acquisition. The inevitable consequence of this influx of M&A was the convergence of internal legacy data with other companies’ legacy data.
With little of this data managed well and stored in a way that was easy to access and recognize, vital knowledge and findings were locked within disparate siloes.
As Ian Harrow, project leader at the Pistoia Alliance for the FAIR implementation project, told us “there has always been a need and a struggle to manage data well”.
Dr Martin Romacker, Principal Scientist at Roche, explained that today, we have “data assets that are siloed, stored in varying formats, hard to retrieve and share and are not interoperable”. With the industry undergoing significant digital transformation, set to generate enormous data volumes and looking into the adoption of new technology, ineffective data management cannot continue.
Although the FAIR (Findable, Accessible, Interoperable and Reusable) data principles have been known since 2016, there has been a recent drive to adopt these concepts further and ensure that we implement quality by design for data management.
Many companies understand that data can be a valuable corporate asset and that the current ‘data deluge’ is putting their scientists under continuous pressure. To create a more collaborative and successful research environment, ready to make much needed breakthroughs, data must be better managed.
RELATED: Watch our recent webinar on data integrity to understand what the FDA are looking for in your data processes
Putting the FAIR principles into practice
As Ian Harrow explained “what has been happening in the last 3 years is a move beyond the FAIR guidelines as we get to the hard part which is putting them into practice”. Although an easy concept in principle, there are a number of stumbling blocks when implementing the FAIR guidelines.
To ensure data is findable requires an underlying structure of data identifiers and oversight over where all data is located. Accessibility then demands a clear set of controls to ensure only appropriate people can authorize certain data and other data is open when necessary. Ian pointed out that “FAIR does not mean open data. It means data that is accessible and administered in a managed way with standards for accessibility and protocols”.
Interoperability requires that data can be moved around to all the places it is needed, one of the toughest aspects of FAIR to meet. This coupled with re-usability, presents perhaps the biggest challenges for many companies. If all other aspects are in place, then re-usability becomes far easier. But without the necessary structure, many companies have to start from scratch. Considering the legacy of siloed operations and data management, a new approach which embeds the FAIR principles into the data collection process must be undertaken.
RELATED: Download our report to discover everything you need to know about how to build a smart lab
Quality by design
For all life sciences companies, data management is a continuous journey. Ian believes that during this process “it is very reasonable to ask what is ‘FAIR enough’” He does not believe that from the off-set, the FAIR guidelines require all practices to be perfectly FAIR. Rather, you approach to the principles should be driven by your intended application; what you’re looking to do with the data to make the best use of it.
Instead of striving for perfect data management, Ian and the Pistoia Alliance recommend taking a more measured approach. By treating the process as a journey, it can make the change far more manageable.
Ian shared the example of ‘bring your own data workshops’, where people can bring their data and look at what they currently have alongside the FAIR metrics. Through this, Ian believes they can “look for the low hanging fruit, those easy change to make the data more FAIR, and also identify the more long term changes to create an action plan”.
In his opinion, “what you don’t do is try to boil the ocean. You can’t fix everything at once”. By looking for those first easy steps you can start to ask an important question; what is our current approach to data management?
A difficult aspect of this process is often aligning and convincing all of the relevant stakeholders to see the lasting value in stronger data management practices. By implementing a short term plan, you can work your way through the process and then use your results as a case study to showcase the impact and benefit of better data management.
This will lead to benefits in the long run. As Ian explained, building the right data management structure is similar to building a house. It is far better to employ a good architect and planner and build upon good foundations than it is to have to consistently re-build again and again. He believes that “the future will be much more about better design”. Currently, we are dealing with data that is not FAIR and has to be verified to be used. In the future, with the correct processes we can ensure the continual quality of data and eliminate the time lost to the retro-fixing process.
From FAIR to new findings
As Dr Romacker of Roche pointed out, “companies need to act with a sense of urgency to implement FAIR if we are going to realize the value of analytical methods, such as deep learning, based on high quality data”.
As we move towards the Lab of the Future and more life sciences companies invest in AI, big data and machine learning, the quality of data will be essential.
Ian shared that one of the most important considerations to keep in mind as you pursue a FAIR strategy is that your data can be used by machines. As he explained “humans will be needed as experts to manage the process, but if you’re going to work with big data, machine learning and AI, then what you need is the ability to scale what you’re doing”. He continues that “it’s all about machines taking care of workflows and being able to work with the datasets you have”. In his opinion, that is where good data management through the FAIR guidelines scores strongly.
In this period of transition, Ian believes that companies are open to a more collaborative approach to data management. While in the past, each company would have taken their own approach to fix this issue, he now foresees far more competitive co-operation to bring forward a new era of quality by design.
As companies seek to collaborate more, he believes that projects such as the FAIR toolkit will be more significant to encourage this approach and then share best practices and use cases to a broad audience.
RELATED: Our Pharma IQ experts share insight on the use of artificial intelligence in Pharma