How far is the pharmaceutical industry from the optimal data lifecycle?

The pharmaceutical industry is looking at its vast reserves of data for actionable insights to inform drug discovery.

However, many hurdles stand in the way before medicine discovery reaches the optimal data lifecycle.

Dr. Gerhard Noelken, Business Development Europe, Pistoia Alliance examines the road ahead for pharmaceuticals.

R&D medical Data

How far is the pharma sector from the optimal data lifecycle?

Gerhard: “For good data lifecycle management, consistent data architecture combined with semantic technology that supports different Ontologies in different pharmaceutical domains is needed.

“While many areas like clinical research and safety have invested heavily in excellent data management, aligning those domains with the data coming from areas like early research, commercial or manufacturing can still be difficult.

“In research, creative excellence in coming up with new ideas is not always easily compatible with consistent vocabulary or data standards to be used. Semantic technology, intelligent data capture systems with automatic recording of experimental and contextual metadata can be very helpful.

“Good data standards can be a great relief for a whole scientific domain if they are introduced efficiently and broadly accepted. Not only will scientists benefit from more efficient data sharing, but now data analysis tools and artificial intelligence machines can digest data from very different sources much more efficiently without the need for huge data cleansing activities.”


If you were given the same opportunity to remove one large obstacle from the path to lifecycle nirvana: what would that one hurdle be for you, Gerhard?

Gerhard: “The need to constantly improve the understanding of how important data governance is.

“We typically see two different types of data: data generated around new potential drug candidatesand data generated for the better understanding of basic research principles. Unfortunately, data in those two areas are often treated very differently.

“In the first case, the key driver is the success of the project, to get the drug to the patient as efficiently and safely as possible. This sometimes means that you have to drop non-efficient compounds early and cannot invest into the documentation of all the information for those non-efficient drugs to the same level of detail.

“When you look into these datasets ten years later to evaluate a different target mechanism, “negative” data can be as valuable if not more valuable than the data of your original lead compound.

“Good data governance standards require applying the same level of attention to all the various data types. In the long-term this can create a much higher value for the “Corporate knowledge repository” pharma companies have accessible today.”