ClouderaNOW24     See the latest Cloudera Innovations

Watch now
2x productivity on 25PB of data

Key Highlights




  •  Reduced R&D costs with clear data access & centralization
  • Aligned efficiencies for clinical trials process and phases while reducing cost 

  • Modernizing clinical trial data for multiple use cases: safety networks, real-world data insights, and genomic analytics linkage to past clinical trial data

  • Connecting data, centralizing it for all clinical and for analysis for FDA submissions

  • Real-time alerts to physicians on patient safety such as severe effects across patients

For life science organizations health issues are at the forefront of their daily lives. Treating patient illness, from acute to chronic to life threatening conditions and everything in between, drives innovation.  Data becomes the critical lifeline to finding therapies and cures for many diseases affecting patients on a global scale.

Managing the abundance of data from labs, doctors notes, prescriptions, clinical trials, MRI’s and surgeries, life science organizations analyze data through machine learning models to come up with cures or ways to manage diseases.  


Data, new and old, resided in silos across the company and led to inefficient analytics and machine learning. Many companies managing patient health, like this large pharmaceutical company,  need to constantly maintain a diverse drug portfolio to remain competitive in the marketplace.  

Real-time streaming capabilities were non-existent which presented a challenge. The previous platform was not scalable or flexible enough to store all the necessary data. The company did not have any predictive modeling and no central way to address the problem. 

Lines of business within the company, like Research & Development (R&D), needed productivity improvements to accelerate new drug discovery and unleash their scientists to explore complete datasets much faster with increased efficiency of clinical trial innovation. The executive team mandated a corporate initiative to move to the cloud quickly. The company needed to act fast to manage its data across the organization while requiring comprehensive security to meet an onerous regulatory environment, avoid vendor lock-in, and the flexibility to choose their cloud vendors.


The pharmaceutical company adopted the Cloudera Data Platform (CDP) for the public  cloud (AWS), for an R&D data convergence hub & clinical trial research platform. All research and development data on this platform will deliver advanced analytics for new drug discovery and development (pre-clinical data, 3rd party medical records and generic evidence data).

Implementation of Cloudera Data Flow (CDF) also enables real-time streaming from wearables used in the clinical trial process including patient facing applications using NiFi data streaming. In addition to CDF, the customer is leveraging Cloudera's other integrated data services - data engineering (CDE), data warehouse (CDW), and machine learning (CML) - all with consistent security and governance of highly sensitive patient data. CDP and AWS will be the foundation for a new transactional data warehouse. 

Data analysis helped the company to better understand their customers' needs for a specific drug for therapies, surgery to repair or replace things in the body as well as address long term strategies fighting life threatening diseases. Product compound analytics were important for the company as it enabled product managers and product teams to analyze performance of patient devices. This was critical information to optimize to help diagnose issues and problems.  

The management of clinical trial data with the Cloudera Data Platform was pivotal to this pharmaceutical company. With vast amounts of data coming in through each phase of the trials, there was a strong need to be able to efficiently analyse and gain insights at each step. With a multi-tenant application the company can also provide savings by reducing development and deployment costs. 


The Cloudera Data Platform, with AWS, and CDF enabled a central data store for fast, holistic analytics and machine learning for the clinical trial research platform and data hub. This helped create a better drug pipeline, increased efficiency, and lowered costs to manage clinical trials. In addition, doubling R&D productivity improved cycle time with faster and more effective trials. 

The company wants to understand and cure diseases by developing therapies -  a cure versus a vaccine. The Cloudera solution diversifies their drug portfolio and accelerates FDA approvals to bring new drugs to market. “This will be based on thousands of data sources with more than 25 petabytes of unstructured data plus 20 million unique entities. The company will be able to identify and analyze over 13 trillion relationships”, said Head of AI.

For the pharmaceutical company, the results were clear and helped them define a portfolio for use cases in safety networks, for real-world data insights, and genomic analytics linkage to past clinical trial data.  What can take other companies over a year to build and accomplish with on-prem technologies, they were able to accomplish in a few months with CDP in the public cloud, AWS, and CDF.

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.