Cloudera named a leader in 2022 Gartner® Magic Quadrant™ for Cloud Database Management Systems Get the report

Research Institute Analyzes Genome Data to Advance Precision Medicine for Predicting and Preventing Disease

PALO ALTO, Calif., June 06, 2017 Cloudera, Inc., (NYSE: CLDR), the leading provider of the modern platform for machine learning and advanced analytics built on the latest open source technologies,  announced that Inova Translational Medicine Institute (ITMI), a global leading medical research institute, has deployed Cloudera Enterprise to securely analyze massive collections of clinical and genomic data at unprecedented speeds and scale for faster innovations in translational medicine research.

As part of the Inova Center for Personalized Health (ICPH), ITMI’s team of leading scientists, researchers, analysts and collaborators use machine learning algorithms on terabytes of clinical and genomic information to identify the genetic links to diseases. They make discoveries from the data insights and, in collaboration with the treating physician, develop personalized treatment plans for patients. This approach is also known as precision medicine and has the power to help patients live longer, healthier lives.

Genetics plays a role in the majority of leading causes of death in the United States, including heart disease, cancer and diabetes. The Institute collects clinical data from thousands of Inova patients born from over 110 countries. Just one person’s unique DNA contains six billion bits of information. Mapping individual's DNA codes into genome sequences helps scientists determine the cause of diseases and discover transformative treatments. As part of this process, ITMI is also assembling what is expected to be one of the world’s largest whole genome sequence databases connected to patient information in a healthcare system.


“The challenge for ITMI researchers and scientists was to analyze our highly complex, massive collection of raw data faster and more efficiently and translate insights into practical patient care. We’re now able to get answers in minutes and seconds and can find correlations that we couldn’t see before,” said Aaron Black, chief data officer of ITMI. “Our researchers used to spend 80 percent of their time on data wrangling and only a sliver of time on the analytics. We’re in the process of reversing that. We can now accelerate the pace of genomic discovery and dramatically change the way we interact with our research teams.  We believe that will improve our ability to provide the right treatments to the right patients and ultimately, improve outcomes. What Cloudera has done is made this imminently possible.”

The Cloudera platform enabled ITMI to streamline their genomic data analysis for discovery. This genomic data analysis allows a bioinformatics scientist to study genomic correlations from people with conditions like arthritis, autoimmune diseases or cancer. In the past, given the massive size of whole genomes, this process could take ITMI about two months to accomplish. Using Cloudera, ITMI can accomplish end-to-end data analysis in one week. In the future, ITMI expects to do these data analysis in just hours.

Working with Cloudera, ITMI built a world-class bioinformatics infrastructure for the Institute's massively growing data collection of genomes paired against the clinical record. The infrastructure was designed to store and process this convergence of biological data, at speeds and scale, well into the future.

While one genome equals more than three billion DNA base pairs, ITMI currently tracks approximately 9,000 whole sequenced genomes, scaling to 15,000 in the future. Cloudera’s modern analytic database powered by Apache Impala brings high-performance SQL analytics to big data. With the flexibility, scale and speed Cloudera provides, ITMI’s team will apply multi-user concurrency and high-performance analysis of genomic data gathered from mothers, fathers and infants enrolled in various familial base studies.  For example, ITMI has been able to leverage its clinical and genomic analysis expertise to help discover previously undiagnosed congenital anomalies in infants.  This is a time consuming and iterative process, but with tools like Cloudera, ITMI anticipates accelerating these discoveries to help these families.

“Inova’s unique and leading edge big data architecture matches the diversity in their patient community and their breadth of innovation. Cloudera is proud to work with these pioneers in clinical genetics at scale, who are advancing genomic research and personalized healthcare,” said Shawn Dolley, industry leader, health and life science at Cloudera. “ITMI is advancing the way researchers and clinicians can consume and manage genomic and molecular data. Combining clinical and genetic data and layering in machine learning is how we will transform the decisions we make in patient care, disease prevention and precision public health.”

Additional Resources

About Cloudera

Cloudera delivers the modern platform for machine learning and advanced analytics built on the latest open source technologies. The world’s leading organizations trust Cloudera to help solve their most challenging business problems with Cloudera Enterprise, the fastest, easiest and most secure data platform available for the modern world. Our customers efficiently capture, store, process and analyze vast amounts of data, empowering them to use advanced analytics to drive business decisions quickly, flexibly and at lower cost than has been possible before. To ensure our customers are successful, we offer comprehensive support, training and professional services. Learn more at


Connect with Cloudera

About Cloudera:

Read our blog:

Follow us on Twitter:

Visit us on Facebook:

Join the Cloudera Community:

Read about our customers’ successes:

Cloudera, Hue and associated marks and trademarks or registered trademarks of Cloudera Inc. All other company and product names may be trademarks of their respective owners.

This press release contains forward-looking statements including, among other things, statements regarding the expected performance and benefits of Cloudera’s offerings. The words "believe," "may," "will," "plan," "expect," and similar expressions are intended to identify forward-looking statements. These forward-looking statements are subject to risks, uncertainties, and assumptions. If the risks materialize or assumptions prove incorrect, actual results could differ materially from the results implied by these forward-looking statements. Risks include, but are not limited to, risks described in our filings with the Securities and Exchange Commission (SEC), including our Form S-1 Registration Statement, and our future reports that we may file with the SEC from time to time, which could cause actual results to vary from expectations. Cloudera assumes no obligation to, and does not currently intend to, update any such forward-looking statements after the date of this release.

Press Contact

+1 (650) 644-3900

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.