Data lakehouse
Healthcare
Spain
The Madrid Salud is the public entity responsible for managing the healthcare system in the Madrid Metropolitan Area. It oversees the provision of healthcare services in all centers within the region and implements institutional programs for disease prevention, rehabilitation, and promotion of healthy habits for the over six million inhabitants residing in the area.
Through the GENESIS initiative, the DG Information Systems and Digital Health is working on the digital transformation strategy for the Madrid healthcare service, with data as the central focus.
The main objective of the General Directorate is to enhance the patient experience through investments in data management technologies, aiming to improve accessibility to the system while also considering the needs of the 35,000 healthcare professionals involved. The goal is to provide a self-service-based care model and, operationally, to streamline the range of services available to citizens and regional processes.
This initiative is part of a Digital Health plan that aims to transform healthcare in Madrid based on four pillars: accessibility, telemedicine, artificial intelligence (AI), and healthcare data management.
Data-driven strategy to improve patient experience
In 2019, Madrid Salud started to work with Cloudera technology to create a unified data space that would facilitate accessibility to information and foster research in the healthcare field through proper data governance.
The process started with a highly fragmented architecture, with legacy applications that interoperated with each other. Initially, the Cloudera Data Warehouse was implemented, which has since been migrated to the latest versions, enabling a unified repository through a modern Data Lakehouse architecture powered by Cloudera, including self-service capabilities.
Healthcare organizations are inherently complex, and Madrid Salud is no exception. It manages 35 hospitals, more than 200 health centers, and 35,000 professionals. The organization has transitioned from an outdated, complex architecture to a modern, unified one where data is at the heart of digital transformation.
Collaboration with Cloudera and CGM Clinical España has allowed Madrid Salud to redefine its healthcare model, now managing one of the largest data lakes in the European healthcare sector. The institution holds comprehensive patient health data, enabling it to unlock its full potential.
A Data Governance Office ensures data quality, metadata management, and security while also handling data ownership in a sector where data is highly complex and variable, requiring robust standardization. The organization has developed a multi-layered model for ingesting information from diverse data sources using a hybrid on-premise and cloud platform. This setup ensures data persistence, virtualization, and application capabilities, enabling self-consumption.
"Thanks to the technology provided by Cloudera, we have laid the foundations for a total and comprehensive transformation of healthcare in Madrid, aiming to improve the quality of life for all its users," says Nuria Ruiz Hombrebueno, Director-General of Digital Health at Madrid Salud.
Harnessing data to improve healthcare
In the project's initial phase, integrating data from over 400 applications into a single repository was a significant challenge. Big Data technologies were used to normalize and standardize these sources:
Preparation: Tailoring data to meet specific analysis requirements, reducing initial processing costs.
Support: Providing data scientists with a Data Lab environment for interactive model development.
Algorithm Development: Enabling the creation of algorithms in multiple languages according to the needs and preferences of data scientists using a common framework.
Execution: Training and deploying algorithms, then publishing results for consumption.
The healthcare sector generates a massive volume of data due to extensive user interactions. Following this successful integration, Madrid Salud created the largest healthcare data repository in Spain, holding over 1.6 billion objects. Professionals now have a comprehensive view of patients and their conditions, supported by an integrated medical imaging platform. This project has also laid the technological groundwork for telemedicine services.
Moving Toward Self-Service and Advanced Analytics
In the next phase, Madrid Salud is developing a self-service model for professionals, leveraging descriptive and prescriptive analytics. The transition to the latest data lakehouse technology will enable the organization to maximize the value of its data.
For instance, genomic data will play a critical role in early diagnosis by utilizing historical and demographic data. The diversity and complexity of healthcare data necessitate cutting-edge technology for effective data management.
Co-creation model to promote medical research
As part of its transformation, Madrid Salud established the Digital Health Innovation Center in 2022, housed in the administrative building of Zendal Hospital. This center operates under a co-creation model, providing a governed platform where the healthcare sector can develop clinical trials using anonymized and controlled data. It also opens the door for collaboration with the pharmaceutical and technology industries.
To support this platform, a governance layer was implemented to manage, control, and authorize all projects, and to prepare data catalogs and quality standards for research. The organization is now developing a series of technological and clinical trials to refine data models at both the source and destination levels.
This initiative will result in a federated and open data suite, adding value to Madrid’s healthcare ecosystem and fostering collaborations with national and international organizations. The ultimate goal is to strengthen data-driven healthcare models that treat data as a vital asset.