Open Data Lakehouse
Financial Services
Spain
Stabilizing an Economy on a Foundation of Data
The Banco de España (BdE) is a public institution that serves as Spain’s national central bank. It is also responsible for supervising the Spanish banking system and other financial intermediaries operating in Spain, all within the European institutional framework. Its mission is to ensure price and financial stability, supporting economic growth. In addition, it contributes to the design of other economic policies through its analysis. This organization manages vast amounts of information, essential for producing the reports and studies it publishes, making effective data use a key priority.
A Critical System Strained by Silos and Obsolescence
One of Banco de España’s key services is the Central Credit Register (CIR), a public service that manages a database containing virtually all loans, credits, guarantees, and general credit exposures that financial institutions have with their clients. The CIR exchanges data with 360 institutions, covering more than €3 trillion in credit and other exposures. The system is continuously updated and issues over 450 million reports annually.
The CIR was also the first unit to use a computer at the organization. Various technological solutions were implemented in the following years, including a mainframe-based operational system, relational databases, and analytical tools. Over time, data volumes grew significantly, and the existing technologies and solutions showed limitations in meeting BdE’s needs. Data was siloed, complicating access and interoperability, resulting in a model with multiple sources of truth, affecting data quality. Obsolescence, inflexible applications, and high maintenance costs further compounded this.
From Silos to Synergy: Unifying Data with a Modern Lakehouse
To address this, and as part of its 2020–2024 Strategic Plan, Banco de España redesigned its information management processes, focusing on evolving the CIR under the BigCIR project. The initiative defined several key objectives, notably reducing time-to-market and unifying data governance, which are essential for managing the data lifecycle (lineage, auditing, security, etc.).
BigCIR is built on a data lakehouse powered by Cloudera technology, offering greater reporting agility, eliminating data silos, and adapting to users’ new analytical demands. The project also enables a response to the exponential growth in microdata analysis and lays the groundwork for developing Artificial Intelligence capabilities.
Measuring Success in Terabytes and Trust
BigCIR went live in June 2023 and currently manages up to 850 TB of data and 77 billion records, with around 3,300 weekly queries handled by more than 200 employees. It is a disruptive and successful project that provides users full access to CIR microdata through an open platform integrating analytical tools for effective data exploitation.
Another area of strong focus has been data quality. Thanks to Cloudera’s advanced automation capabilities, the organization ensures that all published data includes the appropriate labels to indicate its level of accuracy. These quality indicators help users correctly interpret whether the data is final or provisional.
After standardizing and automating its data processing, Banco de España is ready to meet today’s information demands. It has also implemented a technical data architecture that accelerates the development and deployment of use cases on BigCIR. Looking ahead, the organization is working to improve system usability, moving toward a sandbox model with reusable components, and progressing toward a hybrid model with optional cloud connectivity.