ClouderaNOW24  Product demos. Live Q&As. Exclusive sneak peeks  |  Oct 30

Register now

Impact

Improved data processing speed across millions of documents and datasets, added more complex analytics and algorithms and improved accuracy and efficiency.

Reduced data management and infrastructure costs for managing several hundred terabytes and increased access with dedicated data spaces for various groups

Enhanced data governance and security management for ongoing compliance with national and European Union regulations

Data Architecture
Industry

Public Sector

Country

Spain

Spain's Tax Agency Leverages Cloudera for Data Innovation

The Agencia Tributaria (AEAT) is the Spanish public entity in charge of the correct application of the country's tax system for over 48 million citizens. The agency is responsible for managing and collecting state taxes, including customs duties, as well as preventing and detecting fraud, and enforcing sanctions for non-compliance with tax laws. The agency collaborates across different Spanish regions and works within the European Union's requirements and is responsible for many assistance services to taxpayers. As with tax agencies around the world, they face growing data volumes, changing regulations and new opportunities for AI. 

Managing Massive Data Growth and Advanced Analytics Needs

The agency needed to grapple with a growing volume of data from various sources, a challenge exacerbated by Spain's population exceeding 48 million. This necessitated not just the ability to manage large data volumes but also advanced analytics and AI capabilities. The agency needed to be able to perform agile queries and develop more data engineering and machine learning algorithms in an environment with billions of records.

They needed a solution that could be easily integrated with existing systems but also had scalability, high availability and guaranteed information governance and security. They also needed a platform to simplify data administration tasks, add new analytics capabilities, and optimize and streamline tax and customs administration. 

Implementing Data Lakehouse for Advanced Analytics and Compliance

The organization strategically chose Cloudera as a partner to create its Big Data platform. Recognizing the importance of a data lakehouse architecture to adapt to management and continuous growth needs, the Agencia Tributaria deployed the services on-premises within its proprietary infrastructure of high-availability servers. 

Under this infrastructure, the Tax Agency works on several levels. First, Cloudera allows them to create isolated, controlled and performance-optimized data spaces through data partitioning and replication. In addition, they are able to index information from millions of documents that they were previously unable to fully capitalize on, as they are now able to search all of this content by terms. 

They also rely on Cloudera to run complex algorithms that require the processing capabilities of distributed systems. Thanks to Hive and Impala databases and parallel processing with Spark, they perform operations on data tables with billions of records, doing massive and complex crosswalks, pattern searches, etc.

Finally, governance and regulatory compliance are essential for the organization, which must comply with the National Security Scheme of Spain. Using Cloudera Shared Data Experience, they grant access to data only to the appropriate users, enabling the governance of teams for each business area.

Enhanced Data Management and Preparing for Future Growth

Now, Cloudera is deployed in four different clusters to meet the needs of the organization's different environments. In total, more than forty dedicated processing nodes are operational. The platform hosts several hundred terabytes of information.

The Tax Agency relies on Cloudera for data management and a foundation for performing advanced data analysis in a constantly growing environment. The platform helps the organization constantly improve its analytical capabilities and ensure regulatory compliance.

The organization is now able to index information from millions of documents, run complex algorithms, and create isolated and controlled data spaces. 

In the future, the Agency anticipates a further increase in data, driven by new sources and ongoing growth. As a result, their data and analytics needs will continue to expand, and they plan to leverage tools such as Cloudera's to fulfill the state-level functions they perform.

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.