Overview
Data lake flexibility & data warehouse performance in a single platform.
Open data lakehouse helps organizations run quick analytics on all data - structured and unstructured at massive scale. It eliminates data silos and allows data teams to collaborate on the same data with the tools of their choice on any public cloud and private cloud.
This modern data architecture delivers data reliability with ease of data management. Run BI, AI, ML, streaming analytics on the same data without moving or locking your data ever.
Cloudera delivers the world's only open data lakehouse providing the following benefits:
Open architecture
Cloudera’s data lakehouse powered by Apache Iceberg is 100% open—open source, open standards based, with wide community adoption. It can store multiple data formats and enables multiple engines to work on the same data.
Ease of adoption
By integrating Iceberg right into the Shared Data Experience (SDX), Cloudera offers the easiest path to deploying a lakehouse. Additional capabilities like schema evolution, hidden partition, and more simplify data management for large data sets.
Multi-cloud
Build a data lakehouse anywhere, on any public cloud or in your own data center. Build once and run anywhere without any headaches. Cloudera offers the same data services with full portability on all clouds.
Secure and governed
The Iceberg tables in Cloudera integrate within SDX, allowing for unified security, fine-grained policies, governance, lineage, and metadata management across multiple clouds, so you can focus on analyzing your data while we take care of the rest.
Cloudera's Open Data Lakehouse is now available on private cloud. Get the details
Use AI Via an End-to-End Data Lakehouse to Increase Data Lifecycle Efficiency
Key Components
Supercharge your data with an open lakehouse
Multifunction analytics
Cloudera provides the full range of data services to run AI, ML, BI, streaming analytics, data engineering on your data lakehouse. From ingestion and streaming, to processing and persistence, orchestration, discovery, and access, powerful and scalable data services deliver key analytic functions. And you can bring your choice of tools as well.
Open Table Format, Apache Iceberg
Apache Iceberg is the key building block of the open lakehouse. It is a high-performance open table format for large analytic tables that brings the reliability of SQL tables to big data, while making it possible for multiple compute engines to work concurrently. It offers rich capabilities like time travel, snapshot isolation, schema evolution, hidden partitioning and more.
Shared Data Experience (SDX)
SDX is a fundamental part of Cloudera that delivers unified security and governance technologies built on metadata. Providing full data management across data and analytics on all infrastructures everywhere, SDX reduces risk and operational costs. IT can deploy fully secured and governed data lakehouses faster, giving more users access to more data, without compromise.
Robust Data Catalog
Find, curate, and tag data anywhere across all infrastructures and generate relevant insight with Cloudera Data Catalog:
- Understand, document, and monitor data and its use
- Observe regulations and standards for relevant data
- Implement organizational and technical data protection measures
- Collaborate and share data responsibly with full insight
Customer
The data lakehouse helps global retailer NEW YORKER anticipate customer needs for better in-store experience.
Resources
Discover more insights on managing data anywhere
GigaOm Radar for Data Lakes & Lakehouses
Cloudera named a 2024 market leader for data lakehouses.