Overview
Why Apache Iceberg?
Apache Iceberg is an open table format purpose-built for large scale analytics. It delivers the reliability and simplicity of SQL tables, providing data warehouse-like capabilities directly on data lake storage.
Apache Iceberg is not a storage, it’s not a database, and it’s not a compute engine. It is a metadata management layer that sits on top of your data files, stored wherever you want. Iceberg makes data accessible to multiple compute engines concurrently while guaranteeing data reliability and consistency.
Reasons for adopting Iceberg.
Openness
Iceberg is fully open, vendor-agnostic, and engine-agnostic. It has the broadest community support from both vendors and non-vendors, which accelerates unbiased innovation.
Modern data warehouse functionality
Iceberg features, such as transactional consistency, hidden partitioning, schema evolution, and time travel, ease data operations.
Petabyte-scale analytics
Iceberg was built from the ground up, eliminating the bottlenecks of previous table formats by maintaining its own metadata layer
Apache Iceberg on Cloudera.
We integrate Iceberg as a first-class citizen, right into our Data Lakehouse.
Run high-performance analytics, data engineering, data science, and AI, while bringing the right engine for the right job to your data in place, eliminating data movement and data copies.
Abstract storage from compute. Get unified access to structured, semi-structured, and unstructured data in the data lakehouse. Use built-in AI chatbots to explore and leverage all of your data.
Why run Apache Iceberg on Cloudera?
The only hybrid open data lakehouse powered by Iceberg
Deploy anywhere, on any cloud or in your data center, wherever your data resides
Multi-engine support
Get the broadest set of pre-integrated data services and capabilities for ingestion, processing, analytics and AI to support your entire data lifecycle
Lower TCO by up to 75%
Common standard for data with unified security and governance, eliminates ETL, data silos, and data copies, reducing TCO by up to 75%
Benefits of Cloudera's open data lakehouse, powered by Apache Iceberg
Democratize data: Empower everyone to access data-driven insights with natural language
Accelerate analytics and AI: Deploy Generative AI applications and dashboards on your data
Keep data open & interoperable: Own your data and leverage your choice of tools
Unlock the full potential value of your data
Introducing Apache Iceberg: The Case for an Open Data Lakehouse Powered by Cloudera
Customers
Apache Iceberg guarantees full ownership of your data
Get engaged
Blogs
Empower Your Cyber Defenders with Real-Time Analytics Author: Carolyn Duby, Field CTO
Cloudera Lakehouse Optimizer Makes it Easier Than Ever to Deliver High-Performance Iceberg Tables
Databricks Follows Cloudera by Adopting Iceberg, While Snowflake Mulls Open Source Approach
Documentation
Getting started with Apache Iceberg
Ready to get started?