Apache HBase, a key component of CDH, is a distributed, scalable data store that runs on top of HDFS. HBase is modeled after Google’s BigTable and provides the ability to store data in massive tables (billions of rows / millions of columns) for fast, random access.
Primary Use Cases for HBase
Serving data to many users or applications
Traditional relational databases are not inherently distributed. Therefore, as the number of users interacting with the database (i.e. reading and writing data) grows, the storage, memory and CPU requirements can quickly grow behind what a single machine can serve. HBase is distributed by design. This means that the system is architected to leverage the storage, memory and CPU resources of any number of servers (or nodes) in a “cluster” to scale the database horizontally as load and performance demands increase.
Providing fast, random read/write access to users and applications
HDFS is a "write once read many" (WORM) file system that’s tuned for batch operations. The emphasis is on high throughput rather than low latency. HBase augments HDFS by providing record-based storage that allows users and applications to perform fast, random reads and writes to data. Changes are cataloged in memory and eventually pushed down to HDFS for persistence. This enables the Hadoop system to serve random reads and writes to users and applications across big tables in real time.
Key Features of Apache HBase
- Scale-out architecture - add servers to increase capacity
- Full consistency – guard against node failures or simultaneous writes to the same record
- High availability – multiple master nodes ensure continuous access to data
- Automatic sharding – transparently and efficiently scale out your data across machines in the cluster
- Active-active replication – stream data across locations for disaster recovery and data protection
- Security – table and column family-level security via Kerberos
Get Support for HBase with Cloudera Enterprise
Cloudera Enterprise is the best way to leverage the power of Apache HBase in production environments. When you deploy HBase as part of Cloudera Enterprise Flex Edition or Data Hub Edition as part of an enterprise data hub, you can rely on our market-leading technical support for HBase, as well as actively influence the development of the project.