CDH

100% Open Source Distribution including Apache Hadoop

CDH is the world’s most complete, tested, and popular distribution of Apache Hadoop and related projects. CDH is 100% Apache-licensed open source and is the only Hadoop solution to offer unified batch processing, interactive SQL, and interactive search, and role-based access controls. More enterprises have downloaded CDH than all other such distributions combined.

Key Capabilities

Just like a Linux distribution gives you more than Linux, CDH delivers the core elements of Hadoop – scalable storage and distributed computing – along with additional components such as a user interface, plus necessary enterprise capabilities such as security, and integration with a broad range of hardware and software solutions.

All the integration work is done for you, and the entire solution is thoroughly tested and fully documented. By taking the guesswork out of building out your Hadoop deployment, CDH gives you a streamlined path to success in solving real business problems.

What's Inside?

CDH includes the core elements of Apache Hadoop plus several additional key open source projects that, when coupled with customer support, management, and governance through a Cloudera Enterprise subscription, can deliver an enterprise data hub.

Cloudera's Hadoop

Online NoSQL – HBase

HBase is a distributed key-value store that helps you build real-time applications on massive tables (billions of rows, millions of columns) with fast, random access.

Search – Cloudera Search

Cloudera Search lets your users query and browse data in Hadoop just they would search Google or your favorite e-commerce site.

Analytic SQL – Impala

Impala is the industry’s leading massively-parallel (MPP) SQL engine built for Hadoop.



In-Memory Machine Learning and Stream Processing – Apache Spark

Spark delivers fast, in-memory analytics and real-time stream processing for Hadoop.

 

CDH is:

  • Flexible - Store any type of data and prosecute it with an array of different computation frameworks including batch processing, interactive SQL, free text search, machine learning and statistical computation.
  • Integrated - Get up and running quickly on a complete, packaged, Hadoop platform.
  • Secure - Process and control sensitive data and facilitate multi-tenancy.
  • Scalable & Extensible - Enable a broad range of applications and scale them with your business.
  • Highly Available - Run mission-critical workloads with confidence.
  • Open - Benefit from rapid innovation without proprietary vendor lock-in.
  • Compatible - Extend and leverage existing IT investments.

Get Started with Cloudera Express

Supercharge CDH and get up-and-running even more quickly with Cloudera Express, which includes both CDH and also Cloudera Manager to help you with automated deployment, configuration and cluster management.

This powerful management automation tool offers the fastest and easiest way to getting your Hadoop cluster up and running and exploring your first use cases. Once your cluster is operational and you’re ready to take the next step, check out Cloudera Enterprise, our subscription offering that includes Cloudera’s unique proactive and predictive support and powers most of the world’s product Hadoop cluster.

Kite SDK for Developers

Kite is an open source set of libraries, tools, examples, and documentation engineered to simplify the most common tasks when building applications on top of Hadoop. Get Started Now >>

Get Announcements About New CDH Releases

To receive CDH release announcements, bookmark or subscribe to the Release Announcements forum.