CDH 4.2.1

Cloudera’s 100% Open Source Hadoop Platform

CDH is Cloudera's open source software distribution and consists of Apache Hadoop and additional key open source projects to ensure you get the most out of Hadoop and your data.

It is the only Hadoop solution to offer unified querying options (including batch processing, interactive SQL, text search, and machine learning) and necessary enterprise security features (such as role-based access controls).

Please note: CDH requires manual installation from the command line.
For a faster, automated installation download Cloudera Manager.

What's New in CDH4.2.1

This is a maintenance release that fixes a number of issues including the following.


Apache HDFS

  • HDFS-4618 - Default transaction interval for checkpoints is too low
  • HADOOP-9125 - LdapGroupsMapping threw CommunicationException after some idle time.
  • HADOOP-9150 - Avoid unnecessary DNS resolution attempts for logical URIs
  • HDFS-4595 - When short circuit read is fails, DFSClient does not fallback to regular read
  • HDFS-4538 - Allow use of legacy blockreader
  • HDFS-4569 - Small image transfer related cleanups
  • HDFS-3224 - Bug in check for DN re-registration with different storage ID
  • HDFS-3990 - NN's health report has severe performance problems
  • HADOOP-8469 - Make NetworkTopology class pluggable
  • HDFS-4246 - The exclude node list should be more forgiving, for each output stream
  • HDFS-4521 - Invalid network toploogies should not be cached
  • HDFS-4304 - Make FSEditLogOp.MAX_OP_SIZE configurable
  • HDFS-3277 - Fail over to loading a different FSImage if the first one we try to load is corrupt.
  • HADOOP-5442 - The job history display needs to be paged
  • HADOOP-7108 - JobConf link broken in job history page
  • HDFS-4596 - Shutting down namenode during checkpointing can lead to md5sum error
  • HDFS-4591 - HA clients can fail to fail over while Standby NN is performing long checkpoint
  • HADOOP-8159 - Mismatch of topology script mappings can trigger HADOOP-8159 if hostname/IP mapping mismatches for the same DN

Apache MapReduce

  • MAPREDUCE-5008 - Merger progress miscounts with respect to EOF_MARKER
  • MAPREDUCE-2217 - (Fixed case in which this bug leaves orphan processes hanging)
  • MAPREDUCE-4843 - When using DefaultTaskController, JobLocalizer not thread safe
  • HADOOP-5442 - The job history display needs to be paged
  • MAPREDUCE-4888 - NLineInputFormat drops data in 1.1 and beyond

Apache HBase

  • HBASE-8199 - Eliminate exception for ExportSnapshot against the null table snapshot
  • HBASE-8099 - ReplicationZookeeper.copyQueuesFromRSUsingMulti should not return any queues if it failed to execute.
  • HBASE-7991 - HFileReaderV1 caching the same parent META block could cause server abort when splitting
  • HBASE-8211 - Support for NN HA for 0.94
  • HBASE-7369 - HConnectionManager should remove aborted connections
  • HBASE-8211 - Support for NN HA for 0.94
  • HBASE-8288 - HBaseFileSystem: Refactoring and correct semantics for createPath methods

Apache Hive

  • HIVE-4141 - InspectorFactories contains static HashMaps which can cause infinite loop
  • HIVE-4075 - TypeInfoFactory is not thread safe and is accessed by multiple threads
  • HIVE-2264 - Hive Driver calls System.exit()which causes HiveServer to exit
  • HIVE-4122 - Queries fail if timestamp data not in expected format
  • HIVE-3528 - Avro SerDe doesn't handle serializing Nullable types that require access to a schema
  • HIVE-4119 - ANALYZE TABLE ... COMPUTE STATISTICS FOR COLUMNS fails with NPE if the table is empty
  • HIVE-4001 - Add o.a.h.h.serde.Constants for backward compatibility

Hue

  • Support for nested groups for LDAP import in Hue
  • HUE-1082 - LDAP Sync Users fails in the latest code branch
  • HUE-1080 - LDAP Sync Users fails with AD in Hue 2.2
  • HUE-1064 - Hue doesn't work when hadoop.rpc.protection is set to anything other than authentication

CDH 4.x Requirements and Supported Versions

Supported Operating Systems

CDH4 provides packages for Red-Hat-compatible, SLES, Ubuntu, and Debian systems as described below.

Operating System

Version

Packages

Red Hat compatible



Red Hat Enterprise Linux (RHEL)

5.7

64-bit


6.2

64-bit, 32-bit

6.4

64-bit

CentOS

5.7

64-bit


6.2

64-bit, 32-bit


6.4

64-bit

Oracle Linux with Unbreakable Enterprise Kernel

5.6

64-bit

6.4

64-bit

SLES



SLES Linux Enterprise Server (SLES)

11 with Service Pack 1 or later

64-bit

Ubuntu/Debian



Ubuntu

Lucid (10.04) - Long-Term Support (LTS)

64-bit


Precise (12.04) - Long-Term Support (LTS)

64-bit

Debian

Squeeze (6.0.3)

64-bit

  Note:
  • For production environments, 64-bit packages are recommended. Except as noted above, CDH4 provides only 64-bit packages.
  • Cloudera has received reports that our RPMs work well on Fedora, but we have not tested this.
  • If you are using an operating system that is not supported by Cloudera's packages, you can also download source tarballs from Downloads.

Supported Databases

Supported JDK versions

Supported Internet Protocol