Developer & Admin Resources

Index of Open-Source Projects Inside, or Related to, CDH

  • Apache Hadoop Core
    The foundation for Hadoop: MapReduce, HDFS, and Common
  • Apache Avro
    Data serialization system
  • Apache Crunch (not in CDH)
    Java library for writing, testing, and running pipelines of MapReduce jobs
  • DataFu (CDH4.1 and later)
    Library of Pig UDFs for data mining and statistics
  • Apache Flume
    Aggregator for log and event data
  • Apache HBase
    Scalable record and table store with real-time read/write access
  • Apache Hive
    SQL-like language and metadata repository