Resource Library

Cloudera offers a variety of materials on big data consolidation, storage and processing. The library includes high-level overviews as well as detailed information on Apache Hadoop and the surrounding ecosystem.

  1. /content/cloudera/en/resources/library/hbasecon2014/hbase-read-high-availability-using-timeline-consistent-region-re/jcr:content/mainContent/resourcecomponent.img.jpg/1405466051863.jpg
    HBase Read High Availability Using Timeline-Consistent Region Replicas
    • Monday, May 05 2014
    • Category: HBaseCon, Video, Presentation
    HBase has ACID semantics within a row that make it a perfect candidate for a lot of real-time serving workloads. However, single homing a region to a server implies some periods of unavailability for the regions after a server crash. Although the mean time to recovery has improved a lot recently, for some use cases, it is still preferable to do possibly stale reads while the region is recovering. In this talk, you will get an overview of our design and implementation of region replicas in HBase, which provide timeline-consistent reads even when the primary region is unavailable or busy.
  2. /content/cloudera/en/resources/library/hbasecon2014/from-mongodb-to-hbase-in-six-easy-months/jcr:content/mainContent/resourcecomponent.img.png/1405466034032.png
    From MongoDB to HBase in Six Easy Months
    • Monday, May 05 2014
    • Category: Presentation Slides, HBaseCon
    Pushing well past MongoDB's limits (2TB data every week) is an interesting exercise in operational frustration. It also severely hampers flexibility of design for new use cases. This talk covers the architectural journey from MongoDB/Redis to HBase at Optimizely -- including the performance, design flexibility, speed of implementation, and other gains made. It also covers the operational setup needed to monitor and maintain the system as well as lessons learned from the migration process itself.
  3. /content/cloudera/en/resources/library/hbasecon2014/real-time-hbase--lessons-from-the-cloud---ppt/jcr:content/mainContent/resourcecomponent.img.png/1405465981881.png
    Real-time HBase: Lessons from the Cloud
    • Monday, May 05 2014
    • Category: Presentation Slides, HBaseCon
    Running HBase in real time in the cloud provides an interesting and ever-changing set of challenges -- instance types are not ideal, neighbors can degrade your performance, and instances can randomly die in unanticipated ways. This talk will cover what HubSpot has learned about running in production on Amazon EC2, how to handle DR and redundancy, and the tooling the team has found to be the most helpful.
  4. /content/cloudera/en/resources/library/hbasecon2014/tales-from-the-cloudera-field-ppt/jcr:content/mainContent/resourcecomponent.img.png/1405466015382.png
    Tales from the Cloudera Field
    • Monday, May 05 2014
    • Category: Presentation Slides, HBaseCon
    From supporting the 0.90.x, 0.92, 0.94, and 0.96 HBase installations on clusters ranging from tens to hundreds of nodes, Cloudera has seen it all. Having automated the upgrade paths from the different Apache releases, we have developed a smooth path that can help the community with upcoming upgrades. In addition to automation best practices, in this talk you'll also learn proactive configuration tweaks and operational best practices to keep your HBase cluster always up and running. We'll also walk through how to contain an application bug let loose in production, to minimize the impact on HBase posed by faulty hardware, and the direct correlation between inefficient schema design and HBase performance.
  5. /content/cloudera/en/resources/library/hbasecon2014/bulk-loading-in-the-wild--ingesting-the-world-s-energy-data/jcr:content/mainContent/resourcecomponent.img.jpg/1405466083303.jpg
    Bulk Loading in the Wild: Ingesting the World's Energy Data
    • Monday, May 05 2014
    • Category: CDH, Presentation, Video
    HBase is designed to store your big data and provide low latency random access to that data. One of its most compelling features is Bulk Loading, which enables the generation of HFiles that can then be passed to the RegionServers. Opower's energy insights platform uses it to ingest the hundreds of millions of meter reads it receives daily from its partner utility companies. This presentation will walk you through the HBase Bulk Loading process and Opower's adoption of it as an important piece of its HBase ecosystem.
  6. /content/cloudera/en/resources/library/hbasecon2014/hbase-read-high-availability-using-timeline-consistent-region-re-ppt/jcr:content/mainContent/resourcecomponent.img.png/1405466186684.png
    HBase Read High Availability Using Timeline-Consistent Region Replicas
    • Monday, May 05 2014
    • Category: HBaseCon, Presentation Slides
    HBase has ACID semantics within a row that make it a perfect candidate for a lot of real-time serving workloads. However, single homing a region to a server implies some periods of unavailability for the regions after a server crash. Although the mean time to recovery has improved a lot recently, for some use cases, it is still preferable to do possibly stale reads while the region is recovering. In this talk, you will get an overview of our design and implementation of region replicas in HBase, which provide timeline-consistent reads even when the primary region is unavailable or busy.
  7. /content/cloudera/en/resources/library/hbasecon2014/new-security-features-in-apache-hbase-0-98--an-operator-s-guide-ppt/jcr:content/mainContent/resourcecomponent.img.png/1405466204741.png
    New Security Features in Apache HBase 0.98: An Operator's Guide
    • Monday, May 05 2014
    • Category: HBaseCon, Presentation Slides
    HBase 0.98 introduces several new security features: visibility labels, cell ACLs, transparent encryption, and coprocessor framework changes. This talk will cover the new capabilities available in HBase 0.98+, the threat models and use cases they cover, how these features stack up against other data stores in the Apache big data ecosystem, and how operators and security architects can take advantage of them.
  8. /content/cloudera/en/resources/library/recordedwebinar/tableau-cloudera-webinar-and-demo-201405-slides/jcr:content/mainContent/resourcecomponent.img.png/1405383544082.png
    Govern This! Data Discovery and the application of data governance with new stack technologies
    • Thursday, May 01 2014
    • Category: Business process optimization, Analytics & Business Intelligence, Presentation Slides, Presentation
    Tableau joins us to share and demo how to apply governance to the discovery layer in an enterprise data hub while still meeting the speed and agility requirements of the business user. We also provide a Cloudera Navigator demo along with the Tableau demo.
  9. /content/cloudera/en/resources/library/recordedwebinar/tableau-cloudera-webinar-and-demo-201405/jcr:content/mainContent/resourcecomponent.img.png/1405383525267.png
    Govern This! Data Discovery and the application of data governance with new stack technologies
    • Thursday, May 01 2014
    • Category: Analytics & Business Intelligence, Business process optimization, Video, Recorded Webinars, CDH, Cloudera Enterprise
    Tableau joins us to share and demo how to apply governance to the discovery layer in an enterprise data hub while still meeting the speed and agility requirements of the business user. We also provide a Cloudera Navigator demo along with the Tableau demo.
  10. /content/cloudera/en/resources/library/productdemo/tableau-and-cloudera-demo/jcr:content/mainContent/resourcecomponent.img.png/1405556245309.png
    Tableau and Cloudera Demo
    • Thursday, May 01 2014
    • Category: Data hub, Business process optimization, Software Vendor (ISV), Analytics & Business Intelligence, Video, Product Demos
    This is an excerpt from the Tableau and Cloudera webinar on May 1st. Tableau joins us to share and demo how to apply governance to the discovery layer in an enterprise data hub while still meeting the speed and agility requirements of the business user.