This is the documentation for Cloudera Manager 5.1.0.
Documentation for other versions is available at Cloudera Documentation.

Monitoring Data Storage

The Service Monitor and Host Monitor roles in the Cloudera Management Service store time series data, health data, Impala query metadata, and YARN application metadata. This section describes the process for migrating monitoring data and how to configure disk and memory properties to accommodate the requirements of these roles.

Monitoring Data Migration During Cloudera Manager Upgrade

The Cloudera Manager upgrade process automatically migrates data from existing databases to the local datastore. The upgrade process occurs only once for Host Monitor and Service Monitor, though it can be spread across multiple runs of Host Monitor and Service Monitor if they are restarted before it completes. Resource usage (CPU, memory, and disk) by Host Monitor and Service Monitor will be higher than normal during the process.

You can monitor the progress of migrating data from a Cloudera Manager 4 database to the Cloudera Manager 5 datastore in the Host Monitor and Service Monitor logs. Log statements starting with LDBTimeSeriesDataMigrationTool identify the upgrade process. The important statements are: Starting DB migration when migration is first started and Migration progress: {} total, {} migrated, {} errors as progress is reported. Progress is reported with partition counts, so it'll be something like 3 total, 0 migrated, 0 errors to start, up to 3 total, 3 migrated, 0 errors at the end.

After migration completes, the migrated data is summarized in statements such as Running the LDBTimeSeriesRollupManager at {}, forMigratedData={} with table names. At this point, the external database will never again be used by Host Monitor and Service Monitor and the database configurations can be removed (connection information, username, password, etc.).

Service Monitor Storage Configuration

The Service Monitor stores time series data and health data, Impala query metadata, and YARN application metadata.

By default, the data is stored in /var/lib/cloudera-service-monitor/ on the Service Monitor host. This can be changed by modifying the Service Monitor Storage Directory configuration (firehose.storage.base.directory). To change this configuration on an active system, see Moving Monitoring Data on an Active Cluster.

You can also control how much disk space to reserve for the different classes of data the Service Monitor stores by changing the following configuration options:
  • Time-series metrics and health data: Time-Series Storage (firehose_time_series_storage_bytes - 10 GB default)
  • Impala query metadata: Impala Storage (firehose_impala_storage_bytes - 1 GB default)
  • YARN application metadata: YARN Storage (firehose_yarn_storage_bytes - 1 GB default)

See Data Granularity and Time-Series Metric Data for an explanation of how metric data is stored within Cloudera Manager and for the impact the storage limits have on data retention.

The default values are fairly small, so you should examine disk usage after several days of activity to determine how much space is needed. Do this by visiting the Disk Usage tab on the Service Monitor page. This page shows the current disk space consumed and its rate of growth, both broken down by the type of data stored. For example, it allows you to compare the space consumed by raw metric data versus daily summaries of that data.

Host Monitor Storage Configuration

The Host Monitor stores time series data and health data.

By default, the data is stored in /var/lib/cloudera-host-monitor/ on the Host Monitor’s host. This can be changed by modifying the Host Monitor Storage Directory configuration (firehose.storage.base.directory). To change this configuration on an active system see Moving Monitoring Data on an Active Cluster.

You can control how much disk space to reserve for Host Monitor data by changing the following configuration option:
  • Time-series metrics and health data: Time Series Storage (firehose_time_series_storage_bytes - 10 GB default)

See the next section for an explanation of how metric data is stored within Cloudera Manager and for the impact these limits have on data retention.

The default value is fairly small so we encourage you to examine disk usage after several days of activity to determine how much space they need. You can do this by visiting the Disk Usage tab on the Host Monitor page. This page shows the current disk space consumed and its rate of growth, both broken down by the type of data stored. For example, it allows you to compare the space consumed by raw metric data versus daily summaries of that data.

Data Granularity and Time-Series Metric Data

The Service Monitor and Host Monitor store metric data store time-series metric data in a variety of ways. When the data is first received it is written as is to the metric store. Over time, the raw data is summarized to and stored at various data granularities. For example, after ten minutes a single ten-minute summary point is written containing the average of the metric over the period as well as the minimum, the maximum, the standard deviation, and a variety of other statistics. This process is repeated to produce hourly, six-hourly, daily, and weekly summaries. This data summarization system is only for metric data. Impala query monitoring and YARN application monitoring do not have a similar system. For those systems, when the storage limit is reached, the oldest stored records are deleted.

The Service Monitor and Host Monitor internally manage the amount of their overall storage space to dedicate to each data granularity level. When the limit for a particular level is reached, the oldest data points at that level are deleted. Note that metric data for that time period remains available at the lower granularity levels. That is, when an hourly point for a particular time is deleted to free up space, a daily point still exists covering that hour. Since each of these data granularities consumes significantly less storage than the previous summary level, lower granularity levels can be retained for longer periods of time. In particular, given a reasonable amount of storage, weekly points can normally be retained indefinitely.

Some features, notably detailed display of health results, depend on the presence of raw data. Health history is maintained by the event store dictated by its retention policies.

Moving Monitoring Data on an Active Cluster

There are two ways to change where monitoring data is stored on a cluster: basic and advanced.

Basic: Changing the Configured Directory

  1. Stop the Service or Host Monitor.
  2. If you want to save your old monitoring data then copy the current directory to the new directory.
  3. Update the Storage Directory configuration option (firehose.storage.base.directory) on corresponding role’s configuration page.
  4. Start the Service or Host Monitor.

Advanced: High Performance

For the best performance, and especially for a large cluster, we recommend putting the Host and Service Monitor storage directories on their own dedicated spindles. In most cases that will provide sufficient performance, but if you need additional performance you can divide your data even further. Though this cannot be configured directly with Cloudera Manager, it can be done using symbolic links.

For example if all your Service Monitor data is located in /data/1/service_monitor and you want to separate your Impala data from your time series data you could do the following:

  1. Stop the Service Monitor.
  2. Move the original Impala data in /data/1/service_monitor/impala to the new directory, for example /data/2/impala_data.
  3. Create a symbolic link from /data/1/service_monitor/impala to /data/2/impala_data with the following command:
    ln -s /data/2/impala_data /data/1/service_monitor/impala
  4. Start the Service Monitor.

Host Monitor and Service Monitor Memory Configuration

There are two memory-related configuration options: Java heap size and non-Java memory size. The memory required or recommended for both of these configuration options depends on the size of the cluster. In addition to the memory configured, the Host and Service Monitor will also take advantage of the Linux page cache. Having memory free for use as page cache on the Service and Host Monitor hosts will improve performance.

Table 1. Small Clusters: No More Than 10 Hosts
  Required Recommended
Java Heap Size 256 MB 512 MB
Non-Java Memory 768 MB 1.5 GB
Table 2. Medium Clusters: Between 11 and 100 Hosts
  Required Recommended
Java Heap Size 1 GB 2 GB
Non-Java Memory 2 GB 4 GB
Table 3. Large Clusters: More Than 100 Hosts
  Required Recommended
Java Heap Size 2 GB 4 GB
Non-Java Memory 6 GB 12 GB