This is the documentation for Cloudera Manager 4.8.4.
Documentation for other versions is available at Cloudera Documentation.

tsquery Language

The tsquery language is the language used to specify statements for retrieving time-series data from the Cloudera Manager time-series data store. A tsquery statement has the following structure:
SELECT [metric expression] WHERE [predicate]

The statement select * is invalid. For any other query, a maximum of 250 time series will be returned. This value can be configured in the Other Cloudera Manager Settings.

Each tsquery statement returns one or more time series. Example 1 returns one time series for each DataNode.

Examples

  1. Retrieve time series for all metrics for all DataNodes.
    select * where roleType=DATANODE
  2. Retrieve the total_cpu_user metric time series for all DataNodes.
    select total_cpu_user where roleType=DATANODE
  3. Retrieve the jvm_heap_used_mb metric time series divided by 1024 and the jvm_heap_committed metric time series divided by 1024 for all roles running on the host named "my host".
    select jvm_heap_used_mb/1024, jvm_heap_committed_mb/1024 where category=ROLE and hostname="my host"
  4. Retrieve the jvm_total_threads and jvm_blocked_threads metrics time series for all entities that have these two metrics.
    select jvm_total_threads,jvm_blocked_threads 

tsquery Syntax

Multiple tsquery statements can be concatenated with semi-colons. Example 3 can be written as:

select jvm_heap_used_mb/1024 where category=ROLE and hostname=myhost; select jvm_heap_commited_mb/1024 where category=ROLE and hostname=myhost

tsquery tokens are case insensitive. For example, Select, select, and SeLeCt are all equivalent to SELECT.

tsquery attribute names and most attribute values are also case insensitive. For example, displayName and serviceType are two attributes whose values are case sensitive.

A metric expression can be replaced with an * (asterisk) as shown in example 1. In that case, all metrics that are applicable for selected entities, such as DATANODE in example 1, are returned.

The predicate can be omitted as shown in example 4. In such cases, time series for all entities for which the metrics are appropriate are returned. For this query you would see the jvm_new_threads metric for NameNodes, DataNodes, TaskTrackers, and so on.

Metric Expressions

A metric expression is a comma-delimited list of one or more metric expression statements. A metric expression statement is the name of a metric collected by Cloudera Manager, a metric expression function, or a scalar value joined by one or more metric expression operators. For example:

jvm_heap_used_mb, cpu_user, 5

See the FAQ to learn how to discover metrics collected by Cloudera Manager and use cases for using scalar values in metric expressions.

For a list of all the metrics supported by Cloudera Manager, see Cloudera Manager Metrics

Metric Expression Operators

Metric expressions support the following binary operators:

  • + (addition)
  • - (subtraction)
  • * (multiplication)
  • / (division)

The following are examples of legal metric expression statements:

total_cpu_user + total_cpu_system
1000 * jvm_gc_time_ms / jvm_gc_count

Metric Expression Functions

Metric expressions support the following functions.
  • dt(metric) - Derivative with negative values. The change of the underlying metric expression, per second. For example:
    dt(jvm_gc_count)
  • dt0(metric)- Derivative where negative values are skipped (useful for dealing with counter resets). The change of the underlying metric expression, per second. For example:
    dt0(jvm_gc_time_ms) / 10
  • getHostFact(string factName, double defaultValue) - Retrieves a fact about a host. For example:
    select dt(total_cpu_user) / getHostFact(numCores, 2) where category=HOST

    This example divides the results of dt(total_cpu_user) by the current number of cores for each host. If the number of cores cannot be determined, the default "2" will be used.

    getHostFact currently supports one fact, numCores.

    The chart below is an example of a query using getHostFact. The query computes the percentage of total user and system CPU usage each role is using on the host. It first computes the "CPU seconds per second" for the number of cores used by taking the derivative of the total user and system CPU times. It normalizes the result to the number of cores on the host by using the getHostFact function and multiplies the result by 100 to get the percentage.

Example tsquery Statements with Compound Metric Expressions

  1. Retrieve all time series for all metrics for DataNodes or TaskTrackers.
    select * where roleType=DATANODE or roleType=TASKTRACKER
  2. Retrieve all time series for all metrics for DataNodes or TaskTrackers that are running on host named "myhost".
    select * where (roleType=DATANODE or roleType=TASKTRACKER) and hostname=myhost
  3. Retrieve the total_cpu_user metric time series for all hosts with names that match the regular expression "host[0-3]+.*"
    select total_cpu_user where category=role and hostname rlike "host[0-3]+.*" 

Predicates

A predicate is a time series attribute operator value pair. You use the AND and OR operators to compose compound predicates.

Operators and Values

The predicate operators are = and rlike.

The following constraints hold on values:
  • The rlike operator accepts only quoted values. For example:
    hostname rlike "host[0-3]+.*"
  • value can be any regular expression as specified in regular expression constructs in the Java Pattern class documentation.

Time Series Attributes

Attribute Description

active

Indicates whether the entities to be retrieved must be active. A nonactive entity is an entity that has been removed or deleted from the cluster. The default is to retrieve only active entities (that is, active=true). To access time series for deleted or removed entities specify active=false in the query. For example:
SELECT fd_open WHERE roleType=DATANODE and active=false

category

The category, which controls the type of the entities returned by the query. Some metrics are collected for more than one type of entity. For example, total_cpu_user is collected for entities of type HOST and ROLE. To retrieve the data for all hosts in your deployment use:

select total_cpu_user where category=HOST

The ROLE category applies to all role types (see roleType attribute). The SERVICE category applies to all service types (see serviceType attribute). For example, to retrieve the committed heap for all roles on host1 use:

select jvm_committed_heap_mb where category=ROLE and hostname="host1"

One of: ROLE, DIRECTORY, HOST, FILESYSTEM, SERVICE, NETWORK_INTERFACE, DISK, CLUSTER, FLUME_SOURCE, FLUME_CHANNEL, FLUME_SINK.

clusterId

The cluster ID. To specify a cluster by its name, use the predicate: category=CLUSTER and displayName="display name". For example:
where category = cluster and displayName = "Cluster 1 - CDH4"

componentName

The Flume component name. For example, channel1, sink1.

device

The disk device name. For example, sda.

displayName

The display name of the entity. Use quoted strings to specify localized names or names that include spaces.

hostId

The canonical identifier for a host in Cloudera Manager. It must be unique and may not change over time. Often the hostname is used as the hostId.

hostname

The host name.

iface

The network interface name. For example, eth0.

mountpoint

The mount point name, for example, /var, /mnt/homes.

partition

The partition name. For example, sda1.

roleType

The role type. One of: NAMENODE, DATANODE, SECONDARYNAMENODE, JOURNALNODE, MASTER, REGIONSERVER, JOBTRACKER, TASKTRACKER, ACTIVITYMONITOR, SERVICEMONITOR, HOSTMONITOR, EVENTSERVER, ALERTPUBLISHER, REPORTSMANAGER, SERVER, AGENT, IMPALAD, STATESTORE.

serviceType

The service type. One of: HDFS, HBASE, MAPREDUCE, MGMT, ZOOKEEPER, FLUME, IMPALA.

rackId

The Rack ID. For example, /default.

serviceName

The service ID. To specify a service by its name use the predicate: category=SERVICE and displayName="display name".

FAQ

How do I compare all disk IO for all the DataNodes that belong to a specific HDFS service?
select bytes_read, bytes_written where roleType=DATANODE and serviceName=hdfs1

Replace hdfs1 with the appropriate service name. You can then facet by metric and compare all DataNode byte_reads and byte_writes metrics at once. See Grouping (Faceting) Time Series for more details about faceting.

When would I use a derivative function?

Some metrics represent a counter, for example, bytes_read. For such metrics it is sometimes useful to see the rate of change instead of the absolute counter value. Use dt or dt0 derivative functions.

When should I use the dt0 function?

Some metrics, like bytes_read represent a counter that always grows. For such metrics a negative rate means that the counter has been reset (for example, process restarted, host restarted, and so on). Use dt0 for these metrics.

How do I display a threshold on a chart?

Assume that you want to retrieve the latencies for all disks on your hosts, compare them, and show a threshold on the chart to easily detect outliers. Use the following to retrieve the metrics and the threshold:

select service_time, await_time, await_read_time, await_write_time, 50 where category=disk

You can then facet the results to be all in one chart. The scalar threshold "50" will also be rendered on the chart. See Grouping (Faceting) Time Series for more details about faceting.

I get the warning "The query hit the maximum results limit". How do I work around the limit?

There is a limit on the number of results that can be returned by a query. When a query results in more time-series streams than the limit a warning for "partial results" is issued. To circumvent the problem try to reduce the number of metrics you are trying to retrieve. You can also use the rlike operator to limit the query to a subset of entities. For example, instead of

select service_time, await_time, await_read_time, await_write_time, 50 where category=DISK

you can use

select service_time, await_time, await_read_time, await_write_time, 50 where category=disk and hostname rlike "host1[0-9]?.cloudera.com"

The latter query will retrieve the disks for only ten hosts.

How do I discover which metrics are available for which entities?

One way to discover which metrics are collected by Cloudera Manager is to click the List of Metrics link on the Charts page to the right of the Search button. Another way is to retrieve all metrics for the type of entity you are interested in:

select * where roleType=DATANODE and hostname=host1