Skip to main content

Temporal Cloud available metrics reference

Most Temporal Cloud metrics are suffixed with _count. This indicates that they behave largely like a Prometheus counter. You'll want to use a function like rate or increase to calculate a per-second rate of increase, or an extrapolated total increase over a time period.

rate(temporal_cloud_v0_frontend_service_request_count[5m])

temporal_cloud_v0_service_latency has _bucket, _count, and _sum metrics. This is because it's a Prometheus Histogram. You can use the _count and _sum metrics to calculate an average latency over a time period, or use the _bucket metric to calculate an approximate histogram quartile.

# the average latency observation over the last 5 minutes
rate(temporal_cloud_v0_service_latency_sum[5m]) / rate(temporal_cloud_v0_service_latency_count[5m])

# the approximate 99th percentile latency over the last 5 minutes, broken down by operation
histogram_quantile(0.99, sum(rate(temporal_cloud_v0_service_latency_bucket[5m])) by (le, operation))

Metric labels

Metrics for all Namespaces in your account are available from the metrics endpoint. Use the following labels to filter metrics:

LabelExplanation
leLess than or equal to (le) is used in histograms to categorize observations into buckets based on their value being less than or equal to a predefined upper limit.
operationThis includes operations such as:
  • SignalWorkflowExecution
  • StartBatchOperation
  • StartWorkflowExecution
  • TaskQueueMgr
  • TerminateWorkflowExecution
  • UpdateNamespace
  • UpdateSchedule
This list is non-exhaustive.
resource_exhausted_causeCause for resource exhaustion.
task_typeActivity or Workflow.
temporal_accountTemporal Account.
temporal_namespaceTemporal Namespace.
temporal_service_typeFrontend or Matching or History or Worker.
is_backgroundThis label on temporal_cloud_v0_total_action_count indicates when actions are produced by a Temporal background job, for example: hourly Workflow Export.
namespace_modeThis label on temporal_cloud_v0_total_action_count indicates if actions are produced by an active vs a passive Namespace. For a regular Namespace, namespace_mode will always be “active”.

The following is an example of how you can filter metrics using labels:

temporal_cloud_v0_poll_success_count{__rollup__="true", operation="TaskQueueMgr", task_type="Activity", temporal_account="12345", temporal_namespace="your_namespace.12345", temporal_service_type="matching"}

Available metrics

What metrics are emitted from Temporal Cloud?

The following metrics are emitted for your various Namespaces.

temporal_cloud_v0_frontend_service_error_count

This is a count of gRPC errors returned aggregated by operation.

temporal_cloud_v0_frontend_service_request_count

This is a count of gRPC requests received aggregated by operation.

temporal_cloud_v0_poll_success_count

Tasks that are successfully matched to a poller.

temporal_cloud_v0_poll_success_sync_count

Tasks that are successfully sync matched to a poller.

temporal_cloud_v0_poll_timeout_count

When no tasks are available for a poller before timing out.

temporal_cloud_v0_replication_lag_bucket

A histogram of replication lag during a specific time interval for a multi-region Namespace.

temporal_cloud_v0_replication_lag_count

The replication lag count during a specific time interval for a multi-region Namespace.

temporal_cloud_v0_replication_lag_sum

The sum of replication lag during a specific time interval for a multi-region Namespace.

temporal_cloud_v0_resource_exhausted_error_count

gRPC requests received that were rate-limited by Temporal Cloud, aggregated by cause.

temporal_cloud_v0_schedule_action_success_count

Successful execution of a Scheduled Workflow.

temporal_cloud_v0_schedule_buffer_overruns_count

When average schedule run length is greater than average schedule interval while a buffer_all overlap policy is configured.

temporal_cloud_v0_schedule_missed_catchup_window_count

Skipped Scheduled executions when Workflows were delayed longer than the catchup window.

temporal_cloud_v0_schedule_rate_limited_count

Workflows that were delayed due to exceeding a rate limit.

temporal_cloud_v0_service_latency_bucket

Latency for SignalWithStartWorkflowExecution, SignalWorkflowExecution, StartWorkflowExecution operations.

temporal_cloud_v0_service_latency_count

Count of latency observations for SignalWithStartWorkflowExecution, SignalWorkflowExecution, StartWorkflowExecution operations.

temporal_cloud_v0_service_latency_sum

Sum of latency observation time for SignalWithStartWorkflowExecution, SignalWorkflowExecution, StartWorkflowExecution operations.

temporal_cloud_v0_state_transition_count

Count of state transitions for each Namespace.

temporal_cloud_v0_total_action_count

Approximate count of Temporal Cloud Actions.

temporal_cloud_v0_workflow_cancel_count

Workflows canceled before completing execution.

temporal_cloud_v0_workflow_continued_as_new_count

Workflow Executions that were Continued-As-New from a past execution.

temporal_cloud_v0_workflow_failed_count

Workflows that failed before completion.

temporal_cloud_v0_workflow_success_count

Workflows that successfully completed.

temporal_cloud_v0_workflow_terminate_count

Workflows terminated before completing execution.

temporal_cloud_v0_workflow_timeout_count

Workflows that timed out before completing execution.