Skip to main content

Aiven for OpenSearch® metrics available via Prometheus

Monitor and optimize your Aiven for OpenSearch service with metrics available via Prometheus. These metrics help track cluster health, replication status, and overall performance.

Prerequisites

Access Prometheus metrics

  1. Open your service's Overview page in the Aiven Console.
  2. In the Connection information section, click the Prometheus tab.
  3. Copy the Service URI.
  4. Paste the Service URI into your browser's address bar.
  5. When prompted, enter your Prometheus credentials.
  6. Click Login.

Host metrics

Host metrics provide insights into system-level performance, including CPU, memory, disk, and network usage.

CPU utilization

CPU utilization metrics offer insights into CPU usage. These metrics include time spent on different processes, system load, and overall uptime.

MetricDescription
cpu_usage_guestCPU time spent running a virtual CPU for guest operating systems
cpu_usage_guest_niceCPU time running low-priority virtual CPUs for guest operating systems; interrupted by higher-priority tasks and measured in hundredths of a second
cpu_usage_idleTime the CPU spends doing nothing
cpu_usage_iowaitTime waiting for I/O to complete
cpu_usage_irqTime servicing interrupts
cpu_usage_niceTime running user-niced processes
cpu_usage_softirqTime servicing softirqs
cpu_usage_stealTime spent in other operating systems when running in a virtualized environment
cpu_usage_systemTime spent running system processes
cpu_usage_userTime spent running user processes
system_load1System load average for the last minute
system_load15System load average for the last 15 minutes
system_load5System load average for the last 5 minutes
system_n_cpusNumber of CPU cores available
system_n_usersNumber of users logged in
system_uptimeTime for which the system has been up and running

Disk space utilization

Disk space utilization metrics provide a snapshot of disk usage. These metrics include information about free and used disk space, as well as inode usage and total disk capacity.

MetricDescription
disk_freeAmount of free disk space
disk_inodes_freeNumber of free inodes
disk_inodes_totalTotal number of inodes
disk_inodes_usedNumber of used inodes
disk_totalTotal disk space
disk_usedAmount of used disk space
disk_used_percentPercentage of disk space used

Disk input and output

Metrics such as diskio_io_time and diskio_iops_in_progress provide insights into disk I/O operations. These metrics cover read/write operations, the duration of these operations, and the number of bytes read/written.

MetricDescription
diskio_io_timeTotal time spent on I/O operations
diskio_iops_in_progressNumber of I/O operations currently in progress
diskio_merged_readsNumber of read operations that were merged
diskio_merged_writesNumber of write operations that were merged
diskio_read_bytesTotal bytes read from disk
diskio_read_timeTotal time spent on read operations
diskio_readsTotal number of read operations
diskio_weighted_io_timeWeighted time spent on I/O operations, considering their duration and intensity
diskio_write_bytesTotal bytes written to disk
diskio_write_timeTotal time spent on write operations
diskio_writesTotal number of write operations

Generic memory

The following metrics, including mem_active and mem_available, provide insights into your system's memory usage.

MetricDescription
mem_activeAmount of actively used memory
mem_availableAmount of available memory
mem_available_percentPercentage of available memory
mem_bufferedAmount of memory used for buffering I/O
mem_cachedAmount of memory used for caching
mem_commit_limitMaximum amount of memory that can be committed
mem_committed_asTotal amount of committed memory
mem_dirtyAmount of memory waiting to be written to disk
mem_freeAmount of free memory
mem_high_freeAmount of free memory in the high memory zone
mem_high_totalTotal amount of memory in the high memory zone
mem_huge_pages_freeNumber of free huge pages
mem_huge_page_sizeSize of huge pages
mem_huge_pages_totalTotal number of huge pages
mem_inactiveAmount of inactive memory
mem_low_freeAmount of free memory in the low memory zone
mem_low_totalTotal amount of memory in the low memory zone
mem_mappedAmount of memory mapped into the process's address space
mem_page_tablesAmount of memory used by page tables
mem_sharedAmount of memory shared between processes
mem_slabAmount of memory used by the kernel for data structure caches
mem_swap_cachedAmount of swap memory cached
mem_swap_freeAmount of free swap memory
mem_swap_totalTotal amount of swap memory
mem_totalTotal amount of memory
mem_usedAmount of used memory
mem_used_percentPercentage of used memory
mem_vmalloc_chunkLargest contiguous block of vmalloc memory available
mem_vmalloc_totalTotal amount of vmalloc memory
mem_vmalloc_usedAmount of used vmalloc memory
mem_wiredAmount of wired memory
mem_write_backAmount of memory being written back to disk
mem_write_back_tmpAmount of temporary memory being written back to disk

Network

The following metrics, including net_bytes_recv and net_packets_sent, provide insights into your system's network operations.

MetricDescription
net_bytes_recvTotal bytes received on the network interfaces
net_bytes_sentTotal bytes sent on the network interfaces
net_drop_inIncoming packets dropped
net_drop_outOutgoing packets dropped
net_err_inIncoming packets with errors
net_err_outOutgoing packets with errors
net_icmp_inaddrmaskrepsNumber of ICMP address mask replies received
net_icmp_inaddrmasksNumber of ICMP address mask requests received
net_icmp_incsumerrorsNumber of ICMP checksum errors
net_icmp_indestunreachsNumber of ICMP destination unreachable messages received
net_icmp_inechorepsNumber of ICMP echo replies received
net_icmp_inechosNumber of ICMP echo requests received
net_icmp_inerrorsNumber of ICMP messages received with errors
net_icmp_inmsgsTotal number of ICMP messages received
net_icmp_inparmprobsNumber of ICMP parameter problem messages received
net_icmp_inredirectsNumber of ICMP redirect messages received
net_icmp_insrcquenchsNumber of ICMP source quench messages received
net_icmp_intimeexcdsNumber of ICMP time exceeded messages received
net_icmp_intimestamprepsNumber of ICMP timestamp reply messages received
net_icmp_intimestampsNumber of ICMP timestamp request messages received
net_icmpmsg_intype3Number of ICMP type 3 (destination unreachable) messages received
net_icmpmsg_intype8Number of ICMP type 8 (echo request) messages received
net_icmpmsg_outtype0Number of ICMP type 0 (echo reply) messages sent
net_icmpmsg_outtype3Number of ICMP type 3 (destination unreachable) messages sent
net_icmp_outaddrmaskrepsNumber of ICMP address mask reply messages sent
net_icmp_outaddrmasksNumber of ICMP address mask request messages sent
net_icmp_outdestunreachsNumber of ICMP destination unreachable messages sent
net_icmp_outechorepsNumber of ICMP echo reply messages sent
net_icmp_outechosNumber of ICMP echo request messages sent
net_icmp_outerrorsNumber of ICMP messages sent with errors
net_icmp_outmsgsTotal number of ICMP messages sent
net_icmp_outparmprobsNumber of ICMP parameter problem messages sent
net_icmp_outredirectsNumber of ICMP redirect messages sent
net_icmp_outsrcquenchsNumber of ICMP source quench messages sent
net_icmp_outtimeexcdsNumber of ICMP time exceeded messages sent
net_icmp_outtimestamprepsNumber of ICMP timestamp reply messages sent
net_icmp_outtimestampsNumber of ICMP timestamp request messages sent
net_icmp_outratelimitglobalNumber of globally rate-limited ICMP messages sent
net_icmp_outratelimithostNumber of ICMP messages rate-limited per host
net_ip_defaultttlDefault time-to-live for IP packets
net_ip_forwardingIndicates if IP forwarding is enabled
net_ip_forwdatagramsNumber of forwarded IP datagrams
net_ip_fragcreatesNumber of IP fragments created
net_ip_fragfailsNumber of failed IP fragmentations
net_ip_fragoksNumber of successful IP fragmentations
net_ip_inaddrerrorsNumber of incoming IP packets with address errors
net_ip_indeliversNumber of incoming IP packets delivered to higher layers
net_ip_indiscardsNumber of incoming IP packets discarded
net_ip_inhdrerrorsNumber of incoming IP packets with header errors
net_ip_inreceivesTotal number of incoming IP packets received
net_ip_inunknownprotosNumber of incoming IP packets with unknown protocols
net_ip_outdiscardsNumber of outgoing IP packets discarded
net_ip_outnoroutesNumber of outgoing IP packets with no route available
net_ip_outrequestsTotal number of outgoing IP packets requested to be sent
net_ip_outtransmitsNumber of IP packets transmitted successfully
net_ip_reasmfailsNumber of failed IP reassembly attempts
net_ip_reasmoksNumber of successful IP reassembly attempts
net_ip_reasmreqdsNumber of IP fragments received needing reassembly
net_ip_reasmtimeoutNumber of IP reassembly timeouts
net_packets_recvTotal number of packets received on the network interfaces
net_packets_sentTotal number of packets sent on the network interfaces
netstat_tcp_closeNumber of TCP connections in the CLOSE state
netstat_tcp_close_waitNumber of TCP connections in the CLOSE_WAIT state
netstat_tcp_closingNumber of TCP connections in the CLOSING state
netstat_tcp_establishedNumber of TCP connections in the ESTABLISHED state
netstat_tcp_fin_wait1Number of TCP connections in the FIN_WAIT_1 state
netstat_tcp_fin_wait2Number of TCP connections in the FIN_WAIT_2 state
netstat_tcp_last_ackNumber of TCP connections in the LAST_ACK state
netstat_tcp_listenNumber of TCP connections in the LISTEN state
netstat_tcp_noneNumber of TCP connections in the NONE state
netstat_tcp_syn_recvNumber of TCP connections in the SYN_RECV state
netstat_tcp_syn_sentNumber of TCP connections in the SYN_SENT state
netstat_tcp_time_waitNumber of TCP connections in the TIME_WAIT state
netstat_udp_socketNumber of UDP sockets
net_tcp_activeopensNumber of active TCP open connections
net_tcp_attemptfailsNumber of failed TCP connection attempts
net_tcp_currestabNumber of currently established TCP connections
net_tcp_estabresetsNumber of established TCP connections reset
net_tcp_incsumerrorsNumber of TCP checksum errors in incoming packets
net_tcp_inerrsNumber of incoming TCP packets with errors
net_tcp_insegsNumber of TCP segments received
net_tcp_maxconnMaximum number of TCP connections supported
net_tcp_outrstsNumber of TCP reset packets sent
net_tcp_outsegsNumber of TCP segments sent
net_tcp_passiveopensNumber of passive TCP open connections
net_tcp_retranssegsNumber of TCP segments retransmitted
net_tcp_rtoalgorithmTCP retransmission timeout algorithm
net_tcp_rtomaxMaximum TCP retransmission timeout
net_tcp_rtominMinimum TCP retransmission timeout
net_udp_ignoredmultiNumber of UDP multicast packets ignored
net_udp_incsumerrorsNumber of UDP checksum errors in incoming packets
net_udp_indatagramsNumber of UDP datagrams received
net_udp_inerrorsNumber of incoming UDP packets with errors
net_udp_memerrorsNumber of UDP packets dropped due to memory errors
net_udplite_ignoredmultiNumber of UDP-Lite multicast packets ignored
net_udplite_incsumerrorsNumber of UDP-Lite checksum errors in incoming packets
net_udplite_indatagramsNumber of UDP-Lite datagrams received
net_udplite_inerrorsNumber of incoming UDP-Lite packets with errors
net_udplite_memerrorsNumber of UDP-L

Kernel

The metrics listed below, such as kernel_boot_time and kernel_context_switches, provide insights into the operations of your system's kernel.

MetricDescription
kernel_boot_timeTime at which the system was last booted
kernel_context_switchesNumber of context switches that have occurred in the kernel
kernel_entropy_availAmount of available entropy in the kernel's entropy pool
kernel_interruptsNumber of interrupts that have occurred
kernel_processes_forkedNumber of processes that have been forked

Process

Metrics such as processes_running and processes_zombies provide insights into the management of the system's processes.

MetricDescription
processes_blockedNumber of processes that are blocked
processes_deadNumber of processes that have terminated
processes_idleNumber of processes that are idle
processes_pagingNumber of processes that are paging
processes_runningNumber of processes currently running
processes_sleepingNumber of processes that are sleeping
processes_stoppedNumber of processes that are stopped
processes_totalTotal number of processes
processes_total_threadsTotal number of threads across all processes
processes_unknownNumber of processes in an unknown state
processes_zombiesNumber of zombie processes (terminated but not reaped by parent process)

Swap usage

Metrics such as swap_free and swap_used provide insights into the usage of the system's swap memory.

MetricDescription
swap_freeAmount of free swap memory
swap_inAmount of data swapped in from disk
swap_outAmount of data swapped out to disk
swap_totalTotal amount of swap memory
swap_usedAmount of used swap memory
swap_used_percentPercentage of swap memory used

OpenSearch-specific metrics

These metrics provide insights into the performance and health of your Aiven for OpenSearch service.

Node statistics

Track node metrics such as CPU and memory usage, disk I/O, and JVM statistics. For more information, see the node stats API.

Cluster statistics

Track cluster-level metrics, including the number of indices, shard distribution, and memory usage. For more information, see the cluster stats API.

Cluster health

Monitor cluster health with granular metrics available at the index level. Use local with level=index to view index-specific health status. For more information, see the cluster health API.

Cross-cluster replication (CCR) stats Limited availability

  • Leader stats: Monitor replication metrics from the leader cluster, including replication lag and synchronization status. For more information, see the leader cluster stats API.
  • Follower stats: Monitor follower cluster metrics, including replication delays and error rates, to maintain data consistency. For more information, see the follower cluster stats API.