<scrape_config>
<http_config>
<tls_config>
<oauth2>
<azure_sd_config>
<consul_sd_config>
<digitalocean_sd_config>
<docker_sd_config>
<dockerswarm_sd_config>
<dns_sd_config>
<ec2_sd_config>
<openstack_sd_config>
<ovhcloud_sd_config>
<puppetdb_sd_config>
<file_sd_config>
<gce_sd_config>
<hetzner_sd_config>
<http_sd_config>
<ionos_sd_config>
<kubernetes_sd_config>
<kuma_sd_config>
<lightsail_sd_config>
<linode_sd_config>
<marathon_sd_config>
<nerve_sd_config>
<nomad_sd_config>
<serverset_sd_config>
<triton_sd_config>
<eureka_sd_config>
<scaleway_sd_config>
<uyuni_sd_config>
<vultr_sd_config>
<static_config>
<relabel_config>
<metric_relabel_configs>
<alert_relabel_configs>
<alertmanager_config>
<remote_write>
<remote_read>
<tsdb>
<exemplars>
<tracing_config>
Prometheus 通过命令行标志和配置文件进行配置。命令行标志配置不可变的系统参数(例如存储位置、磁盘和内存中保留的数据量等),而配置文件定义了与抓取作业及其实例以及要加载的规则文件相关的所有内容。
要查看所有可用的命令行标志,运行 ./prometheus -h
。
Prometheus 可以在运行时重新加载其配置。如果新配置格式不正确,更改将不会被应用。配置重新加载由向 Prometheus 进程发送 SIGHUP
信号或向 /-/reload
端点发送 HTTP POST 请求(当启用了 --web.enable-lifecycle
标志时)触发。这也会重新加载任何配置的规则文件。
要指定加载哪个配置文件,使用 --config.file
标志。
文件采用YAML 格式编写,由以下描述的方案定义。括号表示参数是可选的。对于非列表参数,值设置为指定的默认值。
通用占位符定义如下
<boolean>
: 布尔值,可以取值 true
或 false
<duration>
: 匹配正则表达式 ((([0-9]+)y)?(([0-9]+)w)?(([0-9]+)d)?(([0-9]+)h)?(([0-9]+)m)?(([0-9]+)s)?(([0-9]+)ms)?|0)
的持续时间,例如 1d
, 1h30m
, 5m
, 10s
<filename>
: 当前工作目录中的有效路径<float>
: 浮点数<host>
: 由主机名或 IP 后跟可选端口号组成的有效字符串<int>
: 整数值<labelname>
: 匹配正则表达式 [a-zA-Z_][a-zA-Z0-9_]*
的字符串。源标签中任何其他不支持的字符应转换为下划线。例如,标签 app.kubernetes.io/name
应写为 app_kubernetes_io_name
。<labelvalue>
: Unicode 字符组成的字符串<path>
: 有效的 URL 路径<scheme>
: 可以取值 http
或 https
的字符串<secret>
: 常规字符串,是秘密,如密码<string>
: 常规字符串<size>
: 字节大小,例如 512MB
。单位是必需的。支持的单位:B, KB, MB, GB, TB, PB, EB。<tmpl_string>
: 在使用前进行模板扩展的字符串其他占位符是单独指定的。
一个有效的示例文件可以在这里找到。
全局配置指定在所有其他配置上下文中有效的参数。它们也作为其他配置部分的默认值。
global:
# How frequently to scrape targets by default.
[ scrape_interval: <duration> | default = 1m ]
# How long until a scrape request times out.
# It cannot be greater than the scrape interval.
[ scrape_timeout: <duration> | default = 10s ]
# The protocols to negotiate during a scrape with the client.
# Supported values (case sensitive): PrometheusProto, OpenMetricsText0.0.1,
# OpenMetricsText1.0.0, PrometheusText0.0.4.
# The default value changes to [ PrometheusProto, OpenMetricsText1.0.0, OpenMetricsText0.0.1, PrometheusText0.0.4 ]
# when native_histogram feature flag is set.
[ scrape_protocols: [<string>, ...] | default = [ OpenMetricsText1.0.0, OpenMetricsText0.0.1, PrometheusText0.0.4 ] ]
# How frequently to evaluate rules.
[ evaluation_interval: <duration> | default = 1m ]
# Offset the rule evaluation timestamp of this particular group by the
# specified duration into the past to ensure the underlying metrics have
# been received. Metric availability delays are more likely to occur when
# Prometheus is running as a remote write target, but can also occur when
# there's anomalies with scraping.
[ rule_query_offset: <duration> | default = 0s ]
# The labels to add to any time series or alerts when communicating with
# external systems (federation, remote storage, Alertmanager).
# Environment variable references `${var}` or `$var` are replaced according
# to the values of the current environment variables.
# References to undefined variables are replaced by the empty string.
# The `$` character can be escaped by using `$$`.
external_labels:
[ <labelname>: <labelvalue> ... ]
# File to which PromQL queries are logged.
# Reloading the configuration will reopen the file.
[ query_log_file: <string> ]
# File to which scrape failures are logged.
# Reloading the configuration will reopen the file.
[ scrape_failure_log_file: <string> ]
# An uncompressed response body larger than this many bytes will cause the
# scrape to fail. 0 means no limit. Example: 100MB.
# This is an experimental feature, this behaviour could
# change or be removed in the future.
[ body_size_limit: <size> | default = 0 ]
# Per-scrape limit on the number of scraped samples that will be accepted.
# If more than this number of samples are present after metric relabeling
# the entire scrape will be treated as failed. 0 means no limit.
[ sample_limit: <int> | default = 0 ]
# Limit on the number of labels that will be accepted per sample. If more
# than this number of labels are present on any sample post metric-relabeling,
# the entire scrape will be treated as failed. 0 means no limit.
[ label_limit: <int> | default = 0 ]
# Limit on the length (in bytes) of each individual label name. If any label
# name in a scrape is longer than this number post metric-relabeling, the
# entire scrape will be treated as failed. Note that label names are UTF-8
# encoded, and characters can take up to 4 bytes. 0 means no limit.
[ label_name_length_limit: <int> | default = 0 ]
# Limit on the length (in bytes) of each individual label value. If any label
# value in a scrape is longer than this number post metric-relabeling, the
# entire scrape will be treated as failed. Note that label values are UTF-8
# encoded, and characters can take up to 4 bytes. 0 means no limit.
[ label_value_length_limit: <int> | default = 0 ]
# Limit per scrape config on number of unique targets that will be
# accepted. If more than this number of targets are present after target
# relabeling, Prometheus will mark the targets as failed without scraping them.
# 0 means no limit. This is an experimental feature, this behaviour could
# change in the future.
[ target_limit: <int> | default = 0 ]
# Limit per scrape config on the number of targets dropped by relabeling
# that will be kept in memory. 0 means no limit.
[ keep_dropped_targets: <int> | default = 0 ]
# Specifies the validation scheme for metric and label names. Either blank or
# "utf8" for full UTF-8 support, or "legacy" for letters, numbers, colons,
# and underscores.
[ metric_name_validation_scheme: <string> | default "utf8" ]
# Specifies whether to convert all scraped classic histograms into native
# histograms with custom buckets.
[ convert_classic_histograms_to_nhcb: <bool> | default = false]
# Specifies whether to scrape a classic histogram, even if it is also exposed as a native
# histogram (has no effect without --enable-feature=native-histograms).
[ always_scrape_classic_histograms: <boolean> | default = false ]
runtime:
# Configure the Go garbage collector GOGC parameter
# See: https://tip.golang.org/doc/gc-guide#GOGC
# Lowering this number increases CPU usage.
[ gogc: <int> | default = 75 ]
# Rule files specifies a list of globs. Rules and alerts are read from
# all matching files.
rule_files:
[ - <filepath_glob> ... ]
# Scrape config files specifies a list of globs. Scrape configs are read from
# all matching files and appended to the list of scrape configs.
scrape_config_files:
[ - <filepath_glob> ... ]
# A list of scrape configurations.
scrape_configs:
[ - <scrape_config> ... ]
# Alerting specifies settings related to the Alertmanager.
alerting:
alert_relabel_configs:
[ - <relabel_config> ... ]
alertmanagers:
[ - <alertmanager_config> ... ]
# Settings related to the remote write feature.
remote_write:
[ - <remote_write> ... ]
# Settings related to the OTLP receiver feature.
# See https://prometheus.ac.cn/docs/guides/opentelemetry/ for best practices.
otlp:
[ promote_resource_attributes: [<string>, ...] | default = [ ] ]
# Configures translation of OTLP metrics when received through the OTLP metrics
# endpoint. Available values:
# - "UnderscoreEscapingWithSuffixes" refers to commonly agreed normalization used
# by OpenTelemetry in https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/pkg/translator/prometheus
# - "NoUTF8EscapingWithSuffixes" is a mode that relies on UTF-8 support in Prometheus.
# It preserves all special characters like dots, but still adds required metric name suffixes
# for units and _total, as UnderscoreEscapingWithSuffixes does.
# - (EXPERIMENTAL) "NoTranslation" is a mode that relies on UTF-8 support in Prometheus.
# It preserves all special character like dots and won't append special suffixes for metric
# unit and type.
#
# WARNING: The "NoTranslation" setting has significant known risks and limitations (see https://prometheus.ac.cn/docs/practices/naming/
# for details):
# * Impaired UX when using PromQL in plain YAML (e.g. alerts, rules, dashboard, autoscaling configuration).
# * Series collisions which in the best case may result in OOO errors, in the worst case a silently malformed
# time series. For instance, you may end up in situation of ingesting `foo.bar` series with unit
# `seconds` and a separate series `foo.bar` with unit `milliseconds`.
[ translation_strategy: <string> | default = "UnderscoreEscapingWithSuffixes" ]
# Enables adding "service.name", "service.namespace" and "service.instance.id"
# resource attributes to the "target_info" metric, on top of converting
# them into the "instance" and "job" labels.
[ keep_identifying_resource_attributes: <boolean> | default = false]
# Configures optional translation of OTLP explicit bucket histograms into native histograms with custom buckets.
[ convert_histograms_to_nhcb: <boolean> | default = false]
# Settings related to the remote read feature.
remote_read:
[ - <remote_read> ... ]
# Storage related settings that are runtime reloadable.
storage:
[ tsdb: <tsdb> ]
[ exemplars: <exemplars> ]
# Configures exporting traces.
tracing:
[ <tracing_config> ]
<scrape_config>
scrape_config
部分指定了一组目标和描述如何抓取它们的参数。通常情况下,一个抓取配置指定一个作业。在高级配置中,这可能会改变。
目标可以通过 static_configs
参数静态配置,或使用支持的服务发现机制之一动态发现。
此外,relabel_configs
允许在抓取之前对任何目标及其标签进行高级修改。
# The job name assigned to scraped metrics by default.
job_name: <job_name>
# How frequently to scrape targets from this job.
[ scrape_interval: <duration> | default = <global_config.scrape_interval> ]
# Per-scrape timeout when scraping this job.
# It cannot be greater than the scrape interval.
[ scrape_timeout: <duration> | default = <global_config.scrape_timeout> ]
# The protocols to negotiate during a scrape with the client.
# Supported values (case sensitive): PrometheusProto, OpenMetricsText0.0.1,
# OpenMetricsText1.0.0, PrometheusText0.0.4, PrometheusText1.0.0.
[ scrape_protocols: [<string>, ...] | default = <global_config.scrape_protocols> ]
# Fallback protocol to use if a scrape returns blank, unparseable, or otherwise
# invalid Content-Type.
# Supported values (case sensitive): PrometheusProto, OpenMetricsText0.0.1,
# OpenMetricsText1.0.0, PrometheusText0.0.4, PrometheusText1.0.0.
[ fallback_scrape_protocol: <string> ]
# Whether to scrape a classic histogram, even if it is also exposed as a native
# histogram (has no effect without --enable-feature=native-histograms).
[ always_scrape_classic_histograms: <boolean> |
default = <global.always_scrape_classic_hisotgrams> ]
# The HTTP resource path on which to fetch metrics from targets.
[ metrics_path: <path> | default = /metrics ]
# honor_labels controls how Prometheus handles conflicts between labels that are
# already present in scraped data and labels that Prometheus would attach
# server-side ("job" and "instance" labels, manually configured target
# labels, and labels generated by service discovery implementations).
#
# If honor_labels is set to "true", label conflicts are resolved by keeping label
# values from the scraped data and ignoring the conflicting server-side labels.
#
# If honor_labels is set to "false", label conflicts are resolved by renaming
# conflicting labels in the scraped data to "exported_<original-label>" (for
# example "exported_instance", "exported_job") and then attaching server-side
# labels.
#
# Setting honor_labels to "true" is useful for use cases such as federation and
# scraping the Pushgateway, where all labels specified in the target should be
# preserved.
#
# Note that any globally configured "external_labels" are unaffected by this
# setting. In communication with external systems, they are always applied only
# when a time series does not have a given label yet and are ignored otherwise.
[ honor_labels: <boolean> | default = false ]
# honor_timestamps controls whether Prometheus respects the timestamps present
# in scraped data.
#
# If honor_timestamps is set to "true", the timestamps of the metrics exposed
# by the target will be used.
#
# If honor_timestamps is set to "false", the timestamps of the metrics exposed
# by the target will be ignored.
[ honor_timestamps: <boolean> | default = true ]
# track_timestamps_staleness controls whether Prometheus tracks staleness of
# the metrics that have an explicit timestamps present in scraped data.
#
# If track_timestamps_staleness is set to "true", a staleness marker will be
# inserted in the TSDB when a metric is no longer present or the target
# is down.
[ track_timestamps_staleness: <boolean> | default = false ]
# Configures the protocol scheme used for requests.
[ scheme: <scheme> | default = http ]
# Optional HTTP URL parameters.
params:
[ <string>: [<string>, ...] ]
# If enable_compression is set to "false", Prometheus will request uncompressed
# response from the scraped target.
[ enable_compression: <boolean> | default = true ]
# File to which scrape failures are logged.
# Reloading the configuration will reopen the file.
[ scrape_failure_log_file: <string> ]
# HTTP client settings, including authentication methods (such as basic auth and
# authorization), proxy configurations, TLS options, custom HTTP headers, etc.
[ <http_config> ]
# List of Azure service discovery configurations.
azure_sd_configs:
[ - <azure_sd_config> ... ]
# List of Consul service discovery configurations.
consul_sd_configs:
[ - <consul_sd_config> ... ]
# List of DigitalOcean service discovery configurations.
digitalocean_sd_configs:
[ - <digitalocean_sd_config> ... ]
# List of Docker service discovery configurations.
docker_sd_configs:
[ - <docker_sd_config> ... ]
# List of Docker Swarm service discovery configurations.
dockerswarm_sd_configs:
[ - <dockerswarm_sd_config> ... ]
# List of DNS service discovery configurations.
dns_sd_configs:
[ - <dns_sd_config> ... ]
# List of EC2 service discovery configurations.
ec2_sd_configs:
[ - <ec2_sd_config> ... ]
# List of Eureka service discovery configurations.
eureka_sd_configs:
[ - <eureka_sd_config> ... ]
# List of file service discovery configurations.
file_sd_configs:
[ - <file_sd_config> ... ]
# List of GCE service discovery configurations.
gce_sd_configs:
[ - <gce_sd_config> ... ]
# List of Hetzner service discovery configurations.
hetzner_sd_configs:
[ - <hetzner_sd_config> ... ]
# List of HTTP service discovery configurations.
http_sd_configs:
[ - <http_sd_config> ... ]
# List of IONOS service discovery configurations.
ionos_sd_configs:
[ - <ionos_sd_config> ... ]
# List of Kubernetes service discovery configurations.
kubernetes_sd_configs:
[ - <kubernetes_sd_config> ... ]
# List of Kuma service discovery configurations.
kuma_sd_configs:
[ - <kuma_sd_config> ... ]
# List of Lightsail service discovery configurations.
lightsail_sd_configs:
[ - <lightsail_sd_config> ... ]
# List of Linode service discovery configurations.
linode_sd_configs:
[ - <linode_sd_config> ... ]
# List of Marathon service discovery configurations.
marathon_sd_configs:
[ - <marathon_sd_config> ... ]
# List of AirBnB's Nerve service discovery configurations.
nerve_sd_configs:
[ - <nerve_sd_config> ... ]
# List of Nomad service discovery configurations.
nomad_sd_configs:
[ - <nomad_sd_config> ... ]
# List of OpenStack service discovery configurations.
openstack_sd_configs:
[ - <openstack_sd_config> ... ]
# List of OVHcloud service discovery configurations.
ovhcloud_sd_configs:
[ - <ovhcloud_sd_config> ... ]
# List of PuppetDB service discovery configurations.
puppetdb_sd_configs:
[ - <puppetdb_sd_config> ... ]
# List of Scaleway service discovery configurations.
scaleway_sd_configs:
[ - <scaleway_sd_config> ... ]
# List of Zookeeper Serverset service discovery configurations.
serverset_sd_configs:
[ - <serverset_sd_config> ... ]
# List of Triton service discovery configurations.
triton_sd_configs:
[ - <triton_sd_config> ... ]
# List of Uyuni service discovery configurations.
uyuni_sd_configs:
[ - <uyuni_sd_config> ... ]
# List of labeled statically configured targets for this job.
static_configs:
[ - <static_config> ... ]
# List of target relabel configurations.
relabel_configs:
[ - <relabel_config> ... ]
# List of metric relabel configurations.
metric_relabel_configs:
[ - <relabel_config> ... ]
# An uncompressed response body larger than this many bytes will cause the
# scrape to fail. 0 means no limit. Example: 100MB.
# This is an experimental feature, this behaviour could
# change or be removed in the future.
[ body_size_limit: <size> | default = 0 ]
# Per-scrape limit on the number of scraped samples that will be accepted.
# If more than this number of samples are present after metric relabeling
# the entire scrape will be treated as failed. 0 means no limit.
[ sample_limit: <int> | default = 0 ]
# Limit on the number of labels that will be accepted per sample. If more
# than this number of labels are present on any sample post metric-relabeling,
# the entire scrape will be treated as failed. 0 means no limit.
[ label_limit: <int> | default = 0 ]
# Limit on the length (in bytes) of each individual label name. If any label
# name in a scrape is longer than this number post metric-relabeling, the
# entire scrape will be treated as failed. Note that label names are UTF-8
# encoded, and characters can take up to 4 bytes. 0 means no limit.
[ label_name_length_limit: <int> | default = 0 ]
# Limit on the length (in bytes) of each individual label value. If any label
# value in a scrape is longer than this number post metric-relabeling, the
# entire scrape will be treated as failed. Note that label values are UTF-8
# encoded, and characters can take up to 4 bytes. 0 means no limit.
[ label_value_length_limit: <int> | default = 0 ]
# Limit per scrape config on number of unique targets that will be
# accepted. If more than this number of targets are present after target
# relabeling, Prometheus will mark the targets as failed without scraping them.
# 0 means no limit. This is an experimental feature, this behaviour could
# change in the future.
[ target_limit: <int> | default = 0 ]
# Limit per scrape config on the number of targets dropped by relabeling
# that will be kept in memory. 0 means no limit.
[ keep_dropped_targets: <int> | default = 0 ]
# Specifies the validation scheme for metric and label names. Either blank or
# "utf8" for full UTF-8 support, or "legacy" for letters, numbers, colons, and
# underscores.
[ metric_name_validation_scheme: <string> | default "utf8" ]
# Specifies the character escaping scheme that will be requested when scraping
# for metric and label names that do not conform to the legacy Prometheus
# character set. Available options are:
# * `allow-utf-8`: Full UTF-8 support, no escaping needed.
# * `underscores`: Escape all legacy-invalid characters to underscores.
# * `dots`: Escapes dots to `_dot_`, underscores to `__`, and all other
# legacy-invalid characters to underscores.
# * `values`: Prepend the name with `U__` and replace all invalid
# characters with their unicode value, surrounded by underscores. Single
# underscores are replaced with double underscores.
# e.g. "U__my_2e_dotted_2e_name".
# If this value is left blank, Prometheus will default to `allow-utf-8` if the
# validation scheme for the current scrape config is set to utf8, or
# `underscores` if the validation scheme is set to `legacy`.
[ metric_name_validation_scheme: <string> | default "utf8" ]
# Limit on total number of positive and negative buckets allowed in a single
# native histogram. The resolution of a histogram with more buckets will be
# reduced until the number of buckets is within the limit. If the limit cannot
# be reached, the scrape will fail.
# 0 means no limit.
[ native_histogram_bucket_limit: <int> | default = 0 ]
# Lower limit for the growth factor of one bucket to the next in each native
# histogram. The resolution of a histogram with a lower growth factor will be
# reduced as much as possible until it is within the limit.
# To set an upper limit for the schema (equivalent to "scale" in OTel's
# exponential histograms), use the following factor limits:
#
# +----------------------------+----------------------------+
# | growth factor | resulting schema AKA scale |
# +----------------------------+----------------------------+
# | 65536 | -4 |
# +----------------------------+----------------------------+
# | 256 | -3 |
# +----------------------------+----------------------------+
# | 16 | -2 |
# +----------------------------+----------------------------+
# | 4 | -1 |
# +----------------------------+----------------------------+
# | 2 | 0 |
# +----------------------------+----------------------------+
# | 1.4 | 1 |
# +----------------------------+----------------------------+
# | 1.1 | 2 |
# +----------------------------+----------------------------+
# | 1.09 | 3 |
# +----------------------------+----------------------------+
# | 1.04 | 4 |
# +----------------------------+----------------------------+
# | 1.02 | 5 |
# +----------------------------+----------------------------+
# | 1.01 | 6 |
# +----------------------------+----------------------------+
# | 1.005 | 7 |
# +----------------------------+----------------------------+
# | 1.002 | 8 |
# +----------------------------+----------------------------+
#
# 0 results in the smallest supported factor (which is currently ~1.0027 or
# schema 8, but might change in the future).
[ native_histogram_min_bucket_factor: <float> | default = 0 ]
# Specifies whether to convert classic histograms into native histograms with
# custom buckets (has no effect without --enable-feature=native-histograms).
[ convert_classic_histograms_to_nhcb: <bool> | default =
<global.convert_classic_histograms_to_nhcb>]
其中 <job_name>
在所有抓取配置中必须是唯一的。
<http_config>
http_config
允许配置 HTTP 请求。
# Sets the `Authorization` header on every request with the
# configured username and password.
# username and username_file are mutually exclusive.
# password and password_file are mutually exclusive.
basic_auth:
[ username: <string> ]
[ username_file: <string> ]
[ password: <secret> ]
[ password_file: <string> ]
# Sets the `Authorization` header on every request with
# the configured credentials.
authorization:
# Sets the authentication type of the request.
[ type: <string> | default: Bearer ]
# Sets the credentials of the request. It is mutually exclusive with
# `credentials_file`.
[ credentials: <secret> ]
# Sets the credentials of the request with the credentials read from the
# configured file. It is mutually exclusive with `credentials`.
[ credentials_file: <filename> ]
# Optional OAuth 2.0 configuration.
# Cannot be used at the same time as basic_auth or authorization.
oauth2:
[ <oauth2> ]
# Configure whether requests follow HTTP 3xx redirects.
[ follow_redirects: <boolean> | default = true ]
# Whether to enable HTTP2.
[ enable_http2: <boolean> | default: true ]
# Configures the request's TLS settings.
tls_config:
[ <tls_config> ]
# Optional proxy URL.
[ proxy_url: <string> ]
# Comma-separated string that can contain IPs, CIDR notation, domain names
# that should be excluded from proxying. IP and domain names can
# contain port numbers.
[ no_proxy: <string> ]
# Use proxy URL indicated by environment variables (HTTP_PROXY, https_proxy, HTTPs_PROXY, https_proxy, and no_proxy)
[ proxy_from_environment: <boolean> | default: false ]
# Specifies headers to send to proxies during CONNECT requests.
[ proxy_connect_header:
[ <string>: [<secret>, ...] ] ]
# Custom HTTP headers to be sent along with each request.
# Headers that are set by Prometheus itself can't be overwritten.
http_headers:
# Header name.
[ <string>:
# Header values.
[ values: [<string>, ...] ]
# Headers values. Hidden in configuration page.
[ secrets: [<secret>, ...] ]
# Files to read header values from.
[ files: [<string>, ...] ] ]
<tls_config>
tls_config
允许配置 TLS 连接。
# CA certificate to validate API server certificate with. At most one of ca and ca_file is allowed.
[ ca: <string> ]
[ ca_file: <filename> ]
# Certificate and key for client cert authentication to the server.
# At most one of cert and cert_file is allowed.
# At most one of key and key_file is allowed.
[ cert: <string> ]
[ cert_file: <filename> ]
[ key: <secret> ]
[ key_file: <filename> ]
# ServerName extension to indicate the name of the server.
# https://tools.ietf.org/html/rfc4366#section-3.1
[ server_name: <string> ]
# Disable validation of the server certificate.
[ insecure_skip_verify: <boolean> ]
# Minimum acceptable TLS version. Accepted values: TLS10 (TLS 1.0), TLS11 (TLS
# 1.1), TLS12 (TLS 1.2), TLS13 (TLS 1.3).
# If unset, Prometheus will use Go default minimum version, which is TLS 1.2.
# See MinVersion in https://pkg.go.dev/crypto/tls#Config.
[ min_version: <string> ]
# Maximum acceptable TLS version. Accepted values: TLS10 (TLS 1.0), TLS11 (TLS
# 1.1), TLS12 (TLS 1.2), TLS13 (TLS 1.3).
# If unset, Prometheus will use Go default maximum version, which is TLS 1.3.
# See MaxVersion in https://pkg.go.dev/crypto/tls#Config.
[ max_version: <string> ]
<oauth2>
使用客户端凭据或密码授权类型的 OAuth 2.0 认证。Prometheus 从指定端点使用给定的客户端访问和密钥获取访问令牌。
client_id: <string>
[ client_secret: <secret> ]
# Read the client secret from a file.
# It is mutually exclusive with `client_secret`.
[ client_secret_file: <filename> ]
# Scopes for the token request.
scopes:
[ - <string> ... ]
# The URL to fetch the token from.
token_url: <string>
# Optional parameters to append to the token URL.
# To set 'password' grant type, add it to params:
# endpoint_params:
# grant_type: 'password'
# username: '[email protected]'
# password: 'strongpassword'
endpoint_params:
[ <string>: <string> ... ]
# Configures the token request's TLS settings.
tls_config:
[ <tls_config> ]
# Optional proxy URL.
[ proxy_url: <string> ]
# Comma-separated string that can contain IPs, CIDR notation, domain names
# that should be excluded from proxying. IP and domain names can
# contain port numbers.
[ no_proxy: <string> ]
# Use proxy URL indicated by environment variables (HTTP_PROXY, https_proxy, HTTPs_PROXY, https_proxy, and no_proxy)
[ proxy_from_environment: <boolean> | default: false ]
# Specifies headers to send to proxies during CONNECT requests.
[ proxy_connect_header:
[ <string>: [<secret>, ...] ] ]
# Custom HTTP headers to be sent along with each request.
# Headers that are set by Prometheus itself can't be overwritten.
http_headers:
# Header name.
[ <string>:
# Header values.
[ values: [<string>, ...] ]
# Headers values. Hidden in configuration page.
[ secrets: [<secret>, ...] ]
# Files to read header values from.
[ files: [<string>, ...] ] ]
<azure_sd_config>
Azure SD 配置允许从 Azure VM 中检索抓取目标。
发现至少需要以下权限
Microsoft.Compute/virtualMachines/read
: VM 发现所需Microsoft.Network/networkInterfaces/read
: VM 发现所需Microsoft.Compute/virtualMachineScaleSets/virtualMachines/read
: 规模集 (VMSS) 发现所需Microsoft.Compute/virtualMachineScaleSets/virtualMachines/networkInterfaces/read
: 规模集 (VMSS) 发现所需在重标记期间,以下元标签可用于目标
__meta_azure_machine_id
: 机器 ID__meta_azure_machine_location
: 机器运行的位置__meta_azure_machine_name
: 机器名称__meta_azure_machine_computer_name
: 机器计算机名__meta_azure_machine_os_type
: 机器操作系统__meta_azure_machine_private_ip
: 机器私有 IP__meta_azure_machine_public_ip
: 如果存在,机器的公共 IP__meta_azure_machine_resource_group
: 机器的资源组__meta_azure_machine_tag_<tagname>
: 机器的每个标签值__meta_azure_machine_scale_set
: VM 所属的规模集名称(此值仅在使用规模集时设置)__meta_azure_machine_size
: 机器大小__meta_azure_subscription_id
: 订阅 ID__meta_azure_tenant_id
: 租户 ID有关 Azure 发现的配置选项,请参见下方
# The information to access the Azure API.
# The Azure environment.
[ environment: <string> | default = AzurePublicCloud ]
# The authentication method, either OAuth, ManagedIdentity or SDK.
# See https://docs.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/overview
# SDK authentication method uses environment variables by default.
# See https://learn.microsoft.com/en-us/azure/developer/go/azure-sdk-authentication
[ authentication_method: <string> | default = OAuth]
# The subscription ID. Always required.
subscription_id: <string>
# Optional tenant ID. Only required with authentication_method OAuth.
[ tenant_id: <string> ]
# Optional client ID. Only required with authentication_method OAuth.
[ client_id: <string> ]
# Optional client secret. Only required with authentication_method OAuth.
[ client_secret: <secret> ]
# Optional resource group name. Limits discovery to this resource group.
[ resource_group: <string> ]
# Refresh interval to re-read the instance list.
[ refresh_interval: <duration> | default = 300s ]
# The port to scrape metrics from. If using the public IP address, this must
# instead be specified in the relabeling rule.
[ port: <int> | default = 80 ]
# HTTP client settings, including authentication methods (such as basic auth and
# authorization), proxy configurations, TLS options, custom HTTP headers, etc.
[ <http_config> ]
<consul_sd_config>
Consul SD 配置允许从 Consul 的 Catalog API 中检索抓取目标。
在重标记期间,以下元标签可用于目标
__meta_consul_address
: 目标的地址__meta_consul_dc
: 目标的中心名称__meta_consul_health
: 服务的健康状态__meta_consul_partition
: 服务注册的管理分区名称__meta_consul_metadata_<key>
: 目标的每个节点元数据键值__meta_consul_node
: 为目标定义的节点名称__meta_consul_service_address
: 目标的地址__meta_consul_service_id
: 目标的 ID__meta_consul_service_metadata_<key>
: 目标的每个服务元数据键值__meta_consul_service_port
: 目标的端口__meta_consul_service
: 目标所属的服务名称__meta_consul_tagged_address_<key>
: 目标的每个节点标记地址键值__meta_consul_tags
: 目标的标签列表,由标签分隔符连接# The information to access the Consul API. It is to be defined
# as the Consul documentation requires.
[ server: <host> | default = "localhost:8500" ]
# Prefix for URIs for when consul is behind an API gateway (reverse proxy).
[ path_prefix: <string> ]
[ token: <secret> ]
[ datacenter: <string> ]
# Namespaces are only supported in Consul Enterprise.
[ namespace: <string> ]
# Admin Partitions are only supported in Consul Enterprise.
[ partition: <string> ]
[ scheme: <string> | default = "http" ]
# The username and password fields are deprecated in favor of the basic_auth configuration.
[ username: <string> ]
[ password: <secret> ]
# A list of services for which targets are retrieved. If omitted, all services
# are scraped.
services:
[ - <string> ]
# A Consul Filter expression used to filter the catalog results
# See https://www.consul.io/api-docs/catalog#list-services to know more
# about the filter expressions that can be used.
[ filter: <string> ]
# The `tags` and `node_meta` fields are deprecated in Consul in favor of `filter`.
# An optional list of tags used to filter nodes for a given service. Services must contain all tags in the list.
tags:
[ - <string> ]
# Node metadata key/value pairs to filter nodes for a given service. As of Consul 1.14, consider `filter` instead.
[ node_meta:
[ <string>: <string> ... ] ]
# The string by which Consul tags are joined into the tag label.
[ tag_separator: <string> | default = , ]
# Allow stale Consul results (see https://www.consul.io/api/features/consistency.html). Will reduce load on Consul.
[ allow_stale: <boolean> | default = true ]
# The time after which the provided names are refreshed.
# On large setup it might be a good idea to increase this value because the catalog will change all the time.
[ refresh_interval: <duration> | default = 30s ]
# HTTP client settings, including authentication methods (such as basic auth and
# authorization), proxy configurations, TLS options, custom HTTP headers, etc.
[ <http_config> ]
请注意,用于抓取目标的 IP 地址和端口是组合而成的 <__meta_consul_address>:<__meta_consul_service_port>
。但是,在某些 Consul 设置中,相关地址在 __meta_consul_service_address
中。在这些情况下,您可以使用 重标记功能来替换特殊的 __address__
标签。
重标记阶段是基于任意标签过滤服务或节点的首选且更强大的方法。对于拥有数千服务的用户来说,直接使用 Consul API 可能更有效,它提供了对节点的基本过滤支持(目前按节点元数据和单个标签过滤)。
<digitalocean_sd_config>
DigitalOcean SD 配置允许从 DigitalOcean 的 Droplets API 中检索抓取目标。此服务发现默认使用公共 IPv4 地址,但这可以通过重标记更改,如Prometheus digitalocean-sd 配置文件中所示。
在重标记期间,以下元标签可用于目标
__meta_digitalocean_droplet_id
: droplet ID__meta_digitalocean_droplet_name
: droplet 名称__meta_digitalocean_image
: droplet 镜像的 slug__meta_digitalocean_image_name
: droplet 镜像的显示名称__meta_digitalocean_private_ipv4
: droplet 的私有 IPv4__meta_digitalocean_public_ipv4
: droplet 的公共 IPv4__meta_digitalocean_public_ipv6
: droplet 的公共 IPv6__meta_digitalocean_region
: droplet 的区域__meta_digitalocean_size
: droplet 的大小__meta_digitalocean_status
: droplet 的状态__meta_digitalocean_features
: droplet 功能的逗号分隔列表__meta_digitalocean_tags
: droplet 标签的逗号分隔列表__meta_digitalocean_vpc
: droplet 的 VPC ID# The port to scrape metrics from.
[ port: <int> | default = 80 ]
# The time after which the droplets are refreshed.
[ refresh_interval: <duration> | default = 60s ]
# HTTP client settings, including authentication methods (such as basic auth and
# authorization), proxy configurations, TLS options, custom HTTP headers, etc.
[ <http_config> ]
<docker_sd_config>
Docker SD 配置允许从 Docker Engine 主机中检索抓取目标。
此 SD 发现“容器”,并为容器配置的每个网络 IP 和端口创建一个目标。
可用的元标签
__meta_docker_container_id
: 容器 ID__meta_docker_container_name
: 容器名称__meta_docker_container_network_mode
: 容器的网络模式__meta_docker_container_label_<labelname>
: 容器的每个标签,不支持的字符转换为下划线__meta_docker_network_id
: 网络 ID__meta_docker_network_name
: 网络名称__meta_docker_network_ingress
: 网络是否为 ingress__meta_docker_network_internal
: 网络是否为 internal__meta_docker_network_label_<labelname>
: 网络的每个标签,不支持的字符转换为下划线__meta_docker_network_scope
: 网络范围__meta_docker_network_ip
: 容器在此网络中的 IP__meta_docker_port_private
: 容器上的端口__meta_docker_port_public
: 如果存在端口映射,则为外部端口__meta_docker_port_public_ip
: 如果存在端口映射,则为公共 IP有关 Docker 发现的配置选项,请参见下方
# Address of the Docker daemon.
host: <string>
# The port to scrape metrics from, when `role` is nodes, and for discovered
# tasks and services that don't have published ports.
[ port: <int> | default = 80 ]
# The host to use if the container is in host networking mode.
[ host_networking_host: <string> | default = "localhost" ]
# Sort all non-nil networks in ascending order based on network name and
# get the first network if the container has multiple networks defined,
# thus avoiding collecting duplicate targets.
[ match_first_network: <boolean> | default = true ]
# Optional filters to limit the discovery process to a subset of available
# resources.
# The available filters are listed in the upstream documentation:
# https://docs.docker.net.cn/engine/api/v1.40/#operation/ContainerList
[ filters:
[ - name: <string>
values: <string>, [...] ]
# The time after which the containers are refreshed.
[ refresh_interval: <duration> | default = 60s ]
# HTTP client settings, including authentication methods (such as basic auth and
# authorization), proxy configurations, TLS options, custom HTTP headers, etc.
[ <http_config> ]
重标记阶段是过滤容器的首选且更强大的方法。对于拥有数千容器的用户来说,直接使用 Docker API 可能更有效,它提供了对容器的基本过滤支持(使用 filters
)。
有关为 Docker Engine 配置 Prometheus 的详细示例,请参见此示例 Prometheus 配置文件。
<dockerswarm_sd_config>
Docker Swarm SD 配置允许从 Docker Swarm engine 中检索抓取目标。
可以配置以下角色之一来发现目标
services
services
角色发现所有Swarm 服务并将其端口暴露为目标。对于服务的每个已发布端口,生成一个目标。如果服务没有已发布端口,则使用 SD 配置中定义的 port
参数为每个服务创建一个目标。
可用的元标签
__meta_dockerswarm_service_id
: 服务 ID__meta_dockerswarm_service_name
: 服务名称__meta_dockerswarm_service_mode
: 服务模式__meta_dockerswarm_service_endpoint_port_name
: 端点端口名称,如果可用__meta_dockerswarm_service_endpoint_port_publish_mode
: 端点端口的发布模式__meta_dockerswarm_service_label_<labelname>
: 服务的每个标签,不支持的字符转换为下划线__meta_dockerswarm_service_task_container_hostname
: 目标的容器主机名,如果可用__meta_dockerswarm_service_task_container_image
: 目标的容器镜像__meta_dockerswarm_service_updating_status
: 服务的状态,如果可用__meta_dockerswarm_network_id
: 网络 ID__meta_dockerswarm_network_name
: 网络名称__meta_dockerswarm_network_ingress
: 网络是否为 ingress__meta_dockerswarm_network_internal
: 网络是否为 internal__meta_dockerswarm_network_label_<labelname>
: 网络的每个标签,不支持的字符转换为下划线__meta_dockerswarm_network_scope
: 网络范围tasks
tasks
角色发现所有Swarm 任务并将其端口暴露为目标。对于任务的每个已发布端口,生成一个目标。如果任务没有已发布端口,则使用 SD 配置中定义的 port
参数为每个任务创建一个目标。
可用的元标签
__meta_dockerswarm_container_label_<labelname>
: 容器的每个标签,不支持的字符转换为下划线__meta_dockerswarm_task_id
: 任务 ID__meta_dockerswarm_task_container_id
: 任务的容器 ID__meta_dockerswarm_task_desired_state
: 任务期望状态__meta_dockerswarm_task_slot
: 任务槽位__meta_dockerswarm_task_state
: 任务状态__meta_dockerswarm_task_port_publish_mode
: 任务端口的发布模式__meta_dockerswarm_service_id
: 服务 ID__meta_dockerswarm_service_name
: 服务名称__meta_dockerswarm_service_mode
: 服务模式__meta_dockerswarm_service_label_<labelname>
: 服务的每个标签,不支持的字符转换为下划线__meta_dockerswarm_network_id
: 网络 ID__meta_dockerswarm_network_name
: 网络名称__meta_dockerswarm_network_ingress
: 网络是否为 ingress__meta_dockerswarm_network_internal
: 网络是否为 internal__meta_dockerswarm_network_label_<labelname>
: 网络的每个标签,不支持的字符转换为下划线__meta_dockerswarm_network_label
: 网络的每个标签,不支持的字符转换为下划线__meta_dockerswarm_network_scope
: 网络范围__meta_dockerswarm_node_id
: 节点 ID__meta_dockerswarm_node_hostname
: 节点主机名__meta_dockerswarm_node_address
: 节点地址__meta_dockerswarm_node_availability
: 节点可用性__meta_dockerswarm_node_label_<labelname>
: 节点的每个标签,不支持的字符转换为下划线__meta_dockerswarm_node_platform_architecture
: 节点架构__meta_dockerswarm_node_platform_os
: 节点操作系统__meta_dockerswarm_node_role
: 节点角色__meta_dockerswarm_node_status
: 节点状态对于使用 mode=host
发布的端口,不会填充 __meta_dockerswarm_network_*
元标签。
nodes
nodes
角色用于发现Swarm 节点。
可用的元标签
__meta_dockerswarm_node_address
: 节点地址__meta_dockerswarm_node_availability
: 节点可用性__meta_dockerswarm_node_engine_version
: 节点 engine 版本__meta_dockerswarm_node_hostname
: 节点主机名__meta_dockerswarm_node_id
: 节点 ID__meta_dockerswarm_node_label_<labelname>
: 节点的每个标签,不支持的字符转换为下划线__meta_dockerswarm_node_manager_address
: 节点的管理器组件地址__meta_dockerswarm_node_manager_leader
: 节点的管理器组件的领导状态(true 或 false)__meta_dockerswarm_node_manager_reachability
: 节点的管理器组件的可达性__meta_dockerswarm_node_platform_architecture
: 节点架构__meta_dockerswarm_node_platform_os
: 节点操作系统__meta_dockerswarm_node_role
: 节点角色__meta_dockerswarm_node_status
: 节点状态有关 Docker Swarm 发现的配置选项,请参见下方
# Address of the Docker daemon.
host: <string>
# Role of the targets to retrieve. Must be `services`, `tasks`, or `nodes`.
role: <string>
# The port to scrape metrics from, when `role` is nodes, and for discovered
# tasks and services that don't have published ports.
[ port: <int> | default = 80 ]
# Optional filters to limit the discovery process to a subset of available
# resources.
# The available filters are listed in the upstream documentation:
# Services: https://docs.docker.net.cn/engine/api/v1.40/#operation/ServiceList
# Tasks: https://docs.docker.net.cn/engine/api/v1.40/#operation/TaskList
# Nodes: https://docs.docker.net.cn/engine/api/v1.40/#operation/NodeList
[ filters:
[ - name: <string>
values: <string>, [...] ]
# The time after which the service discovery data is refreshed.
[ refresh_interval: <duration> | default = 60s ]
# HTTP client settings, including authentication methods (such as basic auth and
# authorization), proxy configurations, TLS options, custom HTTP headers, etc.
[ <http_config> ]
重标记阶段是过滤任务、服务或节点的首选且更强大的方法。对于拥有数千任务的用户来说,直接使用 Swarm API 可能更有效,它提供了对节点的基本过滤支持(使用 filters
)。
有关为 Docker Swarm 配置 Prometheus 的详细示例,请参见此示例 Prometheus 配置文件。
<dns_sd_config>
基于 DNS 的服务发现配置允许指定一组 DNS 域名,这些域名会定期被查询以发现目标列表。要联系的 DNS 服务器从 /etc/resolv.conf
读取。
此服务发现方法仅支持基本的 DNS A、AAAA、MX、NS 和 SRV 记录查询,不支持 RFC6763 中指定的高级 DNS-SD 方法。
在重标记期间,以下元标签可用于目标
__meta_dns_name
: 生成发现目标的记录名称。__meta_dns_srv_record_target
: SRV 记录的目标字段__meta_dns_srv_record_port
: SRV 记录的端口字段__meta_dns_mx_record_target
: MX 记录的目标字段__meta_dns_ns_record_target
: NS 记录的目标字段# A list of DNS domain names to be queried.
names:
[ - <string> ]
# The type of DNS query to perform. One of SRV, A, AAAA, MX or NS.
[ type: <string> | default = 'SRV' ]
# The port number used if the query type is not SRV.
[ port: <int>]
# The time after which the provided names are refreshed.
[ refresh_interval: <duration> | default = 30s ]
<ec2_sd_config>
EC2 SD 配置允许从 AWS EC2 实例中检索抓取目标。默认使用私有 IP 地址,但可以通过重标记更改为公共 IP 地址。
使用的 IAM 凭据必须具有 ec2:DescribeInstances
权限才能发现抓取目标,如果希望将可用区 ID 作为标签使用,也可以选择具有 ec2:DescribeAvailabilityZones
权限(参见下方)。
在重标记期间,以下元标签可用于目标
__meta_ec2_ami
: EC2 Amazon Machine Image__meta_ec2_architecture
: 实例架构__meta_ec2_availability_zone
: 实例运行的可用区__meta_ec2_availability_zone_id
: 实例运行的可用区 ID(需要 ec2:DescribeAvailabilityZones
)__meta_ec2_instance_id
: EC2 实例 ID__meta_ec2_instance_lifecycle
: EC2 实例生命周期,仅为“spot”或“scheduled”实例设置,否则为空__meta_ec2_instance_state
: EC2 实例状态__meta_ec2_instance_type
: EC2 实例类型__meta_ec2_ipv6_addresses
: 分配给实例网络接口的 IPv6 地址的逗号分隔列表,如果存在__meta_ec2_owner_id
: 拥有 EC2 实例的 AWS 账户 ID__meta_ec2_platform
: 操作系统平台,Windows 服务器上设置为“windows”,否则为空__meta_ec2_primary_ipv6_addresses
: 实例主要 IPv6 地址的逗号分隔列表,如果存在。列表按相应网络接口在附加顺序中的位置排序。__meta_ec2_primary_subnet_id
: 主要网络接口的子网 ID,如果可用__meta_ec2_private_dns_name
: 实例的私有 DNS 名称,如果可用__meta_ec2_private_ip
: 实例的私有 IP 地址,如果存在__meta_ec2_public_dns_name
: 实例的公共 DNS 名称,如果可用__meta_ec2_public_ip
: 实例的公共 IP 地址,如果可用__meta_ec2_region
: 实例所在区域__meta_ec2_subnet_id
: 实例运行所在子网 ID 的逗号分隔列表,如果可用__meta_ec2_tag_<tagkey>
: 实例的每个标签值__meta_ec2_vpc_id
: 实例运行所在 VPC 的 ID,如果可用有关 EC2 发现的配置选项,请参见下方
# The information to access the EC2 API.
# The AWS region. If blank, the region from the instance metadata is used.
[ region: <string> ]
# Custom endpoint to be used.
[ endpoint: <string> ]
# The AWS API keys. If blank, the environment variables `AWS_ACCESS_KEY_ID`
# and `AWS_SECRET_ACCESS_KEY` are used.
[ access_key: <string> ]
[ secret_key: <secret> ]
# Named AWS profile used to connect to the API.
[ profile: <string> ]
# AWS Role ARN, an alternative to using AWS API keys.
[ role_arn: <string> ]
# Refresh interval to re-read the instance list.
[ refresh_interval: <duration> | default = 60s ]
# The port to scrape metrics from. If using the public IP address, this must
# instead be specified in the relabeling rule.
[ port: <int> | default = 80 ]
# Filters can be used optionally to filter the instance list by other criteria.
# Available filter criteria can be found here:
# https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_DescribeInstances.html
# Filter API documentation: https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_Filter.html
filters:
[ - name: <string>
values: <string>, [...] ]
# HTTP client settings, including authentication methods (such as basic auth and
# authorization), proxy configurations, TLS options, custom HTTP headers, etc.
[ <http_config> ]
重标记阶段是基于任意标签过滤目标的首选且更强大的方法。对于拥有数千实例的用户来说,直接使用 EC2 API 可能更有效,它支持过滤实例。
<openstack_sd_config>
OpenStack SD 配置允许从 OpenStack Nova 实例中检索抓取目标。
可以配置以下 <openstack_role>
类型之一来发现目标
hypervisor
hypervisor
角色为每个 Nova hypervisor 节点发现一个目标。目标地址默认为 hypervisor 的 host_ip
属性。
在重标记期间,以下元标签可用于目标
__meta_openstack_hypervisor_host_ip
: hypervisor 节点的 IP 地址。__meta_openstack_hypervisor_hostname
: hypervisor 节点名称。__meta_openstack_hypervisor_id
: hypervisor 节点 ID。__meta_openstack_hypervisor_state
: hypervisor 节点状态。__meta_openstack_hypervisor_status
: hypervisor 节点状态。__meta_openstack_hypervisor_type
: hypervisor 节点类型。instance
instance
角色为每个 Nova 实例的网络接口发现一个目标。目标地址默认为网络接口的私有 IP 地址。
在重标记期间,以下元标签可用于目标
__meta_openstack_address_pool
: 私有 IP 池。__meta_openstack_instance_flavor
: OpenStack 实例的 flavor 名称,如果 flavor 名称不可用,则为 flavor ID。__meta_openstack_instance_id
: OpenStack 实例 ID。__meta_openstack_instance_image
: OpenStack 实例使用的镜像 ID。__meta_openstack_instance_name
: OpenStack 实例名称。__meta_openstack_instance_status
: OpenStack 实例状态。__meta_openstack_private_ip
: OpenStack 实例的私有 IP。__meta_openstack_project_id
: 拥有此实例的项目(租户)。__meta_openstack_public_ip
: OpenStack 实例的公共 IP。__meta_openstack_tag_<key>
: 实例的每个元数据项,不支持的字符转换为下划线。__meta_openstack_user_id
: 拥有租户的用户账户。loadbalancer
loadbalancer
角色为每个带有 PROMETHEUS
监听器的 Octavia 负载均衡器发现一个目标。目标地址默认为负载均衡器的 VIP 地址。
在重标记期间,以下元标签可用于目标
__meta_openstack_loadbalancer_availability_zone
: OpenStack 负载均衡器的可用区。__meta_openstack_loadbalancer_floating_ip
: OpenStack 负载均衡器的浮动 IP。__meta_openstack_loadbalancer_id
: OpenStack 负载均衡器 ID。__meta_openstack_loadbalancer_name
: OpenStack 负载均衡器名称。__meta_openstack_loadbalancer_provider
: OpenStack 负载均衡器的 Octavia 提供者。__meta_openstack_loadbalancer_operating_status
: OpenStack 负载均衡器的操作状态。__meta_openstack_loadbalancer_provisioning_status
: OpenStack 负载均衡器的供应状态。__meta_openstack_loadbalancer_tags
: OpenStack 负载均衡器的逗号分隔列表。__meta_openstack_loadbalancer_vip
: OpenStack 负载均衡器的 VIP。__meta_openstack_project_id
: 拥有此负载均衡器的项目(租户)。有关 OpenStack 发现的配置选项,请参见下方
# The information to access the OpenStack API.
# The OpenStack role of entities that should be discovered.
role: <openstack_role>
# The OpenStack Region.
region: <string>
# identity_endpoint specifies the HTTP endpoint that is required to work with
# the Identity API of the appropriate version. While it's ultimately needed by
# all of the identity services, it will often be populated by a provider-level
# function.
[ identity_endpoint: <string> ]
# username is required if using Identity V2 API. Consult with your provider's
# control panel to discover your account's username. In Identity V3, either
# userid or a combination of username and domain_id or domain_name are needed.
[ username: <string> ]
[ userid: <string> ]
# password for the Identity V2 and V3 APIs. Consult with your provider's
# control panel to discover your account's preferred method of authentication.
[ password: <secret> ]
# At most one of domain_id and domain_name must be provided if using username
# with Identity V3. Otherwise, either are optional.
[ domain_name: <string> ]
[ domain_id: <string> ]
# The project_id and project_name fields are optional for the Identity V2 API.
# Some providers allow you to specify a project_name instead of the project_id.
# Some require both. Your provider's authentication policies will determine
# how these fields influence authentication.
[ project_name: <string> ]
[ project_id: <string> ]
# The application_credential_id or application_credential_name fields are
# required if using an application credential to authenticate. Some providers
# allow you to create an application credential to authenticate rather than a
# password.
[ application_credential_name: <string> ]
[ application_credential_id: <string> ]
# The application_credential_secret field is required if using an application
# credential to authenticate.
[ application_credential_secret: <secret> ]
# Whether the service discovery should list all instances for all projects.
# It is only relevant for the 'instance' role and usually requires admin permissions.
[ all_tenants: <boolean> | default: false ]
# Refresh interval to re-read the instance list.
[ refresh_interval: <duration> | default = 60s ]
# The port to scrape metrics from. If using the public IP address, this must
# instead be specified in the relabeling rule.
[ port: <int> | default = 80 ]
# The availability of the endpoint to connect to. Must be one of public, admin or internal.
[ availability: <string> | default = "public" ]
# TLS configuration.
tls_config:
[ <tls_config> ]
<ovhcloud_sd_config>
OVHcloud SD 配置允许从 OVHcloud 的专用服务器和VPS 中使用其API检索抓取目标。Prometheus 会定期检查 REST 端点并为每个发现的服务器创建一个目标。该角色将尝试使用公共 IPv4 地址作为默认地址,如果不存在,它将尝试使用 IPv6 地址。这可以通过重标记更改。对于 OVHcloud 的公共云实例,您可以使用openstacksdconfig。
__meta_ovhcloud_vps_cluster
: 服务器的集群__meta_ovhcloud_vps_datacenter
: 服务器的数据中心__meta_ovhcloud_vps_disk
: 服务器的磁盘__meta_ovhcloud_vps_display_name
: 服务器的显示名称__meta_ovhcloud_vps_ipv4
: 服务器的 IPv4__meta_ovhcloud_vps_ipv6
: 服务器的 IPv6__meta_ovhcloud_vps_keymap
: 服务器的 KVM 键盘布局__meta_ovhcloud_vps_maximum_additional_ip
: 服务器的最大额外 IP__meta_ovhcloud_vps_memory_limit
: 服务器的内存限制__meta_ovhcloud_vps_memory
: 服务器的内存__meta_ovhcloud_vps_monitoring_ip_blocks
: 服务器的监控 IP 块__meta_ovhcloud_vps_name
: 服务器名称__meta_ovhcloud_vps_netboot_mode
: 服务器的网络启动模式__meta_ovhcloud_vps_offer_type
: 服务器的服务类型__meta_ovhcloud_vps_offer
: 服务器的服务__meta_ovhcloud_vps_state
: 服务器状态__meta_ovhcloud_vps_vcore
: 服务器的虚拟核心数量__meta_ovhcloud_vps_version
: 服务器版本__meta_ovhcloud_vps_zone
: 服务器区域__meta_ovhcloud_dedicated_server_commercial_range
: 服务器商业范围__meta_ovhcloud_dedicated_server_datacenter
: 服务器数据中心__meta_ovhcloud_dedicated_server_ipv4
: 服务器 IPv4__meta_ovhcloud_dedicated_server_ipv6
: 服务器 IPv6__meta_ovhcloud_dedicated_server_link_speed
: 服务器链路速度__meta_ovhcloud_dedicated_server_name
: 服务器名称__meta_ovhcloud_dedicated_server_no_intervention
: 服务器是否禁用数据中心干预__meta_ovhcloud_dedicated_server_os
: 服务器操作系统__meta_ovhcloud_dedicated_server_rack
: 服务器机架__meta_ovhcloud_dedicated_server_reverse
: 服务器反向 DNS 名称__meta_ovhcloud_dedicated_server_server_id
: 服务器 ID__meta_ovhcloud_dedicated_server_state
: 服务器状态__meta_ovhcloud_dedicated_server_support_level
: 服务器支持级别有关 OVHcloud 发现的配置选项,请参见下方
# Access key to use. https://api.ovh.com
application_key: <string>
application_secret: <secret>
consumer_key: <secret>
# Service of the targets to retrieve. Must be `vps` or `dedicated_server`.
service: <string>
# API endpoint. https://github.com/ovh/go-ovh#supported-apis
[ endpoint: <string> | default = "ovh-eu" ]
# Refresh interval to re-read the resources list.
[ refresh_interval: <duration> | default = 60s ]
<puppetdb_sd_config>
PuppetDB SD 配置允许从 PuppetDB 资源中检索抓取目标。
此 SD 发现资源,并为 API 返回的每个资源创建一个目标。
资源地址是资源的 certname
,可以在重标记期间更改。
在重标记期间,以下元标签可用于目标
__meta_puppetdb_query
: Puppet 查询语言 (PQL) 查询__meta_puppetdb_certname
: 与资源关联的节点名称__meta_puppetdb_resource
: 资源的类型、标题和参数的 SHA-1 哈希,用于标识__meta_puppetdb_type
: 资源类型__meta_puppetdb_title
: 资源标题__meta_puppetdb_exported
: 资源是否已导出("true"
或 "false"
)__meta_puppetdb_tags
: 资源标签的逗号分隔列表__meta_puppetdb_file
: 声明资源的清单文件__meta_puppetdb_environment
: 与资源关联的节点环境__meta_puppetdb_parameter_<parametername>
: 资源的参数有关 PuppetDB 发现的配置选项,请参见下方
# The URL of the PuppetDB root query endpoint.
url: <string>
# Puppet Query Language (PQL) query. Only resources are supported.
# https://puppet.com/docs/puppetdb/latest/api/query/v4/pql.html
query: <string>
# Whether to include the parameters as meta labels.
# Due to the differences between parameter types and Prometheus labels,
# some parameters might not be rendered. The format of the parameters might
# also change in future releases.
#
# Note: Enabling this exposes parameters in the Prometheus UI and API. Make sure
# that you don't have secrets exposed as parameters if you enable this.
[ include_parameters: <boolean> | default = false ]
# Refresh interval to re-read the resources list.
[ refresh_interval: <duration> | default = 60s ]
# The port to scrape metrics from.
[ port: <int> | default = 80 ]
# HTTP client settings, including authentication methods (such as basic auth and
# authorization), proxy configurations, TLS options, custom HTTP headers, etc.
[ <http_config> ]
有关使用 PuppetDB 配置 Prometheus 的详细示例,请参见此示例 Prometheus 配置文件。
<file_sd_config>
基于文件的服务发现提供了一种更通用的方式来配置静态目标,并作为插入自定义服务发现机制的接口。
它读取一组包含零个或多个 <static_config>
的文件列表。通过磁盘监视检测到所有定义文件的更改并立即应用。
虽然这些单个文件被监视更改,但父目录也被隐式监视。这是为了有效地处理原子重命名并检测与配置的 glob 匹配的新文件。如果父目录包含大量其他文件,这可能会导致问题,因为每个文件也会被监视,即使与它们相关的事件不相关。
文件可以是 YAML 或 JSON 格式。只有导致格式良好的目标组的更改才会被应用。
文件必须包含静态配置列表,使用以下格式
JSON
[
{
"targets": [ "<host>", ... ],
"labels": {
"<labelname>": "<labelvalue>", ...
}
},
...
]
YAML
- targets:
[ - '<host>' ]
labels:
[ <labelname>: <labelvalue> ... ]
作为备用,文件内容也会定期按指定的刷新间隔重新读取。
每个目标在重标记阶段都有一个元标签 __meta_filepath
。其值设置为提取目标的文件路径。
有与此发现机制相关的集成列表。
# Patterns for files from which target groups are extracted.
files:
[ - <filename_pattern> ... ]
# Refresh interval to re-read the files.
[ refresh_interval: <duration> | default = 5m ]
其中 <filename_pattern>
可以是以 .json
, .yml
或 .yaml
结尾的路径。最后一个路径段可以包含一个匹配任何字符序列的单个 *
,例如 my/path/tg_*.json
。
<gce_sd_config>
GCE SD 配置允许从 GCP GCE 实例中检索抓取目标。默认使用私有 IP 地址,但可以通过重标记更改为公共 IP 地址。
在重标记期间,以下元标签可用于目标
__meta_gce_instance_id
: 实例的数字 ID__meta_gce_instance_name
: 实例名称__meta_gce_label_<labelname>
: 实例的每个 GCE 标签,不支持的字符转换为下划线__meta_gce_machine_type
: 实例机器类型的完整或部分 URL__meta_gce_metadata_<name>
: 实例的每个元数据项__meta_gce_network
: 实例的网络 URL__meta_gce_private_ip
: 实例的私有 IP 地址__meta_gce_interface_ipv4_<name>
: 每个命名接口的 IPv4 地址__meta_gce_project
: 实例运行所在的 GCP 项目__meta_gce_public_ip
: 如果存在,实例的公共 IP 地址__meta_gce_subnetwork
: 实例的子网络 URL__meta_gce_tags
: 实例标签的逗号分隔列表__meta_gce_zone
: 实例运行所在的 GCE 区域 URL有关 GCE 发现的配置选项,请参见下方
# The information to access the GCE API.
# The GCP Project
project: <string>
# The zone of the scrape targets. If you need multiple zones use multiple
# gce_sd_configs.
zone: <string>
# Filter can be used optionally to filter the instance list by other criteria
# Syntax of this filter string is described here in the filter query parameter section:
# https://cloud.google.com/compute/docs/reference/latest/instances/list
[ filter: <string> ]
# Refresh interval to re-read the instance list
[ refresh_interval: <duration> | default = 60s ]
# The port to scrape metrics from. If using the public IP address, this must
# instead be specified in the relabeling rule.
[ port: <int> | default = 80 ]
# The tag separator is used to separate the tags on concatenation
[ tag_separator: <string> | default = , ]
凭据由 Google Cloud SDK 默认客户端通过查看以下位置进行发现,优先使用找到的第一个位置
GOOGLE_APPLICATION_CREDENTIALS
环境变量指定的 JSON 文件$HOME/.config/gcloud/application_default_credentials.json
中的 JSON 文件如果 Prometheus 在 GCE 中运行,则与其运行实例关联的服务帐户应至少对计算资源具有只读权限。如果在 GCE 外部运行,请确保创建适当的服务帐户并将凭据文件放置在预期位置之一。
<hetzner_sd_config>
Hetzner SD 配置允许从 Hetzner Cloud API 和 Robot API 中检索抓取目标。此服务发现默认使用公共 IPv4 地址,但这可以通过重标记更改,如Prometheus hetzner-sd 配置文件中所示。
在重标记期间,以下元标签可用于所有目标
__meta_hetzner_server_id
: 服务器 ID__meta_hetzner_server_name
: 服务器名称__meta_hetzner_server_status
: 服务器状态__meta_hetzner_public_ipv4
: 服务器的公共 IPv4 地址__meta_hetzner_public_ipv6_network
: 服务器的公共 IPv6 网络 (/64)__meta_hetzner_datacenter
: 服务器数据中心以下标签仅适用于 role
设置为 hcloud
的目标
__meta_hetzner_hcloud_image_name
: 服务器镜像名称__meta_hetzner_hcloud_image_description
: 服务器镜像描述__meta_hetzner_hcloud_image_os_flavor
: 服务器镜像 OS 版本风味__meta_hetzner_hcloud_image_os_version
: 服务器镜像 OS 版本__meta_hetzner_hcloud_datacenter_location
: 服务器位置__meta_hetzner_hcloud_datacenter_location_network_zone
: 服务器网络区域__meta_hetzner_hcloud_server_type
: 服务器类型__meta_hetzner_hcloud_cpu_cores
: 服务器 CPU 核心数__meta_hetzner_hcloud_cpu_type
: 服务器 CPU 类型(共享或专用)__meta_hetzner_hcloud_memory_size_gb
: 服务器内存大小(GB)__meta_hetzner_hcloud_disk_size_gb
: 服务器磁盘大小(GB)__meta_hetzner_hcloud_private_ipv4_<networkname>
: 服务器在给定网络中的私有 IPv4 地址__meta_hetzner_hcloud_label_<labelname>
: 服务器的每个标签,不支持的字符转换为下划线__meta_hetzner_hcloud_labelpresent_<labelname>
: 对于服务器的每个标签为 true
,不支持的字符转换为下划线以下标签仅适用于 role
设置为 robot
的目标
__meta_hetzner_robot_product
: 服务器产品__meta_hetzner_robot_cancelled
: 服务器取消状态# The Hetzner role of entities that should be discovered.
# One of robot or hcloud.
role: <string>
# The port to scrape metrics from.
[ port: <int> | default = 80 ]
# The time after which the servers are refreshed.
[ refresh_interval: <duration> | default = 60s ]
# Label selector used to filter the servers when fetching them from the API. See https://docs.hetzner.cloud/#label-selector for more details.
# Only used when role is hcloud.
[ label_selector: <string> ]
# HTTP client settings, including authentication methods (such as basic auth and
# authorization), proxy configurations, TLS options, custom HTTP headers, etc.
[ <http_config> ]
<http_sd_config>
基于 HTTP 的服务发现提供了一种更通用的方式来配置静态目标,并作为插入自定义服务发现机制的接口。
它从包含零个或多个 <static_config>
的列表的 HTTP 端点获取目标。目标必须回复 HTTP 200 响应。HTTP 头 Content-Type
必须是 application/json
,并且主体必须是有效的 JSON。
示例响应主体
[
{
"targets": [ "<host>", ... ],
"labels": {
"<labelname>": "<labelvalue>", ...
}
},
...
]
端点会定期按指定的刷新间隔进行查询。prometheus_sd_http_failures_total
计数器指标跟踪刷新失败次数。
每个目标在重标记阶段都有一个元标签 __meta_url
。其值设置为提取目标的 URL。
# URL from which the targets are fetched.
url: <string>
# Refresh interval to re-query the endpoint.
[ refresh_interval: <duration> | default = 60s ]
# HTTP client settings, including authentication methods (such as basic auth and
# authorization), proxy configurations, TLS options, custom HTTP headers, etc.
[ <http_config> ]
<ionos_sd_config>
IONOS SD 配置允许从 IONOS Cloud API 中检索抓取目标。此服务发现默认使用第一个网卡 IP 地址,但这可以通过重标记更改。在重标记期间,以下元标签可用于所有目标
__meta_ionos_server_availability_zone
: 服务器可用区__meta_ionos_server_boot_cdrom_id
: 服务器启动使用的 CD-ROM ID__meta_ionos_server_boot_image_id
: 服务器启动使用的启动镜像或快照 ID__meta_ionos_server_boot_volume_id
: 启动卷 ID__meta_ionos_server_cpu_family
: 服务器 CPU 系列__meta_ionos_server_id
: 服务器 ID__meta_ionos_server_ip
: 分配给服务器的所有 IP 的逗号分隔列表__meta_ionos_server_lifecycle
: 服务器资源生命周期状态__meta_ionos_server_name
: 服务器名称__meta_ionos_server_nic_ip_<nic_name>
: 按连接到服务器的每个网卡名称分组的 IP 的逗号分隔列表__meta_ionos_server_servers_id
: 服务器所属的服务器 ID__meta_ionos_server_state
: 服务器执行状态__meta_ionos_server_type
: 服务器类型# The unique ID of the data center.
datacenter_id: <string>
# The port to scrape metrics from.
[ port: <int> | default = 80 ]
# The time after which the servers are refreshed.
[ refresh_interval: <duration> | default = 60s ]
# HTTP client settings, including authentication methods (such as basic auth and
# authorization), proxy configurations, TLS options, custom HTTP headers, etc.
[ <http_config> ]
<kubernetes_sd_config>
Kubernetes SD 配置允许从 Kubernetes 的 REST API 中检索抓取目标,并始终与集群状态保持同步。
可以配置以下 role
类型之一来发现目标
node
node
角色为每个集群节点发现一个目标,地址默认为 Kubelet 的 HTTP 端口。目标地址默认为 Kubernetes 节点对象中按照 NodeInternalIP
、NodeExternalIP
、NodeLegacyHostIP
和 NodeHostName
的地址类型顺序存在的第一个地址。
可用的元标签
__meta_kubernetes_node_name
: 节点对象名称。__meta_kubernetes_node_provider_id
: 云提供商为节点对象指定的名称。__meta_kubernetes_node_label_<labelname>
: 节点对象的每个标签,不支持的字符转换为下划线。__meta_kubernetes_node_labelpresent_<labelname>
: 对于节点对象的每个标签为 true
,不支持的字符转换为下划线。__meta_kubernetes_node_annotation_<annotationname>
: 节点对象的每个注解。__meta_kubernetes_node_annotationpresent_<annotationname>
: 对于节点对象的每个注解为 true
。__meta_kubernetes_node_address_<address_type>
: 每个节点地址类型的第一个地址,如果存在。此外,节点的 instance
标签将被设置为从 API 服务器检索到的节点名称。
service
service
角色为每个服务的每个服务端口发现一个目标。这通常对于服务的黑盒监控很有用。地址将被设置为服务的 Kubernetes DNS 名称和相应的服务端口。
可用的元标签
__meta_kubernetes_namespace
: 服务对象命名空间。__meta_kubernetes_service_annotation_<annotationname>
: 服务对象的每个注解。__meta_kubernetes_service_annotationpresent_<annotationname>
: 对于服务对象的每个注解为 "true"。__meta_kubernetes_service_cluster_ip
: 服务的集群 IP 地址。(不适用于类型为 ExternalName 的服务)__meta_kubernetes_service_loadbalancer_ip
: 负载均衡器的 IP 地址。(适用于类型为 LoadBalancer 的服务)__meta_kubernetes_service_external_name
: 服务的 DNS 名称。(适用于类型为 ExternalName 的服务)__meta_kubernetes_service_label_<labelname>
: 服务对象的每个标签,不支持的字符转换为下划线。__meta_kubernetes_service_labelpresent_<labelname>
: 对于服务对象的每个标签为 true
,不支持的字符转换为下划线。__meta_kubernetes_service_name
: 服务对象名称。__meta_kubernetes_service_port_name
: 目标的服务端口名称。__meta_kubernetes_service_port_number
: 目标的服务端口号。__meta_kubernetes_service_port_protocol
: 目标的服务端口协议。__meta_kubernetes_service_type
: 服务类型。pod
pod
角色发现所有 pod 并将它们的容器暴露为目标。对于容器的每个声明端口,生成一个目标。如果容器没有指定端口,则为每个容器创建一个无端口目标,以便通过重标记手动添加端口。
可用的元标签
__meta_kubernetes_namespace
: pod 对象命名空间。__meta_kubernetes_pod_name
: pod 对象名称。__meta_kubernetes_pod_ip
: pod 对象的 pod IP。__meta_kubernetes_pod_label_<labelname>
: pod 对象的每个标签,不支持的字符转换为下划线。__meta_kubernetes_pod_labelpresent_<labelname>
: 对于 pod 对象的每个标签为 true
,不支持的字符转换为下划线。__meta_kubernetes_pod_annotation_<annotationname>
: pod 对象的每个注解。__meta_kubernetes_pod_annotationpresent_<annotationname>
: 对于 pod 对象的每个注解为 true
。__meta_kubernetes_pod_container_init
: 如果容器是InitContainer,则为 true
__meta_kubernetes_pod_container_name
: 目标地址指向的容器名称。__meta_kubernetes_pod_container_id
: 目标地址指向的容器 ID。ID 格式为 <type>://<container_id>
。__meta_kubernetes_pod_container_image
: 容器使用的镜像。__meta_kubernetes_pod_container_port_name
: 容器端口名称。__meta_kubernetes_pod_container_port_number
: 容器端口号。__meta_kubernetes_pod_container_port_protocol
: 容器端口协议。__meta_kubernetes_pod_ready
: 设置为 true
或 false
表示 pod 的就绪状态。__meta_kubernetes_pod_phase
: 在生命周期中设置为 Pending
、Running
、Succeeded
、Failed
或 Unknown
。__meta_kubernetes_pod_node_name
: pod 调度到的节点名称。__meta_kubernetes_pod_host_ip
: pod 对象的当前主机 IP。__meta_kubernetes_pod_uid
: pod 对象 UID。__meta_kubernetes_pod_controller_kind
: pod 控制器对象类型。__meta_kubernetes_pod_controller_name
: pod 控制器名称。endpoints
endpoints
角色从服务列出的端点中发现目标。对于每个端点地址,按端口发现一个目标。如果端点由 pod 支持,则 pod 的所有未绑定到端点端口的额外容器端口也会作为目标被发现。
可用的元标签
__meta_kubernetes_namespace
: endpoints 对象命名空间。__meta_kubernetes_endpoints_name
: endpoints 对象名称。__meta_kubernetes_endpoints_label_<labelname>
: endpoints 对象的每个标签,不支持的字符转换为下划线。__meta_kubernetes_endpoints_labelpresent_<labelname>
: 对于 endpoints 对象的每个标签为 true
,不支持的字符转换为下划线。__meta_kubernetes_endpoints_annotation_<annotationname>
: endpoints 对象的每个注解。__meta_kubernetes_endpoints_annotationpresent_<annotationname>
: 对于 endpoints 对象的每个注解为 true
。__meta_kubernetes_endpoint_hostname
: 端点主机名。__meta_kubernetes_endpoint_node_name
: 托管端点的节点名称。__meta_kubernetes_endpoint_ready
: 设置为 true
或 false
表示端点的就绪状态。__meta_kubernetes_endpoint_port_name
: 端点端口名称。__meta_kubernetes_endpoint_port_protocol
: 端点端口协议。__meta_kubernetes_endpoint_address_target_kind
: 端点地址目标类型。__meta_kubernetes_endpoint_address_target_name
: 端点地址目标名称。role: service
发现的所有标签。role: pod
发现的所有标签。endpointslice
endpointslice
角色从现有 endpointslices 中发现目标。对于 endpointslice 对象中引用的每个端点地址,发现一个目标。如果端点由 pod 支持,则 pod 的所有未绑定到端点端口的额外容器端口也会作为目标被发现。
此角色需要 discovery.k8s.io/v1
API 版本(自 Kubernetes v1.21 起可用)。
可用的元标签
__meta_kubernetes_namespace
: endpoints 对象命名空间。__meta_kubernetes_endpointslice_name
: endpointslice 对象名称。__meta_kubernetes_endpointslice_label_<labelname>
: endpointslice 对象的每个标签,不支持的字符转换为下划线。__meta_kubernetes_endpointslice_labelpresent_<labelname>
: 对于 endpointslice 对象的每个标签为 true
,不支持的字符转换为下划线。__meta_kubernetes_endpointslice_annotation_<annotationname>
: endpointslice 对象的每个注解。__meta_kubernetes_endpointslice_annotationpresent_<annotationname>
: 对于 endpointslice 对象的每个注解为 true
。__meta_kubernetes_endpointslice_address_target_kind
: 引用对象的类型。__meta_kubernetes_endpointslice_address_target_name
: 引用对象名称。__meta_kubernetes_endpointslice_address_type
: 目标地址的 ip 协议族。__meta_kubernetes_endpointslice_endpoint_conditions_ready
: 设置为 true
或 false
表示引用端点的就绪状态。__meta_kubernetes_endpointslice_endpoint_conditions_serving
: 设置为 true
或 false
表示引用端点的服务状态。__meta_kubernetes_endpointslice_endpoint_conditions_terminating
: 设置为 true
或 false
表示引用端点的终止状态。__meta_kubernetes_endpointslice_endpoint_topology_kubernetes_io_hostname
: 托管引用端点的节点名称。__meta_kubernetes_endpointslice_endpoint_topology_present_kubernetes_io_hostname
: 指示引用对象是否有 kubernetes.io/hostname 注解的标记。__meta_kubernetes_endpointslice_endpoint_hostname
: 引用端点主机名。__meta_kubernetes_endpointslice_endpoint_node_name
: 托管引用端点的节点名称。__meta_kubernetes_endpointslice_endpoint_zone
: 引用端点所在的区域。__meta_kubernetes_endpointslice_port
: 引用端点端口。__meta_kubernetes_endpointslice_port_name
: 引用端点的命名端口。__meta_kubernetes_endpointslice_port_protocol
: 引用端点协议。role: service
发现的所有标签。role: pod
发现的所有标签。ingress
ingress
角色为每个 ingress 的每个路径发现一个目标。这通常对于 ingress 的黑盒监控很有用。地址将被设置为 ingress spec 中指定的主机。
此角色需要 networking.k8s.io/v1
API 版本(自 Kubernetes v1.19 起可用)。
可用的元标签
__meta_kubernetes_namespace
: ingress 对象命名空间。__meta_kubernetes_ingress_name
: ingress 对象名称。__meta_kubernetes_ingress_label_<labelname>
: ingress 对象的每个标签,不支持的字符转换为下划线。__meta_kubernetes_ingress_labelpresent_<labelname>
: 对于 ingress 对象的每个标签为 true
,不支持的字符转换为下划线。__meta_kubernetes_ingress_annotation_<annotationname>
: ingress 对象的每个注解。__meta_kubernetes_ingress_annotationpresent_<annotationname>
: 对于 ingress 对象的每个注解为 true
。__meta_kubernetes_ingress_class_name
: ingress spec 中的类名,如果存在。__meta_kubernetes_ingress_scheme
: ingress 协议方案,如果设置了 TLS 配置则为 https
。默认为 http
。__meta_kubernetes_ingress_path
: ingress spec 中的路径。默认为 /
。有关 Kubernetes 发现的配置选项,请参见下方
# The information to access the Kubernetes API.
# The API server addresses. If left empty, Prometheus is assumed to run inside
# of the cluster and will discover API servers automatically and use the pod's
# CA certificate and bearer token file at /var/run/secrets/kubernetes.io/serviceaccount/.
[ api_server: <host> ]
# The Kubernetes role of entities that should be discovered.
# One of endpoints, endpointslice, service, pod, node, or ingress.
role: <string>
# Optional path to a kubeconfig file.
# Note that api_server and kube_config are mutually exclusive.
[ kubeconfig_file: <filename> ]
# Optional namespace discovery. If omitted, all namespaces are used.
namespaces:
own_namespace: <boolean>
names:
[ - <string> ]
# Optional label and field selectors to limit the discovery process to a subset of available resources.
# See https://kubernetes.ac.cn/docs/concepts/overview/working-with-objects/field-selectors/
# and https://kubernetes.ac.cn/docs/concepts/overview/working-with-objects/labels/ to learn more about the possible
# filters that can be used. The endpoints role supports pod, service and endpoints selectors.
# The pod role supports node selectors when configured with `attach_metadata: {node: true}`.
# Other roles only support selectors matching the role itself (e.g. node role can only contain node selectors).
# Note: When making decision about using field/label selector make sure that this
# is the best approach - it will prevent Prometheus from reusing single list/watch
# for all scrape configs. This might result in a bigger load on the Kubernetes API,
# because per each selector combination there will be additional LIST/WATCH. On the other hand,
# if you just want to monitor small subset of pods in large cluster it's recommended to use selectors.
# Decision, if selectors should be used or not depends on the particular situation.
[ selectors:
[ - role: <string>
[ label: <string> ]
[ field: <string> ] ]]
# Optional metadata to attach to discovered targets. If omitted, no additional metadata is attached.
attach_metadata:
# Attaches node metadata to discovered targets. Valid for roles: pod, endpoints, endpointslice.
# When set to true, Prometheus must have permissions to get Nodes.
[ node: <boolean> | default = false ]
# HTTP client settings, including authentication methods (such as basic auth and
# authorization), proxy configurations, TLS options, custom HTTP headers, etc.
[ <http_config> ]
有关为 Kubernetes 配置 Prometheus 的详细示例,请参见此示例 Prometheus 配置文件。
您可能希望查看第三方 Prometheus Operator,它可以自动在 Kubernetes 之上进行 Prometheus 设置。
<kuma_sd_config>
Kuma SD 配置允许从 Kuma 控制平面检索抓取目标。
此 SD 基于 Kuma 数据平面代理发现“监控分配”,通过 MADS v1 (Monitoring Assignment Discovery Service) xDS API,并将在启用 Prometheus 的 mesh 中为每个代理创建一个目标。
以下元标签可用于每个目标
__meta_kuma_mesh
: 代理的 Mesh 名称__meta_kuma_dataplane
: 代理名称__meta_kuma_service
: 代理关联的服务名称__meta_kuma_label_<tagname>
: 代理的每个标签有关 Kuma MonitoringAssignment 发现的配置选项,请参见下方
# Address of the Kuma Control Plane's MADS xDS server.
server: <string>
# Client id is used by Kuma Control Plane to compute Monitoring Assignment for specific Prometheus backend.
# This is useful when migrating between multiple Prometheus backends, or having separate backend for each Mesh.
# When not specified, system hostname/fqdn will be used if available, if not `prometheus` will be used.
[ client_id: <string> ]
# The time to wait between polling update requests.
[ refresh_interval: <duration> | default = 30s ]
# The time after which the monitoring assignments are refreshed.
[ fetch_timeout: <duration> | default = 2m ]
# HTTP client settings, including authentication methods (such as basic auth and
# authorization), proxy configurations, TLS options, custom HTTP headers, etc.
[ <http_config> ]
重标记阶段是过滤代理和用户定义标签的首选且更强大的方法。
<lightsail_sd_config>
Lightsail SD 配置允许从 AWS Lightsail 实例中检索抓取目标。默认使用私有 IP 地址,但可以通过重标记更改为公共 IP 地址。
在重标记期间,以下元标签可用于目标
__meta_lightsail_availability_zone
: 实例运行的可用区__meta_lightsail_blueprint_id
: Lightsail blueprint ID__meta_lightsail_bundle_id
: Lightsail bundle ID__meta_lightsail_instance_name
: Lightsail 实例名称__meta_lightsail_instance_state
: Lightsail 实例状态__meta_lightsail_instance_support_code
: Lightsail 实例支持代码__meta_lightsail_ipv6_addresses
: 分配给实例网络接口的 IPv6 地址的逗号分隔列表,如果存在__meta_lightsail_private_ip
: 实例的私有 IP 地址__meta_lightsail_public_ip
: 如果可用,实例的公共 IP 地址__meta_lightsail_region
: 实例区域__meta_lightsail_tag_<tagkey>
: 实例的每个标签值有关 Lightsail 发现的配置选项,请参见下方
# The information to access the Lightsail API.
# The AWS region. If blank, the region from the instance metadata is used.
[ region: <string> ]
# Custom endpoint to be used.
[ endpoint: <string> ]
# The AWS API keys. If blank, the environment variables `AWS_ACCESS_KEY_ID`
# and `AWS_SECRET_ACCESS_KEY` are used.
[ access_key: <string> ]
[ secret_key: <secret> ]
# Named AWS profile used to connect to the API.
[ profile: <string> ]
# AWS Role ARN, an alternative to using AWS API keys.
[ role_arn: <string> ]
# Refresh interval to re-read the instance list.
[ refresh_interval: <duration> | default = 60s ]
# The port to scrape metrics from. If using the public IP address, this must
# instead be specified in the relabeling rule.
[ port: <int> | default = 80 ]
# HTTP client settings, including authentication methods (such as basic auth and
# authorization), proxy configurations, TLS options, custom HTTP headers, etc.
[ <http_config> ]
<linode_sd_config>
Linode SD 配置允许从 Linode 的 Linode APIv4 中检索抓取目标。此服务发现默认使用公共 IPv4 地址,但这可以通过重标记更改,如Prometheus linode-sd 配置文件中所示。
Linode APIv4 Token 必须创建时具有以下范围:linodes:read_only
、ips:read_only
和 events:read_only
。
在重标记期间,以下元标签可用于目标
__meta_linode_instance_id
: linode 实例 ID__meta_linode_instance_label
: linode 实例标签__meta_linode_image
: linode 实例镜像的 slug__meta_linode_private_ipv4
: linode 实例私有 IPv4__meta_linode_public_ipv4
: linode 实例公共 IPv4__meta_linode_public_ipv6
: linode 实例公共 IPv6__meta_linode_private_ipv4_rdns
: linode 实例第一个私有 IPv4 的反向 DNS__meta_linode_public_ipv4_rdns
: linode 实例第一个公共 IPv4 的反向 DNS__meta_linode_public_ipv6_rdns
: linode 实例第一个公共 IPv6 的反向 DNS__meta_linode_region
: linode 实例区域__meta_linode_type
: linode 实例类型__meta_linode_status
:Linode 实例的状态__meta_linode_tags
:Linode 实例的标签列表,使用标签分隔符连接__meta_linode_group
:Linode 实例所属的显示组__meta_linode_gpus
:Linode 实例的 GPU 数量__meta_linode_hypervisor
:为 Linode 实例提供支持的虚拟化软件__meta_linode_backups
:Linode 实例的备份服务状态__meta_linode_specs_disk_bytes
:Linode 实例可访问的存储空间大小__meta_linode_specs_memory_bytes
:Linode 实例可访问的 RAM 大小__meta_linode_specs_vcpus
:此 Linode 可访问的 VCPU 数量__meta_linode_specs_transfer_bytes
:Linode 实例每月分配的网络流量大小__meta_linode_extra_ips
:分配给 Linode 实例的所有额外 IPv4 地址列表,使用标签分隔符连接__meta_linode_ipv6_ranges
:分配给 Linode 实例的带掩码的 IPv6 地址范围列表,使用标签分隔符连接
# Optional region to filter on.
[ region: <string> ]
# The port to scrape metrics from.
[ port: <int> | default = 80 ]
# The string by which Linode Instance tags are joined into the tag label.
[ tag_separator: <string> | default = , ]
# The time after which the linode instances are refreshed.
[ refresh_interval: <duration> | default = 60s ]
# HTTP client settings, including authentication methods (such as basic auth and
# authorization), proxy configurations, TLS options, custom HTTP headers, etc.
[ <http_config> ]
<marathon_sd_config>
Marathon SD 配置允许使用 Marathon REST API 来检索抓取目标。Prometheus 将定期检查 REST 端点以获取当前正在运行的任务,并为每个至少有一个健康任务的应用创建一个目标组。
在重标记期间,以下元标签可用于目标
__meta_marathon_app
:应用的名称(斜杠已替换为破折号)__meta_marathon_image
:使用的 Docker 镜像名称(如果可用)__meta_marathon_task
:Mesos 任务的 ID__meta_marathon_app_label_<labelname>
:附加到应用的 Marathon 标签,任何不支持的字符都已转换为下划线__meta_marathon_port_definition_label_<labelname>
:端口定义标签,任何不支持的字符都已转换为下划线__meta_marathon_port_mapping_label_<labelname>
:端口映射标签,任何不支持的字符都已转换为下划线__meta_marathon_port_index
:端口索引号(例如 1
对应 PORT1
)请参阅下文了解 Marathon 服务发现的配置选项
# List of URLs to be used to contact Marathon servers.
# You need to provide at least one server URL.
servers:
- <string>
# Polling interval
[ refresh_interval: <duration> | default = 30s ]
# Optional authentication information for token-based authentication
# https://docs.mesosphere.com/1.11/security/ent/iam-api/#passing-an-authentication-token
# It is mutually exclusive with `auth_token_file` and other authentication mechanisms.
[ auth_token: <secret> ]
# Optional authentication information for token-based authentication
# https://docs.mesosphere.com/1.11/security/ent/iam-api/#passing-an-authentication-token
# It is mutually exclusive with `auth_token` and other authentication mechanisms.
[ auth_token_file: <filename> ]
# HTTP client settings, including authentication methods (such as basic auth and
# authorization), proxy configurations, TLS options, custom HTTP headers, etc.
[ <http_config> ]
默认情况下,Marathon 中列出的每个应用都将被 Prometheus 抓取。如果并非所有服务都提供 Prometheus 指标,您可以使用 Marathon 标签和 Prometheus 重贴标签功能来控制实际抓取哪些实例。有关如何设置 Marathon 应用和 Prometheus 配置的实际示例,请参阅 Prometheus marathon-sd 配置文件。
默认情况下,所有应用在 Prometheus 中都将显示为单个作业(配置文件中指定的作业),这也可以通过重贴标签来更改。
<nerve_sd_config>
Nerve SD 配置允许从 AirBnB 的 Nerve 检索抓取目标,这些目标存储在 Zookeeper 中。
在重标记期间,以下元标签可用于目标
__meta_nerve_path
:Zookeeper 中端点节点的完整路径__meta_nerve_endpoint_host
:端点的主机__meta_nerve_endpoint_port
:端点的端口__meta_nerve_endpoint_name
:端点的名称# The Zookeeper servers.
servers:
- <host>
# Paths can point to a single service, or the root of a tree of services.
paths:
- <string>
[ timeout: <duration> | default = 10s ]
<nomad_sd_config>
Nomad SD 配置允许从 Nomad 的 Service API 检索抓取目标。
在重标记期间,以下元标签可用于目标
__meta_nomad_address
:目标的的服务地址__meta_nomad_dc
:目标的数据中心名称__meta_nomad_namespace
:目标的命名空间__meta_nomad_node_id
:为目标定义的节点名称__meta_nomad_service
:目标所属服务的名称__meta_nomad_service_address
:目标的的服务地址__meta_nomad_service_id
:目标的 服务 ID__meta_nomad_service_port
:目标的服务端口__meta_nomad_tags
:目标的标签列表,使用标签分隔符连接# The information to access the Nomad API. It is to be defined
# as the Nomad documentation requires.
[ allow_stale: <boolean> | default = true ]
[ namespace: <string> | default = default ]
[ refresh_interval: <duration> | default = 60s ]
[ region: <string> | default = global ]
# The URL to connect to the API.
[ server: <string> ]
[ tag_separator: <string> | default = ,]
# HTTP client settings, including authentication methods (such as basic auth and
# authorization), proxy configurations, TLS options, custom HTTP headers, etc.
[ <http_config> ]
<serverset_sd_config>
Serverset SD 配置允许从 Serversets 检索抓取目标,这些目标存储在 Zookeeper 中。Serversets 通常由 Finagle 和 Aurora 使用。
在重标记期间,以下元标签可用于目标
__meta_serverset_path
:Zookeeper 中 serverset 成员节点的完整路径__meta_serverset_endpoint_host
:默认端点的主机__meta_serverset_endpoint_port
:默认端点的端口__meta_serverset_endpoint_host_<endpoint>
:给定端点的主机__meta_serverset_endpoint_port_<endpoint>
:给定端点的端口__meta_serverset_shard
:成员的分片编号__meta_serverset_status
:成员的状态# The Zookeeper servers.
servers:
- <host>
# Paths can point to a single serverset, or the root of a tree of serversets.
paths:
- <string>
[ timeout: <duration> | default = 10s ]
Serverset 数据必须是 JSON 格式,目前不支持 Thrift 格式。
<triton_sd_config>
Triton SD 配置允许从 Container Monitor 服务发现端点检索抓取目标。
可以配置以下 <triton_role>
类型之一来发现目标
container
container
角色为每个由 account
拥有的“虚拟机”发现一个目标。这些是 SmartOS 区域或 lx/KVM/bhyve 品牌的区域。
在重标记期间,以下元标签可用于目标
__meta_triton_groups
:目标所属的组列表,使用逗号分隔符连接__meta_triton_machine_alias
:目标容器的别名__meta_triton_machine_brand
:目标容器的品牌__meta_triton_machine_id
:目标容器的 UUID__meta_triton_machine_image
:目标容器的镜像类型__meta_triton_server_id
:目标容器正在运行的服务器 UUIDcn
cn
角色为构成 Triton 基础设施的每个计算节点(也称为“服务器”或“全局区域”)发现一个目标。account
必须是 Triton 操作员,目前需要至少拥有一个 container
。
在重标记期间,以下元标签可用于目标
__meta_triton_machine_alias
:目标的主机名(需要 triton-cmon 1.7.0 或更高版本)__meta_triton_machine_id
:目标的 UUID请参阅下文了解 Triton 服务发现的配置选项
# The information to access the Triton discovery API.
# The account to use for discovering new targets.
account: <string>
# The type of targets to discover, can be set to:
# * "container" to discover virtual machines (SmartOS zones, lx/KVM/bhyve branded zones) running on Triton
# * "cn" to discover compute nodes (servers/global zones) making up the Triton infrastructure
[ role : <string> | default = "container" ]
# The DNS suffix which should be applied to target.
dns_suffix: <string>
# The Triton discovery endpoint (e.g. 'cmon.us-east-3b.triton.zone'). This is
# often the same value as dns_suffix.
endpoint: <string>
# A list of groups for which targets are retrieved, only supported when `role` == `container`.
# If omitted all containers owned by the requesting account are scraped.
groups:
[ - <string> ... ]
# The port to use for discovery and metric scraping.
[ port: <int> | default = 9163 ]
# The interval which should be used for refreshing targets.
[ refresh_interval: <duration> | default = 60s ]
# The Triton discovery API version.
[ version: <int> | default = 1 ]
# TLS configuration.
tls_config:
[ <tls_config> ]
<eureka_sd_config>
Eureka SD 配置允许使用 Eureka REST API 检索抓取目标。Prometheus 将定期检查 REST 端点,并为每个应用实例创建一个目标。
在重标记期间,以下元标签可用于目标
__meta_eureka_app_name
:应用的名称__meta_eureka_app_instance_id
:应用实例的 ID__meta_eureka_app_instance_hostname
:实例的主机名__meta_eureka_app_instance_homepage_url
:应用实例的主页 URL__meta_eureka_app_instance_statuspage_url
:应用实例的状态页面 URL__meta_eureka_app_instance_healthcheck_url
:应用实例的健康检查 URL__meta_eureka_app_instance_ip_addr
:应用实例的 IP 地址__meta_eureka_app_instance_vip_address
:应用实例的 VIP 地址__meta_eureka_app_instance_secure_vip_address
:应用实例的安全 VIP 地址__meta_eureka_app_instance_status
:应用实例的状态__meta_eureka_app_instance_port
:应用实例的端口__meta_eureka_app_instance_port_enabled
:应用实例的端口是否启用__meta_eureka_app_instance_secure_port
:应用实例的安全端口地址__meta_eureka_app_instance_secure_port_enabled
:应用实例的安全端口__meta_eureka_app_instance_country_id
:应用实例的国家 ID__meta_eureka_app_instance_metadata_<metadataname>
:应用实例元数据__meta_eureka_app_instance_datacenterinfo_name
:应用实例的数据中心名称__meta_eureka_app_instance_datacenterinfo_<metadataname>
:数据中心元数据请参阅下文了解 Eureka 服务发现的配置选项
# The URL to connect to the Eureka server.
server: <string>
# Refresh interval to re-read the app instance list.
[ refresh_interval: <duration> | default = 30s ]
# HTTP client settings, including authentication methods (such as basic auth and
# authorization), proxy configurations, TLS options, custom HTTP headers, etc.
[ <http_config> ]
有关如何设置 Eureka 应用和 Prometheus 配置的实际示例,请参阅 Prometheus eureka-sd 配置文件。
<scaleway_sd_config>
Scaleway SD 配置允许从 Scaleway 实例 和 裸金属服务 检索抓取目标。
在重标记期间,以下元标签可用于目标
__meta_scaleway_instance_boot_type
:服务器的引导类型__meta_scaleway_instance_hostname
:服务器的主机名__meta_scaleway_instance_id
:服务器的 ID__meta_scaleway_instance_image_arch
:服务器镜像的架构__meta_scaleway_instance_image_id
:服务器镜像的 ID__meta_scaleway_instance_image_name
:服务器镜像的名称__meta_scaleway_instance_location_cluster_id
:服务器位置的集群 ID__meta_scaleway_instance_location_hypervisor_id
:服务器位置的 hypervisor ID__meta_scaleway_instance_location_node_id
:服务器位置的节点 ID__meta_scaleway_instance_name
:服务器的名称__meta_scaleway_instance_organization_id
:服务器的组织__meta_scaleway_instance_private_ipv4
:服务器的私有 IPv4 地址__meta_scaleway_instance_project_id
:服务器的项目 ID__meta_scaleway_instance_public_ipv4
:服务器的公有 IPv4 地址__meta_scaleway_instance_public_ipv6
:服务器的公有 IPv6 地址__meta_scaleway_instance_public_ipv4_addresses
:服务器的公有 IPv4 地址列表__meta_scaleway_instance_public_ipv6_addresses
:服务器的公有 IPv6 地址列表__meta_scaleway_instance_region
:服务器的区域__meta_scaleway_instance_security_group_id
:服务器安全组的 ID__meta_scaleway_instance_security_group_name
:服务器安全组的名称__meta_scaleway_instance_status
:服务器的状态__meta_scaleway_instance_tags
:服务器的标签列表,使用标签分隔符连接__meta_scaleway_instance_type
:服务器的商业类型__meta_scaleway_instance_zone
:服务器的区域(例如:fr-par-1
,完整列表请参见此处)此角色按以下顺序使用找到的第一个地址:私有 IPv4、公有 IPv4、公有 IPv6。这可以通过重贴标签进行更改,如 Prometheus scaleway-sd 配置文件 中所示。如果在重贴标签之前实例没有地址,它将不会被添加到目标列表,并且您将无法对其进行重贴标签。
__meta_scaleway_baremetal_id
:服务器的 ID__meta_scaleway_baremetal_public_ipv4
:服务器的公有 IPv4 地址__meta_scaleway_baremetal_public_ipv6
:服务器的公有 IPv6 地址__meta_scaleway_baremetal_name
:服务器的名称__meta_scaleway_baremetal_os_name
:服务器操作系统的名称__meta_scaleway_baremetal_os_version
:服务器操作系统的版本__meta_scaleway_baremetal_project_id
:服务器的项目 ID__meta_scaleway_baremetal_status
:服务器的状态__meta_scaleway_baremetal_tags
:服务器的标签列表,使用标签分隔符连接__meta_scaleway_baremetal_type
:服务器的商业类型__meta_scaleway_baremetal_zone
:服务器的区域(例如:fr-par-1
,完整列表请参见此处)此角色默认使用公有 IPv4 地址。这可以通过重贴标签进行更改,如 Prometheus scaleway-sd 配置文件 中所示。
请参阅下文了解 Scaleway 服务发现的配置选项
# Access key to use. https://console.scaleway.com/project/credentials
access_key: <string>
# Secret key to use when listing targets. https://console.scaleway.com/project/credentials
# It is mutually exclusive with `secret_key_file`.
[ secret_key: <secret> ]
# Sets the secret key with the credentials read from the configured file.
# It is mutually exclusive with `secret_key`.
[ secret_key_file: <filename> ]
# Project ID of the targets.
project_id: <string>
# Role of the targets to retrieve. Must be `instance` or `baremetal`.
role: <string>
# The port to scrape metrics from.
[ port: <int> | default = 80 ]
# API URL to use when doing the server listing requests.
[ api_url: <string> | default = "https://api.scaleway.com" ]
# Zone is the availability zone of your targets (e.g. fr-par-1).
[ zone: <string> | default = fr-par-1 ]
# NameFilter specify a name filter (works as a LIKE) to apply on the server listing request.
[ name_filter: <string> ]
# TagsFilter specify a tag filter (a server needs to have all defined tags to be listed) to apply on the server listing request.
tags_filter:
[ - <string> ]
# Refresh interval to re-read the targets list.
[ refresh_interval: <duration> | default = 60s ]
# HTTP client settings, including authentication methods (such as basic auth and
# authorization), proxy configurations, TLS options, custom HTTP headers, etc.
[ <http_config> ]
<uyuni_sd_config>
Uyuni SD 配置允许通过 Uyuni API 从被管理系统检索抓取目标。
在重标记期间,以下元标签可用于目标
__meta_uyuni_endpoint_name
:应用端点的名称__meta_uyuni_exporter
:为目标公开指标的 exporter__meta_uyuni_groups
:目标的系统组__meta_uyuni_metrics_path
:目标的指标路径__meta_uyuni_minion_hostname
:Uyuni 客户端的主机名__meta_uyuni_primary_fqdn
:Uyuni 客户端的主 FQDN__meta_uyuni_proxy_module
:如果为目标配置了 Exporter Exporter 代理,则为模块名称__meta_uyuni_scheme
:用于请求的协议方案__meta_uyuni_system_id
:客户端的系统 ID请参阅下文了解 Uyuni 服务发现的配置选项
# The URL to connect to the Uyuni server.
server: <string>
# Credentials are used to authenticate the requests to Uyuni API.
username: <string>
password: <secret>
# The entitlement string to filter eligible systems.
[ entitlement: <string> | default = monitoring_entitled ]
# The string by which Uyuni group names are joined into the groups label.
[ separator: <string> | default = , ]
# Refresh interval to re-read the managed targets list.
[ refresh_interval: <duration> | default = 60s ]
# HTTP client settings, including authentication methods (such as basic auth and
# authorization), proxy configurations, TLS options, custom HTTP headers, etc.
[ <http_config> ]
有关如何设置 Uyuni Prometheus 配置的实际示例,请参阅 Prometheus uyuni-sd 配置文件。
<vultr_sd_config>
Vultr SD 配置允许从 Vultr 检索抓取目标。
此服务发现默认使用主 IPv4 地址,这可以通过重贴标签进行更改,如 Prometheus vultr-sd 配置文件 中所示。
在重标记期间,以下元标签可用于目标
__meta_vultr_instance_id
:Vultr 实例的唯一 ID。__meta_vultr_instance_label
:此实例的用户提供的标签。__meta_vultr_instance_os
:操作系统名称。__meta_vultr_instance_os_id
:此实例使用的操作系统 ID。__meta_vultr_instance_region
:实例所在的区域 ID。__meta_vultr_instance_plan
:计划的唯一 ID。__meta_vultr_instance_main_ip
:主 IPv4 地址。__meta_vultr_instance_internal_ip
:私有 IP 地址。__meta_vultr_instance_main_ipv6
:主 IPv6 地址。__meta_vultr_instance_features
:此实例可用的功能列表。__meta_vultr_instance_tags
:与此实例关联的标签列表。__meta_vultr_instance_hostname
:此实例的主机名。__meta_vultr_instance_server_status
:服务器健康状态。__meta_vultr_instance_vcpu_count
:vCPU 数量。__meta_vultr_instance_ram_mb
:RAM 大小(MB)。__meta_vultr_instance_disk_gb
:磁盘大小(GB)。__meta_vultr_instance_allowed_bandwidth_gb
:每月带宽配额(GB)。# The port to scrape metrics from.
[ port: <int> | default = 80 ]
# The time after which the instances are refreshed.
[ refresh_interval: <duration> | default = 60s ]
# HTTP client settings, including authentication methods (such as basic auth and
# authorization), proxy configurations, TLS options, custom HTTP headers, etc.
[ <http_config> ]
<static_config>
static_config
允许指定目标列表及其共同的标签集合。这是在抓取配置中指定静态目标的标准方式。
# The targets specified by the static config.
targets:
[ - '<host>' ]
# Labels assigned to all metrics scraped from the targets.
labels:
[ <labelname>: <labelvalue> ... ]
<relabel_config>
重贴标签是一种强大的工具,可以在抓取目标之前动态重写目标的标签集合。每个抓取配置可以配置多个重贴标签步骤。它们按照在配置文件中出现的顺序应用于每个目标的标签集合。
最初,除了配置的每个目标的标签外,目标的 job
标签被设置为相应抓取配置的 job_name
值。__address__
标签被设置为目标的 <host>:<port>
地址。重贴标签后,如果 instance
标签在重贴标签过程中未设置,则默认将其设置为 __address__
的值。
__scheme__
和 __metrics_path__
标签分别被设置为目标的协议方案和指标路径,如 scrape_config
中指定。
__param_<name>
标签被设置为第一个传递的 URL 参数的值,该参数名为 <name>
,如 scrape_config
中定义。
__scrape_interval__
和 __scrape_timeout__
标签被设置为目标的抓取间隔和超时,如 scrape_config
中指定。
在重贴标签阶段,可能还有以 __meta_
为前缀的额外标签可用。这些标签由提供目标的服务发现机制设置,并因机制而异。
以 __
开头的标签将在目标重贴标签完成后从标签集合中移除。
如果重贴标签步骤只需要临时存储标签值(作为后续重贴标签步骤的输入),请使用 __tmp
标签名前缀。Prometheus 保证永远不会使用此前缀。
# The source_labels tells the rule what labels to fetch from the series. Any
# labels which do not exist get a blank value (""). Their content is concatenated
# using the configured separator and matched against the configured regular expression
# for the replace, keep, and drop actions.
[ source_labels: '[' <labelname> [, ...] ']' ]
# Separator placed between concatenated source label values.
[ separator: <string> | default = ; ]
# Label to which the resulting value is written in a replace action.
# It is mandatory for replace actions. Regex capture groups are available.
[ target_label: <labelname> ]
# Regular expression against which the extracted value is matched.
[ regex: <regex> | default = (.*) ]
# Modulus to take of the hash of the source label values.
[ modulus: <int> ]
# Replacement value against which a regex replace is performed if the
# regular expression matches. Regex capture groups are available.
[ replacement: <string> | default = $1 ]
# Action to perform based on regex matching.
[ action: <relabel_action> | default = replace ]
<regex>
是任何有效的 RE2 正则表达式。它对于 replace
, keep
, drop
, labelmap
,labeldrop
和 labelkeep
操作是必需的。正则表达式在两端都被锚定。要取消锚定,请使用 .*<regex>.*
。
<relabel_action>
决定要执行的重贴标签操作
replace
:将 regex
与连接后的 source_labels
进行匹配。然后,将 target_label
设置为 replacement
,其中 replacement
中的匹配组引用(${1}
, ${2}
, ...)替换为它们的值。如果 regex
不匹配,则不进行替换。lowercase
:将连接后的 source_labels
转换为小写。uppercase
:将连接后的 source_labels
转换为大写。keep
:丢弃 regex
与连接后的 source_labels
不匹配的目标。drop
:丢弃 regex
与连接后的 source_labels
匹配的目标。keepequal
:丢弃连接后的 source_labels
与 target_label
不匹配的目标。dropequal
:丢弃连接后的 source_labels
与 target_label
匹配的目标。hashmod
:将 target_label
设置为连接后的 source_labels
的哈希值的 modulus
模。labelmap
:将 regex
与所有源标签名称(而不仅仅是 source_labels
中指定的名称)进行匹配。然后将匹配标签的值复制到由 replacement
给定的标签名称,其中 replacement
中的匹配组引用(${1}
, ${2}
, ...)替换为它们的值。labeldrop
:将 regex
与所有标签名称进行匹配。任何匹配的标签都将从标签集合中移除。labelkeep
:将 regex
与所有标签名称进行匹配。任何不匹配的标签都将从标签集合中移除。使用 labeldrop
和 labelkeep
时必须小心,以确保一旦标签被移除,指标仍然具有唯一的标签。
<metric_relabel_configs>
指标重贴标签是在摄取样本之前的最后一步应用。它具有与目标重贴标签相同的配置格式和操作。指标重贴标签不适用于自动生成的时间序列,例如 up
。
这的一种用途是排除摄取成本过高的时间序列。
<alert_relabel_configs>
在将警报发送到 Alertmanager 之前,会对其应用警报重贴标签。它具有与目标重贴标签相同的配置格式和操作。警报重贴标签在外部标签之后应用。
这的一种用途是确保具有不同外部标签的 Prometheus 服务器 HA 对发送相同的警报。
<alertmanager_config>
alertmanager_config
部分指定 Prometheus 服务器发送警报的 Alertmanager 实例。它还提供参数来配置如何与这些 Alertmanager 通信。
Alertmanager 可以通过 static_configs
参数静态配置,或使用受支持的服务发现机制之一动态发现。
此外,relabel_configs
允许从发现的实体中选择 Alertmanager,并对使用的 API 路径进行高级修改,该路径通过 __alerts_path__
标签公开。
# Per-target Alertmanager timeout when pushing alerts.
[ timeout: <duration> | default = 10s ]
# The api version of Alertmanager.
[ api_version: <string> | default = v2 ]
# Prefix for the HTTP path alerts are pushed to.
[ path_prefix: <path> | default = / ]
# Configures the protocol scheme used for requests.
[ scheme: <scheme> | default = http ]
# Optionally configures AWS's Signature Verification 4 signing process to sign requests.
# Cannot be set at the same time as basic_auth, authorization, oauth2, azuread or google_iam.
# To use the default credentials from the AWS SDK, use `sigv4: {}`.
sigv4:
# The AWS region. If blank, the region from the default credentials chain
# is used.
[ region: <string> ]
# The AWS API keys. If blank, the environment variables `AWS_ACCESS_KEY_ID`
# and `AWS_SECRET_ACCESS_KEY` are used.
[ access_key: <string> ]
[ secret_key: <secret> ]
# Named AWS profile used to authenticate.
[ profile: <string> ]
# AWS Role ARN, an alternative to using AWS API keys.
[ role_arn: <string> ]
# HTTP client settings, including authentication methods (such as basic auth and
# authorization), proxy configurations, TLS options, custom HTTP headers, etc.
[ <http_config> ]
# List of Azure service discovery configurations.
azure_sd_configs:
[ - <azure_sd_config> ... ]
# List of Consul service discovery configurations.
consul_sd_configs:
[ - <consul_sd_config> ... ]
# List of DNS service discovery configurations.
dns_sd_configs:
[ - <dns_sd_config> ... ]
# List of EC2 service discovery configurations.
ec2_sd_configs:
[ - <ec2_sd_config> ... ]
# List of Eureka service discovery configurations.
eureka_sd_configs:
[ - <eureka_sd_config> ... ]
# List of file service discovery configurations.
file_sd_configs:
[ - <file_sd_config> ... ]
# List of DigitalOcean service discovery configurations.
digitalocean_sd_configs:
[ - <digitalocean_sd_config> ... ]
# List of Docker service discovery configurations.
docker_sd_configs:
[ - <docker_sd_config> ... ]
# List of Docker Swarm service discovery configurations.
dockerswarm_sd_configs:
[ - <dockerswarm_sd_config> ... ]
# List of GCE service discovery configurations.
gce_sd_configs:
[ - <gce_sd_config> ... ]
# List of Hetzner service discovery configurations.
hetzner_sd_configs:
[ - <hetzner_sd_config> ... ]
# List of HTTP service discovery configurations.
http_sd_configs:
[ - <http_sd_config> ... ]
# List of IONOS service discovery configurations.
ionos_sd_configs:
[ - <ionos_sd_config> ... ]
# List of Kubernetes service discovery configurations.
kubernetes_sd_configs:
[ - <kubernetes_sd_config> ... ]
# List of Lightsail service discovery configurations.
lightsail_sd_configs:
[ - <lightsail_sd_config> ... ]
# List of Linode service discovery configurations.
linode_sd_configs:
[ - <linode_sd_config> ... ]
# List of Marathon service discovery configurations.
marathon_sd_configs:
[ - <marathon_sd_config> ... ]
# List of AirBnB's Nerve service discovery configurations.
nerve_sd_configs:
[ - <nerve_sd_config> ... ]
# List of Nomad service discovery configurations.
nomad_sd_configs:
[ - <nomad_sd_config> ... ]
# List of OpenStack service discovery configurations.
openstack_sd_configs:
[ - <openstack_sd_config> ... ]
# List of OVHcloud service discovery configurations.
ovhcloud_sd_configs:
[ - <ovhcloud_sd_config> ... ]
# List of PuppetDB service discovery configurations.
puppetdb_sd_configs:
[ - <puppetdb_sd_config> ... ]
# List of Scaleway service discovery configurations.
scaleway_sd_configs:
[ - <scaleway_sd_config> ... ]
# List of Zookeeper Serverset service discovery configurations.
serverset_sd_configs:
[ - <serverset_sd_config> ... ]
# List of Triton service discovery configurations.
triton_sd_configs:
[ - <triton_sd_config> ... ]
# List of Uyuni service discovery configurations.
uyuni_sd_configs:
[ - <uyuni_sd_config> ... ]
# List of Vultr service discovery configurations.
vultr_sd_configs:
[ - <vultr_sd_config> ... ]
# List of labeled statically configured Alertmanagers.
static_configs:
[ - <static_config> ... ]
# List of Alertmanager relabel configurations.
relabel_configs:
[ - <relabel_config> ... ]
# List of alert relabel configurations.
alert_relabel_configs:
[ - <relabel_config> ... ]
<remote_write>
write_relabel_configs
是在将样本发送到远程端点之前应用的重贴标签。写入重贴标签在外部标签之后应用。这可以用于限制发送哪些样本。
有一个 小型演示 说明如何使用此功能。
# The URL of the endpoint to send samples to.
url: <string>
# protobuf message to use when writing to the remote write endpoint.
#
# * The `prometheus.WriteRequest` represents the message introduced in Remote Write 1.0, which
# will be deprecated eventually.
# * The `io.prometheus.write.v2.Request` was introduced in Remote Write 2.0 and replaces the former,
# by improving efficiency and sending metadata, created timestamp and native histograms by default.
#
# Before changing this value, consult with your remote storage provider (or test) what message it supports.
# Read more on https://prometheus.ac.cn/docs/specs/remote_write_spec_2_0/#io-prometheus-write-v2-request
[ protobuf_message: <prometheus.WriteRequest | io.prometheus.write.v2.Request> | default = prometheus.WriteRequest ]
# Timeout for requests to the remote write endpoint.
[ remote_timeout: <duration> | default = 30s ]
# Custom HTTP headers to be sent along with each remote write request.
# Be aware that headers that are set by Prometheus itself can't be overwritten.
headers:
[ <string>: <string> ... ]
# List of remote write relabel configurations.
write_relabel_configs:
[ - <relabel_config> ... ]
# Name of the remote write config, which if specified must be unique among remote write configs.
# The name will be used in metrics and logging in place of a generated value to help users distinguish between
# remote write configs.
[ name: <string> ]
# Enables sending of exemplars over remote write. Note that exemplar storage itself must be enabled for exemplars to be scraped in the first place.
[ send_exemplars: <boolean> | default = false ]
# Enables sending of native histograms, also known as sparse histograms, over remote write.
# For the `io.prometheus.write.v2.Request` message, this option is noop (always true).
[ send_native_histograms: <boolean> | default = false ]
# When enabled, remote-write will resolve the URL host name via DNS, choose one of the IP addresses at random, and connect to it.
# When disabled, remote-write relies on Go's standard behavior, which is to try to connect to each address in turn.
# The connection timeout applies to the whole operation, i.e. in the latter case it is spread over all attempt.
# This is an experimental feature, and its behavior might still change, or even get removed.
[ round_robin_dns: <boolean> | default = false ]
# Optionally configures AWS's Signature Verification 4 signing process to
# sign requests. Cannot be set at the same time as basic_auth, authorization, oauth2, or azuread.
# To use the default credentials from the AWS SDK, use `sigv4: {}`.
sigv4:
# The AWS region. If blank, the region from the default credentials chain
# is used.
[ region: <string> ]
# The AWS API keys. If blank, the environment variables `AWS_ACCESS_KEY_ID`
# and `AWS_SECRET_ACCESS_KEY` are used.
[ access_key: <string> ]
[ secret_key: <secret> ]
# Named AWS profile used to authenticate.
[ profile: <string> ]
# AWS Role ARN, an alternative to using AWS API keys.
[ role_arn: <string> ]
# Optional AzureAD configuration.
# Cannot be used at the same time as basic_auth, authorization, oauth2, sigv4 or google_iam.
azuread:
# The Azure Cloud. Options are 'AzurePublic', 'AzureChina', or 'AzureGovernment'.
[ cloud: <string> | default = AzurePublic ]
# Azure User-assigned Managed identity.
[ managed_identity:
[ client_id: <string> ] ]
# Azure OAuth.
[ oauth:
[ client_id: <string> ]
[ client_secret: <string> ]
[ tenant_id: <string> ] ]
# Azure SDK auth.
# See https://learn.microsoft.com/en-us/azure/developer/go/azure-sdk-authentication
[ sdk:
[ tenant_id: <string> ] ]
# WARNING: Remote write is NOT SUPPORTED by Google Cloud. This configuration is reserved for future use.
# Optional Google Cloud Monitoring configuration.
# Cannot be used at the same time as basic_auth, authorization, oauth2, sigv4 or azuread.
# To use the default credentials from the Google Cloud SDK, use `google_iam: {}`.
google_iam:
# Service account key with monitoring write permissions.
credentials_file: <file_name>
# Configures the queue used to write to remote storage.
queue_config:
# Number of samples to buffer per shard before we block reading of more
# samples from the WAL. It is recommended to have enough capacity in each
# shard to buffer several requests to keep throughput up while processing
# occasional slow remote requests.
[ capacity: <int> | default = 10000 ]
# Maximum number of shards, i.e. amount of concurrency.
[ max_shards: <int> | default = 50 ]
# Minimum number of shards, i.e. amount of concurrency.
[ min_shards: <int> | default = 1 ]
# Maximum number of samples per send.
[ max_samples_per_send: <int> | default = 2000]
# Maximum time a sample will wait for a send. The sample might wait less
# if the buffer is full. Further time might pass due to potential retries.
[ batch_send_deadline: <duration> | default = 5s ]
# Initial retry delay. Gets doubled for every retry.
[ min_backoff: <duration> | default = 30ms ]
# Maximum retry delay.
[ max_backoff: <duration> | default = 5s ]
# Retry upon receiving a 429 status code from the remote-write storage.
# This is experimental and might change in the future.
[ retry_on_http_429: <boolean> | default = false ]
# If set, any sample that is older than sample_age_limit
# will not be sent to the remote storage. The default value is 0s,
# which means that all samples are sent.
[ sample_age_limit: <duration> | default = 0s ]
# Configures the sending of series metadata to remote storage
# if the `prometheus.WriteRequest` message was chosen. When
# `io.prometheus.write.v2.Request` is used, metadata is always sent.
#
# Metadata configuration is subject to change at any point
# or be removed in future releases.
metadata_config:
# Whether metric metadata is sent to remote storage or not.
[ send: <boolean> | default = true ]
# How frequently metric metadata is sent to remote storage.
[ send_interval: <duration> | default = 1m ]
# Maximum number of samples per send.
[ max_samples_per_send: <int> | default = 500]
# HTTP client settings, including authentication methods (such as basic auth and
# authorization), proxy configurations, TLS options, custom HTTP headers, etc.
# enable_http2 defaults to false for remote-write.
[ <http_config> ]
有一个支持此功能的 集成列表。
<remote_read>
# The URL of the endpoint to query from.
url: <string>
# Name of the remote read config, which if specified must be unique among remote read configs.
# The name will be used in metrics and logging in place of a generated value to help users distinguish between
# remote read configs.
[ name: <string> ]
# An optional list of equality matchers which have to be
# present in a selector to query the remote read endpoint.
required_matchers:
[ <labelname>: <labelvalue> ... ]
# Timeout for requests to the remote read endpoint.
[ remote_timeout: <duration> | default = 1m ]
# Custom HTTP headers to be sent along with each remote read request.
# Be aware that headers that are set by Prometheus itself can't be overwritten.
headers:
[ <string>: <string> ... ]
# Whether reads should be made for queries for time ranges that
# the local storage should have complete data for.
[ read_recent: <boolean> | default = false ]
# Whether to use the external labels as selectors for the remote read endpoint.
[ filter_external_labels: <boolean> | default = true ]
# HTTP client settings, including authentication methods (such as basic auth and
# authorization), proxy configurations, TLS options, custom HTTP headers, etc.
[ <http_config> ]
有一个支持此功能的 集成列表。
<tsdb>
tsdb
允许您配置 TSDB 的运行时可重载配置设置。
# Configures how old an out-of-order/out-of-bounds sample can be w.r.t. the TSDB max time.
# An out-of-order/out-of-bounds sample is ingested into the TSDB as long as the timestamp
# of the sample is >= TSDB.MaxTime-out_of_order_time_window.
#
# When out_of_order_time_window is >0, the errors out-of-order and out-of-bounds are
# combined into a single error called 'too-old'; a sample is either (a) ingestible
# into the TSDB, i.e. it is an in-order sample or an out-of-order/out-of-bounds sample
# that is within the out-of-order window, or (b) too-old, i.e. not in-order
# and before the out-of-order window.
#
# When out_of_order_time_window is greater than 0, it also affects experimental agent. It allows
# the agent's WAL to accept out-of-order samples that fall within the specified time window relative
# to the timestamp of the last appended sample for the same series.
[ out_of_order_time_window: <duration> | default = 0s ]
<exemplars>
请注意,Exemplar 存储仍被视为实验性功能,必须通过 --enable-feature=exemplar-storage
启用。
# Configures the maximum size of the circular buffer used to store exemplars for all series. Resizable during runtime.
[ max_exemplars: <int> | default = 100000 ]
<tracing_config>
tracing_config
配置通过 OTLP 协议将 Prometheus 的跟踪数据导出到跟踪后端。跟踪目前是 实验性 功能,将来可能会发生变化。
# Client used to export the traces. Options are 'http' or 'grpc'.
[ client_type: <string> | default = grpc ]
# Endpoint to send the traces to. Should be provided in format <host>:<port>.
[ endpoint: <string> ]
# Sets the probability a given trace will be sampled. Must be a float from 0 through 1.
[ sampling_fraction: <float> | default = 0 ]
# If disabled, the client will use a secure connection.
[ insecure: <boolean> | default = false ]
# Key-value pairs to be used as headers associated with gRPC or HTTP requests.
headers:
[ <string>: <string> ... ]
# Compression key for supported compression types. Supported compression: gzip.
[ compression: <string> ]
# Maximum time the exporter will wait for each batch export.
[ timeout: <duration> | default = 10s ]
# TLS configuration.
tls_config:
[ <tls_config> ]
本文档是 开源的。请通过提交 issue 或 pull request 帮助改进它。