Esta pagina se ve mejor con JavaScript habilitado

Monitorización Dovecot mediante Prometheus

 ·  🎃 kr0m

En este manual veremos como monitorizar un servidor Dovecot mediante Prometheus, si no hemos realizado la instalación base de Prometheus primero seguiremos la guía básica de instalación de Prometheus y Grafana.

Dovecot soporta de forma nativa Prometheus , tan solo debemos indicarle que métricas collectar y exponerlas. En el fichero de ejemplo hay algunas métricas predefinidas pero si necesitamos algo mas custom siempre podemos definir nuestros propios filtros en base a los eventos que nos interesen.

Copiamos el fichero de configuración de ejemplo:

cp /usr/local/etc/dovecot/example-config/conf.d/10-metrics.conf /usr/local/etc/dovecot/conf.d/

Lo editamos descomentando las métricas que nos interesen y bindeando el exporter:

vi /usr/local/etc/dovecot/conf.d/10-metrics.conf

##
## Statistics and metrics
##

# Dovecot supports gathering statistics from events.
# Currently there are no statistics logged by default, and therefore they must
# be explicitly added using the metric configuration blocks.
#
# Unlike old stats, the new statistics do not require any plugins loaded.
#
# See https://doc.dovecot.org/configuration_manual/stats/ for more details.

##
## Example metrics
##

metric auth_success {
  filter = event=auth_request_finished AND success=yes
}

metric auth_failures {
  filter = event=auth_request_finished AND NOT success=yes
}

metric imap_command {
  filter = event=imap_command_finished
  group_by = cmd_name tagged_reply_state
}

metric smtp_command {
  filter = event=smtp_server_command_finished
  group_by = cmd_name status_code duration:exponential:1:5:10
}

metric mail_delivery {
  filter = event=mail_delivery_finished
  group_by = duration:exponential:1:5:10
}

##
## Prometheus
##

# To allow access to statistics with Prometheus, enable http listener
# on stats process. Stats will be available on /metrics path.
#
# See https://doc.dovecot.org/configuration_manual/stats/openmetrics/ for more
# details.

service stats {
  inet_listener http {
    port = 9900
  }
}

##
## Event exporting
##

# You can also export individual events.
#
# See https://doc.dovecot.org/configuration_manual/event_export/ for more
# details.

#event_exporter log {
#  format = json
#  format_args = time-rfc3339
#  transport = log
#}
#
#metric imap_commands {
#  exporter = log
#  filter = event=imap_command_finished
#}

Reiniciamos el servicio:

service dovecot restart

Comprobamos que las estadísticas estén disponibles:

doveadm -f table stats dump

metric_name                               field    count sum    min    max    avg       median stddev   %95                  
auth_success                              duration 10    86976  7594   13466  8697.60   7718   1845.96  13466                
auth_failures                             duration 0     0      0      0      0.00      0      0.00     0                    
imap_command                              duration 40    348361 14     339435 8709.02   62     52959.89 793                  
imap_command_CAPABILITY                   duration 9     445    14     63     49.44     58     18.26    63                   
imap_command_CAPABILITY_OK                duration 9     445    14     63     49.44     58     18.26    63                   
imap_command_STATUS                       duration 15    4539   41     732    302.60    70     294.58   732                  
imap_command_STATUS_OK                    duration 15    4539   41     732    302.60    70     294.58   732                  
imap_command_LOGOUT                       duration 10    388    25     83     38.80     27     19.79    83                   
imap_command_LOGOUT_OK                    duration 10    388    25     83     38.80     27     19.79    83                   
imap_command_APPEND                       duration 1     339435 339435 339435 339435.00 339435 0.00     339435               
imap_command_APPEND_OK                    duration 1     339435 339435 339435 339435.00 339435 0.00     339435               
imap_command_LSUB                         duration 1     129    129    129    129.00    129    0.00     129                  
imap_command_LSUB_OK                      duration 1     129    129    129    129.00    129    0.00     129                  
imap_command_LIST                         duration 1     793    793    793    793.00    793    0.00     793                  
imap_command_LIST_OK                      duration 1     793    793    793    793.00    793    0.00     793                  
imap_command_NAMESPACE                    duration 1     18     18     18     18.00     18     0.00     18                   
imap_command_NAMESPACE_OK                 duration 1     18     18     18     18.00     18     0.00     18                   
imap_command_SELECT                       duration 1     2011   2011   2011   2011.00   2011   0.00     2011                 
imap_command_SELECT_OK                    duration 1     2011   2011   2011   2011.00   2011   0.00     2011                 
imap_command_UID_FETCH                    duration 1     603    603    603    603.00    603    0.00     603                  
imap_command_UID_FETCH_OK                 duration 1     603    603    603    603.00    603    0.00     603                  
smtp_command                              duration 5     6092   28     5441   1218.40   87     2117.61  5441                 
smtp_command_LHLO                         duration 1     87     87     87     87.00     87     0.00     87                   
smtp_command_LHLO_250                     duration 1     87     87     87     87.00     87     0.00     87                   
smtp_command_LHLO_250_duration_11_100     duration 1     87     87     87     87.00     87     0.00     87                   
smtp_command_MAIL                         duration 1     59     59     59     59.00     59     0.00     59                   
smtp_command_MAIL_250                     duration 1     59     59     59     59.00     59     0.00     59                   
smtp_command_MAIL_250_duration_11_100     duration 1     59     59     59     59.00     59     0.00     59                   
smtp_command_RCPT                         duration 1     477    477    477    477.00    477    0.00     477                  
smtp_command_RCPT_250                     duration 1     477    477    477    477.00    477    0.00     477                  
smtp_command_RCPT_250_duration_101_1000   duration 1     477    477    477    477.00    477    0.00     477                  
smtp_command_DATA                         duration 1     5441   5441   5441   5441.00   5441   0.00     5441                 
smtp_command_DATA_250                     duration 1     5441   5441   5441   5441.00   5441   0.00     5441                 
smtp_command_DATA_250_duration_1001_10000 duration 1     5441   5441   5441   5441.00   5441   0.00     5441                 
smtp_command_QUIT                         duration 1     28     28     28     28.00     28     0.00     28                   
smtp_command_QUIT_221                     duration 1     28     28     28     28.00     28     0.00     28                   
smtp_command_QUIT_221_duration_11_100     duration 1     28     28     28     28.00     28     0.00     28                   
mail_delivery                             duration 1     3941   3941   3941   3941.00   3941   0.00     3941                 
mail_delivery_duration_1001_10000         duration 1     3941   3941   3941   3941.00   3941   0.00     3941   

Comprobamos manualmente desde el servidor prometheus que se pueda acceder a las métricas:

curl http://hellstorm:9900/metrics

# HELP process_start_time_seconds Timestamp of service start
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1672768567
# HELP dovecot_build Dovecot build information
# TYPE dovecot_build info
dovecot_build_info{version="2.3.19.1",revision="9b53102964"} 1
# HELP dovecot_auth_success Total number of all events of this kind
# TYPE dovecot_auth_success counter
dovecot_auth_success_total 13
# HELP dovecot_auth_success_duration_seconds Total duration of all events of this kind
# TYPE dovecot_auth_success_duration_seconds counter
dovecot_auth_success_duration_seconds_total 0.118286
# HELP dovecot_auth_failures Total number of all events of this kind
# TYPE dovecot_auth_failures counter
dovecot_auth_failures_total 0
# HELP dovecot_auth_failures_duration_seconds Total duration of all events of this kind
# TYPE dovecot_auth_failures_duration_seconds counter
dovecot_auth_failures_duration_seconds_total 0.000000
# HELP dovecot_imap_command Total number of all events of this kind
# TYPE dovecot_imap_command counter
dovecot_imap_command_total{cmd_name="CAPABILITY"} 12
dovecot_imap_command_total{cmd_name="CAPABILITY",tagged_reply_state="OK"} 12
dovecot_imap_command_total{cmd_name="STATUS"} 17
dovecot_imap_command_total{cmd_name="STATUS",tagged_reply_state="OK"} 17
dovecot_imap_command_total{cmd_name="LOGOUT"} 13
dovecot_imap_command_total{cmd_name="LOGOUT",tagged_reply_state="OK"} 13
dovecot_imap_command_total{cmd_name="APPEND"} 1
dovecot_imap_command_total{cmd_name="APPEND",tagged_reply_state="OK"} 1
dovecot_imap_command_total{cmd_name="LSUB"} 2
dovecot_imap_command_total{cmd_name="LSUB",tagged_reply_state="OK"} 2
dovecot_imap_command_total{cmd_name="LIST"} 2
dovecot_imap_command_total{cmd_name="LIST",tagged_reply_state="OK"} 2
dovecot_imap_command_total{cmd_name="NAMESPACE"} 2
dovecot_imap_command_total{cmd_name="NAMESPACE",tagged_reply_state="OK"} 2
dovecot_imap_command_total{cmd_name="SELECT"} 3
dovecot_imap_command_total{cmd_name="SELECT",tagged_reply_state="OK"} 3
dovecot_imap_command_total{cmd_name="UID FETCH"} 3
dovecot_imap_command_total{cmd_name="UID FETCH",tagged_reply_state="OK"} 3
dovecot_imap_command_total{cmd_name="UID SORT"} 1
dovecot_imap_command_total{cmd_name="UID SORT",tagged_reply_state="OK"} 1
dovecot_imap_command_count 56
# HELP dovecot_imap_command_duration_seconds Total duration of all events of this kind
# TYPE dovecot_imap_command_duration_seconds counter
dovecot_imap_command_duration_seconds_total{cmd_name="CAPABILITY"} 0.000574
dovecot_imap_command_duration_seconds_total{cmd_name="CAPABILITY",tagged_reply_state="OK"} 0.000574
dovecot_imap_command_duration_seconds_total{cmd_name="STATUS"} 0.005222
dovecot_imap_command_duration_seconds_total{cmd_name="STATUS",tagged_reply_state="OK"} 0.005222
dovecot_imap_command_duration_seconds_total{cmd_name="LOGOUT"} 0.001152
dovecot_imap_command_duration_seconds_total{cmd_name="LOGOUT",tagged_reply_state="OK"} 0.001152
dovecot_imap_command_duration_seconds_total{cmd_name="APPEND"} 0.339435
dovecot_imap_command_duration_seconds_total{cmd_name="APPEND",tagged_reply_state="OK"} 0.339435
dovecot_imap_command_duration_seconds_total{cmd_name="LSUB"} 0.000427
dovecot_imap_command_duration_seconds_total{cmd_name="LSUB",tagged_reply_state="OK"} 0.000427
dovecot_imap_command_duration_seconds_total{cmd_name="LIST"} 0.002619
dovecot_imap_command_duration_seconds_total{cmd_name="LIST",tagged_reply_state="OK"} 0.002619
dovecot_imap_command_duration_seconds_total{cmd_name="NAMESPACE"} 0.000064
dovecot_imap_command_duration_seconds_total{cmd_name="NAMESPACE",tagged_reply_state="OK"} 0.000064
dovecot_imap_command_duration_seconds_total{cmd_name="SELECT"} 0.004772
dovecot_imap_command_duration_seconds_total{cmd_name="SELECT",tagged_reply_state="OK"} 0.004772
dovecot_imap_command_duration_seconds_total{cmd_name="UID FETCH"} 0.001486
dovecot_imap_command_duration_seconds_total{cmd_name="UID FETCH",tagged_reply_state="OK"} 0.001486
dovecot_imap_command_duration_seconds_total{cmd_name="UID SORT"} 0.000311
dovecot_imap_command_duration_seconds_total{cmd_name="UID SORT",tagged_reply_state="OK"} 0.000311
dovecot_imap_command_duration_seconds_sum 0.356062
# HELP dovecot_smtp_command Histogram
# TYPE dovecot_smtp_command histogram
dovecot_smtp_command_bucket{cmd_name="LHLO",status_code="250",le="0.000010"} 0
dovecot_smtp_command_bucket{cmd_name="LHLO",status_code="250",le="0.000100"} 1
dovecot_smtp_command_bucket{cmd_name="LHLO",status_code="250",le="0.001000"} 1
dovecot_smtp_command_bucket{cmd_name="LHLO",status_code="250",le="0.010000"} 1
dovecot_smtp_command_bucket{cmd_name="LHLO",status_code="250",le="0.100000"} 1
dovecot_smtp_command_bucket{cmd_name="LHLO",status_code="250",le="+Inf"} 1
dovecot_smtp_command_sum{cmd_name="LHLO",status_code="250"} 0.000087
dovecot_smtp_command_count{cmd_name="LHLO",status_code="250"} 1
dovecot_smtp_command_bucket{cmd_name="MAIL",status_code="250",le="0.000010"} 0
dovecot_smtp_command_bucket{cmd_name="MAIL",status_code="250",le="0.000100"} 1
dovecot_smtp_command_bucket{cmd_name="MAIL",status_code="250",le="0.001000"} 1
dovecot_smtp_command_bucket{cmd_name="MAIL",status_code="250",le="0.010000"} 1
dovecot_smtp_command_bucket{cmd_name="MAIL",status_code="250",le="0.100000"} 1
dovecot_smtp_command_bucket{cmd_name="MAIL",status_code="250",le="+Inf"} 1
dovecot_smtp_command_sum{cmd_name="MAIL",status_code="250"} 0.000059
dovecot_smtp_command_count{cmd_name="MAIL",status_code="250"} 1
dovecot_smtp_command_bucket{cmd_name="RCPT",status_code="250",le="0.000010"} 0
dovecot_smtp_command_bucket{cmd_name="RCPT",status_code="250",le="0.000100"} 0
dovecot_smtp_command_bucket{cmd_name="RCPT",status_code="250",le="0.001000"} 1
dovecot_smtp_command_bucket{cmd_name="RCPT",status_code="250",le="0.010000"} 1
dovecot_smtp_command_bucket{cmd_name="RCPT",status_code="250",le="0.100000"} 1
dovecot_smtp_command_bucket{cmd_name="RCPT",status_code="250",le="+Inf"} 1
dovecot_smtp_command_sum{cmd_name="RCPT",status_code="250"} 0.000477
dovecot_smtp_command_count{cmd_name="RCPT",status_code="250"} 1
dovecot_smtp_command_bucket{cmd_name="DATA",status_code="250",le="0.000010"} 0
dovecot_smtp_command_bucket{cmd_name="DATA",status_code="250",le="0.000100"} 0
dovecot_smtp_command_bucket{cmd_name="DATA",status_code="250",le="0.001000"} 0
dovecot_smtp_command_bucket{cmd_name="DATA",status_code="250",le="0.010000"} 1
dovecot_smtp_command_bucket{cmd_name="DATA",status_code="250",le="0.100000"} 1
dovecot_smtp_command_bucket{cmd_name="DATA",status_code="250",le="+Inf"} 1
dovecot_smtp_command_sum{cmd_name="DATA",status_code="250"} 0.005441
dovecot_smtp_command_count{cmd_name="DATA",status_code="250"} 1
dovecot_smtp_command_bucket{cmd_name="QUIT",status_code="221",le="0.000010"} 0
dovecot_smtp_command_bucket{cmd_name="QUIT",status_code="221",le="0.000100"} 1
dovecot_smtp_command_bucket{cmd_name="QUIT",status_code="221",le="0.001000"} 1
dovecot_smtp_command_bucket{cmd_name="QUIT",status_code="221",le="0.010000"} 1
dovecot_smtp_command_bucket{cmd_name="QUIT",status_code="221",le="0.100000"} 1
dovecot_smtp_command_bucket{cmd_name="QUIT",status_code="221",le="+Inf"} 1
dovecot_smtp_command_sum{cmd_name="QUIT",status_code="221"} 0.000028
dovecot_smtp_command_count{cmd_name="QUIT",status_code="221"} 1
# HELP dovecot_mail_delivery Histogram
# TYPE dovecot_mail_delivery histogram
dovecot_mail_delivery_bucket{le="0.000010"} 0
dovecot_mail_delivery_bucket{le="0.000100"} 0
dovecot_mail_delivery_bucket{le="0.001000"} 0
dovecot_mail_delivery_bucket{le="0.010000"} 1
dovecot_mail_delivery_bucket{le="0.100000"} 1
dovecot_mail_delivery_bucket{le="+Inf"} 1
dovecot_mail_delivery_sum 0.003941
dovecot_mail_delivery_count 1
# EOF

Damos de alta un nuevo scrape:

vi /usr/local/etc/prometheus.yml

  - job_name: 'dovecot_exporter'
    scrape_interval: 30s
    static_configs:
      - targets:
        - hellstorm:9900
        labels:
          scrape_interval: 30s

Reiniciamos el servicio:

service prometheus restart

NOTA: Si cambiamos el intervalo de scrape es importante hacerlo también en la etiqueta scrape_interval de las gráficas de Grafana ya que se utiliza en las querys de las gráficas.

Debemos tener en cuenta que este exporter NO pide autenticación por lo tanto debemos restringir el acceso mediante firewall o bindear el exporter a localhost e instalar algún servidor web como Nginx para que pida credenciales y actúe como intermediario. En caso de tratarse de una jail en modo bridge no tendremos interfaz de loopback, por lo tanto debemos bindear las stats a otro puerto, bindear Nginx en el 9900 y restringir por firewall el acceso al puerto de las stats para que solo se pueda acceder desde la ip local. En el scrape deberíamos definir las credenciales.

Cargamos en Grafana la siguiente dashboard

Si te ha gustado el artículo puedes invitarme a un RedBull aquí