In this guide, we will see how to monitor sent/received emails and the queue status on a SendMail mail server using Prometheus. If we have not performed the basic installation of Prometheus first, we will follow the basic installation guide for Prometheus and Grafana.
The first step will be to enable the
SendMail stats:
cd /etc/mail
We generate the configuration file:
We edit it:
define(`STATUS_FILE',`/var/log/sendmail.stats')dnl
We compile the configuration and apply it:
cp HOSTNAME.cf sendmail.cf
We restart the service:
Now, with the following command, we can obtain the statistics:
Statistics from Tue Jan 3 18:22:55 2023
M msgsfr bytes_from msgsto bytes_to msgsrej msgsdis msgsqur Mailer
=====================================================================
T 0 0K 0 0K 0 0 0
C 0 0 0
If we send an email from the SendMail server, we will see the following output:
Statistics from Tue Jan 3 18:23:34 2023
M msgsfr bytes_from msgsto bytes_to msgsrej msgsdis msgsqur Mailer
3 1 1K 0 0K 0 0 0 local
5 0 0K 1 2K 0 0 0 esmtp
=====================================================================
T 1 1K 1 2K 0 0 0
C 1 1 0
We observe that the email is generated from local(msgsfr) to esmtp(msgsto)
If we receive an email on the SendMail server, we will see the following output:
Statistics from Tue Jan 3 18:23:34 2023
M msgsfr bytes_from msgsto bytes_to msgsrej msgsdis msgsqur Mailer
3 1 1K 1 5K 0 0 0 local
5 1 4K 1 2K 0 0 0 esmtp
=====================================================================
T 2 5K 2 7K 0 0 0
C 2 2 0
We observe that the email is generated from esmtp(msgsfr) to local(msgsto)
The last two rows are the Totals: T, which in my opinion, having the partials, does not make sense to show, and the messages sent over TCP connections, which I also do not see the point of.
- connectionMessagesFrom: Number of messages sent over TCP connections.
- connectionMessagesTo: Number of messages received over TCP connections.
- connectionMessagesRejected: Number of messages that arrived over TCP connections and were rejected.
Therefore, to monitor sent emails and incoming/outgoing traffic, we must obtain the following fields.
- Issuance:
Increase esmtp(msgsto)
Increase esmtp(bytes_to)
- Reception:
Increase local(msgsto)
Increase local(bytes_to)
Another interesting metric is the SMTP server queue size. Observing a queued email in an idle system is complicated, so we will define a firewall rule where we will allow the entry of the email but not the exit. This way, the email will be queued:
ipfw add 00011 deny tcp from any to any 25
We check the queue:
/var/spool/mqueue (1 request)
-----Q-ID----- --Size-- -----Q-Time----- ------------Sender/Recipient-----------
19DAkLMh052596 575 Tue Jan 03 18:46 <kr0m@alfaexploit.com>
(Deferred: Permission denied)
<jjivarspoquet@gmail.com>
Total requests: 1
We delete the firewall rule and see that there are no more queued emails:
ipfw delete 00011
/var/spool/mqueue is empty
Total requests: 0
Sendmail processes emails in different queues depending on their origin. The default queues are:
- mqueue queue: The email was introduced into the system by a process running as root. These are usually incoming emails, and we can find them in /var/spool/mqueue/.
- clientmqueue queue: The email was introduced into the system by any user other than root. These are usually emails sent by regular system users, and we can find them in /var/spool/clientmqueue (path defined in /etc/mail/submit.cf).
NOTE: I have not been able to queue emails in the clientmqueue queue in any way. I do not know in what case this scenario will occur.
In addition, each of the queues may have emails in different states:
- lost: These are emails that, after several attempts, could not be delivered for some reason.
- quarantined: These are emails that, for some reason, have been quarantined. This can happen due to a rule defined at the SendMail level or by a milter such as SpamAssassin, for example.
Mailq allows us to visualize the different queues and states according to the arguments we pass to it:
- Without arguments: Shows the mqueue queue.
- -Ac: Shows the clientmqueue queue.
- -qL: Shows the lost emails.
- -qQ: Shows the quarantined emails.
Now that we know how to manually query the data, let’s proceed with the programming of our exporter. The first thing is to decide what type of metrics we are going to serve. In my case, they will all be Gauges:
- Sent/received emails
- Sent/received bytes
- Emails queued by queue/state
We install the necessary libraries:
We program our exporter:
#!/usr/local/bin/python
from flask import Response, Flask, request
from flask_httpauth import HTTPBasicAuth
from werkzeug.security import generate_password_hash, check_password_hash
from waitress import serve
import prometheus_client
from prometheus_client.core import CollectorRegistry
from prometheus_client import Gauge
import time
import threading
import subprocess
import re
import datetime
# https://github.com/prometheus/client_python#gauge
smtp_incoming_emails = Gauge('smtp_incoming_emails', 'Incoming emails via SMTP: mailstats')
smtp_incoming_data = Gauge('smtp_incoming_data', 'Incoming Kbytes via SMTP: mailstats')
smtp_incoming_rejected_emails = Gauge('smtp_incoming_rejected_emails', 'Rejected incoming emails via SMTP: mailstats')
smtp_incoming_discarded_emails = Gauge('smtp_incoming_discarded_emails', 'Discarded incoming emails via SMTP: mailstats')
smtp_incoming_quarantined_emails = Gauge('smtp_incoming_quarantined_emails', 'Quarantined incoming emails via SMTP: mailstats')
smtp_outcoming_emails = Gauge('smtp_outcoming_emails', 'outcoming emails via SMTP: mailstats')
smtp_outcoming_data = Gauge('smtp_outcoming_data', 'outcoming Kbytes via SMTP: mailstats')
smtp_outcoming_rejected_emails = Gauge('smtp_outcoming_rejected_emails', 'Rejected outcoming emails via SMTP: mailstats')
smtp_outcoming_discarded_emails = Gauge('smtp_outcoming_discarded_emails', 'Discarded outcoming emails via SMTP: mailstats')
smtp_outcoming_quarantined_emails = Gauge('smtp_outcoming_quarantined_emails', 'Quarantined outcoming emails via SMTP: mailstats')
smtp_queued_emails_mqueue = Gauge('smtp_queued_emails_mqueue', 'Queued emails: qmail')
smtp_queued_emails_clientmqueue = Gauge('smtp_queued_emails_clientmqueue', 'Queued emails: qmail -Ac')
smtp_queued_emails_lost = Gauge('smtp_queued_emails_lost', 'Queued emails: qmail -qL')
smtp_queued_emails_quarantined = Gauge('smtp_queued_emails_quarantined', 'Queued emails: qmail -qQ')
app = Flask(__name__)
auth = HTTPBasicAuth()
users = {
"sendmail_exporter_user": generate_password_hash("PASSWORD"),
}
@auth.verify_password
def verify_password(username, password):
if username in users and check_password_hash(users.get(username), password):
return username
def get_sendmail_stats():
print('++ mainThread started')
print('------------------')
while True:
now = datetime.datetime.now()
print('%s' % now)
print('')
# MAILSTATS:
process = subprocess.run(["mailstats", "-P"], capture_output=True, encoding="utf-8")
#print(process.stdout)
for line in process.stdout.splitlines():
search_pattern = False
search_pattern = re.match("\s*\d*\s*\d*\s*\d*\s*(\d*)\s*(\d*)\s*(\d*)\s*(\d*)\s*(\d*)\s*local", line)
if search_pattern:
print('> smtp_incoming_emails: %i' % int(search_pattern.group(1)))
smtp_incoming_emails.set(search_pattern.group(1))
print('> smtp_incoming_data: %s' % search_pattern.group(2))
smtp_incoming_data.set(search_pattern.group(2))
print('> smtp_incoming_rejected_emails: %i' % int(search_pattern.group(3)))
smtp_incoming_rejected_emails.set(search_pattern.group(3))
print('> smtp_incoming_discarded_emails: %i' % int(search_pattern.group(4)))
smtp_incoming_discarded_emails.set(search_pattern.group(4))
print('> smtp_incoming_quarantined_emails: %i' % int(search_pattern.group(5)))
smtp_incoming_quarantined_emails.set(search_pattern.group(5))
print('')
continue
search_pattern = False
search_pattern = re.match("\s*\d*\s*\d*\s*\d*\s*(\d*)\s*(\d*)\s*(\d*)\s*(\d*)\s*(\d*)\s*esmtp", line)
if search_pattern:
print('> smtp_outcoming_emails: %i' % int(search_pattern.group(1)))
smtp_outcoming_emails.set(search_pattern.group(1))
print('> smtp_outcoming_data: %s' % search_pattern.group(2))
smtp_outcoming_data.set(search_pattern.group(2))
print('> smtp_outcoming_rejected_emails: %s' % search_pattern.group(3))
smtp_outcoming_rejected_emails.set(search_pattern.group(3))
print('> smtp_outcoming_discarded_emails: %s' % search_pattern.group(4))
smtp_outcoming_discarded_emails.set(search_pattern.group(4))
print('> smtp_outcoming_quarantined_emails: %s' % search_pattern.group(5))
smtp_outcoming_quarantined_emails.set(search_pattern.group(5))
print('')
continue
# MAILQ:
process = subprocess.run(["mailq"], capture_output=True, encoding="utf-8")
for line in process.stdout.splitlines():
search_pattern = False
search_pattern = re.match(".*Total requests:\s*(\d*)", line)
if search_pattern:
print('> smtp_queued_emails_mqueue: %s' % search_pattern.group(1))
smtp_queued_emails_mqueue.set(search_pattern.group(1))
break
process = subprocess.run(["mailq", "-Ac"], capture_output=True, encoding="utf-8")
for line in process.stdout.splitlines():
search_pattern = False
search_pattern = re.match(".*Total requests:\s*(\d*)", line)
if search_pattern:
print('> smtp_queued_emails_clientmqueue: %s' % search_pattern.group(1))
smtp_queued_emails_clientmqueue.set(search_pattern.group(1))
break
process = subprocess.run(["mailq", "-qL"], capture_output=True, encoding="utf-8")
for line in process.stdout.splitlines():
search_pattern = False
search_pattern = re.match(".*Total requests:\s*(\d*)", line)
if search_pattern:
print('> smtp_queued_emails_lost: %s' % search_pattern.group(1))
smtp_queued_emails_lost.set(search_pattern.group(1))
break
process = subprocess.run(["mailq", "-qQ"], capture_output=True, encoding="utf-8")
for line in process.stdout.splitlines():
search_pattern = False
search_pattern = re.match(".*Total requests:\s*(\d*)", line)
if search_pattern:
print('> smtp_queued_emails_quarantined: %s' % search_pattern.group(1))
smtp_queued_emails_quarantined.set(search_pattern.group(1))
break
# Metric refresh rate
time.sleep(30)
print('------------------')
@app.route("/metrics")
@auth.login_required
def serve_metrics():
now = datetime.datetime.now()
res = []
src_ip = request.remote_addr
print('')
print('< Serving metrics: %s - %s' % (src_ip,now))
print('')
res.append(prometheus_client.generate_latest())
return Response(res, mimetype="text/plain")
if __name__ == "__main__":
mainThread = threading.Thread(target=get_sendmail_stats)
mainThread.start()
serve(app, host="0.0.0.0", port=2525)
NOTE: Remember that if we install the exporter on the parent host of a jail server, we must bind it only to its specific IP, otherwise it will occupy the IPs of all the jails.
We give it execution permissions:
We run it manually to make sure it doesn’t fail:
We manually query the metrics:
# HELP python_gc_objects_collected_total Objects collected during gc
# TYPE python_gc_objects_collected_total counter
python_gc_objects_collected_total{generation="0"} 98.0
python_gc_objects_collected_total{generation="1"} 287.0
python_gc_objects_collected_total{generation="2"} 0.0
# HELP python_gc_objects_uncollectable_total Uncollectable object found during GC
# TYPE python_gc_objects_uncollectable_total counter
python_gc_objects_uncollectable_total{generation="0"} 0.0
python_gc_objects_uncollectable_total{generation="1"} 0.0
python_gc_objects_uncollectable_total{generation="2"} 0.0
# HELP python_gc_collections_total Number of times this generation was collected
# TYPE python_gc_collections_total counter
python_gc_collections_total{generation="0"} 77.0
python_gc_collections_total{generation="1"} 6.0
python_gc_collections_total{generation="2"} 0.0
# HELP python_info Python platform information
# TYPE python_info gauge
python_info{implementation="CPython",major="3",minor="9",patchlevel="16",version="3.9.16"} 1.0
# HELP smtp_incoming_emails Incoming emails via SMTP: mailstats
# TYPE smtp_incoming_emails gauge
smtp_incoming_emails 1.0
# HELP smtp_incoming_data Incoming Kbytes via SMTP: mailstats
# TYPE smtp_incoming_data gauge
smtp_incoming_data 5.0
# HELP smtp_incoming_rejected_emails Rejected incoming emails via SMTP: mailstats
# TYPE smtp_incoming_rejected_emails gauge
smtp_incoming_rejected_emails 0.0
# HELP smtp_incoming_discarded_emails Discarded incoming emails via SMTP: mailstats
# TYPE smtp_incoming_discarded_emails gauge
smtp_incoming_discarded_emails 0.0
# HELP smtp_incoming_quarantined_emails Quarantined incoming emails via SMTP: mailstats
# TYPE smtp_incoming_quarantined_emails gauge
smtp_incoming_quarantined_emails 0.0
# HELP smtp_outcoming_emails outcoming emails via SMTP: mailstats
# TYPE smtp_outcoming_emails gauge
smtp_outcoming_emails 1.0
# HELP smtp_outcoming_data outcoming Kbytes via SMTP: mailstats
# TYPE smtp_outcoming_data gauge
smtp_outcoming_data 2.0
# HELP smtp_outcoming_rejected_emails Rejected outcoming emails via SMTP: mailstats
# TYPE smtp_outcoming_rejected_emails gauge
smtp_outcoming_rejected_emails 0.0
# HELP smtp_outcoming_discarded_emails Discarded outcoming emails via SMTP: mailstats
# TYPE smtp_outcoming_discarded_emails gauge
smtp_outcoming_discarded_emails 0.0
# HELP smtp_outcoming_quarantined_emails Quarantined outcoming emails via SMTP: mailstats
# TYPE smtp_outcoming_quarantined_emails gauge
smtp_outcoming_quarantined_emails 0.0
# HELP smtp_queued_emails_mqueue Queued emails: qmail
# TYPE smtp_queued_emails_mqueue gauge
smtp_queued_emails_mqueue 0.0
# HELP smtp_queued_emails_clientmqueue Queued emails: qmail -Ac
# TYPE smtp_queued_emails_clientmqueue gauge
smtp_queued_emails_clientmqueue 0.0
# HELP smtp_queued_emails_lost Queued emails: qmail -qL
# TYPE smtp_queued_emails_lost gauge
smtp_queued_emails_lost 0.0
# HELP smtp_queued_emails_quarantined Queued emails: qmail -qQ
# TYPE smtp_queued_emails_quarantined gauge
smtp_queued_emails_quarantined 0.0
We copy the exporter to a more convenient path:
We daemonize the exporter:
#! /bin/sh
#
# $FreeBSD$
#
# PROVIDE: sendmail_exporter
# REQUIRE: DAEMON
# KEYWORD: shutdown
. /etc/rc.subr
name="sendmail_exporter"
rcvar="${name}_enable"
extra_commands="status"
start_cmd="${name}_start"
stop_cmd="${name}_stop"
status_cmd="${name}_status"
sendmail_exporter_start(){
echo "Starting service: ${name}"
/usr/sbin/daemon -S -p /var/run/${name}.pid -T ${name} /usr/local/sbin/sendmail_exporter
}
sendmail_exporter_stop(){
if [ -f /var/run/${name}.pid ]; then
echo "Stopping service: ${name}"
kill $(cat /var/run/${name}.pid)
sleep 4
else
echo "It appears ${name} is not running."
fi
}
sendmail_exporter_status(){
if [ -f /var/run/${name}.pid ]; then
echo "${name} running with PID: $(cat /var/run/${name}.pid)"
else
echo "It appears ${name} is not running."
fi
}
load_rc_config ${name}
run_rc_command "$1"
We assign the necessary permissions to our RC script:
We enable the service:
We start the exporter:
We query the metrics again to make sure it’s still working:
In order to configure the scrapes by name, I need to register the servers to be monitored in the /etc/hosts file of the Prometheus server since I don’t have a DNS server on my local network:
192.168.69.2 mightymax
192.168.69.4 garrus
192.168.69.16 baudbeauty
192.168.69.17 hellstorm
192.168.69.18 paradox
192.168.69.19 atlas
192.168.69.20 metacortex
We register the scrape in Prometheus:
- job_name: 'sendmail_exporter'
scrape_interval: 30s
static_configs:
- targets:
- hellstorm:2525
labels:
scrape_interval: 30s
basic_auth:
username: sendmail_exporter_user
password: PASSWORD
NOTE: If we change the scrape interval, it is important to also update the scrape_interval tag in the Grafana charts, as it is used in the chart queries.
We restart the service:
We import the following dashboard in Grafana
Where we will see the following charts: