Config File
sznuper uses a single YAML config file. By default it looks for:
~/.config/sznuper/config.yml(as user)/etc/sznuper/config.yml(as root)
Override with --config <path>. Environment variables are supported anywhere in the file using ${VAR_NAME} syntax.
The config has four top-level sections:
options
Section titled “options”Paths for healthcheck storage, caching, and logs.
options: healthchecks_dir: /etc/sznuper/healthchecks cache_dir: /var/cache/sznuper logs_dir: /var/log/sznuperAll fields are optional. Defaults depend on whether sznuper runs as root or user.
| Field | Root default | User default |
|---|---|---|
healthchecks_dir | /etc/sznuper/healthchecks | ~/.config/sznuper/healthchecks |
cache_dir | /var/cache/sznuper | ~/.cache/sznuper |
logs_dir | /var/log/sznuper | ~/.local/state/sznuper/logs |
globals
Section titled “globals”Arbitrary key-value pairs accessible in notification templates. Useful for shared values like hostname.
globals: hostname: my-server environment: productionservices
Section titled “services”Notification services using Shoutrrr URLs. Each service has a name, a URL, and optional default params.
The service name is arbitrary - it’s just a label you use to reference it later in alerts. You can have multiple services of the same type with different names:
services: telegram-ops: url: telegram://${TELEGRAM_TOKEN}@telegram params: chats: ${OPS_CHAT_ID} telegram-alerts: url: telegram://${TELEGRAM_TOKEN}@telegram params: chats: ${ALERTS_CHAT_ID} discord: url: discord://${DISCORD_TOKEN}@${DISCORD_WEBHOOK_ID}Then reference them by name in your alerts:
alerts: - name: disk ... notify: - telegram-ops - discordalerts
Section titled “alerts”A list of alerts. Each alert defines what to check, when to check it, and who to notify. Example:
alerts: - name: disk healthcheck: https://github.com/sznuper/healthchecks/releases/download/v0.4.0/disk_usage sha256: abc123... triggers: - interval: "5m" timeout: "30s" args: mount: / threshold_warn_percent: 80 threshold_crit_percent: 95 template: "Disk usage on {{globals.hostname}}: {{event.usage_percent}}%" cooldown: "1h" notify: - telegramAlert fields
Section titled “Alert fields”| Field | Required | Description |
|---|---|---|
name | yes | Unique name for this alert |
healthcheck | yes | URI of the healthcheck (file://, https://, or builtin://) |
sha256 | no | SHA-256 hash for remote healthchecks, or false to skip verification |
triggers | no | List of triggers (see below) |
timeout | no | Max execution time (e.g. "30s") |
args | no | Key-value arguments passed as HEALTHCHECK_ARG_* env vars |
side_effects | no | Shell commands to run after event processing |
template | yes | Go template for the notification message (see below) |
cooldown | no | Suppress repeated notifications (e.g. "5m", "1h") |
notify | yes | List of services to notify |
events | no | Per-event-type configuration (see below) |
Triggers
Section titled “Triggers”A list of triggers. Each alert can have multiple triggers and they all run independently. Example:
triggers: - interval: "5m" - cron: "0 9 * * 1" - cron: "0 18 * * *"This alert would run every 5 minutes, every Monday at 9am, and every day at 6pm.
Available trigger types:
| Type | Description |
|---|---|
interval | Run on a fixed interval (e.g. "5m", "30s") |
cron | Cron expression, 5 or 6 fields (e.g. "0 9 * * *") |
watch | Run when a file changes (e.g. /var/log/app.log) |
pipe | Continuous shell command whose stdout is fed to the healthcheck (e.g. "tail -F /var/log/app.log") |
lifecycle | Special trigger that fires on daemon start and stop. Only works with the builtin://lifecycle healthcheck. |
Templates
Section titled “Templates”Templates use Go’s text/template syntax with Sprig functions. Four scopes are available:
| Scope | Description |
|---|---|
event | Fields from the healthcheck output (e.g. {{event.type}}, {{event.usage_percent}}) |
globals | Values from the globals config section (e.g. {{globals.hostname}}) |
alert | Alert metadata (e.g. {{alert.name}}) |
args | Arguments from the alert’s args field (e.g. {{args.mount}}) |
Example:
template: |- [{{event.type | upper}}] {{globals.hostname}}: Disk {{args.mount}} at {{event.usage_percent}}% ({{event.available}} remaining)Notify targets
Section titled “Notify targets”A list of services to notify. In the simplest form, just the service name:
notify: - telegram - discordYou can also override params per notification. The params are merged on top of the service’s base params - any key you set here wins over the service default. Params are passed as query parameters in the Shoutrrr URL.
notify: - telegram - telegram: params: chats: ${ANOTHER_CHAT_ID} # override the default chat notification: "false" # send silently - discord: params: username: sznuper-bot avatar_url: https://example.com/avatar.pngThis sends to the default telegram chat, a second telegram chat silently, and discord with a custom bot name and avatar.
Events
Section titled “Events”Each alert has template, notify, and optionally cooldown that apply to all event types by default. The events section lets you override these per event type.
| Field | Description |
|---|---|
healthy | List of event types considered healthy. When sznuper sees a healthy event after unhealthy ones, it resets cooldowns. |
on_unmatched | What to do with event types not listed in override: "notify" (default) or "drop". |
override | Per-event-type overrides for template, cooldown, and notify. |
For example, say you have a disk usage alert with a default cooldown of "1h" and a simple template. But for critical_usage events you want a more urgent message, a shorter cooldown, and to also notify discord:
alerts: - name: disk healthcheck: ... triggers: - interval: "5m" template: |- [{{event.type | upper}}] {{globals.hostname}}: Disk at {{event.usage_percent}}% cooldown: "1h" notify: - telegram events: healthy: - ok on_unmatched: notify override: critical_usage: template: |- CRITICAL: {{globals.hostname}} disk at {{event.usage_percent}}%! Only {{event.available}} remaining on {{args.mount}} cooldown: "5m" notify: - telegram - discordHere, ok and high_usage events use the alert-level defaults (1h cooldown, telegram only). But critical_usage gets its own template, a 5m cooldown, and notifies both telegram and discord.
Example config
Section titled “Example config”options: healthchecks_dir: /etc/sznuper/healthchecks cache_dir: /var/cache/sznuper logs_dir: /var/log/sznuper
globals: hostname: my-server
services: telegram: url: telegram://${TELEGRAM_TOKEN}@telegram params: chats: ${TELEGRAM_CHAT_ID}
alerts: - name: disk healthcheck: https://github.com/sznuper/healthchecks/releases/download/v0.4.0/disk_usage sha256: abc123... triggers: - interval: "10m" args: mount: / threshold_warn_percent: 80 threshold_crit_percent: 95 template: |- Disk usage on {{globals.hostname}} Mount: {{event.mount}} Usage: {{event.usage_percent}}% Available: {{event.available}} cooldown: "1h" notify: - telegram events: healthy: - ok
- name: ssh healthcheck: https://github.com/sznuper/healthchecks/releases/download/v0.4.0/ssh_journal sha256: abc123... triggers: - pipe: "journalctl -fu sshd --output=json" template: |- SSH {{event.type}} on {{globals.hostname}} User: {{event.user}} Host: {{event.host}} notify: - telegram events: override: login: cooldown: "0" failure: cooldown: "5m"