This project is in no way affiliated with Velero.

Velero Sentinel was born out of the necessity to get informed about failed and partially failed backups made with Velero.

There are more elaborate alternatives out there. If you have Prometheus Operator running, you can achieve the same with a simple rule:

 1apiVersion: monitoring.coreos.com/v1
 2kind: PrometheusRule
 3metadata:
 4  name: velero
 5spec:
 6  groups:
 7  - name: velero-failures
 8    rules:
 9    - alert: VeleroBackupPartialFailures
10      annotations:
11        message: Velero backup {{ $labels.schedule }} has {{ $value | humanizePercentage }} partialy failed backups.
12      expr: |-
13                velero_backup_partial_failure_total{schedule!=""} / velero_backup_attempt_total{schedule!=""} > 0.25
14      for: 15m
15      labels:
16        severity: warning
17    - alert: VeleroBackupFailures
18      annotations:
19        message: Velero backup {{ $labels.schedule }} has {{ $value | humanizePercentage }} failed backups.
20      expr: |-
21                velero_backup_failure_total{schedule!=""} / velero_backup_attempt_total{schedule!=""} > 0.25
22      for: 15m
23      labels:
24        severity: warning

However, this requires more moving parts. The goal of Velero Sentinel is to provide the most simple solution possible.

Velero Sentinel is still work in progress. Although it only needs read access to your cluster, proceed with caution!

Install Velero Sentinel

diy