This project is in no way affiliated with Velero.
Velero Sentinel was born out of the necessity to get informed about failed and partially failed backups made with Velero.
There are more elaborate alternatives out there. If you have Prometheus Operator running, you can achieve the same with a simple rule:
1apiVersion: monitoring.coreos.com/v1
2kind: PrometheusRule
3metadata:
4 name: velero
5spec:
6 groups:
7 - name: velero-failures
8 rules:
9 - alert: VeleroBackupPartialFailures
10 annotations:
11 message: Velero backup {{ $labels.schedule }} has {{ $value | humanizePercentage }} partialy failed backups.
12 expr: |-
13 velero_backup_partial_failure_total{schedule!=""} / velero_backup_attempt_total{schedule!=""} > 0.25
14 for: 15m
15 labels:
16 severity: warning
17 - alert: VeleroBackupFailures
18 annotations:
19 message: Velero backup {{ $labels.schedule }} has {{ $value | humanizePercentage }} failed backups.
20 expr: |-
21 velero_backup_failure_total{schedule!=""} / velero_backup_attempt_total{schedule!=""} > 0.25
22 for: 15m
23 labels:
24 severity: warning
However, this requires more moving parts. The goal of Velero Sentinel is to provide the most simple solution possible.
Velero Sentinel is still work in progress. Although it only needs read access to your cluster, proceed with caution!