Alerting vs Reporting vs Dashboarding
One of the common concepts that is discussed in observability (at least by the author) is the distinction of what should be an alert compared to what should be a report. Somewhere in the middle of these two extremes is what should be a dashboard. Some of these concepts will exist across a spectrum between two points, but in most cases bucketing information into alerts, reports and dashboards will provide enough clarity that a choice to draw a line can be made.
What is an alert
An alert should meet three criteria:
- It is causing an immediate effect on a service level
- It is specifically able to be acted on
- Lack of immediate action will worsen the impact on the service level
If it doesn't meet all three of these criteria, it is not something that merits an alert.
It is causing an immediate effect on a service level
This is the first hurdle anything that is an alert should cross. If it is not affecting a service level commitment or metric, it is not something that is necessary to alert on. Critical services have service levels, therefore if there are no service levels being breached, it is not affecting anything critical.
It is specifically able to be acted on
Lack of immediate action will worsen the impact on the service level
What is a report
What is a dashboard