Prometheus and Grafana Integration with HeyOnCall

In this step-by-step guide I cover how to integrate HeyOnCall with a Prometheus Alertmanager instance. Integration takes a matter of minutes, and you can start leveraging Prometheus's extensive set of features in your website monitoring and on-call rotations.

Author
Humberto Evans

Prometheus is a popular open source monitoring solution that lets you aggregate metrics exposed by your infrastructure and applications. If you are willing to put in the work to configure and maintain it, it can be a great alternative to expensive Application Performance Monitoring solutions like Datadog and Better Uptime. At HeyOnCall we are big fans of keeping our operations as lean as possible, so we are big fans of Prometheus and Grafana for monitoring. If you are new to Prometheus I suggest you follow our guide to a minimal Prometheus setup.

HeyOnCall natively integrates with Prometheus Alertmanager through their provided webhooks. Once set up, you can route Prometheus alerts to a HeyOnCall trigger, and Prometheus will even automatically resolve a triggering alert if the alerting condition goes back below the alerting threshold.

Set up HeyOnCall

The first step is to set up the HeyOnCall trigger that will receive a webhook from your Prometheus Alertmanager instance. Create a new trigger and choose Prometheus from the mechanism dropdown.

Prometheus HeyOnCall Trigger

We also need to create an API key so that Prometheus can authenticate with HeyOnCall. Click on API Keys from an Organization's home page, and create a new API key for Prometheus.

Prometheus HeyOnCall Key Created

Note the API key and the Trigger ID. We will use both of these in the Prometheus configuration coming up.

Create a new receiver in Prometheus Alertmanager

Now head over to where you keep your Prometheus configuration. If you are using Kubernetes this is likely a ConfigMap definition, or a file on the Prometheus Alertmanager instance if you are running it directly. Somewhere in the configuration file you should have a receivers: key that consists of a list of receivers that are available to be routed to. Add a new receiver that looks like this:

receivers:
  - name: heyoncall
    webhook_configs:
      - url: "https://api.heyoncall.com/triggers/YOUR-TRIGGER-ID-HERE/prometheus"
        http_config:
          follow_redirects: false
          authorization:
            credentials: "YOUR-HEYONCALL-API-KEY-HERE"

The block follows the specification for a webhook_config in Alertmanager. See their documentation for further customization. If your configuration lives in version control, we recommend using credentials_file instead of credentials to store the HeyOnCall API key.

Route an alert to your new receiver

Now we need to set up a route: to send alerts to the receiver we just created. Prometheus uses a "tree" of routes where the first route is the root node, all alerts go through the root node and travel down the tree of routes, and will be delivered to any receiver that matches the matchers directive. Like many configuration options in Prometheus this is often overkill for small to medium sized teams. In the example below we set the root route to our heyoncall receiver, so that every alert gets sent to the heyoncall trigger.

route:
  receiver: heyoncall
  group_wait: 10s
  repeat_interval: 30m
  routes: []

If you had multiple HeyOnCall receivers you could set up additional leaf routes, one for each HeyOnCall trigger, and then you would be able to match a label on the alert to route to the right team.

Done

That's it! With this simple configuration alerts will be routed to the individual currently on call in the Service you set up in HeyOnCall. If you have a more complicated use case that our current integration does not cover, don't hesitate to reach out to us at hey@heyoncall.com. We will be happy to help.

Remember that monitoring through metrics is only part of a comprehensive uptime and monitoring plan. In addition to gathering metrics from your infrastructure you should also be monitoring uptime from an external source.

Happy Monitoring!