180 lines
30 KiB
Markdown
180 lines
30 KiB
Markdown
![]() |
---
|
||
|
date: 2017-03-14
|
||
|
title: Consul Prometheus and Puppet
|
||
|
category: devops
|
||
|
---
|
||
|
|
||
|
Recently I've been playing around with Prometheus. For now I think it is the best open source solution for monitoring (in the same way that chlamydia is probably the best STD). Previously I was a fan of Sensu, but honestly there are just too many moving parts to go wrong with Sensu, which meant they inevitably did.
|
||
|
|
||
|
So, why do I like Prometheus? Basically, it stays pretty close to the UNIX philosophy of doing one thing and doing it well - basically it is just a time-series database. Alerting is a seperate module for example and graphing is pretty much left to Grafana. Initially I was not taken by it for one simple reason:
|
||
|
|
||
|
- All its configuration is central.
|
||
|
|
||
|
Unlike with Sensu, a node cannot announce itself to the Prometheus server and then be automatically monitored. In this day an age, that sucks. However, while browsing the docs I discovered that it supports [service discovery](https://prometheus.io/blog/2015/06/01/advanced-service-discovery/).
|
||
|
|
||
|
|
||
|
So the process:
|
||
|
|
||
|
- Use Puppet to configure Prometheus
|
||
|
- Individual nodes announce to Consul what services they have
|
||
|
- Prometheus collects its endpoints from Consul
|
||
|
|
||
|
This looks something like this:
|
||
|
|
||
|
<img src="
|
||
|
|
||
|
Here is a network of 5 machines:
|
||
|
|
||
|
- Prometheus (also the Consul server
|
||
|
- Puppet
|
||
|
- 3 that will be monitored
|
||
|
|
||
|
This is a very simple consul cluster. Normally one would have at least 3 masters (ideally more) spread accross different datacentres. It works for this demo though.
|
||
|
|
||
|
Right, let's jump straight into the Puppet code. I am using the classic 'Roles and Profiles' pattern. You can find my control repo [here](https://gogs.chriscowley.me.uk/Puppet/controlrepo). There are a few Puppet modules necessarry, so your `Puppetfile` will contain:
|
||
|
|
||
|
```
|
||
|
forge 'http://forge.puppetlabs.com'
|
||
|
|
||
|
mod 'KyleAnderson/consul', '2.1.0'
|
||
|
mod 'puppet/archive', '1.3.0'
|
||
|
mod 'puppetlabs/stdlib', '4.15.0'
|
||
|
mod 'puppetlabs/firewall', '1.8.2'
|
||
|
|
||
|
mod 'prometheus',
|
||
|
:git => 'https://github.com/voxpupuli/puppet-prometheus.git'
|
||
|
```
|
||
|
|
||
|
To begin with, lets install [Node Exporter](https://prometheus.io/download/#node_exporter) everywhere. This will collect basic system stats and make them available to Prometheus.
|
||
|
|
||
|
In `common.yaml`:
|
||
|
|
||
|
```
|
||
|
---
|
||
|
prometheus::node_export: 0.13.0
|
||
|
```
|
||
|
|
||
|
and in your `profile::base`:
|
||
|
|
||
|
```
|
||
|
class profile::base {
|
||
|
include ::prometheus::node_exporter
|
||
|
firewall {'102 node exporter':
|
||
|
dport => 9100,
|
||
|
proto => tcp,
|
||
|
action => accept,
|
||
|
}
|
||
|
}
|
||
|
```
|
||
|
|
||
|
Consul needs to be everywhere and you need to announce to it that the node exporter is there, so in your base profile:
|
||
|
|
||
|
```
|
||
|
class profile::base {
|
||
|
include ::consul
|
||
|
firewall { '103 Consul':
|
||
|
dport => [8400, 8500],
|
||
|
proto => tcp,
|
||
|
action => accept,
|
||
|
}
|
||
|
}
|
||
|
```
|
||
|
|
||
|
And in `common.yaml`:
|
||
|
|
||
|
```
|
||
|
---
|
||
|
consul::version: 0.7.4
|
||
|
consul::config_hash:
|
||
|
data_dir: '/opt/consul'
|
||
|
datacenter: 'homelab'
|
||
|
log_level: 'INFO'
|
||
|
node_name: "%{::hostname}"
|
||
|
retry_join:
|
||
|
- 192.168.1.89
|
||
|
consul::services:
|
||
|
node_exporter:
|
||
|
address: "%{::fqdn}"
|
||
|
checks:
|
||
|
- http: http://localhost:9100
|
||
|
interval: 10s
|
||
|
port: 9100
|
||
|
tags:
|
||
|
- monitoring
|
||
|
```
|
||
|
|
||
|
Obviously modify the retry_join to suite your infrastructure. If you are doing the right thing and have a cluster, just expand the array down.
|
||
|
|
||
|
For the consul master create a profile that contains:
|
||
|
|
||
|
```
|
||
|
profile::consulmaster {
|
||
|
firewall { '102 consul inbound':
|
||
|
dport => [8300, 8301, 8302, 8600],
|
||
|
proto => tcp,
|
||
|
action => accept,
|
||
|
}
|
||
|
}
|
||
|
```
|
||
|
|
||
|
You need the following in Hiera applied to that node(s):
|
||
|
|
||
|
```
|
||
|
---
|
||
|
consul::version: 0.7.4
|
||
|
consul::config_hash:
|
||
|
bootstrap_expect: 1
|
||
|
data_dir: '/opt/consul'
|
||
|
datacenter: 'homelab'
|
||
|
log_level: 'INFO'
|
||
|
server: true
|
||
|
node_name: "%{::hostname}"
|
||
|
```
|
||
|
|
||
|
Change `bootstrap_expect` to match what you need.
|
||
|
|
||
|
To configure the prometheus server itself create `profile::prometheus`:
|
||
|
|
||
|
```
|
||
|
class profile::prometheus {
|
||
|
firewall { '100 Prometheus inbound':
|
||
|
dport => [9090,9093],
|
||
|
proto => tcp,
|
||
|
action => accept,
|
||
|
}
|
||
|
|
||
|
class { 'prometheus':
|
||
|
scrape_configs => [
|
||
|
{
|
||
|
'job_name' => 'consul',
|
||
|
'consul_sd_configs' => [
|
||
|
{
|
||
|
'server' => 'localhost:8500',
|
||
|
'services' => [
|
||
|
'node_exporter',
|
||
|
],
|
||
|
},
|
||
|
],
|
||
|
},
|
||
|
],
|
||
|
}
|
||
|
}
|
||
|
```
|
||
|
|
||
|
This will create a scrape config that queries consul for all services named 'node_exporter'.
|
||
|
|
||
|
Finally, the hiera for your prometheus node will look like:
|
||
|
|
||
|
```
|
||
|
---
|
||
|
classes:
|
||
|
- profile::prometheus
|
||
|
prometheus::version: '1.5.0'
|
||
|
```
|
||
|
|
||
|
That is it!
|
||
|
|
||
|
As an aside, the basic ideas here are based on Gareth Rushgrove's excellent presentation about having 2 different speeds of configuration management. Basically, Puppet is the slow and stable speed then, in parallel, Consul gives another path that is much more reactive.
|
||
|
|
||
|
{% youtube XfSrc_sAm2c %}
|