180 lines
30 KiB
Markdown
180 lines
30 KiB
Markdown
![]() |
---
|
||
|
date: 2017-03-14
|
||
|
title: Consul Prometheus and Puppet
|
||
|
category: devops
|
||
|
---
|
||
|
|
||
|
Recently I've been playing around with Prometheus. For now I think it is the best open source solution for monitoring (in the same way that chlamydia is probably the best STD). Previously I was a fan of Sensu, but honestly there are just too many moving parts to go wrong with Sensu, which meant they inevitably did.
|
||
|
|
||
|
So, why do I like Prometheus? Basically, it stays pretty close to the UNIX philosophy of doing one thing and doing it well - basically it is just a time-series database. Alerting is a seperate module for example and graphing is pretty much left to Grafana. Initially I was not taken by it for one simple reason:
|
||
|
|
||
|
- All its configuration is central.
|
||
|
|
||
|
Unlike with Sensu, a node cannot announce itself to the Prometheus server and then be automatically monitored. In this day an age, that sucks. However, while browsing the docs I discovered that it supports [service discovery](https://prometheus.io/blog/2015/06/01/advanced-service-discovery/).
|
||
|
|
||
|
|
||
|
So the process:
|
||
|
|
||
|
- Use Puppet to configure Prometheus
|
||
|
- Individual nodes announce to Consul what services they have
|
||
|
- Prometheus collects its endpoints from Consul
|
||
|
|
||
|
This looks something like this:
|
||
|
|
||
|
<img src="data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHhtbG5zOnhsaW5rPSJodHRwOi8vd3d3LnczLm9yZy8xOTk5L3hsaW5rIiB3aWR0aD0iNTYxcHgiIGhlaWdodD0iNDExcHgiIHZlcnNpb249IjEuMSI+PGRlZnMvPjxnIHRyYW5zZm9ybT0idHJhbnNsYXRlKDAuNSwwLjUpIj48cmVjdCB4PSIwIiB5PSI4MCIgd2lkdGg9IjEyMCIgaGVpZ2h0PSIxMDAiIGZpbGw9Im5vbmUiIHN0cm9rZT0iIzAwMDAwMCIgcG9pbnRlci1ldmVudHM9Im5vbmUiLz48ZyB0cmFuc2Zvcm09InRyYW5zbGF0ZSg0Ni41LDg3LjUpIj48c3dpdGNoPjxmb3JlaWduT2JqZWN0IHN0eWxlPSJvdmVyZmxvdzp2aXNpYmxlOyIgcG9pbnRlci1ldmVudHM9ImFsbCIgd2lkdGg9IjI2IiBoZWlnaHQ9IjEyIiByZXF1aXJlZEZlYXR1cmVzPSJodHRwOi8vd3d3LnczLm9yZy9UUi9TVkcxMS9mZWF0dXJlI0V4dGVuc2liaWxpdHkiPjxkaXYgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzE5OTkveGh0bWwiIHN0eWxlPSJkaXNwbGF5OiBpbmxpbmUtYmxvY2s7IGZvbnQtc2l6ZTogMTJweDsgZm9udC1mYW1pbHk6IEhlbHZldGljYTsgY29sb3I6IHJnYigwLCAwLCAwKTsgbGluZS1oZWlnaHQ6IDEuMjsgdmVydGljYWwtYWxpZ246IHRvcDsgd2lkdGg6IDI2cHg7IHdoaXRlLXNwYWNlOiBub3dyYXA7IHdvcmQtd3JhcDogbm9ybWFsOyB0ZXh0LWFsaWduOiBjZW50ZXI7Ij48ZGl2IHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8xOTk5L3hodG1sIiBzdHlsZT0iZGlzcGxheTppbmxpbmUtYmxvY2s7dGV4dC1hbGlnbjppbmhlcml0O3RleHQtZGVjb3JhdGlvbjppbmhlcml0OyI+c3RhdHM8L2Rpdj48L2Rpdj48L2ZvcmVpZ25PYmplY3Q+PHRleHQgeD0iMTMiIHk9IjEyIiBmaWxsPSIjMDAwMDAwIiB0ZXh0LWFuY2hvcj0ibWlkZGxlIiBmb250LXNpemU9IjEycHgiIGZvbnQtZmFtaWx5PSJIZWx2ZXRpY2EiPnN0YXRzPC90ZXh0Pjwvc3dpdGNoPjwvZz48cmVjdCB4PSIyNDAiIHk9IjAiIHdpZHRoPSIxMjAiIGhlaWdodD0iMTAwIiBmaWxsPSIjZmZmZmZmIiBzdHJva2U9IiMwMDAwMDAiIHBvaW50ZXItZXZlbnRzPSJub25lIi8+PGcgdHJhbnNmb3JtPSJ0cmFuc2xhdGUoMjgxLjUsNy41KSI+PHN3aXRjaD48Zm9yZWlnbk9iamVjdCBzdHlsZT0ib3ZlcmZsb3c6dmlzaWJsZTsiIHBvaW50ZXItZXZlbnRzPSJhbGwiIHdpZHRoPSIzNiIgaGVpZ2h0PSIxMiIgcmVxdWlyZWRGZWF0dXJlcz0iaHR0cDovL3d3dy53My5vcmcvVFIvU1ZHMTEvZmVhdHVyZSNFeHRlbnNpYmlsaXR5Ij48ZGl2IHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8xOTk5L3hodG1sIiBzdHlsZT0iZGlzcGxheTogaW5saW5lLWJsb2NrOyBmb250LXNpemU6IDEycHg7IGZvbnQtZmFtaWx5OiBIZWx2ZXRpY2E7IGNvbG9yOiByZ2IoMCwgMCwgMCk7IGxpbmUtaGVpZ2h0OiAxLjI7IHZlcnRpY2FsLWFsaWduOiB0b3A7IHdpZHRoOiAzOHB4OyB3aGl0ZS1zcGFjZTogbm93cmFwOyB3b3JkLXdyYXA6IG5vcm1hbDsgdGV4dC1hbGlnbjogY2VudGVyOyI+PGRpdiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMTk5OS94aHRtbCIgc3R5bGU9ImRpc3BsYXk6aW5saW5lLWJsb2NrO3RleHQtYWxpZ246aW5oZXJpdDt0ZXh0LWRlY29yYXRpb246aW5oZXJpdDsiPnB1cHBldDwvZGl2PjwvZGl2PjwvZm9yZWlnbk9iamVjdD48dGV4dCB4PSIxOCIgeT0iMTIiIGZpbGw9IiMwMDAwMDAiIHRleHQtYW5jaG9yPSJtaWRkbGUiIGZvbnQtc2l6ZT0iMTJweCIgZm9udC1mYW1pbHk9IkhlbHZldGljYSI+cHVwcGV0PC90ZXh0Pjwvc3dpdGNoPjwvZz48cmVjdCB4PSI0NDAiIHk9IjI2MCIgd2lkdGg9IjEyMCIgaGVpZ2h0PSIxMDAiIGZpbGw9IiNmZmZmZmYiIHN0cm9rZT0iIzAwMDAwMCIgcG9pbnRlci1ldmVudHM9Im5vbmUiLz48ZyB0cmFuc2Zvcm09InRyYW5zbGF0ZSg0ODIuNSwzNDMuNSkiPjxzd2l0Y2g+PGZvcmVpZ25PYmplY3Qgc3R5bGU9Im92ZXJmbG93OnZpc2libGU7IiBwb2ludGVyLWV2ZW50cz0iYWxsIiB3aWR0aD0iMzQiIGhlaWdodD0iMTIiIHJlcXVpcmVkRmVhdHVyZXM9Imh0dHA6Ly93d3cudzMub3JnL1RSL1NWRzExL2ZlYXR1cmUjRXh0ZW5zaWJpbGl0eSI+PGRpdiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMTk5OS94aHRtbCIgc3R5bGU9ImRpc3BsYXk6IGlubGluZS1ibG9jazsgZm9udC1zaXplOiAxMnB4OyBmb250LWZhbWlseTogSGVsdmV0aWNhOyBjb2xvcjogcmdiKDAsIDAsIDApOyBsaW5lLWhlaWdodDogMS4yOyB2ZXJ0aWNhbC1hbGlnbjogdG9wOyB3aWR0aDogMzRweDsgd2hpdGUtc3BhY2U6IG5vd3JhcDsgd29yZC13cmFwOiBub3JtYWw7IHRleHQtYWxpZ246IGNlbnRlcjsiPjxkaXYgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzE5OTkveGh0bWwiIHN0eWxlPSJkaXNwbGF5OmlubGluZS1ibG9jazt0ZXh0LWFsaWduOmluaGVyaXQ7dGV4dC1kZWNvcmF0aW9uOmluaGVyaXQ7Ij5ub2RlMzwvZGl2PjwvZGl2PjwvZm9yZWlnbk9iamVjdD48dGV4dCB4PSIxNyIgeT0iMTIiIGZpbGw9IiMwMDAwMDAiIHRleHQtYW5jaG9yPSJtaWRkbGUiIGZvbnQtc2l6ZT0iMTJweCIgZm9udC1mYW1pbHk9IkhlbHZldGljYSI+bm9kZTM8L3RleHQ+PC9zd2l0Y2g+PC9nPjxyZWN0IHg9IjI0MCIgeT0iMzEwIiB3aWR0aD0iMTIwIiBoZWlnaHQ9IjEwMCIgZmlsbD0iI2ZmZmZmZiIgc3Ryb2tlPSIjMDAwMDAwIiBwb2ludGVyLWV2ZW50cz0ibm9uZSIvPjxnIHRyYW5zZm9ybT0idHJhbnNsYXRlKDI4Mi41LDM5My41KSI+PHN3aXRjaD48Zm9yZWlnbk9iamVjdCBzdHlsZT0ib3ZlcmZsb3c6dmlzaWJsZTsiIHBvaW50ZXItZXZlbnRzPSJhbGwiIHdpZHRoPSIzNCIgaGVpZ2h0PSIxMiIgcmVxdWlyZWRGZWF0dXJlcz0iaHR0cDovL3d3dy53My5vcmcvVFIvU1ZHMTEvZmVhdHVyZSNFeHRlbnNpYmlsaXR5Ij48ZGl2IHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8xOTk5L3hodG1sIiBzdHlsZT0iZGlzcGxheTogaW5saW5lLWJsb2NrOyBmb250LXNpemU6IDEycHg7IGZvbnQtZmFtaWx5OiBIZWx2ZXRpY2E7IGNvbG9yOiByZ2IoMCwgMCwgMCk
|
||
|
|
||
|
Here is a network of 5 machines:
|
||
|
|
||
|
- Prometheus (also the Consul server
|
||
|
- Puppet
|
||
|
- 3 that will be monitored
|
||
|
|
||
|
This is a very simple consul cluster. Normally one would have at least 3 masters (ideally more) spread accross different datacentres. It works for this demo though.
|
||
|
|
||
|
Right, let's jump straight into the Puppet code. I am using the classic 'Roles and Profiles' pattern. You can find my control repo [here](https://gogs.chriscowley.me.uk/Puppet/controlrepo). There are a few Puppet modules necessarry, so your `Puppetfile` will contain:
|
||
|
|
||
|
```
|
||
|
forge 'http://forge.puppetlabs.com'
|
||
|
|
||
|
mod 'KyleAnderson/consul', '2.1.0'
|
||
|
mod 'puppet/archive', '1.3.0'
|
||
|
mod 'puppetlabs/stdlib', '4.15.0'
|
||
|
mod 'puppetlabs/firewall', '1.8.2'
|
||
|
|
||
|
mod 'prometheus',
|
||
|
:git => 'https://github.com/voxpupuli/puppet-prometheus.git'
|
||
|
```
|
||
|
|
||
|
To begin with, lets install [Node Exporter](https://prometheus.io/download/#node_exporter) everywhere. This will collect basic system stats and make them available to Prometheus.
|
||
|
|
||
|
In `common.yaml`:
|
||
|
|
||
|
```
|
||
|
---
|
||
|
prometheus::node_export: 0.13.0
|
||
|
```
|
||
|
|
||
|
and in your `profile::base`:
|
||
|
|
||
|
```
|
||
|
class profile::base {
|
||
|
include ::prometheus::node_exporter
|
||
|
firewall {'102 node exporter':
|
||
|
dport => 9100,
|
||
|
proto => tcp,
|
||
|
action => accept,
|
||
|
}
|
||
|
}
|
||
|
```
|
||
|
|
||
|
Consul needs to be everywhere and you need to announce to it that the node exporter is there, so in your base profile:
|
||
|
|
||
|
```
|
||
|
class profile::base {
|
||
|
include ::consul
|
||
|
firewall { '103 Consul':
|
||
|
dport => [8400, 8500],
|
||
|
proto => tcp,
|
||
|
action => accept,
|
||
|
}
|
||
|
}
|
||
|
```
|
||
|
|
||
|
And in `common.yaml`:
|
||
|
|
||
|
```
|
||
|
---
|
||
|
consul::version: 0.7.4
|
||
|
consul::config_hash:
|
||
|
data_dir: '/opt/consul'
|
||
|
datacenter: 'homelab'
|
||
|
log_level: 'INFO'
|
||
|
node_name: "%{::hostname}"
|
||
|
retry_join:
|
||
|
- 192.168.1.89
|
||
|
consul::services:
|
||
|
node_exporter:
|
||
|
address: "%{::fqdn}"
|
||
|
checks:
|
||
|
- http: http://localhost:9100
|
||
|
interval: 10s
|
||
|
port: 9100
|
||
|
tags:
|
||
|
- monitoring
|
||
|
```
|
||
|
|
||
|
Obviously modify the retry_join to suite your infrastructure. If you are doing the right thing and have a cluster, just expand the array down.
|
||
|
|
||
|
For the consul master create a profile that contains:
|
||
|
|
||
|
```
|
||
|
profile::consulmaster {
|
||
|
firewall { '102 consul inbound':
|
||
|
dport => [8300, 8301, 8302, 8600],
|
||
|
proto => tcp,
|
||
|
action => accept,
|
||
|
}
|
||
|
}
|
||
|
```
|
||
|
|
||
|
You need the following in Hiera applied to that node(s):
|
||
|
|
||
|
```
|
||
|
---
|
||
|
consul::version: 0.7.4
|
||
|
consul::config_hash:
|
||
|
bootstrap_expect: 1
|
||
|
data_dir: '/opt/consul'
|
||
|
datacenter: 'homelab'
|
||
|
log_level: 'INFO'
|
||
|
server: true
|
||
|
node_name: "%{::hostname}"
|
||
|
```
|
||
|
|
||
|
Change `bootstrap_expect` to match what you need.
|
||
|
|
||
|
To configure the prometheus server itself create `profile::prometheus`:
|
||
|
|
||
|
```
|
||
|
class profile::prometheus {
|
||
|
firewall { '100 Prometheus inbound':
|
||
|
dport => [9090,9093],
|
||
|
proto => tcp,
|
||
|
action => accept,
|
||
|
}
|
||
|
|
||
|
class { 'prometheus':
|
||
|
scrape_configs => [
|
||
|
{
|
||
|
'job_name' => 'consul',
|
||
|
'consul_sd_configs' => [
|
||
|
{
|
||
|
'server' => 'localhost:8500',
|
||
|
'services' => [
|
||
|
'node_exporter',
|
||
|
],
|
||
|
},
|
||
|
],
|
||
|
},
|
||
|
],
|
||
|
}
|
||
|
}
|
||
|
```
|
||
|
|
||
|
This will create a scrape config that queries consul for all services named 'node_exporter'.
|
||
|
|
||
|
Finally, the hiera for your prometheus node will look like:
|
||
|
|
||
|
```
|
||
|
---
|
||
|
classes:
|
||
|
- profile::prometheus
|
||
|
prometheus::version: '1.5.0'
|
||
|
```
|
||
|
|
||
|
That is it!
|
||
|
|
||
|
As an aside, the basic ideas here are based on Gareth Rushgrove's excellent presentation about having 2 different speeds of configuration management. Basically, Puppet is the slow and stable speed then, in parallel, Consul gives another path that is much more reactive.
|
||
|
|
||
|
{% youtube XfSrc_sAm2c %}
|