--- date: 2017-03-14 title: Consul Prometheus and Puppet category: devops --- Recently I've been playing around with Prometheus. For now I think it is the best open source solution for monitoring (in the same way that chlamydia is probably the best STD). Previously I was a fan of Sensu, but honestly there are just too many moving parts to go wrong with Sensu, which meant they inevitably did. So, why do I like Prometheus? Basically, it stays pretty close to the UNIX philosophy of doing one thing and doing it well - basically it is just a time-series database. Alerting is a seperate module for example and graphing is pretty much left to Grafana. Initially I was not taken by it for one simple reason: - All its configuration is central. Unlike with Sensu, a node cannot announce itself to the Prometheus server and then be automatically monitored. In this day an age, that sucks. However, while browsing the docs I discovered that it supports [service discovery](https://prometheus.io/blog/2015/06/01/advanced-service-discovery/). So the process: - Use Puppet to configure Prometheus - Individual nodes announce to Consul what services they have - Prometheus collects its endpoints from Consul This looks something like this: Here is a network of 5 machines: - Prometheus (also the Consul server - Puppet - 3 that will be monitored This is a very simple consul cluster. Normally one would have at least 3 masters (ideally more) spread accross different datacentres. It works for this demo though. Right, let's jump straight into the Puppet code. I am using the classic 'Roles and Profiles' pattern. You can find my control repo [here](https://gogs.chriscowley.me.uk/Puppet/controlrepo). There are a few Puppet modules necessarry, so your `Puppetfile` will contain: ``` forge 'http://forge.puppetlabs.com' mod 'KyleAnderson/consul', '2.1.0' mod 'puppet/archive', '1.3.0' mod 'puppetlabs/stdlib', '4.15.0' mod 'puppetlabs/firewall', '1.8.2' mod 'prometheus', :git => 'https://github.com/voxpupuli/puppet-prometheus.git' ``` To begin with, lets install [Node Exporter](https://prometheus.io/download/#node_exporter) everywhere. This will collect basic system stats and make them available to Prometheus. In `common.yaml`: ``` --- prometheus::node_export: 0.13.0 ``` and in your `profile::base`: ``` class profile::base { include ::prometheus::node_exporter firewall {'102 node exporter': dport => 9100, proto => tcp, action => accept, } } ``` Consul needs to be everywhere and you need to announce to it that the node exporter is there, so in your base profile: ``` class profile::base { include ::consul firewall { '103 Consul': dport => [8400, 8500], proto => tcp, action => accept, } } ``` And in `common.yaml`: ``` --- consul::version: 0.7.4 consul::config_hash: data_dir: '/opt/consul' datacenter: 'homelab' log_level: 'INFO' node_name: "%{::hostname}" retry_join: - 192.168.1.89 consul::services: node_exporter: address: "%{::fqdn}" checks: - http: http://localhost:9100 interval: 10s port: 9100 tags: - monitoring ``` Obviously modify the retry_join to suite your infrastructure. If you are doing the right thing and have a cluster, just expand the array down. For the consul master create a profile that contains: ``` profile::consulmaster { firewall { '102 consul inbound': dport => [8300, 8301, 8302, 8600], proto => tcp, action => accept, } } ``` You need the following in Hiera applied to that node(s): ``` --- consul::version: 0.7.4 consul::config_hash: bootstrap_expect: 1 data_dir: '/opt/consul' datacenter: 'homelab' log_level: 'INFO' server: true node_name: "%{::hostname}" ``` Change `bootstrap_expect` to match what you need. To configure the prometheus server itself create `profile::prometheus`: ``` class profile::prometheus { firewall { '100 Prometheus inbound': dport => [9090,9093], proto => tcp, action => accept, } class { 'prometheus': scrape_configs => [ { 'job_name' => 'consul', 'consul_sd_configs' => [ { 'server' => 'localhost:8500', 'services' => [ 'node_exporter', ], }, ], }, ], } } ``` This will create a scrape config that queries consul for all services named 'node_exporter'. Finally, the hiera for your prometheus node will look like: ``` --- classes: - profile::prometheus prometheus::version: '1.5.0' ``` That is it! As an aside, the basic ideas here are based on Gareth Rushgrove's excellent presentation about having 2 different speeds of configuration management. Basically, Puppet is the slow and stable speed then, in parallel, Consul gives another path that is much more reactive. {% youtube XfSrc_sAm2c %}