Puppet

Puppet is a configuration management system written in Ruby (Private) with C (Private) components. It uses its own domain-specific language to model resources.

Concepts

Puppet is designed to operate in a client-server configuration, with a Master and a number of Nodes. Communication is over 8140/TCP.

  • Resources declare the desired states of packages, files, services, etc.
    • Resource Types define the structure of a resource.
    • Resource Definitions are instances of these types.
  • Providers are Ruby modules which perform the heavy lifting behind the scenes.
  • Facts provide data about Nodes used for targeting.
  • A Catalog is a set of configuration targeting a Node.

Configuration Runs

An application of these resources to a node is called a "configuration run". At a high level, these involve:

  • A Node connecting to the Master
  • The Node send facts about itself to the master
  • The Master classifying the node according to its facts
  • The master Master sending a catalog, describing the desired state of the Node and dependency data
  • Node applies and returns data

Components

  • Facter provides data collection for targeting.
  • Hiera offers storage for Node-specific data.
  • Vanagon packages up the different components.

Installation

For Linux systems, Puppet is shipped in multiple packages:

  • puppet contains just the agent.
  • puppet-server contains both the agent and server.

Install the appropriate release configuration for the system package manager, update the local package cache and install the appropriate packages.

On Windows, configuration is in C:\ProgramData.

Cautious note

The Puppetlabs packages run on WEBrick by default, which wasn't really intended for production use. 😱

Configuration

Puppet configuration is stored under a single directory:

  • /etc/puppet:
    • puppet.conf:
      • [master]
        • environmentpath
        • basemodulepath
      • [agent]
        • runinterval sets the number of seconds between catalog runs.
        • server sets the Master to connect to.
      • [main] sets defaults for both sections where no specific value is set.
    • manifests/:
      • site.pp contains the site manifest, limiting you to one environment for all Nodes.
    • environments/*/:
      • environment.conf describes the environment
        • modulepath
        • environment_timeout
      • manifests/
        • nodes.pp
      • modules/

Authentication

Master-Agent communication takes place over a TLS session, with authentication taking place using certificates which are automatically generated at startup. This means that DNS names are important -- ensure all appropriate names are known to the server before start, else you'll have a lot of manual key exchange to redo.

Nodes' keys will need to be signed by the Master key before they'll be able to complete the authentication process, ensuring that rogue systems can't scrape certificates without prior approval. Key management can be done with the cert command -- list keys:

sudo puppet cert list

Accept the named key:

sudo puppet cert sign foo

Usage

Matchers match the nodes to which a piece of configuration will be applied. Resources are named by type and an unique identifier, then their parameters are specified as a set of key-value pairs.

node matcher {
  resourceType { identifier:
    key => value,
  }
}

For example:

node 'node' {
  package { 'apache2':
    ensure => 'installed',
  }
}

Variables can reduce duplication:

$tools = ['git', 'vim']

package { $tools:
  ensure => 'installed',
}

Resources can be referenced using ResourceType['id'] (or ResourceType[$id]).

Selectors are like switch statements, facilitating conditional assignment based on some other value.

$somevalue = $osfamily ? {
  'debian' => 'deb',
  'redhat' => 'el',
  default  => 'generic',
}

Classes allow reuse of common resource declarations. First, declare the class:

class base {
  # resources here
}

Then use it:

node default {
  class { 'base': }
}

This clunkier syntax can be replaced with one of the include, require, contain and hiera_include functions -- see below.

Variable and module scopes

To prevent accidental overwriting of values, Puppet encloses variables and modules in scopes. The root scope is referred to as the top scope, and node and class blocks are also considered separate logical scopes.

To use classes declared in other scopes, anchor the class reference to the root scope by preceding its name with ::, e.g. ::apache2.

Ordering and dependencies

Resources have meta-parameters:

  • subscribe can be used to ensure a resource's state is reapplied after changes are applied to its dependency.

Where dependencies can't be reliably determined, we can specify them explicitly, e.g.:

Example['first'] -> ResourceType['second']

Common functions

  • hiera($key) retrieves the named key from its configuration store.
  • inline_template($text) allows rendering ERB style templates contained in a string, useful for setting file contents. Variables within the parent scope are accessible by default. For larger templates, use separate *.erb templates.
  • template("module/template.erb") behaves like inline_template(), but allows rendering ERB templates in external files.
  • include is analogous to `class { name: }.
  • hiera_include($key) includes the classes named in the specified Hiera key.

Common resources

  • File ensures files exist (present), removes them (absent). On change, it'll make a copy of the file in the Filebucket, allowing later retrieval with the puppet filebucket command.
  • Package installs or removes packages.
  • Service manages system services, ensuring they're running or stopped and altering their enabled state (enable).

Applying configuration

Run with agent --onetime.

Modules

Modules are reusable configuration, stored on the Puppet master. Community-contributed modules can be downloaded from The Puppet Forge.

Community modules can be installed using the CLI:

puppet module install --modulepath /etc/puppet/environments/production/modules author-module

Scaffold an empty module using the CLI:

puppet module generate author-module --environment env

This will generate a subset of the following structure:

  • README.md
  • Gemfile
  • facts.d/ contains external facts, allowing Puppet to derive fact values from external tools emitted as key-value pairs (x=y).
  • files/ contains static files to be pushed to Nodes.
  • lib/ contains custom facts.
  • manifests/ is home to the Puppet configuration:
    • init.pp will be automatically loaded.
  • metadata.json
  • Rakefile
  • spec/ contains RSpec tests:
    • classes/:
      • init_spec.rb
    • spec_helper.rb
  • templates/ contains dynamic templates.
  • tests/:
    • init.pp

Hiera

Hiera eases scaling Puppet configurations by extracting Node-specific data into separate storage (configured using data sources it calls backends). By default it uses file storage in the /var/lib/hiera directory, but this can be changed by modifying its configuration, stored in /etc/puppet/hiera.yaml.

Roles and profiles

This commonly used pattern separates a role (a description of what a server does) from its constituent profiles (components, e.g. a runtime or application dependency). A role is a class in the root of a module, and profiles are represented as subclasses of these classes. Roles can include profiles.


Backlinks