How to set up a puppet infrastructure

What is puppet and why would I need it?

Puppet is an open source configuration management tool, that can help you to manage lots of servers without writing customized scripts for setup and maintenance for each one or each group of them.
It’s very powerful and comes with lots of modules already, for nearly every task you face.

What’s this about

In this post I’ll cover how to set up a” puppet master” (control server), managing different environments on it (for teams and/or test/production) with r10k, writing a simple module, connecting a host to the master and configure it automatically.
The master will be puppet5, so we can fully utilize the power of hiera, which allows us to separate data (configs) from code (modules).

Pre-requisites

  • Im using a vm with debian9 and 4GB RAM, once you start connecting many servers though, you’ll need much more RAM.
  • with puppet it is important to have ntp setup correctly on all hosts (apt-get install ntp should take care of everything)
  • make sure the master knows himself (/etc/hosts: 127.0.1.1 puppetmaster.your.domain puppetmaster)

Installation

echo "deb http://apt.puppetlabs.com stretch puppet" >> /etc/apt/sources.list.d/puppet.list
wget apt.puppetlabs.com/pubkey.gpg
apt-key add pubkey.gpg
apt-get update
apt-get install puppet-master-passenger

We’ll use puppetmaster with apache passenger here, this is more than sufficient for even big environments.
puppetserver is the future though, as it is more scalable in very large environments.

Setup

root@puppetmastertest:~# ls /etc/puppet/
auth.conf  code  hiera.yaml  puppet.conf

You’ll find the above structure in /etc/puppet.
hiera.yaml basically describes the hierarchy puppet should search to lookup host(grou) infromation, Delete it for now, we’ll use it in our environments later.
auth.conf is used to control catalog and ssl information your hosts are allowed to access (leave it, it should be sufficient for now).
puppet.conf contains your log- and ssl paths, as well as a facter path and more (can be extended, see: https://puppet.com/docs/puppet/5.3/config_file_main.html) important for now are the log and ssl locations and also the path for facter.
(Enter the example from the next box into your puppet.conf)
The directory code contains modules and hieradata, normally everything is stored just there, but well use several environments, so we’ll first have to create the right structure.

[main]
logdir=/var/log/puppet
vardir=/var/lib/puppet
ssldir=/var/lib/puppet/ssl
rundir=/var/run/puppet
factpath=$vardir/lib/facter

[master]
vardir = /var/lib/puppet
cadir  = /var/lib/puppet/ssl/ca
ssl_client_header = SSL_CLIENT_S_DN
ssl_client_verify_header = SSL_CLIENT_VERIFY
certname = puppetmaster
dns_alt_names = puppetmaster, puppetserver.your.domain

The [main] Section is pretty self explanatory, logs are stored in logdir and most other “stuff” related to the function of puppet is stored in $vardir or related directories.
SSL-Certificates (as well as requests) from the clients are stored in $ssldir (We’ll come to that later), the reports for the puppetruns are stored under $vardir/state and so on. $factpath is where puppet will look for custom facts.
The main section can be the same for master and clients but must not be.

The [master] section is specifically for the master itself.
Important here are dns_alt_names and certname, because puppetmaster brings it’s own CA (you can also use your own, but we won’t cover that here), so set especially dns_alt_names to whatever you think is useful when calling the master from the clients.

connecting a client

echo "deb http://apt.puppetlabs.com stretch puppet" >> /etc/apt/sources.list.d/puppet.list
wget apt.puppetlabs.com/pubkey.gpg
apt-key add pubkey.gpg
apt-get update
apt-get install puppet

Let’s connect our first client and leave the directories for now. We’ll cover them once this works.

[main]
server = puppetmaster.your.domain
certname = puppettest1.your.domain
env = prod

Delete the master section.
server is the fqdn of our puppetmaster
certname is the name we want for our ssl servercertificate. This is a fact and will be important when working with facts later (e.g. in hiera.yaml)
env is prod for now, because as long as we haven’t configured the master otherwise this is the only environment available. We’ll configure at least two later though, a testing and a productive environment.

root@puppettest1:~# puppet agent -vot
Info: Caching certificate for ca
Info: csr_attributes file loading from /etc/puppet/csr_attributes.yaml
Info: Creating a new SSL certificate request for puppettest1.your.domain
Info: Certificate Request fingerprint (SHA256): E5:71:47:CE:94:9B:A9:DC:DC:37:B7:92:89:BA:DD:78:75:D5:CC:72:06:A5:32:AF:83:8D:B0:5A:E9:81:3F:88
Info: Caching certificate for ca
Exiting; no certificate found and waitforcert is disabled

Use the above command to connect the client to the puppetmaster. puppet automatically creates a csr.
You can view and sign it on the master like this:

puppet cert list
  "puppettest1.your.domain" (SHA256) E5:71:47:CE:94:9B:A9:DC:DC:37:B7:92:89:BA:DD:78:75:D5:CC:72:06:A5:32:AF:83:8D:B0:5A:E9:81:3F:88

root@puppetmaster:~# puppet cert sign puppettest1.your.domain
Signing Certificate Request for:
  "puppettest1.your.domain" (SHA256) E5:71:47:CE:94:9B:A9:DC:DC:37:B7:92:89:BA:DD:78:75:D5:CC:72:06:A5:32:AF:83:8D:B0:5A:E9:81:3F:88
Notice: Signed certificate request for puppettest1.your.domain
Notice: Removing file Puppet::SSL::CertificateRequest puppettest1.your.domain at '/var/lib/puppet/ssl/ca/requests/puppettest1.your.domain.pem'

If something goes wrong with signing (or you need to migrate a node), you may issue “puppet cert clean $certname” and delete the certificate files in $ssldir on the client. Then repeat the above.
Tip: In large environments you will want to autosign certificates (policybased or just everything is up to you), read here on how to do this: https://puppet.com/docs/puppet/5.3/ssl_autosign.html

However, you should have a connected node by now. Let’s move on, filling it with stuff.

configuring the master

I’d recommend to do all configuration of the environments in a git repository (check out how to do this: here), leaving the management up to r10k.
However, you can just take the examples and put it into the respective directories:

  • /etc/puppet/code/environments/test
  • /etc/puppet/code/environments/production

Whatever you do, you should have at least two environments. One for testing and one for production.

In your repository (or in prod/test) create the following:

vi environment.conf
modulepath          = site-modules:modules:$basemodulepath
config_version      = 'scripts/config_version.sh $environmentpath $environment'

This contains the path where puppet should look for modules and the path for config_version.sh (needed to manage the environments with r10k).
You may look these up with:

puppet config print basemodulepath
/etc/puppet/code/modules:/usr/share/puppet/modules

puppet config print environment
production

puppet config print environmentpath
/etc/puppet/code/environments

Create a Puppetfile in your repository and either use prod and test branch, or just create (for now) identical files in prod/test folders:

forge "http://forge.puppetlabs.com"
moduledir = 'modules'

# Get a specific release from GitHub
#mod 'puppet-gitlab',
#   :git    => 'https://github.com/voxpupuli/puppet-gitlab' ,
#   :ref    => '3.0.2'

#mod 'helloworld', :local => true
#mod 'base', :local => true
#mod 'docker', :local => true
#mod 'kubernetes', :local => true
#mod "puppetlabs/inifile"
#mod "puppetlabs/stdlib"
#mod "puppetlabs/apt"

The above is just an example, but a good one to walk you through it.

  • forge is the location of an external url to get (forge based) modules from, I use the official puppelabs forge.
  • moduledir tells puppet where to install these modules (relative to path)
  • mod contains all attributes of the respective module.
    Let’s take the first block, the name of the module is ‘puppet-gitlab’ but I use another git source than the standard one (:git => ‘http://$modulegit’), also I want a specific version (:ref => ‘3.0.2’).
    In everyday work you’ll most probably want to use specific version in production environment, after you’ve tested them and “latest” (this is fetched as a default if you don’t provide a version) in test environments.
    Try it out, with hiera properly configured (later), you’ll get a fully configured and operational gitlab server in no time.
    However, when working with r10k, it will wipe everything that is not referenced (so don’t forget to commit your changes and push them before running it). So there is one more important parameter:
  • :local => true
    This is to indicate that r10k should not try to download this module from somewhere but also should not wipe it, because (most likely) it’s one of your self written modules (I’ll cover helloworld, base and docker later, so that you get the idea on how to write basic modules and how to combine them to sets).
    Best create a git repo for those modules and later clone/pull them separately if needed.
  • I commented nearly everything for now but you may already uncomment the last two modules:
    “puppetlabs/stdlib” and “puppetlabs/apt” as they are pretty useful and I use them in every setup.
    check it out: https://forge.puppet.com/puppetlabs/stdlib. inifile is useful when using the gitlab module.

Create the hiera.yaml in your repo:

vi hiera.yaml

---
version: 5
defaults:
  data_hash: yaml_data
  datadir: hieradata
hierarchy:
  - name: nodes
    path: nodes/%{trusted.certname}.yaml
  - name: Common
    path: common.yaml

This tells hiera that it should use yaml (data_hash), could be json as well but I haven’t tried that at all.
Also it references the directory where hiera should look for configuration data (this is where our cluster/note$resourceconfiguration goes).
Important is the hierarchy, because it tells hiera how and where to look for data.
It’s best to let hiera look from the most to the least specific definition.
In this example we create a simple hierarchy (under hieradata, which is a folder in our repo):

mkdir hieradata
mkdir nodes
touch nodes/yourserver.your.domain.yaml

This way, if the agent connects the master the master looks under nodes (as configured) and finds a trusted.certname.yaml because it’s the certname of the client. Hiera stops here at the most specific match and starts applying whatever config we put into nodes/yourserver.your.domain.yaml.
Next, create a file named common.yaml under hieradata.

touch hieradata/common.yaml


This file will contain the configuration everything that has no specific match.
As you can see, you may extend and improve this structure according to your needs, maybe naming your servers/clusters FunctionDepartmentLocationVlan.your.domain or whatever and have different paths for different combinations of what may be implied by this structure at your company.
You may also specify different merge strategies for hiera like stop at first match or merge most and least specific, etc.
You get the idea. Let’s stick with our example for now, but if you are interested, read here: https://puppet.com/docs/puppet/4.10/hiera_merging.html

vi r10k.yaml

cachedir: '/var/cache/r10k'
sources:
   yourdepartment:
    basedir: '/etc/puppet/code/environments'
    remote: 'https://gitlab.your.domain/youruser/puppetcontrolrepo.git'
    prefix: true

Create the above file in your git repo (if you’ve decided not to use git, which is not recommended, skip this).
This tells r10k where to fetch the environments (I named it yourdepartment, you may configure as much as you like, to allow different teams to build their own puppet environment for example or just to separate things from each other).
prefix is the important option here, because it tells r10k to prefix the environments with the respective branch names.
So if you have a production and a test branch for “yourdepartment” in your gitlab repo, they will be checked out separateley.
This allows you to set the environment of some nodes (either in /etc/../puppet.conf or on the commandline with env=yourenvironment_master or –environment=yourenvironment_test) so that you can use them for testing or production, just as you need.
I recommend you keep this in your git repo under version control. But you need copy it to /etc/r10k.yaml on your master (if you like, create another repo for that).

mkdir modules
mkdir manifests

Our initial structure is complete. commit and push your code into git (no git? ignore) and do a “checkout -b test” to create a test branch while you’re at it.
Then install r10k on your master.

gem install r10k

Now clone your repo initially (so that r10k.yaml exists, you may also just copy it there)

git clone https://gitlab.your.domain/youruser/puppetcontrolrepo /etc/puppet/code/environments/yourdepartment_master

Your master branch is checked out. From now on r10k will take care about updating the environment, installing modules and so on (after you’ve pushed to git of course and ran te appropriate commands).
I would advise to either run r10k regularly via cron or on demand using git as a trigger or some ssh command when running the master for several teams.
However, let’s check out the important commands, like checking the syntax of your puppetfile.

/etc/puppet/code/environments/yourdepartment_master# r10k puppetfile check
Syntax OK

Then let’s see what r10k would do (dry-run):

/etc/puppet/code/environments/yourdepartment_master# r10k deploy display -v
---
:sources:
- :name: :yourdepartment
  :basedir: "/etc/puppet/code/environments"
  :prefix: true
  :remote: https://gitlab.your.domain/youruser/puppetcontrolrepo.git
  :environments: []

If everything seems right, let’s deploy our environment:

r10k deploy environment -v -p

root@puppetmastertest:/etc/puppet/code/environments# r10k deploy environment -v -p
WARN     -> The r10k configuration file at /etc/r10k.yaml is deprecated.
WARN     -> Please move your r10k configuration to /etc/puppetlabs/r10k/r10k.yaml.
INFO     -> Using Puppetfile '/etc/puppet/code/environments/test_master/Puppetfile'
INFO     -> Using Puppetfile '/etc/puppet/code/environments/test_test/Puppetfile'
INFO     -> Deploying environment /etc/puppet/code/environments/test_master
INFO     -> Environment test_master is now at 7d56685293dae8a68673b7ba1009e8ceea51571a
INFO     -> Deploying Puppetfile content /etc/puppet/code/environments/test_master/modules/helloworld
[...]

Ignore the warnings, they tell you that we’re not using the “official” puppet package from puppetlabs but the one from the debian repository (one uses /etc/puppet and the other /etc/puppetlabs/).

You now have r10k management for your puppetmaster! Let’s go on and write some basic modules.

writing basic modules

Either develop your modules in the modules folder you created in your git puppet repository or use a different one (or several), that depends on your setup and company structure.
Create a site.pp file in your manifests folder.
In the days of old these contained node configurations but here we tell hiera how to lookup and merge all the classes it finds and put them in the catalog (this is the merge strategy, mentioned further above).
More on the topic: https://puppet.com/docs/puppet/5.3/hiera_automatic.html

notify { "Using $environment" :
    message => "Processing catalog from the $environment environment." ,
}

lookup ('classes', Array[String], 'unique').include

The interesting part is the lookup statement.
The notify statement is just for debug purposes. It always shows the environment used on the client.
Let’s create a base module which we can use on all of our hosts:

mkdir -p modules/base/manifests
mkdir -p modules/base/files
mkdir -p modules/base/manifests/openssh

Our base module will contain an openssh module and a bashrc module (you can download much better modules from puppetlabs, this is just to get the idea)

vi modules/base/manifests/openssh.pp

class base::openssh {
   class { base::openssh::install: }
}

vi modules/base/manifests/openssh/install.pp

class base::openssh::install {
   package { "openssh-client":
      ensure => present,
   }
   package { "openssh-server":
      ensure => present,
   }
}

In our base module we create an openssh.pp file which contains a reference openssh::install class. We could have defined it in the file itself, but later, if you add to this rather rudimentary openssh module or create your own,you’ll want to design them as modular as possible. SO you could add e.g a configure.pp and so on. However, this will make sure, that always the latest version of ssh client and server will be installed on our node (in prod you may want to work with a specific version).

root@puppettest1:~# puppet resource package openssh-client
package { 'openssh-client':
  ensure => '1:6.7p1-5+deb8u8',
}

On a host where you have some package installed you can let puppet tell you about it’s attributes and then write your code accordingly.

root@puppettest1:~# puppet resource user root
user { 'root':
  ensure           => 'present',
  comment          => 'root',
  gid              => '0',
  home             => '/root',
  password         => '$........onbZvbm0',
  password_max_age => '99999',
  password_min_age => '0',
  shell            => '/bin/bash',
  uid              => '0',
}

https://puppet.com/docs/puppet/5.3/type.html (or use puppet describe $resourcename)

vi modules/base/manifests/bashrc.pp

class base::bashrc {
  
file { '/root/.bashrc':
  ensure  => 'present',
  group   => '0',
  mode    => '0644',
  owner   => '0',
  source  => "puppet:///modules/base/bashrc/bashrc"
}
}

vi modules/base/files/bashrc/bashrc
#just copy a bashrc that you like and add:
#additional paths
PATH=$PATH:/opt/puppetlabs/puppet/bin

This is our second module in our base module, it creates a bashrc on our nodes. Notice the source parameter. We could write our content directly (parameter is “content”) but we store the file separately and tell puppet were to find it.
This translates to ” modules/base/files/bashrc/bashrc”, puppet skips “files” in the path…

from now on puppet will be in your PATH on all of your host.
Let’s put it together:

vi modules/base/manifests/init.pp

class base {
   class { base::openssh: }
   class { base::bashrc: }
}

Every module needs an init.pp! Here we define a class “base” that consist of our two new classes. But even if you are developing a new module I’d recommend that you put only basics into init.pp and keep everything as modular as possible.
Now let’s define our nodes.

defining a node

vi hieradata/common.yaml

---
classes:
   - base

Let’s put our base class in our common.yaml file that we’ve touched earlier (don’t forget to commit/push). We want ssh and bashrc on all of our hosts so it fits best in our least specific definition.

puppet agent -vv --test --environment yourdepartment_test

Info: Using configured environment 'yourdepartment_test'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for yourhost.your.domain
Info: Applying configuration version 'puppetserver-yourdepartment_test-caab813140e'
Notice: Processing catalog from the yourdepartment_test environment.
Notice: /Stage[main]/Base/Notify[Using Base Class!]/message: defined 'message' as 'Using Base Class!'
Notice: /Stage[main]/Base::Openssh/Notify[Using Openssh Test class!]/message: defined 'message' as 
Notice: /Stage[main]/Base::Bashrc/Notify[Using bashrc class!]/message: defined 'message' as 'Using bashrc class!'
Info: Stage[main]: Unscheduling all events on Stage[main]
Notice: Applied catalog in 0.62 seconds

Your node now has a new bashrc and ssh-packages provisioned.
On a more specific node (hieradata/nodes/yourserver.your.domain.yaml) use the gitlab class from puppetlabs maybe:

---
classes:
    - gitlab

gitlab::external_url: 'http://gitlabmaster.your.domain'
gitlab::gitlab_rails:
  time_zone: 'Europe/Berlin'
  gitlab_email_enabled: false
  gitlab_default_theme: 4
  gitlab_email_display_name: 'Gitlab'
gitlab::sidekiq:
  shutdown_timeout: 5

From now on, do all your configuration changes in your git test branch, pull it on the master automatically with r10k try it on your test nodes, merge into master if appropriate, run the agent and always be sure about what’s configured where and the state of your nodes!


Tips:

  • create cronjobs on the master for r10k and on the clients for puppetruns.

I’ve learned a lot from these books I read ages ago (they should be free on the web by now but also greatly outdated):

“Pro Puppet”, James Turnbull and Jeffrey McCune, APress
“Learning puppet 4”, Jo Rhett, O’reilly

Leave a comment

Your email address will not be published. Required fields are marked *