A collection of task oriented solutions in Puppet

 

You want to remove a set of files

Challenge

You want to remove old, large or unused files and directories.

Solution

class tidy_cache {

  # Always remove the given file
  tidy { '/tmp/install-artifact': }

  # remove all files from /tmp/dropbox
  tidy { 'clean_dropbox':
    path    => '/tmp/dropbox',
    recurse => 1,
  }

  # remove all files, in /tmp/image_cache,
  # that are older than 1 week
  tidy { 'prune_old_caches':
    path    => '/tmp/image_cache',
    age     => '1w',
    recurse => 1,
  }

}

Explanation

While we've seen a number of recipes that add files and directories we've yet to see one that can help with removing them. In this example we'll go beyond just removing a single file and instead remove any that match our selection criteria. Firstly we'll remove all files in a given directory:

class tidy_cache {

  # remove all files from /tmp/dropbox
  tidy { 'clean_dropbox':
    path    => '/tmp/dropbox',
    recurse => 1,
  }

}
Example output:
 Notice: /Stage[main]/Tidy_cache/Tidy[clean_dropbox]: Tidying 2 files
 Notice: /Stage[main]/Tidy_cache/File[/tmp/dropbox/world]/ensure: removed
 Notice: /Stage[main]/Tidy_cache/File[/tmp/dropbox/hello]/ensure: removed

This will remove all files in /tmp/dropbox, but not in any subdirectories. If you want to remove an entire directory tree of files you need to also request recursion.

tidy { 'completely_clean_dropbox':
  path    => '/tmp/dropbox',
  recurse => true,
  # rmdirs => true, # this also removes empty directories
}

That example is the heavy handed sledge hammer, it removes everything without a trace. If we want to be a more selective we can use one or more selection criteria to select files for deletion. We can use the age attribute to remove all files present in our cache directory that have been there for over a week:

class tidy_cache {
  # remove all files, in /tmp/image_cache,
  # that are older than 1 week

  tidy { 'prune_old_caches':
    path    => '/tmp/image_cache',
    age     => '1w',
    recurse => 1,
  }
}

You can use the first letter of a number of different time periods in the age selector. The official tidy age attribute documentation lists which are supported. To further refine our selection, if we were low on disk space for example, we can add a second attribute to our tidy resource that will ensure we remove all of the larger cached files:

tidy { 'prune_old_caches':
  path => '/tmp/image_cache',

  age  => '1w',
  size => '1g',
}

At this point it's worth noting that the files will be removed if they match any of the selection attributes, it does not need to match them all to be deleted. If we have a 2GB cache file it will be removed when puppet runs, no matter how old it is. The age attribute will only be consulted if the files are smaller than the given size, as anything larger will already be selected via size.

Our current examples apply to all the files within our directory tree. To be even more cautious we can restrict which files may be removed by specifying a list of file name patterns. If a file matches one of these patterns the other criteria, such as age, are checked and may cause deletion. If the name doesn't match the files will never be deleted.

tidy { 'prune_old_caches':
  path => '/tmp/image_cache',

  age  => '1w',
  size => '1g',
}

One common misconception when using matches is what is available to match against. Expressions in matches only check against the 'basename' of the file, without the directory path. You can determine the basename of a file with a snippet of ruby code.

$ ruby -e 'puts File.basename "/etc/passwd"'
passwd

The tidy type is very useful but it can also be the cause of great destruction. It's strongly recommended to invest time testing these resources, especially with a --noop run, to show which files, and how many, it will delete.