Last night as I was about to head to sleep, Sensu started emailing me about disk space warnings on one of the backend servers. That’s strange, I thought. I had set up logrotate with appropriate limits to ensure the log file size is reasonable and rotation happens on a daily basis.
Curious, I ssh’d into the server to investigate. Running a df -h indicated as expected the disk space in use was over 70% (which is the trigger for sensu to send a notification) and the log files had grown way over expected size. So why didn’t logrotate rotate the files? I ran logrotate again to see what’s happening
$ sudo /etc/cron.daily/logrotate
error: backend:7 bad size '536870912.0'
error: found error in "log/production.log", skipping
Huh, ok so now we know why logrotate didn’t run. But having a decimal in the config causes it to flag as bad size? I checked the documentation and it doesn’t mention so:
size size Log files are rotated when they grow bigger then size bytes. If size is followed by M, the size if assumed to be in megabytes. If the k is used, the size is in kilobytes. So size 100, size 100k, and size 100M are all valid.
Removing the decimal allowed logrotate to work fine and rotated the files. So that’s a TIL for me. In case you’re wondering how that decimal came into the picture, Chef is used as the CM tool for deploying all changes. The size is defined in a recipe as so
size (0.5 * (1024 * 1024 * 1024))
maxsize (0.5 * (1024 * 1024 * 1024))
Now the first number is templated based on what service is being deployed and for the backend, it is configured as 0.5. Since Chef uses Ruby, this evaluates to ‘536870912.0’ and thus, the error.