How to Configure Log File Rotation on NetScaler

The newsyslog utility included with the NetScaler firmware, archives log files, if necessary, and rotates the system logs so the current log is empty when rotation occurs. The system crontabruns this utility every hour and it reads the configuration file which specifies the files to rotate and the conditions. The archived files may be compressed if required.

The existing configuration is located in /etc/newsyslog.conf. However, because this file resides in the memory filesystem, the administrator must save the modifications to /nsconfig/newsyslog.conf so the configuration survives restarting the NetScaler.

The entries contained in this file have the following format:

logfilename [owner:group] mode count size when flags [/pid_file] [sig_num]

Note: Fields within squared brackets are optional and can be omitted.

Each line on the file represents a log file that should be handled by the newsyslog utility and the conditions under which rotation should occur.

For example, following is a highlighted entry taken from the newsyslog.conf file. In this entry, the size field indicates that the size of the file ns.log is 100 Kilobytes and the count field indicates that the number of archived ns.log files is 25. A size of 100K and count of 25 are the default size and count values.

Note that the when field is configured with an asterisk ( * ), meaning that the ns.log file is not rotated based on time. Every hour, a crontab job runs the newsyslog utility which checks if the size of ns.log is greater than or equal to the size configured in this file. In this example, if it is greater than or equal to 100K, it rotates that file.

root@ns# cat /etc/newsyslog.conf# Netscaler newsyslog.conf# This file is present in the memory filesystem by default, and any changes# to this file will be lost following a reboot. If changes to this file# require persistence between reboots, copy this file to the /nsconfig# directory and make the required changes to that file.## logfilename [owner:group] mode count size when flags [/pid_file] [sig_num]/var/log/cron 600 3 100 * Z/var/log/amd.log 644 7 100 * Z/var/log/auth.log 600 7 100 * Z/var/log/ns.log 600 25 100 * Z

The size field can be changed to modify the minimum size of the ns.log file or the when field can be changed to enable rotating the ns.log file based on a certain time.

The daily, weekly, and/or monthly specification is given as: [Dhh], [Dhh [Ww]] and [Dhh [Mdd]], respectively. The time-of-day fields, which are optional, default to midnight. The ranges and meanings for these specifications are:

Hh hours, range 0 … 23

w day of week, range 0 … 6, 0 = Sunday

dd day of month, range 1 … 31, or the letter L or l to specify the last day of the month.

Examples

Here are some examples with explanations for the logs that are rotated by default:

/var/log/auth.log 600 7 100 * Z

The authentication log is rotated when the file reaches 100K, the last 7 copies of the auth.log are archived and compressed with gzip (Z flag), and the resulting archives are assigned the following permissions –rw——-.

/var/log/all.log 600 7 * @T00 Z

The catch-all log is rotated 7 times at midnight every night (@T00) and compressed with gzip. The resulting archives are assigned the following permissions –rw-r—–.

/var/log/weekly.log 640 5 * $W6D0 Z

The weekly log is rotated 5 times at midnight every Monday. The resulting archives are assigned the following permissions –rw-r—–.

Common Rotation Patterns

  • D0: rotate every night at midnight

  • D23: rotate every day at 23:00

  • W0D23: rotate every week on Sunday at 23:00

  • W5: rotate every week on Friday at midnight

  • MLD6: rotate at the last day of every month at 6:00

  • M5: rotate on every 5th day of month at midnight

If an interval and a time specification are both given, then both conditions must be met. That is, the file must be as old as or older than the specified interval and the current time must match the time specification.

The minimum file size can be controlled, but there is no limit on how large the size of the file can be before newsyslog utility gets its turn in the next hour slot.

Debugging

To debug the behavior of the newsyslog utility, add the verbose flag.

root@dj_ns# newsyslog -v/var/log/cron <3Z>: size (Kb): 31 [100] --> skipping/var/log/amd.log <7Z>: does not exist, skipped./var/log/auth.log <7Z>: size (Kb): 2 [100] --> skipping/var/log/kerberos.log <7Z>: does not exist, skipped./var/log/lpd-errs <7Z>: size (Kb): 0 [100] --> skipping/var/log/maillog <7Z>: --> will trim at Tue Mar 24 00:00:00 2009/var/log/sendmail.st <10>: age (hr): 0 [168] --> skipping/var/log/messages <5Z>: size (Kb): 7 [100] --> skipping/var/log/all.log <7Z>: --> will trim at Tue Mar 24 00:00:00 2009/var/log/slip.log <3Z>: size (Kb): 0 [100] --> skipping/var/log/ppp.log <3Z>: does not exist, skipped./var/log/security <10Z>: size (Kb): 0 [100] --> skipping/var/log/wtmp <3>: --> will trim at Wed Apr 1 04:00:00 2009/var/log/daily.log <7Z>: does not exist, skipped./var/log/weekly.log <5Z>: does not exist, skipped./var/log/monthly.log <12Z>: does not exist, skipped./var/log/console.log <5Z>: does not exist, skipped./var/log/ns.log <5Z>: size (Kb): 18 [100] --> skipping/var/log/nsvpn.log <5Z>: size (Kb): 0 [100] --> skipping/var/log/httperror.log <5Z>: size (Kb): 1 [100] --> skipping/var/log/httpaccess.log <5Z>: size (Kb): 1 [100] --> skippingroot@dj_ns#

Related:

Install VAAI 1.2.3 on ESX 6.0 could cause ESX server to become unresponsive

Article Number: 504703 Article Version: 4 Article Type: Break Fix



Isilon VAAI,Isilon OneFS,VMware ESX Server

After installing VAAI 1.2.3 plugin on ESX 6.0 server, log file /var/log/isi-nas-vib.log can grow unbounded, eventually filling up /var partition and can cause ESX server to become unresponsive. The log file will continue to be recreated after it is removed.

VAAI 1.2.3 plugin was built with debug enabled by default

WORKAROUND

Uninstall VAAI 1.2.3 plugin until a fix is provided.

OR

Setup a cron job on ESX 6.0 server to periodically remove the log file:

1. ssh into ESX server and login as root user

2. edit /etc/rc.local.d/local.sh, and add the following lines toward the end, before “exit 0”:

/bin/echo ‘*/15 * * * * /bin/rm /var/log/isi-nas-vib.log’ >>/var/spool/cron/crontabs/root

/bin/kill -HUP $(cat /var/run/crond.pid)

/usr/lib/vmware/busybox/bin/busybox crond

3. The reason for the above step is so if ESX server reboots, the workaround will persist after reboot. But at this point workaround has not been set on the ESX server yet.

4. Manually run the above 3 yellow-highlighted commands to implement the workaround on the current ESX server session.

5. Monitor /var/log/syslog.log and make sure you see every 15min an entry such as:

2017-10-03T15:15:01Z crond[35429]: crond: USER root pid 38236 cmd /bin/rm /var/log/isi-nas-vib.log

Related:

Thin Clone Refresh using Snapshot

Hello Everyone:

The last week, we created an Consistency Group and a ThinClone set also scheduled snapshot at 4:30AM daily .

The thin clone is refreshed using Snapshots in daily basis and presented to AIX Server .

Customer could recognize the thin Clone volumes in the OS and start up the Oracle DB.

Now, we did tests with the refresh option in the GUI for the thin clone Snapshot Set and everything worked so very well.

The only point, is that we don’t see any option to schedule the thin clone refresh using the snapshot set …. Any possibility ?



Somebody, can say me if there is any way to schedule the refresh by cron using CLI ? … or any other method ??

Thank you in advance for your help

Related:

This is regarding the Unity Refresh

Hello Everyone:

The last week, we create an Consistency Group and a ThinClone set created for the same also scheduled snapshot at 4:30AM daily . The Snapshots are refreshed in thin clone in daily basis and in and presented to AIX Server and customer could recognize the thin Clone volumes in the OS and start up the Oracle DB.

Now, we did tests with the refresh option in the GUI for the Snapshot Set and everything worked so very well.

The only point, is that we don’t see any option to schedule the refresh of the snapshot set …. Any possibility ?

Somebody, can say me if there is any way to schedule the refresh by cron using CLI ? … or any other method ??

Thank you in advance for your help

Related:

Isilon OneFS 8.0.0.2: Intermittent node unresponsive or node split from the Infiniband Network due to isi_phone_home

Article Number: 494612 Article Version: 6 Article Type: Break Fix



Isilon,Isilon OneFS,Isilon OneFS 7.2.1.0,Isilon OneFS 7.2.1.1,Isilon OneFS 7.2.1.2,Isilon OneFS 7.2.1.3,Isilon OneFS 7.2.1.4,Isilon OneFS 7.2.1.5,Isilon OneFS 8.0.0.0,Isilon OneFS 8.0.0.1,Isilon OneFS 8.0.0.2,Isilon OneFS 8.0.0.3

Node offline events are generated by the Cluster and after checking the uptime of the nodes in the cluster it is found that the nodes were split from the group.

You may notice lot of intermittent group changes.

For example:

2017-01-05T01:36:37-06:00 <0.5> ISILON-7(id7) /boot/kernel.amd64/kernel: [gmp_rtxn.c:1787](pid 72308=”kt: gmp-merge”)(tid=101118) gmp merge took 33s

2017-01-07T01:55:11-06:00 <0.4> ISILON-7(id7) /boot/kernel.amd64/kernel: [gmp_info.c:1734](pid 56721=”kt: rtxn_split”)(tid=100805) group change: <3,23351>

[up: 19 nodes, down: 1 node] (node 13 changed to down)

2017-01-07T01:55:11-06:00 <0.4> ISILON-7(id7) /boot/kernel.amd64/kernel: [gmp_info.c:1735](pid 56721=”kt: rtxn_split”)(tid=100805) new group: <3,23351>:

{ 1:0-14,18-32, 2:0-14,18-31,36, 3-4:0-14,18-32, 5:0-7,9-10,12-14,18-32,38-39, 6-7:0-14,18-32, 8:0-13,17-31,35, 9:0-10,12-14,18-32,36, 10-12,14-18:0-14,18-32, 19:0-3,5-14,18-32,36, 20:0-14,18-32, down: 13 }

2017-01-07T01:55:11-06:00 <0.5> ISILON-7(id7) /boot/kernel.amd64/kernel: [gmp_rtxn.c:2020](pid 56721=”kt: rtxn_split”)(tid=100805) gmp split took 20s

2017-01-07T01:56:20-06:00 <0.5> ISILON-7(id7) /boot/kernel.amd64/kernel: [gmp_rtxn.c:1598](pid 56826=”kt: gmp-merge”)(tid=100578) gmp txn <1,1301> error 85 from rtxn_exclusive_merge_lock_get after 44001ms

2017-01-07T01:56:48-06:00 <0.4> ISILON-7(id7) /boot/kernel.amd64/kernel: [gmp_info.c:1734](pid 56980=”kt: rtxn_split”)(tid=103079) group change: <9,23352>

[up: 18 nodes, down: 2 nodes] (node 20 changed to down)

2017-01-07T01:56:48-06:00 <0.4> ISILON-7(id7) /boot/kernel.amd64/kernel: [gmp_info.c:1735](pid 56980=”kt: rtxn_split”)(tid=103079) new group: <9,23352>:

{ 1:0-14,18-32, 2:0-14,18-31,36, 3-4:0-14,18-32, 5:0-7,9-10,12-14,18-32,38-39, 6-7:0-14,18-32, 8:0-13,17-31,35, 9:0-10,12-14,18-32,36, 10-12,14-18:0-14,18-32, 19:0-3,5-14,18-32,36, down: 13, 20 }

2017-01-07T01:56:48-06:00 <0.5> ISILON-7(id7) /boot/kernel.amd64/kernel: [gmp_rtxn.c:2020](pid 56980=”kt: rtxn_split”)(tid=103079) gmp split took 25s

2017-01-07T02:00:42-06:00 <0.4> ISILON-7(id7) /boot/kernel.amd64/kernel: [gmp_info.c:1734](pid 56826=”kt: gmp-merge”)(tid=100578) group change: <13,23353>

[up: 19 nodes, down: 1 node] (node 13 changed to up)

2017-01-07T02:00:42-06:00 <0.4> ISILON-7(id7) /boot/kernel.amd64/kernel: [gmp_info.c:1735](pid 56826=”kt: gmp-merge”)(tid=100578) new group: <13,23353>:

{ 1:0-14,18-32, 2:0-14,18-31,36, 3-4:0-14,18-32, 5:0-7,9-10,12-14,18-32,38-39, 6-7:0-14,18-32, 8:0-13,17-31,35, 9:0-10,12-14,18-32,36, 10-18:0-14,18-32, 19:0-3,5-14,18-32,36, down: 20 }

2017-01-07T02:00:42-06:00 <0.5> ISILON-7(id7) /boot/kernel.amd64/kernel: [gmp_rtxn.c:1787](pid 56826=”kt: gmp-merge”)(tid=100578) gmp merge took 232s

2017-01-07T02:01:22-06:00 <0.4> ISILON-7(id7) /boot/kernel.amd64/kernel: [gmp_info.c:1734](pid 57751=”kt: gmp-merge”)(tid=102358) group change: <1,23354>

[up: 20 nodes] (node 20 changed to up)

On the nodes that were split from the group, you may notice that the device worker threads are no progressing and below error is seen:

For example:

2017-01-05T01:35:02-06:00 <0.5> ISILON-13(id13) /boot/kernel.amd64/kernel: [rbm_core.c:811](pid 43=”swi4: clock sio”)(tid=100036) ping timeout no dwt progress

2017-01-07T01:54:58-06:00 <0.5> ISILON-13(id13) /boot/kernel.amd64/kernel: [rbm_core.c:811](pid 43=”swi4: clock sio”)(tid=100036) ping timeout no dwt progress

2015-06-08T09:00:25-05:00 <0.5> ISILON-3(id3) /boot/kernel.amd64/kernel: [rbm_core.c:811](pid 42=”swi4: clock sio”)(tid=100035) ping timeout no dwt progress

2015-06-08T09:00:27-05:00 <0.5> ISILON-3(id3) /boot/kernel.amd64/kernel: [rbm_core.c:811](pid 42=”swi4: clock sio”)(tid=100035) ping timeout no dwt progress

2016-12-24T16:20:07-06:00 <0.5> ISILON-9(id9) /boot/kernel.amd64/kernel: [rbm_core.c:811](pid 43=”swi4: clock sio”)(tid=100036) ping timeout no dwt progress

Another behavior you may see are failed attempts to create kernel threads due to too many kernel threads. This will show messages like the following:

/boot/kernel.amd64/kernel: kt_new: backoff 1 sec

/boot/kernel.amd64/kernel: kt_new: backoff 1 sec

/boot/kernel.amd64/kernel: [ktp.c:148](pid 5460=”kt: j_syncer”)(tid=100378) kt_new: backoff 1 sec

/boot/kernel.amd64/kernel: [ktp.c:148](pid 5460=”kt: j_syncer”)(tid=100378) kt_new: backoff 1 sec

/boot/kernel.amd64/kernel: [ktp.c:148](pid 79932=”kt: dxt3″)(tid=107754) [ktp.c:148](pid 80002=”kt: dxt0″)(tid=107871) kt_new: backoff 1 sec

/boot/kernel.amd64/kernel: [ktp.c:148](pid 79812=”kt: dxt2″)(tid=107720) kt_new: backoff 1 sec

/boot/kernel.amd64/kernel: kt_new: backoff 1 sec

/boot/kernel.amd64/kernel: [ktp.c:148](pid 80005=”kt: dxt1″)(tid=107802) kt_new: backoff 1 sec


On the nodes that were split you will see processes running that were created by the cron execution of the isi_phone_home script, used to collect information every certain time based on cron.

For example:

root 228 0.0 0.0 26908 3348 ?? I 11:25AM 0:00.00 cron: running job (cron)

root 230 0.0 0.0 7144 1388 ?? Is 11:25AM 0:00.01 /bin/sh -c isi_ropc -s python /usr/local/isi_phone_home/isi_phone_home –script-file cluster_stats.py

root 232 0.0 0.0 100532 13628 ?? I 11:25AM 0:00.11 python /usr/bin/isi_ropc -s python /usr/local/isi_phone_home/isi_phone_home –script-file cluster_stats.py

root 234 0.0 0.0 7144 1388 ?? I 11:25AM 0:00.01 sh -c python /usr/local/isi_phone_home/isi_phone_home –script-file cluster_stats.py

root 235 0.0 0.0 127608 17076 ?? I 11:25AM 0:00.16 python /usr/local/isi_phone_home/isi_phone_home –script-file cluster_stats.py

The phone_home processes pile up because they are trying to get an exclusive lock on a file which is currently locked by another phone_home and when phone home runs, it will try to acquire a lock on a file which is in /ifs/.ifsvar/run/isi_phone_home.lock and then populate the isi_phone_home.pid file with its processID and node number.

ls -l /ifs/.ifsvar/run



-rw——- 1 root wheel 0 Jul 7 2016 phone_home.lock

-rw-r–r– 1 root wheel 8 Feb 4 11:45 phone_home.pid

By default the lock is created using blocking flag, if a new instance of phone home attempts to get the lock on that file (which is already locked by the first instance), that second instance will get blocked and stuck until the first instance ends.

A permanent fix is already addressed in Halfpipe 8.0.1.0

A permanent fix for 7.2 branch is targeted to 7.2.1.6

A permanent fix for 8.0.0.x is targeted for 8.0.0.5

To workaround this issue, you should disable the script as shown here, as well as comment out the crontab entries on the cluster. This will only impact the automatic sending of periodic phone home logs via this feature. This will not impact dial home or any other cluster functions.

To disable the phone home script:

# isi_phone_home –d

or

# isi_phone_home -–disable

This places a .disable file in the run directory mentioned earlier and the script will check for this file before performing most of its operations. However, to completely disable the script, comment out these crontab entries in /etc/mcp/templates/crontab on any node in the cluster (the chanes will be propagated to /etc/crontab on each node automatically)

# — Start section of phone home

#min hour mday month wday who command

*/5 * * * * root isi_ropc -s python /usr/local/isi_phone_home/isi_phone_home –script-file cluster_stats.py

*/31 * * * * root isi_ropc -s python /usr/local/isi_phone_home/isi_phone_home –script-file cluster_full_disk.py

47 11 * * * root isi_ropc -s python /usr/local/isi_phone_home/isi_phone_home –script-file nfs_zones.py

13 00 * * * root isi_ropc -s python /usr/local/isi_phone_home/isi_phone_home –list-file telemetry.list

0 * * * * root isi_ropc -s python /usr/local/isi_phone_home/isi_phone_home –list-file everyhour.list

3 1 */7 * * root isi_ropc -s python /usr/local/isi_phone_home/isi_phone_home –delete-data

17 1 */7 * * root isi_ropc -s python /usr/local/isi_phone_home/isi_phone_home –create-package

4 2 */7 * * root isi_ropc -s python /usr/local/isi_phone_home/utils/send_data.py

# — End section of phone home

Note: If there are already nodes in a locked state, it may be necessary to delete the .lock and .pid files and kill the processes locking these files. If this happens, please contact Technical Support for assistance.

Note: In some versions of OneFS 7.2, these cron entries may be in /etc/mcp/override/crontab.smbtime. If this is the case, comment out the entries there.

Related:

7023283: How to configure rollover scripts to manage logs and audits

This document (7023283) is provided subject to the disclaimer at the end of this document.

Environment

Privileged Account Manager 3.5

Situation

How to configure custom perl scripts to handle rollover operations for audit database files, audit videos and host log files.
Understanding basic storage maintenance requirements so the server does not run out of disk space.

Resolution

Rollover and archival procedures are handled through custom perl scripts in PAM. This accommodates for a wide range of security models / retention policies and archival procedures. There are options to configure how often this occurs, typically by time (in hours) or by size (in Mb), but the actual procedure that is executed is the custom perl script that is provided as the rollover script.
Documentation provides several example rollover scripts that can be used. They can be extended to handle any additional requirements or procedures that are necessary for the organization, such as moving the rolled-over files to an archive location. Alternatively, with these basic rollover procedures in place, custom bash scripts with crontab could handle these files in various ways. If any are developed, please do share with the community in our forums or as a cool tool!
Example rollover scripts can be found below:

Disclaimer

This Support Knowledgebase provides a valuable tool for NetIQ/Novell/SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented “AS IS” WITHOUT WARRANTY OF ANY KIND.

Related:

Re: Is it possible to rerun a policy action?

Hhm – good point.

Yes, you can define a workflow which will only run a clone action. And it will work.

However, my personal favorite is scripted cloning

– easy to configure

– easy to change

– much more flexibility

– you can use much more mminfo parameters to select your save sets precisely

– you can specify a storage node (load balancing)

– you can add other criteria as a specific retention date

– and it is valid for all NW versions 😉

Then schedule such script via in cron/task scheduler. Done.

We have used this method for years.

Related:

Avamar Reporting Using Grafana

I’ve always found Avamar reporting to be a bit challenging. Sure you have DPA and BRM (which is now dead I think), but what about just some simple report or stats to share with your team?

For a long time, I used simple bash scripts that would dump mccli data and email me reports via a cron job. I was hoping the SNMP-subagent was able to expose more information, but that didn’t display nearly as much as I hoped. Recently I went back to the drawing board to see if we can leverage Grafana to display some of this raw data for us. Instead of re-inventing the wheel and try to pull via the REST API, why not just query the DB directly from Grafana?

DataSource.PNG.png

By Adding a PostgreSQL data source to Grafana, you have the full ability to run any queries you like. For example, as you see below, we can show job status for the last 24 hours.

graph.PNG.png

As queries get more complex, you can use a different panel to show more detailed information. The Grafana table does a good job showing reports in a traditional style.

list.PNG.png

Someone asked me the other day if I’m able to display information from multiple Avamar systems. You definitely can as long as you add another data source. One thing I really want to try is to see if I can also graph data domain metrics along with the Avamar. Let’s face it, no one likes to find out their /ddvar partition is full. Would be nice to see it all on one screen.

Related:

7022672: How to set up the sadc cron job on SLES12

Starting the sysstat.service systemd unit sets up a symlink named sysstat in the /etc/cron.d directory which points to the /etc/sysstat/sysstat.cron file. This link is removed when the service is stopped.

By default, system activity data are collected every ten minutes and written to /var/log/sa/sa## data files, where ## corresponds to the day of the month. These data files can be read with the “sar” program. This is the sa1 component of sadc.

A daily report is run every six hours which parses the sa## data files and outputs the activity information into to text files. These text reports are written to /var/log/sa/sar##. This is the sa2 component of sadc.

The frequency of either one of these crons can be changed in the /etc/sysstat/sysstat.cron file.

You can read more about these features in their respective man pages.

Related:

7022654: DeepSea node runs out of memory and / or root filesystem disk space

This document (7022654) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Enterprise Storage 4

SUSE Enterprise Storage 5

Situation

The DeepSea salt master node runs out of memory and or reports that the root file system is full. Messages similar to the below are flooding the “/var/log/salt/master” log file:

2018-01-15 15:20:51,512 [salt.utils.process][ERROR ][1149] An un-handled exception from the multiprocessing process ‘Maintenance-18’ was caught:

Traceback (most recent call last):

File “/usr/lib/python2.7/site-packages/salt/utils/process.py”, line 647, in _run

return self._original_run()

File “/usr/lib/python2.7/site-packages/salt/master.py”, line 240, in run

salt.daemons.masterapi.clean_old_jobs(self.opts)

File “/usr/lib/python2.7/site-packages/salt/daemons/masterapi.py”, line 193, in clean_old_jobs

mminion.returners[fstr]()

File “/usr/lib/python2.7/site-packages/salt/returners/local_cache.py”, line 430, in clean_old_jobs

shutil.rmtree(t_path)

File “/usr/lib64/python2.7/shutil.py”, line 247, in rmtree

rmtree(fullname, ignore_errors, onerror)

File “/usr/lib64/python2.7/shutil.py”, line 252, in rmtree

onerror(os.remove, fullname, sys.exc_info())

File “/usr/lib64/python2.7/shutil.py”, line 250, in rmtree

os.remove(fullname)

OSError: [Errno 13] Permission denied: ‘/var/cache/salt/master/jobs/6d/431cfbd8dccac414c625deb28178aed3c7625fdc5765849dda1897e039c114/jid’

Resolution

This will eventually be addressed with an update to DeepSea, currently to work around the issue create a CRON job that runs once per day (or multiple times per day if deemed necessary) that automatically removes all files older than one day from the salt masters jobs cache using something like the below examples:

/usr/bin/find /var/cache/salt/master/jobs/ -type f -mtime +1 -delete 2>&1 | logger -t salt-jobs

OR

/usr/bin/find /var/cache/salt/master/jobs/ -type f -mtime +1 | xargs /usr/bin/rm 2>&1 | logger -t salt-jobs

The above will find all files older than one day and remove them, logging any errors to “/var/log/messages” with a “salt-jobs” entry.

Cause

Runners in jinja are executed as root resulting in runner jobs having root file permissions. The salt-master service however runs as salt:salt and is thus unable to manage these files in its job cache.

Status

Reported to Engineering

Additional Information

An additional temporary solution may be to simply intermittently change the ownership of all files in the salt master job cache recursively to salt:salt using for example:

chown -R salt:salt /var/cache/salt/master/jobs/*

Disclaimer

This Support Knowledgebase provides a valuable tool for NetIQ/Novell/SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented “AS IS” WITHOUT WARRANTY OF ANY KIND.

Related: