Re: Ask the Expert: The What, Why and How of Isilon Job Engine

What are the main causes that the all jobs are paused at the same time including the flex protect?,

FlexProtect would pause all the jobs except you’ve job engine tweaked.

If FlexProtect job is also paused then something is wrong with job engine — isi_job_d may not be running or one of the node is in readonly mode or down or cluster is unable to connect to one of the node via backend (IB). At this stage I would ask you to log support case and have support work at it. I may write troubleshooting steps, but I don’t know user’s experience level, so it will be best for support to fix it.

and if you want to launch one job more automatically will be paused including flexprotect?

Isilon job engine is written in a way to give top most priority to Data Integrity and hence when a drive or a node is in Smartfail status OneFS would run FlexProtect and reprotect data. You could pause FlexProtect job and run other job by removing job engine from “Degraded” mode, but at this stage again I would ask you to check with support because you need to know protection level on the cluster, what’s in smartfail status, and reason to pause FlexProtect

Why the system paused all jobs?

See answer to 1st question

what must I do for quit of that status(all paused jobs)?

See answer to 1st question

how could you run a job in a degraded mode?

You need to take job engine out of degraded mode. Again I can’t share these commands on a public forum as changes to job engine without proper knowledge could cause other issues. Please log a support case, and if there is a reason support would make those changes.

Hope this helps!


Leave a Reply