Datacap 9.0.1 – Rulerunner aborts for multiple RR environment; recommended workflow setup?

We have a Datacap 9.0.1 setup with multiple RRs running – there are 4 RRs (one per region) with around 20 cores each; around 28 threads per RR due to the high volume. Two RRs are on VM, two are on bare-metal. For each job (3 per region), there is one task for ingestion, one for export, and the rest for OCR (everything between ingestion and export)

At various times – maybe once or twice a week, we get a lot of tasks aborting for one region (a different one each time) during a period of time, usually a 15-30 minute window, usually with PageID and Batch Profiler. PageID aborts at different points (PDFFreConversion, SplitMultiPageTiff, C2BW requesting abort, according to RRS log) and batch profiler usually aborts because RecPage DLL is not found (even though it is there in the system) . Rerunning or reingesting the emails usually works after the abort window is over.

I know this has been asked before, and the theory was that it was possibly due to a high level of load to the filesystem due to all the files being written to. One suggestion was to turn off reflush, which we did. Another was to increase logging, but would this also increase load to the filesystem even with reflush off? If anyone has seen this, does reducing logging help?

Another suggestion was to reduce # of threads. If we have high volume, would it help to have some threads handling PageID only and others handling Batch Profiler only, instead of all tasks except ingestion and export?

Related:

Leave a Reply