Server HDD activity high, crippling

I do not need a solution (just sharing information)

Our small organization’s single Server 2012 file server has been running SEPM since 2014.

I’ve been here since 2017.

Since I;ve been here, the server has been sluggish.

It’s a VM with 3TB disk space on 2 volumes and 6GB dedicated RAM.

The only other VMs are a very small linux VM and a small Win7 VM used only for remote login.

I’ve played around with stripping out unnecessary apps and even moved the paging file with only marginal sucess.

Today it was particularly lethargig and I noticed in resource monitor, that almost all of the disk activity (which was quite high) was rrelated to SEPM.

A google search turned up an issue where someone had “Live Update Administrator” and SEPM both running on the same server, and that was causing excessive disk activity.

That post was from 2010. Our services are not named the same.

I was wondering if “Live Update” is the same as “Live Update Manager” as we have bothe “Live Update” and “Symantec Endpoint Manager” services runningas well as a few others that begin with “Symantec…”

I wanted to stop the “Live Update” service to see what happened but I can’t. Looks like I’ll have to uninstall it.

It shows up as a seperate installed ap in “add and remove…”

I want to make sure from someone here, before i do that, though.

Thanks-

KK

0

Related:

Avamar Client for Windows: Avamar backup fails with “avtar Error : Out of memory for cache file” on Windows clients

Article Number: 524280 Article Version: 3 Article Type: Break Fix



Avamar Plug-in for Oracle,Avamar Client for Windows,Avamar Client for Windows 7.2.101-31



In this scenario we have the same issue presented in the KB 495969 however the solution does not apply due to an environment issue on a Windows client.

  • KB 495969 – Avamar backup fails with “Not Enough Space” and “Out of Memory for cache file”

The issue could affect any plugin like in this case with the error presented in the following manner:

  • For FS backups:
avtar Info <8650>: Opening hash cache file 'C:Program Filesavsvarp_cache.dat'avtar Error <18866>: Out of memory for cache file 'C:Program Filesavsvarp_cache.dat' size 805306912avtar FATAL <5351>: MAIN: Unhandled internal exception Unix exception Not enough space
  • For VSS backups:
avtar Info <8650>: Opening hash cache file 'C:Program Filesavsvarp_cache.dat'avtar Error <18866>: Out of memory for cache file 'C:Program Filesavsvarp_cache.dat' size 1610613280avtar FATAL <5351>: MAIN: Unhandled internal exception Unix exception Not enough space
  • For Oracle backup:
avtar Info <8650>: Opening hash cache file 'C:Program Filesavsvarclientlogsoracle-prefix-1_cache.dat'avtar Error <18866>: Out of memory for cache file 'C:Program Filesavsvarclientlogsoracle-prefix-1_cache.dat' size 100663840avtar FATAL <5351>: MAIN: Unhandled internal exception Unix exception Not enough spaceor this variant:avtar Info <8650>: Opening hash cache file 'C:Program Filesavsvarclientlogsoracle-prefix-1_cache.dat'avtar Error <18864>: Out of restricted memory for cache file 'C:Program Filesavsvarclientlogsoracle-prefix-1_cache.dat' size 100663840avtar FATAL <5351>: MAIN: Unhandled internal exception Unix exception Not enough space avoracle Error <7934>: Snapup of <oracle-db> aborted due to rman terminated abnormally - check the logs
  • With the RMAN log reporting this:
RMAN-00571: ===========================================================RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============RMAN-00571: ===========================================================RMAN-03002: failure of backup plus archivelog command at 06/14/2018 22:17:40RMAN-03009: failure of backup command on c0 channel at 06/14/2018 22:17:15ORA-04030: out of process memory when trying to allocate 1049112 bytes (KSFQ heap,KSFQ Buffers)Recovery Manager complete. 

Initially it was though the cache file could not grow in size due to incorrect “hashcachemax” value.

The client had plenty of free RAM (48GB total RAM) so we increase the flag’s value from -16 (3GB file size max) to -8 (6GB file size max)

But the issue persisted and the disk space was also not an issue, there was plenty of GBs of free space

Further investigations with a test binary from the engineering team lead to the fact that the MS OS was not releasing enough unused and contiguous memory required to allocate/load into the memory the entire hash cache file for the backup operation.

It was tried a test binary that would allocate the memory in smaller pieces to see if we could reach the point where the OS would allow the full file p_cache.dat to be loaded into memory but that also did not help, the Operative system was still not allowing to load the file into memory for some reason.

The root cause is hided somewhere in the OS however in this case we did not engage the MS team for further investigations on their side.

Instead we found a way to work around the issue setting the cache file to be smaller, see details in the resolution section below.

In order to work around this issue we set the hash cache file to be of a smaller size so that the OS would not have issues in allocating it into memory.

In this case it was noticed that the OS was also having problems in allocating smaller sizes like 200+ MB so we decided to re-size the p_cache.dat to be just 100MB with the use of the following flag:

–hashcachemax=100

This way the hash cache file would never grow beyond 100MB and would overwrite the old entries.

After adding that flag it is requited to recycle the cache file by renaming or deleting the p_cache.dat (renaming is the preferred option)

After the first backup which would take longer than usual as expected (to rebuild the cache file) the issue should be resolved.

  • The Demand-paging cache is not recommended in this scenario since the backup are directed to GSAN storage so the Monolithic paging cache was used.
  • Demand-paging was designed to gain benefit for backup being sent to DataDomain storage.

Related:

7021211: Memory, I/O and DefaultTasksMax related considerations for SLES for SAP servers with huge memory

Somegeneral guidelines about using pagecache_limit and optimizing some ofthe I/O related settings:-

If on the server in question,you are *not* simultaneously mixing a heavy file I/O workload whilerunning a memory intensive application workload, then this setting(pagecache_limit) will probably cause more harm than good. However,in most SAP environments, there is both high I/O and memory intensiveworkloads.

Ideally, vm.pagecache_limit_mb should be zerountil such time that pagecache is seen to exhaust memory. If it doesexhaust memory then trial-and-error-tuning must be used to findvalues that work for the specific server/workload in question.

Asregards the type of settings that have both a fixed value and a’ratio’ setting option, keep in mind that ratio settings will be moreand more inaccurate as the amount of memory in the server grows.Therefore, specific ‘byte’ settings should be used as opposed to’ratio’ type settings. The ‘ratio’ settings can allow too muchaccumulation of dirty memory which has been proven to lead toprocessing stalls during heavy fsync or sync write loads. Settingdirty_bytes to a reasonable value (which depends on the storageperformance) leads to much less unexpected behavior.

Setting,say, a 4gb pagecache limit on a 142G machine, is asking for trouble,especially when you consider that this would be much smaller than adefault dirty ratio limit (which is by default 40% of availablepages).

If the pagecache_limit is used, it should alwaysbe set to a value well above the ‘dirty’ limit, be it a fixed valueor a percentage.

The thing is that there is no universal’correct’ values for these settings. You are always balancingthroughput with sync latency. If we had code in the kernel so that itwould auto-tune automatically based on the amount of RAM in theserver, it would be very prone to regressions because it depends onserver-specific loading. So, necessarily, it falls to the serveradmins to come up with the best values for these settings (viatrial-and-error).

*If* we know for a fact that the serverdoes encounter issues with pagecache_limit set to 0 (not active),then choose a pagecache_limit that is suitable in relation to howmuch memory is in the server.

Lets assume that you have aserver with 1TB of RAM, these are *suggested* values which could beused as a starting point:-

pagecache_limit_mb = 20972 # 20gb – Different values could be tried from say 20gb <>64gb

pagecache_limit_ignore_dirty = 1 # see the below section on this variable to decide what it should be set toovm.dirty_ratio =0

vm.dirty_bytes = 629145600 # This could be reduced orincreased based on actual hardware performance but

keep thevm.dirty_background_bytes to approximately 50% of thissetting

vm.dirty_background_ratio = 0

vm.dirty_background_bytes= 314572800 # Set this value to approximately 50% of vm.dirty_bytes


NOTE: If it isdecided to try setting pagecache_limit to 0 (not active) then it’sstill a good idea to test different values for dirty_bytes anddirty_background_bytes in an I/O intensive environment to arrive atthe bestperformance.

———————————————————————

Howpagecache_limit works:

—————————————-

Theheart of this patch is a function called shrink_page_cache(). It iscalled from balance_pgdat (which is the worker for kswapd) if thepagecache is above the limit. The function is also called in__alloc_pages_slowpath.

shrink_page_cache() calculates thenumber of pages the cache is over its limit. It reduces this numberby a factor (so you have to call it several times to get down to thetarget) then shrinks the pagecache (using the KernelLRUs).

shrink_page_cache does several passes:

– Just reclaiming from inactive pagecache memory. This is fast– but it might not find enough free pages; if that happens, thesecond pass will happen.

– In the second pass,pages from active list will also be considered.

– The third pass will only happen if pagecacahe_limig_ignore-dirty isnot 1. In that case, the third pass is a repetition of the secondpass, but this time we allow pages to be written out.

Inall passes, only unmapped pages will be considered.


Howit changes memorymanagement:

——————————————————

Ifthe pagecache_limit_mb is set to zero (default), nothing changes.

Ifset to a positive value, there will be three different operatingmodes:

(1) If we still have plenty of free pages, the pagecachelimit will NOT be enforced. Memory management decisions are taken asnormally.

(2) However, as soon someone consumes those freepages, we’ll start freeing pagecache — as those are returned to thefree page pool, freeing a few pages from pagecache will return us tostate (1) — if however someone consumes these free pages quickly,we’ll continue

freeing up pages from the pagecache until wereach pagecache_limit_mb.

(3) Once we are at or below the lowwatermark, pagecache_limit_mb, the pages in the page cache will begoverned by normal paging memory management decisions; if it startsgrowing above the limit (corrected by the free pages), we’ll freesome up again.

This feature is useful for machines thathave large workloads, carefully sized to eat most of the memory.Depending on the applications page access pattern, the kernel may tooeasily swap the application memory out in favor of pagecache. Thiscan happen even for low values of swappiness. With this feature, theadmin can tell the kernel that only a certain amount of pagecache isreally considered useful and that it otherwise should favor theapplicationsmemory.


pagecache_limit_ignore_dirty:

——————————————

Thedefault for this setting is 1; this means that we don’t considerdirty memory to be part of the limited pagecache, as we can not easily free up dirty memory (we’d need to do writes for this). Bysetting this to 0, we actually consider dirty (unampped) memoryto be freeable and do a third pass in shrink_page_cache() where weschedule the pages for write-out. Values larger than 1 are alsopossible and result in a fraction of the dirty pages to be considerednon-freeable.

From SAP on the subject:

If there are alot of local writes and it is OK to throttle them by limiting thewriteback caching, we recommended that you set the value to 0. Ifwriting mainly happens to NFS filesystems, the default 1 should beleft untouched. A value of 2 would be a middle ground, not limitinglocal write back caching as much, but potentially resulting in somepaging.

Related:

7021829: Java Memory Management in Reflection X Advantage

Memory Use Overview

Attachmate Java applications rely on the Java Virtual Machine (JVM) to manage memory. The JVM allocates a pool of memory at startup and applications running in that JVM share that memory. Each Attachmate application that uses Java sets a maximum value for the size of the JVM memory pool. (This is also called the Java heap size.) The JVM can use memory, as required, up to this specified limit. As memory use approaches this limit, the JVM will recycle unused memory so that it remains available to applications running in the JVM.

When viewing memory used by a Java application, note the following:

  • All applications running in a JVM draw memory from the JVM memory pool, not from the Windows operating system memory. The memory used by all applications running in the JVM will never exceed the specified heap size for that JVM.
  • The JVM will recycle available memory back to the applications running in that JVM as needed, but this doesn’t happen until the memory use approaches the heap limit. Because of this, applications running in a JVM may not actually require all the memory reported in the Windows Task Manager at any given time.

In the Windows Task Manager, some Java applications create processes that use an application-specific image name. (For example Reflection X Manager and Reflection X Manager for Domains create processes listed under rxmgr.exe and rxmgrdomains.exe). Other applications (including the Reflection X Service) create processes that use java.exe as the image name.

If you also run other applications that consume a lot of memory, you may see performance issues. As Windows runs out of the random access memory (RAM) needed to run your applications, it uses something called virtual memory to compensate. When the available RAM runs low, Windows moves data from RAM to disk storage in a file called the paging file. Moving data to and from the paging file frees up RAM so your computer can complete its work, but reading data from the disk is slower than reading data directly from memory.

Memory Troubleshooting

On most systems, you will not need to modify your settings using any of the following troubleshooting techniques. The Reflection X Advantage default JVM settings provide sufficient memory to address the memory demands created by most X clients and servers. The default maximum heap size also ensures sufficient memory will be available on most systems for your Windows applications that don’t use this JVM. However, in some cases, contention for memory may prevent Reflection X Advantage applications from launching, or affect the performance of other applications on your system.

If your system’s memory is limited or fragmented, you may see one of the following indications that memory limits are preventing Reflection X Advantage applications from running.

  • In version 5.0 or later, you see a message saying “Application failed to start due to insufficient memory.”
  • In version 4.2 or earlier, the Reflection X Advantage application fails to launch, and the Windows Event log shows a message like the following:

“The description for Event ID (1025) in Source (RXAdvantage) cannot be found. The local computer may not have the necessary registry information or message DLL files to display messages from a remote computer. You may be able to use the /AUXSOURCE= flag to retrieve this description; see Help and Support for details. The following information is part of the event: The event log file is corrupt.”

If memory limits prevent Reflection X Advantage from running, or you have determined that contention for memory is affecting performance of other applications, consider the following approaches.

Install Reflection X Advantage using the 64-bit Installer

If you have installed Reflection X Advantage using the 32-bit installer on a 64-bit system, and Reflection X Advantage applications fail to launch, reinstall using the 64-bit installer. The memory limit on 64-bit processes is much larger than the limit on a 32-bit processes.

Add RAM

The more RAM your computer has, the faster your programs will generally run. If a lack of RAM is slowing your computer, you may have success using the following adjustments. However, if none of these solutions result in better performance, adding additional RAM is the best solution.

Reduce the Maximum Memory Available to the JVM

Reducing the maximum memory limit for Reflection X Advantage (also called the Java heap size) can resolve the problem of Reflection X Applications failing to launch and can also increase the amount of memory available to other applications.

Note: This solution may affect performance in Reflection X Advantage. If memory is available Reflection X Advantage sessions cache images, the results of certain calculations, and other data. These actions can improve drawing performance and compression over low-bandwidth networks. If changing the Java heap size results in slow performance or X client failures, the value is probably set too low.

To change the maximum memory available to the Reflection X Advantage JVM:

  1. Close X Manager if it is running.
  2. Navigate to the Reflection X Advantage installation folder. (The default location for most installs is C:Program FilesAttachmateReflection).
  3. Locate rxmgr.alp (if you use standalone X Manager) or rxmgrdomains.alp (if you use X Manager for Domains) and remove the read-only attribute from this file. (Right-click, select Properties, clear the “Read-only” checkmark, then click OK.)
  4. Open the *.alp file in a text editor, such as Notepad.
  5. In the line that begins “alp.java.command.options=” locate the -Xmx setting. (The default value is -Xmx900m in version 5.0 and -Xmx1024m in version 4.2 and earlier.) Change this to a smaller value (for example: -Xmx700m).

Note: The *.alp files are replaced by newer, default files when you upgrade your Reflection X Advantage product. If memory problems reappear after an upgrade, see Technical Note 2530, “Installing Newer Versions of Reflection X Advantage If You Have Edited Rxmgr.alp.”

Increase Virtual Memory

You can increase the amount of virtual memory available by increasing the minimum and maximum sizes of the paging file.

This change can result in slower performance in Reflection X Advantage and other applications because of excessive disk paging. This solution may work for you if there are idle applications that can be paged out at times.

To modify virtual memory in Windows:

  1. Open the System Properties dialog box. (In Windows 7: Start menu > right-click Computer > Properties > Advanced system settings.)
  2. On the Advanced tab, under Performance, click Settings.
  3. Click the Advanced tab, and then, under Virtual memory, click Change.
  4. Click Custom size and set values for Initial size (MB) and/or Maximum size (MB).

Increases in size usually don’t require a restart, but if you decrease the size, you will need to restart Windows.

Add Additional Reflection X Advantage Nodes

If you are running Reflection X Advantage in domain mode and are experiencing memory contention on the domain node, you can experiment with using multiple nodes to help alleviate memory problems. Reflection X Advantage supports creating multiple nodes on the same system. This change effectively increases the allowed memory available to run sessions. You can also add nodes on remote systems. This increases both the number of CPU cores available and amount of available memory.

To add nodes to a domain, use the rxsconfig command line utility. For details, see “Set Up Domain Nodes” in the Reflection X Advantage product Help.

The number of sessions that a node can support depends on what kind of domain services are configured for the session (such as whether the session has a headless server running on the node) and how active the clients of the session are. You can monitor the load on your nodes using the Administrative Console. As an initial approximation, a session without a headless server can be expected to require about 1MB, and with a headless server about 25MB of heap space. For a 1GB heap (the default), this can be anything from about 30-1000 sessions. With a large number of sessions, resources such as cores and network bandwidth become a bigger concern.

Related:

Datacap webservice does not process the vscan task

We are running Datacap 9.1.1 on Windows 2012.
We have a Datacap application based on the standard template that we need to call via web service.
I tried to simulate the process using Postman (similar to Fiddler) to generate the REST calls. Our input files are in a folder as they would be if the process was run via the Datacap Studio. The process seems to perform the vscanmulti task but the result is just a vscanmulti.xml page file in the batch folder. The next task PageId obviously gets stuck as pending as no file was scanned. I tried using GrabBatch and Release but they just generate pagefiles in the Batch folder with the task names but no processing is performed. No task logs are produced either.
How do I get the app to process vscan then followed by the remaining tasks? I understand that I need to use the GetPageFile and SetPageFile but that is only after the scan is performed successfully.

The following are the steps:

1. /Session/Logon – successful
2. /Queue/CreateBatch – apparently successful. Request includes the pagefile definition.
3. /Queue/ReleaseBatch/AppDev/135/finished – successful
4. /Queue/GrabBatch/AppDev/135 – successful
5. /Queue/ReleaseBatch/AppDev/135/finished – successful
6. etc.

I have reviewed both the Knowledge Center and the IBM Datacap V9.0 Installing and Using Datacap Web Services.pdf as well as the Datacap-Web-Services-Projects sample from DeveloperWorks.
Thanks in advance.

Related:

How to separate the second table for FindTableValueRegEx

I’d like to extract data from multiple tables in a page by FindTableValueRegEx, which was introduced by Datacap 9.1.1.

FindTableValueRegEx action’s help message said:

> Multiple Tables:
> If the layout file contains multiple tables, only the first table will be searched with this action. To search tables other than the first table, the Locate action library action CreateVirtualPage has the ability to separate out sections of a page into unique pages. Depending on the contents of the expected documents, this action can be used to break out tables into separate pages allowing FindTableValueRegEx to be used to search that new page containing only the table of interest.

So, I tried to use CreateVirtualPage action for separating the second table.
I think the steps can be below.

1. Recognize()
2. CreateCcoFromLayout()
3. (Locate action to set the beginning of the second table ?)
4. SetVirtualPageStartPosition()
5. (Locate action to set the end of the second table ?)
6. SetVirtualPageEndPosition()
7. CreateVirtualPage()

However, I couldn’t find a way to set locations of the beginning and the end of the second table.
I appreciate any advice.

Related:

7021290: Increased process “memory usage” after upgrading to SLES 12 SP2

When using the code from a shared library, the code that gets executed will create a virtual-> physical mapping of the pages the code “touches”.

A page in X86 is 4 kilobytes/4096. If only one small function in a library in SP0 [that does not have “do_fault_around”] is used, that is how much RSS that library will get charged.

For SP2, that same function call will map 64 kilobytes/65536, and this will be reflected in Resident Set Size (RSS) and Proportional Set Size (PSS).

The values can be found in /proc/<pid>/smaps.

Note though, that the physical pages are already present in the “page cache”, so there is no difference in the actual memory usage here.

The file /sys/kernel/debug/fault_around_bytes does not exist on SLES 12 GA, but on SP2 it exists with a default value of 65536.

In the end the customer was able to restore the memory use characteristics/stats of SP0 on SP2 by

echo 4096 > /sys/kernel/debug/fault_around_bytes

Related: