Understanding Workspace Environment Management (WEM) System Optimization

The WEM System Optimization feature is a group of settings designed to dramatically lower resource usage on a VDA on which the WEM Agent is installed.

These are machine-based settings that will apply to all user sessions.




Managing Servers with different Hardware Configurations

Sets of VMs may have been configured with different hardware configurations. For instance some machine may have 4 CPU cores and 8GB RAM, while others have 2 CPU Cores and 4GB RAM. The determination could be made such that each server set requires a different set of WEM System Optimization settings. Because machines can only be part of one WEM ConfigSet, administrators must consider whether they need to create multiple ConfigSets to accommodate different optimization profiles.


WEM System Optimization Settings

User-added image


Fast Logoff

A purely visual option that will end the HDX connection to a remote session, giving the impression that the session has immediately closed. However, the session itself continues to progress through the session logoff phases on the VDA.


CPU Management

CPU Priority:

You can statically define the priority for a process. Every instance of, for example, Notepad that is launched on the VDA will be launched with a priority of the desired CPU priority. The choices are:

  • Idle
  • Below Normal
  • Normal
  • Above Normal
  • High
  • Realtime *

* https://stackoverflow.com/questions/1663993/what-is-the-realtime-process-priority-setting-for

CPU Affinity:

You can statically define how many CPU cores a process will use. Every instance of Notepad that is launched on the VDA will use the number of cores defined.

Process Clamping:

Process clamping allows you to prevent a process from using more CPU percentage than the specified value. A process in the Process Clamping list can use CPU up to the configured percentage, but will not go higher. The setting limits the CPU percentage no matter which CPU cores the process uses.

Note: The clamping percentage is global, not per core (that is, 10% on a quad-core CPU is 10%, not 10% of one core).


Generally, Process Clamping is not a recommended solution for keeping the CPU usage of a troublesome process artificially low. It’s a brute force approach and computationally expensive. The better solution is to use a combination of CPU spikes protection and to assign static Limit CPU / Core Usage, CPU priorities, CPU affinities values to such processes.

CPU Management Settings:

CPU Spikes Protection:

CPU Spikes Protection is not the same as Process Clamping. Process Clamping will prevent a process from exceeding a set CPU percentage usage value. Spikes Protection manages the process when it exceeds the CPU Usage Limit (%) value.

CPU Spikes Protection is not designed to reduce overall CPU usage. CPU Spikes Protection is designed to reduce the impact on user experience by processes that consume an excessive percentage of CPU Usage.

If a process exceeds the CPU Usage Limit (%) value, for over a set period of time (defined by the Limit Sample Time (s) value), the process will be relegated to Low Priority for a set period of time, defined by the Idle Priority Time (s) value. The CPU usage Limit (%) value is global across all logical processors.

The total number of logical processors is determined by the number of CPUs, the number of cores in the CPU, and whether HyperThreading is enabled. The easiest method of determining the total number of logical cores in a machine is by using Windows Task Manager (2 logical processors shown in the image):

User-added image

To better understand CPU Spikes Protection, let’s follow a practical scenario:

Users commonly work with a web app that uses Internet Explorer. An administrator has noticed that iexplore.exe processes on the VDAs consume a lot of CPU time and overall responsiveness in user sessions is suffering. There are many other user processes running and percentage CPU usage is running in the 90 percent range.

To improve responsiveness, the administrator sets the CPU Usage Limit value to 50% and a Idle Priority Time of 180 seconds. For any given user session, when a single iexplore.exe process instance reaches 50% CPU usage, it’s CPU priority is immediately lowered to Low for 180 seconds. During this time iexplore.exe will consequently get less CPU time due to its low position in the CPU queue and thereby reduce its impact on overall session responsiveness. Other user processes that haven’t also reached 50% have a higher CPU priority and so continue to consume CPU time and although the overall percentage CPU usage continues to show above 90%, the session responsiveness for that user is greatly improved.

In this scenario, the machine has 4 logical processors. If the processes’ CPU usage is spread equally across all logical processors, each will show 12.5% usage for that process instance.

If there are two iexplore.exe process instances in a session, their respective percentage CPU usage values are not added to trigger Spikes Protection. Spikes Protection settings apply on each individual process instance.​

User-centric CPU Optimization (process tracking on the WEM Agent):

As stated previously, all WEM System Optimization settings are machine-based and settings configured for a particular ConfigSet will apply to all users launching sessions from the VDA.

The WEM Agent records the history of every process on the machine that has triggered Spikes Protection. It records the number of times that the process has triggered Spikes Protection, and it records the user for which the trigger occurred.

So if a process triggers the CPU Spikes Protection in User A’s session, the event is recorded for User A only. If User B starts the same process, then WEM Process Optimization behavior is determined only by process triggers in User B’s session. On each VDA the Spike Protection triggers for each user (by user SID) are stored in the local database on the VDA and refreshing the cache does not interfere with this stored history.

Limit CPU / Core Usage:

When a process has exceeded the CPU Usage Limit value (i.e. Spikes Protection for the process has been triggered), in addition to setting the CPU priority to Low, WEM can also limit the amount of CPU cores that the process uses if a CPU / Core Usage Limit value is set. The limit is in effect for the duration of the Idle Priority Time.

Enable Intelligent CPU Optimization:

When Enable Intelligent CPU Optimization is enabled, all processes that the user launches in their session will start at a CPU Priority of High. This makes sense as the user has purposefully launched the process, so we want the process to be reactive.

If a process triggers Spikes Protection, it will be relegated to Low priority for 180 seconds (if default setting is used). But, if it triggers Spikes Protection a certain number of times, the process will run at the next lowest CPU Priority the next time it’s launched.

So it was launching at High priority initially; once the process exceed a certain number of triggers, it will launch at Above Normal priority the next time. If the process continues to trigger Spikes Protection, it will launch at the next lowest priority until eventually it will launch at the lowest CPU priority.

The behavior of Enable Intelligent CPU Optimization is overridden if a static CPU Priority value has been set for a process. If Enable Intelligent CPU Optimization is enabled and a process’s CPU Priority value has been set to Below Normal, then the process will launch at Below Normal CPU priority instead of the default High priority.

If Enable Intelligent CPU Optimization is enabled and a process’s CPU Priority value has been statically set to High, then the process will launch at High. If the process triggers Spikes Protection, it will be relegated to Low priority for 180 seconds (if default setting is used), but then return to High priority afterwards.

Note: The Enable CPU Spikes Protection box must be ticked for Enable Intelligent CPU Optimization to work.


Memory Management

Working Set Optimization:

WEM determines how much RAM a running process is currently using and also determines the least amount of RAM the process requires, without losing stability. The difference between the two values is considered by WEM to be excess RAM. The process’s RAM usage is calculated over time, the duration of which is configured using the Idle Sample Time (min) WEM setting. The default value is 120 minutes.

Let’s look at a typical scenario when WEM Memory Management has been enabled:

A user opens Internet Explorer, navigates to YouTube, and plays some videos. Internet Explorer will use as much RAM as it needs. In the background, and over the sampling period, WEM determines the amount of RAM Internet Explorer has used and also determines the least amount of RAM required, without losing stability.

Then the user is finished with Internet Explorer and minimizes it to the Task Bar. When the process percentage CPU usage drops to the value set by the Idle State Limit (percentage) value (default is 1%), WEM then forces the process to release the excess RAM (as previously calculated). The RAM is released by writing it to the pagefile.

When the user restores Internet Explorer from the Task Bar, it will initially run in its optimized state but can still go on to consume additional RAM as needed.

When considering how this affects multiple processes over multiple user sessions, the result is that all of that RAM freed up is available for other processes and will increase user density by supporting a greater amount of users on the same server.

Idle State Limit (percent):

The value set here is the percentage of CPU usage under which a process is considered to be idle. The default is 1% CPU usage. Remember that when a process is considered to be idle, WEM forces it to shed its excess RAM. So be careful not to set this value too high; otherwise a process being actively used may be mistaken as an idle process, resulting in its memory being released. It is not advised to set this value higher than 5%.


I/O Management

These settings allow you to optimize the I/O priority of specific processes, so that processes which are contending for disk and network I/O access do not cause performance bottlenecks. For example, you can use I/O Management settings to throttle back a disk-bandwidth-hungry application.

The process priority you set here establishes the “base priority” for all of the threads in the process. The actual, or “current,” priority of a thread may be higher (but is never lower than the base). In general, Windows give access to threads of higher priority before threads of lower priority.

I/O Priority Settings:

Enable Process I/O Priority

When selected, this option enables manual setting of process I/O priority. Process I/O priorities you set take effect when the agent receives the new settings and the process is next restarted.

Add Process I/O Priority

Process Name: The process executable name without the extension. For example, for Windows Explorer (explorer.exe) type “explorer”.

I/O Priority: The “base” priority of all threads in the process. The higher the I/O priority of a process, the sooner its threads get I/O access. Choose from High, Normal, Low, Very Low.

Enable Intelligent I/O Optimization:

This adopts exactly the same principles as Enable Intelligent CPU Optimization, but for I/O instead of CPU.

Note: The Enable CPU Spikes Protection box must be ticked for Enable Intelligent I/O Optimization to work.


Exclude specified processes:

By default, WEM CPU Management excludes all of the most common Citrix and Windows core service processes. This is because they make the environment run and they need to make their own decisions about how much CPU time & priority they need. WEM administrators can however, add processes they want to exclude from Spikes Protection to the list. Typically, antivirus processes would be excluded. In this case, in order to stop antivirus scanning taking over disk I/O in the session, administrators would also set a static I/O Priority of Low for antivirus processes.


Notes:

  1. When configuring, the entered process name is a match to the process name’s entry in Windows Task Manager.
  2. Process names are not case-sensitive.
  3. You don’t enter “.exe” after the process name. So for instance, enter “notepad” rather than “notepad.exe”.

Related:

Server HDD activity high, crippling

I do not need a solution (just sharing information)

Our small organization’s single Server 2012 file server has been running SEPM since 2014.

I’ve been here since 2017.

Since I;ve been here, the server has been sluggish.

It’s a VM with 3TB disk space on 2 volumes and 6GB dedicated RAM.

The only other VMs are a very small linux VM and a small Win7 VM used only for remote login.

I’ve played around with stripping out unnecessary apps and even moved the paging file with only marginal sucess.

Today it was particularly lethargig and I noticed in resource monitor, that almost all of the disk activity (which was quite high) was rrelated to SEPM.

A google search turned up an issue where someone had “Live Update Administrator” and SEPM both running on the same server, and that was causing excessive disk activity.

That post was from 2010. Our services are not named the same.

I was wondering if “Live Update” is the same as “Live Update Manager” as we have bothe “Live Update” and “Symantec Endpoint Manager” services runningas well as a few others that begin with “Symantec…”

I wanted to stop the “Live Update” service to see what happened but I can’t. Looks like I’ll have to uninstall it.

It shows up as a seperate installed ap in “add and remove…”

I want to make sure from someone here, before i do that, though.

Thanks-

KK

0

Related:

Avamar Client for Windows: Avamar backup fails with “avtar Error : Out of memory for cache file” on Windows clients

Article Number: 524280 Article Version: 3 Article Type: Break Fix



Avamar Plug-in for Oracle,Avamar Client for Windows,Avamar Client for Windows 7.2.101-31



In this scenario we have the same issue presented in the KB 495969 however the solution does not apply due to an environment issue on a Windows client.

  • KB 495969 – Avamar backup fails with “Not Enough Space” and “Out of Memory for cache file”

The issue could affect any plugin like in this case with the error presented in the following manner:

  • For FS backups:
avtar Info <8650>: Opening hash cache file 'C:Program Filesavsvarp_cache.dat'avtar Error <18866>: Out of memory for cache file 'C:Program Filesavsvarp_cache.dat' size 805306912avtar FATAL <5351>: MAIN: Unhandled internal exception Unix exception Not enough space
  • For VSS backups:
avtar Info <8650>: Opening hash cache file 'C:Program Filesavsvarp_cache.dat'avtar Error <18866>: Out of memory for cache file 'C:Program Filesavsvarp_cache.dat' size 1610613280avtar FATAL <5351>: MAIN: Unhandled internal exception Unix exception Not enough space
  • For Oracle backup:
avtar Info <8650>: Opening hash cache file 'C:Program Filesavsvarclientlogsoracle-prefix-1_cache.dat'avtar Error <18866>: Out of memory for cache file 'C:Program Filesavsvarclientlogsoracle-prefix-1_cache.dat' size 100663840avtar FATAL <5351>: MAIN: Unhandled internal exception Unix exception Not enough spaceor this variant:avtar Info <8650>: Opening hash cache file 'C:Program Filesavsvarclientlogsoracle-prefix-1_cache.dat'avtar Error <18864>: Out of restricted memory for cache file 'C:Program Filesavsvarclientlogsoracle-prefix-1_cache.dat' size 100663840avtar FATAL <5351>: MAIN: Unhandled internal exception Unix exception Not enough space avoracle Error <7934>: Snapup of <oracle-db> aborted due to rman terminated abnormally - check the logs
  • With the RMAN log reporting this:
RMAN-00571: ===========================================================RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============RMAN-00571: ===========================================================RMAN-03002: failure of backup plus archivelog command at 06/14/2018 22:17:40RMAN-03009: failure of backup command on c0 channel at 06/14/2018 22:17:15ORA-04030: out of process memory when trying to allocate 1049112 bytes (KSFQ heap,KSFQ Buffers)Recovery Manager complete. 

Initially it was though the cache file could not grow in size due to incorrect “hashcachemax” value.

The client had plenty of free RAM (48GB total RAM) so we increase the flag’s value from -16 (3GB file size max) to -8 (6GB file size max)

But the issue persisted and the disk space was also not an issue, there was plenty of GBs of free space

Further investigations with a test binary from the engineering team lead to the fact that the MS OS was not releasing enough unused and contiguous memory required to allocate/load into the memory the entire hash cache file for the backup operation.

It was tried a test binary that would allocate the memory in smaller pieces to see if we could reach the point where the OS would allow the full file p_cache.dat to be loaded into memory but that also did not help, the Operative system was still not allowing to load the file into memory for some reason.

The root cause is hided somewhere in the OS however in this case we did not engage the MS team for further investigations on their side.

Instead we found a way to work around the issue setting the cache file to be smaller, see details in the resolution section below.

In order to work around this issue we set the hash cache file to be of a smaller size so that the OS would not have issues in allocating it into memory.

In this case it was noticed that the OS was also having problems in allocating smaller sizes like 200+ MB so we decided to re-size the p_cache.dat to be just 100MB with the use of the following flag:

–hashcachemax=100

This way the hash cache file would never grow beyond 100MB and would overwrite the old entries.

After adding that flag it is requited to recycle the cache file by renaming or deleting the p_cache.dat (renaming is the preferred option)

After the first backup which would take longer than usual as expected (to rebuild the cache file) the issue should be resolved.

  • The Demand-paging cache is not recommended in this scenario since the backup are directed to GSAN storage so the Monolithic paging cache was used.
  • Demand-paging was designed to gain benefit for backup being sent to DataDomain storage.

Related:

Understanding Write Cache in Provisioning Services Server

This article provides information about write cache usage in a Citrix Provisioning, formerly Provisioning Services (PVS), Server.

Write Cache in Provisioning Services Server

In PVS, the term “write cache” is used to describe all the cache modes. The write cache includes data written by the target device. If data is written to the PVS server vDisk in a caching mode, the data is not written back to the base vDisk. Instead, it is written to a write cache file in one of the following locations:

When the vDisk mode is private/maintenance mode, all data is written back to the vDisk file on the PVS Server. When the target device is booted in standard mode or shared mode, the write cache information is checked to determine the cache location. When a target device boots to a vDisk in standard mode/shared mode, regardless of the cache type, the data written to the Write Cache is deleted on boot so that when a target is rebooted or starts up it has a clean cache and contains nothing from the previous sessions.

If the PVS target is using Cache on Device RAM with overflow on hard disk or Cache on device hard disk, the PVS target software either does not find an appropriate hard disk partition or it is not formatted using NTFS. As a result, it will fail over to Cache on the server. The PVS target software will, by default, redirect the system page file to the same disk as the write cache so that the pagefile.sys is allocating space on the cache drive unless it is manually set up to be redirected on a separate volume.

For RAM cache without a local disk, you should consider setting the system page file to zero because all writes, including system page file writes, will go to the RAM cache unless redirected manually. PVS does not redirect the page file in the case of RAM cache.



Cache on device Hard Disk

Requirements

  • Local HD in every device using the vDisk.
  • The local HD must contain a basic volume pre-formatted with a Windows NTFS file system with at least 512MB of free space.

The cache on local HD is stored in a file called .vdiskcache on a secondary local hard drive. It gets created as an invisible file in the root folder of the secondary local HD. The cache file size grows, as needed, but never gets larger than the original vDisk, and frequently not larger than the free space on the original vDisk. It is slower than RAM cache or RAM Cache with overflow to local hard disk, but faster than server cache and works in an HA environment. Citrix recommends that you do not use this cache type because of incompatibilities with Microsoft Windows ASLR which could cause intermittent crashes and stability issues. This cache is being replaced by RAM Cache with overflow to the hard drive.

Cache in device RAM

Requirement

  • An appropriate amount of physical memory on the machine.

The cache is stored in client RAM. The maximum size of the cache is fixed by a setting in the vDisk properties screen. RAM cache is faster than other cache types and works in an HA environment. The RAM is allocated at boot and never changes. The RAM allocated can’t be used by the OS. If the workload has exhausted the RAM cache size, the system may become unusable and even crash. It is important to pre-calculate workload requirements and set the appropriate RAM size. Cache in device RAM does not require a local hard drive.

Cache on device RAM with overflow on Hard Disk

Requirement

  • Provisioning Service 7.1 hotfix 2 or later.
  • Local HD in every target device using the vDisk.
  • The local HD must contain Basic Volume pre-formatted with a Windows NTFS file system with at least 512 MB of free space. By default, Citrix sets this to 6 GB but recommends 10 GB or larger depending on workload.
  • The default RAM is 64 MB RAM, Citrix recommends at least 256 MB of RAM for a Desktop OS and 1 GB for Server OS if RAM cache is being used.
  • If you decide not to use RAM cache you may set it to 0 and only the local hard disk will be used to cache.

Cache on device RAM with overflow on hard disk represents the newest of the write cache types. Citrix recommends using this cache type for PVS, it combines the best of RAM with the stability of hard disk cache. The cache uses non-paged pool memory for the best performance. When RAM utilization has reached its threshold, the oldest of the RAM cache data will be written to the local hard drive. The local hard disk cache uses a file it creates called vdiskdif.vhdx.

Things to note about this cache type:

  • This write cache type is only available for Windows 7/2008 R2 and later.
  • This cache type addresses interoperability issues with Microsoft Windows ASLR.



Cache on Server

Requirements

  • Enough space allocated to where the server cache will be stored.
Server cache is stored in a file on the server, or on a share, SAN, or other location. The file size grows, as needed, but never gets larger than the original vDisk, and frequently not larger than the free space on the original vDisk. It is slower than RAM cache because all reads/writes have to go to the server and be read from a file. The cache gets deleted when the device reboots, that is, on every boot, the device reverts to the base image. Changes remain only during a single boot session. Server cache works in an HA environment if all server cache locations to resolve to the same physical storage location. This cache type is not recommended for a production environment.

Additional Resources

Selecting the Write Cache Destination for Standard vDisk Images

Turbo Charging your IOPS with the new PVS Cache in RAM with Disk Overflow Feature

Related:

7004093: How to get a Windows memory dump

If the “Complete memory dump” option is not available:

If the “Complete memory dump” option is removed from the choice list in the later Windows versions, it is because Windows knows that a Complete memory dump isn’t possible. e.g. The amount of physical RAM is more than 2GB, or the page file size isn’t set to the size of physical memory or greater.

The “How to generate a kernel or a complete memory dump file in Windows Server 2008” KB article (http://support.microsoft.com/kb/969028) presents a good deal of information on what’s new and different regarding obtaining a crash dump on Vista/2008, and also covers the “how to manually force a dump” topic too. Although the document describes the possibility of enabling the “Complete” memory dump option even though the machine has over 4GB of memory, due to the issue described of dumps over 4GB potentially being corrupt and the general non-necessity of actually making and uploading a dump of that size, Novell recommends using the “truncatememory or removememory switches in the BCDEdit.exe” approach described in the document.

i.e. From an elevated command prompt (i.e. “Run as administrator”), execute this command:

BCDEDIT.EXE /set {current} truncatememory 0x80000000

to have Windows ignore all the memory above 2GB after the next reboot. Now (after reboot) the “Complete” memory dump option should become available, and the Complete dump generated won’t be larger than 2GB.

To return the machine to its original memory configuration, execute this command:

BCDEDIT.EXE /deletevalue {current} truncatememory

Windows 7 Specific

When attempting to collect a memory dump in connection with a Windows 7 kernel-mode crash, the MEMORY.DMP file may be unexpectedly missing. This may be due to the following Windows 7-specific default behavior:

If there are less than 25GB of disk space free and the machine is not joined to a domain, by default Windows will delete a generated MEMORY.DMP file rather than keeping it. (After Windows reboots and reports the crash to Microsoft via the online crash analysis / Windows Error Reporting.)

If there are more than 25GB, or the machine is joined to a domain (read “corporate environment”), or you’re actually on a Windows Server 2008 R2 (not Windows 7 Ultimate / Professional / Home), the MEMORY.DMP will be retained by default, as it always has in previous versions of Windows.

The Windows 7 default policy can be explicitly overridden by setting the following registry value:

[HKEY_LOCAL_MACHINESYSTEMCurrentControlSetControlCrashControl]

“AlwaysKeepMemoryDump”=dword:00000001


Formerly known as TID# 10084257

Related:

7021211: Memory, I/O and DefaultTasksMax related considerations for SLES for SAP servers with huge memory

Somegeneral guidelines about using pagecache_limit and optimizing some ofthe I/O related settings:-

If on the server in question,you are *not* simultaneously mixing a heavy file I/O workload whilerunning a memory intensive application workload, then this setting(pagecache_limit) will probably cause more harm than good. However,in most SAP environments, there is both high I/O and memory intensiveworkloads.

Ideally, vm.pagecache_limit_mb should be zerountil such time that pagecache is seen to exhaust memory. If it doesexhaust memory then trial-and-error-tuning must be used to findvalues that work for the specific server/workload in question.

Asregards the type of settings that have both a fixed value and a’ratio’ setting option, keep in mind that ratio settings will be moreand more inaccurate as the amount of memory in the server grows.Therefore, specific ‘byte’ settings should be used as opposed to’ratio’ type settings. The ‘ratio’ settings can allow too muchaccumulation of dirty memory which has been proven to lead toprocessing stalls during heavy fsync or sync write loads. Settingdirty_bytes to a reasonable value (which depends on the storageperformance) leads to much less unexpected behavior.

Setting,say, a 4gb pagecache limit on a 142G machine, is asking for trouble,especially when you consider that this would be much smaller than adefault dirty ratio limit (which is by default 40% of availablepages).

If the pagecache_limit is used, it should alwaysbe set to a value well above the ‘dirty’ limit, be it a fixed valueor a percentage.

The thing is that there is no universal’correct’ values for these settings. You are always balancingthroughput with sync latency. If we had code in the kernel so that itwould auto-tune automatically based on the amount of RAM in theserver, it would be very prone to regressions because it depends onserver-specific loading. So, necessarily, it falls to the serveradmins to come up with the best values for these settings (viatrial-and-error).

*If* we know for a fact that the serverdoes encounter issues with pagecache_limit set to 0 (not active),then choose a pagecache_limit that is suitable in relation to howmuch memory is in the server.

Lets assume that you have aserver with 1TB of RAM, these are *suggested* values which could beused as a starting point:-

pagecache_limit_mb = 20972 # 20gb – Different values could be tried from say 20gb <>64gb

pagecache_limit_ignore_dirty = 1 # see the below section on this variable to decide what it should be set toovm.dirty_ratio =0

vm.dirty_bytes = 629145600 # This could be reduced orincreased based on actual hardware performance but

keep thevm.dirty_background_bytes to approximately 50% of thissetting

vm.dirty_background_ratio = 0

vm.dirty_background_bytes= 314572800 # Set this value to approximately 50% of vm.dirty_bytes


NOTE: If it isdecided to try setting pagecache_limit to 0 (not active) then it’sstill a good idea to test different values for dirty_bytes anddirty_background_bytes in an I/O intensive environment to arrive atthe bestperformance.

———————————————————————

Howpagecache_limit works:

—————————————-

Theheart of this patch is a function called shrink_page_cache(). It iscalled from balance_pgdat (which is the worker for kswapd) if thepagecache is above the limit. The function is also called in__alloc_pages_slowpath.

shrink_page_cache() calculates thenumber of pages the cache is over its limit. It reduces this numberby a factor (so you have to call it several times to get down to thetarget) then shrinks the pagecache (using the KernelLRUs).

shrink_page_cache does several passes:

– Just reclaiming from inactive pagecache memory. This is fast– but it might not find enough free pages; if that happens, thesecond pass will happen.

– In the second pass,pages from active list will also be considered.

– The third pass will only happen if pagecacahe_limig_ignore-dirty isnot 1. In that case, the third pass is a repetition of the secondpass, but this time we allow pages to be written out.

Inall passes, only unmapped pages will be considered.


Howit changes memorymanagement:

——————————————————

Ifthe pagecache_limit_mb is set to zero (default), nothing changes.

Ifset to a positive value, there will be three different operatingmodes:

(1) If we still have plenty of free pages, the pagecachelimit will NOT be enforced. Memory management decisions are taken asnormally.

(2) However, as soon someone consumes those freepages, we’ll start freeing pagecache — as those are returned to thefree page pool, freeing a few pages from pagecache will return us tostate (1) — if however someone consumes these free pages quickly,we’ll continue

freeing up pages from the pagecache until wereach pagecache_limit_mb.

(3) Once we are at or below the lowwatermark, pagecache_limit_mb, the pages in the page cache will begoverned by normal paging memory management decisions; if it startsgrowing above the limit (corrected by the free pages), we’ll freesome up again.

This feature is useful for machines thathave large workloads, carefully sized to eat most of the memory.Depending on the applications page access pattern, the kernel may tooeasily swap the application memory out in favor of pagecache. Thiscan happen even for low values of swappiness. With this feature, theadmin can tell the kernel that only a certain amount of pagecache isreally considered useful and that it otherwise should favor theapplicationsmemory.


pagecache_limit_ignore_dirty:

——————————————

Thedefault for this setting is 1; this means that we don’t considerdirty memory to be part of the limited pagecache, as we can not easily free up dirty memory (we’d need to do writes for this). Bysetting this to 0, we actually consider dirty (unampped) memoryto be freeable and do a third pass in shrink_page_cache() where weschedule the pages for write-out. Values larger than 1 are alsopossible and result in a fraction of the dirty pages to be considerednon-freeable.

From SAP on the subject:

If there are alot of local writes and it is OK to throttle them by limiting thewriteback caching, we recommended that you set the value to 0. Ifwriting mainly happens to NFS filesystems, the default 1 should beleft untouched. A value of 2 would be a middle ground, not limitinglocal write back caching as much, but potentially resulting in somepaging.

Related:

7021829: Java Memory Management in Reflection X Advantage

Memory Use Overview

Attachmate Java applications rely on the Java Virtual Machine (JVM) to manage memory. The JVM allocates a pool of memory at startup and applications running in that JVM share that memory. Each Attachmate application that uses Java sets a maximum value for the size of the JVM memory pool. (This is also called the Java heap size.) The JVM can use memory, as required, up to this specified limit. As memory use approaches this limit, the JVM will recycle unused memory so that it remains available to applications running in the JVM.

When viewing memory used by a Java application, note the following:

  • All applications running in a JVM draw memory from the JVM memory pool, not from the Windows operating system memory. The memory used by all applications running in the JVM will never exceed the specified heap size for that JVM.
  • The JVM will recycle available memory back to the applications running in that JVM as needed, but this doesn’t happen until the memory use approaches the heap limit. Because of this, applications running in a JVM may not actually require all the memory reported in the Windows Task Manager at any given time.

In the Windows Task Manager, some Java applications create processes that use an application-specific image name. (For example Reflection X Manager and Reflection X Manager for Domains create processes listed under rxmgr.exe and rxmgrdomains.exe). Other applications (including the Reflection X Service) create processes that use java.exe as the image name.

If you also run other applications that consume a lot of memory, you may see performance issues. As Windows runs out of the random access memory (RAM) needed to run your applications, it uses something called virtual memory to compensate. When the available RAM runs low, Windows moves data from RAM to disk storage in a file called the paging file. Moving data to and from the paging file frees up RAM so your computer can complete its work, but reading data from the disk is slower than reading data directly from memory.

Memory Troubleshooting

On most systems, you will not need to modify your settings using any of the following troubleshooting techniques. The Reflection X Advantage default JVM settings provide sufficient memory to address the memory demands created by most X clients and servers. The default maximum heap size also ensures sufficient memory will be available on most systems for your Windows applications that don’t use this JVM. However, in some cases, contention for memory may prevent Reflection X Advantage applications from launching, or affect the performance of other applications on your system.

If your system’s memory is limited or fragmented, you may see one of the following indications that memory limits are preventing Reflection X Advantage applications from running.

  • In version 5.0 or later, you see a message saying “Application failed to start due to insufficient memory.”
  • In version 4.2 or earlier, the Reflection X Advantage application fails to launch, and the Windows Event log shows a message like the following:

“The description for Event ID (1025) in Source (RXAdvantage) cannot be found. The local computer may not have the necessary registry information or message DLL files to display messages from a remote computer. You may be able to use the /AUXSOURCE= flag to retrieve this description; see Help and Support for details. The following information is part of the event: The event log file is corrupt.”

If memory limits prevent Reflection X Advantage from running, or you have determined that contention for memory is affecting performance of other applications, consider the following approaches.

Install Reflection X Advantage using the 64-bit Installer

If you have installed Reflection X Advantage using the 32-bit installer on a 64-bit system, and Reflection X Advantage applications fail to launch, reinstall using the 64-bit installer. The memory limit on 64-bit processes is much larger than the limit on a 32-bit processes.

Add RAM

The more RAM your computer has, the faster your programs will generally run. If a lack of RAM is slowing your computer, you may have success using the following adjustments. However, if none of these solutions result in better performance, adding additional RAM is the best solution.

Reduce the Maximum Memory Available to the JVM

Reducing the maximum memory limit for Reflection X Advantage (also called the Java heap size) can resolve the problem of Reflection X Applications failing to launch and can also increase the amount of memory available to other applications.

Note: This solution may affect performance in Reflection X Advantage. If memory is available Reflection X Advantage sessions cache images, the results of certain calculations, and other data. These actions can improve drawing performance and compression over low-bandwidth networks. If changing the Java heap size results in slow performance or X client failures, the value is probably set too low.

To change the maximum memory available to the Reflection X Advantage JVM:

  1. Close X Manager if it is running.
  2. Navigate to the Reflection X Advantage installation folder. (The default location for most installs is C:Program FilesAttachmateReflection).
  3. Locate rxmgr.alp (if you use standalone X Manager) or rxmgrdomains.alp (if you use X Manager for Domains) and remove the read-only attribute from this file. (Right-click, select Properties, clear the “Read-only” checkmark, then click OK.)
  4. Open the *.alp file in a text editor, such as Notepad.
  5. In the line that begins “alp.java.command.options=” locate the -Xmx setting. (The default value is -Xmx900m in version 5.0 and -Xmx1024m in version 4.2 and earlier.) Change this to a smaller value (for example: -Xmx700m).

Note: The *.alp files are replaced by newer, default files when you upgrade your Reflection X Advantage product. If memory problems reappear after an upgrade, see Technical Note 2530, “Installing Newer Versions of Reflection X Advantage If You Have Edited Rxmgr.alp.”

Increase Virtual Memory

You can increase the amount of virtual memory available by increasing the minimum and maximum sizes of the paging file.

This change can result in slower performance in Reflection X Advantage and other applications because of excessive disk paging. This solution may work for you if there are idle applications that can be paged out at times.

To modify virtual memory in Windows:

  1. Open the System Properties dialog box. (In Windows 7: Start menu > right-click Computer > Properties > Advanced system settings.)
  2. On the Advanced tab, under Performance, click Settings.
  3. Click the Advanced tab, and then, under Virtual memory, click Change.
  4. Click Custom size and set values for Initial size (MB) and/or Maximum size (MB).

Increases in size usually don’t require a restart, but if you decrease the size, you will need to restart Windows.

Add Additional Reflection X Advantage Nodes

If you are running Reflection X Advantage in domain mode and are experiencing memory contention on the domain node, you can experiment with using multiple nodes to help alleviate memory problems. Reflection X Advantage supports creating multiple nodes on the same system. This change effectively increases the allowed memory available to run sessions. You can also add nodes on remote systems. This increases both the number of CPU cores available and amount of available memory.

To add nodes to a domain, use the rxsconfig command line utility. For details, see “Set Up Domain Nodes” in the Reflection X Advantage product Help.

The number of sessions that a node can support depends on what kind of domain services are configured for the session (such as whether the session has a headless server running on the node) and how active the clients of the session are. You can monitor the load on your nodes using the Administrative Console. As an initial approximation, a session without a headless server can be expected to require about 1MB, and with a headless server about 25MB of heap space. For a 1GB heap (the default), this can be anything from about 30-1000 sessions. With a large number of sessions, resources such as cores and network bandwidth become a bigger concern.

Related:

Installing Community Edition on Azure VM

Hi,
I have had good success running the QRadar Community Edition on a standard Azure CentOS VM, so I thought I would post the brief mods required that allow the installation to run here, in case anyone else finds them useful – use them at your own risk.

I am not going to explain how to create an Azure VM, hopefully you will already be up to speed on that, the VM specifics I used are –

PublisherName: OpenLogic
Offer: CentOS
Skus: 7.3
Version: Latest
Size: Standard_F2s (this is 2 cpus, 4GB RAM, premium storage)
VMOSDiskSize: 80GB

Once created the VM needs a few changes to make the QRadar install run smoothly, as follows.

Extend the /dev/sda2 partition to use the full available space

sudo fdisk /dev/sda

The device presents a logical sector size that is smaller than
the physical sector size. Aligning to a physical sector (or optimal
I/O) size boundary is recommended, or performance may be impacted.
Welcome to fdisk (util-linux 2.23.2).

Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Command (m for help): d
Partition number (1,2, default 2): 2
Partition 2 is deleted

Command (m for help): n
Partition type:
p primary (1 primary, 0 extended, 3 free)
e extended
Select (default p): p
Partition number (2-4, default 2): 2
First sector (1026048-167772159, default 1026048):
Using default value 1026048
Last sector, +sectors or +size{K,M,G} (1026048-167772159, default 167772159):
Using default value 167772159
Partition 2 of type Linux and of size 79.5 GiB is set

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.

WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table. The new table will be used at
the next reboot or after you run partprobe(8) or kpartx(8)
Syncing disks.

Reboot to pick up the new partition table.

sudo reboot

Grow the root filesystem:

sudo xfs_growfs /

Create 8GB of swap space:

sudo dd if=/dev/zero of=/swapfile bs=1024 count=8388608

sudo chmod 0600 /swapfile

sudo mkswap /swapfile

sudo swapon /swapfile

Add the following line to /etc/fstab to mount the swap on reboot:

/swapfile swap swap defaults 0 0

Update everything and install screen:

sudo yum -y update

sudo yum install screen

Disable SELINUX, and reboot to clear it:

sudo sed -i -e ‘s/^SELINUX=.*$/SELINUX=disabled/g’ /etc/selinux/config
sudo reboot

Copy the Community Edition to a temporary directory, mount it and run the setup as per the IBM instructions (You get the standard appliance install screens, it tells you that you have insufficient memory, but continues to install an appliance type “300”.)

Eventually you get a working Qradar CE system! Don’t forget this doesn’t have all the DSMs so you may need to get rpms from fix central for additional log source support.

Regards.

Related:

Datacap webservice does not process the vscan task

We are running Datacap 9.1.1 on Windows 2012.
We have a Datacap application based on the standard template that we need to call via web service.
I tried to simulate the process using Postman (similar to Fiddler) to generate the REST calls. Our input files are in a folder as they would be if the process was run via the Datacap Studio. The process seems to perform the vscanmulti task but the result is just a vscanmulti.xml page file in the batch folder. The next task PageId obviously gets stuck as pending as no file was scanned. I tried using GrabBatch and Release but they just generate pagefiles in the Batch folder with the task names but no processing is performed. No task logs are produced either.
How do I get the app to process vscan then followed by the remaining tasks? I understand that I need to use the GetPageFile and SetPageFile but that is only after the scan is performed successfully.

The following are the steps:

1. /Session/Logon – successful
2. /Queue/CreateBatch – apparently successful. Request includes the pagefile definition.
3. /Queue/ReleaseBatch/AppDev/135/finished – successful
4. /Queue/GrabBatch/AppDev/135 – successful
5. /Queue/ReleaseBatch/AppDev/135/finished – successful
6. etc.

I have reviewed both the Knowledge Center and the IBM Datacap V9.0 Installing and Using Datacap Web Services.pdf as well as the Datacap-Web-Services-Projects sample from DeveloperWorks.
Thanks in advance.

Related: