Citrix ADC Integrated Caching Counters

This article contains information about the newnslog Integrated Caching counters and a brief description of the counters.

Using the Counters

Log on to the ADC using an SSH client, change to SHELL, navigate to the /var/nslog directory, and then use the ‘nsconmsg’ command to see comprehensive statistics using the different counters available. For the detailed procedure refer to Citrix Blog – NetScaler ‘Counters’ Grab-Bag!.

The newnslog Integrated Caching counters

The following table lists the different newnslog Integrated Caching counters and a brief description of the counter.

newnslog Counter

Description

cac_tot_req

Total cache hits plus the total cache misses.

cac_cur_pcb_hit

This number should be close to the number of hits being served currently.

cac_tot_non304_hit

Total number of full (non-304) responses served from the cache. A 304 status code indicates that a response has not been modified since the last time it was served.

cac_tot_304_hit

Object not modified responses served from the cache.

(Status code 304 served instead of the full response.)

cache_tot_hits

Responses served from the integrated cache. These responses match a policy with a CACHE action.

cache_percent_304_hits

304 responses as a percentage of all responses that the NetScaler appliance served.

cache_percent_hits

Cache hits as percentage of the total number of requests.

cache_recent_percent_304_hits

Recently recorded ratio of 304 hits to all hits expressed as percentage

cache_recent_percent_hit

Recently recorded cache hit ratio expressed as percentage.

cac_cur_pcb_miss

Responses fetched from the origin and served from the cache. Should approximate storable misses. Does not include non-storable misses.

cache_tot_misses

Intercepted HTTP requests requiring fetches from origin server.

cache_tot_storable_misses

Cache misses for which the fetched response is stored in the cache before serving it to the client. Storable misses conform to a built-in or user-defined caching policy that contains a CACHE action.

cache_tot_non_storable_misses

Cache misses for which the fetched response is not stored in the cache. These responses match policies with a NOCACHE action or are affected by Poll Every Time.

cache_tot_revalidation_misses

Responses that an intervening cache revalidated with the integrated cache before serving, as determined by a Cache-Control: Max-Age header configurable in the integrated cache.

cache_tot_full_to_conditional_request

Number of user-agent requests for a cached Poll Every Time (PET) response that were sent to the origin server as conditional requests.

cache_percent_storable_miss

Responses that were fetched from the origin, stored in the cache, and then served to the client, as a percentage of all cache misses.

cache_recent_percent_storable_miss

Recently recorded ratio of storable misses to all misses expressed as percentage.

cache_percent_successful_reval

Percentage of times stored content was successfully revalidated by a 304 (Object Not Modified) response rather than by a full response

cache_recent_percent_successful

_reval

Recently recorded percentage of times stored content was successfully revalidated by a 304 response rather than by a full response

cache_tot_successful_revalidation

Total number of times stored content was successfully revalidated by a 304 Not Modified response from the origin.

cache_percent_byte_hit

Bytes served from the cache divided by total bytes served to the client. If compression is On in the NetScaler, this ratio might not reflect the bytes served by the compression module. If the compression is Off, this ratio is the same as cachePercentOriginBandwidthSaved.

cache_recent_percent_byte_hit

Recently recorded cache byte hit ratio expressed as percentage. Here we define byte hit ratio as ((number of bytes served from the cache)/(total number of bytes served to the client)). This is the standard definition of Byte Hit Ratio. If the compression is turned ON in the NetScaler appliance, then this ratio does not mean much. This might under or overestimate the origin-to-cache bandwidth saving depending upon whether bytes served by CMP in NetScaler are more or less than compressed bytes served from the cache. If CMP is turned OFF in the NetScaler appliance, then this ratio is the same as cacheRecentPercentOriginBandwidthSaved.

cactor_max32_res_so_far

Size, in bytes, of largest response sent to client from the cache or the origin server.

cache_tot_resp_bytes

Total number of HTTP response bytes served by the NetScaler appliance from both the origin and the cache.

cache_bytes_served

Total number of bytes served from the Integrated Cache.

cache_comp_bytes_served

Number of compressed bytes served from the cache.

cac_tot_parameterized_inval_req

Requests matching a policy with an invalidation (INVAL) action and a content group that uses an invalidation selector or parameters.

cac_tot_non_parameterized

_inval_req

Requests that match an invalidation policy where the invalid Groups parameter is configured and expires one or more content groups.

cac_tot_inval_nostore_miss

Requests that match an invalidation policy and result in expiration of specific cached responses or entire content groups.

cache_percent_origin_

bandwidth_saved

Percentage of origin bandwidth saved, expressed as number of bytes served from the integrated cache divided by all bytes served. The assumption is that all compression is done in the NetScaler appliance.

cache_recent_percent_origin

_bandwidth_saved

Bytes served from cache divided by total bytes served to client. This ratio can be greater than 1 because of the assumption that all compression has been done in the NetScaler appliance.

cactor_tot_expire_at_last_byte

Instances of content expiring immediately after receiving the last body byte due to the Expire at Last Byte setting for the content group.

cac_tot_enable_flashcache

Number of requests to a content group with flash cache enabled that were cache misses. Flash cache distributes the response to all the clients in a queue.

cac_tot_delayed_logging

Number of requests to a content group with flash cache enabled that were cache hits. The flash cache setting queues requests that arrive simultaneously and distributes the response to all the clients in the queue.

cac_tot_parameterized_non304_hit

Parameterized requests resulting in a full response (not status code 304: Object Not Updated) served from the cache.

cactor_tot_parameterized_req

Total number of requests where the content group has hit and invalidation parameters or selectors.

cac_tot_parameterized_304_hit

Parameterized requests resulting in an object not modified (status code 304) response.

cache_tot_parameterized_hits

Parameterized requests resulting in either a 304 or non-304 hit.

cache_percent_parameterized

_304_hits

Percentage of parameterized 304 hits relative to all parameterized hits.

cache_recent_percent_

parameterized_hits

Recently recorded ratio of parameterized 304 hits to all parameterized hits, expressed as a percentage

cactor_tot_pet_with_nostore_reval

Requests that triggered a search of a content group that has Poll Every Time (PET) enabled (always consult the origin server before serving cached data).

cache_tot_pet_hits

Number of times a cache hit was found during a search of a content group that has Poll Every Time enabled.

cache_percent_pet_hits

Percentage of cache hits in content groups that have Poll Every Time enabled, relative to all searches of content groups with Poll Every Time enabled.

cache_max_mem

Largest amount of memory the NetScaler appliance can dedicate to caching, up to 50% of available memory. A 0 value disables caching, but the caching module continues to run.

cache64_max_mem

Largest amount of memory the NetScaler appliance can dedicate to caching, up to 50% of the available memory. A 0 value disables caching, but the caching module continues to run.

cache_max_mem_active

Currently active value of maximum memory.

cache_utilized_mem

Amount of memory the integrated cache is currently using.

cactor_cur_hash

Responses currently in integrated cache. Includes responses fully downloaded, in the process of being downloaded, and expired or flushed but not yet removed.

cache_cur_marker_cell

Marker objects created when a response exceeds the maximum or minimum size for entries in its content group or has not yet received the minimum number of hits required for items in its content group.

cactor_err_no_buf

Total number of times the cache failed to allocate memory to store responses.

Related:

http.method & ICAP

I need a solution

Hello,

I used to configure for ICAP RESPMOD …

<Cache>

policy.BC_malware_scanner policy.ICAP_Content_Scan_Security

Is there any internal/performance reason using instead …

<Cache>

http.method=GET policy.BC_malware_scanner policy.ICAP_Content_Scan_Security

… to prevent sending CONNECT, POST and all other non relevant methods

Best Regards,

Vincent

0

Related:

OneFS Read Caching

There have been a number of recent questions from the field around how caching is performed in OneFS. So it seemed like an ideal time to review this topic.

Caching occurs in OneFS at multiple levels, and for a variety of types of data. For this discussion we’ll concentrate on the caching of file system structures in main memory and on SSD.

Isilon’s caching infrastructure design is based on aggregating each individual node’s cache into one cluster wide, globally accessible pool of memory. This is done by using an efficient messaging system, which allows all the nodes’ memory caches to be available to each and every node in the cluster.

For remote memory access, OneFS utilizes Ethernet or Infiniband (IB) for the cluster’s private, backend network. While not as quick as local memory, remote memory access is still very fast due to the low latency of 40Gb Ethernet or QDR Infiniband. In particular, an IB utilizes the Sockets Direct Protocol (SDP), which provides an efficient, socket-like interface between nodes which, by using a switched star topology, ensures that remote memory addresses are only ever one hop away.

OneFS uses up to three levels of read cache, plus an NVRAM-backed write cache, or write coalescer. The first two types of read cache, level 1 (L1) and level 2 (L2), are memory (RAM) based, and analogous to the cache used in CPUs. These two cache layers are present in all Isilon storage nodes. An optional third tier of read cache, called SmartFlash, or Level 3 cache (L3), is also configurable on nodes that contain solid state drives (SSDs). L3 cache is an eviction cache that is populated by L2 cache blocks as they are aged out from memory.

caching_1.png

The OneFS caching subsystem is coherent across the cluster. This means that if the same content exists in the private caches of multiple nodes, this cached data is consistent across all instances. For example, consider the following scenario:

1. Node 2 and Node 4 each have a copy of data located at an address in shared cache.

2. Node 4, in response to a write request, invalidates node 2’s copy.

3. Node 4 then updates the value.

4. Node 2 must re-read the data from shared cache to get the updated value.

OneFS utilizes the MESI Protocol to maintain cache coherency, implementing an “invalidate-on-write” policy to ensure that all data is consistent across the entire shared cache. The various states that in-cache data can take are:

  • M – Modified: The data exists only in local cache, and has been changed from the value in shared cache. Modified data is referred to as ‘dirty’.
  • E – Exclusive: The data exists only in local cache, but matches what is in shared cache. This data referred to as ‘clean’.
  • S – Shared: The data in local cache may also be in other local caches in the cluster.
  • I – Invalid: A lock (exclusive or shared) has been lost on the data.



caching_2.png

L1 cache, or front-end cache, is memory that is nearest to the protocol layers (e.g. NFS, SMB, etc) used by clients, or initiators, connected to that node. The main task of L1 is to prefetch data from remote nodes. Data is pre-fetched per file, and this is optimized in order to reduce the latency associated with the nodes’ IB back-end network. Since the IB interconnect latency is relatively small, the size of L1 cache, and the typical amount of data stored per request, is less than L2 cache.

L1 is also known as remote cache because it contains data retrieved from other nodes in the cluster. It is coherent across the cluster, but is used only by the node on which it resides, and is not accessible by other nodes. Data in L1 cache on storage nodes is aggressively discarded after it is used. L1 cache uses file-based addressing, in which data is accessed via an offset into a file object. The L1 cache refers to memory on the same node as the initiator. It is only accessible to the local node, and typically the cache is not the master copy of the data. This is analogous to the L1 cache on a CPU core, which may be invalidated as other cores write to main memory. L1 cache coherency is managed via a MESI-like protocol using distributed locks, as described above.

Note: L1 cache is utilized differently in the Isilon Accelerator nodes, which don’t contain any disk drives. Instead, the entire read cache is L1 cache, since all the data is fetched from other storage nodes. Also, cache aging is based on a least recently used (LRU) eviction policy, as opposed to the drop-behind algorithm typically used in a storage node’s L1 cache. Because an accelerator’s L1 cache is large, and the data in it is much more likely to be requested again, so data blocks are not immediately removed from cache upon use. However, metadata & update heavy workloads don’t benefit as much, and an accelerator’s cache is only beneficial to clients directly connected to the node.



caching_2-1.png

L2, or back-end cache, refers to local memory on the node on which a particular block of data is stored. L2 reduces the latency of a read operation by not requiring a seek directly from the disk drives. As such, the amount of data prefetched into L2 cache for use by remote nodes is much greater than that in L1 cache.

L2 is also known as local cache because it contains data retrieved from disk drives located on that node and then made available for requests from remote nodes. Data in L2 cache is evicted according to a Least Recently Used (LRU) algorithm. Data in L2 cache is addressed by the local node using an offset into a disk drive which is local to that node. Since the node knows where the data requested by the remote nodes is located on disk, this is a very fast way of retrieving data destined for remote nodes. A remote node accesses L2 cache by doing a lookup of the block address for a particular file object. As described above, there is no MESI invalidation necessary here and the cache is updated automatically during writes and kept coherent by the transaction system and NVRAM.

L3 cache is a subsystem which caches evicted L2 blocks on a node. Unlike L1 and L2, not all nodes or clusters have an L3 cache, since it requires solid state drives (SSDs) to be present and exclusively reserved and configured for caching use. L3 serves as a large, cost-effective way of extending a node’s read cache from gigabytes to terabytes. This allows clients to retain a larger working set of data in cache, before being forced to retrieve data from higher latency spinning disk. The L3 cache is populated with “interesting” L2 blocks dropped from memory by L2’s least recently used cache eviction algorithm. Unlike RAM based caches, since L3 is based on persistent flash storage, once the cache is populated, or warmed, it’s highly durable and persists across node reboots, etc. L3 uses a custom log-based filesystem with an index of cached blocks. The SSDs provide very good random read access characteristics, such that a hit in L3 cache is not that much slower than a hit in L2.

To utilize multiple SSDs for cache effectively and automatically, L3 uses a consistent hashing approach to associate an L2 block address with one L3 SSD. In the event of an L3 drive failure, a portion of the cache will obviously disappear, but the remaining cache entries on other drives will still be valid. Before a new L3 drive may be added to the hash, some cache entries must be invalidated.

OneFS also uses a dedicated inode cache in which recently requested inodes are kept. The inode cache frequently has a large impact on performance, because clients often cache data, and many network I/O activities are primarily requests for file attributes and metadata, which can be quickly returned from the cached inode.

OneFS provides tools to accurately assess the performance of the various levels of cache at a point in time. These cache statistics can be viewed from the OneFS CLI using the isi_cache_stats command. Statistics for L1, L2 and L3 cache are displayed for both data and metadata. For more detailed and formatted output, a verbose option of the command is available using the‘isi_cache_stats -v’ option.

caching_8.png

It’s worth noting that for L3 cache, the prefetch statistics will always read zero, since it’s a pure eviction cache and does not utilize data or metadata prefetch.



caching_9.png

Due to balanced data distribution, automatic rebalancing, and distributed processing, OneFS is able to leverage additional CPUs, network ports, and memory as the system grows. This also allows the caching subsystem (and, by virtue, throughput and IOPS) to scale linearly with the cluster size.

Related:

7023312: Filr L1 Terminal Fault / Foreshadow vulnerabilities (CVE-2018-3615,CVE-2018-3620,CVE-2018-3646)

This document (7023312) is provided subject to the disclaimer at the end of this document.

Environment

Micro Focus Filr 3.x

Situation

Modern Intel CPUs feature “hyper threads”, where multiple threads of execution can happen on the same core, sharing various resources, including the Level 1 (L1) Data Cache.
Researchers have found that during speculative execution, pagetable address lookups do not honor pagetable present and other reserved bits, so that speculative execution could read memory content of other processes or other VMs if this memory content is present in the shared L1 Datacache of the same core.
The issue is called “Level 1 Terminal Fault”, or short “L1TF”.
3 Variants of the issue are tracked :
  • OS level: CVE-2018-3620
  • VMM level: CVE-2018-3646
  • SGX enclave level: CVE-2018-3615
Since the Filr, Search, and Database servers are provided as an appliance running on SLES-11, the kernel updates provided by SUSE are required to mitigate these vulnerabilities.

Resolution

A fix for this issue is available in Filr 3.0 – Security Update 5 available via the Micro Focus Patch Finder.
If you’re running Filr 1.2 or older, please upgrade to the Filr 3.0 Security Update 5.

Additional Information

For more details, please consult https://www.suse.com/support/kb/doc/?id=7023077

Disclaimer

This Support Knowledgebase provides a valuable tool for NetIQ/Novell/SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented “AS IS” WITHOUT WARRANTY OF ANY KIND.

Related:

  • No Related Posts

Understanding Write Cache in Provisioning Services Server

This article provides information about write cache usage in a Citrix Provisioning, formerly Provisioning Services (PVS), Server.

Write Cache in Provisioning Services Server

In PVS, the term “write cache” is used to describe all the cache modes. The write cache includes data written by the target device. If data is written to the PVS server vDisk in a caching mode, the data is not written back to the base vDisk. Instead, it is written to a write cache file in one of the following locations:

When the vDisk mode is private/maintenance mode, all data is written back to the vDisk file on the PVS Server. When the target device is booted in standard mode or shared mode, the write cache information is checked to determine the cache location. When a target device boots to a vDisk in standard mode/shared mode, regardless of the cache type, the data written to the Write Cache is deleted on boot so that when a target is rebooted or starts up it has a clean cache and contains nothing from the previous sessions.

If the PVS target is using Cache on Device RAM with overflow on hard disk or Cache on device hard disk, the PVS target software either does not find an appropriate hard disk partition or it is not formatted using NTFS. As a result, it will fail over to Cache on the server. The PVS target software will, by default, redirect the system page file to the same disk as the write cache so that the pagefile.sys is allocating space on the cache drive unless it is manually set up to be redirected on a separate volume.

For RAM cache without a local disk, you should consider setting the system page file to zero because all writes, including system page file writes, will go to the RAM cache unless redirected manually. PVS does not redirect the page file in the case of RAM cache.



Cache on device Hard Disk

Requirements

  • Local HD in every device using the vDisk.
  • The local HD must contain a basic volume pre-formatted with a Windows NTFS file system with at least 512MB of free space.

The cache on local HD is stored in a file called .vdiskcache on a secondary local hard drive. It gets created as an invisible file in the root folder of the secondary local HD. The cache file size grows, as needed, but never gets larger than the original vDisk, and frequently not larger than the free space on the original vDisk. It is slower than RAM cache or RAM Cache with overflow to local hard disk, but faster than server cache and works in an HA environment. Citrix recommends that you do not use this cache type because of incompatibilities with Microsoft Windows ASLR which could cause intermittent crashes and stability issues. This cache is being replaced by RAM Cache with overflow to the hard drive.

Cache in device RAM

Requirement

  • An appropriate amount of physical memory on the machine.

The cache is stored in client RAM. The maximum size of the cache is fixed by a setting in the vDisk properties screen. RAM cache is faster than other cache types and works in an HA environment. The RAM is allocated at boot and never changes. The RAM allocated can’t be used by the OS. If the workload has exhausted the RAM cache size, the system may become unusable and even crash. It is important to pre-calculate workload requirements and set the appropriate RAM size. Cache in device RAM does not require a local hard drive.

Cache on device RAM with overflow on Hard Disk

Requirement

  • Provisioning Service 7.1 hotfix 2 or later.
  • Local HD in every target device using the vDisk.
  • The local HD must contain Basic Volume pre-formatted with a Windows NTFS file system with at least 512 MB of free space. By default, Citrix sets this to 6 GB but recommends 10 GB or larger depending on workload.
  • The default RAM is 64 MB RAM, Citrix recommends at least 256 MB of RAM for a Desktop OS and 1 GB for Server OS if RAM cache is being used.
  • If you decide not to use RAM cache you may set it to 0 and only the local hard disk will be used to cache.

Cache on device RAM with overflow on hard disk represents the newest of the write cache types. Citrix recommends using this cache type for PVS, it combines the best of RAM with the stability of hard disk cache. The cache uses non-paged pool memory for the best performance. When RAM utilization has reached its threshold, the oldest of the RAM cache data will be written to the local hard drive. The local hard disk cache uses a file it creates called vdiskdif.vhdx.

Things to note about this cache type:

  • This write cache type is only available for Windows 7/2008 R2 and later.
  • This cache type addresses interoperability issues with Microsoft Windows ASLR.



Cache on Server

Requirements

  • Enough space allocated to where the server cache will be stored.
Server cache is stored in a file on the server, or on a share, SAN, or other location. The file size grows, as needed, but never gets larger than the original vDisk, and frequently not larger than the free space on the original vDisk. It is slower than RAM cache because all reads/writes have to go to the server and be read from a file. The cache gets deleted when the device reboots, that is, on every boot, the device reverts to the base image. Changes remain only during a single boot session. Server cache works in an HA environment if all server cache locations to resolve to the same physical storage location. This cache type is not recommended for a production environment.

Additional Resources

Selecting the Write Cache Destination for Standard vDisk Images

Turbo Charging your IOPS with the new PVS Cache in RAM with Disk Overflow Feature

Related:

7023078: Security Vulnerability: “L1 Terminal Fault” (L1TF) ??? Hypervisor Information (CVE-2018-3620, CVE-2018-3646, XSA-273).

Full mitigation for this issue requires a combination of hardware and software changes. Depending on the guest type, software changes may be required at both the Hypervisor and guest level.

Updated Intel microcode (provided through your hardware / BIOS vendor or by SUSE) introduces a new feature called “flush_l1d”. Hypervisors and bare-metal kernels use this feature to flush the L1 data cache during operations which may be susceptible to data leakage (e.g. when switching between VMs in Hypervisor environments).

Software mitigations exist for the Linux Kernel and for Hypervisors. These mitigations include support for new CPU features, passing these features to guests, and support for enabling/disabling/tuning the mitigations. Recommended mitigations vary depending on the environment.

For the Linux kernel (on both bare metal and virtual machines) L1TF mitigation is controlled through the “l1tf” kernel boot parameter. For complete information on this parameter, see TID 7023077.

KVM

For KVM host environments, mitigation can be achieved through L1D cache flushes, and/or disabling Extended Page Tables (EPT) and Simultaneous MultiThreading (SMT).

The L1D cache flush behavior is controlled through the “kvm-intel.vmentry_l1d_flush” kernel command line option:

kvm-intel.vmentry_l1d_flush=always

The L1D cache is flushed on every VMENTER.

kvm-intel.vmentry_l1d_flush=cond

The L1D cache is flushed on VMENTER only when there can be leak of host memory between VMEXIT and VMENTER. This could still leak some host data, like address space layout.

kvm-intel.vmentry_l1d_flush=never

Disables the L1D cache flush mitigation.

The default setting here is “cond”.

The l1tf “full” setting overrides the settings of this configuration variable.


L1TF can be used to bypass Extended Page Tables (EPT). To mitigate this risk, it is possible to disable EPT and use shadow pages instead. This mitigation is available through the “kvm-intel.enable_ept” option:
kvm-intel.enable_ept=0

The Extended Page tables support is switched off.
As shadow pages are much less performant than EPT, SUSE recommends leaving EPT enabled, and use L1D cache flush and SMT tuning for full mitigation.


To eliminate the risk of untrusted processes or guests exploiting this vulnerability on a sibling hyper-thread, Simultaneous MultiThreading (SMT) can be disabled completely.

SMT can be controlled through kernel boot command line parameters, or on-the-fly through sysfs:

On the kernel boot command line:

nosmt

SMT is disabled, but can be later reenabled in the system.

nosmt=force

SMT is disabled, and can not be reenabled in the system.

If this option is not passed, SMT is enabled. Any SMT options used with the “l1tf” kernel parameter option overrides this “nosmt” option.


SMT can also be controlled through sysfs:

/sys/devices/system/cpu/smt/control

This file allows to read the current control state and allows to disable or (re)enable SMT.

Possible states are:

on

SMT is supported and enabled.

off

SMT is supported, but disabled. Only primary SMT threads can be onlined.

forceoff

SMT is supported, but disabled. Further control is not possible.

notsupported

SMT is not supported.

Potential values that can be written into this file:

on

off

forceoff

/sys/devices/system/cpu/smt/active

This file contains the state of SMT, if it is enabled and active, where active means that multiple threads run on 1 core.

Xen

For Xen hypervisor environments, mitigation is enabled by default and varies based on guest type. Manual adjustment of the “smt=” parameter is recommended, but the remaining parameters are best left at default values.A description of all relevant parameters are provided in the event any changes are necessary.

PV guests achieve mitigation at the Xen Hypervisor level. If a PV guest attempts to write an L1TF-vulnerable PTE, the hypervisor will force shadow mode and prevent the vulnerability. PV guests which fail to switch to shadow mode (e.g. due to a memory shortage at the hypervisor level) are intentionally crashed.

pv-l1tf=[ <bool>, dom0=<bool>, domu=<bool> ]

By default, pv-l1tf is enabled for DomU environments and, for stability and performance reasons, disabled for Dom0.

HVM guests achieve mitigation through a combination of L1D flushes, and disabling SMT.

spec-ctrl=l1d-flush=<bool>

This parameter determines whether or not the Xen hypervisor performs L1D flushes on VMEntry. Regardless of this setting, this feature is virtualized and passed to HVM guests for in-guest mitigation.

smt=<bool>
This parameter can be used to enable/disable SMT from the hypervisor. Xen environments hosting any untrusted HVM guests, or guests not under the full control of the host admin, should either disable SMT (through BIOS or smt=<bool> means), or ensure HVM guests use shadow mode (hap=0) in order to fully mitigate L1TF. It is also possible to reduce the risk of L1TF through the use of CPU pinning, custom CPU pools and/or soft-offlining of some hyper-threads.
These approaches are beyond the scope of this TID, but are documented in the standard Xen documentation.

WARNING – The combination of Meltdown mitigation (KPTI) and shadow mode on hardware which supports PCID can result in a severe performance degradation.

NOTE – Efforts are ongoing to implement scheduling improvements that allow hyper-thread siblings to be restricted to threads from a single guest. This will reduce the exposure of L1TF, and the requirement to disable SMT in many environments.

Related:

7023077: Security Vulnerability: “L1 Terminal Fault” (L1TF) aka CVE-2018-3615, CVE-2018-3620 & CVE-2018-3646.

Modern Intel CPUs feature “hyper threads”, where multiple threads of execution can happen on the same core, sharing various resources, including the Level 1 (L1) Data Cache.

Researchers have found that during speculative execution, pagetable address lookups do not honor pagetable present and other reserved bits, so that speculative execution could read memory content of other processes or other VMs if this memory content is present in the shared L1 Datacache of the same core.

The issue is called “Level 1 Terminal Fault”, or short “L1TF”.

At this time this issue is known to only affect Intel CPU’s.
Not affected Intel CPU’s are :
– Older models, where the CPU family is < 6

– A range of ATOM processors

(Cedarview, Cloverview, Lincroft, Penwell, Pineview, Slivermont, Airmont, Merrifield)

– The Core Duo Yonah variants (2006 – 2008)

– The XEON PHI family

– Processors which have the ARCH_CAP_RDCL_NO bit set in the IA32_ARCH_CAPABILITIES MSR.

If the bit is set the CPU is also not affected by the Meltdown vulnerabitly.

(Note: These CPUs should become available end of 2018)
CPUs from ARM and AMD are not affected by this problem.

For other CPU vendors affectedness is currently unknown.

3 Variants of the issue are tracked :

– OS level: CVE-2018-3620

– VMM level: CVE-2018-3646

– SGX enclave level: CVE-2018-3615

SUSE’s mitigations cover the OS and VMM levels.

Attackers could use this issue get access to other memory in physical RAM on the machine.
Untrusted malicious VMs are able to read memory from other VMs, the Host system or SGX enclaves.

Note that this requires the memory being loaded into L1 datacache from another process / VM, which is hard to control for an active attacker.

Related:

VNX M and R, ViPR SRM, Watch4Net VNX SolutionPack: Storage Processor Write Cache report shows no data

Article Number: 479818 Article Version: 4 Article Type: Break Fix



VNX Family Monitoring & Reporting,Watch4net,ViPR SRM

The Storage Processor Write Cache Utilization report shows no data for some arrays.

This will happen on arrays running Flare version 33 and later due to Storage Processor cache architecture changing in FLARE 33, which causes several of the cache related metrics to not be available like they are for versions 30-32.

In particular, the following metrics used to calculate this report are not available:

  • Flush Ratio (%)
  • High Water Flush On
  • Idle Flush On
  • Low Water Flush Off
  • Write Cache Flushes/s

The product is functioning as designed.

Related:

NetScaler Integrated Caching Counters

This article contains information about the newnslog Integrated Caching counters, their SNMP counterpart, and a brief description of the counters.

Using the Counters

Log on to the NetScaler using an SSH client, change to SHELL, navigate to the /var/nslog directory, and then use the ‘nsconmsg’ command to see comprehensive statistics using the different counters available. For the detailed procedure refer to Citrix Blog – NetScaler ‘Counters’ Grab-Bag!.

The newnslog Integrated Caching counters

The following table lists the different newnslog Integrated Caching counters, a brief description of the counter, and the matching SNMP object name.

newnslog Counter

SNMP OID

Description

cac_tot_req

cacheTotRequests

Total cache hits plus the total cache misses.

cac_cur_pcb_hit

cacheCurHits

This number should be close to the number of hits being served currently.

cac_tot_non304_hit

cacheTotNon304Hits

Total number of full (non-304) responses served from the cache. A 304 status code indicates that a response has not been modified since the last time it was served.

cac_tot_304_hit

cacheTot304Hits

Object not modified responses served from the cache.

(Status code 304 served instead of the full response.)

cache_tot_hits

cacheTotHits

Responses served from the integrated cache. These responses match a policy with a CACHE action.

cache_percent_304_hits

cachePercent304Hits

304 responses as a percentage of all responses that the NetScaler appliance served.

cache_percent_hits

cachePercentHit

Cache hits as percentage of the total number of requests.

cache_recent_percent_304_hits

cacheRecentPercent304Hits

Recently recorded ratio of 304 hits to all hits expressed as percentage

cache_recent_percent_hit

cacheRecentPercentHit

Recently recorded cache hit ratio expressed as percentage.

cac_cur_pcb_miss

cacheCurMisses

Responses fetched from the origin and served from the cache. Should approximate storable misses. Does not include non-storable misses.

cache_tot_misses

cacheTotMisses

Intercepted HTTP requests requiring fetches from origin server.

cache_tot_storable_misses

cacheTotStoreAbleMisses

Cache misses for which the fetched response is stored in the cache before serving it to the client. Storable misses conform to a built-in or user-defined caching policy that contains a CACHE action.

cache_tot_non_storable_misses

cacheTotNonStoreAbleMisses

Cache misses for which the fetched response is not stored in the cache. These responses match policies with a NOCACHE action or are affected by Poll Every Time.

cache_tot_revalidation_misses

cacheTotRevalidationMiss

Responses that an intervening cache revalidated with the integrated cache before serving, as determined by a Cache-Control: Max-Age header configurable in the integrated cache.

cache_tot_full_to_conditional_request

cacheTotFullToConditionalRequest

Number of user-agent requests for a cached Poll Every Time (PET) response that were sent to the origin server as conditional requests.

cache_percent_storable_miss

cachePercentStoreAbleMiss

Responses that were fetched from the origin, stored in the cache, and then served to the client, as a percentage of all cache misses.

cache_recent_percent_storable_miss

cacheRecentPercentStoreAbleMiss

Recently recorded ratio of storable misses to all misses expressed as percentage.

cache_percent_successful_reval

cachePercentSuccessfulRevalidation

Percentage of times stored content was successfully revalidated by a 304 (Object Not Modified) response rather than by a full response

cache_recent_percent_successful

_reval

cacheRecentPercentSuccessful

Revalidation

Recently recorded percentage of times stored content was successfully revalidated by a 304 response rather than by a full response

cache_tot_successful_revalidation

cacheTotSuccessfulRevalidation

Total number of times stored content was successfully revalidated by a 304 Not Modified response from the origin.

cache_percent_byte_hit

cachePercentByteHit

Bytes served from the cache divided by total bytes served to the client. If compression is On in the NetScaler, this ratio might not reflect the bytes served by the compression module. If the compression is Off, this ratio is the same as cachePercentOriginBandwidthSaved.

cache_recent_percent_byte_hit

cacheRecentPercentByteHit

Recently recorded cache byte hit ratio expressed as percentage. Here we define byte hit ratio as ((number of bytes served from the cache)/(total number of bytes served to the client)). This is the standard definition of Byte Hit Ratio. If the compression is turned ON in the NetScaler appliance, then this ratio does not mean much. This might under or overestimate the origin-to-cache bandwidth saving depending upon whether bytes served by CMP in NetScaler are more or less than compressed bytes served from the cache. If CMP is turned OFF in the NetScaler appliance, then this ratio is the same as cacheRecentPercentOriginBandwidthSaved.

cactor_max32_res_so_far

cacheLargestResponseReceived

Size, in bytes, of largest response sent to client from the cache or the origin server.

cache_tot_resp_bytes

cacheTotResponseBytes

Total number of HTTP response bytes served by the NetScaler appliance from both the origin and the cache.

cache_bytes_served

cacheBytesServed

Total number of bytes served from the Integrated Cache.

cache_comp_bytes_served

cacheCompressedBytesServed

Number of compressed bytes served from the cache.

cac_tot_parameterized_inval_req

cacheTotParameterizedInvalidation

Requests

Requests matching a policy with an invalidation (INVAL) action and a content group that uses an invalidation selector or parameters.

cac_tot_non_parameterized

_inval_req

cacheTotNonParameterized

InvalidationRequests

Requests that match an invalidation policy where the invalid Groups parameter is configured and expires one or more content groups.

cac_tot_inval_nostore_miss

cacheTotInvalidationRequests

Requests that match an invalidation policy and result in expiration of specific cached responses or entire content groups.

cache_percent_origin_

bandwidth_saved

cachePercentOriginBandwidth

Saved

Percentage of origin bandwidth saved, expressed as number of bytes served from the integrated cache divided by all bytes served. The assumption is that all compression is done in the NetScaler appliance.

cache_recent_percent_origin

_bandwidth_saved

cacheRecentPercentOrigin

BandwidthSaved

Bytes served from cache divided by total bytes served to client. This ratio can be greater than 1 because of the assumption that all compression has been done in the NetScaler appliance.

cactor_tot_expire_at_last_byte

cacheTotExpireAtLastByte

Instances of content expiring immediately after receiving the last body byte due to the Expire at Last Byte setting for the content group.

cac_tot_enable_flashcache

cacheTotFlashcacheMisses

Number of requests to a content group with flash cache enabled that were cache misses. Flash cache distributes the response to all the clients in a queue.

cac_tot_delayed_logging

cacheTotFlashcacheHits

Number of requests to a content group with flash cache enabled that were cache hits. The flash cache setting queues requests that arrive simultaneously and distributes the response to all the clients in the queue.

cac_tot_parameterized_non304_hit

cacheTotParameterizedNon304Hits

Parameterized requests resulting in a full response (not status code 304: Object Not Updated) served from the cache.

cactor_tot_parameterized_req

cacheTotParameterizedRequests

Total number of requests where the content group has hit and invalidation parameters or selectors.

cac_tot_parameterized_304_hit

cacheTotParameterized304Hits

Parameterized requests resulting in an object not modified (status code 304) response.

cache_tot_parameterized_hits

cacheTotParameterizedHits

Parameterized requests resulting in either a 304 or non-304 hit.

cache_percent_parameterized

_304_hits

cachePercentParameterized304Hits

Percentage of parameterized 304 hits relative to all parameterized hits.

cache_recent_percent_

parameterized_hits

cacheRecentPercentParameterized

Hits

Recently recorded ratio of parameterized 304 hits to all parameterized hits, expressed as a percentage

cactor_tot_pet_with_nostore_reval

cacheTotPetRequests

Requests that triggered a search of a content group that has Poll Every Time (PET) enabled (always consult the origin server before serving cached data).

cache_tot_pet_hits

cacheTotPetHits

Number of times a cache hit was found during a search of a content group that has Poll Every Time enabled.

cache_percent_pet_hits

cachePercentPetHits

Percentage of cache hits in content groups that have Poll Every Time enabled, relative to all searches of content groups with Poll Every Time enabled.

cache_max_mem

cacheMaxMemoryKB

Largest amount of memory the NetScaler appliance can dedicate to caching, up to 50% of available memory. A 0 value disables caching, but the caching module continues to run.

cache64_max_mem

cache64MaxMemoryKB

Largest amount of memory the NetScaler appliance can dedicate to caching, up to 50% of the available memory. A 0 value disables caching, but the caching module continues to run.

cache_max_mem_active

cacheMaxMemoryActiveKB

Currently active value of maximum memory.

cache_utilized_mem

cacheUtilizedMemoryKB

Amount of memory the integrated cache is currently using.

cactor_cur_hash

cacheNumCached

Responses currently in integrated cache. Includes responses fully downloaded, in the process of being downloaded, and expired or flushed but not yet removed.

cache_cur_marker_cell

cacheNumMarker

Marker objects created when a response exceeds the maximum or minimum size for entries in its content group or has not yet received the minimum number of hits required for items in its content group.

cactor_err_no_buf

cacheErrMemAlloc

Total number of times the cache failed to allocate memory to store responses.

Related: