How To Troubleshoot And Fix The Situation When The ADM HA Is Not Working

One of the possible error conditions reported in the deployment is where in the GUI System -> Deployment the following symptoms are reported:

Heartbeats are not received from the secondary

Data synchronization has failed on secondary


Apart of the information displayed in GUI on primary node, there may be the following further observations:

– the secondary node is not running

– the secondary node is running, but mas_hb_monit process is not running

Related:

Time not getting sync on XenServer with NTP

High NTP offset and jitter while delay is low. This can be seen with “ntpq -p”.

Offset is the time difference between the local server and remote

Jitter is the difference between the last and current offset measurements, thus if it is high, it means that the offset is increasing more over time.

Delay is the time that it takes to communicate with the remote server. A low delay means that the issue is not related to network delays.

This measurements tell that NTP is not being able to discipline the clock as it drifts faster than it is able to sync.

Related:

  • No Related Posts

Event ID 505 Error message in Event Viewer: The Citrix Config Sync Service failed an import. Description: Unknown error occurred.

Work with Technical Support and check the CDF traces that is collected on the Delivery Controllers which is taken while restarting the Citrix Config Synchronization Service and Citrix High Availability service.

See if you can find some errors which says:

Error,”HA::DeleteDatabase caught: System.Data.SqlClient.SqlException (0x80131904): User does not have permission to alter database ‘HaImportDatabaseName’, the database does not exist, or the database is not in a state that allows access checks.

ALTER DATABASE statement failed

RemoveImportDb (UnknownError)”,””

Related:

7002659: How to progress stuck obituaries


3 – Report sync status needs to be error-free

Below an example of a Report Synchronization status. (for Linux use ndsrepair -E)

Collecting replica synchronization status

Start: Wednesday, October 15, 2008 13:35:11 Local Time

Retrieve replica status

Partition: .[Root].

Replica on server: .doublevision.servers.novell

Replica: .doublevision.servers.novell 10-15-2008 13:34:21

Replica on server: .sled-vh1.servers.novell

Replica: .sled-vh1.servers.novell 10-15-2008 13:34:22

Replica on server: .linx-vh1.servers.novell

Replica: .linx-vh1.servers.novell 10-15-2008 13:34:21

All servers synchronized up to time: 10-15-2008 13:34:21

Finish: Wednesday, October 15, 2008 13:35:11 Local Time

Total errors: 0

For the partition that is being checked, total number of errors must be 0

If errors are listed it means that the synchronization process cannot finish, which means that no obituary processing can take place.

Obituary processing can only start if the synchronization process has successfully finished without errors.

(in a dstrace an “all processed = Yes” is visible for the partition if synchronization for that partition is successfull)


4 – All servers in the replica ring must show sync’ed up within one hour from current time

The start time can be seen at the beginning of the log. Compare the start time to the time indicated for each replica.

If a server is not listed to have errors but for example has a time listed far more than 1 hour or even days ago it may need a restart of eDirectory. (on Linux, as root, type rcndsd restart. On netware unload ds and load ds. On windows stop ds.dlm and start it again. On Solaris type /etc/init.d/ndsd stop and /etc/init.d/ndsd start)


5 – All servers in the Tree must be reachable, up and running

Any Server in the Tree could potentially need to be contacted for the obituary process.

Reason for this is that when a client logs into a server and requests information for a particular object the server does not have a replica for, the server will look up (treewalk) the information on a server that does have the replica, and create an external reference object in it’s own database.

The external reference is basically an empty object that points to the server that has the real object, so next time the information is requested the external reference object holds a pointer to the server that needs to be contacted for the information and no treewalking will be needed.

The external reference object will also cause a backlink attribute to be created on the object itself on the replica to keep track of servers that know about the object.

When the object is moved or deleted the backlink attribute is used to make sure servers that do not have a replica will also know what to do with the external reference object. This is done by the obituary process.

6 – Gather the external reference log using dsrepair/ndsrepair

Netware: Load dsrepair -a ->advanced options menu ->check external references

The dsrepair.log will be located in sys:system

Windows: From ndscons load the dsrepair.dlm with”-a” in the startup parameter line -> Repair ->Check External References

The dsrepair.log will be located in c:NovellNDSDIBFiles

Linux: as root type:ndsrepair -C -Ad -A

The default location for ndsrepair.log will be in /var/opt/novell/eDirectory/log/

(if the default is not used the n4u.server.vardir variable will show the location if you type: ndsconfig get)


7 – Checking what partition(s) should be looked at

Below is a piece of an external reference log.

The line that starts with “Found obituary” indicates what object has got the obituary, in

this case it’s CN=upuser.OU=test.O=novell.T=NOVELLWS

Looking at the path should reveal what partition this object would belong to.

For example if ou=test is a partition CN=upuser would belong to that partition.

(1) Found obituary for: EID: 0000a798, DN: CN=upuser.OU=test.O=novell.T=NOVELLWS

Value CTS : 10-21-2008 11:56:38 R = 0001 E = 0001

Value MTS = 10-21-2008 11:56:38 R = 0001 E = 0001, Type = 0001 DEAD,

Flags = 0000

8 – Checking backlink obituaries for problems

Below is an example of a external reference log.

A backlink obituary can be identified by the following :”Type = 0006 BACKLINK”

If backlink obituaries are the cause for the obituaries not progressing it is likey the same can be seen as in our example below.

The Flags are the steps through which the process needs to go (0000, 0001, 0002 and 0004)

Check the backlink obituaries that belong to the same object (tip look for same EID number) and find one that is a step behind (flags = )compared to the other backlink obituaries, it is possible that that one can not be contacted or is not correctly backlinked.

Find the server that belongs to that backlink obituary from the log. It will be listed just below.

In this example we see that there is one backlink obituary that is still at flags = 0000 while the other backlink obituary is already at flags = 0001

We can see in the example that the baclink obituary that is not going forward points to server CN=doublevision.OU=servers.O=novell.T=NOVELLWS

Possible causes are :

1) The server is physically no longer in use (fix: remove it’s NCP Serverobject and this will clean up the backlinks that point to that server)

2) The server is experiencing a problem and may need a restart of eDirectory or even the server itself

3) The backlink is no longer valid on the server that has the external reference object (eg. may be pointing to wrong server)

In this case a “-xk3” repair would be required on the server that holds the external reference object in order for it to verify and correct any wrong backlinks it may have.

Netware: Load dsrepair -XK3 ->Advanced options menu ->Repair Local DS database -> F10 to start the repair

When done on the console type: set dstrace=*b to start the backlink process (give it some time to finish)

Linux: as root type: ndsrepair -R -Ad -XK3

When done type: ndstrace

A screen appears and in this you can type “set dstrace=*b”

To exit type “exit” (give it some time to finish)

Windows: From ndscons load the dsrepair.dlm with “-xk3” in the startup parameter line -> Repair -> Local Database Repair… click repair.

When done from ndscons highlight the ds.dlm and click on configure -> Triggers -> backlinker

(give it some time to finish)

Example:

Repair utility for Novell eDirectory 8.8 – 8.8 SP2 v20213.08

DS Version 20216.62 Tree name: NOVELLWS

Server name: .linx-vh1.servers.novell

Size of /var/opt/novell/eDirectory/log/ndsrepair.log = 34420 bytes.

Preparing Log File “/var/opt/novell/eDirectory/log/ndsrepair.log”

Please Wait…

External Reference Check

External Reference Check

Start: Tuesday, October 21, 2008 11:58:38 Local Time

(1) Found obituary for: EID: 0000a798, DN: CN=upuser.OU=test.O=novell.T=NOVELLWS

Value CTS : 10-21-2008 11:56:38 R = 0001 E = 0001

Value MTS = 10-21-2008 11:56:38 R = 0001 E = 0001, Type = 0001 DEAD,

Flags = 0000


(2) Found obituary for: EID: 0000a798, DN: CN=upuser.OU=test.O=novell.T=NOVELLWS

Value CTS : 10-21-2008 11:56:38 R = 0001 E = 0002

Value MTS = 10-21-2008 11:57:57 R = 0001 E = 0003, Type = 0006 BACKLINK,

Flags = 0001

NOTIFIED

Backlink: Type = 00000001 DEAD, RemoteID = ffffffff,

ServerID = 00008043, CN=sled-vh1.OU=servers.O=novell.T=NOVELLWS

(3) Found obituary for: EID: 0000a798, DN: CN=upuser.OU=test.O=novell.T=NOVELLWS

Value CTS : 10-21-2008 11:56:38 R = 0001 E = 0003

Value MTS = 10-21-2008 11:57:57 R = 0001 E = 0004, Type = 0006 BACKLINK,

Flags = 0000

Backlink: Type = 00000001 DEAD, RemoteID = ffffffff,

ServerID = 0000807d, CN=doublevision.OU=servers.O=novell.T=NOVELLWS

(4) Found obituary for: EID: 0000a798, DN: CN=upuser.OU=test.O=novell.T=NOVELLWS

Value CTS : 10-21-2008 11:56:38 R = 0001 E = 0004

Value MTS = 10-21-2008 11:57:57 R = 0002 E = 0001, Type = 000c USED_BY,

Flags = 0002

OK_TO_PURGE

Used by: Resource type = 00000000, Event type = 00000003, Resource ID = 00008026, T=NOVELLWS

Checked 0 external references

Found: 4 total obituaries in this DIB,

2 Unprocessed obits, 0 Purgeable obits,

1 OK_To_Purge obits, 1 Notified obits

Total errors: 0

NDSRepair process completed.

9 – Inhibit_move obituaries and how to get them progressed

First the explanation:

When any object is moved from one location to another in the database (for example from ou=accounting.o=novell to ou=users.novell) the “old” location of the object will get a MOVED obituary and the “new” location will receive a INHIBIT_MOVE obituary.

The obituary process will take place just as it would with deleting an object, however if the new location is in a different partition it may need to contact another server to negociate the process.

(the server holding the master replica for a partition needs to do this and if you have 2 partitions involved there may be 2 different servers needed to progress the obits.)

In this process we sometimes see that the MOVED obituary is processed just fine along with it’s backlink obituaries but the INHIBIT_MOVE obituary is not progressed and remains at flags = 0000

We call this an “orphaned INHIBIT_MOVE” obituary

Before we think about any fix we need to verify if this is truely the case and need to check all servers that hold master replica’s for any MOVED obituary to make sure we are not breaking our system when we try and fix this.

Once we have verified and are satisfied that no MOVED obituary exists anywhere for the object that has the INHIBIT_MOVE obituary we can proceed with the fix.

The fix is: TID 3908200

p.s. If the object not only holds a INHIBIT_MOVE but also a DEAD obituary you will need to contact Novell Technical Support


10 – master server is clean but obituaries are still seen on servers that have a read/write

If this document is followed and no more obituaries are seen when checking the server holding the master replica it may still be possible that one or more of the servers holding a read/write replica still show obituaries for the partition that is worked on.

To get these progressed you will need to timestamp these obituaries in order to get them sent to the master for the partition for progressing.

Preferred would be you do this on the server that holds the read/write that shows the most obituaries.

You can do this by running a -OT repair:

Netware: Load dsrepair -OT ->Advanced options menu ->Repair Local DS database -> F10 to start the repair

Linux: as root type: ndsrepair -R -Ad -OT

Windows: From ndscons load the dsrepair.dlm with “-OT” in the startup parameter line -> Repair -> Local Database Repair… click repair.

follow step 10 until all replicas are clean and do not show the obituaries.

Related:

VNX: Control Station (CS) Memory Utilization is High

Article Number: 484082 Article Version: 2 Article Type: Break Fix



VNX VG10,VNX VG2,VNX VG50,VNX VG8,VNX1 Series,VNX2 Series,VNX5200,VNX5300,VNX5400,VNX5500,VNX5600,VNX5700,VNX5800,VNX7500,VNX7600,VNX8000

* Running “top” and “free -m” commands on the Control Station shows the used memory is 90%+

$ top

top – 11:27:10 up 5 days, 15:57, 1 user, load average: 0.10, 0.21, 0.28

Mem: 2071700k total, 1935556k used, 136144k free, 493100k buffers

Swap: 2096440k total, 108k used, 2096332k free, 820184k cached


$ free -m

total used free shared buffers cached

Mem: 2023 1890 132 0 481 800

Swap: 2047 0 2047

* Running “top” command shows no process consumes the memory and load average is low “Less than 1”.

* Used “Swap” is almost 0.

* Control station performance is normal, even the memory usage is 90%+.

* Monitoring systems keep alerting the customer with abnormal Memory usage.

* Control Station reboot reduce the used memory, but after a while it starts increasing again.

This is normal and works as designed.

Control Station OE is based on Linux, Linux always tries to use RAM to speed up disk operations by using available memory for buffers (file system metadata) and cache (pages with actual contents of files or block devices). This helps the system to run faster because disk information is already in memory which saves I/O operations.

If space is needed by programs or applications like Oracle, then Linux will free up the buffers and cache to yield memory for the applications. If your system runs for a while you will usually see a small number under the field “free” on the first line.

This is a normal situation and need no action as long as:

1- No process consumes the memory “Check using top command”.

2- Load average is low “Check using top command, less than 1”.

3- Used “Swap” is almost 0 “Check using top command and free -m”.

4- Control Station performance is normal.

5 – buffers and cache consumes most of Memory used space “Check using top command and free -m”:

$top

top – 11:27:10 up 5 days, 15:57, 1 user, load average: 0.10, 0.21, 0.28

Mem: 2071700k total, 1935556k used, 136144k free, 493100k buffers

Swap: 2096440k total, 108k used, 2096332k free, 820184k cached


$ free -m

total used free shared buffers cached

Mem: 2023 1890 132 0 481 800

-/+ buffers/cache: 607 1415

Swap: 2047 0 2047

Related:

  • No Related Posts

7015361: Quick Start: Advanced tuning parameters for eDirectory 8.8 SP8 and eDirectory 9 explained

This document (7015361) is provided subject to the disclaimer at the end of this document.

Environment

NetIQ eDirectory 8.8 SP8 for All Platforms

NetIQ eDirectory 9.0 for All Platforms

Situation

The challenging environments in which eDirectory can be found today have changed over time. In the past, eDirectory was designed with a bias toward the client. More priority was given to clients and applications than to its own background processes.
Today, however, environments have changed:

– eDirectory databases are holding an ever greater number of objects and attaining larger sizes.

– More servers are seen in the replica rings and they tend to be distributed geographically.
– More dynamic environments: an ever increasing numbers of object modifications, deletions, renames and moves. Many of these changes are programatic in nature (IE., IDM).

– Authentications are now measured in logins per second rather than per day.
In a dynamic environment these factors could result in a delay in the processing of object changes in change cache before they are to be sent to other servers. This results in ever greater overhead on a server while the processes responsible for processing these changes compete for time. eDirectory 8.8 SP8 includes new background process optimizations that can drastically reduce the time to skulk and process object changes by multiple orders. Obviously, the degree to which performance increases are actually seen in production will vary between environments.
The good news is that many of these new features are enabled out of the box and require nothing additional from the Administrator. Many customers have seen an overall 20% improvement doing nothing other than upgrading their servers to 888. The other optimization features in 888 must be enabled and tuned by the Administrator but it is these optimizations that hold the greatest promise for significant performance improvements. This TID will describe each tuning parameter as well as provide an example with some starting points.

Every site is different in terms of tuning eDirectory. One’s own internal testing is the only way to find the tuning sweet spots for any given organization. For more detail on the features discussed here please refer to the eDirectory Admin Guide found on the NetIQ Documentation site: https://www.netiq.com/documentation/edir88/.

Resolution


Features enabled out of the box

Obituary Process Optimization

  • Prior to 8.8 SP8 eDirectory had two redundant methods of processing obituaries: backlinks and DRLs (Distributed Reference Links). The DRL method has been removed eliminating these unnecessary cycles.
  • Additionally, if all servers in a ring are on 888, flag 2 is no longer used in processing delete, rename and move obituaries. This results in the cycles required to process an obituary being reduced by 50-75%. Some tests have shown a reduction of up to 230% in the total time to process obituaries.
  • Another side benefit is there are far fewer “UsedBy” and “Obit_UsedBy” obituary attributes that need to be processed.

Obiturary Process Scheduling Optimization

  • Prior to 888, the obituary process would not run on a parition if the server was currently outbounding changes for that partition to other servers. This would cause obituaries for that partition to be delayed. In a busy environment with hundreds of changes per second this could lead to change cache buildup. Now the obituary process can run in parallel with outbound synchronization thereby reducing obituary processing delays.

Priority Queue

  • Synchronization, especially by server mode, could result in only a few servers in a given replica ring actively skulking to each other at any given time. The previous way eDirectory stored the ordering of the servers it needs to schedule for outbound synchronization was not a priority queue. Now the scheduling of synchronization is a priority queue that implements a FIFO ordering so that servers just contacted go to the end of the queue. Servers with the oldest scheduled time go to the front of the list. Additionally, servers who returned a “-698 – Server Busy”, when contacted, are now rescheduled to the front of the queue.


Features not enabled by default

Asynchronous Outbound Synchronization

  • In the past there was one thread responsible for interating through all entries in the change cache to see if there was a new value that needed to be sent to other servers. This same thread was also responsible for putting the values in a packet, sending the packet over the wire and waiting on an achnowledgement from the receiving server before proceeding. Performing these tasks sequentially were costly in terms of time required to process the changes out to other servers. In eDirectory 8.8 SP8 the work has been split between two threads. One examines the change cache, prepares the outgoing packets then fills a queue with the packets. The second thread picks up the packets from the queue and sends them the remote server one at a time. This has reduced the time to get changes out to other servers by up to 50% in some cases.
  • The default setting for Asynchronous Outbound Synchronization is disabled. In eDirectory 8.8 SP8 iMonitor provides an interface to enable this feature. If this feature is enabled the server will be much more aggressive in sending changes out to the remote servers. This can put pressure on the receiving servers resulting in higher utilization. Therefore, another setting has been included with this feature, the Async Dispatcher Thread Delay. This setting allows the administrator to control, in milliseconds, the frequency in which changes are sent to the remote servers. The Async Dispatcher Thread Delay setting, by default, has no delay (0 milliseconds). The allowed range for this setting is 0 – 999 milliseconds. If this value is very small, meaning the thread fires more often, higher CPU and I/O utilization may be seen on the receiving servers due to the higher amount of inbound traffic. Therefore, this setting should be monitored and fine tuned for the specific enviroment.

NOTE: In eDirectory 9 Asynchronous Outbound Synchronization is turned on by default. It is using the Background Process Delay with a Maximum CPU Utilization of 80%.

Example of use
Environment: all servers are upgraded to 8.8 SP8.
Goal: Synchronize changes out to other servers more quickly. It is recommended for most customers that this setting be turned on for all servers. By enabling this feature eDirectory will more aggressively process entries in change cache as well as sending changes to other servers in the replica ring. In most environments a setting of 0 for the Async Dispatcher Thread Delay is recommended. However, this could put stress on the receiving servers. They should be monitored for CPU utilization and this delay can be adjusted if needed. The next section, Background Process Delay Settings, is also used to inject delays if unacceptably high utilization is seen on the outbounding server.
Steps:
– Open a browser and connect to the server’s iMonitor interface: https://x.x.x.x.:8030/nds
– Navigate to Agent Configuration – Background Process Settings – Asynchronous Outbound Synchronization Settings
– Click on the radio button to enable the feature, accept the default of 0 milliseconds for the delay then click on Submit.

Background Process Delay Settings
  • Background process scalability improvements were of primary importance during the development of eDirectory 8.8 SP8. As mentioned earlier, the emphasis in the past was to make background processing a lower priority than that of applications and clients. During 8.8 SP8’s development an analysis of code and extensive testing revealed that many background processes were spending a significant percentage of time sleeping. Dramatic gains in performance can be had in forcing these processes to fire more frequently. These changes need to be carefully considered, especially on older servers, as an increased frequency can result in an increase of the utilization of the hardware.
  • Currently the following three processes have a hard coded delay of 100ms delay time.
    • Change Cache Processing Delay
    • ObitProc Delay
    • Purger Delay
  • Better performance can be had by lowering the delay times so the processes spend less time sleeping. This can be done via a Hard Limit or CPU Based Policy. Lowering these values can improve the following:
    • obituary processing speed
    • purging of objects in change cache
    • faster analysis of objects in change cache resulting in increased synchronization speed.
  • Policies
    • Hard Limit Policy – default
      • This allows the administrator to define a fixed value, in milliseconds, for how long each process will be delayed. However, this approach means the administrator must manually find the right settings.
    • CPU Based Dynamic Policy
      • This allows the system to either step up or down the delay based on current overall CPU usage. The default, when enabled, is an 80% limit of CPU with a 100ms sleep time. The server will check this utilization every 100 milliseconds. Should the load on the server become higher than the configured maximum the system will begin to the increase the sleep time by 5 millisecond increments until utilization falls below the maximum.
      • A test was performed using the CPU policy with a max CPU of 40% and a max Delay of 0. This resulted in an 80% improvement when processing a bulk import of one million objects.
Example of use
Environment: all servers are upgraded to 8.8 SP8.
Goal: Speed up replication, the processing of obituaries and reduce the number of objects in change cache waiting to be processed and purged.
Steps: For most environments it is recommended to use the CPU-based Background Process Delay Setting rather than the Hard Limit. This allows the server itself to find the sweet spot rather than the administrator having to manually try multiple values in order to find the right balance.
– Open a browser and connect to the server’s iMonitor interface: https://x.x.x.x.:8030/nds
– Navigate to Agent Configuration – Background Process Settings – Asynchronous Outbound Synchronization Settings
– In the Background Process Delay Settings section click on the radio button next to select the CPU-based method, accept the defaults of 80% for the Maximum CPU Utilization and 80ms for the Maximum Delay Limit.
– Click on Submit.

Configuring the Mode and Threads used in Outbound Synchronization
  • eDirectory, by default, uses a dynamic (system computed) mode to determine both the mode of synchronization as well as the number of threads to be used in that mode. Performance can be enhanced by both manually selecting the mode and setting the maximum number of threads to be used. This can increase the number of other servers or partitions that a server can synchronize to simultaneously beyond what would have been auto calculated.
  • Setting the Mode of Synchronization
    • Modes
      • By Partition. This is the most conservative as it uses one thread per partition to update multiple servers. By default, if this mode is selected the number of threads will equal the number of partitions on the server up to a maximum of 8 threads. If the System Computed Synchronization Threads is disabled and the Maximum Manual Synchronization Threads is set to 12 the maximum number of threads used can go up to 12. Changes are then synchronized simultaneously with other replica holding servers.
        • If there are more unique partitions than there are unique servers in that server’s replica rings this mode is the most efficient if there are dynamic changes to objects in multiple partitions.
        • For example, if there are 12 partitions but only three servers holding replicas and all 12 partitions contain object changes, then 12 threads can be scheduled to send these changes out for all 12 partitions at once.
      • By Server. This mode is chosen if the number of unique servers in the server’s replica rings is greater than the number of partitions on that server.
        • Maximum number of threads used = [(number of unique servers +1) /2]
        • This mode might be chosen in the following example: There are two partitions in the tree and 9 servers holding replicas. Therefore the calculation used would be (9 unique servers +1) /2 = 5 threads. If partition mode was choosen the server would only schedule two threads to outbound to the other servers.
      • Dynamic mode. This is the default. The system calculates the mode and threads available for outbound synchronization.
        • Dynamic is enabled by default. This allows the system to choose which mode and the number of threads to use on startup.
  • Setting the number of Synchronization Threads
    • Below are the new thread options:
      • Option A – System Computed Synchronization Threads This is enabled by default. Disabling this setting prevents the system from computing its own calculations of how many threads to use. Once disabled, the administrator can set the amount of outbound threads using Option C below.
      • Option B – Maximum System Computed Synchronization Threads This number represent the maximum number of threads that can be used when the server is in dynamic mode. By default the maximum threads dynamic mode can use is 8 threads. This new option is only used with Option A enabled as well and allows the administrator to manually configure up to a maximum of 12 threads.
      • Option C – Maximum Manual Setting Synchronization threads This number is the maximum number of threads that can be used in either mode. This option overrides Option A and allows the administrator to set the number of outbound threads up to the maximum value. By manually selecting the mode and configuring the number of threads for outbound sync, administrators gain control over the number of outbound threads potentially used in outbound sync. Therefore, using the example of 9 unique servers and 2 partitions in dynamic mode the server would choose server mode and calculate the number of threads to use in outbound synchronization to only 5. However, by setting the following in iMonitor we can increase the number of threads so that all 9 servers can potentially receive synchronization changes at once.
        • Set the Synchronization Method to Server
        • Disable the System Computed Synchronization Threads
        • Set the Maximum Manual Setting Synchronization Threads to 9
          • NOTE: The Max. System Computed Synchronization Threads field and the Max. Manual Setting Synchronization Threads field are mutually exclusive.

Example 1
Environment: 6 servers – 12 partitions. All servers hold all partitions. This is a dynmaic environment where frequent object changes are occuring in all partitions.
Goal: Configure the most efficient synchronization mode and determine the optimal number of threads used for outbounding object changes. Ideally, this will result in object changes being sent to as many servers as possible simultaneously.
  • Method: Partition Mode has been selected. We have only 6 servers but 12 busy partitions. The maximum number of threads that can be used for outbound synchronization is 16. If partition mode is selected and the number of threads is set to 12 this server can simlutanously outbound changes in all 12 partitions to remote servers.
  • System Computed Synchronization Threads: leave at enabled
  • Maximum System Computed Synchronization Threads: 12
Steps:
– Open a browser and connect to the server’s iMonitor interface: https://x.x.x.x.:8030/nds
– Navigate to the Agent Configuration – Agent Synchronization page
– For the Synchronization Method click on By Partition.
– Continue to use System Computed Sychronization Threads
– Set the Maximum System Computed Synchronization Threads to 12
– Click on Submit.
Example 2
Environment: 6 servers and 2 partitions. All servers hold both partititions and are upgraded to 8.8 SP8. Modifications occur across all 6 servers.
Goal: In this case there are more servers than partitions. Therefore the threads will be set to server mode with 6 threads selected. In this configuration the server will assign each of the threads to a particular server so that all six threads are outbounding changes from the 2 partitions.
  • Method: Server Mode has been selected.
  • System Computed Synchronization Threads: disabled. In server mode, if this option was not disabled, the server would have calculated 4 threads to be used [(number of unique servers +1) /2] Set the Maximum Manual
  • Maximum Manual Synchronization Threads: 6
Steps:
– Open a browser and connect to the server’s iMonitor interface: https://x.x.x.x.:8030/nds
– Navigate to the Agent Configuration – Agent Synchronization page

– For the Synchronization Method click on By Server.

– Disable the System Computed Synchronization Threads
– Set the Maximum Manual Setting Synchronization Threads to 6.
– Click on Submit.

Policy Based Replication
eDirectory, by default, uses a mesh topology wherein each server must be able to contact all others in the tree. In some environments, for example where modifications only occur on one server, the Administrator can enable this feature and create a policy that defines which servers a single server can outbound to. This is a very advanced setting and should be used with caution. The Administrator must have deep knowledge about the environment and connected systems. This feature will not be covered in this TID but more information can be found in the eDirectory 8.8 SP8 Admin Guide, section 4.1.4.

Configuring the Purger Interval
  • The purger process is responsible for removing entries from the change cache once all servers have seen that object’s changes. The default is to run for 15 minutes followed by a 30 minute delay. In earlier versions of eDirectory the purger would wait on the skulker to complete. In sites that have continuous amounts of changes it could be hours before it fired.
  • In eDirectory 8.8 SP8 the purger is allowed to run alongside the skulker. Further, the delay setting is now configurable. This can be set lower so that the purger process runs more frequently and objects are removed from change cache more quickly once they are seen by all replica bearing servers. 10 minutes is the recommended minimum so that it does not run excessively.
Environment: This customer’s environment is very dynamic. Many programmatic modifications are occuring in the tree due to IDM drivers and a large number of user logins. These changes are quickly getting out to other servers due to the previous tuning changes described above. However, change cache is still growing excessively due to the number of changes between purger cycles.
Goal: To better maintain the backlog of entries in change cache whose changes have been seen by all servers. In order to give the purger process more opportunity to remove these “have seen” entries from change cache the purger will be set to run more often, every 10 minutes. If this proves too low (IE., it is running all the time) this will be adjusted up to 20 minutes. Running more frequently should result in fewer objects in change cache when it does run.
Steps:
– Open a browser and connect to the server’s iMonitor interface: https://x.x.x.x.:8030/nds
– Navigate to the Agent Configuration – Background Process Settings page – Background Process Interval section – Purger Interval
– Enter the number of minutes the Purger Process will wait before running. The 30 minute default works for most environments but for this use case it will be set to 10. (Again, as this process kicks off other processes as well, it is not recommended to go below 10 minutes for this setting.)
– Click on Submit.


Login Update Disable Interval

  • By default several attributes are updated on a user object each time it is used to login. Among them are:
    • Login Time
    • Last Login Time
    • Network Address
    • Revision
    • modifiersName
  • These attribute changes must be synchronized to all other servers holding a replica for that user object. In a busy environment this can add dramatically to the amount of data required to be synchronized between servers. Previously these changes could only be disabled or enabled. Some customers use these values to track logins so disabling them altogether was not an option for them.
  • In eDirectory 8.8 SP8 a new setting has been implemented that allows an administrator to set a time interval between logins wherein these changes are not recorded on the object. The default is 0, meaning this option has not been enabled. By entering a value other than 0 it is enabled. When enabling this setting the typical interval set is 3600 seconds (1 hour). For example, when a user logs in for the first time at 8:00 AM, eDirectory updates these login attributes and the interval starts. If the same user logs in again before 9:00 AM, eDirectory will not update these attributes’ values.
Environment: This environment is a heavy user of LDAP based applications. The LDAP connections made by these applications are numerous but short-lived. Login Update attributes are enabled both for tracking IP addresses from which users logged in as well as their last login time. As these users are getting updated multiple times per minute or hour, change cache is growing exponentially and synchronization is near non-stop.
Goal: Reduce the number of entries in change cache and minimize the number of changes that need to be skulked out to other servers. Since the workstation IP addresses and last login times are tracked it has been decided to enable the Login Update Disable Interval functionality. As users are not logging in multiple times per hour from different workstations it has been decided to only update the login attributes once per hour. This will still allow the relevant information to be tracked but also reduce unwarranted synchronization traffic and change cache bloat.
Steps:
– Open a browser and connect to the server’s iMonitor interface: https://x.x.x.x.:8030/nds
– Navigate to the Agent Configuration – Login Settings page – Login Update Disable Interval field
– Enter the number of seconds between logins during which additional logins will not update a user’s login attributes.
– Click on Submit.

How to Configure Subscription Synchronization on StoreFront

Note: Before starting the configuring subscription synchronization in StoreFront, verify the following information:

  • Server’s host name is resolvable

  • Telnet 808 successfully connects to the servers

In this example, the server names are hostname1.citrix.com and hostname2.citrix.com. The servers are not on the same group.

Note: When we started looking at internal mechanisms like the Credential Wallet and testing complex replication schemes with dozens of StoreFront nodes in the same group, we also made some interesting discoveries. We came to the conclusion that it is best to limit the number of StoreFront nodes in a group to 5 servers. Now, keep in mind there is no technical limit or cap that we enforce – it’s just similar to the guidance we’ve provided regarding zones and IMA in the past (fewer is better and we really don’t recommend more than 5 zones in a farm, but there is actually no upper limit).

For each server, add the servers from the other cluster in the CitrixSubscriptionsSyncUsers group within Computer Management.

User-added image

Important: In multiple server deployments, use only one server at a time to make changes to the configuration of the server group. Ensure that the Citrix StoreFront management console is not running on any of the other servers in the deployment. After complete, propagate your configuration changes to the server group so that other servers in the deployment are updated.

Note: When establishing your subscription synchronization, the configured Delivery Controllers must be named identically between the synchronized Stores. The Delivery Controller names are case sensitive. Failing to duplicate the Delivery Controller names exactly may lead to users having different subscriptions across the synchronized stores.

To configure periodic synchronization of users’ application subscriptions between stores in different StoreFront deployments, execute the following Windows PowerShell commands:

  1. Run PowerShell as an Administrator on Windows 2012 Server. Right-click on the tile and select Run as administrator.

    User-added image

  2. Type cd to go to the root.

    User-added image

  3. Type Import-Module “installationlocationManagementCmdletsUtilsModule.psm1” which in the server is:

    Import-Module ‘C:Program FilesCitrixReceiver StoreFrontManagementCmdletsUtilsModule.psm1’.

    User-added image
    Note: When typing in Windows PowerShell, press the Tab key after a few characters and it will auto populate the directory. For example, typing “Pro” and pressing tab key causes the “Program Files” to appear.

  4. Type Import-Module ‘C:Program FilesCitrixReceiver StoreFrontManagementCmdletsSubscriptionSyncModule.psm1’

    User-added image

    To specify the remote StoreFront deployment containing the store to be synchronized, type the following command:

    Add-DSSubscriptionsRemoteSyncCluster –clusterName deploymentname -clusterAddress deploymentaddress

    For example, Add-DSSubscriptionsRemoteSyncCluster – clusterName citrixsync -clusterAddress hostname1.citrix.com

    The cluster name is citrixsync and hostname1.citrix.com is the server on the secondary datacenter.

  5. Sync the store.

    Validate that the Storefront servers have the same name.

    If you have more than one, execute the following command for each store:

    Add-DSSubscriptionsRemoteSyncStore –clusterName deploymentname -storeName storename

    For example, Add-DSSubscriptionsRemoteSyncStore – clustername citrixsync -storename citrixstore

  6. To configure synchronization to occur at a particular time every day, execute the following command:

    Add-DSSubscriptionsSyncSchedule –scheduleName synchronizationname –startTime hh:mm

    For example, Add-DSSubscriptionsSyncReoccurringSchedule – schedulename timesync -startTime 10:20:00 -repeatMinutes 5

    The schedule timesync starts at 10:20 am and repeats every 5 minutes.

    Note: The synchronizationname helps you to identify the schedule you are creating. Use the setting –startTime to specify a time of day at which you want to synchronize subscriptions between the stores.

    Configure additional schedules to run synchronization times throughout the day.

  7. Configure a regular synchronization at a specific interval; execute the following command:

    Add-DSSubscriptionsSyncReoccuringSchedule –scheduleName synchronizationname –startTime hh:mm:ss -repeatMinutes interval 

    User-added image

    The re-occurring schedule “periodicsync” is different from the sync schedule. In this example, it repeats every 60 minutes and runs 2 or 3 times a day instead of every hour.

    Note: When configuring the synchronization schedules for your StoreFront deployments, ensure the schedules do not lead to a situation where the deployments are attempting to synchronize simultaneously. In the example, the re-occurring schedule “periodicsync” start time is at 10:45 on the server and 10:50 on the other cluster.

  8. ​Follow Steps 1 to 7 on the second server that you want to synchronize. Remember to change the server Fully Qualified Name when running the Add-DSSubscriptionsRemoteSyncCluster –clusterName deploymentname

    clusterAddress deploymentaddress” command (deploymentadress will be the remote server). The synchronization starts at a different time in the “-starttime” portion of the command.

  9. To start synchronizing user’s application subscriptions between the stores, restart the subscription store service on both the local and remote deployments. At the Windows PowerShell command prompt on a server in each deployment, execute the following command:

    Restart-DSSubscriptionsStoreSubscriptionService

    The following information is displayed on the event viewer logs:

    User-added image

    User-added image

Related:

Clean up free buffer space on Guardium Collector

Hi,

Does anyone knows on how to clean up the free buffer space on Guardium Collector.

Recently we’d one of the Collector full and clened it up completely and now everything is set but this parameter “Free Buffer Space” shows value 0 in Buffer usage Monitor report which results the overall utilization to be high.

Related:

NetScaler High Availability File Sync Fails When “internaluserlogin” is Disabled

When internaluserlogin is disabled on a NetScaler high availability configuration and SSH key are not configured, the file synchronization will not work. This is becasue NetScaler high availability sync authentication is performed using SSH key. In case SSH key is configured and synchronization still does not work then refer to CTX214822High Availability Synchronization Fails with “-internaluserlogin disabled” Command.

If synchronization already worked

High availability was correctly configured and after some synchronization the user disabled internaluserlogin without configuring SSH key. You will see the following error message on /var/log/nsfsyncd.log:

protocol version mismatch — is your shell clean?

(see the rsync man page for an explanation)

rsync error: protocol incompatibility (code 2) at

If synchronization never worked

The following is an excerpt from the /var/log/nsfsyncd.log:

rsync: connection unexpectedly closed (0 bytes received so far) [receiver]

rsync error: unexplained error (code 255) at io.c(600) [receiver=3.0.6]

Files in the /nsconfig/ssl path might fail to synchronize. This could impact the configuration. However, issuing the command force ha sync might report success.

Related: