3 – Report sync status needs to be error-free
Below an example of a Report Synchronization status. (for Linux use ndsrepair -E)
Collecting replica synchronization status
Start: Wednesday, October 15, 2008 13:35:11 Local Time
Retrieve replica status
Partition: .[Root].
Replica on server: .doublevision.servers.novell
Replica: .doublevision.servers.novell 10-15-2008 13:34:21
Replica on server: .sled-vh1.servers.novell
Replica: .sled-vh1.servers.novell 10-15-2008 13:34:22
Replica on server: .linx-vh1.servers.novell
Replica: .linx-vh1.servers.novell 10-15-2008 13:34:21
All servers synchronized up to time: 10-15-2008 13:34:21
Finish: Wednesday, October 15, 2008 13:35:11 Local Time
Total errors: 0
For the partition that is being checked, total number of errors must be 0
If errors are listed it means that the synchronization process cannot finish, which means that no obituary processing can take place.
Obituary processing can only start if the synchronization process has successfully finished without errors.
(in a dstrace an “all processed = Yes” is visible for the partition if synchronization for that partition is successfull)
4 – All servers in the replica ring must show sync’ed up within one hour from current time
The start time can be seen at the beginning of the log. Compare the start time to the time indicated for each replica.
If a server is not listed to have errors but for example has a time listed far more than 1 hour or even days ago it may need a restart of eDirectory. (on Linux, as root, type rcndsd restart. On netware unload ds and load ds. On windows stop ds.dlm and start it again. On Solaris type /etc/init.d/ndsd stop and /etc/init.d/ndsd start)
5 – All servers in the Tree must be reachable, up and running
Any Server in the Tree could potentially need to be contacted for the obituary process.
Reason for this is that when a client logs into a server and requests information for a particular object the server does not have a replica for, the server will look up (treewalk) the information on a server that does have the replica, and create an external reference object in it’s own database.
The external reference is basically an empty object that points to the server that has the real object, so next time the information is requested the external reference object holds a pointer to the server that needs to be contacted for the information and no treewalking will be needed.
The external reference object will also cause a backlink attribute to be created on the object itself on the replica to keep track of servers that know about the object.
When the object is moved or deleted the backlink attribute is used to make sure servers that do not have a replica will also know what to do with the external reference object. This is done by the obituary process.
6 – Gather the external reference log using dsrepair/ndsrepair
Netware: Load dsrepair -a ->advanced options menu ->check external references
The dsrepair.log will be located in sys:system
Windows: From ndscons load the dsrepair.dlm with”-a” in the startup parameter line -> Repair ->Check External References
The dsrepair.log will be located in c:NovellNDSDIBFiles
Linux: as root type:ndsrepair -C -Ad -A
The default location for ndsrepair.log will be in /var/opt/novell/eDirectory/log/
(if the default is not used the n4u.server.vardir variable will show the location if you type: ndsconfig get)
7 – Checking what partition(s) should be looked at
Below is a piece of an external reference log.
The line that starts with “Found obituary” indicates what object has got the obituary, in
this case it’s CN=upuser.OU=test.O=novell.T=NOVELLWS
Looking at the path should reveal what partition this object would belong to.
For example if ou=test is a partition CN=upuser would belong to that partition.
(1) Found obituary for: EID: 0000a798, DN: CN=upuser.OU=test.O=novell.T=NOVELLWS
Value CTS : 10-21-2008 11:56:38 R = 0001 E = 0001
Value MTS = 10-21-2008 11:56:38 R = 0001 E = 0001, Type = 0001 DEAD,
Flags = 0000
8 – Checking backlink obituaries for problems
Below is an example of a external reference log.
A backlink obituary can be identified by the following :”Type = 0006 BACKLINK”
If backlink obituaries are the cause for the obituaries not progressing it is likey the same can be seen as in our example below.
The Flags are the steps through which the process needs to go (0000, 0001, 0002 and 0004)
Check the backlink obituaries that belong to the same object (tip look for same EID number) and find one that is a step behind (flags = )compared to the other backlink obituaries, it is possible that that one can not be contacted or is not correctly backlinked.
Find the server that belongs to that backlink obituary from the log. It will be listed just below.
In this example we see that there is one backlink obituary that is still at flags = 0000 while the other backlink obituary is already at flags = 0001
We can see in the example that the baclink obituary that is not going forward points to server CN=doublevision.OU=servers.O=novell.T=NOVELLWS
Possible causes are :
1) The server is physically no longer in use (fix: remove it’s NCP Serverobject and this will clean up the backlinks that point to that server)
2) The server is experiencing a problem and may need a restart of eDirectory or even the server itself
3) The backlink is no longer valid on the server that has the external reference object (eg. may be pointing to wrong server)
In this case a “-xk3” repair would be required on the server that holds the external reference object in order for it to verify and correct any wrong backlinks it may have.
Netware: Load dsrepair -XK3 ->Advanced options menu ->Repair Local DS database -> F10 to start the repair
When done on the console type: set dstrace=*b to start the backlink process (give it some time to finish)
Linux: as root type: ndsrepair -R -Ad -XK3
When done type: ndstrace
A screen appears and in this you can type “set dstrace=*b”
To exit type “exit” (give it some time to finish)
Windows: From ndscons load the dsrepair.dlm with “-xk3” in the startup parameter line -> Repair -> Local Database Repair… click repair.
When done from ndscons highlight the ds.dlm and click on configure -> Triggers -> backlinker
(give it some time to finish)
Example:
Repair utility for Novell eDirectory 8.8 – 8.8 SP2 v20213.08
DS Version 20216.62 Tree name: NOVELLWS
Server name: .linx-vh1.servers.novell
Size of /var/opt/novell/eDirectory/log/ndsrepair.log = 34420 bytes.
Preparing Log File “/var/opt/novell/eDirectory/log/ndsrepair.log”
Please Wait…
External Reference Check
External Reference Check
Start: Tuesday, October 21, 2008 11:58:38 Local Time
(1) Found obituary for: EID: 0000a798, DN: CN=upuser.OU=test.O=novell.T=NOVELLWS
Value CTS : 10-21-2008 11:56:38 R = 0001 E = 0001
Value MTS = 10-21-2008 11:56:38 R = 0001 E = 0001, Type = 0001 DEAD,
Flags = 0000
(2) Found obituary for: EID: 0000a798, DN: CN=upuser.OU=test.O=novell.T=NOVELLWS
Value CTS : 10-21-2008 11:56:38 R = 0001 E = 0002
Value MTS = 10-21-2008 11:57:57 R = 0001 E = 0003, Type = 0006 BACKLINK,
Flags = 0001
NOTIFIED
Backlink: Type = 00000001 DEAD, RemoteID = ffffffff,
ServerID = 00008043, CN=sled-vh1.OU=servers.O=novell.T=NOVELLWS
(3) Found obituary for: EID: 0000a798, DN: CN=upuser.OU=test.O=novell.T=NOVELLWS
Value CTS : 10-21-2008 11:56:38 R = 0001 E = 0003
Value MTS = 10-21-2008 11:57:57 R = 0001 E = 0004, Type = 0006 BACKLINK,
Flags = 0000
Backlink: Type = 00000001 DEAD, RemoteID = ffffffff,
ServerID = 0000807d, CN=doublevision.OU=servers.O=novell.T=NOVELLWS
(4) Found obituary for: EID: 0000a798, DN: CN=upuser.OU=test.O=novell.T=NOVELLWS
Value CTS : 10-21-2008 11:56:38 R = 0001 E = 0004
Value MTS = 10-21-2008 11:57:57 R = 0002 E = 0001, Type = 000c USED_BY,
Flags = 0002
OK_TO_PURGE
Used by: Resource type = 00000000, Event type = 00000003, Resource ID = 00008026, T=NOVELLWS
Checked 0 external references
Found: 4 total obituaries in this DIB,
2 Unprocessed obits, 0 Purgeable obits,
1 OK_To_Purge obits, 1 Notified obits
Total errors: 0
NDSRepair process completed.
9 – Inhibit_move obituaries and how to get them progressed
First the explanation:
When any object is moved from one location to another in the database (for example from ou=accounting.o=novell to ou=users.novell) the “old” location of the object will get a MOVED obituary and the “new” location will receive a INHIBIT_MOVE obituary.
The obituary process will take place just as it would with deleting an object, however if the new location is in a different partition it may need to contact another server to negociate the process.
(the server holding the master replica for a partition needs to do this and if you have 2 partitions involved there may be 2 different servers needed to progress the obits.)
In this process we sometimes see that the MOVED obituary is processed just fine along with it’s backlink obituaries but the INHIBIT_MOVE obituary is not progressed and remains at flags = 0000
We call this an “orphaned INHIBIT_MOVE” obituary
Before we think about any fix we need to verify if this is truely the case and need to check all servers that hold master replica’s for any MOVED obituary to make sure we are not breaking our system when we try and fix this.
Once we have verified and are satisfied that no MOVED obituary exists anywhere for the object that has the INHIBIT_MOVE obituary we can proceed with the fix.
The fix is: TID 3908200
p.s. If the object not only holds a INHIBIT_MOVE but also a DEAD obituary you will need to contact Novell Technical Support
10 – master server is clean but obituaries are still seen on servers that have a read/write
If this document is followed and no more obituaries are seen when checking the server holding the master replica it may still be possible that one or more of the servers holding a read/write replica still show obituaries for the partition that is worked on.
To get these progressed you will need to timestamp these obituaries in order to get them sent to the master for the partition for progressing.
Preferred would be you do this on the server that holds the read/write that shows the most obituaries.
You can do this by running a -OT repair:
Netware: Load dsrepair -OT ->Advanced options menu ->Repair Local DS database -> F10 to start the repair
Linux: as root type: ndsrepair -R -Ad -OT
Windows: From ndscons load the dsrepair.dlm with “-OT” in the startup parameter line -> Repair -> Local Database Repair… click repair.
follow step 10 until all replicas are clean and do not show the obituaries.