Event ID 1066 — Cluster Storage Functionality

Event ID 1066 — Cluster Storage Functionality

Updated: November 25, 2009

Applies To: Windows Server 2008 R2

In a failover cluster, most clustered services or applications use at least one disk, also called a disk resource, that you assign when you configure the clustered service or application. Clients can use the clustered service or application only when the disk is functioning correctly.

Event Details

Product: Windows Operating System
ID: 1066
Source: Microsoft-Windows-FailoverClustering
Version: 6.1
Symbolic Name: RES_DISK_CORRUPT_DISK
Message: Cluster disk resource ‘%1’ indicates corruption for volume ‘%2’. Chkdsk is being run to repair problems. The disk will be unavailable until Chkdsk completes. Chkdsk output will be logged to file ‘%3’. Chkdsk may also write information to the Application Event Log.

Resolve
Confirm volume integrity

If Chkdsk has been started automatically, we recommend that you allow it to run so that it can correct any problems with the file system. Chkdsk output will be logged to the <systemroot>\Cluster\Reports folder. Note that file-system errors might indicate that the hardware is deteriorating. For information about other ways of gathering information about a clustered disk that appears to have file system errors, see “Using logs and the clustering validation wizard to gather information about the state of a clustered disk” and “Running a disk maintenance tool such as Chkdsk on a clustered disk.”

To view or change the current setting for the triggers for running Chkdsk on a clustered disk, see “Viewing or changing the setting for the triggers that cause Chkdsk to run on a clustered disk,” later in this topic.

If you do not currently have Event Viewer open, see “Opening Event Viewer and viewing events related to failover clustering.”

To perform the following procedures, you must be a member of the local Administrators group on each clustered server, and the account you use must be a domain account, or you must have been delegated the equivalent authority.

Using logs and the clustering validation wizard to gather information about the state of a clustered disk

To use logs and the clustering validation wizard to gather information about the state of a clustered disk:

  1. Scan appropriate event logs for errors that are related to the disk.
  2. Check cables and any related devices on the bus.
  3. To open the failover cluster snap-in, click Start, click Administrative Tools, and then click Failover Cluster Management. If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Continue.
  4. In the Failover Cluster Management snap-in, in the console tree, make sure Failover Cluster Management is selected and then, under Management, click Validate a Configuration.
  5. Follow the instructions in the wizard to specify the cluster you want to test.
  6. On the Testing Options page, select Run only tests I select.
  7. On the Test Selection page, clear the check boxes for Network and System Configuration. This leaves only the tests for Storage and Inventory. You can run all these tests, or you can select only the specific tests that appear relevant to your situation.

    Important   If a clustered service or application is using a disk when you start the wizard, the wizard will prompt you about whether to take that clustered service or application offline for the purposes of testing. If you choose to take a clustered service or application offline, it will remain offline until the tests finish.

  8. Follow the instructions in the wizard to run the tests.
  9. On the Summary page, click View Report.

Running a disk maintenance tool such as Chkdsk on a clustered disk

If you need to run a disk maintenance tool such as Chkdsk on a clustered disk, use maintanence mode to prevent the disk maintenance tool from triggering failover.

To run a disk maintenance tool such as Chkdsk on a clustered disk:

  1. To open the failover cluster snap-in, click Start, click Administrative Tools, and then click Failover Cluster Management. If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Continue.
  2. In the Failover Cluster Management snap-in, if the cluster is not displayed, in the console tree, right-click Failover Cluster Management, click Manage a Cluster, and select or specify the cluster you want.
  3. If the console tree is collapsed, expand the tree under the cluster that uses the disk on which you want run a disk maintenance tool.
  4. In the console tree, click Storage.
  5. In the center pane, click the disk on which you want to run the disk maintenance tool.
  6. Under Actions, click More Actions, and then click Turn On Maintenance Mode for this disk.
  7. Run the disk maintenance tool on the disk.

    When maintenance mode is on, the disk remains online in the cluster, but the disk maintenance tool can finish running without triggering a failover.

  8. When the disk maintenance tool finishes running, with the disk still selected, under Actions, click More Actions, and then click Turn Off Maintenance Mode for this disk.

Viewing or changing the setting for the triggers that cause Chkdsk to run on a clustered disk

Disk resources in a cluster have a private property setting called DiskRunChkDsk that specifies the triggers that will cause Chkdsk to run on the disk.

To view or change the setting for the triggers that cause Chkdsk to run on a clustered disk:

  1. To open an elevated Command Prompt window, on a node in the cluster, click Start, point to All Programs, click Accessories, right-click Command Prompt, and then click Run as administrator.
  2. Type:

    CLUSTER RESOURCE /STATUS

  3. In the list of resources, find the name of the disk for which you want to view the setting.
  4. Type the following command, which requests a list of the private properties of the resource:

    CLUSTER RESOURCE “DiskResourceName” /PRIV

  5. In the resulting display, find the value (from 0 through 5) for DiskRunChkDsk.
  6. Compare the value to the following list:
    • 0   The default. If a bit has been set on the disk indicating possible file system inconsistency, or a quick check equivalent to a dir command at the root returns a file system error, runs Chkdsk /f /x.
    • 1   If a bit has been set on the disk indicating possible file system inconsistency, or a check equivalent to a dir /s command at the root returns a file system error, runs Chkdsk /f /x.
    • 2   Whenever the disk is mounted, runs Chkdsk /f /x.
    • 3   When the disk is mounted, runs Chkdsk, unless a bit has been set on the disk indicating possible file system inconsistency, or a quick check equivalent to a dir command at the root returns a file system error. In the latter cases, runs Chkdsk /f /x.
    • 4   No action; Chkdsk is never triggered by the cluster software.
    • 5   Runs a check equivalent to a dir /s command at the root and if a file system error is returned, does not mount the disk. Otherwise, no action is taken.
  7. To change the setting for the disk, run a command of the following form, replacing n with the value you choose from the preceding list:

    CLUSTER RESOURCE “DiskResourceName” /PRIV DISKRUNCHKDSK=n

  8. To confirm the setting, type the following command again:

    CLUSTER RESOURCE “DiskResourceName” /PRIV

Opening Event Viewer and viewing events related to failover clustering

To open Event Viewer and view events related to failover clustering:

  1. If Server Manager is not already open, click Start, click Administrative Tools, and then click Server Manager. If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Continue.
  2. In the console tree, expand Diagnostics, expand Event Viewer, expand Windows Logs, and then click System.
  3. To filter the events so that only events with a Source of FailoverClustering are shown, in the Actions pane, click Filter Current Log. On the Filter tab, in the Event sources box, select FailoverClustering. Select other options as appropriate, and then click OK.
  4. To sort the displayed events by date and time, in the center pane, click the Date and Time column heading.

Verify

Confirm that the disk resource can come online. If there have been recent problems with writing to the disk, it can be appropriate to monitor event logs and monitor the function of the corresponding clustered service or application, to confirm that the problems have been resolved.

To perform the following procedures, you must be a member of the local Administrators group on each clustered server, and the account you use must be a domain account, or you must have been delegated the equivalent authority.

Confirming that a disk resource can come online

To confirm that a disk resource can come online:

  1. To open the failover cluster snap-in, click Start, click Administrative Tools, and then click Failover Cluster Management. If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Continue.
  2. In the Failover Cluster Management snap-in, if the cluster you want to manage is not displayed, in the console tree, right-click Failover Cluster Management, click Manage a Cluster, and then select or specify the cluster that you want.
  3. If the console tree is collapsed, expand the tree under the cluster you want to manage, and then expand Services and Applications.
  4. In the console tree, click a clustered service or application.
  5. In the center pane, expand the listing for the disk resource. View the status of the resource.
  6. If a disk resource is offline, to bring it online, right-click the resource and then click Bring this resource online.

To perform a quick check on the status of a resource, you can run the following command.

Using a command to check the status of a resource in a failover cluster

To use a command to check the status of a resource in a failover cluster:

  1. On a node in the cluster, click Start, point to All Programs, click Accessories, right-click Command Prompt, and then click Run as administrator.
  2. Type:

    CLUSTER RESOURCE ResourceName /STATUS

    If you run the preceding command without specifying a resource name, status is displayed for all resources in the cluster.

Related Management Information

Cluster Storage Functionality

Failover Clustering

Related:

Event ID 525 — Backup Set Integrity

Event ID 525 — Backup Set Integrity

Updated: January 27, 2011

Applies To: Windows Server 2008

When you run a backup operation, Windows Server Backup runs checks for consistency and hardware and software corruption to determine the integrity of the backup set.

Event Details

Product: Windows Operating System
ID: 525
Source: Microsoft-Windows-Backup
Version: 6.0
Symbolic Name: ADMIN_BACKED_UP_BAD_CLUSTERS_EVENT
Message: Backup completed with warning(s) – Volume ‘%2’ has developed new bad clusters. This may be an indication of problems with your hardware. %3 bytes have not been backed up as they could not be read. Please run chkdsk /R on ‘%2’ and rerun the backup.

Resolve
Find bad clusters on source volumes and create a new backup

When you run a backup operation, Windows Server Backup checks the source volumes (volumes being backed up) for bad clusters. If bad clusters are found, the operation will complete, but with errors. To resolve this issue, follow these general steps:

  1. Run chkdisk /r on the volume of concern. Depending on the number of bad clusters found, you may want to replace the disk containing the volume.
  2. Re-run the backup.
  3. Check for Event ID 4, which indicates that the backup completed with no errors.

To create a one-time backup or work with events, you must have membership in Backup Operators or Administrators, or you must have been delegated the appropriate authority. To run chkdsk, you must be a member of the Administrators group, or you must have been delegated the appropriate authority.

Find bad clusters using chkdsk

To look for bad clusters exist on the volume:

  1. On the computer that contains the volume that may have bad clusters, open a command prompt. Click Start, and then click Command Prompt.
  2. At the prompt, type: chkdsk /r. This command will look for bad sectors on the disk and recover any readable information. 

Create a one-time backup using the command line

Make sure that the backup storage location specified by -backupTarget is online.

To perform a one-time backup:

  1. Open an elevated Command Prompt window. Click Start, point to All Programs, click Accessories, right-click Command Prompt, and then click Run as administrator.
  2. At the prompt, type: wbadmin start backup. Use parameters, as needed. (To view the parameters and help for this command, at a command line, type: wbadmin start backup /?).

    For example, to create a backup that will be stored on drive f, of volumes e:, d:\mountpoint, and \\?\Volume{cc566d14-4410-11d9-9d93-806e6f6e6963}, type: wbadmin start backup -backupTarget:f: -include:e:,d:\mountpoint,\\?\Volume{cc566d14-44a0-11d9-9d93-806e6f6e6963}\.

Confirm that a backup completed with no errors

To confirm that a backup operation completed with no errors:

  1. Open Event Viewer. Click Start, click Administrative Tools, and then click Event Viewer.
  2. In the left pane, double-click Applications and Service Logs, double-click Microsoft, double-click Windows, double-click Backup, and then click Operational.
  3. In the Event ID column, look for event 4.
  4. For this event, confirm that the the value in the Source column is Backup.

Verify

To verify that a backup set is complete and will be able to be used for recovery, you should do the following:

  • Verify that the backup operation to create the backup set completed with no errors.
  • Verify that the global catalog has information about the backup set.
  • Verify that the local catalog has information about the backup set.
  • Verify that the backup set itself is not corrupted by performing a recovery with the backup set.

To perform these procedures, you must have membership in Backup Operators or Administrators, or you must have been delegated the appropriate authority.

Verify that a backup completed with no errors

To verify that a backup operation completed with no errors:

  1. Open Event Viewer. Click Start, click Administrative Tools, and then click Event Viewer.
  2. In the left pane, double-click Applications and Service Logs, double-click Microsoft, double-click Windows, double-click Backup, and then click Operational.
  3. In the Event ID column, look for event 4.
  4. For this event, confirm that the the value in the Source column is Backup.

Verify the global catalog

To verify that the global catalog has information about the backup set:

  1. Open an elevated Command Prompt window. Click Start, point to All Programs, click Accessories, right-click Command Prompt, and then click Run as administrator.
  2. At the prompt, type: wbadmin get versions.
  3. If the command output shows information about backups, then the global catalog is intact.

Verify the local catalog

To verify that the local catalog has information about the backup set:

  1. Open an elevated Command Prompt window. Click Start, point to All Programs, click Accessories, right-click Command Prompt, and then click Run as administrator.
  2. At the prompt, type: wbadmin get versions -backuptarget:<VolumeName>.
  3. If backup versions are listed, then the local backup catalog is not corrupted and is intact. 

Verify that a backup works for recovery

To verify that a backup will work for recovery, you should try recovering something from the backup.

Note: Make sure that you do not mistakenly overwrite newer data. To avoid this, you can perform a recovery to a different volume than was backed up as part of the backup set. You will receive a message that any data on the destination volume will be lost when you perform the recovery. Make sure that the destination volume is empty or does not contain information that you will need later.

To perform a recovery:

  1. Open an elevated Command Prompt window. Click Start, point to All Programs, click Accessories, right-click Command Prompt, and then click Run as administrator.
  2. At the prompt, type: wbadmin start recovery. Use parameters, as needed. (To view the parameters and help for this command, at a command line, type: wbadmin start recovery /?).

    For example, to run a recovery of the backup from March 1, 2005, taken at 9:00 A.M. of the d:\folder and its sub-folders, type: wbadmin start recovery -version:03/1/2005-09:00 -itemType:File -items:d:\folder -recursive.

  3. Review the items that you recovered to make sure that they were recovered as you expected.

Related Management Information

Backup Set Integrity

File Services

Related:

Insufficient memory resources to complete the operation. Try the operation again.

Details
Product: Windows Operating System
Event ID: 6011
Source: MacFile
Version: 5.0
Component: Application Event Log
Symbolic Name: AFPMACFILEMSG_BufferSize
Message: Insufficient memory resources to complete the operation. Try the operation again.
   
Explanation

The server might be running out of memory.

   
User Action

Check the server’s available disk space. If you find that the server has sufficient memory, run Chkdsk to determine whether there is a problem with the server’s disk.

Related:

A defective sector on drive %1 has been replaced (hotfixed). No data was lost. You should run CHKDSK soon to restore full performance and replenish the volume’s spare sector pool. The hotfix occurred while processing a remote request.

Details
Product: Windows Operating System
Event ID: 3181
Source: System
Version: 5.0
Symbolic Name: NELOG_HotFix
Message: A defective sector on drive %1 has been replaced (hotfixed). No data was lost. You should run CHKDSK soon to restore full performance and replenish the volume’s spare sector pool. The hotfix occurred while processing a remote request.
   
Explanation

This message should occur only on a workstation. Any action to correct the problem should be performed on that computer. The fault-tolerant system found a bad disk sector and rerouted data to a good sector (this is called hotfixing).

   
User Action

Run CHKDSK soon to ensure that enough good disk sectors are available for hotfixing.

Related:

WARNING: Because of a lazy-write error, drive %1 now contains some corrupted data. The cache is stopped.

Details
Product: Windows Operating System
Event ID: 3180
Source: System
Version: 5.0
Symbolic Name: NELOG_Lazy_Write_Err
Message: WARNING: Because of a lazy-write error, drive %1 now contains some corrupted data. The cache is stopped.
   
Explanation

This message should occur only on a workstation. Any action to correct the problem should be performed on that computer. An error occurred when the lazy-write process tried to write to the specified hard disk.

   
User Action

Run CHKDSK on the specified drive to check for problems with the disk or the files affected by the lazy-write process.

Related:

A fatal internal MTA error occurred. Unable to initialize because queue could not be created. Contact Microsoft Technical Support. [ ] (16)

Details
Product: Exchange
Event ID: 3051
Source: MSExchangeMTA
Version: 6.5.6940.0
Component: Microsoft Exchange Message Transfer Agent
Message: A fatal internal MTA error occurred. Unable to initialize because queue <name> could not be created. Contact Microsoft Technical Support. [<value> <value> <value> <value>] (16)
   
Explanation

The Microsoft Exchange Server disk may be full or a disk error may have occurred. The message transfer agent (MTA) service will stop if the disk is full.

   
User Action

If the Microsoft Exchange Server disk is full, delete any unnecessary files. If you suspect a disk error, run Chkdsk.

Verify that the \Exchsrvr\Mtadata directory exists. Stop and restart the MTA service. If the error persists, run the Mtacheck.exe utility. Stop the MTA service, open a command prompt, and type Mtacheck/v /f c:\mta.txt.

If Mtacheck removes objects from a database queue, it places each damaged object in a Db*.dat file in \Exchsrvr\Mtadata\Mtacheck.out. Contact Microsoft Product Support Services for assistance.

Related:

A fatal system error occurred while initializing the MTA. Reboot the computer. If that does not work, Contact Microsoft Technical Support. [ ] (16)

Details
Product: Exchange
Event ID: 3050
Source: MSExchangeMTA
Version: 6.5.6940.0
Component: Microsoft Exchange Message Transfer Agent
Message: A fatal system error occurred while initializing the MTA. Reboot the computer. If that does not work, Contact Microsoft Technical Support. [<value> <value> <value> <value>] (16)
   
Explanation

The Microsoft Exchange Server disk may be full or a disk error may have occurred. The message transfer agent (MTA) service will stop if the disk is full.

   
User Action

If the Microsoft Exchange Server disk is full, delete any unnecessary files. If you suspect a disk error, run Chkdsk.

Verify that the \Exchsrvr\Mtadata directory exists. Stop and restart the MTA service. If the error persists, run the Mtacheck.exe utility. Stop the MTA service, open a command prompt, and type Mtacheck/v /f c:\mta.txt.

If Mtacheck removes objects from a database queue, it places each damaged object in a Db*.dat file in \Exchsrvr\Mtadata\Mtacheck.out. Microsoft Product Support Services for assistance.

Related:

A fatal MTA database server error was encountered. [ ] (16)

Details
Product: Exchange
Event ID: 3048
Source: MSExchangeMTA
Version: 6.5.6940.0
Component: Microsoft Exchange Message Transfer Agent
Message: A fatal MTA database server error was encountered. [<value> <value> <value> <value> <value>] (16)
   
Explanation

The Microsoft Exchange Server disk may be full or a disk error may have occurred. The message transfer agent (MTA) service will stop if the disk is full.

   
User Action

If the Microsoft Exchange Server disk is full, delete any unnecessary files. If you suspect a disk error, run Chkdsk.

Verify that the \Exchsrvr\Mtadata directory exists. Stop and restart the MTA service. If the error persists, run the Mtacheck.exe utility. Stop the MTA service, open a command prompt, and type Mtacheck/v /f c:\mta.txt.

If Mtacheck removes objects from a database queue, it places each damaged object in a Db*.dat file in \Exchsrvr\Mtadata\Mtacheck.out. Contact Microsoft Product Support Services for assistance.

Related:

The following message tracking log file is corrupted: ‘%1’. The corrupted record won’t be included in the search results.

Details
Product: Exchange
Event ID: 7012
Source: MSExchangeTransportLogSearch
Version: 8.0
Symbolic Name: LogSearchLogFileCorrupted
Message: The following message tracking log file is corrupted: ‘%1’. The corrupted record won’t be included in the search results.
   
Explanation

This Error event indicates the Message Tracking tool detected that a record stored in the message tracking log referenced in the event description was damaged. Specifically, a record may be damaged. Alternatively, one or more fields may be missing. Therefore, the record was excluded from the search results.

Message tracking records the Simple Mail Transfer Protocol (SMTP) transport activity of all messages that are transferred to and from a computer running Microsoft® Exchange Server 2007 that has the Hub Transport, Mailbox, or Edge Transport server role installed. You can use message tracking logs for message forensics, mail flow analysis, reporting, and troubleshooting.

For more information, see Managing Message Tracking.

   
User Action

To avoid this error in the future, follow one or more of these steps:

  • Do not manually modify message tracking log files. By default, these files are located in the following directory: C:\Program Files\Microsoft\Exchange Server\TransportRoles\Logs\MessageTracking.

  • Make sure that your file-based antivirus software is configured to exclude file directories that contain message tracking log files.

  • Review the System log for disk-drive-related events. Use the information in those events to determine whether the disk that stored the log files has any hardware failures. You may have to use CHKDSK tool to check a disk and display a status report. For more information about CHKDSK, type CHKDSK /? at a command prompt.

If you are not already doing so, consider running the tools that Microsoft Exchange offers to help administrators analyze and troubleshoot their Exchange environment. These tools can help you make sure that your configuration is in line with Microsoft best practices. They can also help you identify and resolve performance issues, improve mail flow, and better manage disaster recovery scenarios. Go to the Toolbox node of the Exchange Management Console to run these tools now. For more information about these tools, see Toolbox in the Exchange Server 2007 Help.

Related: