DELL EMC NetWorker NAS backups with NDMP

DELLEMCNetWorker





Avatar.jpg





Before we jump into NDMP and how NetWorker handles it, let us first understand some basics about NAS and what are the different types of backups.



Network Attached Storage (NAS)

NAS systems.jpg







A NAS system is a dedicated high-performance file server with the storage system. It provides file-level data access and sharing. File sharing refers to storing and accessing files over a network. NAS uses network and file sharing protocols, including TCP/IP for data transfer and CIFS and NFS for remote file services. Using NFS or CIFS, remote clients gain access over TCP/IP to all or a portion of a file system that is on a file server. The owner of a file can set the required type of access, such as read-only or read-write for a particular user or group of users and control changes to the file. When multiple users try to access a shared file at the same time, a protection scheme is required to maintain data integrity and at the same time make this sharing possible.



File Copy Backup

The simplest method for backup uses file copy, such as an operating system’s copy application. In this type of copy, the metadata includes the names and characteristics of all files so that the level of granularity for recovery is at the file level. The performance of a file copy backup is directly affected by the number of files, sizes, and the general characteristics of the file system being backed up.

Raw Device Backup

Backup of data can also occur on a raw device level. That means that the file system will have to be unmounted so that the copy can take place. The backup application can then use “dump” applications, such as UNIX’s dd to perform a copy from the raw device to the backup device. This type of backup is usually faster than a file copy but affects restore granularity.

NAS Head Systems

The use of NAS heads imposes a new set of considerations on the backup and recovery strategy in NAS environments. NAS heads use a proprietary operating system and file system structure supporting multiple file-sharing protocols. In application server-based backup, the NAS head retrieves data from storage over the network and transfers it to the backup client running on the application server. The backup client sends this data to a storage node, which in turn writes the data to the backup device. This results in overloading the network with the backup data and the use of production server resources to move backup data.

Serverless Backup

In the server-less backup, the network share is mounted directly on the storage node. This avoids overloading the network during the backup process and eliminates the need to use resources on the production server. In this scenario, the storage node, which is also a backup client, reads the data from the NAS head and writes it to the backup device without involving the application server. Compared to the previous solution, this eliminates one network hop.

With the adoption of NAS devices by the industry, several challenges were noticed.

Proprietary Operating Systems

Most NAS devices run on proprietary operating systems designed for very specific functionality and therefore do not generally support “Open System” management software applications for control. There are different data storage formats between the storage arrays.

Network File Systems

Security structures differ on the two most common network file systems, NFS and CIFS. Backups implemented via one of the common protocols would not effectively backup any data security attributes on the NAS device that was accessed via a different protocol. For example, CIFS LAN backup, when restored, would not be able to restore NFS file attributes and vice-versa. With a dual accessed file system, NFS and CIFS gave rise to the concern that if the file system was corrupted and there was no formal independent methodology for recovering it, then the permissions and rights of the file system could be compromised on recovery and neither protocol would understand the other’s schema. Therefore, when pre-NDMP backups were performed, the image on tape was that of the specific protocol used to perform the backup

Network Data Management Protocol (NDMP)

NAS backup challenges are addressed with NDMP, which is both a mechanism and protocol utilized on a network infrastructure to enable the control of backup, recovery, and transfer of other data between NDMP-enabled primary and secondary storage devices. TCP/IP is the transport protocol. XDR is the data output language where all data is read from and copied back to disparate operating systems and hardware platforms without losing the data integrity. The NFS file system and Microsoft use XDR to describe its data format. By enabling this standard on a NAS device, the proprietary operating system ensures that the data storage format conforms to the XDR standard and therefore allows the data to be backed up and restored without file system structure loss with respect to different rights and permission structures, as in the case of dual accessed file systems.

Some History and facts about NDMP

Co-Invented by NetApp and PDC Software (aka Intelliguard) – now Legato/EMC – in the early 1990’s, with first commercial deployments of NDMP enabled systems as early as 1996.

Since its inception, the protocol has gone through multiple versions, designed and standardized by the NDMP consortium (www.ndmp.org) and providing varying degrees of functionality and interoperability.

The current and latest version of the NDMP protocol is version 4 and it is supported by all enterprise NAS and DMA vendors.

NDMP version 5 has been in the works for a number of years but has not become standardized.

Some of the proposed NDMP v.5 features are already supported independently by a few NAS and DMA vendors (i.e. Token Based Backup, Checkpoint Restartable Backup, Snapshot Management Extension, and Cluster Aware Backup).

NDMP is purpose-built for NAS backup and recovery so it’s most efficient for this task and removes the traditional method technical barriers.

With NDMP, the NAS performs its own backup and recovery. The DMA only sends the commands to the NAS and maintains the device configuration and catalog.

  • Overall, NDMP provides the following benefits:
  • Reduces complexity
  • Provides interoperability
  • Allows NAS device to be “backup ready”
  • Allows faster backups
  • Allows NAS and DMA vendors to focus on core competency and compatibility
  • It is a cooperative open standard initiative

Additionally, the ability to backup the filesystem from a block-level representation can provide a significant performance benefit, particularly in the case of dense file systems.



NDMP Standard

NDMP (Network Data Management Protocol) is an open protocol used to control data backup and recovery communications between primary and secondary storage in a heterogeneous network environment. Compliance with NDMP standards ensures that the data layout format, interface to storage, management software controls, and tape format are common irrespective of the device and software being used. Refer to the NDMP organization for more details on the protocol and implementation, at http://www.ndmp.org



NDMP Operation on NAS Devices

When implemented on a NAS device, it responds to backup software requests for backup and recovery functions. In traditional backup methods, NDMP backups only use the LAN for metadata. The actual backup data is directly transferred to the local backup device by the NAS device.

NDMP Components in a NetWorker Environment



Three main components support NDMP data operations with the NetWorker software:

NDMP Data Server, NDMP Tape Server, DMA

NDMP Components.jpg

Control station: Control Station is the management control host interface into the entire NAS system.

Data Mover: Data mover is the host machine that basically owns all the NAS resources.



NDMP Configuration Models

There are several different NDMP configuration models. Direct-NDMP and NDMP-DSA or three-way backups.

Each one of the models target specific user needs and applications. In all the scenarios, the backup server and the NAS device are NDMP-compliant. The backup application controls the backup/restore process and handles file and scheduling information.

Supported NDMP Backup Types

ndmp backup types.jpg

NDMP Optional Features

Depending upon the backup software vendor, there are two additional NDMP features that are supported:

• Direct Access Recovery or DAR

• Dynamic Drive Sharing or DDS

Direct Access Recovery (DAR)

DAR is the ability to keep a track of tape position for individual files in NDMP backups so that the tape server can seek directly to the file during restore. Without DAR support, a single file restore requires reading through the entire index. Another form of DAR is the Directory DAR or DDAR, which is an improved version. DDAR supports directory-level DAR by restoring all the content under a particular directory.

Dynamic Drive Sharing (DDS)

DDS enables tape drives within individual tape libraries to be shared between multiple NAS devices and/or storage nodes in a SAN. By allowing storage nodes to write data to all available drives, more drives can be assigned per backup group in comparison to an environment whereby drives are dedicated to specific servers. As a result, DDS maximizes library utilization, enables backups and recoveries to be completed sooner, and increases library ROI.

NDMP key Installation binaries with NetWorker

NDMP Binaries.jpg

Direct-NDMP

The DMA (NetWorker) controls the NDMP connection and manages the metadata.

The NAS backs up the data directly, over Fibre Channel, to a locally attached NDMP TAPE device.



  • Direct-NDMP writes and reads to/from TAPE in serial fashion, one save set at a time.

  • Direct-NDMP does not support multiplexing of save sets on the same volume.

  • Client parallelism must not exceed the total number of NDMP devices available to the NAS.

  • Direct-NDMP has the advantage of backing up data directly to tape from the NAS.

  • Typically fast, but cannot take advantage of tape multiplexing.

  • Maybe a better choice for best performance when few, large file systems require backup where throughput is more relevant than multiplexing.



How it works

  • nsrndmp_save runs on the NetWorker Server. It is invoked by ‘workflow’ or manually via command line.

  • nsrndmp_save establishes a TCP connection with the NAS on port 10000 and sends backup parameters to the NAS. The NAS backs up directly to NDMP TAPE over FC SAN.

  • nsrndmp_save spawns ‘nsrndmp_2fh’ which receives and sorts File History (FH) messages from the NAS for building the backup index.

  • Once all File History is received, ‘nsrndmp_2fh’ exits. ‘nsrndmp_save’ then spawns ‘nsrdmpix’ which converts the FH to NetWorker index format and passes it to ‘nsrindexd’ for index commit on the NetWorker Server.

Once both the backup and index processing are completed (not necessarily at the same time) ‘nsrndmp_save’ reports the backup status and exits.

If either the backup or index generation fails, ‘nsrndmp_save’ reports the backup has failed.

NOTE: If the backup fails the index generation fails. But if only index generation fails, the backup is likely successful and recoverable.

A separate ‘nsrndmp_save’ process is issued and running for each save set. Same for ‘nsrndmp_2fh’ and ‘nsrdmpix’.







NDMP-DSA (Data Server Agent)

NetWorker controls the NDMP connection and manages the metadata. NetWorker is also responsible for the data backup to a NetWorker device (non-NDMP).

The NAS sends the data over TCP/IP to NetWorker for backup to the Server’s or Storage Node’s configured device.

NDMP-DSA supports any and all NetWorker device types (Tape, AFTD, DD-Boost).

NDMP-DSA is sometimes also called “Three-Way NDMP”.



  • NDMP-DSA writes and reads data over the TCP/IP network to the NetWorker server or Storage Node.

  • Client parallelism can be set based on the capabilities of the NAS as opposed to the number of available devices.

  • NDMP-DSA has the advantage of leveraging NetWorker’s tape multiplexing capability to backup more than one save sets to the same tape at the same time.

  • With NDMP-DSA the data has to be sent over the TCP/IP network to NetWorker which may cause throughput contention.

  • NDMP-DSA may be a better choice for best performance when many, small file systems require backup where greater parallelism and multiplexing are more relevant than network bandwidth.



How it works



  • ‘nsrndmp_save’ starts via ‘savegrp’ on the NetWorker Server. It can also be started manually from a command line.

  • ‘nsrndmp_save’ connects to the NAS over TCP port 10000, authenticates, and passes the backup parameters to the NAS.

  • ‘nsrndmp_save’ communicates with the Storage Node (Server or Remote) via ‘nsrexecd’ and spawns the ‘nsrdsa_save’ process.

  • ‘nsrndmp_save’ passes the DSA hostname information to the NAS so it can connect to the ‘nsrdsa_save’ process and start the backup.

  • ‘nsrdsa_save’ communicates with the NAS over any of the available NetWorker TCP Service ports configured on the Server or Storage Node.
  • ‘nsrdsa_save’ receives the backup data from the NAS and passes it to the ‘nsrmmd’ for backup to the NetWorker device.

  • If “Client Direct” is set in the NDMP Client, ‘nsrdsa_save’ communicates with the backup device directly.

  • In parallel, ‘nsrndmp_save’ spawns the ‘nsrndmp_2fh’ process on the NetWorker Server which receives FH messages for index generation.

  • After ‘nsrndmp_2fh’ is done, ‘nsrndmp_save’ spawns ‘nsrdmpix’ for index processing to ‘nsrindexd’.

  • Once the backup is done, the NAS closes the DSA connection and the ‘nsrdsa_save’ process exits. If /when index processing is complete, ‘nsrndmp_save’ exits and reports the status of the backup.

If either the backup or index generation fails, ‘nsrndmp_save’ reports the backup has failed.

NOTE: If the backup fails the index generation fails. But if only index generation fails, the backup is likely successful and recoverable.



  • A separate ‘nsrdsa_save’ process is issued and running for each save set being backed up concurrently.



NDMP-DSA Client Direct



NDMP “Client Direct” applies solely to the communication between the ‘nsrdsa_save’ process and the backup device (AFTD or DD-Boost). Contrary to core NetWorker Clients, NDMP Clients cannot communicate directly with the NDMP-DSA backup device (i.e. the NAS always has to send the data to the ‘nsrdsa_save’ on the NW host). Thus, the DSA host is always the acting client insofar as NDMP “Client Direct” is concerned. With “Client Direct” set, the ‘nsrdsa_save’ communicates with the backup device directly. This is called “Immediate Save”. Without the “Client Direct” set, the ‘nsrdsa_save’ communicates with the ‘nsrmmd’ process rather than the device directly. This is called “Non-Immediate Save”.

Client Direct NDMP workflow.jpg



More information can be found in NetWorker NDMP guide.



https://support.emc.com/docu89901_NetWorker_18.1_Network_Data_Management_Protocol_(NDMP)_User_Guide.pdf?language=en_US



NDMP Index Processing ( Technical Notes)



https://support.emc.com/docu39128_NDMP-Index-Processing:-A-Performance-Case-Study-Technical-Notes.pdf?language=en_US



Tip: Interested in setting up a simulator for Isilon in the LAB.

Here is the info if you would be interested.



Download VMWare Player.



Download Isilon Simulator (162 MB) from the following link



https://download.emc.com/downloads/DL89604_Isilon_OneFS_8.1.0.4_Simulator.zip?language=en_US&source=Coveo



Each Node will require 2 GB of Memory.



Remember Isilon Cluster should have at least 3 Nodes, the maximum is 144.



Follow the following Video and have fun.



Technical Demo: EMC Isilon OneFS Simulator | ID.TV – YouTube





Post Install if you want to explore the GUI



Open a Browser and type one of the external IP addresses with port number 8080.

Have Fun and I hope it helps!!



Related:

Leave a Reply