Once upon a time, enterprise network engineers had to provide network access and sufficient bandwidth to various connected servers, applications and end devices. From an OSI model perspective, the focus was on Layers 1-4 only. Upper OSI layers were more or less ignored, as all traffic and data flows running across a network shared all bandwidth and queuing resources.
As time went on, network equipment became sophisticated to the point where different data flows could be identified and treated differently on the network. Various quality of service (QoS) and application-level traffic shaping techniques can be used to accomplish this goal. Additionally, the ever-increasing reliance on business-critical applications has forced network engineers to understand upper layers of the OSI model so they can help to identify any inefficiencies or problems related to the network, server OS, virtualization software and applications themselves. But in order to do that, a tool is needed to identify such problems.
So, it’s important to understand exactly what your organization needs — and properly gauge the tradeoff between granularity and complexity.
In many cases, network performance monitoring tools evolved from more traditional and less sophisticated network monitoring software. These monitoring tools commonly used ICMP ping and Simple Network Monitoring Protocol (SNMP) polling/traps to verify the health of a network. More modern additions include the ability to monitor, baseline and intelligently analyze possible images all the way to the application itself. Most modern network performance tools have the ability to perform the following five functions:
Depending on the network performance monitoring vendor, these tasks are performed with varying levels of granularity. And the more precise they are, the more complex implementation and management can be. So, it’s important to understand exactly what your organization needs — and properly gauge the tradeoff between granularity and complexity. That being said, let’s further explore the five functions today’s network performance monitoring tools commonly offer.
Network and application monitoring
As mentioned earlier, today’s network performance monitoring tools evolved from network monitoring that leveraged ICMP ping and the SNMP protocol. Routine pings from the network monitoring server were sent to various networks, servers and other end devices that required monitoring. If the monitored device stopped responding to the ping requests, the monitoring tool would mark the device as “down” and would alert support staff.
SNMP collects and organizes various types of data from network and server components capable of supporting the protocol.
For network devices, this commonly means monitoring specific device interface states and data throughput rates over time. It can also monitor hardware health, including power supplies, fans and memory utilization, among others.
Some network performance monitoring tools are also capable of collecting and triggering from various syslog messages. Syslog is a common standard for infrastructure device log messages. The messages are sent to the centralized network monitoring tool to be stored, analyzed and used to notify support engineers in the event of a system malfunction.
Network monitoring tools have the beefed-up capability to monitor availability and performance statistics all the way up to the application level. This type of monitoring usually relies on software plug-ins or OS settings configured to send monitoring data back to the centralized monitoring server.
Virtualization and OS problem detection
Issues also can — and do — arise between the network and the application. This includes problems at the virtualization level, server operating system and any middleware the application relies on to operate. Virtualization hypervisors can be individually monitored for performance problems that can cause slowdowns at the application level. The same is also true for monitoring the host OS and middleware that orchestrates communication over distributed systems. Network performance monitoring vendors use differing methods to monitor these types of problems and some support a greater variety of hypervisors, operating systems and middleware software than others.
In addition to providing simple up/down status and utilization information, network performance monitoring products can perform more sophisticated and automated network troubleshooting. This includes routing protocol monitoring and alerting when unscheduled routing protocol changes occur. Additionally, some products possess intelligence to understand how various WAN technologies, virtual overlays and QoS features operate. They, too, can be set to automatically alert when problems occur and even take automated actions to resolve issues.
Application data and flow capture analysis
The most important duties of modern network performance monitoring tools revolve around data and flow capture analysis. There are a few different methods to capture data packets on various parts of the network to be used for automated and/or manual analysis. Among the most common:
Deployment of distributed data collection agents throughout critical parts of the network
The ability to leverage packet capture functionality built into certain router/switch hardware
The ability to examine packets to perform more granular application analysis is a growing need in many enterprise organizations. By using deep packet inspection, network administrators can identify more application-related communication problems that would otherwise go unnoticed.
Network flow collection sweeps up IP network statistics as data enters and exits network interfaces. Once this data is exported to a centralized server and analyzed using network performance monitoring flow analysis tools, network support administrators can identify traffic source and destination information, as well as detail QoS policies the traffic encounters as it traverses the network. Ultimately, the data can be used to identify any configuration issues or congestion along various network paths, between network devices.
Root cause analysis
The ability to combine various events collected and analyzed on a network performance monitoring tool can also be used to form an automated root cause analysis. If an issue occurred on the network that triggered events on multiple components, many network performance monitoring tools use artificial intelligence to correlate the events and determine a likely root cause to the problem. This is one of the trickier functions to configure, since it requires all devices and monitoring systems be configured perfectly. For example, if device times are not synchronized using the Network Time Protocol, event times will be incorrect. This can negatively affect the accuracy of the root cause analysis engine. But once set up and properly maintained, automated root cause analysis tools can save a tremendous amount of time from a troubleshooting perspective.