In-Band Network Telemetry: Next Frontier in Network Visualization with Analytics and Why Enterprise Customer Care
By: Gautam Chanda, Global Product Line Manager DC Networking Analytics, HPE
Let’s first answer the important question: Why do we need Network Visualization and Analytics?
Data Center networks have become cloud scale and deployment of hyper-converged networks is increasing. Telecom networks will enable faster connectivity everywhere with higher bandwidth delivering 5G wireless services. All of these next-generation networks not only require much higher bandwidth, but they also require real-time telemetry to deliver services with good Quality of Experience (QoE).
A network with detailed real-time visibility enables better reliability and real-time control. Here are key reasons customers need Network Visualization and Analytics now even more than before:
- Ability to Pinpoint Traffic Patterns for Dynamic Applications: Data centers now have increasingly complex network deployments with Network Virtualization & Overlay / Tunnel technologies; SDN/NFV; Silicon Programmability; Multi-tenancy; increased Applications volume; mobility; Hybrid cloud; Bare metal & Virtualized servers (VMs/Containers); Vswitch; NIC virtualization; Orchestration and the list goes on. This gives rise to increasingly complicated traffic patterns in the data center in which network operators would like to have greater visibility into those complex patterns to understand if their DC network infrastructure is performing optimally.
- Security Challenges: More security concerns can arise in complicated IT scenarios, more strict regulatory compliances, and more cybersecurity attacks from both inside and outside data center are threats. Defense against Security Attacks and complex traffic patterns from both inside and outside of the data center are critical.
- Intent-Based Network
- Network Analytics (Visibility, Validation, Optimization & Upgrade, Troubleshooting, Policy Enforcement) is increasingly important for modern DC and Cloud deployments.
Old Network Management Tools such as SNMP is not up to the task in this very high speed networks as we move from 10G to 25G to 100G and beyond in a short order.
The figure below demonstrates very well the need for Network Visualization and Analytics:
This bring us to In-Band Network Telemetry (INT).
Let’s pause for a minute:
- Let’s assume you’re interested in the behaviour of your live user-data traffic.
- What is the best source of information?
- Well… probably the live user-data traffic itself.
- Let’s add meta-data to all interesting live user-data traffic.
This is the essence of In-Band Network Telemetry.
The figure below contrasts traditional ways where in traditional network monitoring, an application polls the host CPU to gather aggregated telemetry every few seconds or minutes, which doesn’t scale well in next generation networks. In-Band Network Telemetry, however, enables packet level telemetry by having key details related to packet processing added to the data plane packets without consuming any host CPU resources:
Figure 2: Traditional vs New Way
In-Band Network Telemetry (INT) is a sophisticated and flexible telemetry feature supported usually within the Network devices in HW. As explained above INT allows for the collection and reporting by the data plane on detailed latency, congestion, and network state information, without requiring intervention or work by the control plane. The INT enabled devices inserts this valuable metadata, which can then be extracted and interpreted later by a collector/Sink/Network Management SW such as HPE IMC, in-band without affecting network performance.
The INT will enable a number of very useful Customer Use Cases such as:
- Network troubleshooting
- When packets enter/exit networks
- Which path was taken by individual flows associated with Specific Applications
- How long packets spend at each hop
- How long packets spend on each link
- Which switches are seeing congestion?
- Microburst detection
- Real-time control or feedback loops:
- Collector might use the INT data plane information to feed back control information to traffic sources, which could in turn use this information to make changes to traffic engineering or packet forwarding. (Explicit congestion notification schemes are an example of these types of feedback loops).
- Network Event Detection:
- If the collected path state indicates a condition that requires immediate attention or resolution (such as severe congestion or violation of certain dataplane invariances), the Collector could generate immediate actions to respond to the network events, forming a feedback control loop either in a centralized or a fully decentralized fashion (a la TCP).
- List Goes On…..
The Figure below shows end to end INT Customer Use Case in a Data Center:
Figure 3: End To End INT
In Figure 3 above shows how In-Band Network Telemetry is used to “Track in Real Time Path and Latency of Packets and Flows Associated with Specific Applications”:
- Collect the physical path and hop latencies hop-by-hop for every packet.
- Can be initiated /Transited / terminated by either a switch or a NIC (Network Interface Card) in a Host such as a Server.
- INT metadata is encapsulated and exported to the collector (e.g. HPE IMC).
- Case 1a: Real-time fault detection and isolation or alert: Congested/oversubscribed links and devices, imbalanced links (LAG, ECMP), loop.
- Case 1b: Interactive analysis & troubleshooting: On-demand path visualization; Traffic matrix generation; Triage incidents of congestion.
- Case 1c: Path Verification of bridging/routing, SLA, and configuration effects.
- Enhanced visibility for all your Network traffic
- Network provided telemetry data gathered and added to live data
- Complement out-of-band OAM tools like SNMP, ping, and traceroute
- Path / Service chain verification
- Record the packet’s trip as meta-data within the packet
- Record path and node (i/f, time, app-data) specific data hop-by-hop and end to end
- Export telemetry data via Netflow/IPFIX/Kafka to Controller/Apps
- In-band Network Telemetry can be implemented without forwarding performance degradation
- Network ASIC vendors have started to add INT as a built in functions within their newest ASICs
HPE FlexFabric Network Analytics solution is leading the way towards this next frontier in Network Visualization and Analytics.