Unreal-to- Real

Unreal-to- Real

Saturday, May 25, 2013

BIDIRECTIONAL FORWARDING DETECTION (BFD)


BIDIRECTIONAL FORWARDING DETECTION

Fast Failure Detection to Speed Network Convergence

OVERVIEW

In both Enterprise and Service Provider networks, the convergence of business-critical applications onto a common IP  Infrastructure is becoming more common. Given the criticality of the data, these networks are typically constructed with a high degree of redundancy. While such redundancy is desirable, its effectiveness is dependent upon the ability of individual network devices to quickly detect failures and reroute traffic to an alternate path.

 This detection is now typically accomplished via hardware detection mechanisms. However, the signals from these mechanisms are not always conveyed directly to the upper protocol layers. When the hardware mechanisms do not exist (eg: Ethernet) or when the signaling does not reach the upper protocol layers, the protocols must rely on their much slower strategies to detect failures. The detection times in existing protocols are typically greater than one second, and sometimes much longer. For some applications, this is too long to be useful.

Bi-directional Forwarding Detection (BFD) provides rapid failure detection times between forwarding engines, while maintaining low overhead. It also provides a single, standardized method of link/device/protocol failure detection at any protocol layer and over any media.

THE PROBLEM WITH CONVERGENCE

The process of network convergence can be broken up into a set of discreet events*:

·         Failure detection: the speed with which a device on the network can detect and react to a failure of one of its own components, or the failure of a component in a routing protocol peer.

·         Information dissemination: the speed with which the failure in the previous stage can be communicated to other devices in the network

·         Repair: the speed with which all devices on the network—having been notified of the failure—can calculate an alternate path through which data can flow.

An improvement in any one of these stages provides an improvement in overall convergence.

The first of these stages—failure detection—can be the most problematic and inconsistent.

·         Different routing protocols use varying methods and timers to detect the loss of a routing adjacency with a peer
·         Link-layer failure detection times can vary widely depending on the physical media & the Layer 2 encapsulation used
·         Intervening devices (eg: Ethernet switch) can hide link-layer failures from routing protocol peers

Packet over SONET (POS) tends to have the best failure detection time amongst the different Layer 1/2 media choices. It can typically detect and react to media or protocol failures in ~50 milliseconds. This has become the benchmark against which other protocols are measured. BFD can provide fast failure detection times for all media types, encapsulations, topologies, and routing protocols. In the best-case scenario, it can provide fast failure detection similar to that found in POS. 

A secondary benefit of BFD, in addition to fast failure detection, is that it provides network administrators with a consistent method of detecting failures. Thus, one availability methodology could be used, irrespective of the Interior Gateway Protocol (IGP) or the topology of the target network. This eases network profiling and planning, because reconvergence time should be consistent and predictable.

Common BFD applications include:

·         Control plane liveliness detection
·         Tunnel endpoint liveliness detection
·         A trigger mechanism for IP/MPLS Fast ReRoute
·         MPLS Label Switching Protocol date plane failure detection

BFD Packet Formats

BFD control packets are always sent as unicast packets to the BFD peer.

Figure 1. BFD Control Packet Payload
 
Field Description

Vers The version number of the protocol. The initial protocol version is 0.

Diag A diagnostic code specifying the local system's reason for the last transition of the session from Up to some other state.

Possible values are:
0—No Diagnostic
1—Control Detection Time Expired
2—Echo Function Failed
3—Neighbor Signaled Session Down
4—Forwarding Plane Reset
5—Path Down
6—Concatenated Path Down
7—Administratively Down

H Bit The “I Hear You” bit. This bit is set to 0 if the transmitting system either is not receiving BFD packets from the remote system, or is in the process of tearing down the BFD session for some reason. Otherwise, during normal operation, it is set to 1.

D Bit The “Demand Mode” bit. If set, the transmitting system wishes to operate in Demand Mode*.

P Bit The Poll bit. If set, the transmitting system requesting verification of connectivity, or of a parameter change.

F Bit The Final bit. If set, the transmitting system is responding to a received BFD Control packet that had the Poll (P) bit set.

Rsvd Reserved bits. These bits must be zero on transmit, and ignored on receipt.

Detect Mult Detect time multiplier. The negotiated transmit interval, multiplied by this value, provides the detection time for the transmitting system in Asynchronous mode.

If the reader is familiar with IGP HELLO protocol mechanisms, this is analogous to the hello-multiplier in ISIS, which can be used to determine the hold-timer. (hello-interval)* (hello-multiplier) = hold-timer. If a HELLO is not received within the hold-timer, a failure has occurred.

Similarly in BFD – (transmit interval) * (detect multiplier) = detect-timer. If a BFD control packet is not received from the remote system within detect-timer, a failure has occurred.

Length Length of the BFD Control packet, in bytes.

My Discriminator A unique, nonzero discriminator value generated by the transmitting system, used to demultiplex multiple BFD sessions between the same pair of systems.

Field Description 

Your Discriminator The discriminator received from the corresponding remote system. This field reflects back the received value of My Discriminator, or is zero if that value is unknown.

Desired Min TX Interval This is the minimum interval, in microseconds, that the local system would like to use when transmitting BFD Control packets.

Required Min RX Interval This is the minimum interval, in microseconds, between received BFD Control packets that this  system is capable of supporting.

Required Min Echo RX Interval This is the minimum interval, in microseconds, between received BFD Echo packets that this system is capable of supporting. If this value is zero, the transmitting system does not support the receipt of BFD Echo packets.

BFD Initial Session Setup

To better understand how BFD is implemented, consider an example. Imagine two routers, each of which runs EIGRP, connected over a common medium. Both routers have just started up, so no BFD session has been established. In each router, EIGRP informs the BFD process of the IP address of the neighbor that it needs to monitor. It is important to note that BFD does not discover its peers dynamically. It relies on the configured routing protocols to tell it which IP addresses to use and which peer relationships to form.

BFD on each router will form a BFD control packet. These packets are sent at a minimum of one-second intervals* until a BFD session is established. They may cross in transmission, although BFD is designed to adapt to this condition.

The initial packets from either side will be very similar: Vers, Diag, the H, D, P, and F bits will all be set to zero. My Discriminator will be set to a value which is unique on the transmitting router; Your Discriminator is set to zero, because the BFD session has yet to be established**. The values of the TX and RX timers will be set to the values found in the configuration of the device.

After the remote router receives a BFD control packet during the session initiation phase, it will copy the value of the “My Discriminator” field into its own “Your Discriminator” field and set the H (“I Hear You”) bit for any subsequent BFD control packets it transmits. Once both systems see their own Discriminators in each other’s control packets, the session is “officially” established. Both systems will continue to send at (at least) one second intervals until they see the appropriate Discriminators in each other’s BFD control packets.

The Discriminator values can also be used to multiplex/demultiplex sessions if there are multiple BFD connections between a pair of BFD peers, or to allow the changing of an IP address on a BFD interface without causing the BFD session to be reset.

Figure 2 illustrates the initial BFD session setup.

 



Concurrent with the exchange of control packets to establish the BFD session, BFD timers are also negotiated. The negotiation of the initial BFD timers is somewhat anomalous, because—unlike the subsequent timer changes—it occurs without the exchange of Poll and Final (P and F) bits. The P and F bits are used to ensure that the remote device received the packet requesting the timer change. However, this exchange is not required during initial session setup, as the fact that the remote device changed the value of “Your Discriminator” and set the H bit in subsequent packets is sufficient to ensure that it received the currently requested timer values. The next section of this document will discuss the details of timer negotiation.

BFD Timer Negotiation

The process of BFD timer negotiation between two BFD devices is a very simple one, and occurs in a few steps. A device needs to assure three things before it can negotiate a BFD timer:

·         That its peer device saw the packet containing the local device’s proposed timers
·         That it never sends BFD control packets faster than the peer is willing to receive them
·         That the peer never send BFD control packets faster than the local system is willing to receive them

As mentioned earlier, the setting of “Your Discriminator” and the H bit are sufficient to allow the local device to know that the remote device has seen its packets during initial timer exchange. Once these timers have been negotiated, they can be renegotiated at any time during the session without causing a session reset. The existing timers are maintained during the negotiation period, and the new timers do not take effect until they are acknowledge via a Poll bit and Final bit exchange.

The device that changed its timers will set the P bit on all subsequent BFD control packets, until it receives a BFD control packet with the F bit set from the remote system. This exchange of bits guards against packets that might otherwise be lost in transit. It is extremely important to note that the setting of the F bit by the remote system does not imply that it accepts the newly proposed timers. It merely indicates that the remote system has seen the packets in which the timers were changed.

How, then, are the timers actually negotiated? Each system, upon receiving a BFD control packet will take the “Required Min RX Interval” and compare it to its own “Desired Min TX Interval” and take the greater (slower) of the two values and use it as the transmission rate for its BFD packets. Thus, the slower of the two systems determines the transmission rate.

Because this comparison is performed independently by either peer, it is possible to have asynchronous transmission rates on the link. That is, one peer will be sending BFD control packets more frequently in one direction than the peer is sending in the other direction. Figure 3 illustrates both Poll/Final bit usage, and timer negotiation:



 Figure 3. BFD Timer Negotiation

Figure 3 represents the “worst-case scenario” for BFD because Router A proposes radically different timers than already exist; moreover, it loses a packet when it suggests the change. Here is what occurs during this scenario:

·         Router A and Router B both start in a steady state, with agreed upon timers of 50 ms in both directions

·         Router A wishes to change its timers to transmit at 25ms and receive at 150 ms. It sends a BFD control packet with the P bit set. Unfortunately this packet is lost in transit.

·         Router B would have continued to send BFD control packets at 50ms intervals during this exchange. This is not illustrated in Figure 3.

·         After another 50 ms, Router A resends its request to change the timers. Again it sets the Poll Bit. Remember, because the new timers are not in effect yet, Router A must continue to honor the existing timers. The retransmission thus occurs at the 50ms interval.

·         Router B sees the packet this time and compares the requested RX interval to its own TX interval. The requested RX interval is larger, so Router B throttles back to sending BFD control packets at 150ms intervals.

·         Router A receives the packet with the F bit set. The remote timers are still set at 50ms and 50ms. It compares the requested RX interval to its own Desired TX interval of 25ms. The requested RX interval is larger, so Router B continues to send at 50ms intervals

·         The timer negotiation is complete: Router A sends at 50ms intervals, while Router B sends at 150ms intervals.

While the ability to negotiate timers does provide some configuration flexibility, it is anticipated that initial BFD deployments will use identical timer configurations on BFD peers sharing the same media types. Still, timer negotiation does provide some protection against misconfiguration.

Even if one peer sets an absurdly low TX or RX timer, the value will be negotiated upwards by a correctly configured peer.

It is also worth noting that—even though the timers have been negotiated to new values—the actual values in the BFD packets remain at the locally configured settings. For example, although Router B is transmitting at 150ms, an inspection of Router B’s BFD control packet would show its Desired Min TX Interval still set to 50ms. Only an internal timer on the device has changed.

The Detect Multiplier is also communicated in the BFD control packets, but is not negotiated, so it is possible to have different detect-timer values at either side of the BFD session.

BFD Failure Detection 

Once the BFD session and appropriate timers have been negotiated, the BFD peers will send BFD control packets to each other at the negotiated interval. As previously mentioned, this assumes BFD Asynchronous mode; BFD Demand mode functions differently. These control packets function as a heartbeat, very similar to an IGP HELLO protocol, except at a more accelerated rate. As long as each BFD peer receives a BFD control packet within the detect-timer period, the BFD session remains up and any routing protocol associated with BFD maintains its adjacencies. If a BFD peer does not receive a control packet within the detect interval, it informs any clients of that BFD session (i.e. any routing protocols) about the failure. It is up to the routing protocol to determine the appropriate response to that information. The typical response will be to terminate the routing protocol peering session and reconverge, bypassing the failed peer.

The preceding information brings up three important points:

·         BFD is a “liveliness” detection protocol, but does not—in itself—determine the correct reaction to a detected failure.

·         BFD can be used at any protocol layer. It could, for example, detect Physical or Data Link layers failures, if the existing mechanisms did not provide sufficiently speedy detection.

If a BFD device fails to receive a BFD control packet within the detect-timer [(Required Minimum RX Interval) * (Detect Multiplier)], then it informs its client protocol that a failure has occurred. Each time a BFD successfully receives a BFD control packet on a BFD session, the detect timer for that session is reset to zero. Thus, the failure detection is dependant upon received packets, and is independent of when the receiver last transmitted a packet.

This is illustrated in Figure 4: 



In its next BFD control packet, Router A will set the diagnostic field to a value which indicates why the session was taken down. In this case, the diagnostic will be 1: Control Detection Time Expired. Diagnostics are useful to differentiate between real failures, versus administrative actions. For example, if the network administrator disabled BFD for this session, the diagnostic would be 7: Administratively Down.

BFD DEPLOYMENT: Deployment Alternatives

When deploying any protocol or IP functionality, it is appropriate to consider all the alternatives, and be aware of any trade-offs being made. The closest alternative to BFD in conventional EIGRP deployments is use of modified hello and hold timers. By setting EIGRP hello and hold timers to their absolute minimums, the EIGRP protocol to reduce its failure detection mechanism to within the 1-2 second range.

There are several advantages to BFD over the reduced timer mechanism:

Because BFD is not tied to any particular routing protocol, it can me used as a generic and consistent failure detection mechanism for OSPF, ISIS, EIGRP, and BGP.

Because some parts of BFD can be distributed to the data plane, it can be less CPU-intensive than reduced timers, which exist wholly at the control plane.

Reduced EIGRP timers have an absolute minimum detection timer of 1-2 seconds; BFD can provide sub-second failure detection.

BFD also shares some common caveats with reduced EIGRP timers:

BFD can potentially generate false alarms—signaling a link failure when one does not exist. Because the timers used for BFD are so tight, a brief interval of data corruption or queue congestion could potentially cause BFD to miss enough control packets to allow the detect-timer to expire. While the transmission of BFD control packets is managed by giving them the highest possible queue priority, little can be done about prioritizing incoming BFD control packets.

BFD will consume some CPU resources, although many optimizations have been made to ensure the CPU usage is minimal. On non-distributed platforms, in-house testing has shown a minor 2% CPU increase (above baseline) when supporting one hundred concurrent BFD sessions*. On distributed platforms, there is no impact on the main Route Processor CPU, except during BFD session setup and teardown. It is important to note that, because of this accelerated handling of BFD control packets, all output features are bypassed. Users cannot, for example, filter or apply Quality of Service (QoS) to transmitted BFD packets.

Using BFD as Part of a Network Redundancy Plan

BFD can be an important part of an overall network redundancy plan, including other innovations like Nonstop Forwarding (NSF) and the Hot Standby Router Protocol (HSRP). BFD should be deployed in those sections of the network where subsecond failure detection is required, but cannot be provided by traditional Layer 2 mechanisms.

Figure 5 illustrates an example of BFD usage.

 



Even though there is an alternate path available between Routers A and F, a failure on Router E will be hidden from its neighbors (Routers C, D, and F) by the presence of the Layer 2 switches. The switches maintain Layer 2 connectivity for the neighbor routers, which causes them to fall back on using the timers in the Layer 3 HELLO protocol to detect the failure. At the EIGRP default timer settings, this could take up to 15 seconds. In contrast, if Routers C, D, E, and F were all running BFD, all of Router E’s neighbors could detect Router E’s failure in less than a second, and immediately begin the reconvergence to the A->B->D->F path. Although BFD can be used in other deployment scenarios, the L2 switch example is one of the most common and difficult to solve without BFD. It is worth noting that EIGRP can provide nearly instantaneous convergence through the use of feasible successor routes—which are essentially precomputed backup routes. However, this solves a different problem from the initial failure detection provided by BFD. So, while EIGRP is the fastest converging of all the IGP protocols, BFD still provides significant value.

No comments:

Post a Comment