Internet-Draft | EVPN Network Layer Fault Management | September 2024 |
Govindan, et al. | Expires 29 March 2025 | [Page] |
This document specifies proactive, in-band network layer OAM (RFC 9062) mechanisms to detect loss of continuity faults that affect unicast and multi-destination paths (used by Broadcast, Unknown Unicast, and Multicast traffic) in an Ethernet VPN (EVPN, RFC 7432bis) network. The mechanisms specified in this document use the widely adopted Bidirectional Forwarding Detection (RFC 5880) protocol.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 29 March 2025.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
[RFC9062] outlines the OAM requirements of Ethernet VPN (EVPN) [rfc7432bis]. This document specifies mechanisms for proactive fault detection at the network (overlay) layer of EVPN, that is to say between Provider Edge (PE) nodes, as described in Section 2.3 of [RFC9062]. The mechanisms specified in this document use the widely adopted Bidirectional Forwarding Detection (BFD, [RFC5880] [RFC5881] [RFC5882] [RFC5883] [RFC5884]) protocol, which is a lightweight in-band protocol using fixed length messages suitable for implementation in hardware, and other protocols as necessary. EVPN service restoration mechanisms (redundancy and recovery/convergence) are the most logical clients, in the [RFC5882] sense, for BFD sessions specified herein.¶
EVPN fault detection mechanisms need to consider unicast traffic separately from Broadcast, Unknown Unicast, and Multicast (BUM) traffic since they map to different Forwarding Equivalency Classes (FECs) in EVPN so such traffic may follow different paths. Hence this document specifies different continuity fault detection mechanisms, depending on the type of traffic and the type of tunnel used, as follows (see also Section 2.3 of [RFC9062]):¶
Packet loss and packet delay measurement are out of scope for this document. See [ietf-bmwg-evpntest] for EVPN benchmarking guidance.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
The following acronyms are used in this document.¶
This document specifies BFD-based mechanisms for proactive fault detection at the Network Layer (as specified in Section 2.3 of [RFC9062]) for MPLS based EVPN (as specified in [rfc7432bis]) and also for EVPN using VXLAN encapsulation [RFC8365]. Specifically, this document covers the following:¶
This document does not discuss BFD mechanisms for:¶
This document specifies procedures for BFD asynchronous mode. BFD demand mode is outside the scope of this specification except as it is used in [RFC8563]. The use of the BFD Echo function is outside the scope of this specification.¶
The following considerations motivated the use of BFD at the network layer of the OAM model for EVPN (Section 2.3 of [RFC9062]):¶
BFD testing between EVPN PE nodes does not guarantee that the EVPN service is functioning. This can be monitored at the service level, that is CE (Customer Edge) to CE (Section 2.2 of [RFC9062]) as shown in Figure 1 which is taken from Figure 1 of [RFC9062]. For example, an egress EVPN PE could recognize EVPN labeling received and correctly process BFD packets but switch data to incorrect interfaces. However, BFD testing in the EVPN Network Layer does provide additional confidence that data transported using those tunnels will reach the expected egress node. When BFD testing in the EVPN overlay fails, that can be used as an indication of a Loss-of-Connectivity defect in the EVPN underlay that would cause EVPN service failure.¶
The mechanisms specified in BFD for MPLS LSPs [RFC5884] [RFC7726] and BFD for VXLAN [RFC8971] are, except as otherwise provided herein, applied to test loss of continuity for unicast EVPN traffic. Note that this includes the following provision of [RFC5884]:¶
Note that once the BFD session for the MPLS LSP is UP, either end of the BFD session MUST NOT change the source IP address and the local discriminator values of the BFD Control packets it generates, unless it first brings down the session.¶
The MPLS control plane can be verified against the data plane as specified in [RFC8029]. When the discriminators required for de-multiplexing the BFD sessions are not otherwise available, for example by configuration, they can be advertised through BGP using the BFD Discriminator Attribute [RFC9026]. Discriminators are needed for MPLS since the label stack does not contain enough information to identify the sender of the packet.¶
The usage of different MPLS entropy labels [RFC6790] or different VXLAN source ports takes care of the requirement to monitor various paths of the multi-path provider network. Each unique realizable path between the participating PE nodes MAY be monitored separately when such entropy is used. At least one path of multi-path connectivity between two PE nodes MUST be tracked with BFD, but in that case the granularity of fault-detection will be coarser.¶
To support unicast fault management with BFD packets sent to a PE node, that PE node MUST allocate or be configured with a BFD discriminator to be used as Your Discriminator (Section 4.1 of [RFC5880]) in the BFD messages to it. By default, a PE node advertises this discriminator with BGP using the BFD Discriminator Attribute [RFC9026] with BFD Mode TBD2 in an EVPN Ethernet Autodiscovery Route [rfc7432bis] or MAC/IP Advertisement Route as long as it advertises it in at least one route. It extracts its peer's discriminator from such an attribute. However, these discriminators MAY be exchanged out-of-band or through some other mechanism outside the scope of this document.¶
Once a PE node knows a unicast route and discriminator for another PE node and if it is the higher priority of the two PE nodes to initiate BFD and is configured to do so, it endeavors to bring UP and maintain a BFD session to that other PE node. The BFD session is brought down if a PE node is no longer configured to maintain it or if a route and discriminator are no longer available.¶
Section 5.1 below discusses BUM traffic fault detection for P2P and MP2P tunnels using ingress replication and Section 5.2 discusses such fault detection for P2MP tunnels. In both cases the following provision of [RFC5884] zpplies:¶
Note that once the BFD session for the MPLS LSP is UP, either end of the BFD session MUST NOT change the source IP address and the local discriminator values of the BFD Control packets it generates, unless it first brings down the session.¶
Ingress replication (see Section 11 of [rfc7432bis]) uses separate P2P or MP2P tunnels for transporting BUM traffic from the ingress PE (head) to a set of one or more egress PEs (tails). The fault detection mechanism specified by this document takes advantage of the fact that the head makes a unique copy for each tail.¶
Another key aspect to be considered in EVPN is the advertisement of the Inclusive Multicast Ethernet Tag Route (see Section 7.3 of [rfc7432bis]). The BUM traffic flows from a head node to a particular tail only after the head receives such an inclusive multicast route from the tail. This route contains the BUM EVPN MPLS label (downstream allocated) corresponding to the MP2P tunnel for MPLS encapsulation and contains the IP address of the PE originating the inclusive multicast route for use in VXLAN encapsulation. It also contains a BFD Discriminator Attribute [RFC9026] with BFD Mode TDB2 giving the BFD discriminator that will be used by the tail unless this information has been otherwise distributed. This is the P2P mode BFD Discriminator Attribute since a P2P BFD session is used in both the P2P and MP2P cases with ingress replication.¶
There MAY exist multiple BFD sessions between a head PE and an individual tail due to (1) the usage of MPLS entropy labels [RFC6790] or VXLAN source port numbers for an inclusive multicast FEC and (2) due to multiple MP2P tunnels indicated by different tail labels for MPLS or different IP addresses for VXLAN encapsulation. If a PE node is configured to do so, once it knows a multicast route and discriminator for another PE mode it endeavors to bring UP and maintain a BFD session to that other PE node. The BFD session is brought down if a PE node is no longer configured to maintain it or if a route and discriminator are no longer available.¶
Fault detection for BUM traffic distributed using a P2MP tunnel uses BFD Multipoint Active Tails [RFC8563] in one of the three methods providing head notification. Which method is used depends on the local configuration. Sections 5.2.2 and 5.2.3 of [RFC8563] describe two of these methods ("Head Notification and Tail Solicitation with Multipoint Polling" and "Head Notification with Composite Polling"). The third method ("Head Notification without Polling") is touched on in Section 5.2.1 of [RFC8563] and fully specified in [ietf-mpls-p2mp-bfd]. All these three modes assume the existence of a unicast return path from each tail to the head. In addition, Head Notification with Composite Polling assumes a head to tail unicast path disjoint from the path used by the P2MP tunnel.¶
The BUM traffic flows from a head node to the tails after the head transmits an Inclusive Multicast Tag Route [rfc7432bis] if local configuration so directs. This route contains the BUM EVPN MPLS label (upstream allocated) corresponding to the P2MP tunnel for MPLS encapsulation. The route also includes a BFD Discriminator Attribute [RFC9026] with the BFD Mode set to 1 and a Source IP Address TLV, which gives the address associated with the MultiPoint Head of the P2MP session. This BFD discriminator advertised by the head in the Inclusive Multicast route or otherwise configured at or communicated to a tail MUST be used in any reverse BFD control message as Your Discriminator so the head can determine the tail of which P2MP BFD session is responding. If a PE node is configured to do so, once a PE knows a P2MP multicast route and the needed discriminators, it brings UP and maintains a P2MP BFD active tails session to the tails. The BFD session is brought down if a PE node is no longer configured to maintain it or the multicast route and discriminators are no longer available.¶
For MPLS encapsulation of the head to tails BFD, Label Switched Multicast is used. For VXLAN encapsulation, BFD is delivered to the tails through underlay multicast using an outer multicast IP address.¶
The following subsections describe the MPLS and VXLAN encapsulations of BFD for EVPN network layer fault management:¶
This section describes use of the Generic Associated Channel Label (GAL, [RFC5586]) for BFD encapsulation in MPLS-based EVPN network layer fault management. Since the use of BFD specified in this document is encapsulated between PEs, it is treated as single hop and uses the single hop BFD port number [RFC5881].¶
As shown in Figure 2, the packet contains the following labels in the order given: LSP label (transport), optionally an entropy label, the EVPN Unicast label, and then the Generic Associated Channel label with the G-ACh type set to TBD1. The G-ACh payload of the packet MUST contain the destination L2 header (in overlay space) followed by the IP header that encapsulates the BFD packet. The source MAC address of the inner packet can be used to validate the <EVI, MAC> in the receiving node.¶
When ingress replication is used, a packet contains the following labels in the order given: LSP label (transport), optionally an entropy label, the BUM label, and the split horizon label [rfc7432bis] (where applicable). The G-ACh type is set to TBD1. The G-ACh payload of the packet is as described in Section 6.1.1 except that the destination MAC address, if not that of the destination PE node, is the dedicated multicast MAC TBD4.¶
When Label Switched Multicast is used, the encapsulation is the same as in Section 6.1.2 for ingress replication except that the transport label identifies the P2MP tunnel, in effect the set of tail PEs, rather than identifying a single destination PE at the end of an MP2P tunnel.¶
This section describes the use of the VXLAN [RFC7348] [RFC8365] for BFD encapsulation in VXLAN based EVPN fault management.¶
Figure 3 below shows the unicast VXLAN encapsulation on the wire on an Ethernet link. The outer and inner IP headers have a unicast source and destination IP address, both IPv4 or both IPv6 in each header, that are the addresses of the PE nodes that are the BFD message source and destination. The source port number MAY be varied as a source of entropy. If the BFD source has multiple IP addresses, whether multiple IPv4 addresses, multiple IPv6 addresses, or a mixture thereof, entropy MAY be further obtained by using any of those addresses assuming the destination has a same version IP address and the source is prepared for responses directed to the IP address used.¶
When VXLAN encapsulated ingress replication is used, the BFD packet construction is as given in Section 6.2.1 except as follows:¶
When VXLAN head-to-tails (P2MP) is used, the encapsulation is as given in Section 6.2.2 except as follows:¶
The mechanisms specified by this document could affect the packet load on the network and its elements especially when supporting configurations involving a large number of EVIs. The option of decreasing or increasing BFD timer values can be used by an administrator or a network management entity to maintain the overhead incurred due to fault monitoring at an acceptable level.¶
The following IANA Actions are requested.¶
IANA is requested to assign a channel type from the "Pseudowire Associated Channel Types" registry in [RFC4385] as follows.¶
Value Description Reference ----- ------------ ------------ TBD1 BFD-EVPN OAM [this document]¶
IANA is requested to assign a value from the IETF Review range in the BFD Mode sub-registry on the Border Gateway Protocol Parameters Registry web page as follows:¶
Value Description Reference ----- --------------- --------------- TBD2 P2P BFD Session [this document]¶
IANA is requested to assign parallel multicast and unicast MAC addresses under the IANA OUI [0x01005E900101 and 0x00005E900101 suggested] as follows:¶
IANA Multicast 48-bit MAC Addresses Address Usage Reference ------- --------------------- --------------- TBD3 EVPN Network Layer OAM [this document]¶
IANA Unicast 48-bit MAC Addresses Address Usage Reference ------- --------------------- --------------- TBD4 EVPN Network Layer OAM [this document]¶
Security considerations discussed in [RFC5880], [RFC5883], and [RFC8029] apply.¶
MPLS security considerations [RFC5920] apply to BFD Control packets encapsulated in a MPLS label stack. When BFD Control packets are routed, the authentication considerations discussed in [RFC5883] should be followed.¶
VXLAN BFD security considerations in [RFC8971] apply to BFD packets encapsulated in VXLAN.¶
The authors wish to thank the following for their comments and suggestions:¶
Mach Chen, Jorge Rabadan, Alexander Vainshtein, and Mohammed Boucadair¶