Internet-Draft | taxonomy | July 2024 |
Joung, et al. | Expires 7 January 2025 | [Page] |
This draft is to facilitate the understanding of the data plane enhancement solutions, which are suggested currently or can be suggested in the future, for deterministic networking. This draft provides criteria for classifying data plane solutions. Examples of each category are listed, along with reasons where necessary. Strengths and limitations of the categories are described. Suitability of the solutions for various services of deterministic networking are also briefly mentioned. Reference topologies for evaluation of the solutions are given as well.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 7 January 2025.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
This draft is to facilitate the understanding of the data plane enhancement solutions, which are suggested currently or can be suggested in the future, for deterministic networking.¶
An enhancement solution can be a combination of multiple data plane functional entities, such as regulators, queues, and schedulers. A solution can also include functional entities across network nodes, e.g. traffic enforcement or regulation functions at the edge. A regulator, or equivalently a shaper, is defined as a functional entity that makes the arrival process of a flow conform to a predefined process. A packet scheduler, or simply a scheduler, is a functional entity that determines when a packet is transmitted.¶
We use the term taxonomy as a synonym to the criteria for classifying the solutions accordingly. A category is a subset of solutions classified into a single group with a taxonomy. This draft provides several taxonomies and the criteria for classifying data plane solutions. These taxonomies are orthogonal to each other.¶
Examples of the categories are listed, along with reasons where necessary. Strengths and limitations of the categories are described.¶
Suitability of the solutions for various services of deterministic networking are also briefly mentioned. The services can be classified according to the flow characteristics and the performance requirements. For example, Requirements for Reliable Wireless Industrial Services [I-D.ietf-detnet-raw-industrial-req] characterizes the services by the latency bound, the burst size, the burst transmission period, the number of nodes, etc. This document adopts this characterization rule, and classifies the services into one of tight/loose latency, large/small burst, periodic/non-periodic, and large/small scale services. For example, the display information service defined in Section 4.4. of [I-D.ietf-detnet-raw-industrial-req] is a loose latency, large burst, non-periodic, and small scale service.¶
The taxonomies described in this draft can be applied for the solutions of other standardization bodies, such as IEEE 802.1 TSN TG.¶
In this draft, the candidate solutions currently being proposed in DetNet WG are simply listed without any descriptions. The details of the solutions are intentionally omitted. Interesting readers may refer to the corresponding drafts. When necessary, the solutions from IEEE TSN TG or existing popular ones are used as examples to better understand the taxonomy and the derived categories.¶
The mechanisms raised in the DetNet WG are not entirely new concepts but rather variations of existing mechanisms. These deliberate approaches aim to address the scalability requirements defined in [I-D.ietf-detnet-scaling-requirements] while ensuring a degree of continuity and compatibility with the current practices. The taxonomy in this draft reflects how new mechanisms extend existing ones to address scalability challenges.¶
For instance, Cycle Specified Queuing and Forwarding (CSQF) [I-D.chen-detnet-sr-based-bounded-latency], Tagged Cyclic Queuing and Forwarding (TCQF) [I-D.eckert-detnet-tcqf], IEEE 802.1Qdv Enhanced CQF (ECQF) are enhancements built upon the foundation of Cyclic Queuing and Forwarding (CQF). Similarly, Work Conserving Stateless Core Fair Queuing (C-SCORE) [I-D.joung-detnet-stateless-fair-queuing] is an extension of Fair Queuing (FQ). Timeslot Queuing and Forwarding (TQF) [I-D.peng-detnet-packet-timeslot-mechanism] is an extension of IEEE 802.1Qbv, also known as Time Aware Shaper (TAS). Earliest Deadline First (EDF) [I-D.peng-detnet-deadline-based-forwarding] proposed to DetNet WG is a variation of the well-known mechanism that has the same name. Other well-known mechanisms that could provide bounded latency are also covered, for example Deficit Round Robin (DRR) and Asynchronous Traffic Shaping (ATS) [IEEE_802.1Qcr].¶
Reference topologies (RTs) are also listed in this document. An RT consists of a network topology and flows' characteristics. The RTs listed in this document cover various topologies such as ring, mesh, hybrid etc. The purpose of listing the reference topologies (RTs) is to evaluate the dataplane solutions how they perform in real networks, in terms of E2E latency bound and jitter bound.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
Taxonomy based on the performance, such as E2E latency bounds and jitter bounds, is helpful to understand the solutions. The performance should be exhibited as a mathematical expression with the network and traffic parameters.¶
One possible taxonomy would be based on the per hop dominant factor for the latency bound. The dominant factor is defined as the largest sum term in the expression, when the network and traffic conditions are the worst. The worst condition typically means high network utilization, large packet and burst sizes, and large number of hops. Any existing solution can be put into one of three categories.¶
Category 1 (Max Packet Length/Service Rate): FQ and its variations like C-SCORE fall into this category, where the latency bound is primarily influenced by the ratio of a flow's maximum packet size to its allocated service rate. This category emphasizes individual flow isolation. The consequence is that the variation of E2E latency bound for a flow is minimized with the other flows' join and leave. Therefore, this category performs well with dynamic flows. This category also fits well to services with large bursts, since the burst sizes of flows are not the dominant factor of the latency bound.¶
Category 2 (Sum of Max Packet Lengths/Capacity): Solutions like DRR belong here, where the dominant factor is the sum of maximum packet lengths of all DetNet flows over the total allocated bandwidth. This category typically has less implementation complexity than Category 1 but can impact individual flow isolation. The other flows' max packet lengths affect the latency bound, which can be altered as flows join and leave.¶
Category 3 (Sum of Max Burst Sizes/Capacity): CQF, TAS, their variations (including CSQF, TCQF, ECQF, TQF), and EDF fall into this category. The key influence on latency here is the total burst sizes of all DetNet flows relative to the network capacity. This category prioritizes bounded latency guarantees but may require tighter burst control mechanisms. Once the burst is controlled, for example by an extremely strict regulation, into a packet length level, then this category may be indistinguishable with Category 2. This category fits well to the services for static flows with small bursts.¶
As an example, assuming the capacities and maximum packet lengths are identical in all the links along the path of a flow under observation, the E2E latency bound of the flow by FQ is given as the following [STILIADIS-LRS].¶
(B-L)/r + H(Lh/Rh + L/r), (1)¶
where B, L, and r are the maximum burst size of, the maximum packet length of, and the allocated service rate to the flow, respectively; H is the number of hops; Lh and Rh are the maximum packet length and the capacity of all the links.¶
In this example, the term (Lh/Rh + L/r) can be seen as the per hop latency, because the max burst size, B, appears only once. The service rate of a flow, r, is likely to be much less than the link capacity, Rh, while the maximum lengths L and Lh would not differ too much. Therefore, the dominant factor here is L/r.¶
The dominant factor determines the level of flow isolation, as well as the level of E2E latency bound value.¶
Taxonomy based on the functional characteristics is the key to understanding the solutions. The taxonomy listed in this section is orthogonal to each other, if not stated explicitly.¶
If a solution transmits packets in a periodic pattern, in which a packet is assigned to a time slot based on a predefined rule and a set of consecutive time slots repeated periodically, then the solution is periodic. Otherwise, the solution is non-periodic.¶
The set of consecutive time slots are called a period. Note that here we use the term period to avoid confusion with the term cycle used in CQF, which is equivalent to the time slot defined in this draft.¶
According to the above definition, IEEE 802.1Qbv TAS is a periodic solution. A finite Gate Control List (GCL) of TAS contains multiple gate control entries. Each entry represents a time slot with an assigned set of flows. A set of consecutive time slots forming a GCL is repeated periodically. Time slots can be overlapped with each other, as in ECQF.¶
TAS based solutions and CQF based solutions belong to periodic solutions, for example CSQF, TCQF, ECQF, TQF and so on.¶
Periodic solutions may fit well to periodic services, and vice versa.¶
According to whether network synchronization is required, a solution can be classified as either phase synchronous, frequency synchronous, or asynchronous.¶
Phase synchronous solutions require network nodes to be both phase and frequency synchronized. These solutions can be called strictly synchronous. TAS and CQF are in this category.¶
Frequency synchronous solutions require network nodes to be only frequency synchronized. Such nodes are often called syntonized. CQF variations and TAS variations are in this category, for example CSQF, TCQF, ECQF, TQF and so on.¶
Asynchronous solutions may also require loose phase and frequency synchronizations, for example ATS and EDF.¶
In non-synchronized networks, it has been shown that ignoring the timing inaccuracies can lead to network instability due to unbounded delay in per-flow or interleaved regulators [THOMAS-Sync]. However, the level of synchronization required is not high. The problem can be solved by adjusting the regulator parameters conservatively, even when loosely synchronized clocks are used. Thus, the solutions that require regulators such as ATS are categorized into asynchronous solutions.¶
The criteria to distinguish between synchronous and asynchronous solutions should be the level of required synchronization precision. One indicator suitable to such criteria would be the allowable Maximum Time Interval Error (MTIE). MTIE is usually calculated as the difference between the largest and smallest time differences in the ensemble of measurements. With this definition, a device that has an arbitrarily large and constant time difference with the standard reference has an MTIE value of 0, because MTIE is a measure of the evolution of the time difference, not the magnitude of the time difference itself. In this respect, the MTIE statistic is really a measure of the frequency offset between the device under test and the standard reference.¶
Therefore, the allowable MTIE value can be applied equivalently, for the precision level evaluation, to both phase synchronous and frequency synchronous solutions.¶
In a distributed system, typical MTIE can be managed within nano second level. However, the exact value of the allowable MTIE as an indicator for synchronous solutions is for further study. It is expected to be within tens of nanoseconds.¶
Note that the taxonomy of network synchronization is closely related to the taxonomy of periodicity. However, these two can be used independently of each other.¶
This draft categorizes data plane solutions based on the granularity of their traffic control target, which refers to the size and specificity of the traffic entity they handle. Three granularity levels exist.¶
Flow level: Each packet is controlled based on its specific flow, which can be identified usually by the 5-tuple. Examples include FQ and its variations such as C-SCORE, which offer precise service differentiation but require potentially complex implementation.¶
Flow aggregate level: Flows are grouped by shared characteristics like traffic specification, service requirement, or routing path. This coarser level simplifies control but may offer less precise differentiation. Examples include interleaved regulators in ATS.¶
Class level: Flows are further grouped by similar service requirements, regardless of specific path or traffic details. This coarsest level simplifies control and accommodates traffic fluctuations but provides the least individual flow differentiation. Typically, time or time based information could be used for classification, such as in EDF, CQF and its variations.¶
For each level solution, packets within the same traffic entity receive the same treatment. For example, if a solution is flow aggregate level, then the packets within the same flow aggregate are treated identically, regardless of the flows they belong to.¶
There are cases in which a single solution consists of multiple functional entities that treat packets according to multiple traffic entities of different granularities. In such cases, it is defined that the functional entity with the coarsest granularity is dominant, thus the whole solution belongs to the coarsest granularity category.¶
For example, ATS consists of interleaved regulators (IRs) and a strict priority scheduler. An IR has a queue dedicated to a flow aggregate having the same class and the same input port. The regulation function itself is based on a flow. According to the definition above, IR is a flow aggregate level solution. On the other hand, the strict priority scheduler in ATS is class-based. Therefore, ATS as a whole is class level.¶
A finer granularity level solution has a benefit of a more accurate service differentiation among flows. Its limit is the larger implementation complexity. It fits to services with flows having various independent latency bound values.¶
Periodic solutions can further be categorized based on the traffic granularity. A time slot can be assigned per flow, per flow aggregate, or per class.¶
Note that TAS in 802.1Qbv is a scheduling mechanism defined in an output port with eight queues. The queues are controlled by GCL and its gate control entries. Each queue can serve a class. In an entry, queues can be either open or closed. Thus, TAS can be seen as a class level solution. However, in many cases TAS is understood as a scheduling mechanism, where the number of queues are not limited to 8. There could be a natural extension, such as TQF, which enables Qbv to allocate one queue to each flow or a flow aggregate.¶
Finer granularity periodic solutions have more strengths in jitter control. They also fit services with many periodic flows of independent period values.¶
A work conserving solution never idle when there is a packet to send [Fedorova].¶
A non-work conserving solution can idle even if there is a packet to send in the queue.¶
A solution can be a combination of multiple data plane functional entities, and each functional entity has its own attribute of work conserving or non-work conserving. A solution is non-work conserving, as long as any of the functional entities included in the solution has the non-work conserving attribute.¶
FIFO, round robin schedulers, FQ and its variations like C-SCORE are examples of the work conserving solutions. TAS, CQF, ATS, and their variations are non-work conserving solutions, for example CSQF, TCQF, ECQF, TQF and so on. EDF can be operated either as work conserving or non-work conserving.¶
Work conserving solutions have strengths in terms of average delay. They usually show smaller observed maximum latencies than the theoretical latency bound expressions suggest. They also benefit from the statistical multiplexing gain without any wasted capacity, thus more room for best effort traffic.¶
Non-work conserving solutions have strengths to avoid burst accumulation and are also beneficial for jitter control. The burst size of a flow can be kept similar or the same with the initial burst size. Therefore, the buffer size necessary typically is less than those in work conserving solutions. This further makes the latency evaluation process simple.¶
Data plane solutions can be categorized as "on-time" or "in-time" based on how closely they adhere to predefined target transmission times for packets.¶
On-time solutions strive to transmit packets as close as possible to their target times without ever exceeding them. This ensures tight control over both latency and jitter, but it can sometimes lead to higher average latency.¶
In-time solutions allow more flexibility, transmitting packets without a specified target transmission time. FQ and its variations are in-time solutions.¶
ATS, which includes the interleaved regulator, is an in-time solution. A regulator determines an eligible time for a packet to be transmitted. Packets are always transmitted at or later than their eligible times. An eligible time is not a target transmission time. Note that ATS is a non-work conserving but in-time solution.¶
TAS, CQF, and their variations are on-time solutions. A time slot of TAS, within which a packet should be transmitted, can be seen as the target interval. EDF can be operated either as in-time or on-time.¶
The on-time/in-time taxonomy here is about the scheduling decision, which determines when a packet is transmitted. It is not about the consequence of the scheduling, whether the jitter bound is also guaranteed or not.¶
On-time solutions typically control the jitter as well as latency, but suffer from larger average latency. In-time solutions have limitations on controlling jitter. In-time solutions may have to handle the jitter with additional mechanisms.¶
Data plane solutions prioritize packets from different flows using various decision rules, categorized as follows.¶
Rate-based: Packets are ordered based on the allocated service rate of their flows or flow aggregates. Examples include FQ and its variations like C-SCORE, and DRR.¶
Time-based: Packets are prioritized based on their allowed delay or deadline. Examples are CQF, TAS, their variations, and EDF.¶
Arrival-based: Packets are served in the order they arrive. FIFO is an example.¶
Priority-based: Packets are ordered based on assigned priorities.¶
A solution can determine the service order of the packets from different flows, based on a rule which considers the rate allocated to a flow or a flow aggregate, the delay a packet is allowed, the packet arrival time, or the packet priority. A rule may also be constructed with a combination of these characteristics. Note that the service order within a flow cannot be altered, thus is already decided. We focus only on the service order among packets from different flows.¶
According to its primary service order decision rule, a solution can be categorized into either rate-based, time-based, arrival-based, or priority-based. Any solution can also use the packet arrival time as a secondary decision rule.¶
Strict priority scheduler uses primarily the priority of a packet. It also uses the arrival times among packets of the same priority. In this case it is categorized as priority-based.¶
ATS has IRs and a strict priority scheduler. The service order among packets at an IR is arrival-based. The order among packets from different input ports are decided at the strict priority scheduler. Thus, ATS is priority-based.¶
Rate-based solutions have a simple admission condition check process that is dependent only on the service rates of flows. They benefit from the "pay burst only once" property, by which the maximum burst size of a flow contributes to the E2E latency bound only once, without being multiplied by the hop count. Rate-based solutions typically fit well to services with large burst and large scale services, without a need for overprovisioning, or additional burst control mechanisms.¶
Time-based solutions have strengths in precise delay control for packets or flows. The services with tight latency, small burst, and small scale services may fit this category.¶
Priority-based and arrival-based solutions benefit from the implementation simplicity. The latency and jitter differentiation among flows can be coarse, however. The services with loose latency, small burst, and non-periodic services may fit this category.¶
The purpose of listing the reference topologies (RTs) is to evaluate the dataplane solutions how they perform in real networks, in terms of E2E latency bound and jitter bound. It is expected to exactly calculate the E2E latency bound and jitter bound to any flow, given a dataplane solution and its parameter choices in implementation practices.¶
An RT consists of a network topology and flows' characteristics. A network topology in this document specifies the abstract locations of source, destination, relay nodes and their interconnections. A flow characteristic is composed of its path, requested specifications (RSpec), and traffic specifications (TSpec). The requested specification includes the E2E latency and jitter bounds. The traffic specification includes the maximum burst size and the average rate, as if they have been shaped by a token bucket. Alternatively, a traffic can be specified by the period, the phase, and the maximum burst size. In this case, the maximum burst is transmitted at a certain fixed phase within a period of time.¶
By specifying the above information, other parameters such as the diameter and the maximum utilization of a network can be derived.¶
The RTs listed in this document cover various topologies such as ring, mesh, hybrid etc.¶
Some aspects of the RTs are derived from use cases, in order to reflect the current or future network deployment examples.¶
Based on the RTs, it is also able to check whether a dataplane solution can solve the scalability issues, e.g. those specified in [I-D.ietf-detnet-scaling-requirements]. The network diameter and the utilization level in RTs are set to examine the scalability.¶
A reference network topology, the grid, is shown in Figure 1. It represents a general network of partial mesh or grid topology, without considering a specific use case. A partial mesh is a common topology that can be seen in many real deployments, including datacenter networks.¶
In Figure 1, arrowed links indicate the directions to follow for any traffic route. For example, from Node 2, only Node 1 and Node 3 are the next possible route.¶
The capacity of all the links in the topology is 1Gbps. While real deployments easily exceed 1Gbps link capacity, this RT represents a rather scaled down example in terms of the link capacity and the number of nodes.¶
In-vehicle network (IVN) is an example network which demands deterministic networking. [Buffered_Network] summarizes the flows that require deterministic networking services in IVNs as in Table 1.¶
Flow type | Maximum burst size | Maximum Packet length | Arrival rate | Required maximum latency |
---|---|---|---|---|
Audio | 2Kbit | 2Kbit | 1.6Mbps | 5ms |
Video | 360Kbit | 12Kbit | 11Mbps | 10ms |
Command and Control | 2.4Kbit | 2.4Kbit | 480Kbps | 5ms |
To simplify performance analysis, the flows in Table 1 are abstracted as shown in Table 2. Flows of the same type are aggregated into a single flow. For example, ten command and control (CC) flows that share the same E2E path can be considered to be a single type C flow as in Table 2. The maximum burst size and the average rate of a type C flow are about ten times those of one CC flow, in this case. Each flow type has specific destination nodes. For example, type A flows are destined only to destination 1 or 6.¶
Flow type | Maximum burst size | Maximum Packet length | Arrival rate | Required maximum latency | Destination in Figure 1 |
---|---|---|---|---|---|
A | 20Kbit | 2Kbit | 20Mbps | 5ms | Dst 1, Dst 6 |
B | 4000Kbit | 10Kbit | 100Mbps | 10ms | Dst 3, Dst 4 |
C | 20Kbit | 2Kbit | 5Mbps | 5ms | Dst 2, Dst 5 |
A source creates one flow to each destination for a total of 6 flows. 36 flows are created throughout the network. Table 2 describes characteristics of the three different flow types used in the simulations.¶
The links are unidirectional as specified in Figure 1. All the flows must follow the direction of the arrows in every link. For example, a flow from source 1 to destination 5 follows the path of Src1-1-4-5-2-3-6-Dst9.¶
If there are more than one possible route to the destination, then the shortest path is selected. If there are more than one shortest path, then the following rules are applied.¶
Note that there are at most two outgoing links from a node to select. If both choices give the same distance to the destination, the node closer to the destination is selected as the next node. For example, from Src 5 to Dst 4, the selection from node 8 is to node 9, not to node 7, because node 9 is closer to Dst 4. When the above rule does not break the tie, i.e. the possible next nodes are within the same distance to the destination, then the node closer to the source is selected as the next node. For example, from Src 4 to Dst 5, the selection from node 5 is to node 8, not to node 2, because node 8 is closer to Src 4.¶
The above rules generate a unique route for every source and destination pair.¶
The reason for introducing unidirectional links is to make the network diameter large. With this configuration, the network diameter is 7 hops, which is relatively large considering a small number of nodes.¶
The destination of a flow decides the flow type. For example, all the flows destined to node 1 are of type A. There are 6 flows for each destination. There are 12 flows for each type. The flows with longest paths within the same flow type are of interest. Table 3 shows the path of the flows with longest paths for each flow type. For all the flow types, the number of hops in the longest paths is the same. The utilizations may differ for different links.¶
Flow type | Longest path |
---|---|
A | Src5-8-7-4-5-2-1-Dst1 |
A | Src2-2-3-6-5-8-9-Dst6 |
B | Src5-8-9-6-5-2-3-Dst4 |
B | Src2-2-1-4-5-8-7-Dst3 |
C | Src3-3-6-5-2-1-4-Dst2 |
C | Src6-9-6-5-8-7-4-Dst2 |
C | Src1-1-4-5-2-3-6-Dst5 |
C | Src4-7-4-5-8-9-6-Dst5 |
The network utilization is defined as the maximum link utilization over all the links. The topology achieves network utilization around 60%. The bottleneck links, e.g. the link 2-3, have one type A flow, five type B flows, and two type C flows. The scalability of a solution can be properly evaluated with this topology.¶
There might be matters that require IANA considerations associated with metadata. If necessary, relevant text will be added in a later version.¶
This section will be described later.¶