NeoTec Y. Zhang Internet-Draft C. Xie Intended status: Informational China Telecom Expires: 23 April 2025 20 October 2024 A Use Case of Network Operation for Telecom Cloud draft-zx-neotec-net4cloud-usecase-00 Abstract This document discusses the network operation issues of telecom cloud. It first introduces a typical use case of network for telecom cloud, then illustrates the requirements for network operation for telecom cloud from the perspective of TSP, and proposes a general network operation model for telecom cloud. It also analyzes the gap based on the current technological status. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on 23 April 2025. Copyright Notice Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License. Zhang & Xie Expires 23 April 2025 [Page 1] Internet-Draft net4cloud usecase October 2024 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 3. Use Case of Network for Telecom Cloud . . . . . . . . . . . . 3 4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 7 5. Network Operation Model for Telecom Cloud . . . . . . . . . . 7 6. Gap analysis . . . . . . . . . . . . . . . . . . . . . . . . 9 7. Security Considerations . . . . . . . . . . . . . . . . . . . 10 8. Normative References . . . . . . . . . . . . . . . . . . . . 10 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 1. Introduction Nowadays, the resources required for internet services and applications mainly exist in the form of cloud, which is distributed, decentralized and highly dynamic. As a result, the network planning and operation methods need to be adjusted to meet the demands of cloud. Telecom cloud refers to a cloud computing system owned and operated by Telecom Service Provider (TSP). In general, TSPs own and operate global wide area networks, 5G and cellular networks. TSPs have also developed Telecom Cloud, which is a cloud environment optimized for service delivery through the integration of network and cloud infrastructure. To support cloud computing services, TSPs build overlay networks for corporate customer on the underlay network, and provides differentiated network services in terms of bandwidth, latency and security isolation to meet the flexibility needs of customers and applications. Due to various reasons, overlay network and traditional underlay network have significant differences in planning and operation: overlay network can be quickly created and iteratively upgraded by TSPs, but it has poor awareness of the underlying network's operational status and potential failures. Traditional underlay network focuses on traffic and connectivity, physical pipeline planning between cloud resource pools and network solutions for individual cloud network products, but not on application scenarios. The network operation for telecom cloud is facing many challenges due to the lack of a customer or service-centric approach, as well as the inability to fully meet the demands of cloud services in terms of flexibility, openness, and response speed. For the illustration of network operation for telecom cloud, this document provides a typical use case of telecom cloud network. It first introduces the architecture, then lists the requirements for operations of network for telecom cloud from the perspective of TSPs. Zhang & Xie Expires 23 April 2025 [Page 2] Internet-Draft net4cloud usecase October 2024 Additionally, it proposes a network operation model for telecom cloud, it also analyzes the current gaps in conjunction with the use case. 2. Terminology The following terms are used in this document: CE: Customer Edge IOAM: Input Output Access Management MPLS: Multi-protocol Label Switching OTN: Optical Transport Network PE: Provider Edge POP: Point of Presence PW: Pseudo Wire SD-WAN: Software-Defined Wide Area Network SRv6: Segment Routing IPv6 TSP: Telecom Service Provider VLAN: Virtual Local Area Network VPC: Virtual Private Cloud 3. Use Case of Network for Telecom Cloud For the provisioning of cloud services to its customers, a TSP deploys its large-scale network in a distributed manner, with its backbone underlay network using MPLS or SRv6+EVPN methods, as shown in figure 1. PE devices is located at the edge of the backbone network, responsible for connecting CE devices. This Backbone Network connects cloud resource pools which distributes over a wide geographic area, so its users are capable of accessing local telecom cloud as well as remote telecom cloud. On the access side, CE devices are located at the edge of customer sites, used to establish connections with PE devices. When customers need to access the cloud, they can choose multiple methods, such as: IP RAN, PON, direct fiber drive, OTN, SD-WAN, 5G slice dedicated line. Telecom cloud nodes can be deployed on the operator's Backbone Network or at the network edge. Operators provide cloud services to enterprise Zhang & Xie Expires 23 April 2025 [Page 3] Internet-Draft net4cloud usecase October 2024 customers, including elastic computing, storage, networking, and security infrastructure resources, supporting enterprises to quickly build and deploy application systems and achieve rapid activation of internet services. +------------+ +------------+ | Core Cloud | | Core Cloud | +------------+ +------------+ | | +------------+ +----+ +----+ +------------------+ | Edge Cloud |---| CE |---| PE |---| | +------------+ +----+ +----+ | | +-----------------+ +----+ | | +--------------+ | Customer Site 1 |---| PE |---| Backbone Network |---| 3rd-Party | +-----------------+ +----+ | | | Public Cloud | +-----------------+ +----+ | | +--------------+ | Customer Site 2 |---| PE |---| | +-----------------+ +----+ +------------------+ Figure 1. Diagram of typical network for Telecom Cloud In scenarios such as medical education and game streaming, one customer uploads data to the edge cloud through campus network and private cloud. Important data, or data that has been preliminarily processed by the edge cloud, is then uploaded to the core cloud POP. At the same time, the core cloud also transmits data to the edge cloud. Ultimately, customers can obtain the required data from either the edge cloud or the core cloud. Additionally, the operation of the cloud service requires underlay network support. The cloud network service carrying methods for telecom cloud are mainly of two types: (1) Cloud Access Service Cloud access services allow customers to seamlessly connect their IT resources or data centers to telecom cloud resource pools via the telecom cloud management/control system. These resource pool sites are typically deployed in proximity to customers to minimize data transmission latency and deliver a more localized service experience.This method not only provides network connection services but also integrates cloud computing resources. The current activation methods for Cloud Access Service include six types of network-side access: IP RAN, SD-WAN, PON, direct fiber connection, 5G slicing dedicated line, and OTN. Each of these requires their respective network OSS systems to enable access services for cloud applications. Zhang & Xie Expires 23 April 2025 [Page 4] Internet-Draft net4cloud usecase October 2024 For example, customers directly access telecom cloud through the IP RAN network. The IP RAN technology itself supports multiple service types, ensuring low latency and high reliability of data transmission, suitable for enterprise applications with higher network performance requirements. It can dynamically adjust network resources according to enterprise customer needs. In this architecture, customers access the local operator's boundary equipment (PE) of the Backbone Network through the IP RAN network. The customer’s boundary equipment (CE) establishes a layer 2 channel with the ASBR/B-LEAF device through VLAN+PW. The CE connects to the local IP RAN U equipment in a single or dual uplink mode, interconnected in a layer 2 VLAN manner. A PW is established between the customer-accessed IP RAN U equipment and the ASBR/B-LEAF device to carry the customer’s layer 2 channel. The ASBR device is interconnected with the local Backbone Network PE equipment within the local metropolitan area The CE acts as the three-layer gateway of its own internal network, being neighbors with the Backbone Network PE equipment at the three-layer level. The above ASBR/B-LEAF devices reuse the IP RAN cloud dedicated line to provide customers with end- to-end complete services. +----------+ +----+ +---+ +-------+ +------+ +----+ +----------+ | Customer |---| CE |---| U |---| IPRAN |---| ASBR |---| PE |---| Backbone | | Site | +----+ +---+ +-------+ +------+ +----+ | Network | +----------+ +----------+ Figure 2. Diagram of Cloud Access Service network In this architecture, the cloud resource pool's cloud dedicated line switch acts as CE, connecting to the PE provided by the Backbone Network in the resource pool's city through a pre-configured high- bandwidth relay link. Customers can conveniently access services through the nearest access point. The connection between the cloud dedicated line switch and the Backbone Network PE adopts VLAN sub- interfaces to achieve flexible network configuration. In the entire dedicated line service, the cloud dedicated line switch provides access functions, while the inter-cloud high-speed network handles networking and traffic scheduling tasks. Different customers within the cloud resource pool achieve isolation through their own independent virtual routing and forwarding instances, ensuring security and privacy. (2) Inter-Cloud Connection Service To meet the needs of customers for networking between different cloud data centers, TSPs utilize cloud resource pools distributed in different regions to construct overlay network, such as SD-WAN or VPN on top of the underlay network, as shown in figure 3. The overlay Zhang & Xie Expires 23 April 2025 [Page 5] Internet-Draft net4cloud usecase October 2024 network can quickly adjust configurations according to each customer's and each service's needs without being constrained by physical network elements in most cases. The Overlay network directly serves customers on the cloud, providing customer-level/ application-level virtual network connection, management, and monitoring services. The overlay network includes intra-cloud and inter-cloud. The intra- cloud network mainly refers to VPC, a specific VPC acts as the core and links all intra-cloud network services. VPC allows customers to build a secure, reliable, and configurable/manageable virtual network environment in the cloud while also supporting external network connections with various native cloud gateways and access to third- party public clouds. The underlay network provides high reliability and low latency network connection services for the overlay network on the cloud. To support the overlay network, it is necessary to open basic network capabilities to the cloud management system through the underlay network's central controller and reserve resources such as VPN for the Overlay network. However, it is not aware of customers and applications. The establishment of an overlay network include manual configuration and automated deployment. Enterprise customers can manually configure parameters of network devices (such as routers, switches) according to their actual needs, including tunnel endpoints, encapsulation protocols, etc. This method only works for small-scale network environments, but for large enterprises, manual configuration may be too cumbersome and prone to errors, necessitating automated deployment through network orchestrators. +-------+ VxLAN +-------+ Overlay Network | |<------------------------------->| | | Edge | +-----------+ | Cloud | +-----+ IPSec | Cloud | VxLAN | Municipal | VxLAN | | | |<----->| |<------>| Cloud |<---------->| | +------+ | | +-------+ +-----------+ +-------+ | USER |---| CPE | | POP | | POP | | POP | +------+ | | +-----+ +--------------+ +------------------+ | | |---| |---| IP RAN |---| Backbone Network |----+ +-----+ | MAN | | +----------+ | | +----------+ | | | | | MPLS vpn | | | | MPLS vpn | | Underlay Network +-----+ | +----------+ | | +----------+ | +--------------+ +------------------+ Figure 3. Diagram of constructing an Overlay network on Underlay network Zhang & Xie Expires 23 April 2025 [Page 6] Internet-Draft net4cloud usecase October 2024 4. Requirements From the perspective of TSPs, the network operation for telecom cloud should meet the following requirements: 1) Agile provisioning: Capable of rapidly deploying and configuring cloud and network resources to promptly meet cloud service needs. 2) Flexible adjustment of bandwidth and other resources: Dynamically adjust network bandwidth and other resources according to actual needs within minutes, ensuring efficient resource utilization. 3) High stability and reliability: For new cloud services, such as AI computing, network failures could lead to training task interruptions or data loss. The network needs to possess high stability and reliability to ensure continuous and complete data transmission. 4) Network performance monitoring: The network operation system should have a unified interface and scheduling ability to ensure operable and manageable maintenance. The system should be able to monitor the network's operating status in real-time, quickly detect and locate network problems of various computing nodes. It should also support correlation analysis between application traffic and network status, providing customers with self-healing capabilities for overlay networks. Through monitoring collaboration between cloud and network, it realizes predictive and planning functions of the network. 5) Centralized control: Connect to DC, metropolitan area networks, backbone networks, and other network domains, apply SDN/NFV and other new technologies to achieve centralized control and dynamic management of network elements. 6) Scenario visualization: Efficiently and promptly perceive network operational data, sense the real-time operating status of the network, and automatically discover changes in the network in a timely manner. Realize panoramic visibility of end-to-end service paths and SLA visibility in the network. 5. Network Operation Model for Telecom Cloud Telecom cloud must provide enterprises with flexible cloud computing resources, supporting the deployment and operation of various cloud applications and services, while also offering customers flexible network configuration and deployment services. Therefore, its network operations need to be optimized. This section introduces the network operation model for telecom cloud. The top layer is the Telecom Cloud Service Management/Controller System, which includes Zhang & Xie Expires 23 April 2025 [Page 7] Internet-Draft net4cloud usecase October 2024 components such as orchestrator, controllers, fault operations, resource management, etc. The Telecom Cloud Service Management/ Controller System is responsible for comprehensively managing telecom cloud, building and operating overlay network. At the same time, through interfaces with the Network Management/Controller System, it controls and mobilizes the capabilities of the underlying underlay network to meet the demands of cloud computing above. This model also supports interoperability with third-party Public Cloud. +----------------------------+ | Application | +----------------------------+ ^ ^ | | v v +---------------------+ +---------------------+ | Telecom Cloud | | | | +-----------------+ | | Telecom Cloud | | | VPC | |<-->| Service Management/ | | |-----------------| | | Control System | | | Overlay Network | | | | | +-----------------+ | | | +---------------------+ +---------------------+ ^ ^ | | Interface v v +--------------+ +------------------+ +---------------------+ | 3rd-Party |<--->| Underlay Network |<--->| Network Management/ | | Public Cloud | | | | Control System | +--------------+ +------------------+ +---------------------+ Figure 4. Network Operation Model for Telecom Cloud In this model, unified interfaces need to be established between the Telecom Cloud Service Management/Control System and the Network Control System to achieve seamless data exchange and command transmission. This includes standardized API calls, data formats, and communication protocols. The main interfaces instructions sent by the Telecom Cloud Service Management/Control System to the Network Management/Control System include: * Network establishment * Cloud service operation data annoucement * Network resource scaling on connection, bandwidth, latency, path, etc. Zhang & Xie Expires 23 April 2025 [Page 8] Internet-Draft net4cloud usecase October 2024 * Retrieval of network performance data * Retrieval of network resource usage data * Network traffic scheduling * Network termination Similarly, the Network Management/Control System can also report back to the Telecom Cloud Service Management/Control System. Its interface includes: * Proactive network fault reporting * Proactive network performance reporting * Proactive network resource usage reporting * Proactive underlay network operation reporting 6. Gap analysis Although telecom cloud providers have advantages in providing cloud- network integrated services, most networks, while extensive, are often rigid in operation, making them difficult to fully adapt to modern cloud service needs. As telecom cloud evolves towards multi-cloud and edge cloud models, new challenges in configuration, automation, and operation emerge. Moreover, interoperability becomes a key issue, as TSPs typically rely on equipment from multiple vendors, while public cloud providers can standardize on a single vendor ecosystem. Telecom cloud services must seamlessly integrate various software and hardware systems, creating challenges and opportunities for collaboration, especially within IETF. Addressing these challenges requires collaborative efforts to develop standardized solutions to ensure flexibility, scalability, and interoperability in the telecom cloud environment. 1) Lack of definition for key interfaces and data models: Standardize the abstraction and interface definition of cloud and network resources for management across different vendors. Achieve interoperability and seamless integration of cloud and network services. 2) Lack of definition for cloud resource models: Abstract and define cloud resources, creating standardized resource models. Establish unified interfaces and protocol standards to achieve flexible Zhang & Xie Expires 23 April 2025 [Page 9] Internet-Draft net4cloud usecase October 2024 combination and scheduling of virtual network resources, enhancing resource sharing and collaborative management between cloud and network. 3) Lack of unified Input Output Access Management (IOAM): Extend IOAM to enhance real-time performance monitoring in cloud-integrated networks. Develop standardized methods to manage cross-domain nested slices, enhancing visualization capabilities from the application layer to the network layer. 4) Unified cloud-network orchestration mechanism: Develop a standardized orchestration layer connecting CRD (Custom Resource Definitions) in Kubernetes with NETCONF/YANG to improve coordination between computing resources and network resources. 7. Security Considerations TBD. 8. Normative References [RFC8466] Wen, B., Fioccola, G., Ed., Xie, C., and L. Jalil, "A YANG Data Model for Layer 2 Virtual Private Network (L2VPN) Service Delivery", RFC 8466, DOI 10.17487/RFC8466, October 2018, . [RFC8986] Filsfils, C., Ed., Camarillo, P., Ed., Leddy, J., Voyer, D., Matsushima, S., and Z. Li, "Segment Routing over IPv6 (SRv6) Network Programming", RFC 8986, DOI 10.17487/RFC8986, February 2021, . Authors' Addresses Yue Zhang China Telecom Beiqijia Town, Changping District Beijing 102209 China Email: zhangy390@chinatelecom.cn Zhang & Xie Expires 23 April 2025 [Page 10] Internet-Draft net4cloud usecase October 2024 Chongfeng Xie China Telecom Beiqijia Town, Changping District Beijing 102209 China Email: xiechf@chinatelecom.cn Zhang & Xie Expires 23 April 2025 [Page 11]