Internet-Draft | Taxonomy of Consolidation | October 2024 |
McFadden | Expires 22 April 2025 | [Page] |
This document contributes to the ongoing discussion surrounding Internet consolidation. At recent IETF meetings discussions about Internet consolidation revealed that different perspectives gave completely different views of what consolidation means. While we use the term consolidation to refer to the process of increasing control over Internet infrastructure and services by a small set of organizations, it is clear that that control is expressed through economic, network traffic and protocol concerns. As a contribution to the discussion surrounding consolidation, this document attempts to provide a taxonomy of Internet consolidation with the goal of adding clarity to a complex discussion.¶
The discussion of this draft and issues related to centralization and consolidation often take place in the Decentralization of the Internet Research Group (dinrg).¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 22 April 2025.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
Internet consolidation has been under discussion for the last several years. The 2019 Internet Society's "Global Internet Report: Consolidation and the Internet Economy" highlighted issues on this topic and kicked started a series of discussions and publications around consolidation. The DINRG Workshop on Centralization took place in June of 2021 and was reported on at IETF 114[DINRG]. Furthermore, a draft for the Internet Architecture Board (IAB) discussed issues of economic and technical consolidation[IABwrkshp]. Despite community interest, the draft expired without further work or publication.¶
Several other contributions have focused on responses to consolidation. However, the report from the DINRG Workshop makes clear that there are different "categories of centralization." This draft does not attempt to propose responses to centralization, instead it attempts to build upon the Workshop's summary of "categories of centralization" as it relates to the openness and interoperability of Internet architecture.¶
There's a reason that concerns have been expressed about the "definition" of centralization. Centralization has seen many attempts to define "centralization." In a now expired Internet-Draft, one definition was "the process of increasing control over internet infrastructure and services by a small set of organizations." [Arrko1] A more recent RFC uses the following definition: "centralization" is the state of affairs where a single entity or a small group of them can observe, capture, control, or extract rent from the operation or use of an Internet function exclusively."[RFC9518]¶
This is not simply a recent trend. Dmytri Kleiner views the situation as primarily economic and notes that the dot com boom "was characterized by a rush to own infrastructure, to consolidate independent internet service providers and take control of the network." He describes the environment in the early 2000s as a sort of economic land rush where investors tried to replace the smaller service vendors with larger ones on every scale, from low level telecommunications infrastructure to high level services such as news aggregation, e-mail and video.[Kleiner]¶
Taking as a starting point the centralization of DNS services, there have also been studies of what centralization means in the context of key Internet services. For instance, one academic study found that about 12,000 name servers used by websites in the Alexa top 1 million used exactly the same third-party infrastructure.[DNSmeasure] This finding led to the proposal of a metric for measuring the market share of organizations that provide DNS resolvers.[Entropy] In this case, the authors implicitly provide definitions of "centralization" based on external metrics.¶
A completely different approach to the definition of centralization can be seen by looking at research that focuses on the consolidation of data instead of economic measures.[DataConsolidation] In one study the authors state, "as users get used to using their services, the users' generated content and the data about their online behaviors are concentrated in such companies. This phenomenon, called "data consolidation", has become a serious problem, which makes the Internet society seek to decentralize the current Internet. The decentralized Internet aims to prevent the concentration of user data in a few giant companies."[ISOCconsolidation] This approach to centralization isn't an isolated research activity. Another study surveys the ongoing research activities on "data centralization" from The Internet Society, EU Horizon-2020, and Decentralized web.¶
The key problem is that "centralization" means different things in different contexts. Research takes on various topics including economic, network and data centralization.[JCP] As a result the guidance from the DINRG workshop seems to have an important message for those who would like to see a single definition of centralization: there are "categories" of centralization, each with its own definition.[DINRG]¶
Note that this draft does not attempt to define the concept of intermediation.[RFC3724] In general, an intermediary is a communications participant that is not the primary initiator of communications or the intended recipient of that communication. Intermediation is familiar from the examples of search engines: a search engine is a important tool in helping to scale the Internet by helping users find and identify relevant content in an ever-growing ocean of information. As an intermediary, the search engine places itself between the user and the rest of the Internet. Unfortunately, the intermediary can be accessed directly by the user, or embedded structurally - for instance through algorithims that sift, filter or rate content on behalf of the user. Obviously, such intermediation gives power to those who operate it - with content and information flowing from both sides of the connection.¶
It is possible to argue that today's Internet is impossible without intermediaries (for instance, proxies in HTTP or SIP).[RFC8517] However, controlling the level of participation and access that intermediaries have is a fundamental security issue. Lack of propoer controls on intermediatires in ptotocols is the source of significant security problems. As an example, users have almost no control on where there data or information comes from. If a CDN provides the data, the data may be served locally by a third-party even though the user thinks it is coming from http://www.example.com There are very good reasons for local distribution of content, but the choices about access to data are not left up to the user. Instead, the intermediaries make non-transparent choices without possible intervention by the consumer of the data.¶
It is also important to note that metrics for intermediation and centralization show that, while there is a high degree of variability globally for centralization, research suggests a highly concentrated market of third-party providers: three providers, globally, serve an average of 92 percent of all websites, and Google, by itself, serves an average of 70 percent of recently surveyed websites.[Pace]¶
In addition to not attempting to define the concept of intermediation, this draft also does not attempt to suggest ways in which the balance of power between users and intermediaries could rebalanced. It is possible to imagine a variety of controls on intermediaries - for instance, regulatory, voluntary, protocol-based and others. This draft does not consider the advantages or disadvantages of possible mechanisms for control of intermediaries.¶
These three terms are sometimes used interchangeably. However, there is a subtle difference which should be recognized in any consideration of centralization.¶
Consolidation is the process of limiting choices. Centralization is the process of removing the power to execute change from the edge to the core. Concentration is the market effect of having a small number of players either dominate or monopolize a marketplace. What is clear is that there is significant overlap between these three concepts. A common theme is the concentration of power in the hands of a few organizations. That concentration of power is particularly evident in the case of economic centralization (as we will see in Section 4 below).¶
Internet consolidation is defined in one setting as "the process of increasing control over internet infrastructure and services by a small set of organizations." [RFC8890] Economy of scale is the driving force behind consolidation because markets naturally consolidate when economies of scale come into play. As noted in Section 2.1 above, a common theme here is that almost all features of centralization include a concentration of economic, market or surveillance power.¶
The DINRG Workshop Report notes that:¶
The current consolidation and centralization of control and operation of Internet infrastructure and services was not an original design goal. In fact, RFC1958 says:¶
and later,¶
In fact RFC8280[RFC8280] explicitly asks key questions about centralization of protocols in the context of human rights:¶
The notion of centrality is essential to a consideration of any potential taxonomy. In the realm of graph theory and network analysis, centrality indicators are pivotal tools that quantitatively assess the importance of nodes within a graph based on their position in the network. These indicators assign numerical values or rankings to nodes, facilitating the identification of influential entities across diverse applications. Examples include pinpointing key figures in social networks, critical nodes in infrastructural networks in urban systems, pivotal individuals in disease transmission dynamics, and nodes crucial to understanding brain network functionality. A similar notion of centrality for the Internet can be formulated based on the quantitative analysis of particular intermediaries in Internet sessions.¶
Much of the discussion surrounding consolidation focuses on Internet services, applications and data. However, contributions to the discussion of consolidation have also addressed economic, traffic and architectural consolidation. The concentration of Internet traffic, infrastructure, services, and users on a handful of providers is a growing concern because of the economic, political, and reliability implications of these consolidation trends. In recognition of this, a taxonomy with four main categories is proposed:¶
The Internet Society Report on Internet Consolidation suggests that the Internet's economy is defined as the economic activities that either support the Internet or are fundamentally dependent on the Internet's existence.[ISOCconsolidation]¶
As a result, economic consolidation on the Internet refers to the effects of market consolidation on competition and the economic power of a small set of companies that dominate economic activity in the Internet. There are two aspects to economic consolidation on the Internet.¶
Economic consolidation means that a small number of companies dominate the marketplace and hence, the revenues gathered from the use of the Internet.¶
Economic consolidation also means that a small number of companies control the flow of capital among enterprises that provide services on the Internet.¶
The Internet Society also argues that there are two views of market consolidation in the Internet. In the first the metric is the concentration of providers in a given marketplace. In the second, it is the jusridiction of providers in a given market. It uses a pair of metrics ( the GINI coefficient and the Herfindahl=Hirschman Index (HHI) ) to measure market consolidation.¶
One of the two aspects of economic consolidation is the generation of revenue by a small number of enterprises. As an example, Amazon accounts for more than 45% of all online retail spending in the United States. Alibaba is estimated to have 60% or the electronic commerce market in China. Meta - including Facebook, Messenger, WhatsApp and Instagram - dominates social media and messaging holding four of the world's top six social media platforms.¶
In each of these cases, a very small number of companies operate extremely popular services, concentrating revenue generation into those companies. In fact, the popularity of these services is so great that value is created by adding other, complementary services onto the base they provide. The dominant services thus control the foundation upon which other revenue streams are built.[EmpiricalView]¶
The impact of revenue consolidation is enhanced by the presence of network effects in digital media and services. A network effect is a market feature where the benefit to the user or consumer of a product or service increases as the number of other users increases. First-mover advantages play a role in defining the marketplace, but as a product or service gains traction and sees its market grow, the user base becomes the foundation for future growth.¶
An example of this would be the dominance of the portable document format, originally introduced by Adobe. As the number of users grew, it became the dominant format for the exchange of documents readable on a variety of devices. While the network effect does not imply immediate or inevitable market dominance, it does seem to be tightly correlated with market leaders in digital, Internet-based services.¶
We have started to see governments take action in regards to this dominance. As an example, the Digital Services Act (DSA) in Europe identifies "Very Large Online Platforms" and attempts to provide specific rules for those platforms. The platforms are defined a those reaching more than 10% of the 450 million online consumers in Europe (at the time of writing, there were 20 such platforms). The DSA attempts to provide more effective safeguards for users and a series of transparency measures for online platforms. The DSA is an example of a regulatory instrument designed to change the balance of power between extremely large platforms and their users.¶
In addition to revenue generation, the very large application layer companies (Alphabet, Amazon, Tencent, Meta and Alibaba) control how money and capital moves through enterprises providing services on the Internet.¶
We have seen that embedded intermediaries that have substantial power can implement platforms that provide services to downstream consumers as well as upstream sources of content and applications. As the controlling intermediary for those services and applications, these large companies are also able to dictate how economic flows move between consumers and providers of applications and services.¶
Regulators and policy makers often are concerned about the enormous market power that these huge intermediaries have, but refrain from imposing controls or sanctions on the grounds that consumers get significant benefits when platform operators use upstream revenues to subsidize downstream services.¶
That control is often influenced by the fact that an enormous amount of the economic flow related to the Internet occurs largely out of public sight. Advertising is a critical part of the economic flow of the Internet. Advertisers want their message to reach the broadest community of potential consumers possible. As a result, advertising platforms need to be built to provide services to the greatest number of potential advertisers. The greater the reach of the advertising platform, the greater the value to the advertiser. The result, on the Internet, is a trend to collect the advertising tools in a consolidated group of global, dominant enterprises.¶
The end user sees free tools, services and information. However, the cost of those services is paid for, largely, by advertising revenue that makes up a vast proportion of the economic flow of the Internet. This economic model is mainly based on the collection of surveillance-based consumer data which has considerable implications of the rights/expectations of user privacy. While an in-depth discussion of the privacy implications of centralization and consolidation are out of scope for this draft, it cannot be separated from the discussion how the advertising model has affected economic flow consolidation.¶
A significant majority of the Internet's traffic is delivered from very large content services including Google, Amazon and Facebook. These companies naturally attempt to provide the most competitive service for their customers - including perceived speed of content delivery - these content services seek to establish connections directly with the companies providing access to the network. The result is a "flattening" of the Internet's traditional topology.¶
In fact, a recent study shows that these large services can reach more than 76% of the Internet without having to traverse traditional Tier 1 and Tier 2 ISPs. Besides bringing benefits of low latency and higher security to their users, these large-scale networks are also able to implement improvements and innovations in protocol design by having far greater control over the elements of the infrastructure being used to deliver services.¶
An empirical view of this consolidation in February of 2022 shows that the number of webpages that are hosted on these networks has increased from 2015 to 2020 at a rate exceeding 80%. In looking at data sources including TLD datasets and Alexa Top 1M datasets only a small number of content delivery networks host the vast majority of landing pages.¶
Centralization of this sort makes traffic filtering easier since forcing a content network to block specific content (or worse, blocking the content network entirely) would make a large amount of the content unavailable. As these networks begin to migrate other services to HTTP (for instance, DNS over HTTP), more than the web is affected by the impacts of filtering by centralized content services. In fact, blocking a content network entirely would block all the content of the network, not just the content that was the target of the filtering. It is clear that the implemtation of this type of filtering has widespread consequences beyond just the technical functioning of the Internet.¶
This happens at all layers. As an example at the application layer, in 2021 Google and Apple were forced to remove applications created by the Russian political opposition from their stores. The ability of a government to influence the content network means that centralization can lead to a reduction in the diversity of information or services on the Internet.¶
As the content networks grow in scale, the networks themselves grow to support the required network capacity. This sets up a feedback look that drives market concentration toward the infrastructure provided by the content networks. As these networks grow larger, it becomes difficult for smaller networks and infrastructure providers to compete with the economies of scale from which the large networks benefit.¶
A third category of consolidation is the evolution of the Internet's architecture to meet contemporary use cases and requirements. Early descriptions of the Internet's architecture described heterogeneous endpoints connected by neutral transports. The end-to-end principle suggested that the transport of data between endpoints was provided without much intervention. [RFC1958]¶
Two developments have led to architectural consolidation: the emergence of intermediary services and the movement of transport related code to the application layer.¶
In the first case, technologies like CDNs are built into the network for the efficient delivery of content and services. The consumer is largely unaware that the service or application is being hosted by an organization other than they one they think they contacted. Instead, content delivery is pushed as close to the consumer as possible to ensure that the end-user experience is as optimal as possible.¶
The result is a series of security, economic and policy concerns associated with the small number of very large providers of these intermediary services. However, in this section we only want to consider the architectural issues specific to the use of intermediaries.¶
The second case is vertical architectural consolidation. This is where the companies that control the applications attempt to control all aspects of the communication. For instance, the provider of the browser may be the organization that the browser connects to. The advantage of this kind of architectural consolidation is that it allows the largest players to introduce technological innovation more quickly than if multiple layers of the stack required innovation in parallel.¶
With tools like DNS over HTTPS, we see applications taking control of the infrastructure of transport in addition to providing an application or service. Applications essentially provide their own ecosystem (from centralized control of DNS services all the way to the end-user experience).¶
Others have rightly observed that, in the current environment, development of protocols and standards for the Internet is largely confined to a small number of participants from a small number of organizations. One trend is that the giant enterprises on the Internet also dominate the development of protocols.¶
Having a small number of organizations controlling the infrastructure of the Internet also means that innovative technologies can be implemented quickly and at large-scale. In 2022, a study in the ACM Transactions on Internet Technology found that Google accounting for 60% of all TLS 1.3 secured resources. Some other, large CDNs use TLS 1.3 almost exclusively. QUIC is also an example of a new technology that profits from consolidation. Large scale intermediaries can facilitate the deployment and adoption of new standards because the decision for the adoption is propagated across the infrastructure instead of through user adoption or feature updates.¶
Services and applications are those tools that users see when they interact with the Internet. They take advantage of the infrastructure (and, access) parts of the Internet's ecosystem. According to the Internet Society's report on consolidation, "a small number of companies operating some of the Internet's most popular services dominate this market. Many of these companies act as multi-sided markets or platforms, meaning they offer a base upon which other applications, processes or technologies can be developed."¶
By itself, Google holds 90% of the global search market, the number one mobile operating system (Android), the top-user-generated video platform and has more than 1.5 billion active users of its Gmail email service. Google also has a map service, a public DNS resolver service, a cloud service and a document store.¶
While twenty years ago, an application would simply rely on the underlying operating system to provide its communications and transport services, now applications and services do this for themselves. This is a case of the intermediary or platform providing the application integrating all the necessary components for providing a service on the Internet.¶
Some extremely useful research (from a variety of academic institutions including the University of Illinois, the Unversity of California at Berkeley, the Oxford Internet Institute and the London School of Economics)has documented that there is extreme centralization in the area of Internet jurisdiction.¶
In the first case, numerous national laws are designed to have extraterritorial impact, extending their reach to individuals or corporations beyond the enacting state's borders. This practice has a long history predating the Internet, but its implications are amplified by the inherently cross-border nature of digital networks. Additionally, some nations are increasingly motivated to assert control over the Internet, further intensifying these effects. The intersection of global connectivity and national regulatory ambitions creates a complex legal landscape, where the extraterritorial application of laws can lead to significant jurisdictional challenges and conflicts. As countries strive to enforce their legal frameworks in the digital realm, the tension between national sovereignty and the borderless nature of the Internet becomes more pronounced, highlighting the need for international cooperation and dialogue to navigate these issues effectively.¶
In the second case, the landscape for jurisdiction for dispute resolution is often dictated by the provider of the service. Research shows that, despite the desire of many governments to pursue national soveriegnty, jurisdiction for dispute resolution is dominated by a single nation. Rather than have the jurisdiction be in the country of the consumer of the service, the jurisdiction is determined by the provider.¶
In both of these cases, the risk is that policy makers will use this inequity as a reason to impose rules that spill over on to the Internet elsewhere, with potential unintended effects¶
It has also been observed that there is a paradox concerning the physical infrastructure of the Internet. The physical infrastructure of the Internet has never been more diverse: the kinds and numbers of devices connected continues to grow at exponential rates. However, the network management infrastructure of the Internet continues to be significantly centralized. This paradox is, perhaps, a feature of architectural consolidation discussed above.¶
This memo includes no request to IANA.¶
While this document does not describe a specific protocol, it does discuss the evolving architecture of the Internet. Changes to the Internet's architecture have direct and indirect implications for the Internet's threat model. Specifically, the changes to the end-to-end model (see section 6 above) have inserted new interfaces which must be reflected in security considerations for new protocols.¶
This document seeks to rekindle and restart the discussion on consolidation. As argued above, Internet consolidation is happening at different places and different layers of the Internet. Though there has been interest in the Internet consolidation in the past, now is the time to start the discussions again.¶
The author is particularly grateful for comments received on this draft in the wake of a presentation at IETF 118.¶
Many thanks to all who discussed this with us in DINRG in 2021, 2022, 2023, and 2024. Special thanks go to the providers of detail comments: Sheetal Kumar, Jessamine Pacis, Winthrop Yap Yu, Lisa Garcia and Michaela Nakayama Shapiro.¶