Load-balanced switch

Load-balanced switch

A load-balanced switch is a switch architecture which guarantees 100% throughput, (the equivalent of perfect arbitration), with no central arbitration at all, at the cost of sending each packet across the crossbar twice. Load-balanced switches are currently (2005) a subject of research for large routers scaled past the point of practical central arbitration.

Introduction

Internet routers are typically built of line cards connected together with a switch. Routers supporting moderate total bandwidth may use a bus as their switch, but high bandwidth routers typically use some sort of crossbar interconnection. In a crossbar, each output connects to one input, so that information can flow through every output simultaneously. Crossbars used for packet switching are typically reconfigured tens of millions of times per second. The schedule of these configurations is determined by a central arbiter, e.g. a Wavefront arbiter, in response to requests by the line cards to send information to one another.

Perfect arbitration would result in throughput limited only by the maximum throughput of each crossbar input or output. For example, if all traffic coming into line cards A and B is destined for line card C, then the maximum traffic that cards A and B can process together is limited by C. Perfect arbitration has been shown to require massive amounts of computation, that scales up much faster than the number of ports on the crossbar. Practical systems use imperfect arbitration heuristics (e.g. iSLIP) that can be computed in reasonable amounts of time.

A load-balanced switch is not related to a load balancing switch, which refers to a kind of router used as a front end to a farm of web servers to spread requests to a single website across many servers.

Basic architecture

As shown in the figure to the right, a load-balanced switch has N input line cards, each of rate R, each connected to N buffers by a link of rate R/N. Those buffers are in turn each connected to N output line cards, each of rate R, by links of rate R/N. The buffers in the center are partitioned into N virtual output queues.

Each input line card spreads it's packets evenly to the N buffers, something it can clearly do without contention. Each buffer writes these packets into a single buffer-local memory at a combined rate of R. Simultaneously, each buffer sends packets at the head of each virtual output queue to each output line card, again at rate R/N to each card. The output line card can clearly forward these packets out the line with no contention.

Each buffer in a load-balanced switch acts as a shared-memory switch, and a load-balanced switch is essentially a way to scale up a shared-memory switch, at the cost of additional latency associated with forwarding packets at rate R/N twice.

The Stanford group investigating load-balanced switches is concentrating on implementations where the number of buffers is equal to the number of line cards. One buffer is placed on each line cards, and the two interconnection meshes are actually the same mesh, supplying rate 2R/N between every pair of line cards. But the basic load-balanced switch architecture does not require that the buffers be placed on the line cards, or that there be the same number of buffers and line cards.

One interesting property of a load-balanced switch is that, although the mesh connecting line cards to buffers is required to connect every line card to every buffer, there is no requirement that the mesh act as a non-blocking crossbar, nor that the connections be responsive to any traffic pattern. Such a connection is far simpler than a centrally arbitrated crossbar.

Keeping packets in-order

If two packets destined for the same output arrive back-to-back at one line card, they will be spread to two different buffers, which could have two different occupancies, and so the packets could be reordered by the time they are delivered to the output. Although reordering is legal, it is typically undesirable because TCP does not perform well with reordered packets.

By adding yet more latency and buffering, the load-balanced switch can maintain packet order within flows using only local information. One such algorithm is FOFF (Fully Ordered Frames First). FOFF has the additional benefits of removing any vulnerability to pathological traffic patterns, and providing a mechanism for implementing priorities.

Implementations

ingle chip crossbar plus load-balancing arbiter

The Stanford University Tiny Tera project (see Abrizio) introduced a switch architecture that required at least two chip designs for the switching fabric itself (the crossbar slice and the arbiter). Upgrading the arbiter to include load-balancing and combining these devices could have reliability, cost and throughput advantages.

ingle global router

Since the line cards in a load-balanced switch do not need to be physically near one another, one possible implementation is to use an entire continent- or global-sized backbone network as the interconnection mesh, and core routers as the "line cards". Such an implementation suffers from having all latencies increased to twice the worst-case transmission latency. But it has a number of intriguing advantages:

* Large backbone packet networks typically have massive overcapacity (10x or more) to deal with imperfect capacity planning, congestion, and other problems. A load-balanced switch backbone can deliver 100% throughput with an overcapacity of just 2x, as measured across the whole system.

* The underpinnings of large backbone networks are usually optical channels that cannot be quickly switched. These map well to the constant-rate 2R/N channels of the load-balanced switch's mesh.

* No route tables need be changed based on global congestion information, because there is no global congestion.

* Rerouting in the case of a node failure does require changing the configuration of the optical channels. But the reroute can be precomputed (there are only a finite number of nodes that can fail), and the reroute causes no congestion that would then require further route table changes.

External references

* [http://tiny-tera.stanford.edu/~nickm/papers/PID48684.pdf Optimal Load-Balancing] I. Keslassy, C. Chang, N. McKeown, and D. Lee
* [http://tiny-tera.stanford.edu/~nickm/papers/sigcomm2003.pdf Scaling Internet Routers Using Optics] I. Keslassy, S. Chuang, K. Yu, D. Miller, M. Horowitz, O. Solgaard, and N. McKeown

References


Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

  • Load balancing (computing) — Load balancing is a computer networking methodology to distribute workload across multiple computers or a computer cluster, network links, central processing units, disk drives, or other resources, to achieve optimal resource utilization,… …   Wikipedia

  • Multilayer switch — A multilayer switch (MLS) is a computer networking device that switches on OSI layer 2 like an ordinary network switch and provides extra functions on higher OSI layers. Contents 1 Layer 3 Switching 2 MultiLayer Switch (MLS) OSI layer 3 and/or 4 …   Wikipedia

  • Split multi-link trunking — (SMLT) is a link aggregation technology in computer networking designed by Nortel in 2001 as an enhancement to standard Multi Link Trunking (MLT) as defined in IEEE 802.3ad.Link aggregation or Multi Link Trunking (MLT) allows multiple physical… …   Wikipedia

  • High-availability cluster — High availability clusters (also known as HA clusters or failover clusters) are groups of computers that support server applications that can be reliably utilized with a minimum of down time. They operate by harnessing redundant computers in… …   Wikipedia

  • Software as a service — (SaaS, typically pronounced sass ) is a model of software deployment where an application is hosted as a service provided to customers across the Internet. By eliminating the need to install and run the application on the customer s own computer …   Wikipedia

  • Link aggregation — between a switch and a server Link aggregation or trunking or link bundling or Ethernet/network/NIC bonding[1] or NIC teaming are computer networking umbrella terms to describe various methods of combining (aggregating) multiple network… …   Wikipedia

  • TRS connector — 1⁄4 in TRS connector Trip …   Wikipedia

  • Buck converter — A buck converter is a step down DC to DC converter. Its design is similar to the step up boost converter, and like the boost converter it is a switched mode power supply that uses two switches (a transistor and a diode) and an inductor and a… …   Wikipedia

  • Elevator — For other uses, see Elevator (disambiguation). A set of lifts in the lower level of a London Underground station. The arrows indicate each elevator s position and direction of travel …   Wikipedia

  • Economic Affairs — ▪ 2006 Introduction In 2005 rising U.S. deficits, tight monetary policies, and higher oil prices triggered by hurricane damage in the Gulf of Mexico were moderating influences on the world economy and on U.S. stock markets, but some other… …   Universalium

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”