Cascade failure

Cascade failure

A cascade failure is a series of events on the internet in which network traffic is severely impaired or halted, to or between larger sections of the internet, caused by failing or disconnected hardware or software. Somewhat similar to the more generic cascading failure — found in, for instance, electrical systems — the cascade failure can affect large groups of people and systems.

Causes

The cause of a cascade failure is usually the overloading of a single, crucial router or node. This causes the node to go down, even briefly, resulting in routing of traffic to or through another (alternative) path.This alternative path, as a result, becomes overloaded, causing it to go down, and so on. It will also affect systems which depend on the node for regular operation.

It can also be caused by taking a node down for maintenance or upgrades.

Symptoms

The symptoms of a cascade failure are easy to see: packet loss and high network latency, not just to single systems, but to whole sections of a network or the internet. The high latency and packet loss is caused by the nodes that fail to operate due to congestion collapse, which causes them to still be present in the network but without much or any useful communication going through them. As a result, routes can still be considered valid, without them actually providing communication.

If enough routes go down because of a cascade failure, a complete section of the network or internet can become unreachable. Although undesired, this can help speed up the recovery from this failure as connections will time out, and other nodes will give up trying to establish connections to the section(s) that have become cut off, decreasing load on the involved nodes.

A common thing to see during a cascade failure is a walking failure, where sections go down, causing the next section to fail, after which the first section comes back up. This ripple can make several passes through the same sections or connecting nodes before stability is restored.

History

Cascade failures are a relatively recent development, with the massive increase in traffic and the high interconnectivity between systems and networks. The term was first applied in this context in the late 1990's by a Dutch IT professional and has slowly become a relatively common term for this kind of large-scale failure.

Example

As an example, let's overload a connecting node between a local ISP and their Internet backbone:Initially, the traffic that would normally go through the node is stopped. Systems and users get errors about not being able to reach hosts. Usually, the redundant systems of an ISP respond very quickly, choosing another path through a different backbone. The routing path through this alternative route is longer, with more hops and subsequently going through more systems that normally do not process the amount of traffic suddenly offered.This can cause one or more systems along the alternative route to go down, causing similar problems of their own.

Also, related systems are affected in this case: as example, DNS resolution might fail and what would normally cause systems to be interconnected, might break connections that are not even directly involved in the actual systems that went down. This, in turn, may cause seemingly unrelated nodes to develop problems, that can cause another cascade failure all on its own.

See also

* Chain reaction
* Congestion collapse

References

* cite web
url = http://www.jaist.ac.jp/library/thesis/ks-master-2005/abstract/tmiyazak/abstract.pdf
title = Comparison of defense strategies for cascade breakdown on SF networks with degree correlations
author = Toshiyuki Miyazaki
date = 2005-03-01
language = English

* cite web
url = http://redmondmag.com/columns/print.asp?EditorialsID=1000
title = (In)Secure Shell?
accessdate = 2007-09-08
author = Russ Cooper
date = 2005-06-01
publisher = RedmondMag.com
language = English

* cite web
url = http://www.chds.us/?research/software&d=list
title = Cascade Net (simulation program)
accessdate = 2007-09-08
author = US Department of Homeland Security
date = 2007-02-05
publisher = Center for Homeland Defense and Security
language = English


Wikimedia Foundation. 2010.

Игры ⚽ Нужна курсовая?

Look at other dictionaries:

  • Cascade — A cascade is a type of waterfall or a series of waterfalls.Cascade may also refer to: Places North America* Cascade Range, a mountain range on the west coast of North America * Cascade Volcanoes, a grouping of volcanoes on the west coast of North …   Wikipedia

  • Cascade Tunnel — The Cascade Tunnel was a 2.6 mile (4.2 km) long single track railroad tunnel at Stevens Pass through the Cascade Mountains approximately convert|65|mi to the east of Everett, Washington. It was built by the Great Northern Railway in 1900 to avoid …   Wikipedia

  • Cascade Volcanoes — Geobox|Range name=Cascade Volcanoes other name= image size=280 image caption=Mount Hood reflected in Trillium Lake country=United States country1=Canada region type = Provinces/States region= Oregon region1= Washington region2= California region3 …   Wikipedia

  • Cascading failure — A cascading failure is failure in a system of interconnected parts, where the service provided depends on the operation of a preceding part, and the failure of a preceding part can trigger the failure of successive parts. Redundant parts can… …   Wikipedia

  • Diastolic heart failure — Diastolic dysfunction Classification and external resources ICD 9 428.3 Diastolic heart failure or diastolic dysfunction refers to decline in performance of one or both ventricles of the heart during the time phase of diastole. Diastole is that… …   Wikipedia

  • Heart failure — Classification and external resources The major signs and symptoms of heart failure. ICD 10 I5 …   Wikipedia

  • CFA — • Cascade Failure Analysis ( > IEEE Standard Dictionary ) • Cross Field Amplifier • Coffee Point, AK, USA internationale Flughafen Kennung …   Acronyms

  • Structured criticality — is a property of complex systems whereby small events may trigger larger events due to subtle interdependencies between elements. This often gives rise to a kind of stratified chaos where the general behavior of the system can be modeled on one… …   Wikipedia

  • Congestive collapse — (or congestion collapse) is a condition which a packet switched computer network can reach, when little or no useful communication is happening due to congestion.When a network is in such a condition, it has settled (under overload) into a stable …   Wikipedia

  • Data (Star Trek) — Data Data on the bridge of the Enterprise D Species Android Home planet Omicron Theta Affiliation United Federation of Planets Starfleet …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”