- Centrality
-
Within graph theory and network analysis, there are various measures of the centrality of a vertex within a graph that determine the relative importance of a vertex within the graph (for example, how important a person is within a social network, or, in the theory of space syntax, how important a room is within a building or how well-used a road is within an urban network). Many of the centrality concepts were first developed in social network analysis, and many of the terms used to measure centrality reflect their sociological origin.[1]
There are four measures of centrality that are widely used in network analysis: degree centrality, betweenness, closeness, and eigenvector centrality. For a review as well as generalizations to weighted networks, see Opsahl et al. (2010).[2]
Contents
Degree centrality
Main article: Degree (graph theory)The first, and simplest, is degree centrality. Degree centrality is defined as the number of links incident upon a node (i.e., the number of ties that a node has). Degree is often interpreted in terms of the immediate risk of a node for catching whatever is flowing through the network (such as a virus, or some information). If the network is directed (meaning that ties have direction), then we usually define two separate measures of degree centrality, namely indegree and outdegree. Indegree is a count of the number of ties directed to the node, and outdegree is the number of ties that the node directs to others. For positive relations such as friendship or advice, indegree is often interpreted as a form of popularity, and outdegree as gregariousness.
For a graph G: = (V,E) with n vertices, the degree centrality CD(v) for vertex v is:
- CD(v) = deg(v)
Calculating degree centrality for all nodes V in a graph takes Θ(V2) in a dense adjacency matrix representation of the graph, and for edges E in a graph takes Θ(E) in a sparse matrix representation.
The definition of centrality on the node level can be extended to the whole graph. Let v * be the node with highest degree centrality in G. Let X: = (Y,Z) be the n node connected graph that maximizes the following quantity (with y * being the node with highest degree centrality in X):
Then the degree centrality of the graph G is defined as follows:
H is maximized when the graph X contains one node that is connected to all other nodes and all other nodes are connected only to this one central node (a star graph). In this case
- H = (n − 1)(n − 2)
so the degree centrality of G reduces to:
Betweenness centrality
Main article: Betweenness CentralityBetweenness is a centrality measure of a vertex within a graph (there is also edge betweenness, which is not discussed here). It was introduced as a measure for quantifying the control of a human on the communication between other humans in a social network by Linton Freeman[3]. In his conception, vertices that have a high probability to occur on a randomly chosen shortest paths between two randomly chosen nodes have a high betweenness.
For a graph G: = (V,E) with n vertices, the betweenness CB(v) for vertex v is computed as follows:
1. For each pair of vertices (s,t), compute all shortest paths between them.
2. For each pair of vertices (s,t), determine the fraction of shortest paths that pass through the vertex in question (here, vertex v).
3. Sum this fraction over all pairs of vertices (s,t).
Or, more succinctly[4]:
where σst is the number of shortest paths from s to t, and σst(v) is the number of shortest paths from s to t that pass through a vertex v. This may be normalised by dividing through the number of pairs of vertices not including v, which is (n − 1)(n − 2) for directed graphs and (n − 1)(n − 2) / 2 for undirected graphs. For example, in an undirected star graph, the center vertex (which is contained in every possible shortest path) would have a betweenness of (n − 1)(n − 2) / 2 (1, if normalised) while the leaves (which are contained in no shortest paths) would have a betweenness of 0.
Calculating the betweenness and closeness centralities of all the vertices in a graph involves calculating the shortest paths between all pairs of vertices on a graph. This takes Θ(V3) time with the Floyd–Warshall algorithm, modified to not only find one but count all shortest paths between two nodes. On a sparse graph, Johnson's algorithm may be more efficient, taking O(V2log V + VE) time. On unweighted graphs, calculating betweenness centrality takes O(VE) time using Brandes' algorithm[4].
In calculating betweenness and closeness centralities of all vertices in a graph, it is assumed that graphs are undirected and connected with the allowance of loops and multiple edges. When specifically dealing with network graphs, oftentimes graphs are without loops or multiple edges to maintain simple relationships (where edges represent connections between two people or vertices). In this case, using Brandes' algorithm will divide final centrality scores by 2 to account for each shortest path being counted twice.[4]Closeness centrality
In topology and related areas in mathematics, closeness is one of the basic concepts in a topological space. Intuitively we say two sets are close if they are arbitrarily near to each other. The concept can be defined naturally in a metric space where a notion of distance between elements of the space is defined, but it can be generalized to topological spaces where we have no concrete way to measure distances.
In graph theory closeness is a centrality measure of a vertex within a graph. Vertices that are 'shallow' to other vertices (that is, those that tend to have short geodesic distances to other vertices with in the graph) have higher closeness. Closeness is preferred in network analysis to mean shortest-path length, as it gives higher values to more central vertices, and so is usually positively associated with other measures such as degree.
In the network theory, closeness is a sophisticated measure of centrality. It is defined as the mean geodesic distance (i.e., the shortest path) between a vertex v and all other vertices reachable from it:
where is the size of the network's 'connectivity component' V reachable from v. Closeness can be regarded as a measure of how long it will take information to spread from a given vertex to other reachable vertices in the network.[5]
Some define closeness to be the reciprocal of this quantity, but either way the information communicated is the same (this time estimating the speed instead of the timespan). The closeness CC(v) for a vertex v is the reciprocal of the sum of geodesic distances to all other vertices of V[6]:
Different methods and algorithms can be introduced to measure closeness, like the random-walk centrality introduced by Noh and Rieger (2003) that is a measure of the speed with which randomly walking messages reach a vertex from elsewhere in the network—a sort of random-walk version of closeness centrality.[7]
The information centrality of Stephenson and Zelen (1989) is another closeness measure, which bears some similarity to that of Noh and Rieger. In essence it measures the harmonic mean length of paths ending at a vertex i, which is smaller if i has many short paths connecting it to other vertices.[8]
Dangalchev (2006), in order to measure the network vulnerability, modifies the definition for closeness so it can be used for disconnected graphs and the total closeness is easier to calculate[9]:
An extension to networks with disconnected components has been proposed by Opsahl (2010).[10]
Eigenvector centrality
Eigenvector centrality is a measure of the importance of a node in a network. It assigns relative scores to all nodes in the network based on the principle that connections to high-scoring nodes contribute more to the score of the node in question than equal connections to low-scoring nodes. Google's PageRank is a variant of the Eigenvector centrality measure.[11]
Using the adjacency matrix to find eigenvector centrality
Let xi denote the score of the ith node. Let A = (ai,j) be the adjacency matrix of the network. Hence ai,j = 1 if the ith node is linked to the jth node, and ai,j = 0 otherwise. More generally, the entries in A can be real numbers representing connection strengths, as in a stochastic matrix.
For the ith node, let the centrality score be proportional to the sum of the scores of all nodes which are connected to it. Hence
where M(i) is the set of nodes that are connected to the ith node, N is the total number of nodes and λ is a constant. In vector notation this can be rewritten as
- , or as the eigenvector equation
In general, there will be many different eigenvalues λ for which an eigenvector solution exists. However, the additional requirement that all the entries in the eigenvector be positive implies (by the Perron–Frobenius theorem) that only the greatest eigenvalue results in the desired centrality measure.[12] The ith component of the related eigenvector then gives the centrality score of the ith node in the network. Power iteration is one of many eigenvalue algorithms that may be used to find this dominant eigenvector.[11]
Definition and Characterization of Centrality Indices
Next to the above named classic centrality indices, there are dozens of other more specialized centrality indices. Despite its intuitive notion there is not yet a definition or characterization of centrality indices which captures all of them[13]. A very loose definition of a centrality index is the following:
A centrality index is a real-valued function on the nodes of a graph. It is a structural index, i.e., if G and H are two isomorphic graphs and Φ is the mapping from the vertex set V(G) of G to V(H), then the centrality of a vertex v of G must be the same as the centrality of Φ(v) in H. Conventionally, the higher the centrality index of a node, the higher its perceived centrality in the graph[14]. This definition comprises all classic centrality measures but not all measures that fulfill this definition would be accepted as centrality indices.
Borgatti and Everett summarize that centrality indices measure the position of a node along a predefined set of walks. They characterize centrality indices along four dimensions: the set of walks, whether the length or the number of these walks is considered, the position of the node on the walks (at the start=radial; in the middle=medial), and how the numbers assigned to the paths are summarized in the measure (average, median, weighted sum, ...)[13]. This leads to a characterization by the way a centrality index is calculated. In a different characterization, Borgatti differentiates the centrality indices by what type of paths they consider and which type of network flow they imply [15]. The latter characterizes the centrality indices by the quality with which they predict which node is most central for a given network flow process. This characterization thus provides guidance on when to use which centrality index.
Centralization
The centralization of any network is a measure of how central its most central node is in relation to how central all the other nodes are.[16] The general definition of centralization for non-weighted networks was proposed by Linton Freeman (1979). Centralization measures then (a) calculate the sum in differences in centrality between the most central node in a network and all other nodes; and (b) divide this quantity by the theoretically largest such sum of differences in any network of the same degree.[16] Thus, every centrality measure can have its own centralization measure. Defined formally, if Cx(pi) is any centrality measure of point i, if Cx(p * ) is the largest such measure in the network, and if is the largest sum of differences in point centrality Cx for any graph of with the same number of nodes, then the centralization of the network is:[16]
See also
- Distance in graphs
- Alpha centrality
Notes and references
- ^ Newman, M.E.J. 2010. Networks: An Introduction. Oxford, UK: Oxford University Press.
- ^ Opsahl, Tore; Agneessens, Filip; Skvoretz, John (2010). "Node centrality in weighted networks: Generalizing degree and shortest paths". Social Networks 32 (3): 245. doi:10.1016/j.socnet.2010.03.006. http://toreopsahl.com/2010/04/21/article-node-centrality-in-weighted-networks-generalizing-degree-and-shortest-paths/.
- ^ Freeman, Linton (1977). "A set of measures of centrality based upon betweenness". Sociometry 40: 35-41.
- ^ a b c Brandes, Ulrik (2001 title = A faster algorithm for betweenness centrality). (PDF)Journal of Mathematical Sociology 25: 163-177. http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.11.2024. Retrieved 10.11.2011.
- ^ Newman, MEJ, 2003, Arxiv preprint cond-mat/0309045.
- ^ Sabidussi, G. (1966) The centrality index of a graph. Psychometrika 31, 581--603.
- ^ J. D. Noh and H. Rieger, Phys. Rev. Lett. 92, 118701 (2004).
- ^ Stephenson, K. A. and Zelen, M., 1989. Rethinking centrality: Methods and examples. Social Networks 11, 1–37.
- ^ Dangalchev Ch., Residual Closeness in Networks, Phisica A 365, 556 (2006).
- ^ Tore Opsahl. Closeness centrality in networks with disconnected components. http://toreopsahl.com/2010/03/20/closeness-centrality-in-networks-with-disconnected-components/.
- ^ a b http://www.ams.org/samplings/feature-column/fcarc-pagerank
- ^ M. E. J. Newman (PDF). The mathematics of networks. http://www-personal.umich.edu/~mejn/papers/palgrave.pdf. Retrieved 2006-11-09.
- ^ a b Borgatti, Stephen P.; Everett, Martin G. (2005). "A Graph-Theoretic Perspective on Centrality". Social Networks (Elsevier) 28: 466-484. doi:10.1016/j.socnet.2005.11.005.
- ^ Koschützki, Dirk; Katharina A. Lehmann, Leon Peeters, Stefan Richter, Dagmar Tenfelde-Podehl, Oliver Zlotowski (2005). "Centrality Indices". In Ulrik Brandes, Thomas Erlebach. Network Analysis - Methodological Foundations. LNCS 3418. Springer Verlag, Heidelberg, Germany. pp. 16-60. ISBN 978-3-540-24979-5.
- ^ Stephen P. Borgatti (2005). "Centrality and Network Flow". Social Networks (Elsevier) 27: 55-71.
- ^ a b c Freeman, L. C. (1979). Centrality in social networks: Conceptual clarification. Social Networks, 1(3), 215-239.
Further reading
- Freeman, L. C. (1979). Centrality in social networks: Conceptual clarification. Social Networks, 1(3), 215-239.
- Sabidussi, G. (1966). The centrality index of a graph. Psychometrika, 31 (4), 581-603.
- Freeman, L. C. (1977) A set of measures of centrality based on betweenness. Sociometry 40, 35-41.
- Koschützki, D.; Lehmann, K. A.; Peeters, L.; Richter, S.; Tenfelde-Podehl, D. and Zlotowski, O. (2005) Centrality Indices. In Brandes, U. and Erlebach, T. (Eds.) Network Analysis: Methodological Foundations, pp. 16–61, LNCS 3418, Springer-Verlag.
- Bonacich, P.(1987) Power and Centrality: A Family of Measures, The American Journal of Sociology, 92 (5), pp 1170–1182
External links
Categories:
Wikimedia Foundation. 2010.