Color-coding

Color-coding: For other uses, see Color code.

In computer science and graph theory, the method of color-coding^[1]^[2] efficiently finds k-vertex simple paths, k-vertex cycles, and other small subgraphs within a given graph using probabilistic algorithms, which can then be derandomized and turned into deterministic algorithms. This method shows that many subcases of the subgraph isomorphism problem (an NP-complete problem) can in fact be solved in polynomial time.

The theory and analysis of the color-coding method was proposed in 1994 by Noga Alon, Raphael Yuster, and Uri Zwick.

Contents

1 Results

2 The method

2.1 Example

3 Derandomization

4 Applications

5 References

Results

The following results can be obtained through the method of color-coding:

For every fixed constant $k$ , if a graph $G = (V, E)$ contains a simple cycle of size $k$ , then such cycle can be found in:

O( $V ω$ ) expected time, or

O( $V ω log V$ ) worst-case time, where $ω$ is the exponent of matrix multiplication^[3].

For every fixed constant $k$ , and every graph $G = (V, E)$ that is in any nontrivial minor-closed graph family (e.g., a planar graph), if $G$ contains a simple cycle of size $k$ , then such cycle can be found in:

O( $V$ ) expected time, or

O( $V log V$ ) worst-case time.

If a graph $G = (V, E)$ contains a subgraph isomorphic to a bounded treewidth graph which has $O (log V)$ vertices, then such a subgraph can be found in polynomial time.

The method

To solve the problem of finding a subgraph $H = (V H, E H)$ in a given graph $G = (V, E)$ , where $H$ can be a path, a cycle, or any bounded treewidth graph where $| V H | = O (log V)$ , the method of color-coding begins by randomly coloring each vertex of $G$ with $k = | V H |$ colors, and then tries to find a colorful copy of $H$ in colored $G$ . Here, a graph is colorful if every vertex in it is colored with a distinct color. This method works by repeating (1) random coloring a graph and (2) finding colorful copy of the target subgraph, and eventually the target subgraph can be found if the process is repeated a sufficient number of times.

Suppose $H$ becomes colorful with some non-zero probability $p$ . It immediately follows that if the random coloring is repeated $\tfrac{1}{p}$ times, then $H$ is expected to become colorful once. Note that though $p$ is small, it is shown that if $| V H | = O (log V)$ , $p$ is only polynomially small. Suppose again there exists an algorithm such that, given a graph $G$ and a coloring which maps each vertex of $G$ to one of the $k$ colors, it finds a copy of colorful $H$ , if one exists, within some runtime $O (r)$ . Then the expected time to find a copy of $H$ in $G$ , if one exists, is $O(\tfrac{r}{p})$ .

Sometimes it is also desirable to use a more restricted version of colorfulness. For example, in the context of finding cycles in planar graphs, it is possible to develop an algorithm that finds well-colored cycles. Here, a cycle is well-colored if its vertices are colored by consecutive colors.

Example

An example would be finding a simple cycle of length $k$ in graph $G = (V, E)$ .

By applying random coloring method, each simple cycle has a probability of $k!/k^k > \tfrac{1}{e^k}$ to become colorful, since there are $k k$ ways of coloring the $k$ vertices on the path, among which there are $k!$ colorful occurrences. Then an algorithm (described below) of runtime $O (V ω)$ can be adopted to find colorful cycles in the randomly colored graph $G$ . Therefore, it takes $e^k\cdot O(V^\omega)$ overall time to find a simple cycle of length $k$ in $G$ .

The colorful cycle-finding algorithm works by first finding all pairs of vertices in V that are connected by a simple path of length k − 1, and then checking whether the two vertices in each pair are connected. Given a coloring function $c: V\rightarrow \{1, \dots, k\}$ to color graph $G$ , enumerate all partitions of the color set $\{1, \dots, k\}$ into two subsets $C 1$ , $C 2$ of size $k / 2$ each. Note that $V$ can be divided into $V 1$ and $V 2$ accordingly, and let $G 1$ and $G 2$ denote the subgraphs induced by $V 1$ and $V 2$ respectively. Then, recursively finds colorful path of length $k / 2 - 1$ in each of $G 1$ and $G 2$ . Suppose the boolean matrix $A 1$ and $A 2$ represent the connectivity of each pair of vertices in $G 1$ and $G 2$ by a colorful path, respectively, and let $B$ be the matrix describing the adjacency relations between vertices of $V 1$ and those of $V 2$ , the boolean product $A 1 B A 2$ gives all pairs of vertices in $V$ that are connected by a colorful path of length $k - 1$ . Thus, the recursive relation of matrix multiplications is $t(k) \le 2^k\cdot t(k/2)$ , which yields a runtime of $2^{O(k)}\cdot V^\omega \in O(V^\omega)$ . Although this algorithm finds only the end points of the colorful path, another algorithm by Alon and Naor^[4] that finds colorful paths themselves can be incorporated into it.

Derandomization

The derandomization of color-coding involves enumerating possible colorings of a graph $G$ , such that the randomness of coloring $G$ is no longer required. For the target subgraph $H$ in $G$ to be discoverable, the enumeration has to include at least one instance where the $H$ is colorful. To achieve this, enumerating a $k$ -perfect family $F$ of hash functions from $\{1, 2, \dots, |V|\}$ to $\{1, 2, \dots, k\}$ is sufficient. By definition, $F$ is k-perfect if for every subset $S$ of $\{1, 2, \dots, |V|\}$ where $| S | = k$ , there exists a hash function $h\in F$ such that $h: S \rightarrow \{1, 2, \dots, k\}$ is perfect. In other words, there must exist a hash function in $F$ that colors any given $k$ vertices with $k$ distinct colors.

There are several approaches to construct such a $k$ -perfect hash family:

The best explicit construction is by Moni Naor, Leonard J. Schulman, and Aravind Srinivasan^[5], where a family of size $e k k O (log k) log | V |$ can be obtained. This construction does not require the target subgraph to exist in the original subgraph finding problem.

Another explicit construction by Jeanette P. Schmidt and Alan Siegel^[6] yields a family of size $2 O (k) log 2 | V |$ .

Another construction that appears in the original paper of Noga Alon et al.^[2] can be obtained by first building a $k$ -perfect family that maps $\{1, 2, \dots, |V|\}$ to $\{1, 2,\dots, k^2\}$ , followed by building another $k$ -perfect family that maps $\{1, 2, \dots, k^2\}$ to $\{1, 2, \dots, k\}$ . In the first step, it is possible to construct such a family with $2 n log k$ random bits that are almost $2log k$ -wise independent^[7]^[8], and the sample space needed for generating those random bits can be as small as $k O (1) log | V |$ . In the second step, it has been shown by Jeanette P. Schmidt and Alan Siegel^[6] that the size of such $k$ -perfect family can be $2 O (k)$ . Consequently, by composing the $k$ -perfect families from both steps, a $k$ -perfect family of size $2 O (k) log | V |$ that maps from $\{1, 2, \dots, |V|\}$ to $\{1, 2, \dots, k\}$ can be obtained.

In the case of derandomizing well-coloring, where each vertex on the subgraph is colored consecutively, a $k$ -perfect family of hash functions from $\{1, 2, \dots, |V|\}$ to $\{1, 2, \dots, k!\}$ is needed. A sufficient $k$ -perfect family which maps from $\{1, 2, \dots, |V|\}$ to $\{1, 2, \dots, k^k\}$ can be constructed in a way similar to the approach 3 above (the first step). In particular, it is done by using $n k log k$ random bits that are almost $k log k$ independent, and the size of the resulting $k$ -perfect family will be $k O (k) log | V |$ .

The derandomization of color-coding method can be easily parallelized, yielding efficient NC algorithms.

Applications

Recently, color coding has attracted much attention in the field of bioinformatics. One example is the detection of signaling pathways in protein-protein interaction (PPI) networks. Another example is to discover and to count the number of motifs in PPI networks. Studying both signaling pathways and motifs allows a deeper understanding of the similarities and differences of many biological functions, processes, and structures among organisms.

Due to the huge amount of gene data that can be collected, searching for pathways or motifs can be highly time consuming. However, by exploiting the color coding method, the motifs or signaling pathways with $k = O (log n)$ vertices in a network $G$ with $n$ vertices can be found very efficiently in polynomial time. Thus, this enables us to explore more complex or larger structures in PPI networks. More details can be found in ^[9]^[10].

References

^ Alon, N., Yuster, R., and Zwick, U. 1994. Color-coding: a new method for finding simple paths, cycles and other small subgraphs within large graphs. In Proceedings of the Twenty-Sixth Annual ACM Symposium on theory of Computing (Montreal, Quebec, Canada, May 23–25, 1994). STOC '94. ACM, New York, NY, 326–335. DOI= http://doi.acm.org/10.1145/195058.195179

^ ^a ^b Alon, N., Yuster, R., and Zwick, U. 1995. Color-coding. J. ACM 42, 4 (Jul. 1995), 844–856. DOI= http://doi.acm.org/10.1145/210332.210337

^ Coppersmith–Winograd Algorithm

^ Alon, N. and Naor, M. 1994 Derandomization, Witnesses for Boolean Matrix Multiplication and Construction of Perfect Hash Functions. Technical Report. UMI Order Number: CS94-11., Weizmann Science Press of Israel.

^ Naor, M., Schulman, L. J., and Srinivasan, A. 1995. Splitters and near-optimal derandomization. In Proceedings of the 36th Annual Symposium on Foundations of Computer Science (October 23–25, 1995). FOCS. IEEE Computer Society, Washington, DC, 182.

^ ^a ^b Schmidt, J. P. and Siegel, A. 1990. The spatial complexity of oblivious k-probe Hash functions. SIAM J. Comput. 19, 5 (Sep. 1990), 775-786. DOI= http://dx.doi.org/10.1137/0219054

^ Naor, J. and Naor, M. 1990. Small-bias probability spaces: efficient constructions and applications. In Proceedings of the Twenty-Second Annual ACM Symposium on theory of Computing (Baltimore, Maryland, United States, May 13–17, 1990). H. Ortiz, Ed. STOC '90. ACM, New York, NY, 213-223. DOI= http://doi.acm.org/10.1145/100216.100244

^ Alon, N., Goldreich, O., Hastad, J., and Peralta, R. 1990. Simple construction of almost k-wise independent random variables. In Proceedings of the 31st Annual Symposium on Foundations of Computer Science (October 22–24, 1990). SFCS. IEEE Computer Society, Washington, DC, 544-553 vol.2. DOI= http://dx.doi.org/10.1109/FSCS.1990.89575

^ Alon, N., Dao, P., Hajirasouliha, I., Hormozdiari, F., and Sahinalp, S. C. 2008. Biomolecular network motif counting and discovery by color coding. Bioinformatics 24, 13 (Jul. 2008), i241-i249. DOI= http://dx.doi.org/10.1093/bioinformatics/btn163

^ Hüffner, F., Wernicke, S., and Zichner, T. 2008. Algorithm Engineering for Color-Coding with Applications to Signaling Pathway Detection. Algorithmica 52, 2 (Aug. 2008), 114-132. DOI= http://dx.doi.org/10.1007/s00453-007-9008-7

Categories:
Graph algorithms

Игры ⚽ Поможем решить контрольную работу

Look at other dictionaries:

Color alphabet — is a one to one mapping of a subset of discrete colors to a standardized set of signs (alphabet or graphemes) that allows one to construct meaning out of color directly and unambiguously using an existing system of writing.The choice of colors… … Wikipedia
character color coding — refers to identifying a film s character or persona with a particular color; changes in color often represent transformations, shifts, merges, or changes in persona Examples: the explicit naming of the characters by color in Quentin Tarantino … Glossary of cinematic terms
color-code — [kul′ərkōd΄] vt. color coded, color coding to use specific colors, according to a code, for wires, switches, cards, files, etc … English World dictionary
Color blindness — Colorblind and Colourblind redirect here. For other uses, see Colorblind (disambiguation). Color blindness or color deficiency Classification and external resources An 1895 illustration of normal vision and various kinds of color blindness … Wikipedia
Color code — 25 Pair Color Code Chart used in certain kinds of wiring. A color code is a system for displaying information by using different colors. Color codes are often difficult for color blind and blind people to interpret. The earliest examples of color … Wikipedia
Color wheel — For the circular mechanical device for tinting a light beam (e.g., in a DLP video projector), see color wheel (optics). Boutet s 7 color and 12 color color circles from 1708 … Wikipedia
color-code — ¦ ̷ ̷ ̷ ̷ ¦ ̷ ̷ transitive verb : to color (as wires or pipes) according to a key designed to facilitate identification * * * /kul euhr kohd /, v.t., color coded, color coding. to distinguish or classify with a color code. [1955 60] … Useful english dictionary
color-code — /kul euhr kohd /, v.t., color coded, color coding. to distinguish or classify with a color code. [1955 60] * * * … Universalium
COLOR, LITURGICAL — around the twelfth century CHRISTIANS began using specific colors in CHURCH services to signify the divisions of the CHRISTIAN YEAR, although general agreement of the color coding was never reached. In general, purple was used to signify DEATH … Concise dictionary of Religion
Color Cell Compression — is an early lossy image compression algorithm first described by Campbell et. al. in 1986.[1] It is a variant of Block Truncation Coding.[2] The encoding process works on small blocks of pixels. For each block, it first partitions the pixels in… … Wikipedia

Academic Dictionaries and Encyclopedias

Color-coding

Contents

Results

The method

Example

Derandomization

Applications

References

Look at other dictionaries:

Share the article and excerpts

Academic Dictionaries and Encyclopedias

Wikipedia

Color-coding

Contents

Results

The method

Example

Derandomization

Applications

References

Look at other dictionaries:

Share the article and excerpts

Direct link