Confusion matrix

Confusion matrix

In the field of artificial intelligence, a confusion matrix is a specific table layout that allows visualization of the performance of an algorithm, typically a supervised learning one (in unsupervised learning it is usually called a matching matrix). Each column of the matrix represents the instances in a predicted class, while each row represents the instances in an actual class. The name stems from the fact that it makes it easy to see if the system is confusing two classes (i.e. commonly mislabeling one as another).

Contents

Example

If a classification system has been trained to distinguish between cats, dogs and rabbits, a confusion matrix will summarize the results of testing the algorithm for further inspection. Assuming a sample of 27 animals — 8 cats, 6 dogs, and 13 rabbits, the resulting confusion matrix could look like the table below:

Predicted class
Cat Dog Rabbit
Actual
class
Cat 5 3 0
Dog 2 3 1
Rabbit 0 2 11

In this confusion matrix, of the 8 actual cats, the system predicted that three were dogs, and of the six dogs, it predicted that one was a rabbit and two were cats. We can see from the matrix that the system in question has trouble distinguishing between cats and dogs, but can make the distinction between rabbits and other types of animals pretty well. All correct guesses are located in the diagonal of the table, so it's easy to visually inspect the table for errors, as they will be represented by any non-zero values outside the diagonal.

Table of confusion

In predictive analytics, a table of confusion (sometimes also called a confusion matrix), is a table with two rows and two columns that reports the number of false positives, false negatives, true positives, and true negatives. This allows more detailed analysis than mere proportion of correct guesses (accuracy). Accuracy is not a reliable metric for the real performance of a classifier, because it will yield misleading results if the data set is unbalanced (that is, when the number of samples in different classes vary greatly). For example, if there were 95 cats and only 5 dogs in the data set, the classifier could easily be biased into classifying all the samples as cats. The overall accuracy would be 95%, but in practice the classifier would have a 100% recognition rate for the cat class but a 0% recognition rate for the dog class.

Assuming the confusion matrix above, its corresponding table of confusion, for the cat class, would be:

5 true positives
(actual cats that were
correctly classified as cats)
2 false positives
(dogs that were
incorrectly labeled as cats)
3 false negatives
(cats that were
incorrectly marked as dogs)
17 true negatives
(all the remaining animals,
correctly classified as non-cats)

The final table of confusion would contain the average values for all classes combined.

See also

External links


Wikimedia Foundation. 2010.

Игры ⚽ Нужен реферат?

Look at other dictionaries:

  • Confusion Matrix — Eine Konfusionsmatrix (oder Wahrheitsmatrix) dient zur Beurteilung eines Klassifikators, indem in einer quadratischen Tabelle die Häufigkeiten des Auftretens für alle möglichen Kombinationen von ermittelter Klasse und tatsächlicher Klasse… …   Deutsch Wikipedia

  • Matrix (monde imaginaire) — Matrix Matrix Titre original The Matrix Titre québécois La Matrice Réalisation Andy et Larry Wachowski Acteurs principaux …   Wikipédia en Français

  • Matrix (univers de fiction) — Matrix Matrix Titre original The Matrix Titre québécois La Matrice Réalisation Andy et Larry Wachowski Acteurs principaux …   Wikipédia en Français

  • The Matrix — Matrix Matrix Titre original The Matrix Titre québécois La Matrice Réalisation Andy et Larry Wachowski Acteurs principaux …   Wikipédia en Français

  • Symplectic matrix — In mathematics, a symplectic matrix is a 2n times; 2n matrix M (whose entries are typically either real or complex) satisfying the condition:M^T Omega M = Omega,.where MT denotes the transpose of M and Omega; is a fixed nonsingular, skew… …   Wikipedia

  • Neo (Matrix) — Este artículo o sección se encuentra desactualizado. Es posible que la información suministrada aquí haya cambiado o sea insuficiente. Neo Personaje de …   Wikipedia Español

  • Classification rule — See also: Statistical classification and Classification in machine learning Given a population whose members can be potentially separated into a number of different sets or classes, a classification rule is a procedure in which the elements… …   Wikipedia

  • Glossaire du data mining — Exploration de données Articles principaux Exploration de données Fouille de données spatiales Fouille du web Fouille de flots de données Fouille de textes …   Wikipédia en Français

  • Cluster analysis — The result of a cluster analysis shown as the coloring of the squares into three clusters. Cluster analysis or clustering is the task of assigning a set of objects into groups (called clusters) so that the objects in the same cluster are more… …   Wikipedia

  • Chi-Quadrat-Vierfeldertest — Die Artikel Vierfelderkorrelation und Vierfeldertest überschneiden sich thematisch. Hilf mit, die Artikel besser voneinander abzugrenzen oder zu vereinigen. Beteilige dich dazu an der Diskussion über diese Überschneidungen. Bitte entferne diesen… …   Deutsch Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”