- F1 score
statistics, the F1 score (also F-score or F-measure) is a measure of a test's accuracy. It considers both the precision "p" and the recall "r" of the test to compute the score: "p" is the number of correct results divided by the number of all returned results and "r" is the number of correct results divided by the number of results that should have been returned. The F1 score can be interpreted as a weighted average of the precision and recall, where an F1 score reaches its best value at 1 and worst score at 0.
The traditional F-measure or balanced F-score (F1 score) is the harmonic mean of precision and recall:
The general formula for non-negative real β is::
The formula in terms of
Type I and type II errors::
Two other commonly used F measures are the measure, which weights recall twice as much as precision, and the measure, which weights precision twice as much as recall.
The F-measure was derived by van Rijsbergen (1979) so that "measures the effectiveness of retrieval with respect to a user who attaches β times as much importance to recall as precision". It is based on van Rijsbergen's effectiveness measure . Their relationship is where .
Wikimedia Foundation. 2010.