


  • 个真值框GTs,


  • 根据 IOU 阈值确定 TP, FP(其实FP在计算时没有用);一个GT若有多个Dets,选择其中一个为 TP ,其余为 FP ;
  • 根据置信度score从高到低对所有Dets进行排序;为每个Det进行标记:若其为TP标为1,否则标为0,可得到表 T;T的长度为

计算 Precision 和 Recall

  • 从头开始遍历 T,当前步数为

  • 其中,
    为当前步时 TP 的个数,
  • PR图:Recall为横轴,Precision为纵轴。

计算 mAP


    • 11点插值法:Recall值为离散的11个点[0, 0.1, ..., 1];对每一个点,在 PR图上向右找最大的Precision值,找到11个对应的Precision值,加和求平均即为AP。
    • all点插值法:Recall值为[0, 1]区间内的所有点;对每个点,在 PR图上向右找最大的Precision值,找到无穷个对应的Precision值,对其求定积分即为AP,即近似为PR曲线下的面积。
  • COCO
    • 101点插值法:COCO使用101点插值法进行计算。



Precision x Recall curve

Precision x Recall曲线是评估目标检测器性能的好方法,因为通过为每个目标类别绘制一条曲线来改变置信度。如果特定类别的目标检测器的精度随查全率的提高而保持较高,则认为该检测器良好,这意味着,如果您改变置信度阈值,则查准率和查全率仍然很高。识别优质目标检测器的另一种方法是寻找一种只能识别相关物体(0个误报=高精度)的检测器,找到所有ground truth目标(0个误报=高召回率)。

不良的目标检测器需要增加检测到的物体的数量(增加的误报=较低的精度)才能检索所有ground truth目标(高召回率)。因此,Precision x Recall曲线通常以高精度值开始,随召回率的增加而减小。您可以在下一个主题(平均精度)中看到Prevision x Recall曲线的示例。

Average Precision

比较目标检测器性能的另一种方法是计算Precision x Recall曲线的曲线下面积(AUC)。由于AP曲线通常是上下弯曲的锯齿形曲线,因此比较同一图中的不同曲线(不同的检测器)通常不是一件容易的事-因为这些曲线往往会频繁地相互交叉。这就是为什么数字精度平均精度(AP)也可以帮助我们比较不同检测器的原因。实际上,AP是在0到1之间的所有召回值上平均的精度。

从2010年开始,通过PASCAL VOC挑战计算AP的方法已经改变。目前,由PASCAL VOC挑战执行的插值使用所有数据点,而不是如其论文所述仅插值11个等距点。

  • 11-point interpolation

11点插值法尝试通过在一组11个等间隔的召回级别[0, 0.1, 0.2, ... , 1]上求平均精度来总结Precision x Recall曲线的形状:



is the measured precision at recall

Instead of using the precision observed at each point, the AP is obtained by interpolating the precision only at the 11 levels

taking the maximum precision whose recall value is greater than
  • Interpolating all points




is the measured precision at recall

In this case, instead of using the precision observed at only few points, the AP is now obtained by interpolating the precision at each level,

taking the

maximum precision whose recall value is greater or equal than

. This way we calculate the estimated area under the curve.



An example helps us understand better the concept of the interpolated average precision. Consider the detections below:

There are 7 images with 15 ground truth objects represented by the green bounding boxes and 24 detected objects represented by the red bounding boxes. Each detected object has a confidence level and is identified by a letter (A,B,...,Y).

The following table shows the bounding boxes with their corresponding confidences. The last column identifies the detections as TP or FP. In this example a TP is considered if IOU >= 30%, otherwise it is a FP. By looking at the images above we can roughly tell if the detections are TP or FP.

In some images there are more than one detection overlapping a ground truth (Images 2, 3, 4, 5, 6 and 7). For those cases the first detection is considered TP while the others are FP. This rule is applied by the PASCAL VOC 2012 metric: "e.g. 5 detections (TP) of a single object is counted as 1 correct detection and 4 false detections”.

The Precision x Recall curve is plotted by calculating the precision and recall values of the accumulated TP or FP detections. For this, first we need to order the detections by their confidences, then we calculate the precision and recall for each accumulated detection as shown in the table below:

Plotting the precision and recall values we have the following Precision x Recall curve:

As mentioned before, there are two different ways to measure the interpolted average precision: 11-point interpolation and interpolating all points. Below we make a comparisson between them:

Calculating the 11-point interpolation

The idea of the 11-point interpolated average precision is to average the precisions at a set of 11 recall levels (0,0.1,...,1). The interpolated precision values are obtained by taking the maximum precision whose recall value is greater than its current recall value as follows:

By applying the 11-point interpolation, we have:

Calculating the interpolation performed in all points

By interpolating all points, the Average Precision (AP) can be interpreted as an approximated AUC of the Precision x Recall curve. The intention is to reduce the impact of the wiggles in the curve. By applying the equations presented before, we can obtain the areas as it will be demostrated here. We could also visually have the interpolated precision points by looking at the recalls starting from the highest (0.4666) to 0 (looking at the plot from right to left) and, as we decrease the recall, we collect the precision values that are the highest as shown in the image below:

Looking at the plot above, we can divide the AUC into 4 areas (A1, A2, A3 and A4):

Calculating the total area, we have the AP:


The results between the two different interpolation methods are a little different: 24.56% and 26.84% by the every point interpolation and the 11-point interpolation respectively.



