Object detectors output detection bounding boxes along with confidence scores. The higher the score, the more confident the model is that the associated bounding box is a correct detection.
When used in applications (like this one), the user typically establishes a confidence threshold and then every detection above that threshold is treated as a positive detection, the rest are discarded. The choice can be arbitrary or (sorta) principled.
How should one interpet the "prediction score"?