Evaluate Detected Events

A tool to compare two sets of events, such as detections versus expert annotations, to evaluate the performance of a detector. This tool can also be used to evaluate the concordance between the scoring of two experts.

Definition of the evaluation metrics

Let’s first define the variables :

  • The event from the expert : e

  • The detection : d

  • True Positive (TP) : Correct detection (e and d are the same).

  • False Positive (FP) : Incorrect detection (d does not match any e).

  • False negatives (FN) : Event missed (e not detected)

Evalutation metrics :

  • Precision : TP/(TP+FP) : Fraction of detections that are correct

  • Recall : TP/(TP+FN) : Fraction of events found

  • F1 score = 2 x (precision x recall)/(precision + recall)

  • kappa = (2 * (tp*tn - fn*fp))/((tp + fp)*(fp + tn) + (tp + fn)*(fn + tn))

    Warning

    kappa is considered a conservative agreement because the expected agreement is removed from the score.

Metrics are computed in the samples domain, therefore the list of events e and d are sampled at 100 Hz and the units of TP, TN, FP, FN are samples.

i.e. TP-samples=500 means 500 samples from the expert events are correctly detected.

  • Pro : the performance evaluation is conservative (strict)

  • Con : many shorter d can match a longer e without significant penalty, therefore not suited for event density.

Metrics are also computed in the events domain with the use of the Jaccord index.

Jaccord index : (intersection between e and d) / (union of e and d)

To considere a d as a TP, the jaccord index must exceed a certain threshold. Only one d can match a e, the one with the highest Jaccord index.

  • Pro : Suited for event density.

  • Con : Need to define a Jaccord index threshold.

Steps

1 - Input Files

Start by opening your PSG files (.edf, .sts or .eeg).

  • European Data Format (EDF) :

    The corresponding .tsv file is required with .edf. Both files must be saved in the same directory and share the exact same filename.

  • Stellate format (up to version 6.2) :

    The corresponding .sig file is required with the .sts. Both files must be saved in the same directory and share the exact same filename.

  • NATUS format (version 9.1) :

    (CEAMS users only) The entire NATUS subject folder is required.

For more details on accepted formats, see Polysomnography file format.

2 - Expert Annotation

Select for each PSG file the expert events as gold standard.

3 - Detection Event

Select for each PSG file the detections to be compared against the expert events.

4 - Output Files

Select the sleep stages to perform the comparison in. (I.e. N2 for sleep spindles.)

Define the jaccord index threhold to compute the performance evaluation.

Jaccord index : (intersection between e and d) / (union of e and d)

The output performance file is written in the same directory as the PSG file. The output file is named as the PSG file with an additional suffix “_perf” and the extension .tsv. One evaluation file per PSG file is generated.

Version History

  • v2.1.0Distributed with CEAMS package version 7.2.0 — Snooz beta 2.0.1
    • Initial release of the tool.

  • v2.2.0Distributed with CEAMS package version 7.3.0 — Snooz beta 3.0.0
    • UI improvements for consistent tool and input file descriptions.