View
0
Download
0
Category
Preview:
Citation preview
Bio-Fingerprinting applied topolysomnographs
Student
Lucchini Marta
Supervisor
Faraci Francesca Dalia
Correlator
Fiorillo Luigi
Customer
Faraci Francesca Dalia
Course
Computer Engineering
Module
M00009 Progetto di diploma
Year
2018/2019
Date
September 10, 2019
i
Contents
Abstract 3
1 Assigned project 5
1.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Changes in progress . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Introduction 9
3 Used technologies 13
4 Algorithm 15
4.1 A7 algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2 YASA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.3 Detection of the spindles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.3.1 Threshold 1: Relative power in the sigma band . . . . . . . . . . . . . 17
4.3.2 Threshold 2: Moving correlation . . . . . . . . . . . . . . . . . . . . . 17
4.3.3 Threshold 3: Moving RMS . . . . . . . . . . . . . . . . . . . . . . . . 17
4.3.4 Decision function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.3.5 Additional information . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5 Preliminary analysis 19
6 Analysis taking the rms peak of every spindle 29
6.1 Bar-plots for rms peaks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.2 Scatter-plots for rms peaks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
6.3 Range computed with logarithmic functions . . . . . . . . . . . . . . . . . . . 34
7 Analysis to compute personalized thresholds 37
7.1 Bar-plots for rms inflections . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Bio-Fingerprinting applied to polysomnographs
ii CONTENTS
7.2 Scatter-plots for rms inflections . . . . . . . . . . . . . . . . . . . . . . . . . . 40
7.3 Range computed with mathematical functions . . . . . . . . . . . . . . . . . . 42
8 Analysis on some rms inflections during the night 47
9 Confidence interval on the mean to compute the thresholds 51
9.1 F1-score as a measure of performance . . . . . . . . . . . . . . . . . . . . . 51
9.2 Personalize the algorithm knowing only some spindles for patient . . . . . . . 53
10 Preliminary analysis for adaptive method to compute thresholds 55
10.1 Exploration of the relationship between parameters . . . . . . . . . . . . . . . 55
10.2 Polynomial regression on spindle inflections . . . . . . . . . . . . . . . . . . . 56
11 Adaptive method to compute thresholds 59
12 Conclusions 63
Bio-Fingerprinting applied to polysomnographs
iii
List of Figures
1.1 Assigned project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
5.1 EEG data and relative power . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.2 EEG data and correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.3 EEG data and rms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.4 Parameters in spindle with rms peak of 22.90 . . . . . . . . . . . . . . . . . . 21
5.5 Parameters in spindle with rms peak of 12.17 . . . . . . . . . . . . . . . . . . 22
5.6 Parameters in spindle with rms peak of 9.40 . . . . . . . . . . . . . . . . . . . 22
5.7 Parameters in spindle with rms peak of 11.57 . . . . . . . . . . . . . . . . . . 22
5.8 Parameters in spindle with rms peak of 9.50 . . . . . . . . . . . . . . . . . . . 23
5.9 Parameters in spindle with rms peak of 10.44 . . . . . . . . . . . . . . . . . . 23
5.10 Parameters in EEG with no spindle . . . . . . . . . . . . . . . . . . . . . . . . 24
5.11 Parameters in EEG with no spindle . . . . . . . . . . . . . . . . . . . . . . . . 24
5.12 Parameters in EEG with no spindle . . . . . . . . . . . . . . . . . . . . . . . . 24
5.13 Parameters in EEG pre-spindle . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.14 Parameters in EEG pre-spindle . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.15 Parameters in EEG pre-spindle . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.16 Parameters in EEG post-spindle . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.17 Parameters in EEG post-spindle . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.18 Parameters in EEG post-spindle . . . . . . . . . . . . . . . . . . . . . . . . . 26
6.1 Rms peaks: rms bar-plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
6.2 Rms peaks: correlation bar-plot . . . . . . . . . . . . . . . . . . . . . . . . . 31
6.3 Rms peaks: relative power bar-plot . . . . . . . . . . . . . . . . . . . . . . . . 31
6.4 Scatter plot relative power-correlation of spindle peaks . . . . . . . . . . . . . 32
6.5 Scatter plot relative power-rms of spindle peaks . . . . . . . . . . . . . . . . . 33
6.6 Scatter plot correlation-rms of spindle peaks . . . . . . . . . . . . . . . . . . 33
6.7 Rms peaks: logarithmic relation and range between relative power and cor-
relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
7.1 Inflection in rms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Bio-Fingerprinting applied to polysomnographs
iv LIST OF FIGURES
7.2 Rms inflections: rms bar-plot . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
7.3 Rms inflections: correlation bar-plot . . . . . . . . . . . . . . . . . . . . . . . 39
7.4 Rms inflections: relative power bar-plot . . . . . . . . . . . . . . . . . . . . . 40
7.5 Scatter plot relative power-correlation of spindle inflections . . . . . . . . . . . 41
7.6 Scatter plot relative power-rms of spindle inflections . . . . . . . . . . . . . . 41
7.7 Scatter plot correlation-rms of spindle inflections . . . . . . . . . . . . . . . . 42
7.8 Rms inflections: exponential relation and range between correlation and rms . 43
7.9 Rms inflections: linear relation and range between relative power and corre-
lation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
7.10 Rms inflections: linear relation and range between relative power and rms . . 45
8.1 Scatter plot relative power-correlation of spindle inflections . . . . . . . . . . . 47
8.2 Scatter plot relative power-correlation of no-spindle inflections . . . . . . . . . 48
8.3 Scatter plot relative power-rms of spindle inflections . . . . . . . . . . . . . . 48
8.4 Scatter plot relative power-rms of no-spindle inflections . . . . . . . . . . . . . 49
8.5 Scatter plot correlation-rms of spindle inflections . . . . . . . . . . . . . . . . 49
8.6 Scatter plot correlation-rms of no-spindle inflections . . . . . . . . . . . . . . 50
Bio-Fingerprinting applied to polysomnographs
v
List of Tables
5.1 Results initial YASA algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 19
6.1 Rms peaks: rms bar-plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
6.2 Rms peaks: correlation bar-plot . . . . . . . . . . . . . . . . . . . . . . . . . 31
6.3 Rms peaks: relative power bar-plot . . . . . . . . . . . . . . . . . . . . . . . . 31
6.4 Rms peaks: results with modified thresholds . . . . . . . . . . . . . . . . . . 32
6.5 Rms peaks: results with logarithmic function . . . . . . . . . . . . . . . . . . 35
7.1 Rms inflections: rms bar-plot . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
7.2 Rms inflections: correlation bar-plot . . . . . . . . . . . . . . . . . . . . . . . 39
7.3 Rms inflections: relative power bar-plot . . . . . . . . . . . . . . . . . . . . . 40
7.4 Rms peaks: results with modified thresholds . . . . . . . . . . . . . . . . . . 40
7.5 Rms inflections: results with exponential relation between correlation and rms 43
7.6 Rms inflections: results with linear relation between relative power and corre-
lation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
7.7 Rms inflections: results with linear relation between relative power and corre-
lation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
9.1 Initial YASA algorithm results with F1-score . . . . . . . . . . . . . . . . . . . 53
9.2 Results using confidence interval . . . . . . . . . . . . . . . . . . . . . . . . . 53
9.3 Results using confidence interval with 10/20 spindles in input . . . . . . . . . 53
10.1 Rms inflections and minimum: R2 in polynomial regression . . . . . . . . . . 56
10.2 Rms inflections and minimum separated: R2 in polynomial regression . . . . . 56
10.3 Total spindle durations: R2 in polynomial regression . . . . . . . . . . . . . . 56
11.1 Relations between the parameters: R2 in regression . . . . . . . . . . . . . . 59
11.2 Adaptive method: correlation fixed . . . . . . . . . . . . . . . . . . . . . . . . 60
11.3 Adaptive method: rms fixed . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
11.4 Adaptive method: rms fixed with upper limit . . . . . . . . . . . . . . . . . . . 61
11.5 Adaptive method: relative power fixed . . . . . . . . . . . . . . . . . . . . . . 61
11.6 Adaptive method: relative power fixed with upper limit . . . . . . . . . . . . . 61
Bio-Fingerprinting applied to polysomnographs
vi LIST OF TABLES
Bio-Fingerprinting applied to polysomnographs
1
Thanks
To my mother and my father, for their psychological and economic support, which has al-
lowed me to conclude this important journey with serenity.
A special thanks also to my supervisor Francesca Dalia Faraci and my correlator Luigi Fior-
illo for their help and availability, to all my professors for their teachings and to my colleagues
for the great moments spent together in these three years and for their help and support.
Bio-Fingerprinting applied to polysomnographs
2 LIST OF TABLES
Bio-Fingerprinting applied to polysomnographs
3
Abstract
Polysomnography (PSG) is a multi-parametric test used in the study of sleep and as a diag-
nostic tool for sleep disorders. The first step in the quantitative analysis of polysomnographic
recordings is the classification of sleep stages. It is possible to distinguish between wake,
REM sleep, NREM sleep stages 1 to 4 and movement time.
To classify sleep stages, it is important to identify where certain patterns occur, such as
sleep spindles. A sleep spindle is an electroencephalography (EEG) pattern defined as a
train of distinct waves with frequency 11–16 Hz (most commonly 12-14 Hz) with duration ≥0.5 s. Spindles are a characteristic of stage 2 sleep as they define the transition from stage
N1 (non-rapid eye movement, NREM1) to stage N2 (NREM2).
Sleep stage scoring relies heavily on visual pattern recognition by a human expert and is
time consuming and subjective. Thus, there is a need for automatic classification. Some
automatic detectors already exist, but they are not accurate.
The aim of this project is to demonstrate that the performance of an existent sleep-spindle
detector can improve by modifying the algorithm so that it can be adapted to the character-
istics of each patient.
In this project some analysis has been performed, demonstrating that a detection spindle
algorithm can be made customizable for each patient. For the patient on which I have tested
the best method found, the F1-score has increased from 0.48, result of the initial algorithm,
to 0.55, result using 20 spindles in input.
Bio-Fingerprinting applied to polysomnographs
4 Abstract
Bio-Fingerprinting applied to polysomnographs
5
Bio-Fingerprinting applied to polysomnographs
6 Assigned project
Chapter 1
Assigned project
Figure 1.1: Assigned project
Bio-Fingerprinting applied to polysomnographs
7
1.1 Description
Polysomnography is an analysis that is done to investigate possible sleep disorders. Several
biological signals are recorded (EEG-EOG-EMG-ECG). The project consists in developing
a system that performs pattern-recognition of specific EEG signals performed during sleep,
adapting to the patient’s characteristics.
This study project is integrated into the European project E! SPAS (Sleep Physician Assis-
tant System) in collaboration with two European companies and the Inselspital in Bern.
1.2 Tasks
• Brief analysis of the state of the art in relation to PSG pattern recognition
• Development of automatic recognition tools for characteristic sleep signals based both
on existing open source tools and on results available in the specialist literature
• Integration in the algorithm of data related to the single patient (The idea is to per-
sonalize the recognition on the patient’s data, integrating in the algorithm the initial
choices of the doctor / technician that identifies some patterns visually, from which the
algorithm will possibly improve identifications).
• Make the algorithm flexible in the identification of Arousal, Spindles, K-complexes that
are priority in sleep scoring and must be identified with maximum accuracy.
• Test the accuracy of the algorithm on PSGs where all the patterns have been recog-
nized and labeled by expert staff.
1.3 Targets
• Development of knowledge related to polysomnography
• Development of knowledge related to pattern recognition and biofingerprinting
1.4 Technologies
Mainly Python. Some parts can be developed in R
1.5 Changes in progress
After studying the state of the art in relation to PSG pattern recognition, I decided to dedicate
the project to a specific pattern. I choose spindles because they are well documented and I
Bio-Fingerprinting applied to polysomnographs
8 Assigned project
found a good spindles detection algorithm as start point. However, all the analysis performed
for the project can be extended to other EEG signals.
Bio-Fingerprinting applied to polysomnographs
9
Chapter 2
Introduction
What is sleep? At the behavioral level, sleep can be defined as a reversible behavioral
state of perceptual disengagement from and unresponsiveness to environmental stimuli.
The sleep-wake cycle and the structure of sleep reflect the spontaneous activity of autoreg-
ulatory central nervous system processes. On the base of some physiological parame-
ters, sleep can be divided into two separate states: nonrapid eye movement (NREM) and
rapid eye movement (REM) sleep. These differ from one another as well as from wakeful-
ness. Conventionally, NREM sleep is subdivided into four stages, distinguished from each
other principally on the base of their different patterns of brain electrical activity, as mea-
sured by the electroencephalogram (EEG), which is considered as the core measurement
of polysomnography. Polysomnography (PSG) is a multi-parametric test used in the study
of sleep and as a diagnostic tool in sleep medicine. The test result is called a polysomno-
gram, also abbreviated PSG. Polysomnography is performed overnight, usually for 8 hours,
and is a comprehensive recording of the biophysiological changes that occur during sleep.
The PSG monitors many body functions, including brain activity (EEG), as said before, eye
movements (EOG), muscle activity or skeletal muscle activation (EMG), and heart rhythm
(ECG), during sleep. Polysomnography is used to diagnose, or rule out, many types of sleep
disorders, including narcolepsy, idiopathic hypersomnia, periodic limb movement disorder
(PLMD), REM behavior disorder, parasomnias, and sleep apnea. Although it is not directly
useful in diagnosing circadian rhythm sleep disorders, it may be used to rule out other sleep
disorders. Returning to the EEG, its pattern in NREM sleep is synchronous, with charac-
teristic waveforms such as sleep spindles, K-complexes and slow-frequency, high-amplitude
waves (delta waves). By contrast, REM sleep is defined by low-voltage EEG activation, mus-
cle atonia and episodic bursts of REMs. The rules for visual sleep scoring are provided by
the recommendations of Rechtschaffen and Kales (R&K) published in 1968 and of the Amer-
ican Association of Sleep Medicine (AASM) published in 2007. According to these manuals,
it is possible to distinguish between wake, REM sleep, NREM sleep stages 1 to 4 and move-
ment time. NREM sleep stages 1 and 2 are regarded as light sleep and stages 3 and 4 are
Bio-Fingerprinting applied to polysomnographs
10 Introduction
regarded as deep sleep or slow-wave sleep due to the predominance of slow delta waves in
the EEG. Sleep scoring is performed for time segments of 20 or 30 seconds, which are re-
ferred to as epochs. Thus, 8 hours of sleep consist of 960 30-second epochs. The plot of a
sequence of sleep stages is called a hypnogram. Human sleep starts generally with a stage
1 (N1), a very light sleep usually lasting up few minutes. Slow rolling eye movements are a
feature of stage 1 and contractions of the muscles as well as hypnagogic jerks may occur.
Next follows stage 2 (N2), a deeper state of sleep than stage 1, characterized by the occur-
rence of sleep spindles and K-complexes and an intermediate muscle tone. Stage 2 usually
precedes deep sleep - stages 3 and 4 (SWS, N3). The main characteristic of deep sleep is
the presence of slow oscillations (1 Hz) and delta waves (1-4 Hz) in the EEG for at least 20%
of the epoch duration. The muscle tone is low. Rapid eye movement sleep occurs periodi-
cally throughout the night and is characterized by rapid eye movements, fast low-amplitude
EEG activity like the wake EEG, and a low muscle tone (atonia). The progression of the dif-
ferent stages is not random, but rather follows a cyclic alternation of NREM and REM sleep
with a cycle duration of approximately 90 minutes. Healthy sleep consists of approximately
3-5 sleep cycles. The classification of sleep stages is the first step in the quantitative anal-
ysis of polysomnographic recordings. Sleep stage scoring relies heavily on visual pattern
recognition by a human expert and is time consuming and subjective. Thus, there is a need
for automatic classification. To classify sleep stages, it is important to identify where certain
patterns occur, such as sleep spindles. A sleep spindle is an electroencephalography (EEG)
pattern that results from specific variations in membrane potentials in the thalamocortical
network of the brain. They are defined as a train of distinct waves with frequency 11–16 Hz
(most commonly 12-14 Hz) with duration ≥ 0.5 s, usually maximal in amplitude using cen-
tral deviations. Spindles are a hallmark of stage 2 sleep as they define the transition from
stage N1 (non-rapid eye movement, NREM1) to stage N2 (NREM2). Although the function
of sleep spindles is unclear, it is believed that they actively participate in the consolidation of
overnight declarative memory, the conscious, intentional recollection of factual information,
previous experiences, and concepts, through the reconsolidation process. The density of
spindles has been shown to increase after extensive learning of declarative memory tasks
and the degree of increase in stage 2 spindle activity correlates with memory performance.
Sleep spindle activity has furthermore been found to be associated with the integration of
new information into existing knowledge as well as directed remembering and forgetting
(fast sleep spindles). Moreover, sleep spindles closely modulate interactions between the
brain and its external environment; they essentially moderate responsiveness to sensory
stimuli during sleep. Recent research has revealed that spindles distort the transmission
of auditory information to the cortex. Spindles isolate the brain from external disturbances
during sleep. During NREM sleep, the brain waves produced by people with schizophrenia
lack the normal pattern of slow and fast spindles. Loss of sleep spindles are also a fea-
ture of familial fatal insomnia, a prion disease. Changes in spindle density are observed in
Bio-Fingerprinting applied to polysomnographs
11
disorders such as epilepsy and autism. Although visual inspection by experts is the gold
standard of sleep spindle detection, with the rapid increase in research on sleep spindles,
various automated methods of spindle detection have been proposed to reduce subjective
biases and increase reliability and objectivity. The major advantages of automated methods
are faster, more reproducible and systematic scoring. They extract features from EEG data
and apply specific thresholds to identify features corresponding to sleep spindles. A stan-
dardized band-pass filter or custom frequency range filter and amplitude-threshold approach
has been commonly used in research literature and reported approximately 90% sensitivity.
Time-frequency analysis method also has been applied in spindle detection with wavelet
transformation and matching pursuit. More recently, sophisticated automatic sleep spindle
detection methods using artificial neural networks have been developed and reported high
agreement with experts (ranging from 85% to 96%). Although there is ample evidence that
many automated methods have an acceptable agreement with experts, they have the limita-
tions that they are known for occasional problems with differentiating ambiguous oscillation
signals (e.g., alpha versus spindles). They can also be highly influenced by the algorithm
settings chosen by researchers (e.g., spindle duration, frequency, and amplitude character-
istics). Spindle density and characteristics such as mean oscillation frequency, amplitude
and duration appear to be trait-like, because they are stable over time (inter-night stability)
for the same subject but vary considerably between subjects. To make up for this limitation
and the time consumption of manually detection spindles by experts, adapting to the pa-
tient’s characteristics may be another potential technique. The aim of the present project is
to demonstrate that the performance of an existent sleep-spindle detector can improve by
modifying the algorithm so that it can be adapted to the characteristics of each patient.
Bio-Fingerprinting applied to polysomnographs
12 Introduction
Bio-Fingerprinting applied to polysomnographs
13
Chapter 3
Used technologies
In this project I have used the PyCharm development environment to program in Python,
the language with which the initial algorithm is written. To perform the analysis I have used
several libraries. The main ones are: nummpy and pandas for data manipulation, plotly and
matplotlib for displaying graphs, mne and scipy for the analysis part. Furthermore, I have
used Microsoft Excel to collect and examine data in table form and GeoGebra to observe
useful functions for data analysis.
Bio-Fingerprinting applied to polysomnographs
14 Used technologies
Bio-Fingerprinting applied to polysomnographs
15
Chapter 4
Algorithm
The start point of this project is a sleep-spindle detector that emulates human scoring. This
is an algorithm called YASA. To explain its operation, I am going to explain first the A7
algorithm because YASA is largely inspired by it. Then, I am going to explain the differences
between the two methods.
4.1 A7 algorithm
A7 algorithm runs on a single EEG channel. In my study I have used the C3-M2 channel
to perform the spindle detection since the amplitude of the spindle is maximal at the central
deviations. First, the detector applies a filter between 0.3 and 30 Hz according to standard
practice for clinical polysomnography. Another filter is used to distinct train of sigma waves
between 11 and 16 Hz, when spindle occurs. We will call this signals EEGbf and EEGσ,
respectively. To detect the spindles, the algorithm relies on four parameters:
• Absolute sigma power: used to identify train of sigma waves. Increase of power in the
sigma band means increase of energy of the signal EEGσ
• Relative sigma power: used to ensure the increase of power is specific to the sigma
band in the filtered signal EEGbf
• Sigma covariance: used to identify a high covariance between EEGσ and EEGbf. A
high sigma covariance will indicate that EEGσ and EEGbf vary together
• Sigma correlation: used to identify a high correlation between EEGσ and EEGbf. A
high sigma correlation will indicate that the changes in EEGσ result in the change in
EEGbf
These four parameters are computed on a 0.3 seconds window length each 0.1 seconds
for the whole EEG recording. It has been chosen this window to allow for detection of
spindles as short as 0.3 seconds length. Events that last less than 0.3 seconds and more
Bio-Fingerprinting applied to polysomnographs
16 Algorithm
than 2.5 seconds are discarded. The A7 algorithm detects a spindle when the four A7
sigma parameters exceed their respective thresholds. These four thresholds have been
established with a training dataset.
To analyze the performance of the detector, three parameters has been computed:
• Recall (sensitivity): the proportion of spindle detected
• Precision: the proportion of detected events considered as real spindle
• F1-score: the harmonic mean of the recall and precision
To calculate these parameters, the results computed by A7 have been compared to a
dataset containing the manual spindles detection performed by five human experts. The
by-event performance of the A7 spindle detector was 74% precision, 68% recall and an
F1-score of 0.70. This performance was equivalent to an individual human expert (average
F1-score=0.67).
4.2 YASA
The main differences between YASA and A7 are:
• YASA uses 3 different thresholds (relative power, root mean square and correlation)
instead of the four ones used by A7
• The windowed detection signals are resampled to the original time vector of the data
using cubic interpolation, thus resulting in a pointwise detection signal (= one value
at every sample). The time resolution of YASA is therefore higher than the A7 algo-
rithm. This allows for more precision to detect the beginning, end and durations of the
spindles (typically, A7 = 100 ms and YASA = 10 ms)
• The relative power in the sigma band is computed using a Short-Term Fourier Trans-
form
• The median frequency and absolute power of each spindle is computed using a Hilbert
transform.
• YASA computes some additional spindles properties, such as the symmetry index and
number of oscillations
• Potential sleep spindles are discarded if their duration is below 0.5 seconds and above
2 seconds. These values are respectively 0.3 and 2.5 seconds in the A7 algorithm
• YASA incorporates an automatic rejection of pseudo or fake events based on an Iso-
lation Forest algorithm.
Bio-Fingerprinting applied to polysomnographs
17
4.3 Detection of the spindles
To understand how the algorithm works, it is important to know the meaning of the three
parameters used to detect the spindles.
4.3.1 Threshold 1: Relative power in the sigma band
When a spindle occurs, an increase of energy is expected to be found in the sigma frequency
range (11-16 Hz). To calculate the power in the sigma band relative to the total power in
the broadband frequency (1-30 Hz) a Short-Time Fourier Transform (STFT) is used. It is
performed on consecutive epochs of 2 seconds and with an overlap of 200 ms. To ensure
that at least 20% of the signal’s total power is contained within the sigma band, a threshold
of 0.2 has been fixed, so that it is exceeded whenever a sample has a relative power in the
sigma frequency range ≥ 0.2.
4.3.2 Threshold 2: Moving correlation
This parameter is used to ensure that the changes in EEGσ result in the change in EEGbf. It
is calculated with a sliding window of 300 ms and a step of 100 ms. The Pearson correlation
coefficient between the EEGbf signal and the EEGσ signal is computed and the threshold
is exceeded every time that a sample has a correlation value r ≥ 0.65.
4.3.3 Threshold 3: Moving RMS
To detect increase of energy in the sigma band, a third threshold is defined by computing a
moving root mean square (RMS) of EEGσ, with a window size of 300 ms and a step of 100
ms. The threshold is exceeded every time that a sample has a RMS ≥ 1.5.
4.3.4 Decision function
After calculating each parameter, these values are interpolated using cubic interpolation to
obtain one value per each time point. To detect a spindle it is necessary that at least two
of the three threshold are exceeded. Furthermore, spindles that are too close to each other
(less than 500 ms) are merged together and the ones that are too short (less than 0.5 sec)
or too long (more than 2 sec) are removed.
4.3.5 Additional information
The three parameters are computed and interpolated by the algorithm so that they assume
one value for each sample of the track. The sample frequency used for the project is 125
sps, meaning that every second has 125 values of each parameter. Since spindles usually
Bio-Fingerprinting applied to polysomnographs
18 Algorithm
have a duration of 1 or 2 second, most spindle event contains 125 or 250 values of each
parameter.
Bio-Fingerprinting applied to polysomnographs
19
Chapter 5
Preliminary analysis
The aim of this project is to modify the YASA algorithm so that it could be personalized for
each patient. To understand if it is possible, it is important to know if the performance of the
algorithm change on different patients and if modifying the thresholds the results are not the
same. To do this, I have collected ten different electroencephalogram from ten patients and
I have tested the algorithm without modifications. For every EEG, I had the annotations of
the spindles detected by the REM Logic detector, so the matching that I have computed in
these preliminary analysis is not accurate. To count te matching of spindles detected with
the real ones, I have considered an error of more or less two seconds between the beginning
of the detected spindle and the beginning of the real one. With a multi-thread script I have
tried every combination of the three parameters that could have made sense. I could not try
combination with more than two significant figure because this would have taken months.
The performance has resulted different in each file, meaning that each patient is sensitive
to changing the threshold.
Once understood that different thresholds could be computed for the parameters for every
patient, the important step is to find out how these variables change in every subject and
the relation between them. To do this, before searching for mathematical relations, I wanted
to see how the values assumed by the three parameters during the occurrence of a spindle
resulted plotted in a graph. From now on, all the analysis I have performed have been made
on a single file, of which I know all the spindles that occurred on channel C4-M1, since
the spindle activity could be noticed mainly at central derivation. The number of spindles
annotated by the neurologist is 838. The results computed by the algorithm on this file are:
Table 5.1: Results initial YASA algorithmReal spindles Detected spindles Matching Not detected spindles
838 904 417 421
Bio-Fingerprinting applied to polysomnographs
20 Preliminary analysis
First of all, I have made three graphs, one for parameter, with the values of the data of the
EEG in the x axis, and the corresponding values of the parameter computed by the algorithm
on the y axis. What I noticed is that this way of proceeding have not make sense because
the values were scattered throughout the graph without a relation.
Figure 5.1: EEG data and relative power
Figure 5.2: EEG data and correlation
Bio-Fingerprinting applied to polysomnographs
21
Figure 5.3: EEG data and rms
What could have made more sense was to plot how the three parameters vary over time. As
can be seen in the following images, it seems that when a spindle occurs, the rms parameter
grows very fast, and also the relative power has a slight increase. Covariance, on the other
hand, does not seem to follow a specific trend.
Figure 5.4: Parameters in spindle with rms peak of 22.90
Bio-Fingerprinting applied to polysomnographs
22 Preliminary analysis
Figure 5.5: Parameters in spindle with rms peak of 12.17
Figure 5.6: Parameters in spindle with rms peak of 9.40
Figure 5.7: Parameters in spindle with rms peak of 11.57
Bio-Fingerprinting applied to polysomnographs
23
Figure 5.8: Parameters in spindle with rms peak of 9.50
Figure 5.9: Parameters in spindle with rms peak of 10.44
Looking at these graphs, it may be natural to think that to identify a spindle it is sufficient
to identify a sudden increase in the rms parameter. Unfortunately, as it can be seen in the
following images that show the trend of the parameters during the absence of spindles, this
parameter undergoes abrupt changes very often during the night.
Bio-Fingerprinting applied to polysomnographs
24 Preliminary analysis
Figure 5.10: Parameters in EEG with no spindle
Figure 5.11: Parameters in EEG with no spindle
Figure 5.12: Parameters in EEG with no spindle
Bio-Fingerprinting applied to polysomnographs
25
Figure 5.13: Parameters in EEG pre-spindle
Figure 5.14: Parameters in EEG pre-spindle
Figure 5.15: Parameters in EEG pre-spindle
Bio-Fingerprinting applied to polysomnographs
26 Preliminary analysis
Figure 5.16: Parameters in EEG post-spindle
Figure 5.17: Parameters in EEG post-spindle
Figure 5.18: Parameters in EEG post-spindle
Bio-Fingerprinting applied to polysomnographs
27
It is interesting to understand if the values of the rms peaks during the spindle events have a
relation with the values assumed by the other parameters and/or fall within a narrow range.
To do this, firstly I have found the rms peaks of the spindles and the corresponding relative
power and correlation values in term of time. Once I did this, with the help of Excel, I have
started these analysis.
Bio-Fingerprinting applied to polysomnographs
28 Preliminary analysis
Bio-Fingerprinting applied to polysomnographs
29
Chapter 6
Analysis taking the rms peak ofevery spindle
To understand in which ranges most of the points fall and therefore find reasonable thresh-
olds for this patient, I have plotted some bar-plots for every parameter. Instead, to discover
some relations between the parameters, I have plotted three scatter-plots which relate the
three variables two by two.
6.1 Bar-plots for rms peaks
The YASA algorithm detects a spindle where two of the three parameters exceed a pre-
established threshold for more than 0.5 seconds and less than 2 seconds. To try to improve
the performance of the detector making it adapted to the patient in question, it is important
to understand how the thresholds could be changed to make it better. With the following
bar-plots we can see which values the three parameters assume most in every rms peak.
Bio-Fingerprinting applied to polysomnographs
30 Analysis taking the rms peak of every spindle
Table 6.1: Rms peaks:
rms bar-plotClass Frequency
3 0
4 1
5 3
6 8
7 32
8 55
9 103
10 99
11 114
12 88
13 82
14 64
15 52
16 43
17 34
18 17
19 12
20 14
21 9
22 1
23 4
24 2
25 0
26 0
27 0
28 0
29 0
30 0
31 0
32 1
Other 0
Figure 6.1: Rms peaks: rms bar-plot
Most rms values fall in the range between 7 and 20.
Bio-Fingerprinting applied to polysomnographs
31
Table 6.2: Rms peaks:
correlation bar-plotClass Frequency
0.1 0
0.2 0
0.3 4
0.4 9
0.5 26
0.6 92
0.7 125
0.8 242
0.9 247
1 93
Other 0
Figure 6.2: Rms peaks: correlation bar-plot
Most correlation values fall in the range between 0.5 and 1.
Table 6.3: Rms peaks: rel-
ative power bar-plotClass Frequency
0 0
0.1 66
0.2 154
0.3 206
0.4 163
0.5 128
0.6 66
0.7 41
0.8 12
0.9 2
1 0
Other 0
Figure 6.3: Rms peaks: relative power bar-plot
Most relative power values fall in the range between 0.1 and 0.8.
From these observations, I modified the algorithm giving the parameters a lower and a
higher threshold. These are the results:
Bio-Fingerprinting applied to polysomnographs
32 Analysis taking the rms peak of every spindle
Table 6.4: Rms peaks: results with modified thresholdsRelative power Correlation Rms Detected Matching Not detected
0.2 -∞ 0.65 -∞ 6.5 - 20.5 3250 786 52
0.2 -∞ 0.65 -∞ 7 - 20 5712 760 78
0.2 -∞ 0.45 - 1.05 7 - 20 2528 752 86
0.2 -∞ 0.65 -∞ 6.5 - 16 3192 740 98
0.1 - 0.8 0.65 -∞ 7 - 20 1961 617 221
Looking at these results, it can be noted that the value of the detected spindles is very high,
meaning that there is a large number of false positive values. Trying to reduce the range,
the result does not improve, because the number of false positive decreases, but also the
number of spindles matched decreases a lot and so the performance does not get better.
The next step is try to define functions that describe the trend of the parameters.
6.2 Scatter-plots for rms peaks
Linking the parameters two by two and searching for a relation between them could be a
good way to find ranges that change as the parameter values change. To reach this goal, it
is useful to look at the following scatter-plots.
Figure 6.4: Scatter plot relative power-correlation of spindle peaks
Bio-Fingerprinting applied to polysomnographs
33
Figure 6.5: Scatter plot relative power-rms of spindle peaks
Figure 6.6: Scatter plot correlation-rms of spindle peaks
From these images, it seems to me to see a logarithmic relationship between relative power
and correlation. The points on the other two graphs are instead more scattered and I cannot
see a clear relation between the parameters.
Bio-Fingerprinting applied to polysomnographs
34 Analysis taking the rms peak of every spindle
6.3 Range computed with logarithmic functions
Starting from the hypothesis that exist a logarithmic link between relative power and corre-
lation, I have tried to compute a range delimited by two logarithmic functions on the basis of
the scatter-plot in Figure 5.4 and the function calculated by Excel:
y = 0.1614ln(x) + 0.9515
where x is the relative power and y the correlation.
With the help of GeoGebra, I have looked for the best functions that contained all or almost
all the points present in the graph, maintaining a logarithmic trend. I have chosen these two
functions:
y = 0.11ln(x) + 1.1
for the upper limit and
y = 0.23ln(x) + 0.8
for the lower limit.
Figure 6.7: Rms peaks: logarithmic relation and range between relative power and correla-
tion
Now we can try to let the algorithm find a spindle only when these limits are respected.
Bio-Fingerprinting applied to polysomnographs
35
Table 6.5: Rms peaks: results with logarithmic functionRelative power Correlation Rms Detected Matching Not detected
0.2 -∞ 0.65 -∞ 6.5 - 20.5 1584 547 291
0.2 -∞ 0.65 -∞ 7 - 20 1423 528 310
0.2 -∞ 0.45 - 1.05 7 - 20 1691 579 259
0.2 -∞ 0.65 -∞ 6.5 - 16 1515 488 350
0.1 - 0.8 0.65 -∞ 7 - 20 1934 614 224
Looking at these results, it seems that this approach does not work. Why? We will talk
about this later. Given the poor results of this approach, maybe it would be better to follow
the approach of the initial YASA algorithm and try to define some thresholds personalized
for every patient. In this way, it is also possible to compare the achieved results with the new
thresholds with the initial ones without changing the algorithm in a drastic way. Remember
that the aim of this project is to demonstrate that a spindle detector can be personalized to
obtain better results.
Bio-Fingerprinting applied to polysomnographs
36 Analysis taking the rms peak of every spindle
Bio-Fingerprinting applied to polysomnographs
37
Chapter 7
Analysis to compute personalizedthresholds
The analysis that we are going to discuss in this chapter are the same as those we have
talked in the previous one because the principle is always try to define a range for the
parameters values to improve the detection of the spindles and also look for some relations
between the parameters. What changes compared to what has been done so far, is that we
do not consider anymore the rms peaks. Looking at the graphs representing the trend of
the parameters, we can notice that during a spindle event, before the rms reaches its peak,
there usually seems to be an inflection like this one circled in red:
Figure 7.1: Inflection in rms
We could try to use the central point of this inflection as a threshold that must be exceeded
to detect the spindle. But not all the spindles have this inflection. For the ones without it,
we could try to find the minimum value before the peak, so at the beginning of the spindle,
and try to find the point that is far from it, forward in time, a quarter of the distance between
the minimum and the peak, so as to simulate the presence of an inflection. As before, I
Bio-Fingerprinting applied to polysomnographs
38 Analysis to compute personalized thresholds
have taken this value for rms and the correspondenig values in time for relative power and
correlation, so I have plotted some bar-plots and some scatter-plots to watch che trend of
the parameters.
7.1 Bar-plots for rms inflections
To define a reasonable threshold, it is important to know which values the three parameters
assume most in every rms inflection.
Table 7.1: Rms inflections:
rms bar-plotClass Frequency
0 0
1 25
2 146
3 147
4 129
5 95
6 53
7 59
8 47
9 42
10 33
11 17
12 20
13 15
14 1
15 2
16 3
17 1
18 1
19 1
20 0
Other 0
Figure 7.2: Rms inflections: rms bar-plot
Most rms values fall in the range between 1 and 13.
Bio-Fingerprinting applied to polysomnographs
39
Table 7.2: Rms inflections:
correlation bar-plotClass Frequency
-0.7 1
-0.6 0
-0.5 0
-0.4 1
-0.3 0
-0.2 1
-0.1 15
0 31
0.1 73
0.2 118
0.3 122
0.4 112
0.5 95
0.6 67
0.7 79
0.8 69
0.9 38
1 16
Other 0
Figure 7.3: Rms inflections: correlation bar-plot
Most correlation values fall in the range between 0 and 0.9.
Bio-Fingerprinting applied to polysomnographs
40 Analysis to compute personalized thresholds
Table 7.3: Rms inflections:
relative power bar-plotClass Frequency
0 0
0.1 285
0.2 208
0.3 153
0.4 85
0.5 60
0.6 26
0.7 14
0.8 6
0.9 2
Other 1
Figure 7.4: Rms inflections: relative power bar-plot
Most relative power values fall in the range between 0.1 and 0.6.
Now we can try the algorithm with some ranges based on the results obtained with the
graphs. At first, it is better to define wide ranges that do not exclude too many spindles.
These are the results.
Table 7.4: Rms peaks: results with modified thresholdsRelative power Correlation Rms Detected Matching Not detected
0.1 -∞ 0.05 -∞ 1.5 -∞ 2171 714 124
0.1 - 0.5 0.05 - 0.9 0.5 - 13 2103 657 181
The results are not very encouraging, we could try to find some mathematical relations
between the parameters.
7.2 Scatter-plots for rms inflections
To look for some functional relations between the parameters, once again it is helpful to use
scatter plots.
Bio-Fingerprinting applied to polysomnographs
41
Figure 7.5: Scatter plot relative power-correlation of spindle inflections
Figure 7.6: Scatter plot relative power-rms of spindle inflections
Bio-Fingerprinting applied to polysomnographs
42 Analysis to compute personalized thresholds
Figure 7.7: Scatter plot correlation-rms of spindle inflections
From these images, it seems to me to see an exponential relationship between correlation
and rms. In the other two graphs is not easy to see a clear relation because the points are
more scattered but we can try to imagine a linear relation with a lot of outlier. Now we can
use these functions to give some functional ranges at the algorithm.
7.3 Range computed with mathematical functions
The first relation that I have taken into consideration is the exponential one between corre-
lation and rms. The equation calculated by Excel is:
y = 1.8227e1.8834x
where x is the correlation and y the rms. With the help of GeoGebra, I have defined one
exponential upper limit
y = 4e1.9x
and one exponential lower limit
y = e1.5x − 1
Bio-Fingerprinting applied to polysomnographs
43
Figure 7.8: Rms inflections: exponential relation and range between correlation and rms
Now we can test the algorithm adding these limits (first line of the table) and compare the
results with the ones without the limits (second line of the table).
Table 7.5: Rms inflections: results with exponential relation between correlation and rmsRelative power Correlation Rms Detected Matching Not detected
0.1 -∞ 0.05 -∞ 1.5 -∞ 2139 706 132
0.1 -∞ 0.05 -∞ 1.5 -∞ 2171 714 124
As the logarithmic relation between relative power and correlation computed for rms peaks,
also this approach seems to not work. We could try to add the two linear relation seen with
the scatter-plots for the other parameters. Regarding the relation between relative power
and correlation, the starting point is the function
y = 1.1219x+ 0.1564
where x is the relative power and y the correlation. I have taken as upper limit the function
y = 1.12x+ 0.7
and this as lower limit
y = 1.12x− 0.5
Bio-Fingerprinting applied to polysomnographs
44 Analysis to compute personalized thresholds
Figure 7.9: Rms inflections: linear relation and range between relative power and correlation
Now we can test the algorithm adding these limits (first line of the table) and compare the
results with the ones with the exponential limits correlation and rms (second line of the table).
Table 7.6: Rms inflections: results with linear relation between relative power and correlationRelative power Correlation Rms Detected Matching Not detected
0.1 -∞ 0.05 -∞ 1.5 -∞ 2103 686 152
0.1 -∞ 0.05 -∞ 1.5 -∞ 2139 706 132
Also this restriction do not provide satisfying results. We coul add the last range, based on
a linear relation between relative power and rms. The linear function computed by Excel is
y = 11.536x+ 2.3996
where x is the relative power and y the correlation. I have taken as upper limit the function
y = 1.12x+ 0.7
Bio-Fingerprinting applied to polysomnographs
45
and this as lower limit
y = 1.12x− 0.5
Figure 7.10: Rms inflections: linear relation and range between relative power and rms
Now we can test the algorithm adding these limits (first line of the table) and compare the
results with the ones with the linear limits for relative power and correlation (second line of
the table) and with the exponential limits for correlation and rms (third line of the table).
Table 7.7: Rms inflections: results with linear relation between relative power and correlationRelative power Correlation Rms Detected Matching Not detected
0.1 -∞ 0.05 -∞ 1.5 -∞ 2052 644 194
0.1 -∞ 0.05 -∞ 1.5 -∞ 2103 686 152
0.1 -∞ 0.05 -∞ 1.5 -∞ 2139 706 132
We can see that using mathematical functions as range do not improve the performance.
Looking at the last table, I notice that the number of detected spindles does not decrease
a lot but decreases the number of matching, as before with the logarithmic function for rms
peaks.
To understand why this approach does not work, the first thing I thought is to see if these
relations are maintained throughout the entire night and not just during spindle events.
Bio-Fingerprinting applied to polysomnographs
46 Analysis to compute personalized thresholds
Bio-Fingerprinting applied to polysomnographs
47
Chapter 8
Analysis on some rms inflectionsduring the night
To prove the hypothesis that the relations between the parameters are maintained during
the whole night, I have taken 838 points (to keep the same number of samples) of the track
where the rms has an inflection and I know there is no spindle. After this, I plotted some
scatter-plots to compare them with the ones plotted in the presence of a spindle.
Figure 8.1: Scatter plot relative power-correlation of spindle inflections
Bio-Fingerprinting applied to polysomnographs
48 Analysis on some rms inflections during the night
Figure 8.2: Scatter plot relative power-correlation of no-spindle inflections
Figure 8.3: Scatter plot relative power-rms of spindle inflections
Bio-Fingerprinting applied to polysomnographs
49
Figure 8.4: Scatter plot relative power-rms of no-spindle inflections
Figure 8.5: Scatter plot correlation-rms of spindle inflections
Bio-Fingerprinting applied to polysomnographs
50 Analysis on some rms inflections during the night
Figure 8.6: Scatter plot correlation-rms of no-spindle inflections
Looking at these graphs, we can see that the relations between the parameters computed
where there is a spindle event and the ones in the absence of it are very similar. This can be
the reason why the mathematical ranges calculated before does not produce good results.
Maybe we should focus on an automatic method to customize the thresholds, giving to the
algorithm one value for each parameter that must be exceeded to consider an event as
a spindle and maybe also an upper limit can be useful. We could try to compute these
thresholds on the base of the first ten or twenty spindles, marked by the neurologist, and
use them to detect the other ones.
Bio-Fingerprinting applied to polysomnographs
51
Chapter 9
Confidence interval on the mean tocompute the thresholds
The confidence interval on the mean is a statistical term used to describe the range of val-
ues in which the true mean is expected to fall, based on your data and confidence level.
The most commonly used confidence level is 95 percent, meaning that there is a 95 per-
cent probability that the true mean lies within the confidence interval you’ve calculated. To
calculate the confidence interval, you need to know the mean of your data set, the standard
deviation, the sample size and your chosen confidence level.
After calculating the confidence interval, I executed the algorithm with the lower limits used
as thresholds for every parameter. The performance seems to improve. To compute this
improvement mathematically, I have leaned on the F1-score.
9.1 F1-score as a measure of performance
In statistical analysis of binary classification, the F1 score (also F-score or F-measure) is a
measure of a test’s accuracy. It considers both the precision p and the recall r of the test
to compute the score: p is the number of correct positive results divided by the number of
all positive results returned by the classifier, and r is the number of correct positive results
divided by the number of all relevant samples (all samples that should have been identified
as positive). The F1 score is the harmonic mean of the precision and recall, where an F1
score reaches its best value at 1 (perfect precision and recall) and worst at 0.
F1 = 2Precision ∗RecallPrecision+Recall
Bio-Fingerprinting applied to polysomnographs
52 Confidence interval on the mean to compute the thresholds
Precision =TruePositive
TruePositive+ FalsePositive
Recall =TruePositive
TruePositive+ FalseNegative
In this project, every event is cataloged as follows:
- True Positive (TP): spindle events detected as spindle events
- True Negative (TN): no-spindle events detected as no-spindle events
- False Positive (FP): no-spindle events detected as spindle events
- False Negative (FN): spindle events detected as no-spindle events
Every event, however, is not just a point in the EEG track. It is not so easy to catalog an
event. For this purpose, it is useful to divide the track into some seconds windows. In this
way, we can classify events as follows:
- True Positive (TP): windows containing spindle events detected as windows containing
spindle events
- True Negative (TN): windows containing no-spindle events detected as windows con-
taining no-spindle events
- False Positive (FP): windows containing no-spindle events detected as windows con-
taining spindle events
- False Negative (FN): windows containing spindle events detected as windows con-
taining no-spindle events
As window length I choose five seconds, that seems to me a good compromise between
the duration of spindles, usually one or two seconds and the high number of True Negative
presents in the track. In fact, considering a sleep of almost 8 hours, if we not consider the
838 real spindles and the ones improperly detected, the rest of the track is formed only by
TN. Naturally, if I change the window in which I compute the matching, the results change.
This is not a problem because the F1-score is useful for the comparison between the initial
YASA algorithm and the one modified, so it is enough to recalculate the matching for the
initial algorithm and perform the F1-score.
These are the results of the initial YASA algorithm with the new matching
Bio-Fingerprinting applied to polysomnographs
53
Table 9.1: Initial YASA algorithm results with F1-scoreReal spindles Detected spindles Matching Not detected spindles F1-score
838 904 422 416 0.48
Now we can compare these results with the one obtained computing the thresholds with the
confidence interval, using different confidence levels.
Table 9.2: Results using confidence interval
Confidence level Detected spindles Matching Not detected spindles F1-score
80 % 771 466 372 0.58
90 % 776 472 366 0.58
95 % 801 475 363 0.58
We can notice that the F1-score has increased of 0.1. This means that calculate the thresh-
olds with the confidence interval on the mean is a good approach and help us to increase
the performance of the detector.
9.2 Personalize the algorithm knowing only some spindles for
patient
Now that we have found a good way to set the thresholds, for the aim of the project is
important to try to detect the spindles computing the thresholds knowing only 10 or 20 initial
spindles annotated by the neurologist. The confidence interval is calculated only on these
spindles. These are the results.
Table 9.3: Results using confidence interval with 10/20 spindles in inputInput Confidence Detected Matching Not detected F1-score
spindles level spindles spindles
10 80 % 2321 761 77 0.48
10 70 % 2080 738 100 0.51
10 50 % 1738 701 137 0.54
20 95 % 1942 724 114 0.52
20 80 % 1549 659 179 0.55
20 50 % 1268 608 230 0.58
As we could expect, the number of detected spindles is higher than before, because the
Bio-Fingerprinting applied to polysomnographs
54 Confidence interval on the mean to compute the thresholds
confidence interval is computed on less data and so is less accurate. However, decreasing
the confidence level, the F1-score improve because, though the matching is no more so
high, the algorithm detects less false positive.
Bio-Fingerprinting applied to polysomnographs
55
Chapter 10
Preliminary analysis for adaptivemethod to compute thresholds
The aim of the project has been reached with the use of confidence interval on the mean.
However, to improve the performance, further analysis can be carried out. A possible ap-
proach could be an adaptive method to set the thresholds. The idea is to fix a threshold and
compute the other two at each point of the track on the basis of the value assumed at that
moment by the parameter whose threshold has been set, if the latter has been exceeded.
To understand if this method could be efficient, it is important to perform some extra analysis
on the parameters.
10.1 Exploration of the relationship between parameters
To compute two thresholds on the basis of the value assumed by the other parameter, there
should be a relation between the three variables. To prove the relation, I computed a poly-
nomial regression with relative power and correlation as independent variables and rms as
target variable. In statistics, polynomial regression is a form of regression analysis in which
the relationship between the independent variable x and the dependent variable y is mod-
elled as an nth degree polynomial in x. I have chosen to compute a polynomial regression
with degree 2 because calculating the F1-score, it does not improve with increasing degree,
while degree 1 it is not very accurate. I have performed the regression with and without
removing outliers. To remove them, I have used the cook’s distance method. It is used in
regression analysis to identify the effects of outliers. It is believed that influential outliers
negatively affect the model. The cook’s distance tries to capture this information concerning
the predictor variables. The distance is a measure combining leverage and residual of each
value; the higher the leverage and residual, the higher the score for cook’s distance. An
outlier is detected when
Cook′sDistance >4
n− p
Bio-Fingerprinting applied to polysomnographs
56 Preliminary analysis for adaptive method to compute thresholds
where p is the number of variables and n is the dataset size.
To determine how well the model fits the data, I have used R-squared (R2), also known as
coefficient of determination. It is the proportion of the variance in the dependent variable
that is predictable from the independent variables. It is a number between 0 and 1, where
0 indicates that the model explains none of the variability of the response data around its
mean, while 1 indicates that the model explains all the variability of the response data around
its mean. So the higher the R-squared value, the better the model fits the data.
10.2 Polynomial regression on spindle inflections
First of all, I have computed the regression considering the inflections and minimum of the
spindles. These are the results:
Table 10.1: Rms inflections and minimum: R2 in polynomial regressionRemoval outliers Number of spindles R2
No 838 0.56
Yes 769 0.57
Then, I have separated the inflections from the lows and repeated the regression
Table 10.2: Rms inflections and minimum separated: R2 in polynomial regressionType Removal outliers Number of spindles R2
Inflections No 528 0.56
Inflections Yes 496 0.61
Minimum No 310 0.56
Minimum Yes 292 0.58
Looking at these tables, we can notice that removing the outliers the R2 value does not
change a lot and that inflections and minimum can be considered together in our analysis.
However, the R-squared coefficient is not so high. We could try to perform the regression
considering all the points of the spindle and not only its inflection or minimum
Table 10.3: Total spindle durations: R2 in polynomial regressionRemoval outliers Number of total points R2
No 174125 0.69
Yes 160726 0.72
Bio-Fingerprinting applied to polysomnographs
57
From these results, we can understand that the polynomial regression works better on all
the spindle duration and not just taking a point of the event. The R-squared coefficient is
good and so we can conclude that exists a relation between the three parameters when
a spindle occurs. We can try to follow the approach of adapting the thresholds during the
night.
Bio-Fingerprinting applied to polysomnographs
58 Preliminary analysis for adaptive method to compute thresholds
Bio-Fingerprinting applied to polysomnographs
59
Chapter 11
Adaptive method to computethresholds
To decide on which parameter fix the threshold, I have computed some regressions with
degree 2. I have tried to relate the parameters two by two to understand which regression
fits the data the best way.
Table 11.1: Relations between the parameters: R2 in regressionIndependent Dependent R2 with R2 without
variable variable outliers outliers
relative power rms 0.28 0.32
relative power correlation 0.39 0.47
rms relative power 0.30 0.27
rms correlation 0.62 0.62
correlation relative power 0.48 0.46
correlation rms 0.52 0.54
Looking at this table, I think that the parameter that could be fixed is correlation, because
using it as independent variable in the regression the R-squared coefficients are both quite
good. To set the threshold, I have computed the confidence interval with different confidence
level, to see which is the best. I have tried to calculate it taking in input only the point of
inflection or minimum of the spindle or taking all the spindle. I have performed it on the first
10 or 20 spindles. Then, for every point of the track, if this threshold is exceeded, I have
calculated the other two thresholds with the regression coefficients performed before. So I
have checked if these new thresholds are also exceeded. These are the results:
Bio-Fingerprinting applied to polysomnographs
60 Adaptive method to compute thresholds
Table 11.2: Adaptive method: correlation fixed
Input Inflection / Confidence Detected Matching Not detected F1-score
spindles all spindle level spindles spindles
10 all spindle 50 % 334 234 604 0.40
10 all spindle 90 % 343 239 599 0.40
10 inflection 50 % 682 351 487 0.46
10 inflection 90 % 850 375 463 0.44
20 inflection 50 % 833 433 405 0.52
20 inflection 90 % 1024 471 367 0.51
20 all spindle 90 % 579 344 494 0.49
Some of these F1-score are better than the one of the initial YASA algorithm but not than
the ones of the algorithm that uses threshold computed only with the confidence interval.
Maybe correlation is not the best threshold to fix, so we can try to fix one of the other two
parameters. Moreover, an upper limit for the parameters could be introduced. We can
compute them using the confidence interval on the values of the parameters corresponding
to the rms peaks.
Table 11.3: Adaptive method: rms fixed
Input Inflection / Confidence Detected Matching Not detected F1-score
spindles all spindle level spindles spindles
10 all spindle 90 % 673 348 490 0.46
10 all spindle 99 % 701 358 480 0.47
10 all spindle 50 % 640 333 505 0.45
10 inflection 50 % 1714 481 357 0.38
10 inflection 90 % 1808 475 363 0.36
20 inflection 90 % 1568 454 384 0.38
20 inflection 50 % 1499 449 389 0.38
Bio-Fingerprinting applied to polysomnographs
61
Table 11.4: Adaptive method: rms fixed with upper limit
Input Inflection / Confidence Detected Matching Not detected F1-score
spindles all spindle level spindles spindles
10 inflection 90 % 994 135 703 0.15
20 inflection 90 % 1542 438 400 0.37
20 all spindle 90 % 798 362 476 0.44
20 all spindle 95 % 806 362 476 0.44
20 all spindle 50 % 777 360 478 0.45
Table 11.5: Adaptive method: relative power fixed
Input Inflection / Confidence Detected Matching Not detected F1-score
spindles all spindle level spindles spindles
10 all spindle 50 % 1555 561 277 0.47
10 all spindle 90 % 1587 563 275 0.46
10 inflection 90 % 3283 729 109 0.35
10 inflection 50 % 3079 738 100 0.38
20 inflection 50 % 2902 691 147 0.37
20 inflection 90 % 3272 700 138 0.34
20 all spindle 90 % 1652 563 275 0.45
20 all spindle 50 % 1617 559 279 0.46
Table 11.6: Adaptive method: relative power fixed with upper limit
Input Inflection / Confidence Detected Matching Not detected F1-score
spindles all spindle level spindles spindles
10 inflection 90 % 3314 693 145 0.33
10 inflection 50 % 3016 625 213 0.32
20 inflection 50 % 2909 585 253 0.31
Trying to fix the thresholds of the other parameters, the situation does not get better. Some-
times the F1-score is better using just the inflections and minimum to calculate the con-
fidence interval, sometimes all the spindle is necessary. When the number of detected
spindles is very high, I have introduced the upper limit, but it seems to exclude too many
Bio-Fingerprinting applied to polysomnographs
62 Adaptive method to compute thresholds
true positive events. We could conclude that for this patient this adaptive method does not
work so well.
Bio-Fingerprinting applied to polysomnographs
63
Chapter 12
Conclusions
The aim of the project has been reached. I have demonstrated that a detection spindle
algorithm can be made customizable for each patient. The best method I have tested for
reaching this purpose is to calculate the three parameter thresholds on the first 10 spin-
dles (or 20 if you want to be more precise), annotated from the doctor, with the use of the
confidence interval on the mean. For the patient on which I have tested this method, the
F1-score has increased from 0.48, result of the initial YASA algorithm, to 0.58, result using
20 spindles in input and a confidence level of 50%. Unfortunately, I had only one file with
the neurologist’s notes, so I could not try the operation of this method on other patients.
However, I have performed some analysis that can be replicated on other files and I have
demonstrated that there is a relation between the trend of the three parameters used to
detect the spindles.
Bio-Fingerprinting applied to polysomnographs
64 Conclusions
Bio-Fingerprinting applied to polysomnographs
65
Bibliography
[1] Andrew L. Chesson Jr. Conrad Iber, Sonia Ancoli-Israel. The aasm manual for the scor-
ing of sleep and associated events. 2007.
[2] Silvia Parapatics Peter Anderer, Georg Gruber. An e-health solution for automatic sleep
classification according to rechtschaffen and kales: Validation study of the somnolyzer
24 x 7 utilizing the siesta database. 2005.
[3] Julien Beaudry Karine Lacourse, Jacques Delfrate. A sleep spindle detection algorithm
that emulates human expert spindle scoring. 2018.
[4] raphaelvallat. https://github.com/raphaelvallat/yasa.
[5] Wikipedia.
[6] https://www.ncbi.nlm.nih.gov/pmc/articles/pmc5426701/.
[7] https://sciencing.com/calculate-confidence-interval-mean-5933144.html.
[8] https://datasciencebeginners.com/2018/11/18/10-how-to-detect-outliers/.
Bio-Fingerprinting applied to polysomnographs
Recommended