Sleep Quality Estimation using Accelerometer Data from Thigh-Mounted Devices During in Free Living Conditions

(working title)

Esben Lykke, PhD student

26 februar, 2023

Background

  • Sleep plays a vital role in health, thus, improving the assessment of sleep–wake outside of a laboratory environment is critical
  • The gold standard (PSG) is costly and inconvenient.
  • Methods for estimating sleep/wake based on accelerometry exist, primarily from wrist-worn devices
  • Cole-Kripke and Sadeh algorithms are commonly used
  • determine in-bed time is difficult, usually set by sleep log and/or human scorers
  • detect wakefulness is difficult, worse performance in populations with sleep disorders
  • typically a two level analysis: epoch based and summarized across night(s)
  • Zmachine-derived sleep stats

Purpose…

But Esben, what about them sleep stages!?

  • I did free-living PSG recordings of sleep but…
    • Super fragile -> shitty data
    • Combersome and time consuming
    • free-living when wired up like a robot?
    • would surface skin temperature + acc be enough? Most likely needs HR

It was likely a dead end from the get-go :(

Methods

  • data preparation, big time-consumer is handling raw acc data
  • only thigh data used. HSBC and other is only thigh data…
  • all zm recording is considered as in-bed (sensor problem?)
  • no sleep stages, only sleep/awake
  • sensor problems during sleep, up to 20 consecutive epochs (200 sec) are treated as sleep

Exclusion Criteria

Features

Basic Features

  • Weekday
  • Time of Day
  • Placement
  • Temperature

ACC derived features1

  • Mean ACC X
  • Mean ACC Y
  • Mean ACC Z
  • Standard Deviation X
  • Standard Deviation Y
  • Standard Deviation Z
  • Max Standard Deviation
  • Inclination

Sensor-Independent Features2

  • Clock Proxy Linear
  • Clock Proxy Cosinus

Human Circadian Clock

Forger, Jewett, and Kronauer (1999): a so-called cubic van der Pol equation

\[\frac{dx_c}{dt}=\frac{\pi}{12}\begin{cases}\mu(x_c-\frac{4x^3}{3})-x\begin{bmatrix}(\frac{24}{0.99669\tau_x})^2+kB\end{bmatrix}\end{cases}\]

This thing is dependent on ambient light and body temperature!

Walch et al. (2019) incorporated this feature using step counts from the Apple Watch

But as demonstrated by Walch et al. (2019), a simple cosine function does the trick just as well :)

Circadian Proxy Features

Circadian Proxy Features

building Models

Estimate Sleep Quality Metrics

Estimate Sleep Quality Metrics

Estimate Sleep Quality Metrics

Results

Epoch-Based

  • Performance Metrics
    • F1 Score
    • Accuracy
    • Sensitivity
    • Specificity
    • ROC curves

Summarized across nights

  • Agreement With Zmachine Sleep Stats
    • Sleep Period Time
    • Total Sleep Time
    • Sleep Efficiency
    • Latency Until Persistent Sleep
    • Wake After Sleep Onset

ROC Curves

ROC Curves

Lots of Metrics

Performance of the models to predict each class seperately
Logistic Regression Neural Network Decision Tree XGboost
In-bed Prediction
F1 Score 90.88% 93.69% 93.37% 93.77%
Accuracy 92.87% 94.81% 94.46% 94.85%
Sensitivity 85.43% 92.64% 93.83% 93.16%
Precision 97.07% 94.75% 92.92% 94.39%
Specificity 98.17% 96.35% 94.91% 96.06%
Sleep Prediction
F1 Score 86.57% 89.59% 89.34% 89.62%
Accuracy 90.77% 92.41% 92.10% 92.39%
Sensitivity 84.65% 92.95% 94.20% 93.49%
Precision 88.59% 86.47% 84.96% 86.06%
Specificity 94.09% 92.12% 90.96% 91.79%
Performance of the models to predict each combined class
Logistic Regression Neural Network Decision Tree XGboost
In-Bed Awake Prediction
F1 Score 15.88% 25.45% 26.41% 27.54%
Accuracy 92.05% 92.95% 93.04% 93.26%
Sensitivity 11.67% 18.73% 19.44% 19.93%
Precision 24.83% 39.69% 41.18% 44.58%
Specificity 97.57% 98.05% 98.09% 98.30%
In-Bed Sleep Prediction
F1 Score 86.56% 89.54% 89.35% 89.61%
Accuracy 90.76% 92.39% 92.11% 92.38%
Sensitivity 84.61% 92.69% 94.18% 93.45%
Precision 88.60% 86.60% 84.99% 86.07%
Specificity 94.10% 92.23% 90.98% 91.80%

Bland-Altman Plots

Bland-Altman Analysis

Bias (95% CI) Lower LOA (95% CI) Upper LOA (95% CI)
Sleep Period Time (hrs)
Logistic Regression -1.28 (-1.41; -1.15) -4.08 (-4.48; -3.78) 1.53 (1.22; 1.93)
Neural Net -0.39 (-0.51; -0.27) -3.09 (-3.49; -2.76) 2.31 (1.95; 2.75)
Decision Tree -0.19 (-0.34; -0.08) -2.96 (-3.37; -2.63) 2.59 (2.15; 3.03)
XGboost -0.37 (-0.49; -0.25) -3 (-3.46; -2.69) 2.27 (1.92; 2.73)
Total Sleep Time (hrs)
Logistic Regression -0.59 (-0.7; -0.48) -3.04 (-3.31; -2.83) 1.87 (1.66; 2.12)
Neural Net -0.04 (-0.14; 0.07) -2.36 (-2.63; -2.15) 2.29 (2.04; 2.55)
Decision Tree 0.05 (-0.06; 0.15) -2.21 (-2.42; -2) 2.3 (2.08; 2.55)
XGboost -0.02 (-0.11; 0.09) -2.31 (-2.59; -2.07) 2.27 (2.02; 2.54)
Sleep Efficiency (%)
Logistic Regression 5.76 (5.17; 6.36) -8.48 (-10.99; -6.81) 20.01 (18.42; 22.28)
Neural Net 3.34 (2.49; 4.17) -14.88 (-17.32; -12.46) 21.57 (20; 23.35)
Decision Tree 2.42 (1.55; 3.32) -17.02 (-19.82; -14.47) 21.86 (20.24; 23.84)
XGboost 3.2 (2.45; 4.05) -14.77 (-17.39; -12.52) 21.17 (19.6; 23.27)
Latency Until Persistent Sleep (min)
Logistic Regression 4.29 (0.58; 8.46) -83.37 (-117.62; -62.4) 91.95 (68.85; 127.44)
Neural Net -2.63 (-6.9; 1.79) -91.37 (-120.36; -71.83) 86.12 (64.1; 114.28)
Decision Tree 0.21 (-4.83; 5.5) -88.71 (-117.7; -65.86) 89.13 (62.66; 122.78)
XGboost -2.91 (-6.33; -0.1) -68.68 (-101.27; -53.33) 62.86 (48.66; 87.14)
Wake After Sleep onset (min)
Logistic Regression -13.61 (-16.43; -11.16) -78.31 (-92.92; -69.58) 51.08 (44.03; 60.52)
Neural Net -10.39 (-13.51; -7.51) -79.51 (-92.72; -71.68) 58.72 (51.88; 69.27)
Decision Tree -6.66 (-9.98; -3.68) -81.36 (-94.24; -71.45) 68.04 (59.35; 78.93)
XGboost -8.57 (-11.71; -6.06) -76.69 (-88.54; -68.07) 59.55 (52.69; 69.57)
Bootstrapped mixed effects limits of agreement with multiple observations per subject (Parker et al. 2016)

In-bed classification flow

Sleep classification flow

Discussion

  • heteroscedasticity
  • Cheung 2018 table 4: actigraphy provides a sufficiently narrow range of possible mean differences (CI 95%) clinical significant thresholds
  • could be interesting to build models on thigh and hip ocmbined.
  • multiclass vs multilabel classification
  • in-bed awake/sleep is highly imbalanced -> maybe train a new classifier accounting for imbalanced data (SMOTE)
  • model combined preds instead?

References

Forger, D. B., M. E. Jewett, and R. E. Kronauer. 1999. “A Simpler Model of the Human Circadian Pacemaker.” Journal of Biological Rhythms 14 (6): 532–37. https://doi.org/10.1177/074873099129000867.
Hirshkowitz, Max, Kaitlyn Whiton, Steven M Albert, Cathy Alessi, Oliviero Bruni, Lydia DonCarlos, Nancy Hazen, et al. 2015. “National Sleep Foundation’s Sleep Time Duration Recommendations: Methodology and Results Summary.” Sleep Health, 4.
Skotte, Jørgen, Mette Korshøj, Jesper Kristiansen, Christiana Hanisch, and Andreas Holtermann. 2014. “Detection of Physical Activity Types Using Triaxial Accelerometers.” Journal of Physical Activity and Health 11 (1): 76–84. https://doi.org/10.1123/jpah.2011-0347.
Walch, Olivia, Yitong Huang, Daniel Forger, and Cathy Goldstein. 2019. “Sleep Stage Prediction with Raw Acceleration and Photoplethysmography Heart Rate Data Derived from a Consumer Wearable Device.” Sleep 42 (12): zsz180. https://doi.org/10.1093/sleep/zsz180.