A sample of the actigraphy data analysis tutorials.
Condor Instruments - Complete sleep analysis demonstration
Julius A. P. P. de Paula
jp@condorinst.com.br
1) Package installation and upgrade
!pip install wget # installs library for file download
!pip install xgboost --upgrade # upgrades package used in offwrist algorithm
Requirement already satisfied: wget in /home/julius/.local/lib/python3.8/site-packages (3.2)
Collecting xgboost
Downloading xgboost-1.6.2-py3-none-manylinux2014_x86_64.whl (255.9 MB)
|████████████████████████████████| 255.9 MB 46 kB/s eta 0:00:01 |██████ | 47.9 MB 318 kB/s eta 0:10:54 |████████████████████▉ | 166.9 MB 4.5 MB/s eta 0:00:20 |██████████████████████████▉ | 214.4 MB 5.1 MB/s eta 0:00:09
Requirement already satisfied, skipping upgrade: scipy in /home/julius/.local/lib/python3.8/site-packages (from xgboost) (1.8.1)
Requirement already satisfied, skipping upgrade: numpy in /home/julius/.local/lib/python3.8/site-packages (from xgboost) (1.23.1)
Installing collected packages: xgboost
Attempting uninstall: xgboost
Found existing installation: xgboost 1.6.1
Uninstalling xgboost-1.6.1:
Successfully uninstalled xgboost-1.6.1
Successfully installed xgboost-1.6.2
2) Importing packages
# these packages are required for obtaining the path to the current file
import sys
import inspect
import os
root = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe()))) # path to "this" directory
import numpy as np # mathmatics library
import pandas as pd # data science library
# dependency download
import wget
URL = "https://github.com/Condor-Instruments/actigraphy-tutorials-sample/blob/master/demo-dependencies.zip?raw=true"
response = wget.download(URL, "demo-dependencies.zip")
# file unzip
import zipfile
with zipfile.ZipFile("demo-dependencies.zip", 'r') as zip_ref:
zip_ref.extractall(root)
from crespo_wrapper import crespo_wrapper # algorithm for main sleep period detection
from nap_wrapper import nap_wrapper # algorithm for secondary sleep period detection
from logread import LogRead as lr # class for log file reading
from offwrist_wrapper import offwrist_wrapper # algorithm for offwrist detection
from colekripke import ColeKripke as ck # algorithm for WASO detection
from nights_df import nights_df # helper algorithm for daily processing
from simple_actogram import actigraphy_single_plot_actogram
100% [......................................................] 4001755 / 4001755
3) Reading files
For this demonstration we've made 3 files available: input0.txt, input1.txt and input2.txt
file = "input1.txt" # file subject to analysis
df = lr(file).data # with LogRead class the file is read to a DataFrame from pandas library
npindex = df.index.to_numpy()
df = df[pd.Timestamp(npindex[0]):pd.Timestamp(npindex[int(len(npindex)/3)])]
4) Preparing the input DataFrame
# columns need to renamed before being fed to the algorithms
df = df.rename(columns={"DATE/TIME":"datetime",
"PIM":"activity",
"TEMPERATURE":"int_temp",
"EXT TEMPERATURE":"ext_temp"})
# state-related columns will be separated for better visuallization, all of them will be initially filled with zeros
df["state"] = np.zeros(len(df))
df["offwrist"] = np.zeros(len(df))
df["sleep"] = np.zeros(len(df))
int_temp = df["int_temp"].to_numpy()
ext_temp = df["ext_temp"].to_numpy()
int_temp = np.where(int_temp > 0, int_temp, 0)
ext_temp = np.where(ext_temp > 0, ext_temp, 0)
int_temp = np.where(int_temp < 42 , int_temp, 42)
ext_temp = np.where(ext_temp < 42 , ext_temp, 42)
df["int_temp"] = int_temp
df["ext_temp"] = ext_temp
# pre-scaled temperatures will be used for plotting
scale = np.max([np.max(ext_temp),np.max(int_temp)])
df["int_temp_"] = int_temp/scale
df["ext_temp_"] = ext_temp/scale
print(df)
datetime MS EVENT int_temp ext_temp \
datetime
2021-05-27 11:10:15 2021-05-27 11:10:15 0 0 24.24 23.94
2021-05-27 11:11:15 2021-05-27 11:11:15 0 0 24.36 24.06
2021-05-27 11:12:15 2021-05-27 11:12:15 0 0 24.42 23.94
2021-05-27 11:13:15 2021-05-27 11:13:15 0 0 24.41 23.88
2021-05-27 11:14:15 2021-05-27 11:14:15 0 0 24.36 23.81
... ... .. ... ... ...
2021-06-14 03:17:44 2021-06-14 03:17:44 0 0 32.36 31.38
2021-06-14 03:18:44 2021-06-14 03:18:44 0 0 32.39 31.44
2021-06-14 03:19:44 2021-06-14 03:19:44 0 0 32.40 31.44
2021-06-14 03:20:44 2021-06-14 03:20:44 0 0 32.43 31.44
2021-06-14 03:21:44 2021-06-14 03:21:44 0 0 32.46 31.50
ORIENTATION activity PIMn TAT TATn ... \
datetime ...
2021-05-27 11:10:15 65 4856 0.000003 278 1.713810e-07 ...
2021-05-27 11:11:15 68 4483 74.716700 285 4.750000e+00 ...
2021-05-27 11:12:15 68 425 7.083330 40 6.666670e-01 ...
2021-05-27 11:13:15 68 873 14.550000 56 9.333330e-01 ...
2021-05-27 11:14:15 68 413 6.883330 31 5.166670e-01 ...
... ... ... ... ... ... ...
2021-06-14 03:17:44 80 0 0.000000 0 0.000000e+00 ...
2021-06-14 03:18:44 80 0 0.000000 0 0.000000e+00 ...
2021-06-14 03:19:44 80 0 0.000000 0 0.000000e+00 ...
2021-06-14 03:20:44 80 0 0.000000 0 0.000000e+00 ...
2021-06-14 03:21:44 80 0 0.000000 0 0.000000e+00 ...
BLUE LIGHT IR LIGHT UVA LIGHT UVB LIGHT STATE state \
datetime
2021-05-27 11:10:15 4.55 5.63 0.0 0.00 0 0.0
2021-05-27 11:11:15 2.06 6.32 0.0 0.00 0 0.0
2021-05-27 11:12:15 6.20 13.79 0.0 1.43 0 0.0
2021-05-27 11:13:15 6.50 14.23 0.0 1.90 0 0.0
2021-05-27 11:14:15 5.11 11.09 0.0 1.43 0 0.0
... ... ... ... ... ... ...
2021-06-14 03:17:44 0.00 0.00 0.0 0.00 1 0.0
2021-06-14 03:18:44 0.00 0.00 0.0 0.00 1 0.0
2021-06-14 03:19:44 0.00 0.00 0.0 0.00 1 0.0
2021-06-14 03:20:44 0.00 0.00 0.0 0.00 1 0.0
2021-06-14 03:21:44 0.00 0.00 0.0 0.00 1 0.0
offwrist sleep int_temp_ ext_temp_
datetime
2021-05-27 11:10:15 0.0 0.0 0.676528 0.668155
2021-05-27 11:11:15 0.0 0.0 0.679877 0.671504
2021-05-27 11:12:15 0.0 0.0 0.681552 0.668155
2021-05-27 11:13:15 0.0 0.0 0.681273 0.666481
2021-05-27 11:14:15 0.0 0.0 0.679877 0.664527
... ... ... ... ...
2021-06-14 03:17:44 0.0 0.0 0.903154 0.875802
2021-06-14 03:18:44 0.0 0.0 0.903991 0.877477
2021-06-14 03:19:44 0.0 0.0 0.904270 0.877477
2021-06-14 03:20:44 0.0 0.0 0.905107 0.877477
2021-06-14 03:21:44 0.0 0.0 0.905945 0.879152
[25399 rows x 26 columns]
5) Offwrist
The algorithm for offwrist period detection is meant to filter out of the analysis the moments when the subject is not wearing the actigraph. It is based on the Gradient Boosting algorithm provided by the XGBoost library. With the activity measure PIM and internal and external temperature, new auxiliary variables are computed and all are fed to the algorithm to generate a classification.
out = offwrist_wrapper(df) # offwrist detection
# column updates
df["state"] = out
df["offwrist"] = 0.25*out
# we'll use actograms for visuallizing the data
fig = actigraphy_single_plot_actogram(df, ["activity", "int_temp_","ext_temp_","sleep","offwrist"], [False, True, True, True, True], 12, dt = "datetime")
fig.show()
/home/julius/Dropbox (Condor Instruments)/Julius/condor-demo/functions.py:14: RuntimeWarning: invalid value encountered in long_scalars
6) Main sleep periods
The algorithm for main sleep period detection is based on an implementation of the Crespo algorithm that was initially described in the scientific literature by Crespo et al in 2012. The algorithm consists on a delimitation of the periods of high and low activity inside the time series through a percentile-based thresholding operation. After this initial delimitation, a refinement procedure takes place using metrics that we've developed.
out = crespo_wrapper(df) # main sleep period detection (bed time and getup time)
df["state"] = out
df["sleep"] = np.where(out == 4,0,out)
fig = actigraphy_single_plot_actogram(df, ["activity", "int_temp_","ext_temp_","sleep","offwrist"], [False, True, True, True, True], 12, dt = "datetime")
fig.show()
7) Secondary sleep periods (Naps)
The algorithm used for nap detection is the same Crespo algorithm described above but with different parameters. The same principle is used, we aim to detect periods of low movement, but we'll search only the periods defined as Awake previously and make the algorithm more sensitive to smaller variations in movement amplitude to find these relatively short sleep periods. The same is valid during the refinement stage of the algorithm, we'll feed in different parameters, specific to this particular problem.
# some of the parameters that differentiate our implementations relate to the length of the sleep period we wish to detect,
# we seek to find short periods, and some are the inputs to the refinement logic. the refinement step improves the boundaries of the sleep period
out = nap_wrapper(df) # secondary sleep period detection (nap bed time and getup time)
df["state"] = out
df["sleep"] = np.where(out == 7,1,df["sleep"].to_numpy())
fig = actigraphy_single_plot_actogram(df, ["activity", "int_temp_","ext_temp_","sleep","offwrist"], [False, True, True, True, True], 12, dt = "datetime")
fig.show()
8) WASO
The Wakefullness After Sleep Onset detection uses our implementation of an algorithm described in the scientific literature by Cole et al in 1992. It consists on a weighted sum rolling window operation followed by a thresholding operation. In our implementation we use a different window size and weights, we choose to do so based on the results we got from optimization studies carried with AI and other statistical techniques.
onwrist = np.where(out == 4, False, True) # actigraph on the wrist mask
# input variables read only in the offwrist periods
stamps = df["datetime"].to_numpy()[onwrist]
zcm = df["ZCMn"].to_numpy()[onwrist]
n = len(zcm) # time-series length
state = np.zeros(n) # auxiliary array to compute new states
in_bed = out[onwrist] # this array acts as a sleep journal, we get the information of whether or not the subject is in bed
nights = nights_df(stamps,in_bed,wake_thresh=240,search_gap=False) # night segregation
num_nights = len(nights) # number of nights present in the time series
# nightly sleep statistics arrays
waso = np.nan*np.ones(num_nights)
tbt = np.nan*np.ones(num_nights)
tst = np.nan*np.ones(num_nights)
sol = np.nan*np.ones(num_nights)
soi = np.nan*np.ones(num_nights)
nw = np.nan*np.ones(num_nights)
eff = np.nan*np.ones(num_nights)
bts = []
gts = []
for i in range(num_nights):
bt = nights.at[i,"bt"] # bed time index
gt = nights.at[i,"gt"] # getup time index
bts.append(stamps[bt]) # bed time
gts.append(stamps[gt]) # getup time
input = zcm[bt:gt]
cole = ck(input, # WASO computations are carried nightly
P=0.000464,
weights_before=[34.5,133,529,375,408,400.5,1074,2048.5,2424.5],
weights_after=[1920,149.5,257.5,125,111.5,120,69,40.5],
)
cole.model(np.zeros(gt-bt))
cpred = cole.filtered_weighted # states on the current night
if i == 3:
save = pd.DataFrame([],columns=["stamps","zcm","cole","state"])
save["zcm"] = input
save["cole"] = cole.weighted
save["state"] = cpred
save["stamps"] = stamps[bt:gt]
save.to_csv("save.csv",sep=';',header=True,index=False, index_label=None)
# SOL computation
latency = 0
while cpred[latency] > 0:
latency += 1
# SOI computation
innertia = len(cpred)-1
while cpred[innertia] > 0:
innertia -= 1
# Computing the number of wake periods during the night
edges = np.diff(cpred)
num_awake = np.sum(np.where(edges>0,1,0))
sol[i] = latency
soi[i] = len(cpred)-1-innertia
waso[i] = np.sum(cpred[latency:innertia])
nw[i] = num_awake
tbt[i] = gt-bt
tst[i] = tbt[i]-waso[i]-soi[i]-sol[i]
eff[i] = tst[i]/tbt[i]
state[bt:gt] = 1-cpred
nights["tbt"] = tbt
nights["waso"] = waso
nights["sol"] = sol
nights["soi"] = soi
nights["tst"] = tst
nights["nw"] = nw
nights["eff"] = eff
nights.insert(0,"gts",gts)
nights.insert(0,"bts",bts)
out[onwrist] = state
df["state"] = out
df["sleep"] = np.where(out == 4,0,out)
print(nights)
bts gts bt gt tbt waso sol \
0 2021-05-27 23:42:15 2021-05-28 06:54:15 752 1184 432.0 17.0 8.0
1 2021-05-28 23:30:15 2021-05-29 06:56:15 2180 2626 446.0 20.0 8.0
2 2021-05-30 09:49:15 2021-05-30 12:54:15 3969 4154 185.0 0.0 0.0
3 2021-05-30 22:18:15 2021-05-31 07:55:15 4718 5295 577.0 28.0 0.0
4 2021-05-31 15:16:15 2021-05-31 16:47:15 5703 5794 91.0 5.0 0.0
5 2021-05-31 23:24:15 2021-06-01 07:22:15 6191 6669 478.0 19.0 0.0
6 2021-06-02 00:09:15 2021-06-02 07:21:15 7676 8108 432.0 23.0 0.0
7 2021-06-02 23:36:15 2021-06-03 07:47:15 8794 9285 491.0 8.0 0.0
8 2021-06-04 00:00:54 2021-06-04 07:43:54 10223 10686 463.0 10.0 0.0
9 2021-06-04 23:36:54 2021-06-06 06:20:54 11639 12521 882.0 134.0 8.0
10 2021-06-07 00:13:54 2021-06-07 06:59:54 13594 14000 406.0 6.0 0.0
11 2021-06-07 23:53:54 2021-06-08 08:54:54 15014 15555 541.0 12.0 0.0
12 2021-06-08 23:25:54 2021-06-09 07:19:54 16426 16900 474.0 28.0 0.0
13 2021-06-10 00:27:54 2021-06-10 11:07:54 17641 18281 640.0 65.0 0.0
14 2021-06-11 00:11:44 2021-06-11 06:34:44 19047 19430 383.0 4.0 0.0
15 2021-06-12 00:06:44 2021-06-12 08:33:44 20401 20908 507.0 29.0 0.0
16 2021-06-12 23:39:44 2021-06-13 08:09:44 21814 22324 510.0 22.0 0.0
17 2021-06-13 23:53:44 2021-06-14 03:18:44 23268 23473 205.0 0.0 0.0
soi tst nw eff
0 1.0 406.0 8.0 0.939815
1 0.0 418.0 5.0 0.937220
2 0.0 185.0 0.0 1.000000
3 0.0 549.0 8.0 0.951473
4 0.0 86.0 1.0 0.945055
5 0.0 459.0 6.0 0.960251
6 0.0 409.0 6.0 0.946759
7 1.0 482.0 2.0 0.981670
8 3.0 450.0 4.0 0.971922
9 0.0 740.0 10.0 0.839002
10 0.0 400.0 2.0 0.985222
11 0.0 529.0 4.0 0.977819
12 0.0 446.0 5.0 0.940928
13 0.0 575.0 14.0 0.898438
14 1.0 378.0 2.0 0.986945
15 0.0 478.0 4.0 0.942801
16 2.0 486.0 6.0 0.952941
17 0.0 205.0 0.0 1.000000
fig = actigraphy_single_plot_actogram(df, ["activity", "int_temp_","ext_temp_","sleep","offwrist"], [False, True, True, True, True], 12, dt = "datetime")
fig.show()