Performance Tests and Expected Accuracy
2-state (Sleep/Wake) algorithm for Mice
In 2018, internal testing was performed on our sleep-wake detection algorithm using 24 hours of electroencephalography (EEG), electromyogram (EMG), and piezo system data simultaneously, collected from 42 mice. All animal experiments were conducted at the University of Kentucky with prior approval from the university’s Institutional Animal Care and Use Committee (IACUC). Two human sleep scorers used the EEG and EMG data to label the 24 hours of data as Sleep or Wake over 4-second intervals. The assessment was performed only on segments where both human scorers agreed (~95% of the data). The disagreement between the PiezoSleep System and human scoring was 7% for all sleep-labeled data and 7% for all wake-labeled data.
2-state (Sleep/Wake) algorithm for Rats
There are 2 published studies done by 3rd parties that examine the agreement between the current PiezoSleep system (2.X8r and higher) and EEG/EMG scoring for rats. One study was published using 9 rats in 2022 (I. Topchiy et. al. 2022), and another used 14 rats for 7 days in 2021 (T. Vanneay et. al. 2021). This section summarizes these results and presents results from 8 rats performed internally by Signals Solutions.
The study by Topchiy et. al. used a corded system on 4 rats for 3 days, and a telemetry system on 5 rats for 16 days. Automatic sleep and wake scoring with EEG/EMG was performed using NeuroScore (3.0.7077 DSI) on 4-second intervals, and for 4 rats using a corded EEG/EMG system, 10-second intervals were scored by human assessment of the signals. In comparisons of results from the 3 systems, there was no significant difference in total sleep time or hourly sleep time. There was, however, a trend for NeuroScore results to score more sleep than PiezoSleep, and the human scored the EEG/EMG results. In terms of accuracy in comparing 4-second intervals, there was an agreement between sleep and wake scores of 85.6% for the corded system and 80.8% for the telemetry system.
The study by Vanney et al. tested the PiezoSleep system with 14 rats for 7 days using 10-second scoring intervals. Two light-dark cycles were used, where 7 rats were tested with LD12:12 and the other 7 rats with LD16:8. EEG/EMG sleep-wake was scored by human assessment of the signals over 10-second intervals. For both light cycles and both methods, there was no significant difference in total sleep time or hourly sleep percentages. The accuracy for sleep-wake scoring on 10-second intervals was 81.9% for LD12:12 and 84.9% for LD16:8.Â
In 2019, the PiezoSleep system was tested at Signal Solution for 8 rats using 24-hour recordings of electroencephalography (EEG) and electromyogram (EMG), simultaneous with the PiezoSleep signals. Data were collected, and EEG/EMG signals were scored automatically on 4-second intervals with NeuroScore at the Center for Sleep and Health Research, located at the University of Illinois in Chicago. In this evaluation, adjacent scores from PiezoSleep, which are given every 2 seconds, were combined so they could be compared to 4-second EEG/EMG scored intervals. Intervals that were split were not used in the evaluation since the NeuroScore results did not match this resolution. In addition, potential misalignments between the scored time series were compensated for by performing a correlation between the NeuroScore and 4-second interval PiezoSleep results. The misalignments could result from the clocks used for the EEG/EMG and the piezo system being off slightly, and filtering/buffering delay differences in the outputs of the 2 systems. The shift corresponding to the maximum correlation was applied to align the EEG/EMG scores with the PiezoScores. The adjustment was typically no more than 3 sample points. This resulted in a slightly higher score than the studies mentioned above. The accuracies for the 8 rats ranged from 85% to 91% with an average of 89%. The percent error was 8.3% during sleep and 14.7% during wake. The total sleep time was 59.4% for the EEG/EMG scoring and 60.4% for the PiezoSleep System scoring.
3-state (Wake/NREM/REM) algorithm for mice
The data set used to validate the 2-state performance was used to develop an ensemble decision tree algorithm for classifying wake, NREM, and REM vigilance states. After developing a decision tree classifier based on piezo signal features, the system was sent to an independent laboratory for validation (Paul Franken, Centre Intégratif de Génomique, Bâtiment Le Génopode, Université de Lausanne, Lausanne, Suisse). At these laboratories, simultaneous EEG/EMG and piezo system recordings were made for 4 mice over a 24-hour period. The human-scored EEG/EMG at 4-second intervals was up-sampled to match the decision tree algorithm classification at 2-second intervals. The resulting performance based on matching 2-second intervals was consistent with the training and testing in the development stage. Since wake was detected from the 2-state algorithm, the performance for wake states is the same as it was in the 2-state algorithm. For the sleep detections, the decision tree algorithm was applied to classify REM and NREM detections. For all human-scored NREM intervals (based on EEG/EMG scoring), there was a 12% disagreement with the automatic piezo-based algorithm scoring, and for REM, there was a 52% disagreement. The low agreement for the REM numbers suggests that the patterns picked up from pressure on the cage-floor sensor (from breathing and weight shifting) do not have as strong a correlation to the vigilance states as do the sleep and wake behaviors. In addition, since REM states typically occur in only about 6%, only a few errors from NREM and wake states increase the disagreement rate in REM significantly.
The REM patterns typically detected by the piezo-based algorithm are respiration patterns that become more irregular during detected sleep. Cage-floor pressure changes from the thorax motion (due to respiration) during REM exhibit rate changes over short-time intervals and low-level amplitude variations more than NREM. This pattern does not have a strong one-to-one correlation to the human-scored REM intervals from the EEG/EMG signals; however, piezo-based detections typically increase around and during REM events. The figure below shows an example of the pressure waveform overlaid with the human and piezo system scoring states. Observed in the blocks of human-scored REM states is an increase in the piezo system detections; however, it is not contiguous over this block, as the patterns of irregularity are not maintained over all short intervals in this block. In addition, false positives typically occur around transitions from wake to sleep and sleep to wake, as this is usually accompanied by more irregular breathing. This accounts for most of the disagreements between the human and piezo-based systems. To see the impact of disagreements at transition points, the disagreement rate computation at transitions is modified to include a consistent REM detection if a human-scored REM occurrence is within 1 detection interval (2 seconds) of the piezo system detection. With this rule, the disagreement drops to 38% overall.

Piezo system pressure signal overlaid with human EEG/EMG scoring and piezo system scoring.
To see how the algorithm can track trends in REM behavior over longer periods, the percent REM was computed over 1-hour intervals, and overlapping intervals were incremented every 12 minutes to create the percent REM plots over the 24 hours, as shown in the figures below. This was performed on the human and piezo systems, scored the results, and plotted them for comparison. Figure B2a was the best case of the validation set. There is a slight overprediction of the percent REM for most of the intervals, but the percentages were within a few percent over most of the scoring period. Figure B2b was the worst case of the validation set. There is almost a 10% difference in some for early REM increases (peaks). In both cases, however, the increases and decreases in percent REM over time show good agreement.

a) Comparison of percent REM sleep over 1-hour intervals for human-scored EEG/EMG and automatic piezo system scored. Best match in validation set

b) Comparison of percent REM sleep over 1-hour intervals for human-scored EEG/EMG and automatic piezo system scored: Worst match in validation set.
In conclusion, the REM detection feature has value in detecting changes in patterns of REM-like behavior (irregularity in breathing during sleep) over time; however, it is limited in comparing quantitative amounts of REM sleep from animal to animal. These results were obtained using Signal Solution’s custom cages with film sensors. Since subtle changes in the breathing signal are an important factor in the automatic 3-state algorithm, environments and/or cage systems with noise levels on the order of the mouse breathing signal can significantly impact the algorithm performance. If you are unsure of the noise levels on your system, we recommend contacting Signals Solutions to analyze a recording.
References
(I. Topchiy et. al. 2022) Topchiy I, Fink AM, Maki KA, Calik MW. Validation of PiezoSleep Scoring Against EEG/EMG Sleep Scoring in Rats. Nat Sci Sleep. 2022 Oct 20;14:1877-1886. Doi: 10.2147/NSS.S381367. PMID: 36300015; PMCID: PMC9590343.
(T. Vanneay et. al. 2021) Vanneau T, Quiquempoix M, Trignol A, et al. Determination of the sleep-wake pattern and feasibility of NREM/REM discrimination using the non-invasive piezoelectric system in rats. J Sleep Res. 2021;30(6):e13373. doi:10.1111/jsr.13373.
Performance Tests and Expected Accuracy
2-state (Sleep/Wake) algorithm for Mice
In 2018, internal testing was performed on our sleep-wake detection algorithm using 24 hours of electroencephalography (EEG), electromyogram (EMG), and piezo system data simultaneously, collected from 42 mice. All animal experiments were conducted at the University of Kentucky with prior approval from the university’s Institutional Animal Care and Use Committee (IACUC). Two human sleep scorers used the EEG and EMG data to label the 24 hours of data as Sleep or Wake over 4-second intervals. The assessment was performed only on segments where both human scorers agreed (~95% of the data). The disagreement between the PiezoSleep System and human scoring was 7% for all sleep-labeled data and 7% for all wake-labeled data.
2-state (Sleep/Wake) algorithm for Rats
There are 2 published studies done by 3rd parties that examine the agreement between the current PiezoSleep system (2.X8r and higher) and EEG/EMG scoring for rats. One study was published using 9 rats in 2022 (I. Topchiy et. al. 2022), and another used 14 rats for 7 days in 2021 (T. Vanneay et. al. 2021). This section summarizes these results and presents results from 8 rats performed internally by Signals Solutions.
The study by Topchiy et. al. used a corded system on 4 rats for 3 days, and a telemetry system on 5 rats for 16 days. Automatic sleep and wake scoring with EEG/EMG was performed using NeuroScore (3.0.7077 DSI) on 4-second intervals, and for 4 rats using a corded EEG/EMG system, 10-second intervals were scored by human assessment of the signals. In comparisons of results from the 3 systems, there was no significant difference in total sleep time or hourly sleep time. There was, however, a trend for NeuroScore results to score more sleep than PiezoSleep, and the human scored the EEG/EMG results. In terms of accuracy in comparing 4-second intervals, there was an agreement between sleep and wake scores of 85.6% for the corded system and 80.8% for the telemetry system.
The study by Vanney et al. tested the PiezoSleep system with 14 rats for 7 days using 10-second scoring intervals. Two light-dark cycles were used, where 7 rats were tested with LD12:12 and the other 7 rats with LD16:8. EEG/EMG sleep-wake was scored by human assessment of the signals over 10-second intervals. For both light cycles and both methods, there was no significant difference in total sleep time or hourly sleep percentages. The accuracy for sleep-wake scoring on 10-second intervals was 81.9% for LD12:12 and 84.9% for LD16:8.Â
In 2019, the PiezoSleep system was tested at Signal Solution for 8 rats using 24-hour recordings of electroencephalography (EEG) and electromyogram (EMG), simultaneous with the PiezoSleep signals. Data were collected, and EEG/EMG signals were scored automatically on 4-second intervals with NeuroScore at the Center for Sleep and Health Research, located at the University of Illinois in Chicago. In this evaluation, adjacent scores from PiezoSleep, which are given every 2 seconds, were combined so they could be compared to 4-second EEG/EMG scored intervals. Intervals that were split were not used in the evaluation since the NeuroScore results did not match this resolution. In addition, potential misalignments between the scored time series were compensated for by performing a correlation between the NeuroScore and 4-second interval PiezoSleep results. The misalignments could result from the clocks used for the EEG/EMG and the piezo system being off slightly, and filtering/buffering delay differences in the outputs of the 2 systems. The shift corresponding to the maximum correlation was applied to align the EEG/EMG scores with the PiezoScores. The adjustment was typically no more than 3 sample points. This resulted in a slightly higher score than the studies mentioned above. The accuracies for the 8 rats ranged from 85% to 91% with an average of 89%. The percent error was 8.3% during sleep and 14.7% during wake. The total sleep time was 59.4% for the EEG/EMG scoring and 60.4% for the PiezoSleep System scoring.
3-state (Wake/NREM/REM) algorithm for mice
The data set used to validate the 2-state performance was used to develop an ensemble decision tree algorithm for classifying wake, NREM, and REM vigilance states. After developing a decision tree classifier based on piezo signal features, the system was sent to an independent laboratory for validation (Paul Franken, Centre Intégratif de Génomique, Bâtiment Le Génopode, Université de Lausanne, Lausanne, Suisse). At these laboratories, simultaneous EEG/EMG and piezo system recordings were made for 4 mice over a 24-hour period. The human-scored EEG/EMG at 4-second intervals was up-sampled to match the decision tree algorithm classification at 2-second intervals. The resulting performance based on matching 2-second intervals was consistent with the training and testing in the development stage. Since wake was detected from the 2-state algorithm, the performance for wake states is the same as it was in the 2-state algorithm. For the sleep detections, the decision tree algorithm was applied to classify REM and NREM detections. For all human-scored NREM intervals (based on EEG/EMG scoring), there was a 12% disagreement with the automatic piezo-based algorithm scoring, and for REM, there was a 52% disagreement. The low agreement for the REM numbers suggests that the patterns picked up from pressure on the cage-floor sensor (from breathing and weight shifting) do not have as strong a correlation to the vigilance states as do the sleep and wake behaviors. In addition, since REM states typically occur in only about 6%, only a few errors from NREM and wake states increase the disagreement rate in REM significantly.
The REM patterns typically detected by the piezo-based algorithm are respiration patterns that become more irregular during detected sleep. Cage-floor pressure changes from the thorax motion (due to respiration) during REM exhibit rate changes over short-time intervals and low-level amplitude variations more than NREM. This pattern does not have a strong one-to-one correlation to the human-scored REM intervals from the EEG/EMG signals; however, piezo-based detections typically increase around and during REM events. The figure below shows an example of the pressure waveform overlaid with the human and piezo system scoring states. Observed in the blocks of human-scored REM states is an increase in the piezo system detections; however, it is not contiguous over this block, as the patterns of irregularity are not maintained over all short intervals in this block. In addition, false positives typically occur around transitions from wake to sleep and sleep to wake, as this is usually accompanied by more irregular breathing. This accounts for most of the disagreements between the human and piezo-based systems. To see the impact of disagreements at transition points, the disagreement rate computation at transitions is modified to include a consistent REM detection if a human-scored REM occurrence is within 1 detection interval (2 seconds) of the piezo system detection. With this rule, the disagreement drops to 38% overall.

Piezo system pressure signal overlaid with human EEG/EMG scoring and piezo system scoring.
To see how the algorithm can track trends in REM behavior over longer periods, the percent REM was computed over 1-hour intervals, and overlapping intervals were incremented every 12 minutes to create the percent REM plots over the 24 hours, as shown in the figures below. This was performed on the human and piezo systems, scored the results, and plotted them for comparison. Figure B2a was the best case of the validation set. There is a slight overprediction of the percent REM for most of the intervals, but the percentages were within a few percent over most of the scoring period. Figure B2b was the worst case of the validation set. There is almost a 10% difference in some for early REM increases (peaks). In both cases, however, the increases and decreases in percent REM over time show good agreement.

a) Comparison of percent REM sleep over 1-hour intervals for human-scored EEG/EMG and automatic piezo system scored. Best match in validation set

b) Comparison of percent REM sleep over 1-hour intervals for human-scored EEG/EMG and automatic piezo system scored: Worst match in validation set.
In conclusion, the REM detection feature has value in detecting changes in patterns of REM-like behavior (irregularity in breathing during sleep) over time; however, it is limited in comparing quantitative amounts of REM sleep from animal to animal. These results were obtained using Signal Solution’s custom cages with film sensors. Since subtle changes in the breathing signal are an important factor in the automatic 3-state algorithm, environments and/or cage systems with noise levels on the order of the mouse breathing signal can significantly impact the algorithm performance. If you are unsure of the noise levels on your system, we recommend contacting Signals Solutions to analyze a recording.
References
(I. Topchiy et. al. 2022) Topchiy I, Fink AM, Maki KA, Calik MW. Validation of PiezoSleep Scoring Against EEG/EMG Sleep Scoring in Rats. Nat Sci Sleep. 2022 Oct 20;14:1877-1886. Doi: 10.2147/NSS.S381367. PMID: 36300015; PMCID: PMC9590343.
(T. Vanneay et. al. 2021) Vanneau T, Quiquempoix M, Trignol A, et al. Determination of the sleep-wake pattern and feasibility of NREM/REM discrimination using the non-invasive piezoelectric system in rats. J Sleep Res. 2021;30(6):e13373. doi:10.1111/jsr.13373.
