News

How we measure your dreams

The science, craft, art, and magic of sleep phases detection

Executive summary

For years we have been skeptical about REM estimates from actigraphy (the mother for sleep tracking which can be used at home). At first glance this tasks seems like an attempt to measure your IQ through your running speed. But after analyzing a large PSG (Polysomnography) data-set we have found out there are indeed quite interesting regularities in sleep phases which give us a good change (around 40% with minimum false REM) to find a REM stage onset. This gives us the opportunity to introduce a brand new REM estimate algorithm into Sleep as Android. Read on to find out how we did it.

Attentive users of Sleep as Android certainly noticed that their hypnogram with REM phases looks very different in the new release.

Indeed, we have revisited our sleep phase detection algorithms and implemented some substantial improvements, based on large medical data analysis. It has always been a topic of vivid interest among our users – what do the sleep phases mean, how to interpret them, how accurate they are or can be. Let’s discuss the issue and the underlying principles more in details and cast some light on this esoteric area.

Sleep phases

Nothing had been known about the structure of sleep until about half of the last century. Well, sleep looks … just like sleep to a regular observer.

Then the early experiments with brain activity measurement (using EEG – recording tiny fluctuations of electric potential on the scalp) revealed the existence of several distinct kinds of sleep alternating throughout the night. Further research showed that the phases characterized by specific EEG pattern correspond to other sleep qualities, such as the likelihood of dreams, or how easy is it to awake.

Four major phases are usually recognized today.

  • N1 (abbreviation for non-REM-1) – a short transitional stage between sleep and wakefulness.
  • N2 – mostly light dreamless sleep occupying about 60% of the night.
  • N3 – the deepest sleep phase. The body is totally relaxed, EEG displays slow regular waves. This stage is believed to play a crucial role in the regenerative processes. Dreaming (so called non-REM dreaming) can occur but is not that common as in REM.
  • REM – high likelihood of vivid dreams, muscle paralysis, bursts of rapid eye movements.

The phases alternate in a typical sequence called sleep cycle – from a short N1, via N2, to the deep N3, then a shorter transitory N2, followed by REM, interrupted by brief awake. The cycle repeats several times throughout the night. The picture below shows a typical hypnogram, capturing sleep phases during one ordinary night. However, the pattern is highly variable. The lengths and exact ordering of the phases differ in each individual case.

Sleep Hypnogram

A lot has been found about various external and subjective manifestations of the individual sleep phases. Little is known, however, about the actual mechanics how they contribute to the recovery and cleansing processes – what is happening “under the hood”. The only thing we can say for sure is that enough sleep, including the full sleep cycle, is essential for all aspects of physical and mental health.

Polysomnography

Today’s golden standard for clinical sleep monitoring is called polysomnography (PSG). The patient spends one or more nights in a laboratory, trying to sleep as normally as possible, with dozens of electrodes attached to his body. They monitor his brain activity (EEG), eye movements (EOG), muscle tension (EMG), and several more quantities.

Polysmonograpy equipped patient

Typically, a human specialist watches the sensors and evaluates the records manually. Part of his task is an assessment of sleep phases (N1, N2, N3, REM) by spotting typical activity patterns on the sensors, mainly EEG.

While providing a detailed record of many variables, polysomnography is a laborious and expensive procedure. Current sleep research is often limited to a small data set due to the measurement costs. It is unrealistic to use polysomnography for large, population-wide studies, or many nights per patient. Furthermore, sleep on a polysomnograph is hardly fully natural. Even though the lab operators try to make things as comfortable as possible for the test subject, he still has dozens of electrodes attached to his body, cannot move freely in the bed, the room is unfamiliar. Nobody knows how much these factors affect the results and what the true outcomes would be in a natural situation.

There has been a lot of research on how to monitor sleep with cheaper devices that anyone can have at home and operate on his own. The question is especially relevant today when everyone owns a little box loaded with all kinds of sophisticated sensors – a smartphone.

Activity-based sleep phases

Let me emphasize one more time that the aforementioned sleep phases are defined mostly by EEG patterns, and can be accurately identified only from EEG readings. Let’s call them EEG-phases in the following text, in order to avoid confusion.

There is, obviously, no EEG in a smartphone or common wearables. However, we can monitor a sleeper with sensors that are available in these devices, and it makes perfect sense to analyze the measurements and see if they display any patterns. In Sleep as Android, we focus mostly on detailed monitoring of body movement throughout the night, using a wide range of available sensors (integrated accelerometer, sonar, infrared sensor).

And indeed, there are typically distinct phases of very low physical activity, when the body is completely relaxed, and periods of high activity, when the person is tossing and turning on the bed, rolling, twitching, and jerking. See a typical example showing the amount of activity during one night.

Thereby we can define two sleep phases – light sleep and deep sleep. These phases are shown in hypnograms in Sleep as Android and are used by the app for triggering smart alarms. Let’s call them ACT-phases, as they are based purely on the average short-term amount of physical activity. It’s an objective phenomenon, just like the EEG-phases. We can imagine them as a picture of the same underlying process (sleep) from a different angle. Either picture (EEG-phases, ACT-phases) captures a valid aspect of the reality.

How do the two kinds of phases relate?

We used a different input than polysomnographists, and defined our own sleep phases, reflecting an objective aspect of sleep, and easy to measure with common devices. One naturally needs to ask whether there is any relationship between the EEG-phases and our ACT-phases.

Fortunately, several research teams raised similar questions before (See this one, or this, or this, or this). They measured a bunch of people on a traditional polysomnograph and recorded their physical activity at the same time. (By filming them and then counting the movements manually, or by using accelerometer readings). The published analyses show that there indeed is a significant statistical relationship between EEG-phases and body movements.

Unsurprisingly, N3 phase displays the least amount of physical activity. Contrarily, N1 and REM are phases when the body is moving the most. The activity in REM might look surprising at first glance. It is known that large muscles are paralyzed during REM, as vivid dreams typically occur at this stage, and the body must be prevented from acting them out, jumping over the bed, riding a spaceship, fighting space villains. The paralysis is not perfect though, and limbs or trunk often randomly jerk and twitch.

The phase N2, in which the sleeper spends about half of the night, is somewhere in between. Sometimes the physical activity is low, especially when N2 precedes a deep sleep phase, sometimes the activity is higher, before or after N1 or REM phases.

Generally, the deep phases detected by our app roughly correspond to N3 and partly N2 EEG-phases, and light sleep corresponds to REM, N1, and partly N2.

However, the amount of movements varies greatly, based on age, gender, individual specifics, health, mental state, etc. There is no exact correspondence between such and such movement frequency and a specific sleep phase. The only overall reliable principle is that relatively low activity intervals are mostly N3 or N2, and relatively high activity intervals are likely to be REM, N1, or N2. Any attempt to pinpoint the exact beginning of a sleep phase is subject to a high degree of error and guesswork.

Nonetheless, this is still a useful approach, providing valuable insights with home-made sleep recordings. Large data can be collected cheaply for population-wise studies. Individual sleep enthusiasts may discover their own sleep patterns and possibly devise their own personalized sleep phenomenology.

How about the REM phase, finally?

REM phases are of special interest to our users. Some sleep enthusiasts experiment with lucid dreaming – using our gentle alert triggered during a possible REM phase to become partly conscious and self-aware and get actively engaged in the dream. Others are just curious to see their REM phases on the chart.

Now, wait a minute, in the light of the previous paragraphs, how dare we draw REM phases into our hypnograms? They are based on pure guesswork, right? Well, yes and no. Again, building on contemporary research and our own data analysis, we can produce meaningful results even from activity data alone.

We already know that REMs occur during the periods of higher physical activity. But it is still about 65% of the sleep time for an average user. Can we pinpoint it better?

The idealized sleep cycle, as depicted in textbooks, lasts about 90 minutes, and REM occurs soon after the N3 phase (mostly corresponding to our deep sleep ACT-phase). How reliable this pattern is, however, in reality, and how much of it is just a textbook cartoon?

Luckily, there is a publicly available database of tens of thousands of anonymized records from sleep labs. A large portion of them was annotated by human operators and individual sleep phases were marked on them. We acquired the data and ran some quantitative analyses on them to get a better insight into typical sequences and duration of sleep phases. The data confirm the idealized story to a great extent. Indeed, the pattern [N3 – short N2 – REM] is very common in the live data.

For example, 10 minutes after the end of N3 phase, there is 40% chance that the sleeper is in REM. 40% – it’s less than half – right? But the total amount of REM sleep is about 15% in this dataset, so this simple criterion alone increases the probability of correct REM detection nearly three times, compared to a random benchmark.

We ended up with a simple criterion: when there is a long enough deep sleep period, followed by a long enough light sleep period, we mark the first half of the light sleep (after a short initial offset) as REM. There is no way to tell if there actually was REM at any particular moment, based solely on aggregate activity data. But the approach is statistically valid. In average, the overlap of the actual REMs with the REMs marked on our charts is much better than just random, and we are really getting close to the limits of what can be inferred from smartphone inputs.

A natural question arises now – how do the new REMs, marked within the light sleep intervals, relate to the former REMs, marked at the end of deep sleep?

Our former approach to show REM phase estimates in our hypnogram was based simply on the known REM duration averages per sleep stages. To put it simple for every sleep phase (deep sleep + light sleep pair) which satisfies a duration threshold a percentage of the phase was marked as REM.

The thresholds were set quite high which led to less REM depicted than in the new approach which is more in line with the relative REM amount seen in the large PSG data set.

As not much was known about the actual position of the REM stage we did depict it at the transition of the deep sleep stage into light sleep.

Although the actual length of the estimated REM phase is still blurred and it seems unrelated to other parameters of the sleep phase, in the new approach we know more about the most probable position of REM. This is why we now depict the REM phase few minutes after the light sleep start.

Formerly REM was depicted between deep and light sleep. This has now changed and we are showing REM with higher bars between light sleep and awake which is consistent with how are PSG hypnograms plotted.

Coming next

This is not the end, we always strive to make our app better, and we already have some ideas how to improve REM detection, even with common smartphone sensors.

Some studies show that the nature of movements in REM and non-REM phases tends to be different. The moves in non-REM are more voluntary and purposeful, typically slower – like rolling on the bed or scratching. On the contrary, movements during REM, when muscles are mostly paralyzed, tend to be more like quick random jerks and twitches. We need to do some more data analysis to see if we can detect these patterns in activity record from mobile phone sensors. We hope that it will allow us to mark REM phases with better accuracy.

Another direction, so far in an early prototyping stage, is adding infrared sensors to our sleep mask. Then we might be able to monitor eye movements directly and hence recognize the REM phase quite accurately.

In the meantime, sleep long and prosper, now with a better understanding of your sleep phases and the meaning of our hypnograms.

Other articles in SleepCloud study series<< How does Sleep as Android compare to the Sleep labDid COVID-19 pandemic improve our sleep? >>

32 thoughts on “

How we measure your dreams

The science, craft, art, and magic of sleep phases detection

  1. Hi Jan
    Very interesting feedback regarding heart rate monitoring. How does this apply to your stresslocator hrm and more accurate heart rate monitors, such as the Mio Slice and chest straps? I would have thought that more accurate heart rate measurements could help confirm rem stage onset and duration? My very basic understanding is that heart rates tent to be substantially higher during rem than light sleep. What are your thoughts on this?

    Thanks
    Michael

  2. Hello Michael,

    That’s a broad and difficult question.

    One thing is physical limitation of the sensor itself. All these mass-market devices are usually based on photoplethysmography (https://en.wikipedia.org/wiki/Photoplethysmogram). The sensor needs to stick firmly to the skin in order for the data to be precise. Whenever the sensor moves on the skin, it generates artifacts like short sudden periods of unrealistically high or low heart rates. Generally, it it much easier for a smart watch to move on your wrist, even if you move only a little, than for the finger-strap sensor that we use with the stresslocator. But, if you, for example, have your palm under your body and move, then it may break the stress locator data too. As for chest straps, I have not tested properly any of them, but, I suppose, as some of them are designed for runners, they might be quite reliable. The only way to be sure is to get an app that is able to record raw data from your device and see for yourself.

    The second step is processing of the sensor output in the device. Some devices detect these unrealistic peaks and remove them, some do not. Some devices provide detailed output of the sensor, so our app can inspect the data and clean them, other devices provide only aggregated averaged heartbets (e.g. once per minute) and there is very little that we can do with it.

    The third step is processing in our app. We generally attempt to detect these unrealistic peaks and remove them, and the data that you see in the chart are averaged (over 5 minutes if I remember).

    You are right, there are studies on PSG confirming that heart rates in REM are higher in average. These measurements are done on a known device in a controlled environment, so they use detailed sensor input and clean it properly. And still, the result is that hertrates and sleep phases correspond only to some extent, the prediction from heart rates only would be burdened with large error.

    When we add to it another large (and uncharted) error coming from differences between hundreds of different wearables, their sensors and internal workings, we decided to omit this input for now.

    Best Regards

    Jan

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.