I’ve been wearing both an Oura ring (ring finger on dominant hand) and Whoop strap (non-dominant wrist) for several months now. I imported the data to a Jupyter notebook and graphed some comparison metrics, here’s how they look:
Note: data updated 22nd November 2022, total of 193 days
Whoop vs Oura: Heart Rate Variability
- Whoop Average HRV: 66.7
- Oura Average HRV: 63.4
In most cases the Whoop reading is above the Oura for HRV, though only by a few ms. The exception is June 14th when Whoop reported my HRV at 100ms and Oura at 78ms. Whoop measures HRV during slow wave (deep) sleep, which was shorter than average on this night at 1hr 16. The only other difference to an average night is that light sleep was up at 64% which is well above my typical range.
Update: August involved 12 days with a tooth infection which left my HRV very low
Whoop vs Oura: Resting Heart Rate
- Whoop Average RHR: 53.5
- Oura Average RHR: 54.1
Very good alignment here. Last year Oura had my resting heart rate in the low 40’s so I’m not sure if the uptick since my November 2021 change to the Oura 3.0 ring is a change in my body or a change in the technology.
Whoop vs Oura: Daily Kcal Burn
I should point out that both profiles are filled out with height equal to 176cm and weight at 71kg, meaning the calculation of Basal Metabolic Rate (calories burned at rest) should be identical.
- Whoop Average Kcal burn: 2252
- Oura Average Kcal burn: 2415
On the Whoop podcast they mentioned that any wearable is likely to only get this value to within +/- 10%, meaning if my true burn was around 2250 kcal then both would be plausible. I record my daily intake with Cronometer and over the same period I have my average intake at 2340 kcal. My weight during that time has remained stable and so Oura is likely closer to my true calorie burn.
Update: Since September I’ve been training harder, so it seems for low levels of activity Whoop underestimates calorie burn relative to Oura, but for moderate activity they are more aligned.
Whoop vs Oura: Recovery vs Readiness
These metrics don’t claim to capture the same thing but it’s an interesting comparison. Both are shown to you in the morning as a proxy for how much activity you can do that day.
- Whoop Average Recovery Score: 63.5
- Oura Average Readiness: 81.2
As covered in my Oura Gen 2 vs Gen 3 article, the new generation 3 ring is even more optimistic when it comes to Readiness, averaging a score 2.3 higher. After a bout of food poisoning my Whoop gave a recovery score of 43% while Oura’s Readiness score was still 78. Whoop give more weight to HRV in their equation and so by dipping well below my average with an HRV of 49ms this is reflected in a low recovery score but not necessarily in Oura’s Readiness, which feels more tied to quality of sleep.
Update: A combination of a fever and tooth infection the day after a heavy workout meant I was given a Whoop Recovery score of 1% for the first time. This day was also the lowest Readiness score from Oura of 35.
Whoop vs Oura: Sleep Score
- Whoop Average Sleep Score: 84.9
- Oura Average Sleep Score: 80.2
I found this graph the most interesting. They are generally well aligned but I deliberately changed my routine recently and it’s very clearly reflected in the decoupling of the two scores. On several episodes of the Whoop podcast they’ve mentioned how important sleep consistency is to quality of sleep (consistent sleep and wake times). I now have a bedtime alarm 35 minutes before I intend to sleep, after which I brush my teeth, darken the room, spend 20 minutes reading or listening to health content and then use Oura’s sleep meditation (Note: my favourite is “Letting Go and Drifting Off”). It’s plausible that sleep consistency has a higher weight in the Whoop sleep score, and so you see this jump in one but not the other.
Whoop vs Oura: Total REM Sleep
- Whoop Average REM Sleep (mins): 108.9
- Oura Average REM Sleep (mins): 70.8
Here’s where you start doubting the ability of a wearable to accurately measure sleep stages.. I watched an Oura review and that particular user had switched from a Garmin, noting that his REM and Deep sleep durations appeared to “switch” with the Oura. That meant if his Garmin told him he got 2 hours of REM and 1 hour of Deep sleep, Oura would report roughly 2 hours of Deep sleep and 1 hour of REM.
A typical night of sleep should involve around 20-25% REM sleep and 13-23% Deep sleep. As my total sleep is around 7h30 per night that would equate to 90-112 minutes of REM sleep and 58-104 minutes of Deep sleep. While the average for Whoop REM sleep is 108.6 minutes, as can be seen on the graph it doesn’t actually fall within the 90-112 minute range too often, jumping from over 150 minutes to under 60 minutes quite frequently across the two months.
Whoop vs Oura: Total Deep Sleep
- Whoop Average Deep Sleep (mins): 96.3
- Oura Average Deep Sleep (mins): 127.8
As with the REM sleep graph, Deep sleep doesn’t appear to have good alignment between the two wearables. There’s a ~10 day spell where Oura reports over 120 minutes every night and Whoop reports less than 120 minutes. Deep sleep is physically restorative and so demand should increase with exertion, but I don’t have the background knowledge to say whether Whoop or Oura data is more plausible.
If you’re interested in more in-depth comparisons I highly recommend The Quantified Scientist on YouTube. Here’s a clip from his channel comparing Oura Gen 3 sleep stage tracking to the gold standard
Whoop vs Oura: Respiratory Rate
- Whoop Average Respiratory Rate (breaths per minute): 17.1
- Oura Average Respiratory Rate (breaths per minute): 15.8
Last but not least is respiratory rate. I read Wim Hof’s book where he noted that a respiratory rate above 15 indicates stress. I also read Breath by James Nestor where a lower respiratory rate should be the goal for most people. This is something that generally stays within a small range, typically no more than +/- 1 breath per minute across a month. The important information to take is therefore when respiratory rate suddenly falls outside that range. I would say alignment is generally good, and for the purpose of identifying extremes it doesn’t really matter whether their averages differ, as each will consider it to be my respective baseline rate.
Anything I’ve missed? Leave a comment below. Thanks for reading and please feel free to bookmark it as I intend to update the graphs every 4-6 weeks with the latest data.