fitbit手表中文说明书

Smartwatches and other wearable devices have gained popularity over the past couple of years and have given rise to the cultural phenomenon of the “Quantified Self”. Devices such as the Apple Watch or Fitbit have made it possible for anyone to easily self-track and thereby quantify their lives in some way. Popular self-quantifications include calories burnt, steps walked during the day or quality of sleep.

在过去的几年中,智能手表和其他可穿戴设备获得了普及,并引起了“量化自我”的文化现象。 诸如Apple Watch或Fitbit之类的设备使任何人都可以轻松进行自我跟踪,从而以某种方式量化他们的生活。 流行的自我量化方法包括燃烧卡路里,白天行走的步数或睡眠质量。

In this article, I will focus on the latter, namely quality of sleep, using real life data from approximately one year of Fitbit usage. Fitbit provides users with a Sleep Score, which is supposed to be a measure of sleep quality. I will train and test different Machine Learning models using Python in an attempt to predict the Fitbit Sleep Score as accurately as possible while providing an explanation of how different metrics, such as minutes of REM sleep, affect the score.

在本文中,我将重点讨论后者,即睡眠质量,它使用来自Fitbit大约一年使用的真实生活数据。 Fitbit为用户提供睡眠分数,该分数可以衡量睡眠质量。 我将使用Python训练和测试不同的机器学习模型,以尝试尽可能准确地预测Fitbit睡眠得分,同时说明不同的指标(例如REM睡眠的分钟数)如何影响得分。

The article is structured as follows:

这篇文章的结构如下:

  1. Brief introduction to Fitbit’s Sleep ScoreFitbit睡眠评分简介
  2. Getting the sleep data from Fitbit从Fitbit获取睡眠数据
  3. Data cleaning and preparation数据清理和准备
  4. Exploratory Data Analysis探索性数据分析
  5. Separating the data into training, validation and test set将数据分为训练,验证和测试集
  6. Scaling features and defining performance metrics扩展功能并定义性能指标
  7. Feature Selection using Lasso Regression使用套索回归进行特征选择
  8. Multiple Linear Regression多元线性回归
  9. Random Forest Regressor随机森林回归
  10. Extreme Gradient Boosting Regressor极梯度提升回归器
  11. Cross-Validation交叉验证
  12. Hyperparameter Tuning超参数调整
  13. Final model evaluation最终模型评估
  14. Concluding comments结论性意见

Because there is a lot to cover, I split the article into three parts. Part 1 covers points 1 through 4 and focuses on getting the sleep data, preprocessing and visualising it. Part 2 covers points 5 through 10, i.e. actually building the Machine Learning models based on the preprocessed data from part 1. Part 3 covers the rest and is all about improving the models from part 2 to get the most accurate predictions possible.

由于涉及的内容很多,因此将文章分为三部分。 第1部分涵盖了第1点到第4点,并着重于获取睡眠数据,对其进行预处理和可视化。 第2部分涵盖了第 5点到第10点,即根据第1 部分中的预处理数据实际构建机器学习模型。 第3部分涵盖了其余部分,所有内容都涉及对第2部分中的模型进行改进以获得尽可能准确的预测。

Fitbit睡眠分数到底是多少? (What exactly is the Fitbit Sleep Score?)

The Fitbit Sleep Score is best described through an example, so here are two screenshots of what the App provides to its users:

最好通过一个示例来描述Fitbit睡眠得分,因此以下是该应用程序向用户提供的两个屏幕截图:

Sleep statistics provided by Fitbit
Fitbit提供的睡眠统计

In the Fitbit App, users are given a Sleep Score, which is 78 in this case, a graphical representation of the sleep stages across the sleep window, the concrete breakdown of these sleep stages in minutes as well as percent and an estimated oxygen variation.

在Fitbit App中,为用户提供了睡眠得分,在这种情况下为78,是整个睡眠窗口的睡眠阶段的图形表示,这些睡眠阶段的具体分解(以分钟为单位)以及百分比和估计的氧气变化。

This in and of itself seems fairly straight forward. Fitbit just has some algorithm that they plug the relevant sleep statistics, such as minutes spent in REM sleep, into and it spits out the Sleep Score.

就其本身而言,这似乎很简单。 Fitbit只是有一些算法可以将相关的睡眠统计信息(例如,REM睡眠所花费的时间)插入其中,并吐出睡眠得分。

To anyone with a Fitbit who has ever tried to understand patterns in their Sleep Score it is clear that this is far from straight forward. The below screenshots will make it clear where the confusion is coming from:

对于任何有Fitbit的人,只要曾经尝试了解其睡眠评分的模式,就很明显这远非直截了当。 下面的屏幕截图可以清楚地说明混乱的来源:

More sleep statistics provided by Fitbit
Fitbit提供的更多睡眠统计信息

Comparing these sleep statistics to the first ones tells us the following:

将这些睡眠统计信息与第一个睡眠统计信息进行比较,可以得出以下结论:

  • Time asleep is more than an hour longer睡眠时间超过一个小时以上
  • Time spent in REM sleep is almost the same快速眼动睡眠所花的时间几乎相同
  • Time spent in deep sleep is a lot longer深度睡眠所花费的时间更长

Based on these observations one would expect the second sleep score to be higher than the first one but it is actually the same. What is going on here? What role do the different statistics play in the calculation of the Sleep Score? Is it possible to predict Sleep Scores yourself by only looking at the sleep statistics provided?

基于这些观察结果,人们期望第二个睡眠得分高于第一个睡眠得分,但实际上是相同的。 这里发生了什么? 不同的统计数据在睡眠得分的计算中起什么作用? 仅查看所提供的睡眠统计信息,是否可以自己预测睡眠分数?

This article answers all those questions and provides a detailed walk-through of a Machine Learning project. I hope you enjoy it!

本文回答了所有这些问题,并提供了机器学习项目的详细演练。 我希望你喜欢它!

从Fitbit获取睡眠数据 (Getting the sleep data from Fitbit)

Fitbit allows users to export sleep data in CSV files through their online dashboards. This process turned out to involve a bit of manual labor because Fitbit only allows a maximum of 31 days of data to be exported at a time. A few minutes later I had all the data and quickly combined them into one CSV file.

Fitbit允许用户通过其在线仪表板以CSV文件格式导出睡眠数据。 事实证明,此过程需要一点点人工,因为Fitbit一次最多只能导出31天的数据。 几分钟后,我获得了所有数据,并Swift将它们合并为一个CSV文件。

There was one problem. The manually exported CSV files included all of the sleep statistics (Minutes Asleep, Minutes Awake, Minutes REE Sleep, etc.) but did not include the actual sleep score. What the hell?!

有一个问题。 手动导出的CSV文件包括所有睡眠统计信息(“分钟睡眠”,“分钟睡眠”,“分钟REE睡眠”等),但不包括实际睡眠分数。 我勒个去?!

After some digging, I discovered that there was another export option called “Lifetime Export”, which exports all the data Fitbit has collected on you ever since you started wearing their watch. You have to request this export from Fitbit before being able to download it and once approved you can download a zip folder with all sorts of different files. Included in that zip folder is a CSV file with additional sleep statistics, including the Sleep Score.

经过一番挖掘之后,我发现还有另一个导出选项,称为“终身导出”,可以导出自您开始佩戴Fitbit手表以来Fitbit收集的所有数据。 您必须先从Fitbit要求导出此导出,然后才能下载它,一旦获得批准,您就可以下载包含各种不同文件的zip文件夹。 该zip文件夹中包含一个CSV文件,其中包含其他睡眠统计信息,包括睡眠得分。

I saved the CSV file containing the sleep statistics as sleep_stats.csv and the the CSV file containing the Sleep Scores as sleep_score.csv. Let’s move on to Python.

我将包含睡眠统计信息的CSV文件另存为sleep_stats.csv,将包含睡眠分数的CSV文件另存为sleep_score.csv。 让我们继续使用Python。

数据清理和准备 (Data cleaning and preparation)

This section explains how to get from the CSV file to a DataFrame that is ready to be used in Machine Learning models. In the process, I encountered some common problems that can arise when importing data into Python and I explain how to deal with them in order to end up with a neatly preprocessed data set.

本节说明如何从CSV文件获取准备好在机器学习模型中使用的DataFrame。 在此过程中,我遇到了将数据导入Python时可能会出现的一些常见问题,并且我解释了如何处理它们以便最终获得经过整齐的预处理的数据集。

After importing all the relevant libraries (see the full notebook for the libraries) the first step is to import the sleep data from the CSV files into Python using the pd.read_csv() function:

导入所有相关库之后(第一步,请参见库的完整笔记本 ),第一步是使用pd.read_csv()函数将睡眠数据从CSV文件导入到Python中:

I only import the first two columns of sleep_score.csv as they are the ones that contain the date and the actual sleep score, all other relevant data is found in sleep_stats.csv. Let’s have a look at the first five rows in sleep_stats_data:

我只导入sleep_score.csv的前两列,因为它们是包含日期和实际睡眠分数的列,所有其他相关数据都在sleep_stats.csv中找到。 让我们看一下sleep_stats_data中的前五行:

This is the first common problem: because of the way the CSV file is structured, the column names are in the first row. Here is one way to fix this problem:

这是第一个常见问题:由于CSV文件的结构方式,列名位于第一行。 这是解决此问题的一种方法:

Using the .info() function we can obtain a high-level summary of the data in the DataFrame, which in our case looks like this:

使用.info()函数,我们可以获取DataFrame中数据的高级摘要,在我们的示例中如下所示:

Here, we encounter the second common problem: there are NaN values (missing data) in the last three columns. This is indicated by the fact that the above information summary tells us that there are 322 entries (rows) but for the last three rows the non-null count is 287. Let’s have a look at the rows that contain missing data using the following code:

在这里,我们遇到了第二个常见问题:最后三列中有NaN值(缺少数据)。 上面的信息摘要告诉我们有322个条目(行),但是对于最后三行,非空计数为287。这说明了这一点,让我们使用以下代码查看包含缺失数据的行:

If we look at the column Minutes Asleep or Start and End Time it becomes clear that these rows refer to afternoon naps that Fitbit recorded. Naps are too short for Fitbit to be able to reliably measure important sleep statistics and therefore we will drop all these rows from the data set:

如果我们查看“分钟睡眠时间”或“开始和结束时间”列,则很明显,这些行是Fitbit记录的午睡时间。 午睡太短,以至于Fitbit无法可靠地测量重要的睡眠统计信息,因此,我们将从数据集中删除所有这些行:

In the above data summary, we also encounter the third common problem, which is related to the first: all columns are of data type “object” but columns with index 2 to 8 should clearly be numerical, i.e. either of data type “int” or “float”. The reason these columns are of data type “object” is most likely because the column headers were initially placed in the first row, thereby causing the entire column to be classified as “object”. Let’s convert these columns to data type “float”:

在上面的数据摘要中,我们还遇到了第三个常见问题,该问题与第一个相关:所有列的数据类型均为“对象”,但索引为2到8的列显然应为数字,即数据类型为“ int”或“浮动”。 这些列属于数据类型“对象”的原因很可能是因为列标题最初放置在第一行中,从而导致整个列被归类为“对象”。 让我们将这些列转换为数据类型“ float”:

Let’s now have a look at the first few rows and the summary of sleep_score_data:

现在让我们看一下前几行和sleep_score_data的摘要:

This DataFrame looks a lot better, the column headers were automatically recognised and there are no missing values.

这个DataFrame看起来好多了,列标题被自动识别并且没有丢失的值。

For the purpose of further analyses, I would like to combine the two DataFrames into one, meaning I want to merge them. To ensure that the Sleep Scores end up in the row with the corresponding sleep statistics I need a column that is identical in both DataFrames, which will be used as the column to merge on.

为了进行进一步的分析,我想将两个DataFrame合并为一个,这意味着我想将它们合并。 为了确保睡眠分数最终在具有相应睡眠统计信息的行中显示,我需要在两个DataFrame中都相同的列,该列将用作合并的列。

In our case, both DataFrames have a column with some sort of timestamp. The sleep statistics DataFrame has a start and an end time and the Sleep Score DataFrame has a timestamp. Because a sleep score is always provided after awakening, the date that is relevant in the sleep statistics DataFrame is the end time and we can drop the start time. But there is one more issue: the format of the end time in the sleep statistics DataFrame is different from the format of the timestamp in the Sleep Score DataFrame. If we tried to merge the DataFrames on these columns, the rows would not be matched up. My solution was to create a “Date” column in both DataFrames that contains only the date, merge the DataFrames on those columns, drop the redundant columns and drop one row that contained a missing value after the merge. The following code accomplishes this:

在我们的例子中,两个DataFrame都有一个带有某种时间戳的列。 睡眠统计数据帧具有开始时间和结束时间,而睡眠分数数据帧具有时间戳。 由于始终在唤醒后提供睡眠得分,因此睡眠统计数据框架中相关的日期是结束时间,我们可以减少开始时间。 但是还有一个问题:睡眠统计数据帧中结束时间的格式与睡眠分数数据帧中时间戳的格式不同。 如果我们尝试合并这些列上的DataFrame,则行将不匹配。 我的解决方案是在两个仅包含日期的数据框中创建一个“日期”列,合并这些列上的数据框,删除冗余列,并删除合并后包含缺失值的一行。 以下代码完成了此任务:

The resulting combined DataFrame looks like this:

生成的组合DataFrame如下所示:

Merged and preprocessed data
合并和预处理的数据

I dropped all columns related to dates because this is not a time series analysis and we do not need the dates going forward. The number of awakenings are not provided by the Fitbit app and because I want to predict Sleep Scores using only data that is provided in the app I dropped it as well.

我删除了与日期相关的所有列,因为这不是时间序列分析,因此我们不需要将来的日期。 Fitbit应用程序不提供唤醒次数,因为我只想使用应用程序中提供的数据来预测睡眠分数,所以我也将其删除。

With the combined and preprocessed DataFrame we can move on to some Exploratory Data Analysis.

通过组合和预处理的DataFrame,我们可以进行一些探索性数据分析。

探索性数据分析(EDA) (Exploratory Data Analysis (EDA))

In this section I will use visualisations to provide a better understanding of the underlying data. These initial insights will be the foundation for later analyses.

在本节中,我将使用可视化效果更好地理解基础数据。 这些初步见解将成为以后分析的基础。

First let’s have a look at the distribution of the Sleep Scores:

首先,让我们看一下睡眠得分的分布:

The distribution of sleep scores is skewed to the left, which makes sense because bad night sleeps are more likely to occur than exceptionally good night sleeps due to multiple reasons such as staying out late or having to get up extremely early. In addition, the average sleep score is already relatively high at 82 (out of 100) and therefore it is unlikely (basically impossible) to have many outliers that lie far above the mean.

睡眠分数的分布向左倾斜,这是有道理的,因为由于多种原因(例如,熬夜或必须特别早起床),比正常的夜间睡眠更容易发生不良的夜间睡眠。 另外,平均睡眠得分已经相对较高,为82(满分100),因此不可能(基本上不可能)有许多离平均值远得多的异常值。

Let’s also have a look at the relationship that each individual feature has with the Sleep Score to get a sense of which features might be important and what their relationships to the Sleep Score are. I have defined a function that takes as inputs a DataFrame that contains the target variable in the last column as well as the number of columns to be contained in the entire plot. The number of columns determines how many subplots there are in each row. Here is the function:

我们还要看看每个功能与睡眠得分之间的关​​系,以了解哪些功能可能很重要以及它们与睡眠得分之间的关​​系。 我定义了一个函数,该函数以一个DataFrame作为输入,该DataFrame包含最后一列中的目标变量以及整个绘图中要包含的列数。 列数确定每行中有多少个子图。 这是函数:

Calling this function with the sleep_data DataFrame and num_cols=3 as inputs results in the following plots:

使用sleep_data DataFrame和num_cols = 3作为输入调用此函数将导致以下绘图:

Taken by themselves, Minutes Asleep and Minutes REM Sleep seem to have the strongest positive relationship with Sleep Score. Generally speaking this makes sense because more time asleep should be a positive thing when thinking about sleep quality and therefore Sleep Score. The same is true for more time spent in REM sleep.

单独考虑,“ Minutes Asleep”和“ Minutes REM Sleep”似乎与睡眠得分之间的关​​系最强。 一般来说,这是有道理的,因为在考虑睡眠质量并因此考虑睡眠得分时,更多的睡眠时间应该是一件积极的事情。 对于花在REM睡眠上的更多时间也是如此。

To complete the picture about the relationships between the different features and Sleep Score let’s have a look at the correlation matrix:

为了完成有关不同功能与睡眠得分之间关系的描述,让我们看一下相关矩阵:

Indeed, Sleep Score has the highest correlation with Minutes REM Sleep, closely followed by Minutes Asleep. Another important thing to note is that many of the features are highly correlated. This makes sense because more time asleep should lead to more time spent in all stages of sleep and the features will tend to move together. While this may be an inevitable by-product of the nature of the features included here it could lead to multicollinearity issues down the road. More on this later.

实际上,睡眠分数与分钟REM睡眠的相关性最高,紧随其后的是分钟睡眠。 还要注意的另一重要事项是,许多功能是高度相关的。 这是有道理的,因为更多的睡眠时间会导致在所有睡眠阶段花费更多的时间,并且这些功能部件往往会一起移动。 虽然这可能是此处包含的功能的本质的必然产物,但它可能会导致多重共线性问题。 稍后再详细介绍。

Part 2 builds on the preprocessed data and the insights from the Exploratory Data Analysis to build a couple of different Machine Learning models that predict Sleep Scores. Part 2 can be found here:

第2部分基于预处理数据和Exploratory Data Analysis的见解,构建了两个不同的预测睡眠分数的机器学习模型。 第2部分可以在这里找到:

翻译自: https://towardsdatascience.com/how-to-obtain-and-analyse-fitbit-sleep-scores-a739d7c8df85

fitbit手表中文说明书


http://www.taodudu.cc/news/show-863787.html

相关文章:

  • 熔池 沉积_用于3D打印的AI(第2部分):异常熔池检测的一课学习
  • 机器学习 可视化_机器学习-可视化
  • 学习javascript_使用5行JavaScript进行机器学习
  • 强化学习-动态规划_强化学习-第4部分
  • 神经网络优化器的选择_神经网络:优化器选择的重要性
  • 客户细分_客户细分:K-Means聚类和A / B测试
  • 菜品三级分类_分类器的惊人替代品
  • 开关变压器绕制教程_教程:如何将变压器权重和令牌化器从AllenNLP上传到HuggingFace
  • 一般线性模型和混合线性模型_线性混合模型如何工作
  • 为什么基于数字的技术公司进行机器人研究
  • 人类视觉系统_对人类视觉系统的对抗攻击
  • 在神经网络中使用辍学:不是一个神奇的子弹
  • 线程监视器模型_为什么模型验证如此重要,它与模型监视有何不同
  • dash使用_使用Dash和SHAP构建和部署可解释的AI仪表盘
  • 面向表开发 面向服务开发_面向繁忙开发人员的计算机视觉
  • 可视化 nltk_词嵌入:具有Genism,NLTK和t-SNE可视化的Word2Vec
  • fitbit手表中文说明书_使用机器学习预测Fitbit睡眠分数
  • redis生产环境持久化_在SageMaker上安装持久性Julia环境
  • alexnet vgg_从零开始:建立著名的分类网2(AlexNet / VGG)
  • 垃圾邮件分类 python_在python中创建SMS垃圾邮件分类器
  • 脑电波之父:汉斯·贝格尔_深度学习,认识聪明的汉斯
  • PyCaret 2.0在这里-新增功能?
  • 特征选择 回归_如何执行回归问题的特征选择
  • 建立神经网络来预测贷款风险
  • redshift教程_分析和可视化Amazon Redshift数据—教程
  • 白雪小町_町
  • 机器学习术语_机器学习术语神秘化。
  • centos有趣软件包_这5个软件包使学习R变得有趣
  • 求解决方法_解决方法
  • xml格式是什么示例_什么是对抗示例?

fitbit手表中文说明书_如何获取和分析Fitbit睡眠分数相关推荐

  1. fitbit手表中文说明书_使用机器学习预测Fitbit睡眠分数

    fitbit手表中文说明书 In Part 1 of this article I explained how we can obtain sleep data from Fitbit, load i ...

  2. fitbit手表中文说明书_最佳Fitbit:哪一个适合您?

    fitbit手表中文说明书 Fitbit has become synonymous with fitness tracking, and we've tested and rated every m ...

  3. fitbit手表中文说明书_我如何分析FitBit中的数据以改善整体健康状况

    fitbit手表中文说明书 by Yash Soni 由Yash Soni 我如何分析FitBit中的数据以改善整体健康状况 (How I analyzed the data from my FitB ...

  4. fitbit手表中文说明书_入侵Fitbit-为Twitter DM模拟寻呼机!

    fitbit手表中文说明书 I've been trying to wake up earlier in the morning. The trouble is that alarms wake ev ...

  5. fitbit手表中文说明书_Fitbit OS达到3.0版,这是新功能

    fitbit手表中文说明书 Over the last couple of years, Fitbit has graduated from a company that makes fitness ...

  6. qu32调音台说明书_使用效果不错艾伦赫赛QU32调音台带中文说明书_北京金舒恺歌科技发展有限公司(亿商网手机版)...

    使用效果不错艾伦赫赛QU-32调音台带中文说明书 ALLEN&HEATH/艾伦赫赛 QU-32 32路专业数字台调音台 技术参数: 输入 话筒/线路输入1-16(Qu-16) 1-24(Qu- ...

  7. fenix3 hr 中文说明书_高人一等 Garmin Fenix 3 中文版的隐藏福利

    在经历了不间断的跳票跳票再跳票之后,Garmin Fenix 3 中文版终于快要发售了,作为一款宣称可以将 Ambit 3 甩出十条南京东路的 Fenix 3,尽管在外观上与 Ambit 3 PEAK ...

  8. e5cz温控表中文说明书_欧姆龙温控器e5cz说明书 OMRON温控表E5EC说明书

    欧姆龙温控器e5cz说明书||OMRON温控表E5EC说明书 联系人:魏工 : 座机: QQ:2030149354 产品资料: 輸入種類設定 开机时初始状态下SV会显示输入传感器种类,出厂预设于K1热 ...

  9. 惠普m128fn中文说明书_惠普M128fp使用手册

    惠普M128fp一体机使用说明书就是惠普旗下个人激光多功能一体机m128fp的使用手册,这个使用手册将会告诉用户惠普m128fp怎么扫描.怎么发送传真以及怎么打印,同时你还可以从中找到驱动安装连接不上 ...

最新文章

  1. 20行Python代码给微信头像戴帽子
  2. Oracle 截取字符串,取系统时间
  3. Python爬虫开发教程,一看就懂!
  4. 理科都要学大学计算机吗,女生不适合学理科专业?报考这些理科专业,一毕业就会遭到疯抢!...
  5. murmurhash
  6. 技能的反面 - 魔方和模仿
  7. Java虚拟机------JVM介绍
  8. Java虚拟机(八)——堆
  9. libcap-ng库旨在使具有posix功能的编程比传统的libcap库容易得多
  10. wireshark 开始抓包
  11. “开源社区运营就像种菜”,黄东旭谈开源商业化 | 独家
  12. 「支持m1」自定义菜单键盘快捷键——CustomShortcuts for mac
  13. mysql sql语法区别_sql和mysql语法有什么不同
  14. python 扫描枪_python实现超市扫码仪计费
  15. java 二维向量_二维向量的叉积是标量还是向量?
  16. 湖南大学ACM程序设计新生杯大赛(同步赛)L-Liao Han【打表规律+二分】
  17. c++逆天改命进阶--多态
  18. 从业务开始:一招攻破数据分析思路大难题
  19. 五险一金,你真的懂吗
  20. 花千骨歌曲大全 附简谱

热门文章

  1. 批量put和单条put
  2. 变量绑定对话框控件不同类型成员变量的绑定
  3. 删除指定下标的元素c语言,PHP删除数组中指定下标的元素方法
  4. java 如何知道对象是否被修改过_Java 并发编程:AQS 的原子性如何保证
  5. Java笔记-字符串编码与解码以及编码表原理
  6. mysql -a 参数_mysql参数及解释
  7. ubuntu ifconfig命令找不到_那些年踩过的坑--无法使用MobaXterm远程登录Ubuntu
  8. linux脚本里调执行命令,使用shell的-n/-x/-x执行选项调试Shell脚本
  9. Codeforces Round #643 (Div. 2)(AB)
  10. Android自定义文件路径箭头,Android自定义ViewGroup实现带箭头的圆角矩形菜单