论文题目:Human microbiome aging clocks based on deep learning and tandem of permutation feature importance and accumulated local effects

scholar 引用:3

页数:16

发表时间:2019.01 This article is a preprint and has not been certified by peer review

作者:Fedor Galkin, Aleksandr Aliper, ..., Alex Zhavoronkov

摘要:

The human gut microbiome is a complex ecosystem that both affects and is affected by its host status. Previous analyses of gut microflora revealed associations between specific microbes and host health and disease status, genotype and diet. Here, we developed a method of predicting biological age of the host based on the microbiological profiles of gut microbiota using a curated dataset of 1,165 healthy individuals (3,663 microbiome samples). Our predictive model, a human microbiome clock, has an architecture of a deep neural network and achieves the accuracy of 3.94 years mean absolute error in cross-validation. The performance of the deep microbiome clock was also evaluated on several additional populations. We further introduce a platform for biological interpretation of individual microbial features used in age models, which relies on permutation feature importance and accumulated local effects. This approach has allowed us to define two lists of 95 intestinal biomarkers of human aging. We further show that this list can be reduced to 39 taxa that convey the most information on their host’s aging. Overall, we show that (a) microbiological profiles can be used to predict human age; and (b) microbial features selected by models are age-related.

正文组织架构:

1. Introduction

2. Methods

2.1 Data acquisition

2.2 Abundance calculation

2.3 Neural networks training

2.4 Oversampling

2.5 Feature importance

3. Results

3.1 Age prediction using machine learning

3.2 Microbiological influence on age prediction

3.3 Age bracket prediction with DNN

3.4 Host-based age prediction

4. Discussion

5. Conclusion

正文部分内容摘录:

1. Biological Problem: What biological problems have been solved in this paper?

  • predicting biological age

2. Main discoveries: What is the main discoveries in this paper?

  • Our predictive model, a human microbiome clock, has an architecture of a deep neural network and achieves the accuracy of 3.94 years mean absolute error in cross-validation.
  • This approach has allowed us to define two lists of 95 intestinal biomarkers of human aging.We further show that this list can be reduced to 39 taxa that convey the most information on their host’s aging.
  • Although surprising at first glance, bacterial influence on age prediction is not determined by whether it is beneficial to the host or not.

3. ML(Machine Learning) Methods: What are the ML methods applied in this paper?

  • we developed a method of predicting biological age of the host based on the microbiological profiles of gut microbiota using a curated dataset of 1,165 healthy individuals (3,663 microbiome samples).
  • We also developed a method for microbiological feature selection and annotation. It combines two-fold feature importance assessment using PFI and ALE approaches upon training a DNN.
  • We applied multiple methods to build a regressor that takes in profiles containing abundances for all 1,673 taxa reliably detected in at least 0.13% of samples, including random forest, support vector machine, elastic net, gradient boosting (XGB) and deep neural network (DNN). However, only the latter two models achieved the predictions better than random
  • build a predictor of age with whole genome sequencing (WGS) data aggregated from multiple sources and various machine learning techniques and use it to examine patterns of incessant microflora succession.
  • we report a method to estimate a host’s age based on their microflora taxonomic profile, assess the importance of specific taxa in organismal aging, and suggest candidate geroprotective microbiological interventions.
  • The best performing model architecture was determined in the sample-based setting. It contains three hidden layers with 512 nodes in each, with PReLU activation function, Adam optimizer, dropout fraction 0.5 at each layer, and 0.001 learning rate
  • Age classifier models were trained using a subset of either 95 features or 39 features.

4. ML Advantages: Why are these ML methods better than the traditional methods in these biological problems?

  • To verify the results obtained with DNN, we implemented random forest, support vector machine and elastic net regressor.
  • All of these methods performed poorly compared to the DNN approach with the mean absolute errors exceeding 11 years.
  • Apart from them, we trained a gradient boosting (XGB) regressor with accuracy comparable to the DNN model (MAE = 4.69 years, R2 = 0.81)
  • According to Permutation Feature Importance (PFI) scores, DNN regressor is more sensitive to highly abundant species, while XGB regressor contains some minor taxa among its most important features. We consider this an indication of DNN’s increased robustness compared to other methods.

5. Biological Significance: What is the biological significance of these ML methods’ results?

  • We further introduce a platform for biological interpretation of individual microbial features used in age models, which relies on permutation feature importance and accumulated local effects.
  • Despite great performance of XGB (MAE = 4.69 years) and DNN models (MAE = 3.94 years), extracting biologically relevant information from them presents a major challenge.
  • We implemented ALEs approach using DNN regressor as a reference and its 95 most important features to see how changes in abundance affect the predictions. ALE is a technique that theoretically surpasses PFI as it takes into account intrinsic interdependence of microbiological features.
  • According to our ALE analysis, only 39/95 features could change the average predicted age by more than 1 year
  • Interestingly, reducing the number of features by 59% caused only a 5% drop in F-score for the age bracket classification task. This suggests that the ALEs technique succeeded in selecting only the most relevant microbial features.
  • A weighted F1-score was selected as the target metric to assess model performance.

6. Prospect: What are the potential applications of these machine learning methods in biological science?

  • To our best knowledge, we present the first method to predict human chronological age using gut microbiota abundance profiles.
  • Overall, we show that (a) microbiological profiles can be used to predict human age; and (b) microbial features selected by models are age-related.
  • The identified biomarkers include species whose abundance is positively or negatively correlated with predicted age. These species may be further investigated deeply by the community to improve our understanding of human aging and its relationship with the gut microbiome.

7. Mine Question(Optional)

Paper reading (八十七):Human microbiome aging clocks based on DL相关推荐

  1. Paper reading (八十四):Age- and Sex-Dependent Patterns of Gut Microbial Diversity in Human Adults

    论文题目:Age- and Sex-Dependent Patterns of Gut Microbial Diversity in Human Adults scholar 引用:1 页数:12 发 ...

  2. Paper reading (八十):Persistent microbiome alterations modulate the rate of post-dieting weight regain

    论文题目:Persistent microbiome alterations modulate the rate of post-dieting weight regain scholar 引用:16 ...

  3. Paper reading (八十六):Normalization of the microbiota in patients after treatment for colonic lesions

    论文题目:Normalization of the microbiota in patients after treatment for colonic lesions scholar 引用:10 页 ...

  4. Paper reading (八十二):Maturation of the Infant Respiratory Microbiota, Envir Drivers, and Health cons

    论文题目:Maturation of the Infant Respiratory Microbiota, Environmental Drivers, and Health Consequences ...

  5. Paper reading (六十五):Kernel-penalized regression for analysis of microbiome data

    论文题目:Kernel-penalized regression for analysis of microbiome data scholar 引用:15 页数:29 发表时间:2018.03 发表 ...

  6. paper reading:Part-based Graph Convolutional Network for Action Recognition

    paper reading:Part-based Graph Convolutional Network for Action Recognition 文章目录 paper reading:Part- ...

  7. Paper reading (六十):Multidomain analyses of a intestinal cleanout perturbation experiment

    论文题目:Multidomain analyses of a longitudinal human microbiome intestinal cleanout perturbation experi ...

  8. cvpr2019/cvpr2018/cvpr2017(Papers/Codes/Project/Paper reading)

    cvpr2019/cvpr2018/cvpr2017(Papers/Codes/Project/Paper reading) Source:https://mp.weixin.qq.com/s/SmS ...

  9. 八十七、Python | 十大排序算法系列(上篇)

    @Author:Runsen @Date:2020/7/10 人生最重要的不是所站的位置,而是内心所朝的方向.只要我在每篇博文中写得自己体会,修炼身心:在每天的不断重复学习中,耐住寂寞,练就真功,不畏 ...

最新文章

  1. c++ 全局变量初始化的一点总结
  2. 使用std::cout不能输出显示
  3. java不要在常量和变量中出现易混淆的字母
  4. 算法题+JVM+自定义View,终局之战
  5. linux服务器健康检查,Linux 检查硬盘健康状态
  6. Fpdi实现pdf页面合并(php)
  7. phpcms v9文件上传的四次绕过复现
  8. js实现对数组每一项加1的三种方法
  9. 天梯图excl_Excel版CPU天梯图 方便打印.xls
  10. 技术人最基本投资建议
  11. mysql5.6.23winx64,mysql 5.6.23 winx64.zip安装详细教程
  12. ElasticSearch创建索引:[hotel/6g9tufKRuWDdWfgE_F30Q] ElasticsearchStatusException[Elasticsearch exception
  13. 区块链中nonce与难度系数
  14. 【NOIP2014普及组】子矩阵
  15. mac下elk的安装
  16. 读取Excel表格内容转为Sql when then语句
  17. Oracle上司查下属(上级查下级,或者下级查上级,组织树查询)
  18. 结构体问题探究_12_18.c
  19. 埋线双眼皮跟韩式三点的区别在哪里?
  20. 吸烟的烟民注意了:国家有大动作

热门文章

  1. 计算机辅助设计cad实训总结,cad实训的心得体会
  2. 09 | 设计模式之美——王争
  3. 【独家】区块链走上太空?Why Can't?
  4. 订餐服务网络“橄榄订餐网”复制携程的“互联网+传统行业”模式已见成效
  5. GitHub安装包下载
  6. Java高频面试复习题,助你吊打面试官
  7. 总是空驶、排队等货怎么办?货运要想网络化,运力画像看一看!
  8. 青风nrf52832跑zephyr——点亮LED
  9. 控制器的功能和工作原理
  10. 算法基础之python实现贪心算法中的雷达安装问题和二分法中誊抄书籍问题