说明:
MIMIC 是知名的临床数据库,库的结构和SQL查询的数据的方法是很好的一个建立临床数据库的模板。这个官方的SQL为了进行charlson 评分,查询了多种慢性疾病。SQL代码后有对查询到的结果进行的初步分析(python) ,使我们可以对数据库中各种慢性疾病的人数有初步的了解。
心得体会:
MIMIC存储的形式是纵向数据(窄表),这里演示了如何将其一步转变为宽表;
可以看到一种疾病对应多个编码;
with临时表的用途;

-- THIS SCRIPT IS AUTOMATICALLY GENERATED. DO NOT EDIT IT DIRECTLY.
--DROP TABLE IF EXISTS charlson; CREATE TABLE charlson AS
-- ------------------------------------------------------------------
-- This query extracts Charlson Comorbidity Index (CCI) based on the recorded ICD-9 and ICD-10 codes.
--
-- Reference for CCI:
-- (1) Charlson ME, Pompei P, Ales KL, MacKenzie CR. (1987) A new method of classifying prognostic
-- comorbidity in longitudinal studies: development and validation.J Chronic Dis; 40(5):373-83.
--
-- (2) Charlson M, Szatrowski TP, Peterson J, Gold J. (1994) Validation of a combined comorbidity
-- index. J Clin Epidemiol; 47(11):1245-51.
--
-- Reference for ICD-9-CM and ICD-10 Coding Algorithms for Charlson Comorbidities:
-- (3) Quan H, Sundararajan V, Halfon P, et al. Coding algorithms for defining Comorbidities in ICD-9-CM
-- and ICD-10 administrative data. Med Care. 2005 Nov; 43(11): 1130-9.
-- ------------------------------------------------------------------
WITH diag AS--建立临时表格, 诊断信息
(SELECT hadm_id--医院的ID, CASE WHEN icd_version = 9 THEN icd_code ELSE NULL END AS icd9_code, CASE WHEN icd_version = 10 THEN icd_code ELSE NULL END AS icd10_codeFROM mimic_hosp.diagnoses_icd diag
)
, com AS--另外一个临时表格,基于上个临时表;
(SELECTad.hadm_id-- Myocardial infarction, MAX(CASE WHENSUBSTR(icd9_code, 1, 3) IN ('410','412')--substr(string,start,length)函数格式ORSUBSTR(icd10_code, 1, 3) IN ('I21','I22')--从第一个字符开始,匹配前3个,code的解释见d_icd_dianosis表格ORSUBSTR(icd10_code, 1, 4) = 'I252'THEN 1 ELSE 0 END) AS myocardial_infarct--心肌梗死-- Congestive heart failure, MAX(CASE WHEN SUBSTR(icd9_code, 1, 3) = '428'ORSUBSTR(icd9_code, 1, 5) IN ('39891','40201','40211','40291','40401','40403','40411','40413','40491','40493')OR SUBSTR(icd9_code, 1, 4) BETWEEN '4254' AND '4259'ORSUBSTR(icd10_code, 1, 3) IN ('I43','I50')ORSUBSTR(icd10_code, 1, 4) IN ('I099','I110','I130','I132','I255','I420','I425','I426','I427','I428','I429','P290')THEN 1 ELSE 0 END) AS congestive_heart_failure--充血性心力衰竭-- Peripheral vascular disease, , MAX(CASE WHEN SUBSTR(icd9_code, 1, 3) IN ('440','441')ORSUBSTR(icd9_code, 1, 4) IN ('0930','4373','4471','5571','5579','V434')ORSUBSTR(icd9_code, 1, 4) BETWEEN '4431' AND '4439'ORSUBSTR(icd10_code, 1, 3) IN ('I70','I71')ORSUBSTR(icd10_code, 1, 4) IN ('I731','I738','I739','I771','I790','I792','K551','K558','K559','Z958','Z959')THEN 1 ELSE 0 END) AS peripheral_vascular_disease--周围性血管疾病-- Cerebrovascular disease, MAX(CASE WHEN SUBSTR(icd9_code, 1, 3) BETWEEN '430' AND '438'ORSUBSTR(icd9_code, 1, 5) = '36234'ORSUBSTR(icd10_code, 1, 3) IN ('G45','G46')OR SUBSTR(icd10_code, 1, 3) BETWEEN 'I60' AND 'I69'ORSUBSTR(icd10_code, 1, 4) = 'H340'THEN 1 ELSE 0 END) AS cerebrovascular_disease--脑血管疾病-- Dementia, MAX(CASE WHEN SUBSTR(icd9_code, 1, 3) = '290'ORSUBSTR(icd9_code, 1, 4) IN ('2941','3312')ORSUBSTR(icd10_code, 1, 3) IN ('F00','F01','F02','F03','G30')ORSUBSTR(icd10_code, 1, 4) IN ('F051','G311')THEN 1 ELSE 0 END) AS dementia-- Chronic pulmonary disease,慢性肺疾病, MAX(CASE WHEN SUBSTR(icd9_code, 1, 3) BETWEEN '490' AND '505'ORSUBSTR(icd9_code, 1, 4) IN ('4168','4169','5064','5081','5088')OR SUBSTR(icd10_code, 1, 3) BETWEEN 'J40' AND 'J47'OR SUBSTR(icd10_code, 1, 3) BETWEEN 'J60' AND 'J67'ORSUBSTR(icd10_code, 1, 4) IN ('I278','I279','J684','J701','J703')THEN 1 ELSE 0 END) AS chronic_pulmonary_disease-- Rheumatic disease, MAX(CASE WHEN SUBSTR(icd9_code, 1, 3) = '725'ORSUBSTR(icd9_code, 1, 4) IN ('4465','7100','7101','7102','7103','7104','7140','7141','7142','7148')ORSUBSTR(icd10_code, 1, 3) IN ('M05','M06','M32','M33','M34')ORSUBSTR(icd10_code, 1, 4) IN ('M315','M351','M353','M360')THEN 1 ELSE 0 END) AS rheumatic_disease--风湿性疾病-- Peptic ulcer disease, MAX(CASE WHEN SUBSTR(icd9_code, 1, 3) IN ('531','532','533','534')ORSUBSTR(icd10_code, 1, 3) IN ('K25','K26','K27','K28')THEN 1 ELSE 0 END) AS peptic_ulcer_disease--消化性溃疡-- Mild liver disease, MAX(CASE WHEN SUBSTR(icd9_code, 1, 3) IN ('570','571')ORSUBSTR(icd9_code, 1, 4) IN ('0706','0709','5733','5734','5738','5739','V427')ORSUBSTR(icd9_code, 1, 5) IN ('07022','07023','07032','07033','07044','07054')ORSUBSTR(icd10_code, 1, 3) IN ('B18','K73','K74')ORSUBSTR(icd10_code, 1, 4) IN ('K700','K701','K702','K703','K709','K713','K714','K715','K717','K760','K762','K763','K764','K768','K769','Z944')THEN 1 ELSE 0 END) AS mild_liver_disease--轻微的肝脏疾病-- Diabetes without chronic complication, MAX(CASE WHEN SUBSTR(icd9_code, 1, 4) IN ('2500','2501','2502','2503','2508','2509') ORSUBSTR(icd10_code, 1, 4) IN ('E100','E10l','E106','E108','E109','E110','E111','E116','E118','E119','E120','E121','E126','E128','E129','E130','E131','E136','E138','E139','E140','E141','E146','E148','E149')THEN 1 ELSE 0 END) AS diabetes_without_cc--无慢性并发症的糖尿病-- Diabetes with chronic complication, MAX(CASE WHEN SUBSTR(icd9_code, 1, 4) IN ('2504','2505','2506','2507')ORSUBSTR(icd10_code, 1, 4) IN ('E102','E103','E104','E105','E107','E112','E113','E114','E115','E117','E122','E123','E124','E125','E127','E132','E133','E134','E135','E137','E142','E143','E144','E145','E147')THEN 1 ELSE 0 END) AS diabetes_with_cc--有慢性并发症的糖尿病-- Hemiplegia or paraplegia, MAX(CASE WHEN SUBSTR(icd9_code, 1, 3) IN ('342','343')ORSUBSTR(icd9_code, 1, 4) IN ('3341','3440','3441','3442','3443','3444','3445','3446','3449')OR SUBSTR(icd10_code, 1, 3) IN ('G81','G82')OR SUBSTR(icd10_code, 1, 4) IN ('G041','G114','G801','G802','G830','G831','G832','G833','G834','G839')THEN 1 ELSE 0 END) AS paraplegia--截瘫-- Renal disease, MAX(CASE WHEN SUBSTR(icd9_code, 1, 3) IN ('582','585','586','V56')ORSUBSTR(icd9_code, 1, 4) IN ('5880','V420','V451')ORSUBSTR(icd9_code, 1, 4) BETWEEN '5830' AND '5837'ORSUBSTR(icd9_code, 1, 5) IN ('40301','40311','40391','40402','40403','40412','40413','40492','40493')          ORSUBSTR(icd10_code, 1, 3) IN ('N18','N19')ORSUBSTR(icd10_code, 1, 4) IN ('I120','I131','N032','N033','N034','N035','N036','N037','N052','N053','N054','N055','N056','N057','N250','Z490','Z491','Z492','Z940','Z992')THEN 1 ELSE 0 END) AS renal_disease--肾脏疾病-- Any malignancy, including lymphoma and leukemia, except malignant neoplasm of skin, MAX(CASE WHEN SUBSTR(icd9_code, 1, 3) BETWEEN '140' AND '172'ORSUBSTR(icd9_code, 1, 4) BETWEEN '1740' AND '1958'ORSUBSTR(icd9_code, 1, 3) BETWEEN '200' AND '208'ORSUBSTR(icd9_code, 1, 4) = '2386'ORSUBSTR(icd10_code, 1, 3) IN ('C43','C88')ORSUBSTR(icd10_code, 1, 3) BETWEEN 'C00' AND 'C26'ORSUBSTR(icd10_code, 1, 3) BETWEEN 'C30' AND 'C34'ORSUBSTR(icd10_code, 1, 3) BETWEEN 'C37' AND 'C41'ORSUBSTR(icd10_code, 1, 3) BETWEEN 'C45' AND 'C58'ORSUBSTR(icd10_code, 1, 3) BETWEEN 'C60' AND 'C76'ORSUBSTR(icd10_code, 1, 3) BETWEEN 'C81' AND 'C85'ORSUBSTR(icd10_code, 1, 3) BETWEEN 'C90' AND 'C97'THEN 1 ELSE 0 END) AS malignant_cancer--恶性肿瘤,包括淋巴瘤和白血病,不包括恶性皮肤肿瘤-- Moderate or severe liver disease, MAX(CASE WHEN SUBSTR(icd9_code, 1, 4) IN ('4560','4561','4562')ORSUBSTR(icd9_code, 1, 4) BETWEEN '5722' AND '5728'ORSUBSTR(icd10_code, 1, 4) IN ('I850','I859','I864','I982','K704','K711','K721','K729','K765','K766','K767')THEN 1 ELSE 0 END) AS severe_liver_disease--中重度肝疾病-- Metastatic solid tumor, MAX(CASE WHEN SUBSTR(icd9_code, 1, 3) IN ('196','197','198','199')OR SUBSTR(icd10_code, 1, 3) IN ('C77','C78','C79','C80')THEN 1 ELSE 0 END) AS metastatic_solid_tumor--转移性实体瘤-- AIDS/HIV, MAX(CASE WHEN SUBSTR(icd9_code, 1, 3) IN ('042','043','044')OR SUBSTR(icd10_code, 1, 3) IN ('B20','B21','B22','B24')THEN 1 ELSE 0 END) AS aidsFROM mimic_core.admissions adLEFT JOIN diagON ad.hadm_id = diag.hadm_idGROUP BY ad.hadm_id
)
, ag AS
(SELECT hadm_id, age, CASE WHEN age <= 40 THEN 0WHEN age <= 50 THEN 1WHEN age <= 60 THEN 2WHEN age <= 70 THEN 3ELSE 4 END AS age_scoreFROM mimic_derived.age
)--以下是主查询
SELECT ad.subject_id, ad.hadm_id, ag.age_score, myocardial_infarct, congestive_heart_failure, peripheral_vascular_disease, cerebrovascular_disease, dementia, chronic_pulmonary_disease, rheumatic_disease, peptic_ulcer_disease, mild_liver_disease, diabetes_without_cc, diabetes_with_cc, paraplegia, renal_disease, malignant_cancer, severe_liver_disease , metastatic_solid_tumor , aids-- Calculate the Charlson Comorbidity Score using the original-- weights from Charlson, 1987., age_score+ myocardial_infarct + congestive_heart_failure + peripheral_vascular_disease+ cerebrovascular_disease + dementia + chronic_pulmonary_disease+ rheumatic_disease + peptic_ulcer_disease+ GREATEST(mild_liver_disease, 3*severe_liver_disease)+ GREATEST(2*diabetes_with_cc, diabetes_without_cc)+ GREATEST(2*malignant_cancer, 6*metastatic_solid_tumor)+ 2*paraplegia + 2*renal_disease + 6*aidsAS charlson_comorbidity_index
FROM mimic_core.admissions ad
LEFT JOIN com--引用临时表
ON ad.hadm_id = com.hadm_id
LEFT JOIN ag--引用临时表
ON com.hadm_id = ag.hadm_id
;

初步分析

import pandas as pd
df=pd.read_csv('charlson_comm.csv')
df.info()<class 'pandas.core.frame.DataFrame'>
RangeIndex: 523740 entries, 0 to 523739
Data columns (total 21 columns):#   Column                       Non-Null Count   Dtype
---  ------                       --------------   -----0   subject_id                   523740 non-null  int641   hadm_id                      523740 non-null  int642   age_score                    523740 non-null  int643   myocardial_infarct           523740 non-null  int644   congestive_heart_failure     523740 non-null  int645   peripheral_vascular_disease  523740 non-null  int646   cerebrovascular_disease      523740 non-null  int647   dementia                     523740 non-null  int648   chronic_pulmonary_disease    523740 non-null  int649   rheumatic_disease            523740 non-null  int6410  peptic_ulcer_disease         523740 non-null  int6411  mild_liver_disease           523740 non-null  int6412  diabetes_without_cc          523740 non-null  int6413  diabetes_with_cc             523740 non-null  int6414  paraplegia                   523740 non-null  int6415  renal_disease                523740 non-null  int6416  malignant_cancer             523740 non-null  int6417  severe_liver_disease         523740 non-null  int6418  metastatic_solid_tumor       523740 non-null  int6419  aids                         523740 non-null  int6420  charlson_comorbidity_index   523740 non-null  int64
dtypes: int64(21)
memory usage: 83.9 MB
columns=df.columns[2:-1]
ls_patients_num=[]
for col in df.columns[2:-1]:value=df[col].value_counts()ls_patients_num.append(value)
# print(ls_patients_num)
df2=pd.DataFrame(ls_patients_num)
print(df2[:])0        1        2        3         4
age_score                    155046.0  56457.0  84097.0  89482.0  138658.0
myocardial_infarct           487040.0  36700.0      NaN      NaN       NaN
congestive_heart_failure     455882.0  67858.0      NaN      NaN       NaN
peripheral_vascular_disease  492644.0  31096.0      NaN      NaN       NaN
cerebrovascular_disease      494102.0  29638.0      NaN      NaN       NaN
dementia                     511610.0  12130.0      NaN      NaN       NaN
chronic_pulmonary_disease    439659.0  84081.0      NaN      NaN       NaN
rheumatic_disease            510475.0  13265.0      NaN      NaN       NaN
peptic_ulcer_disease         517983.0   5757.0      NaN      NaN       NaN
mild_liver_disease           489477.0  34263.0      NaN      NaN       NaN
diabetes_without_cc          440969.0  82771.0      NaN      NaN       NaN
diabetes_with_cc             487871.0  35869.0      NaN      NaN       NaN
paraplegia                   516222.0   7518.0      NaN      NaN       NaN
renal_disease                454858.0  68882.0      NaN      NaN       NaN
malignant_cancer             469985.0  53755.0      NaN      NaN       NaN
severe_liver_disease         512935.0  10805.0      NaN      NaN       NaN
metastatic_solid_tumor       501534.0  22206.0      NaN      NaN       NaN
aids                         520348.0   3392.0      NaN      NaN       NaN

注:其中1代表了患病的人数.

MIMIC-iv数据库官方SQL查询语句标注和初步探索性分析(一)--多种疾病的收录的人数相关推荐

  1. MySQL数据库高级SQL查询语句(单表查询,多表联合查询)

    目录 SQL查询语句 基础查询 条件查询 模糊查询 字段控制查询 排序 聚合函数 分组查询 having子句 limit分页查询 多表连接查询 SQL查询语句 数据查询语言. 数据库执行DQL语句不会 ...

  2. MIMIC数据库官方SQL查询标注和初步分析--sofa评分(2-19)

    说明: 'SOFA'的意思是序贯器官衰竭评分表. 本SQL的目的就是查出计算Sepsis3的相关的资料. 展示了MIMIC查询的一种策略,即在官方SQL的基础上进一步进行查询,本SQL是在SOFA.s ...

  3. MIMIC-iv官方SQL概念语句标注——mimic_derived模块部分信息

    1.mmic-iv与mimic-iii的区别之一就是前者分了3个模块,mimic_icu,mimic_hosp和mimic_core, 而在学习过程中我们还可以观察到另外一个模块mimic_deriv ...

  4. 数据库低端sql查询语句片段

    SELECT * FROM tableSELECT * FROM table WHERE name = '强哥'SELECT * FROM table ORDER BY updateTime DESC

  5. R语言构建仿真数据库(sqlite)并使用dplyr语法和SQL语法查询数据库、将dplyr语法查询语句翻译为SQL查询语句

    R语言构建仿真数据库(sqlite)并使用dplyr语法和SQL语法查询数据库.将dplyr语法查询语句翻译为SQL查询语句 目录

  6. WordPress 常用数据库SQL查询语句大全

    https://www.wpdaxue.com/wordpress-sql.html 在使用WordPress的过程中,我们少不了要对数据库进行修改操作,比如,更换域名.修改附件目录.批量修改文章内容 ...

  7. (走向DBA[MSSQL篇] - 从SQL语句的角度提高数据库的访问性能)一些SQL查询语句应加上nolock

    http://kb.cnblogs.com/page/124787/#s8 最近公司来一个非常虎的DBA,10几年的经验,这里就称之为蔡老师吧,在征得我们蔡老同意的前提下 ,我们来分享一下蔡老给我们带 ...

  8. [数据库] SQL查询语句表行列转换及一行数据转换成两列

    本文主要讲述了SQL查询语句表之间的行列转换,同时也包括如何将一行数据转换成两列数据的方法.子查询的应用.decode函数的用法.希望文章对你有所帮助~ 1.创建数据库表及插入数据 2.子查询统计不同 ...

  9. MIMIC IV数据库衍生表格配置

    文章目录 一.MIMIC IV数据库衍生表格(mimic_derived)简介 二.衍生表格示例 三.衍生表格配置 一.MIMIC IV数据库衍生表格(mimic_derived)简介 MIMIC I ...

最新文章

  1. TensorFlow基础笔记(11) max_pool2D函数 深度学习
  2. php heahd,heaheader phpder 详解
  3. 浅谈创业性公司的发展
  4. 数据库中字段类型对应的C#中的数据类型
  5. 第五期 RHCE远程班 12月1日开课(周末班)
  6. 小米12 Ultra将搭载5倍潜望镜头:自研技术加持 成像相对更好
  7. python 2 版本中的input() 和 raw_input() 函数的比较
  8. 消息处理(异步调用OneWay, 双向通讯Duplex)
  9. 操作系统课设 Nachos 实验一:Nachos 系统的安装与调试
  10. SSH、myBatis下载地址
  11. ListView组件的应用
  12. 王健林:万达体育和传奇影业都要开展资本运作 今年要出成绩
  13. 齐桓公称霸天下的用人之道
  14. mysqlError: Can't connect to MySQL server on 'localhost' (10061)
  15. 如何一键修改CAD图纸底图颜色?
  16. Python数据可视化-Pyecharts不同的主题风格
  17. 小程序生成二维码海报
  18. Matlab数字图像的傅里叶变换(FFT)
  19. 硅芯思见:SystemVerilog中的packedarray和unpacked array
  20. 她二本科毕业,拿到阿里年薪40万offer!经验都记录在这几个公众号日记中

热门文章

  1. 嘉宾介绍 | 2020 PG亚洲大会中文分论坛:潘娟
  2. 校企合作案例 《商业智能》| 融合FineBI的省级一流课程、MOOC精品课程的翻转课堂教学
  3. Ubuntu: AVI视频转MP4格式
  4. 文本/图像截图工具hypersnap
  5. Unity的声音 —— AudioSource 和 AudioListener
  6. wma音频怎么转换mp3格式
  7. android 唤醒屏幕并解锁
  8. [INVEST]2018年国庆后黄金是否值得投资小结
  9. unity面试之美(四)
  10. win7自带一键还原怎么用?