kaggle颅内出血比赛分析

帖子来自[1]

帖主的内容：

With se_resnext50 you can achieve 0.066 on LB. Here is the tips.

5 folds
512x512
3 epochs
hflip, crop, brightness, contrast, rotate (here[3] is the details of augmentations)
three types of windows, brain, blood and soft tissues.
tta (5 times)
lb 0.070 to 0.072 for each fold (with tta)
cv 0.071 to 0.074 for each fold (without tta)
lb 0.066 is achievable by averaging folds

I've also shared the source code on github in case you are interested [2].

The github repo lets you train a basic single model as a baseline which probably scores 0.070 to 0.072 on public LB. This baseline model takes 20 hours to train with a single 1080ti.
If you have any questions please ask. Have fun.

#--------------------------------------------下面是回复---------------------------------------------------------------------

Could you please shed some light on how you preprocess the images? Particularly, when you apply windowing, what do values, that you subtract from image array(0, -20, -150) mean? Are those just image.min() ?
image1 = (image1 - 0) / 80
image2 = (image2 - (-20)) / 200
image3 = (image3 - (-150)) / 380
Also in your code if policy==1, than you subtract image.min() and divide by (image3.max()-image3.min()) Why if policy==2 you divide by max only? Just trying to grasp the intuition behind this, thank you.

Reply
It does the same thing. After windowing with fixed values you know the theoretical minimum and maximum values and I used these values for min-max normalization.

#-----------------------------------------------------------------------------------------------------------------

hank your sharing. But I have a question on your code, It is about making adjacent labels. Your code is as following.
for j,id in enumerate(group.ID): if j == 0: left = labels[j-1] else: left = '' if j+1 == len(labels): right = '' else: right = labels[j+1]
The left label won't be " if, only if j == 0. If I modify this code as follows, could it will be better?
for j,id in enumerate(group.ID): if j == 0: left = '' else: left = labels[j-1] if j+1 == len(labels): right = '' else: right = labels[j+1]

Reply
Hi, thank you very much for pointing this out.
I made a simple mistake and your implementation is correct.

LeftLabel and RightLabel are actually not used in the code and does not affect the model. But this feature could be used to improve the traing process by giving some score to its adjacent images and that's why I left them there even though it's not used.

If the target is something like
[1 1 0 1 0 0 1 0 0]

You can spread the score like this
[1 1 0.4 1 0.2 0.2 1 0.2 0]

just an idea.
#----------------------------------------------------------------------------------------------------------------Single FOLD PB 0.074

#----------------------------------------------------------------------------------------------------------------

If you run it without looking at the code, you probably get 0.4, but if you study the code, you may learn a lot.

#----------------------------------------------------------------------------------------------------------------

评论区中提到:

self.df = apply_dataset_policy(self.df, self.cfg.dataset_policy)
# self.df = self.df.sample(560)#这个是用来测试整个流程是否能跑通的

#----------------------------------------------------------------------------------------------------------------

代码从复现角度考虑,下面这句话需要设为True,如果不介意复现情况,可以设定为False
torch.backends.cudnn.deterministic = True

Reference:

[1]https://www.kaggle.com/c/rsna-intracranial-hemorrhage-detection/discussion/112819#latest-652212

[2]https://github.com/appian42/kaggle-rsna-intracranial-hemorrhage

[3]https://github.com/appian42/kaggle-rsna-intracranial-hemorrhage/blob/master/conf/model001.py

kaggle颅内出血比赛分析相关推荐

【干货】Kaggle 数据挖掘比赛经验分享（mark 专业的数据建模过程）
简介 Kaggle 于 2010 年创立,专注数据科学,机器学习竞赛的举办,是全球最大的数据科学社区和数据竞赛平台.笔者从 2013 年开始,陆续参加了多场 Kaggle上面举办的比赛,相继获得了 C ...
Kaggle 数据挖掘比赛经验分享 (转载)
[干货]Kaggle 数据挖掘比赛经验分享 (转载) 标签: 数据挖掘数据科学家机器学习kaggle 2017-05-21 19:25 99人阅读评论(0) 收藏举报本文章已收录于: 分类 ...
Kaggle 数据挖掘比赛经验分享（转）
原作者:陈成龙简介 Kaggle 于 2010 年创立,专注数据科学,机器学习竞赛的举办,是全球最大的数据科学社区和数据竞赛平台.笔者从 2013 年开始,陆续参加了多场 Kaggle上面举办的比赛 ...
【干货】Kaggle 数据挖掘比赛经验分享
[干货]Kaggle 数据挖掘比赛经验分享 Kaggle 于 2010 年创立,专注数据科学,机器学习竞赛的举办,是全球最大的数据科学社区和数据竞赛平台.笔者从 2013 年开始,陆续参加了多场 Ka ...
自然语言处理NLP星空智能对话机器人系列：NLP on Transformers 101 第16章：Kaggle BERT比赛CommonLit Readability Prize赛题解析
自然语言处理NLP星空智能对话机器人系列:NLP on Transformers 101 第16章:Kaggle BERT比赛CommonLit Readability Prize赛题解析第16章: ...
【干货】Kaggle数据挖掘比赛经验分享，陈成龙博士整理！
来源:腾讯广告算法大赛(ID:TSA-Contest) 作者简介陈成龙, 2015 年博士毕业于中山大学,研究图像篡改检测,在图像领域顶级期刊IEEE TIP上发表论文2篇,Kaggle Crowd ...
Kaggle 数据挖掘比赛经验分享
kaggle历期比赛解决方案汇总 - 简介 Kaggle 于 2010 年创立,专注数据科学,机器学习竞赛的举办,是全球最大的数据科学社区和数据竞赛平台.笔者从 2013 年开始,陆续参加了多场 Ka ...
Kaggle 泰坦尼克号生存分析(数据概览和缺失值处理部分）
Kaggle 泰坦尼克号生存分析数据概览 #导入pandas库方便数据读取和预处理,导入os库方便修改工作路径 import os import pandas as pd #读取数据 os.chdi ...
python乒乓球比赛甲乙_用python进行对乒乓球的比赛分析，并且将该程序进行封装...
2.单打的淘汰赛采用七局四胜制,双打淘汰赛和团体赛采用五局三胜制. 重点: 思维方式:自顶向下即将一个复杂问题分解成几个问题,再细分成一个个具体的小问题,从而来解决复杂问题.自底向上为自顶向下的逆过程 ...

kaggle颅内出血比赛分析

kaggle颅内出血比赛分析相关推荐

最新文章

热门文章