[思维][模拟]Scholomance Academy 第45届icpc区域赛沈阳站K
题目描述
As a student of the Scholomance Academy, you are studying a course called \textit{Machine Learning}. You are currently working on your course project: training a binary classifier.
A binary classifier is an algorithm that predicts the classes of instances, which may be positive (+)({+})(+) or negative (−)({-})(−). A typical binary classifier consists of a scoring function S{S}S that gives a score for every instance and a threshold θ\thetaθ that determines the category. Specifically, if the score of an instance S(x)≥θS(x) \geq \thetaS(x)≥θ, then the instance x{x}x is classified as positive; otherwise, it is classified as negative. Clearly, choosing different thresholds may yield different classifiers.
Of course, a binary classifier may have misclassification: it could either classify a positive instance as negative (false negative) or classify a negative instance as positive (false positive).
Given a dataset and a classifier, we may define the true positive rate (TPR{TPR}TPR) and the false positive rate (FPR{FPR}FPR) as follows:
TPR=#TP#TP+#FN,FPR=#FP#TN+#FP{TPR} = \frac{\# {TP}} {\# {TP} + \# {FN}}, \quad {FPR} = \frac{\# {FP}} {\# {TN} + \# {FP}}TPR=#TP+#FN#TP,FPR=#TN+#FP#FP
where #TP\# TP#TP is the number of true positives in the dataset; #FP,#TN,#FN\# FP, \#TN, \#FN#FP,#TN,#FN are defined likewise.
Now you have trained a scoring function, and you want to evaluate the performance of your classifier. The classifier may exhibit different TPR and FPR if we change the threshold θ\thetaθ. Let TPR(θ),FPR(θ){TPR}(\theta), FPR(\theta)TPR(θ),FPR(θ) be the TPR,FPR{TPR, FPR}TPR,FPR when the threshold is θ\thetaθ, define the area under curve{area\;under\;curve}areaundercurve (AUC{AUC}AUC) as
AUC=∫01maxθ∈R{TPR(θ)∣FPR(θ)≤r}dr{AUC} = \int_{0}^{1} \max_{\theta \in \mathbb{R}} \{TPR(\theta)|FPR(\theta) \leq r\} d rAUC=∫01maxθ∈R{TPR(θ)∣FPR(θ)≤r}dr
where the integrand, called receiver operating characteristic{receiver\;operating\;characteristic}receiveroperatingcharacteristic (ROC), means the maximum possible of TPR{TPR}TPR given that FPR≤rFPR \leq rFPR≤r.
Given the actual classes and predicted scores of the instances in a dataset, can you compute the AUC{AUC}AUC of your classifier?
For example, consider the third test data. If we set threshold θ=30\theta = 30θ=30, there are 3 true positives, 2 false positives, 2 true negatives, and 1 false negative; hence, TPR(30)=0.75{TPR}(30) = 0.75TPR(30)=0.75 and FPR(30)=0.5{FPR}(30) = 0.5FPR(30)=0.5. Also, as θ\thetaθ varies, we may plot the ROC curve and compute the AUC accordingly, as shown in Figure 1.
输入描述:
The first line contains a single integer n{n}n (2≤n≤106)(2 \leq n \leq 10^6)(2≤n≤106), the number of instances in the dataset. Then follow n{n}n lines, each line containing a character c∈{+,−}c \in \{{+},{-}\}c∈{+,−} and an integer s{s}s (1≤s≤109)(1 \leq s \leq 10^9)(1≤s≤109), denoting the actual class and the predicted score of an instance.It is guaranteed that there is at least one instance of either class.
输出描述:
Print the AUC{AUC}AUC of your classifier within an absolute error of no more than 10−910^{-9}10−9.
示例1
输入
3 + 2 - 3 - 1
输出
0.5
示例2
输入
6 + 7 - 2 - 5 + 4 - 2 + 6
输出
0.888888888888889
示例3
输入
8 + 34 + 33 + 26 - 34 - 38 + 39 - 7 - 27
输出
0.5625
说明
ROC and AUC{AUC}AUC of the third sample data.
题意: 题目巨长无比,实在考验人的耐心......有一台分类器,可以根据设定的指标θ来把目标分类成+或者-,如果目标参数大于等于θ就分类成+,如果小于θ分类成-。给出n个目标的目标参数以及它们真正的类别,设FPR为真实类别为-的目标中被机器分类为+的目标个数 / 真实类别为-的目标个数,设TPR为真实类别为+的目标中被机器分类为+的目标个数 / 真实类别为+的目标个数,显然FPR与TPR是关于θ的函数。令θ取遍实数可以得到一系列的FPR(θ)、TPR(θ),即以FPR和TPR为轴的一系列散点,构造函数值f(FPR)为小于等于FPR的区域内TPR的最大值,求f函数在[0, 1]上的积分。
分析: 显然散点一定在θ取每个目标参数时可以全部获取到,因此只需要枚举目标参数就可以得到图上的所有散点。根据f函数的定义可以得知f是个分段函数且每段都是直线,同时f一定递增。因此求积分就是一个求矩形面积的过程,for循环枚举断点累加求和即可。
具体代码如下:
#include<cstdio>
#include<cstring>
#include<algorithm>
#include<iostream>
#include<queue>
#include<map>
#define int long long
#define double long double
using namespace std;
const int N = 1e6+10;
typedef pair<int,int> PII;
map<int,int> mp;
int a[N];
int p[N],ne[N],cnt1,cnt2;
signed main()
{int n;cin >> n;char t[2];for(int i = 1; i <= n; i++){scanf("%s%lld", t, &a[i]);if(t[0] == '+')p[cnt1++] = a[i];else ne[cnt2++] = a[i];} sort(p,p+cnt1);sort(ne,ne+cnt2);double pp = 0;if(cnt2 == 0){printf("%.9Lf\n",pp);return 0;}for(int i=0;i<cnt2;i++){int x = cnt2 - (lower_bound(ne,ne+cnt2,ne[i]) - ne);int t = cnt1 - (lower_bound(p,p+cnt1,ne[i]) - p);mp[x] = max(mp[x],t);}for(int i=0;i<cnt1;i++){int x = cnt2 - (lower_bound(ne,ne+cnt2,p[i]) - ne);int t = cnt1 - (lower_bound(p,p+cnt1,p[i]) - p);mp[x] = max(mp[x],t);}int xl = 0,y = mp[0],ans = 0;for(map<int,int>::iterator it = mp.begin();it != mp.end();it++){int xr = it->first;ans += (xr-xl)*y;y = it->second;xl = xr;}printf("%.9Lf\n",(double)ans/cnt1/cnt2);return 0;
}
[思维][模拟]Scholomance Academy 第45届icpc区域赛沈阳站K相关推荐
- 第 45 届ICPC区域赛(南京)记录
题在这里 K - Co-prime Permutation 题意:让你从1~n的数字序列中,选出每一个数字放到一个位置i使得gcd(a[i],i) = 1. 解:相邻的数字互质,故将需要gcd(a[i ...
- icpc网络赛第二场K Meal
icpc网络赛第二场K Meal 题意: 有n个人,n个菜, 现在n个人轮流吃菜,起初S中有n个菜,第i个人会在还没拿走的菜中随机选一个,拿走第j个菜的概率为ai,j∑k∈Sai,k\frac{a_{ ...
- 训练实录 | 第 45 届ICPC沈阳站(牛客重现赛)
第 45 届国际大学生程序设计竞赛(ICPC)亚洲区域赛(沈阳)(重现赛) 传送门:ICPC沈阳 F - The Witchwood 这重现赛,这数据,我既然被hack了,我写的代码太拉跨了????? ...
- j-甜甜圈(第十三届icpc河南省赛)
照常,上题目:J-甜甜圈_河南省第十三届ICPC大学生程序设计竞赛(重现赛) (nowcoder.com) 这种题,就一眼能看出来直接模拟肯定是不行的了 然后比赛的时候三个人就思考了下人生决定跳过 咳 ...
- 第42届ACM国际大学生程序设计竞赛 亚洲区域赛 西安站 总结
今年暑假集训结束的时候我们队在UESTC-ACM Div.1里排名第6,所以获得一场ICPC和一场CCPC的现场赛资格. 由于上半年5月打过西安的邀请赛并取得了一块银牌,笔者那次毕竟是第一次打区域赛级 ...
- 第45届ICPC 昆明站 临时模板补充
昆明站模板补充 __int128 typedef __int128 LL; inline __int128 read(){__int128 x=0,f=1;char ch=getchar();whil ...
- 第45届ICPC沈阳站部分题解(D、F、G、H、I、J、K)
文章目录 D-前缀和思想+dfs F-贪心 G H-双指针+dp 题意 思路 代码 I-追及问题+裴蜀定理 J-可持久化线段树 K-前缀和+积分的定义 题意 思路 参考链接 传送门 本文CSDN 本文 ...
- 第46届ICPC亚洲区域赛(沈阳)L-Perfect Matchings【dp,组合数学】
正题 题目链接:https://ac.nowcoder.com/acm/contest/24346/L 题目大意 有一张2n2n2n个点的完全图,在上面删除一棵生成树,然后求这张图的完全匹配方案数. ...
- 第43届ACM icpc亚洲区域赛焦作站感想
青岛痛失银牌,焦作又是铜牌一枚,现在想想,前面三个水题,我的原因太大了,老是犯各种zz小错误,明明能秒,却花了大量时间debug,导致三个小时才签完到,最后一个小时,F题的bfs没出来,B题也没出来, ...
- 伊朗 2018 ICPC区域赛 A : Iranian ChamPions Cup
题目描述 The Iranian ChamPions Cup (ICPC), the most prestigious football league in Iran, is reaching its ...
最新文章
- 浙大博士整理的计算机视觉学习路线(含时间建议分配)
- tensorflow,神经网络创建源码
- MySQL 时间类型 DATE、DATETIME和TIMESTAMP
- 018_html文件路径
- linux18.04下安装的jdk11.0.2
- python 3d绘图 拖动_使用python-matplotlib连续3D绘图(即图形更新)?
- php伪静态限制网页播放视频,学习猿地-php伪静态后html不能访问怎么办
- Springboot响应处理
- Linux中安装开源JDK(windows的JDK只能安装半开源)
- mfc groupbox 边框颜色_蓝色牛仔裤配什么颜色上衣好看
- Kotlin基础学习第3章—内置类型
- 如何编写可移植的c/c++代码
- 频谱仪的更改ip_通过局域网(LAN)读取频谱分析仪图像的方法
- mysql身份证校验码_javascript身份证验证代码
- 用过企业微信APP 后,微信接收不到消息,解决方案
- java定时器timer 取消_JAVA定时器Timer的使用
- 1072 开学寄语 C++实现
- 海尔消费金融暂停“首付贷”
- 我的《上勾拳》网页单机小游戏有什么版权问题吗
- 3、(三)外汇学习基础篇之银行间外汇即期交易
热门文章
- android拉起软键盘,移动端JavaScript拉起软键盘
- HBGGP的工程建立过程
- Vue学习(学习打卡Day14)
- 视频号算法推荐机制! 微信视频号怎么上热门?
- python_面向对象,以及类的相关知识
- c语言偶数求和while,C语言中编程计算1至100以内的奇数和偶数并分别求和,求代码...
- linux 清除dns缓存
- Rancher搭建Longhorn分布式存储
- 洛谷P2689 东南西北
- 泰拉瑞亚服务器config修改,《泰拉瑞亚》游戏配置怎么修改 游戏配置修改办法推荐...