题目描述

As a student of the Scholomance Academy, you are studying a course called \textit{Machine Learning}. You are currently working on your course project: training a binary classifier.

A binary classifier is an algorithm that predicts the classes of instances, which may be positive (+)({+})(+) or negative (−)({-})(−). A typical binary classifier consists of a scoring function S{S}S that gives a score for every instance and a threshold θ\thetaθ that determines the category. Specifically, if the score of an instance S(x)≥θS(x) \geq \thetaS(x)≥θ, then the instance x{x}x is classified as positive; otherwise, it is classified as negative. Clearly, choosing different thresholds may yield different classifiers.

Of course, a binary classifier may have misclassification: it could either classify a positive instance as negative (false negative) or classify a negative instance as positive (false positive).

Given a dataset and a classifier, we may define the true positive rate (TPR{TPR}TPR) and the false positive rate (FPR{FPR}FPR) as follows:

TPR=#TP#TP+#FN,FPR=#FP#TN+#FP{TPR} = \frac{\# {TP}} {\# {TP} + \# {FN}}, \quad {FPR} = \frac{\# {FP}} {\# {TN} + \# {FP}}TPR=#TP+#FN#TP​,FPR=#TN+#FP#FP​

where #TP\# TP#TP is the number of true positives in the dataset; #FP,#TN,#FN\# FP, \#TN, \#FN#FP,#TN,#FN are defined likewise.

Now you have trained a scoring function, and you want to evaluate the performance of your classifier. The classifier may exhibit different TPR and FPR if we change the threshold θ\thetaθ. Let TPR(θ),FPR(θ){TPR}(\theta), FPR(\theta)TPR(θ),FPR(θ) be the TPR,FPR{TPR, FPR}TPR,FPR when the threshold is θ\thetaθ, define the area  under  curve{area\;under\;curve}areaundercurve (AUC{AUC}AUC) as
AUC=∫01max⁡θ∈R{TPR(θ)∣FPR(θ)≤r}dr{AUC} = \int_{0}^{1} \max_{\theta \in \mathbb{R}} \{TPR(\theta)|FPR(\theta) \leq r\} d rAUC=∫01​maxθ∈R​{TPR(θ)∣FPR(θ)≤r}dr
where the integrand, called receiver  operating  characteristic{receiver\;operating\;characteristic}receiveroperatingcharacteristic (ROC), means the maximum possible of TPR{TPR}TPR given that FPR≤rFPR \leq rFPR≤r.

Given the actual classes and predicted scores of the instances in a dataset, can you compute the AUC{AUC}AUC of your classifier?

For example, consider the third test data. If we set threshold θ=30\theta = 30θ=30, there are 3 true positives, 2 false positives, 2 true negatives, and 1 false negative; hence, TPR(30)=0.75{TPR}(30) = 0.75TPR(30)=0.75 and FPR(30)=0.5{FPR}(30) = 0.5FPR(30)=0.5. Also, as θ\thetaθ varies, we may plot the ROC curve and compute the AUC accordingly, as shown in Figure 1.

输入描述:

The first line contains a single integer n{n}n (2≤n≤106)(2 \leq n \leq 10^6)(2≤n≤106), the number of instances in the dataset. Then follow n{n}n lines, each line containing a character c∈{+,−}c \in \{{+},{-}\}c∈{+,−} and an integer s{s}s (1≤s≤109)(1 \leq s \leq 10^9)(1≤s≤109), denoting the actual class and the predicted score of an instance.It is guaranteed that there is at least one instance of either class.

输出描述:

Print the AUC{AUC}AUC of your classifier within an absolute error of no more than 10−910^{-9}10−9.

示例1

输入

3
+ 2
- 3
- 1

输出

0.5

示例2

输入

6
+ 7
- 2
- 5
+ 4
- 2
+ 6

输出

0.888888888888889

示例3

输入

8
+ 34
+ 33
+ 26
- 34
- 38
+ 39
- 7
- 27

输出

0.5625

说明


ROC and AUC{AUC}AUC of the third sample data.

题意: 题目巨长无比,实在考验人的耐心......有一台分类器,可以根据设定的指标θ来把目标分类成+或者-,如果目标参数大于等于θ就分类成+,如果小于θ分类成-。给出n个目标的目标参数以及它们真正的类别,设FPR为真实类别为-的目标中被机器分类为+的目标个数 / 真实类别为-的目标个数,设TPR为真实类别为+的目标中被机器分类为+的目标个数 / 真实类别为+的目标个数,显然FPR与TPR是关于θ的函数。令θ取遍实数可以得到一系列的FPR(θ)、TPR(θ),即以FPR和TPR为轴的一系列散点,构造函数值f(FPR)为小于等于FPR的区域内TPR的最大值,求f函数在[0, 1]上的积分。

分析: 显然散点一定在θ取每个目标参数时可以全部获取到,因此只需要枚举目标参数就可以得到图上的所有散点。根据f函数的定义可以得知f是个分段函数且每段都是直线,同时f一定递增。因此求积分就是一个求矩形面积的过程,for循环枚举断点累加求和即可。

具体代码如下:

#include<cstdio>
#include<cstring>
#include<algorithm>
#include<iostream>
#include<queue>
#include<map>
#define int long long
#define double long double
using namespace std;
const int N = 1e6+10;
typedef pair<int,int> PII;
map<int,int> mp;
int a[N];
int p[N],ne[N],cnt1,cnt2;
signed main()
{int n;cin >> n;char t[2];for(int i = 1; i <= n; i++){scanf("%s%lld", t, &a[i]);if(t[0] == '+')p[cnt1++] = a[i];else ne[cnt2++] = a[i];} sort(p,p+cnt1);sort(ne,ne+cnt2);double pp = 0;if(cnt2 == 0){printf("%.9Lf\n",pp);return 0;}for(int i=0;i<cnt2;i++){int x = cnt2 - (lower_bound(ne,ne+cnt2,ne[i]) - ne);int t = cnt1 - (lower_bound(p,p+cnt1,ne[i]) - p);mp[x] = max(mp[x],t);}for(int i=0;i<cnt1;i++){int x = cnt2 - (lower_bound(ne,ne+cnt2,p[i]) - ne);int t = cnt1 - (lower_bound(p,p+cnt1,p[i]) - p);mp[x] = max(mp[x],t);}int xl = 0,y = mp[0],ans = 0;for(map<int,int>::iterator it = mp.begin();it != mp.end();it++){int xr = it->first;ans += (xr-xl)*y;y = it->second;xl = xr;}printf("%.9Lf\n",(double)ans/cnt1/cnt2);return 0;
}

[思维][模拟]Scholomance Academy 第45届icpc区域赛沈阳站K相关推荐

  1. 第 45 届ICPC区域赛(南京)记录

    题在这里 K - Co-prime Permutation 题意:让你从1~n的数字序列中,选出每一个数字放到一个位置i使得gcd(a[i],i) = 1. 解:相邻的数字互质,故将需要gcd(a[i ...

  2. icpc网络赛第二场K Meal

    icpc网络赛第二场K Meal 题意: 有n个人,n个菜, 现在n个人轮流吃菜,起初S中有n个菜,第i个人会在还没拿走的菜中随机选一个,拿走第j个菜的概率为ai,j∑k∈Sai,k\frac{a_{ ...

  3. 训练实录 | 第 45 届ICPC沈阳站(牛客重现赛)

    第 45 届国际大学生程序设计竞赛(ICPC)亚洲区域赛(沈阳)(重现赛) 传送门:ICPC沈阳 F - The Witchwood 这重现赛,这数据,我既然被hack了,我写的代码太拉跨了????? ...

  4. j-甜甜圈(第十三届icpc河南省赛)

    照常,上题目:J-甜甜圈_河南省第十三届ICPC大学生程序设计竞赛(重现赛) (nowcoder.com) 这种题,就一眼能看出来直接模拟肯定是不行的了 然后比赛的时候三个人就思考了下人生决定跳过 咳 ...

  5. 第42届ACM国际大学生程序设计竞赛 亚洲区域赛 西安站 总结

    今年暑假集训结束的时候我们队在UESTC-ACM Div.1里排名第6,所以获得一场ICPC和一场CCPC的现场赛资格. 由于上半年5月打过西安的邀请赛并取得了一块银牌,笔者那次毕竟是第一次打区域赛级 ...

  6. 第45届ICPC 昆明站 临时模板补充

    昆明站模板补充 __int128 typedef __int128 LL; inline __int128 read(){__int128 x=0,f=1;char ch=getchar();whil ...

  7. 第45届ICPC沈阳站部分题解(D、F、G、H、I、J、K)

    文章目录 D-前缀和思想+dfs F-贪心 G H-双指针+dp 题意 思路 代码 I-追及问题+裴蜀定理 J-可持久化线段树 K-前缀和+积分的定义 题意 思路 参考链接 传送门 本文CSDN 本文 ...

  8. 第46届ICPC亚洲区域赛(沈阳)L-Perfect Matchings【dp,组合数学】

    正题 题目链接:https://ac.nowcoder.com/acm/contest/24346/L 题目大意 有一张2n2n2n个点的完全图,在上面删除一棵生成树,然后求这张图的完全匹配方案数. ...

  9. 第43届ACM icpc亚洲区域赛焦作站感想

    青岛痛失银牌,焦作又是铜牌一枚,现在想想,前面三个水题,我的原因太大了,老是犯各种zz小错误,明明能秒,却花了大量时间debug,导致三个小时才签完到,最后一个小时,F题的bfs没出来,B题也没出来, ...

  10. 伊朗 2018 ICPC区域赛 A : Iranian ChamPions Cup

    题目描述 The Iranian ChamPions Cup (ICPC), the most prestigious football league in Iran, is reaching its ...

最新文章

  1. 浙大博士整理的计算机视觉学习路线(含时间建议分配)
  2. tensorflow,神经网络创建源码
  3. MySQL 时间类型 DATE、DATETIME和TIMESTAMP
  4. 018_html文件路径
  5. linux18.04下安装的jdk11.0.2
  6. python 3d绘图 拖动_使用python-matplotlib连续3D绘图(即图形更新)?
  7. php伪静态限制网页播放视频,学习猿地-php伪静态后html不能访问怎么办
  8. Springboot响应处理
  9. Linux中安装开源JDK(windows的JDK只能安装半开源)
  10. mfc groupbox 边框颜色_蓝色牛仔裤配什么颜色上衣好看
  11. Kotlin基础学习第3章—内置类型
  12. 如何编写可移植的c/c++代码
  13. 频谱仪的更改ip_通过局域网(LAN)读取频谱分析仪图像的方法
  14. mysql身份证校验码_javascript身份证验证代码
  15. 用过企业微信APP 后,微信接收不到消息,解决方案
  16. java定时器timer 取消_JAVA定时器Timer的使用
  17. 1072 开学寄语 C++实现
  18. 海尔消费金融暂停“首付贷”
  19. 我的《上勾拳》网页单机小游戏有什么版权问题吗
  20. 3、(三)外汇学习基础篇之银行间外汇即期交易

热门文章

  1. android拉起软键盘,移动端JavaScript拉起软键盘
  2. HBGGP的工程建立过程
  3. Vue学习(学习打卡Day14)
  4. 视频号算法推荐机制! 微信视频号怎么上热门?
  5. python_面向对象,以及类的相关知识
  6. c语言偶数求和while,C语言中编程计算1至100以内的奇数和偶数并分别求和,求代码...
  7. linux 清除dns缓存
  8. Rancher搭建Longhorn分布式存储
  9. 洛谷P2689 东南西北
  10. 泰拉瑞亚服务器config修改,《泰拉瑞亚》游戏配置怎么修改 游戏配置修改办法推荐...