接受拒绝算法

数据科学 (Data Science)

Nina was close to tears when she accused Nick Gibb of ruining her life. Nina is an 18 year old about to leave school and go on to higher education; Gibb is the UK government’s schools minister.

妮娜(Nina)指责尼克·吉布(Nick Gibb)毁了自己的生活时差点流泪。 妮娜(Nina)今年18岁,即将离开学校继续接受高等教育; 吉布(Gibb)是英国政府的学校部长。

The occasion was a live BBC radio programme called “Any Questions”, a popular show where politicians and others discuss questions posed by the public.

场合是英国广播公司(BBC)的直播节目“任何问题”,这是一个受欢迎的节目,政界人士和其他人讨论公众提出的问题。

Nina was on track to be accepted by the most prestigious veterinary college in the country; she just needed to get the right grades in her A-level exams and her teachers predicted that she would indeed get those grades.

妮娜(Nina)有望被该国最负盛名的兽医学院录取; 她只需要在A级考试中取得正确的成绩即可,她的老师预测她的确会取得这些成绩。

But, because of the pandemic, the exams never took place and instead the government sanctioned the use of an algorithm to assign grades to students, instead.

但是,由于大流行,考试从未进行过,相反,政府批准使用一种算法为学生分配成绩。

This algorithm reduced the grades for nearly 40% of students. Nina’s were so much lower than the ones predicted for her that, not only would she not be able to get into her preferred college, but no other college would accept her, either.

该算法降低了将近40%的学生的成绩。 妮娜(Nina)的学历比她预期的要低得多,不仅她不能进入自己喜欢的大学,而且其他任何大学也不会接受她。

Mr Gibb was reassuring and told the audibly upset Nina that this was a mistake that would be put right.

Gibb先生放心,并向Nina倾诉不安,这是可以纠正的错误。

And it was. A few days later, after much argument, protest and controversy, and to the great relief of many students who were in a similar situation to Nina, the algorithm was dumped in favour of student grades predicted by schools.

是的。 几天后,经过多次争论,抗议和争议,并使处于与Nina相似境地的许多学生大为欣慰的是,该算法被弃用,以支持学校预测的学生成绩。

So how did the government get itself into this mess?

那么,政府如何陷入困境呢?

The concern was that just using predicted grades would result in grade inflation because teachers tend to be more optimistic about their students’ abilities than is borne out in actual exam results.

令人担心的是,仅使用预测的成绩会导致成绩膨胀,因为与实际考试结果相比,教师更倾向于对学生的能力持乐观态度。

So an algorithm was devised that was based, not so much on the academic record of the individual student, but on the record of the school. A results profile was constructed for the school — how many As did previous cohorts achieve, how many Bs, C’s and so on — and the individual students were positioned on that profile. The ‘exam results’ were then calculated for each student depending on their position in the profile.

因此,设计了一种算法,该算法不仅基于单个学生的学习成绩,而且还基于学校的成绩。 为学校构建了一个结果概要文件-以前的队列达到了多少,B,C的数目等等,并且每个学生都位于该概要文件上。 然后根据每个学生在个人资料中的位置为他们计算“考试结果”。

Using statistics from previous years to predict future results is not an unusual thing to do. Indeed, statistical models like these are very useful for government and corporate planning.

使用前几年的统计数据来预测未来的结果并不是一件寻常的事情。 确实,像这样的统计模型对于政府和公司计划非常有用。

However, while the predicted results may have been in line with previous years and might well reflect what the outcome of this year’s exam results might have been, generally, there are at least two glaring problems.

但是,尽管预测结果可能与前几年保持一致,并且可能很好地反映了今年考试结果的结果,但通常至少存在两个明显的问题。

The first is that some schools are getting better, so judging their performance on the previous year would produce a worse result than is fair. This would affect all of the students at that school.

首先是一些学校的状况正在好转,因此,从上一年的表现来看,其结果会不公平。 这将影响该学校的所有学生。

Secondly, not all cohorts are the same. There will always be a number of high flyers in a school but this number will vary. If there are fewer in a particular school this year, then the statistical approach taken would artificially promote some more mediocre students into that category. They probably won’t complain.

其次,并不是所有的队列都是一样的。 在学校中,总是会有很多高级传单,但是这个数字会有所不同。 如果今年某所学校的学生人数减少,那么采用的统计方法将人为地将一些中等水平的学生晋升为该类别。 他们可能不会抱怨。

But if there are a larger number of gifted students in this year’s cohort then they will be unfairly downgraded and, like Nina, be awarded grades below those that they could have achieved in a real exam.

但是,如果今年的队列中有更多的天才学生,那么他们将被不公平地降级,并且像Nina一样,被授予的分数低于他们在真实考试中可以达到的成绩。

These additional gifted students may well be cheated out of their place at a good university or college.

这些额外的有天赋的学生很可能会被一所好的大学或学院骗走。

In a further twist, the algorithm was not applied to small cohorts because the statistics for these smaller groups are not reliable. In the small classes, which are often private schools, the teacher predicted grades were used. Thus smaller private schools got the benefit of generous teacher assessment while the larger state schools did not. Not an equitable situation.

更进一步,该算法不适用于较小的同类群,因为这些较小的群体的统计数据不可靠。 在通常是私立学校的小班教学中,使用了老师预测的成绩。 因此,较小的私立学校获得了慷慨的教师评估的好处,而较大的公立学校则没有。 并非公平的情况。

So should we chuck out algorithms altogether? No. There are many situations where the use of algorithms and statistics are entirely suitable for prediction purposes. How many beds should a hospital leave free during the flu season? Previous numbers of flu cases can inform the calculation. When Netflix or Amazon recommend a movie to watch or a product to buy, they are using statistics from people who they think are like you. But, if they get it wrong you end up watching a movie that you don’t enjoy or rejecting a product that you don’t want to buy. This is not life changing.

那么我们应该完全放弃算法吗? 不可以。在许多情况下,算法和统计信息的使用完全适合预测目的。 流感季节医院应该腾出几张床? 以前的流感病例数可以为计算提供依据。 当Netflix或Amazon建议观看电影或购买产品时,他们会使用他们认为与您相似的人的统计数据。 但是,如果他们弄错了,那么您最终会看不喜欢的电影,或者拒绝您不想购买的产品。 这不会改变生活。

When statistics are used to make decisions that are profound, using someone else’s data just won’t do. It might produce an overall pattern that is satisfying to the designers, or policy makers, but it can unfairly disadvantage individuals.

当使用统计数据做出有意义的决策时,仅使用其他人的数据是行不通的。 它可能会产生令设计者或决策者满意的整体模式,但可能不公平地使个人处于不利地位。

Making decisions about someone’s personal life by using data from people from a similar background is wrong and unethical. Nina’s grades were calculated by looking at students who were similar to her. But they were not her. If such important decisions are to be made about someone’s life then it is only that person’s data that should be taken into account.

通过使用来自相似背景的人的数据来决定某人的个人生活是错误和不道德的。 妮娜的成绩是通过查看与她相似的学生得出的。 但是他们不是她。 如果要对某人的生活做出如此重要的决定,那么仅应考虑该人的数据。

And maybe it should not be an algorithm that makes that decision.

也许它不应该是做出决定的算法。

翻译自: https://medium.com/swlh/denied-a-university-place-by-an-algorithm-ba3449a5d414

接受拒绝算法


http://www.taodudu.cc/news/show-997602.html

相关文章:

  • 为什么用scrum_为什么Scrum糟糕于数据科学
  • 使用集合映射和关联关系映射_使用R进行基因ID映射
  • 详尽kmp_详尽的分步指南,用于数据准备
  • SMSSMS垃圾邮件检测器的专业攻击
  • 使用Python进行地理编码和反向地理编码
  • grafana 创建仪表盘_创建仪表盘前要问的三个问题
  • 大数据对社交媒体的影响_数据如何影响媒体,广告和娱乐职业
  • python 装饰器装饰类_5分钟的Python装饰器指南
  • 机器学习实际应用_机器学习的实际好处是什么?
  • mysql 时间推移_随着时间的推移可视化COVID-19新案例
  • 海量数据寻找最频繁的数据_寻找数据科学家的“原因”
  • kaggle比赛数据_表格数据二进制分类:来自5个Kaggle比赛的所有技巧和窍门
  • netflix_Netflix的Polynote
  • 气流与路易吉,阿戈,MLFlow,KubeFlow
  • 顶级数据恢复_顶级R数据科学图书馆
  • 大数据 notebook_Dockerless Notebook:数据科学期待已久的未来
  • 微软大数据_我对Microsoft的数据科学采访
  • 如何击败腾讯_击败股市
  • 如何将Jupyter Notebook连接到远程Spark集群并每天运行Spark作业?
  • twitter 数据集处理_Twitter数据清理和数据科学预处理
  • 使用管道符组合使用命令_如何使用管道的魔力
  • 2020年十大币预测_2020年十大商业智能工具
  • 为什么我们需要使用Pandas新字符串Dtype代替文本数据对象
  • nlp构建_使用NLP构建自杀性推文分类器
  • 时间序列分析 lstm_LSTM —时间序列分析
  • 泰晤士报下载_《泰晤士报》和《星期日泰晤士报》新闻编辑室中具有指标的冒险活动-第1部分:问题
  • 异常检测机器学习_使用机器学习检测异常
  • 特征工程tf-idf_特征工程-保留和删除的内容
  • 自我价值感缺失的表现_不同类型的缺失价值观和应对方法
  • 学习sql注入:猜测数据库_面向数据科学家SQL:学习简单方法

接受拒绝算法_通过算法拒绝大学学位相关推荐

  1. cb32a_c++_STL_算法_查找算法_(5)adjacent_find

    cb32a_c++_STL_算法_查找算法_(5)adjacent_find adjacent_find(b,e),b,begin(),e,end() adjacent_find(b,e,p),p-p ...

  2. 常用十大算法_回溯算法

    回溯算法 回溯算法已经在前面详细的分析过了,详见猛击此处. 简单的讲: 回溯算法是一种局部暴力的枚举算法 循环中,若条件满足,进入递归,开启下一次流程,若条件不满足,就不进行递归,转而进行上一次流程. ...

  3. 滴滴派单算法_从算法模型思路到评估方案 - 详解

    导读:说到滴滴的派单算法,大家可能感觉到既神秘又好奇,从出租车扬召到司机在滴滴平台抢单最后到平台派单,大家今天的出行体验已经发生了翻天覆地的变化,面对着每天数千万的呼叫,滴滴的派单算法一直在持续努力让 ...

  4. python序列模式的关联算法_关联算法

    以下内容来自刘建平Pinard-博客园的学习笔记,总结如下: 1 Apriori算法原理总结 Apriori算法是常用的用于挖掘出数据关联规则的算法,它用来找出数据值中频繁出现的数据集合,找出这些集合 ...

  5. java寻优算法_模拟退火算法SA原理及python、java、php、c++语言代码实现TSP旅行商问题,智能优化算法,随机寻优算法,全局最短路径...

    模拟退火算法SA原理及python.java.php.c++语言代码实现TSP旅行商问题,智能优化算法,随机寻优算法,全局最短路径 模拟退火算法(Simulated Annealing,SA)最早的思 ...

  6. 编程神奇算法_分类算法的神奇介绍

    编程神奇算法 由Bryan Berend | 2017年3月23日 (by Bryan Berend | March 23, 2017) About Bryan: Bryan is the Lead ...

  7. 数据挖掘算法_数据挖掘算法入门

    有南方的朋友讲过北方人喜欢打比方,尤其是甲方的,其实也没什么不好了.如果是做菜的话,那么这些算法就相当于烹饪的工具了.对原始的食材进行预处理.加工整合,选择合适烹饪工具,以及对应的方法步骤,最后收获舌 ...

  8. prim算法_贪心算法详解(附例题)

    贪心算法的特征规律 贪心算法,"贪心"二字顾名思义,因此其规律特征就是更加注重当前的状态,贪心法做出的选择是对于当前所处状态的最优选择,它的解决问题的视角是微观的"局部& ...

  9. 回溯算法和贪心算法_回溯算法介绍

    回溯算法和贪心算法 回溯算法 (Backtracking Algorithms) Backtracking is a general algorithm for finding all (or som ...

最新文章

  1. thinkphp_ajax分页实现_无需整理
  2. 【Python-ML】SKlearn库学习曲线和验证曲线
  3. Mysql配置优化浅谈
  4. MFC界面库BCGControlBar v25.3新版亮点:Dialogs和Forms
  5. 栈应用_将算式转成按运算符优先级分布(代码、分析、汇编)
  6. Java包装类型对象比较相等性注意事项
  7. ruby中的特殊字符
  8. 1075. 链表元素分类(25)-PAT乙级真题
  9. 【手写数字识别】基于matlab GUI BP神经网络手写数字识别系统【含Matlab源码 1639期】
  10. windows性能监视器基本指标
  11. 使用UltraISO刻录自己的音乐CD步骤
  12. 硬核!教你三种方法,实现微信自定义修改地区!
  13. DOS批处理简明高级教程
  14. 服务器集群技术的特点和功能
  15. html5新年动画祝福,canvas动画效果新年祝福话语
  16. android延迟刷新adapter,Android关于Adapter更新数据问题案例
  17. 假如生活欺骗了你之ARP欺骗,原理图
  18. 中国营销新闻网新闻发布
  19. 《RabbitMQ》什么是死信队列
  20. TortoiseGit 使用

热门文章

  1. bzoj2938: [Poi2000]病毒
  2. css3 变换、过渡效果、动画
  3. flask内置session原理
  4. Docker学习笔记 - Docker Compose
  5. FB面经Prepare: Dot Product
  6. UVA1262Password(第K字典序)
  7. 一个java处理JSON格式数据的通用类(三)
  8. winged edge翼边
  9. spring boot 服务器常用
  10. github上打包的样式为什么在预览的时候,出现404