机器学习偏见可能会定义少数族裔的健康状况

This article is also a part of the Tech in Policy publication. TIP focuses on technology being used for good and shines a light on its more malicious or neglectful implementations.

本文也是Tech in Policy出版物的一部分。 技巧着眼于善用的技术，并揭示了其更恶意或更无意识的实现。

Whether or not you’re aware, your Google searches, questions posed to Siri, and Facebook timeline all rely on artificial intelligence (AI) to perform effectively. Artificial intelligence is the simulation of human intelligence processes by machines. The goal of artificial intelligence is to build models that can perform specific tasks as intelligently as humans can, if not better. Much of the AI you encounter on a daily basis uses a technique known as machine learning, which uses predictive modeling to generate accurate predictions when given random quantities of data. Because predictive models are built to find relational patterns in data, they learn to favor efficiency over fairness. Machine-learning models built on unrepresentative data can cement bias into machine-learning algorithms.

不管您是否知道，您的Google搜索，对Siri提出的问题以及Facebook时间线都依靠人工智能(AI)来有效执行。人工智能是机器对人类智能过程的模拟。人工智能的目标是建立可以更好地执行人类特定任务的模型，即使不是更好。您每天遇到的许多AI使用称为机器学习的技术，该技术使用预测模型在给定随机数据量时生成准确的预测。由于建立了预测模型以查找数据中的关系模式，因此预测模型学会了偏重效率而非公平。基于非代表性数据的机器学习模型可以将偏差纳入机器学习算法中。

Most datasets used to inform artificial intelligence models contain little to no genetic variation. In fact, starting datasets are 81 percent European on average. Lack of diversity in mass data collection has contributed to racial disparities in cancer‐specific mortality, treatments tailored to European Americans, and disparities in dermatological diagnoses. While the paucity of representative data and government regulations on automated decision-making have contributed to race-related controversy in mainstream media, algorithmic bias has contributed to widespread racial prejudice as it relates to healthcare accessibility in the United States.

用于通知人工智能模型的大多数数据集几乎没有遗传变异。实际上，起始数据集平均在欧洲占81％。大量数据收集的多样性不足导致了特定于癌症的死亡率的种族差异，针对欧美人的治疗方法以及皮肤病学诊断的差异。尽管缺乏代表性的数据和有关自动决策的政府法规已在主流媒体中引发了与种族有关的争议，但算法偏见却导致种族歧视普遍蔓延，因为它与美国的医疗可及性有关。

United States federal law protects characteristics that cannot be used during decision-making processes. These protected attributes include but are not limited to sex, race, disability and religion. However, in machine learning, bias occurs when a combination of training features that closely resemble a protected class is entered into one of the model’s algorithms. Even if the original protected classes are excluded from the machine’s training, other attributes might culminate in a “close proxy” of any given protected class. In the United States, there is no equivalent of the European Union’s General Data Protection Regulation, which holds firms accountable for bias and discrimination in automated decisions. In the United States, bias in algorithmic-driven healthcare contributes to less adequate care for marginalized patients.

美国联邦法律保护在决策过程中无法使用的特征。这些受保护的属性包括但不限于性别，种族，残疾和宗教信仰。但是，在机器学习中，当将非常类似于受保护类的训练功能组合输入模型的一种算法时，就会出现偏差。即使将原始受保护的类从机器的培训中排除，其他属性也可能最终导致任何给定受保护类的“紧密代理”。在美国，没有与欧盟的《通用数据保护条例》等效的条例，该条例要求公司对自动决策中的偏见和歧视负责。在美国，算法驱动型医疗保健的偏见导致边缘化患者的护理不足。

Suppose a machine-learning algorithm designed to make decisions about who should receive additional health services uses the patient’s previous healthcare expenditures as a benchmark. While direct information about race is not pushed into the system, income is strongly correlated with race. Due to historic inequities and systemic racism in the United States, people of color are more likely to have lower income and to use medical services less frequently due to lack of trust in healthcare providers. Even when Black and white patients spent the same amount of money on healthcare expenses, the two race groups did not have the same level of need: Black patients typically paid more for active interventions such as emergency visits for diabetes or hypertension complications. Now, let’s get back to the machine-learning algorithm: altering the algorithm to remedy the bias would increase the percentage of Black patients receiving additional help from 17.7 percent to 46.5 percent. By using income, a close proxy for race, the system indirectly learns to prioritize eligibility decisions based on race.

假设设计一种机器学习算法来决定谁应该获得额外的健康服务，并以患者以前的医疗保健支出为基准。虽然没有将有关种族的直接信息推送到系统中，但是收入与种族紧密相关。由于美国历史上的不平等和系统种族主义，有色人种由于对医疗服务提供者的信任度较低，收入较低，使用医疗服务的频率更低。即使黑人和白人患者在医疗保健费用上花费了相同的金额，两个种族群体的需求水平也不相同：黑人患者通常会为主动干预(例如因糖尿病或高血压并发症而进行的急诊就诊)支付更高的费用。现在，让我们回到机器学习算法：改变算法以纠正偏见将使获得额外帮助的黑人患者的百分比从17.7％增加到46.5％。通过使用收入(种族的亲密代表)，系统间接学习了根据种族确定优先权决策的优先级。

Algorithmic biases present marginalized patients as more healthy, disqualifying them from the speciality care they need. An online breast cancer risk-assessment tool calculates a lower risk for Black and Latinx patients when compared to white patients with identical health-risk factors. The biased algorithm deters patients of color from seeking additional screening even though patients of color are at much higher risk for the disease. In another example, patients visiting the emergency room with back or side pain are rated on a thirteen-point scale to determine if the patient’s pain is related to kidney stones. A higher score means less likelihood of pain due to kidney stones. If you happen to be a Black patient, the assessment’s algorithm adds three points to your score. To date, there is no empirical evidence suggesting Black people’s pain is less likely to indicate the presence of kidney stones. In a final disturbing example, an algorithm developed by the American Heart Association to predict mortality in patients with acute heart failure assigns non-Black patients three additional risk points: those deemed at higher risk receive more care. Biased algorithms result in a higher likelihood that non-Black patients will consistently be more likely to be referred to specialized care. Remarkably, many algorithms widely used in the healthcare system continue to have a substantial negative impact on historically underrepresented people.

算法偏见使处于边缘地位的患者更加健康，使他们失去了所需的专科护理的资格。与具有相同健康风险因素的白人患者相比，在线乳腺癌风险评估工具可计算出黑人和拉丁裔患者的较低风险。有偏见的算法阻止有色患者寻求额外的筛查，即使有色患者患该病的风险更高。在另一示例中，以十三点量表对到有急诊室或侧痛的急诊患者进行评估，以确定患者的疼痛是否与肾结石有关。分数越高，表示肾结石引起疼痛的可能性越小。如果您碰巧是黑人患者，则评估算法将您的分数加3分。迄今为止，尚无经验证据表明黑人的疼痛不太可能表明存在肾结石。在最后一个令人不安的例子中，由美国心脏协会开发的用于预测急性心力衰竭患者死亡率的算法为非黑人患者分配了三个额外的风险点：被认为处于较高风险中的那些人将获得更多的护理。有偏见的算法导致非黑人患者始终如一地接受专门护理的可能性更高。值得注意的是，医疗系统中广泛使用的许多算法继续对历史上代表性不足的人产生实质性的负面影响。

AI bias issues are in no way new to the computer science community; however, it is unclear whether the medical community even recognizes the problem. Algorithmic-driven patient care results in less adequate care for people striving to push back against the tide of systemic racism, given that the lack of diversity in big data and automated-decision regulations has not been ameliorated by the United States federal government. In contrast, European data regulations state that machine-automation developers must use appropriate mathematical and statistical techniques to ensure both that risk of error is minimized and that discriminatory effects are prevented. Regardless of geography, it is undeniable that managing these prejudices requires careful attention to data, the use of artificial intelligence to help detect bias, and the building of diverse teams. The federal government must ensure that those behind the automation are ethically and legally obliged to ensure AI is on the side of fairness.

AI偏见问题对于计算机科学界来说绝不是新事物。然而，目前尚不清楚医学界是否意识到这一问题。鉴于美国联邦政府尚未消除大数据缺乏多样性和自动决策规定的问题，以算法为基础的患者护理对努力反抗系统种族主义浪潮的人们而言，护理不足。相反，欧洲数据法规规定，机器自动化开发人员必须使用适当的数学和统计技术，以确保最大程度地降低错误风险，并防止歧视性影响。不管地理位置如何，不可否认的是，管理这些偏见需要仔细关注数据，使用人工智能来帮助发现偏见以及建立多元化的团队。联邦政府必须确保自动化背后的人在道德和法律上有义务确保AI站在公平的一边。

翻译自: https://medium.com/swlh/machine-learning-biases-might-define-minority-health-outcomes-a60f8f800fc8

查看全文

http://www.taodudu.cc/news/show-2850686.html

Uber Thomas 论文整理
I3D泛读【Que Vadis,Action Recognition?A New Model and the Kinetics Dataset】
科研写作——常见句式（六）
科研写作——常见句式（二）
目标检测, 实例分割, 图像分类, panoptic segmentation文献
科研写作——常见句式（十一）
科研写作——常见句式（一）
win10右键文件夹卡死未响应的解决方法
win10与virtualBox共享文件夹
win10的开机启动目录（文件夹）位置
Win10删除文件夹
win10开机启动文件夹路径
win10的开机启动文件夹
win10修改user文件夹名称
centos7挂载win10共享文件夹详解
win10查看linux文件夹,Win10系统访问Linux子系统中文件的教程
win10自启动方法
win10 不能查看其它电脑共享文件夹常用解决方法
win10右键文件夹无反应
Windows“启动”文件夹
win10 计算机和文件夹,手动清理win10垃圾，教你认识C盘各个文件夹用途
Win10启动文件夹在哪里,Win10怎么添加开机启动项?
计算机开机自启文件夹,开机启动文件夹在哪
Win10右键文件夹卡死如何处理
win10查看服务器共享文件夹权限,win10共享文件夹win7没有权限访问的解决教程
Win10“启动”文件夹在哪里？如何打开Win10启动文件夹？
win10启动文件夹在哪如何设置随系统自动启动
如何打开Win10启动文件夹？
H3C 单播与广播
小 B 的宿舍

机器学习偏见可能会定义少数族裔的健康状况相关推荐

[机器学习] XGB/LGB---自定义损失函数与评价函数
一自定义评价函数 Q: 评价函数为什么会影响模型训练? A: 评价函数会决定best_iteration.best_score在哪里取得最优解. XGBoost模型支持自定义评价函数和损失函数.只要 ...
科学计算法（机器学习）----决策树定义以相关概念
一..决策树 1.决策树是机器学习中一类非常常见的算法,它是一种分类与回归算法,但以分类为主.它的决策思维非常符合人类正常的决策方式. 2.举一个简单的例子, 比如我们要挑选一件衣服,我们就需要做出以 ...
机器学习中误差的定义，以及过拟合现象的介绍
1.误差:学习器的实际预测输出与样本的真实输出之间的差异学习器在训练集上的误差称为:训练/经验误差学习器在新样本上的误差称为:泛化误差注:显然我们希望得到泛化误差小的学习器.但是实际能做的努力是 ...
量化金融中的机器学习是什么? 定义、类型和示例
✏️写作:个人博客,InfoQ,掘金,知乎,CSDN
机器学习模型非线性模型_pycaret在几分钟内准备好您的机器学习模型
机器学习模型非线性模型 While working on a machine learning problem wouldn't it be better if we can make a quic ...
av发行商_如何向发行商推销游戏
av发行商 Later this week, Gamesindustry.biz will present its annual Investment Summit Online event, bri ...
oneplus 驱动_450美元的旗舰旗舰产品OnePlus Nord动手实践
oneplus 驱动 The Nord offers similar specs to the OnePlus 8 for just $450. It looks and feels terrific ...
什么是机器学习：一次权威定义之旅
在这篇文章中,我想要解决一个很简单的问题:机器学习是什么? 你可能对机器学习感兴趣或者稍稍了解.如果有一天你和朋友或同事聊起机器学习,那么一些人可能会问你"机器学习是什么".那么, ...
机器学习（概述一）——定义
何谓机器学习不同人的认知与人类认知过程的对比基本定义基本概念机器学习能用来干吗机器学习的常见应用框架机器学习的分类基于学习形式分类基于目的分类机器学习中的十大经典算法补充术语 ...

机器学习偏见可能会定义少数族裔的健康状况

相关文章：

机器学习偏见可能会定义少数族裔的健康状况相关推荐

最新文章

热门文章