评分卡模型开发（八）--主标尺设计及模型验证

转自：https://blog.csdn.net/lll1528238733/article/details/76601930

上一步中开发的信用风险评分卡模型，得到的是不同风险等级客户对应的分数，我们还需要将分数与违约概率和评级符号联系起来，以便差异化管理证券公司各面临信用风险敞口的客户，这就需要对证券公司各面临信用风险敞口业务中的个人客户开发一个一致的主标尺。最容易理解、最容易操作的方式就是根据违约概率从低到高分为不同的区间，这就相当于把违约概率这把尺子标上刻度，用这把尺子可以把证券公司需承担信用风险敞口的不同业务中的个人客户划分到不同的信用等级，这样各项业务中个人客户的信用等级分布差异、信用风险分布高低，就可以一目了然地展现出来了。这种违约概率和信用等级之间的映射关系就称为主尺标。

由逻辑回归方程原理的分析可知，客户的违约概率p=Odds/(1+Odds)，由式 Score=A-Blog(Odds)中得分与违约概率和Odds之间的对应关系，我们可计算出客户得分对应的违约概率。

由信用风险标准评分卡可知，该评分卡的最高分是89分，最低分是-41分。因此，我们可以计算出该评分卡所有得分范围对应的违约概率：

根据表3.22的结果可见，我们可简单地将每10分对应一个信用等级，并用每相邻得分对应的违约概率（这种方法计算得出的违约概率只能用作风险排序，而不是客户的真实违约概率）的算术平均值作为该信用风险等级对应的平均违约概率，得到最终的主尺标及其内部信用等级对照表3.23：

在主标尺和内部信用等级确定后，接下来我们需要进行模型的区分能力、预测准确度和稳定性等模型的验证工作了。回顾模型开发的过程，在模型开发时我们采用随机抽样的方法将数据分为样本集和测试集，并用样本集开发模型，用测试集做模型验证。因此，做模型验证时，我们应当首先用开发好的模型对测试集中的每一个样本评级一遍，并根据评级结果来计算模型的区分能力和预测准确度。用已开发好的模型对测试集中所有样本重新评级一遍的代码如下：

tmp1<-test_kfolddata[,-21]
credit_risk1<-ifelse(test_kfolddata[,"credit_risk"]=="good",0,1)
data_tmp<-as.matrix(cbind(tmp1,credit_risk1))
##降维purpose（对测试集中的样本做同样的降维处理）##
for(i in 1:nrow(data_tmp))
{#合并car(new)、car(used)if(as.character(data_tmp[i,"purpose"])=="car (new)")  {data_tmp[i,"purpose"]<-as.character("car(new/used)")}if(as.character(data_tmp[i,"purpose"])=="car (used)"){data_tmp[i,"purpose"]<-as.character("car(new/used)")}#合并radio/television、furniture/equipmentif(as.character(data_tmp[i,"purpose"])=="radio/television") {data_tmp[i,"purpose"]<-as.character("radio/television/furniture/equipment")}if(as.character(data_tmp[i,"purpose"])=="furniture/equipment"){data_tmp[i,"purpose"]<-as.character("radio/television/furniture/equipment")}#合并others、repairs、businessif(as.character(data_tmp[i,"purpose"])=="others"){data_tmp[i,"purpose"]<-as.character("others/repairs/business")}if(as.character(data_tmp[i,"purpose"])=="repairs"){data_tmp[i,"purpose"]<-as.character("others/repairs/business")}if(as.character(data_tmp[i,"purpose"])=="business"){data_tmp[i,"purpose"]<-as.character("others/repairs/business")}#合并retraining、educationif(as.character(data_tmp[i,"purpose"])=="retraining"){data_tmp[i,"purpose"]<-as.character("retraining/education")}if(as.character(data_tmp[i,"purpose"])=="education"){data_tmp[i,"purpose"]<-as.character("retraining/education")}
}
##purpose变量降维结束##
###用R代码实现打分卡模型###
data1<-as.data.frame(data_tmp)
tot<-nrow(data1)
score<-list()
for(i in 1:tot)
{lst<-as.matrix(data1[i,])#durationscore_duration<-NAif(lst[,"duration"]<=8){score_duration<-14}elseif(lst[,"duration"]>8&lst[,"duration"]<=33){score_duration<-1}elseif(lst[,"duration"]>33){score_duration<--7}#amountscore_amount<-NAif(lst[,"amount"]<=3913){score_amount<-3}elseif(lst[,"amount"]>3913&lst[,"amount"]<=9283){score_amount<--5}elseif(lst[,"amount"]>9283){score_amount<--14}#agescore_age<-NAif(lst[,"age"]<=34){score_age<--2}elseif(lst[,"age"]>34){score_age<-3}#installment_ratescore_installment_rate<-NAif(lst[,"installment_rate"]==1){score_installment_rate<-2}elseif(lst[,"installment_rate"]==2){score_installment_rate<-5}elseif(lst[,"installment_rate"]==3){score_installment_rate<--1}elseif(lst[,"installment_rate"]==4){score_installment_rate<--6}#statusscore_status<-NAif(lst[,"status"]=="... < 100 DM"){score_status<--10}elseif(lst[,"status"]=="0 <= ... < 200 DM"){score_status<--5}elseif(lst[,"status"]=="... >= 200 DM / salary for at least 1 year"){score_status<-5}elseif(lst[,"status"]=="no checking account"){score_status<-14}#credit_historyscore_credit_history<-NAif(lst[,"credit_history"]=="critical account/other credits existing"){score_credit_history<-8}elseif(lst[,"credit_history"]=="existing credits paid back duly till now"){score_credit_history<--1}elseif(lst[,"credit_history"]=="all credits at this bank paid back duly"){score_credit_history<--10}elseif(lst[,"credit_history"]=="delay in paying off in the past"){score_credit_history<-0}elseif(lst[,"credit_history"]=="no credits taken/all credits paid back duly"){score_credit_history<--16}#savingsscore_savings<-NAif(lst[,"savings"]=="... < 100 DM"){score_savings<--3}elseif(lst[,"savings"]=="... >= 1000 DM"){score_savings<-13}elseif(lst[,"savings"]=="500 <= ... < 1000 DM"){score_savings<-9}elseif(lst[,"savings"]=="unknown/no savings account"){score_savings<-9}elseif(lst[,"savings"]=="100 <= ... < 500 DM"){score_savings<--2}#propertyscore_property<-NAif(lst[,"property"]=="unknown/no property"){score_property<--4}elseif(lst[,"property"]=="real estate"){score_property<-3}elseif(lst[,"property"]=="building society savings agreement/life insurance"){score_property<--1}elseif(lst[,"property"]=="car or other"){score_property<-1}#purposescore_purpose<-NAif(lst[,"purpose"]=="domestic appliances"){score_purpose<-6}elseif(lst[,"purpose"]=="radio/television/furniture/equipment"){score_purpose<--3}elseif(lst[,"purpose"]=="car(new/used)"){score_purpose<--1}elseif(lst[,"purpose"]=="retraining/education"){score_purpose<--5}elseif(lst[,"purpose"]=="others/repairs/business"){score_purpose<--1}score[i]<-sum(20,score_duration,score_amount,score_age,score_installment_rate,score_status,score_credit_history,score_savings,score_property,score_purpose)rm(lst)
}
###用R代码实现打分卡模型结束###
#合并处理测试集样本得分，并输出到指定的CSV文件中#
score_M<-as.matrix(score,ncol=1)
score_data<-cbind(data1,score_M)
score_risk<-score_data[,c("credit_risk1","score_M")]
write.csv(as.matrix(score_risk),"C:/Users/ZL/Desktop/creditcard_model/2.csv")

从理论上说，信用评级无法给出主体是否违约的判断，只能给出主体违约的概率，而评级符号对应的就是主体发生违约的平均违约概率。但对评级结果的实际应用中，实在存在评级结果是否“准确”的质疑。那么，通常情况下如果某主体被评级为投资级（BBB及以上），但发生了违约，则被认为“不准确”或者“误判”。如果某主体被评级为投机级（BB及以下），且发生了违约，则被认为“预测准确”。如果被评级为投机级的主体没发生违约事件（并不是每个被评级为投机级的主体都会发生违约），则可以用概率去解释，那就是“大概率事件并不一定发生，小概率事件也并不一定不发生”。我们采用ROC作为模型区分能力的验证指标，采用AR（accuracy ratio，准确率）作为模型预测准确性的验证指标，并且两者存在AR=2×ROC-1的关系式。验证模型的稳定性需要多年的历史数据，由于数据原因此处略去。

由内部等级与主尺标的对应关系可知，投资级和投机级的分界点为20分，即大于20分的主体发生发生了违约，我们认为是“误判”，小于20分的主体为发生违约，我们也认为是“误判”。则经统计图中的数据可知，误判的主体总数为50个，则AR=1-50/200=0.75，此时ROC=(1+AR)/2=0.875。此时模型的预测准确度和区分能力均达到了较好地要求，可以进行部署使用。

上述模型的验证方法采用的是将测试样本集中的所有样本在生成的评分卡中全部评级一遍的方法，当然也可以采用直接将WOE变量的逻辑回归方程作为评级模型的方法。此时，也需要将测试样本集中的所有入模变量计算其WOE，并代入上述逻辑回归方程。

评分卡模型开发（八）--主标尺设计及模型验证相关推荐

评分卡模型开发-主标尺设计及模型验证
上一步中开发的信用风险评分卡模型,得到的是不同风险等级客户对应的分数,我们还需要将分数与违约概率和评级符号联系起来,以便差异化管理证券公司各面临信用风险敞口的客户,这就需要对证券公司各面临信用风险敞口 ...
基于C#的AE二次开发之主界面设计
上篇文章介绍了AE的安装与配置,下面介绍在VS2012中AE的简单主界面设计方法. 一.项目创建 1.打开VS2012建立项目在打开界面后,点击新建项目,在弹出的界面中选择Visual C#--Ar ...
基于AE的二次开发的主界面设计
一.新建项目 visual studio新建windows窗体应用程序项目,为相应的解决方案以及项目命名.得到包含一个窗体程序Form1的项目,更改Form1的属性:Name(窗体名称)设置为Main ...
信用评分卡模型开发及评估指标
版权声明:本文为博主原创文章,未经博主允许不得转载. 一.信用风险评级模型的类型信用风险计量体系包括主体评级模型和债项评级两部分.主体评级和债项评级均有一系列评级模型组成,其中主体评级模型可用&qu ...
评分卡模型开发（十）--总体流程
转自: https://blog.csdn.net/lll1528238733/article/details/76602006 一.信用风险评级模型的类型信用风险计量体系包括主体评级模型和债项评级 ...
信用标准评分卡模型开发及实现
一.信用风险评级模型的类型信用风险计量体系包括主体评级模型和债项评级两部分.主体评级和债项评级均有一系列评级模型组成,其中主体评级模型可用"四张卡"来表示,分别是A卡.B卡.C卡 ...
信用评分卡（A卡/B卡/C卡）的模型简介及开发流程｜干货
本文转自:https://blog.csdn.net/varyall/article/details/81173326 零.什么是信用评分卡如今在银行.消费金融公司等各种贷款业务机构,普遍使用信用评 ...
评分卡模型开发（四）--定量指标筛选
模型开发的前三步主要讲的是数据处理的方法,从第四步开始我们将逐步讲述模型开发的方法.在进行模型开发时,并非我们收集的每个指标都会用作模型开发,而是需要从收集的所有指标中筛选出对违约状态影响最大的指标, ...
信用评分卡模型开发中双峰分布原因及解决方案
信用评分卡模型开发中双峰分布原因及解决方案文: 郑旻圻邹钰刘巧莉转自: 数信互融在信用评分卡模型开发过程中,正态性是检验模型信用评分分布是否有效的一个重要指标.正常情况下,标准的正态分 ...

评分卡模型开发（八）--主标尺设计及模型验证

评分卡模型开发（八）--主标尺设计及模型验证相关推荐

最新文章

热门文章