orange 数据分析

Objective : Analysing of several factors influencing the recruitment of students and extracting information through plots.

目的: 分析影响学生招生和通过情节提取信息的几个因素。

Description : The following analysis presents the different plots that attempts to link students’ placement prospects, made possible through student perceptions of recruiting organisations to certain academic parameters such as percentage obtained in secondary and higher secondary school, undergraduate degree and post graduation degree.

Description(说明) :以下分析提出了不同的图,这些图试图通过将学生对招募组织的理解与某些学术参数(例如,在中学和高中获得的百分比,大学学位和毕业学位)的理解联系起来,从而尝试联系学生的就业前景。

Miscellaneous factors such as the gender of the candidate, the choice of board for and the stream opted for in high school and secondary education, undergraduate degree specialisation and post graduate degree specialisation have also been taken into account to predict placement status as well as salary offered.

还考虑了其​​他因素,例如候选人的性别,高中和中等教育的董事会选择和选择的职位,本科学位专业和研究生学位专业,以预测安置状况以及所提供的薪水。

Several colleges offer employ-ability tests which serve as a way of helping the employers evaluate their workforce, analyse and judge their skills and hence recruit the right talent. Thus, performance of students in such tests conducted by the college and their previous work experience have also been analysed to deduce their relation with recruitment opportunities.

几所大学提供就业能力测试,以帮助雇主评估其劳动力,分析和判断其技能,从而招募合适的人才。 因此,还对学生在大学进行的此类测试中的表现以及他们以前的工作经验进行了分析,以推断出他们与招聘机会的关系。

Hypothesis : Students with better scores in secondary education and undergraduate degree have better prospects of getting placed.

假设 :中学教育和大学学位较高的学生有更好的入学前景。

Understanding the Project :

了解项目

Going through the analysis, a reader shall be able to infer :

通过分析,读者应能够推断:

  1. How the choice of board of education influences placement prospects.

教育委员会的选择如何影响安置前景。

  1. The relative importance of scores obtained in various degrees and streams in campus recruitment procedure.

在校园招聘过程中,不同程度和不同等级获得的分数的相对重要性。

  1. The relation between gender and work experience with salary offered by corporate on campus placements.

性别和工作经验与公司在校园安置中提供的薪水之间的关系。

Acknowledgements:

致谢:

Myself Ruchika Parag Barman and my team mate Prafful Chauhan created this notebook/blog as part of the course work under “Pandas, bamboolib & Orange workshop” at Suven , under mentor-ship of Rocky Jagtiani .

我自己的 Ruchika Parag Barman 和我的队友 Prafful ChauhanRocky Jagtiani的 指导下,在 Suven的 “熊猫,竹筒和橙子工作坊” 下创建了此笔记本/博客,作为该课程工作的一部分。

Learned from https://datascience.suvenconsultants.com .

从 https://datascience.suvenconsultants.com 了解到。

Mentored by Rocky Jagtiani .

Rocky Jagtiani指导

Dataset:

资料集:

This data set consists of Placement data of students in a XYZ campus. It includes secondary and higher secondary school percentage and specialization. It also includes degree specialization, type and Work experience and salary offers to the placed students.

此数据集包含XYZ校园中学生的安置数据。 它包括中学和高中的百分比和专业。 它还包括学位专业化,类型和工作经验以及向所安置学生提供的薪水。

We have taken 60 observations (no of rows) from which we are extract information through exploratory data analysis and visualization. There are 8 categorical features and 6 numerical features.

我们采用了60个观测值(无行),通过探索性数据分析和可视化从中提取信息。 有8个分类特征和6个数字特征。

Histograms :

直方图:

Inference : Male students are getting more placements than female students and the ratio of male to female in placements is almost around 2:1.

推论 学生变得比 女生 更多 的展示位置男性 的比例, 女性 配股 几乎是2:1左右。

Inference : We can inspect that with respect to high school education, Central board students have wider range of salary than the other board students but placement ratio central to others is less than 1.

推论 :我们可以检查到,就高中教育而言, 中央董事会 学生的 薪资 范围比 其他董事会 学生要大,但相对于 其他人而言, 中心 职位的就业 率低于1。

Inference : We can inspect that with respect to secondary education, Central board students have wider range of salary than the other board students.

推论 :我们可以检查到,就中等教育而言, 中央董事会 学生的 薪资 范围比 其他董事会 学生要广。

Inference : Commerce and Arts students have wider range of salary and number of placed students are more as compared to science or other stream .

推论 :与 理科其他 专业 相比, 商科文科 生的 薪资 范围更广,安置学生的数量也更多。

From the above graphs, one can gather that gender plays quite an important role in whether or not a candidate will be hired. It is more likely for a male candidate to get placed at a corporate as compared to a female candidate. Similarly, the board of education and the stream chosen also determine salary offered. Students have been proposed higher amounts of pay that opted for Commerce and Management studies.

从以上图表可以看出,性别在是否应聘者中起着非常重要的作用。 与女性候选人相比,男性候选人更有可能被安置在公司。 同样,教育委员会和所选择的职位也决定了提供的薪水。 建议学生选择更高的薪水,选择商务和管理学习。

Correlations :

相关性

The correlations table gives us the following ideas :

相关表为我们提供了以下想法:

  1. Students who have scored well in their secondary education are very likely to perform well in their undergraduate degree also.

中学教育 中取得良好成绩的学生,其 本科学位 也很可能会表现良好。

  1. Students who have scored well in their high school education eventually perform well in their secondary education also.

高中阶段 成绩良好的 学生 最终在 中等教育方面 也表现良好。

  1. Again, students who have scored well in their high school education are very likely to perform well in their undergraduate degree also.

同样,在 高中阶段 取得良好 成绩的学生 也很可能在 本科学位 上表现良好。

  1. Most students who have had a good academic record in their high school education also score high in their MBA degree .

大多数 高中学历 良好的学生的 MBA学位 也很高。

Boxplots :

箱线图

Inference : The above boxplot shows the relation between percentage obtained in the undergraduate degree and placement status . Students who get placed score higher than those who do not get placed. The mean score of placed students is given by 68.6925, standard deviation is 6.189 ,2nd quartile or median is 69.25 ,1st quartile is 64.50 and 3rd quartile is 72.1150.

推论 :上面的方框图显示 了本科学位所占百分比升学状况的关系 。 被安置的学生的得分高于没有被安置的学生。 留学生的平均分数为68.6925,标准差为6.189,第二四分位数或中位数为69.25,第一四分位数为64.50,第三四分位数为72.1150。

Whereas, the mean percentage of students not placed is given by 60.8670, standard deviation is 7.045, 2nd quartile or median is 61.00, 1st quartile is 56.65 and 3rd quartile is 64.00.

而未安置学生的平均百分比为60.8670,标准差为7.045,第二四分位数或中位数是61.00,第一四分位数是56.65,第三四分位数是64.00。

From this analysis, undergraduate students/freshers can prioritise and prepare for their undergraduate/degree examinations keeping in mind the average score, as mentioned above, that the corporate companies generally perceive worthy of grabbing a placement in their establishment.

通过这种分析,本科生/新生可以优先考虑并为本科生/学位考试做准备,同时牢记如上所述的平均分数,即公司通常认为值得在其机构中获得职位。

Inference : Male candidates get a higher pay than female candidates . The mean salary of placed male students is given by 302608.70 , standard deviation is 144726.4 , 2nd quartile or median is 264000, 1st quartile is 240000 and 3rd quartile is 300000.

推论男性候选人 的薪酬高于 女性候选人 。 入学男生的平均工资为302608.70,标准差为144726.4,第二四分位数或中位数为264000,第一四分位数为240000,第三四分位数为300000。

On the other hand, the mean salary of placed female students is given by 267571.43, standard deviation is 41776.1, 2nd quartile or median is 250000 ,1st quartile is 240000 and 3rd quartile is 300000.

另一方面,入职女学生的平均工资为267571.43,标准差为41776.1,第二四分位数或中位数为250000,第一四分位数为240000,第三四分位数为300000。

Thus, we can see that while the placement rate of females is lower than males, the salary offered to the placed female candidates is also relatively lower than that of the male candidates.

因此,我们可以看到,尽管女性的就业率低于男性,但提供给被安置的女性候选人的薪水也相对低于男性候选人。

Pivot Table :

数据透视表

Inference : As more students opt for Commerce and Management , the no. of placed students as well as students not placed are much higher in it as compared to Science and other streams . Even the ratio of placed to students not placed is higher in Commerce and Management is higher than that in Science.

推论 :随着越来越多的学生选择 商业与管理 ,不 与 理科和其他科目 相比, 录取 学生和未录取学生的比例要高得多。 即使在商务和管理领域,就读率和未就读率之间的比重也更高,而在理科中则更高。

Readers can understand there are relatively more job opportunities for students who opt for Commerce and Management than other streams.

读者可以理解,选择商业和管理专业的学生比其他领域的工作机会相对更多。

Scatterplots :

散点图

For scatterplots, we have used 60% of the data provided. A scatterplot with variables salary and percentage obtained in the degree examination is formed. Here,the different points have been coloured according to the different streams as shown in the legends table.

对于散点图,我们使用了提供的60%的数据。 形成 了在学位考试中获得的 薪水百分比 可变的散点图。 在这里,不同的点已根据图例表中所示的 不同流 进行了着色。

Inference : The higher salaries have been offered to students whose scores lie in the range 64–74. Moreover, from the point of stream , most of the students that have been offered a pay higher than 300,000 belong to Commerce and Management. Very few students of Science and even fewer students of other streams have crossed the threshold of 300,000 pay.

推论 :为 分数 在64-74之间的学生提供了更高的 薪水 。 而且,从 的角度来看,获得超过30万薪水的大多数学生属于商业与管理专业。 理科专业的学生很少,其他流派的学生甚至超过了30万。

Inference : Students that specialise in Marketing and Finance and those in Marketing and HR score similarly in MBA percentage . However, the highest paid students generally have scores in the range 62–70, approximately. Very few students have been offered a pay higher than 400,000. Majority of students are offered salaries in the range of 250,000 to 350,000.

推论市场营销与金融 专业的学生, 市场 营销与人力资源 专业的MBA百分比 得分相似。 但是,收入最高的学生的分数通常在62-70之间。 很少有学生获得高于40万的薪水。 大多数学生的 薪水 在250,000到350,000之间。

We can understand that maintaining an average score that falls in the above mentioned range shall suffice for a decent paying placement.

我们可以理解,将平均得分保持在上述范围内就足以获得不错的付费。

Mosaic Plot :

马赛克图

Other than academic parameters, some other factors may also be considered for placement by recruiting companies. Employablity tests conducted by colleges are key for establishing appropriate labour market linkages and ascertaining that the workforce is industry ready.

除了学术参数,其他一些因素也可以由招聘单位考虑 的位置 。 高校进行的 能力测试 对于建立适当的劳动力市场联系并确定劳动力已做好行业准备至关重要。

Inference : From the plot above, we can see that of all the students that did not get placed, very few scored above 83.5. Most of the unemployed candidates scored below 83.5.

推论 :从上图可以看出,在所有未获得排名的学生中,只有极少数得分高于83.5。 大多数失业候选人的得分都低于83.5。

Moreover, the plot suggests that students having prior work experience are considered more deserving than freshers. Nearly all the sections of students not placed did not have a prior work experience, whereas those having work experience are on the placed students section on the right.

此外,该图表明,具有过往 工作经验的 学生被认为比新生更值得。 几乎所有未安置学生的部分都没有事先的工作经验,而那些有工作经验的学生则在右侧的已安置学生部分。

From this, students can comprehend that having an experience in a work environment before campus recruitment proves to be beneficial. Thus, they can plan and prepare accordingly for their future.

由此,学生可以理解,在校园招聘之前的工作环境中的经验被证明是有益的。 因此,他们可以为自己的未来做计划并作相应的准备。

Classification Tree :

分类树

This classification tree has placement status (placed) as target .It has the following parameters:

该分类树以 放置状态(已放置) 为目标,具有以下参数:

It is an induced binary tree.

它是一个诱导二叉树。

Minimum no. of instances in leaves : 2.

最低编号 叶子中的实例数量:2。

Do not split subsets more than :5.

子集分割不要超过:5。

Limit the maximal tree depth to : 100.

将最大树深度限制为:100。

Classification stops when majority reaches 95%.

当多数达到95%时,分类将停止。

Students can acquire a detailed analysis about the dependence of the various academic and other factors on whether or not a candidate gets placed based on the data provided. This tree gives a clear explanation of how the different attributes of a particular student shall influence their placement status .

学生可以根据所提供的数据,详细了解 各种学术因素和其他因素 对候选人是否被安置的依赖性。 该树清楚地解释了特定学生的不同属性如何影响他们的 位置状况

This classification tree has salary offered as target .It has the following parameters:

此分类树以 薪金 为目标,它具有以下参数:

It is an induced binary tree.

它是一个诱导二叉树。

Minimum no. of instances in leaves : 2.

最低编号 叶子中的实例数量:2。

Do not split subsets more than :5.

子集分割不要超过:5。

Limit the maximal tree depth to : 100.

将最大树深度限制为:100。

Classification stops when majority reaches 95%.

当多数达到95%时,分类将停止。

Students can acquire a detailed analysis about the dependence of the various academic and other factors on the salary offered to a candidate. This tree gives a clear explanation of how the different attributes of a particular student shall influence their pay.

学生可以获得有关各种 学术和其他因素 对应聘者 薪水 的依赖性的详细分析。 这棵树清楚地说明了特定学生的不同属性将如何影响他们的工资。

Vote of Thanks :

感谢票:

I would like to humbly and sincerely thank my mentor Rocky Jagtiani . He is more of a friend to me than mentor .The data analytics taught by him and various assignments we did and are still doing is the best way to learn and skill in Data Science field.

我要衷心地感谢我的导师 洛基 对于我而言,他不是导师,而是导师。他教给我们的数据分析以及我们目前做的和仍在做的各种作业是在数据科学领域学习和技能的最佳方法。

Recommended https://datascience.suvenconsultants.com/

推荐的 https://datascience.suvenconsultants.com/

翻译自: https://medium.com/@ruchikaparag18/placement-outcomes-data-analysis- using-orange-gui-1884aa3ac0c2

orange 数据分析

orange 数据分析_使用Orange GUI的放置结果数据分析相关推荐

  1. 季节性时间序列数据分析_如何指导时间序列数据的探索性数据分析

    季节性时间序列数据分析 为什么要进行探索性数据分析? (Why Exploratory Data Analysis?) You might have heard that before proceed ...

  2. 用python处理excel 数据分析_像Excel一样使用python进行数据分析(1)

    (虽然是转载,但是是我每块都测试过得,容易出问题的地方我会添加一些自己的经验,仅供参考) 摘要:本篇文章通过python与excel的功能对比介绍如何使用python通过函数式编程完成excel中的数 ...

  3. 使用python数据分析_如何使用Python提升您的数据分析技能

    使用python数据分析 If you're learning Python, you've likely heard about sci-kit-learn, NumPy and Pandas. A ...

  4. python汽车数据分析_用python对汽车油耗进行数据分析

    原标题:用python对汽车油耗进行数据分析 - 从http://fueleconomy.gov/geg/epadata/vehicles.csv.zip 下载汽车油耗数据集并解压 - 进入jupyt ...

  5. python营业数据分析_小案例-使用python进行销售数据分析

    数据分析步骤:提出问题.理解数据.数据清洗.构建模型.数据可视化 数据:朝阳医院2018年销售数据 一.提出问题 从销售数据中分析以下业务指标:月均消费次数.月均消费金额.客单价.消费趋势 二.理解数 ...

  6. python新手入门教程思路-Python新手入门教程_教你怎么用Python做数据分析

    Python新手入门教程_教你怎么用Python做数据分析 跟大家讲了这么多期的Python教程,有小伙伴在学Python新手教程的时候说学Python比较复杂的地方就是资料太多了,比较复杂.很多网上 ...

  7. 流式数据分析_流式大数据分析

    流式数据分析 The recent years have seen a considerable rise in connected devices such as IoT [1] devices, ...

  8. 认识数据分析_认识您的最佳探索数据分析新朋友

    认识数据分析 Visualization often plays a minimal role in the data science and model-building process, yet ...

  9. spotify 数据分析_没有数据? 没问题! 如何从Wikipedia和Spotify收集重金属数据

    spotify 数据分析 For many data science students, collecting data is seen as a solved problem. It's just ...

最新文章

  1. HashSet中的add()方法( 三 )(详尽版)
  2. linux resin mysql_Linux下Resin JSP MySQL的安装和配置-2
  3. Facebook190亿美元收购WhatsApp
  4. 陈旸:清华博士的模型信仰
  5. [UITableViewCell]小结
  6. 【169天】黑马程序员27天视频学习笔记【Day08-上】
  7. Flutter OverflowBox溢出容器
  8. LINUX虚拟机与WINDOWS主机,直接复制交换文件会有问题
  9. 快捷支付与网银支付的对比
  10. Linux 异步IO
  11. Lucene PriorityQueue JDK PriorityQueue
  12. ZJUT 2012校赛决赛-涂颜色
  13. 点云损失函数Chamfer Distance 和 Earth Mover‘s Distance
  14. 抖音上的战斗力测试软件,抖音战斗力测试app
  15. 2020年苹果开发者资质验证流程以及失败后提示未能验证证件
  16. 大厂offer | 2022年C++开发面试题库
  17. 如何使用HTTPS加密保护网站?
  18. _ 10. 控制器和存储器一起组成了计算机核心——中央处理器,安徽2014年会计从业资格考试试题:会计电算化(第一套)...
  19. matlab中连续信号的卷积,连续时间信号卷积运算的MATLAB实现
  20. MindSpore实现手写数字识别代码

热门文章

  1. Scratch(五十五):后羿射日
  2. 在线ajax请求工具,-在线工具-postjson
  3. Proteus8仿真:51单片机A/D转换(ADC0808)
  4. SAP系统权限配置一
  5. cubemx spi 中断_STM32CubeMX之SPI接口
  6. 移植flash游戏到android
  7. 变频器源码、图纸、伺服驱动器和变频器源码、图纸、生产方案
  8. 阿里云服务器vCPU和CPU有区别吗?
  9. 时域特征偏度_时域分析——有量纲特征值含义一网打尽
  10. 电动口罩电路图和源代码程序破解