
机器学习 , 意见 (Machine Learning, Opinion)

Before we proceed further to this post let us first understand what is a binary classification. So let’s understand this by a very simple instance. You are at home and it’s lunchtime, your mom comes to you and asks if you are hungry and want to have your lunch, your answer will be either “yes” or “no”. You only have two options to reply i.e. binary options. Let’s take another example of a student who has just received his result of grade 12 the result will be “passed” or “failed”. These two examples fall under the binary classification as they have two options. Now let’s see how expert systems or rule-based systems are used for Binary Classification.

在继续本文之前,让我们首先了解什么是二进制分类。 因此,让我们通过一个非常简单的实例来了解这一点。 您现在在家,现在是午餐时间,您的妈妈来找您,问您是否饿了,想吃午饭,答案是“是”或“否”。 您只有两个选项可以答复,即二进制选项。 让我们再举一个例子,该学生刚刚收到12年级的成绩,其结果将是“通过”或“失败”。 这两个示例具有两个选项,因此属于二进制分类。 现在让我们看看如何将专家系统或基于规则的系统用于二进制分类。

二进制分类专家系统: (The expert system in Binary Classification:)

The task of the machine is to check whether the image given to a machine contains text or not?


To understand this let’s see how Humans make decisions?


Consider the Health domain: Suppose a doctor is examining the patient and he needs to check whether the patient has Dengue or not?


The doctor is looking at several factors like Skin rash, fever, headache, cold cough, vomiting, and decides whether the person has dengue or not.


Factors Looked up by Doctors- Image by Author医生查找的因素-作者图像

The doctor checked one of the patients with the shown symptoms and did not find the dengue.


Symptoms of a patient with no Dengue- Image by Author没有登革热的患者的症状作者提供的图像

The doctor then checked another patient and found most of the symptoms were there which were enough for the doctor to declare the patient has Dengue.


Symptoms of a patient with Dengue- Image by Author登革热患者的症状作者提供的图像

So now the question arises how the doctor is making all such decisions??


The answer to this is historical data. The doctor had encountered several patients and based on his experience he is now making all such decisions whether a patient has dengue or not.

答案就是历史数据 。 医生遇到了几位患者,根据他的经验,他现在正在做出所有这样的决定,无论患者是否患有登革热。

The below-shown image contains the historical data of four patients.


Historical records of patients- Image by Author患者的历史记录-作者提供的图片

As machines understand only the numbers so let’s take ‘CROSS’ as ‘0’ and ‘TICK’ as ‘1’, so the historical data will appear as follow:

由于机器只能理解数字,因此我们将“ CROSS”(交叉)设为“ 0”,将“ TICK”(“刻度”)设为“ 1”,因此历史数据如下所示:

Historical data in numerical form for Machine Understanding- Image by Author机器理解的数字形式的历史数据-作者提供的图像

So what are the Semantics of decision making by doctors?


2 things that play the role there: Feature and Rules.

在那里起作用的2件事: 功能规则

Features are the inputs or symptoms that we are taking- Skin rash, fever, headache, cold cough and vomiting.


Rules are the permutations of symptoms going on in the head of the doctor which decides whether dengue is there i.e suppose if 2 or more symptoms are positive or 3 or more are positive then the person has dengue.


Features and Rules- Image by Author功能和规则-作者提供的图片

Now we want to outsource these rules to a machine in the form of a program so that the machine can execute the program by taking some input and give us the output. We’ll write an if-else condition and based on the conditions the program will give the output.

现在,我们想将这些规则以程序的形式外包给机器,以便机器可以通过输入一些信息并提供给我们输出来执行程序。 我们将编写if-else条件,并根据条件,程序将提供输出。

Outsourcing the rules to the machine through program- Image by Author通过程序将规则外包给机器

专家系统的局限性: (Limitations of an expert system:)

Let’s take a situation here: Suppose you have a company named Data Science Arena and your company is to hire a person for a job.

让我们在这里采取一种情况:假设您有一家名为Data Science Arena的公司,而您的公司将雇用一个人从事工作。

So for hiring someone, the hiring manager is going to look at various parameters like- 10th marks, 12th marks, if he/she is a graduate, what was the CGPA, what all projects have been done, and some other parameters. The hiring manager will make different permutations of tasks in his/her mind and then he’ll decide whether to select or reject a candidate.

因此,对于雇用某人,招聘经理将查看各种参数,例如-10分,12分,如果他/她是毕业生,什么是CGPA,已完成的所有项目以及其他一些参数。 招聘经理会在脑海中做出不同的任务排列,然后决定是否选择或拒绝候选人。

Hiring Scenario- Image by Author招聘方案-作者提供的图片

But if the tasks of hiring are comparatively large then it would be very difficult for any human to make the permutations of the tasks and the company will prepare a program to be fed to the machine to make it process and give the output.


Thus comes the first limitation- If the number of features (tasks here) are more then it won’t be possible to come up with the different permutation easily and thus difficult to bring out an if-else program easily.

因此, 第一个限制是 -如果功能(这里的任务)的数量更多,那么将不可能轻易地提出不同的排列,因此很难轻易地推出if-else程序。

Also as there would be a large number of historical data of the employees who had been hired and also the ones who have left the company. So looking at the different scenarios out of the large historical data it won’t be easily possible for a human to write a program out of it.

此外,还将有大量的历史数据,包括已雇用的员工和离开公司的员工。 因此,从大量的历史数据中查看不同的场景,对于人类来说,用它编写程序是不容易的。

The data appears to be as follows:


Historical data of 4 Employees- Image by Author4名员工的历史数据-作者提供的图片

Looking at the data one can see that the rules for forming the permutations would be too complex.


Thus comes the second limitation- Even if anyhow you came up with the rules but it won’t be possible to remember all the rules as the rules would be too much complex.

因此出现了第二个限制 -即使您以某种方式提出了规则,但由于规则太复杂了,所以无法记住所有规则。

Also sometimes the rules are inexpressible i.e. suppose the hiring manager hired a person based on his honesty. So how can we express this as a rule?

同样,有时规则是无法表达的,即假设招聘经理根据其诚实性雇用了一个人。 那么,我们通常如何表达呢?

As honesty can’t be expressed in any quantitative form thus making it as the third limitation.


So to solve the above limitations we use machine learning. It takes the inputs and forms a function f(x1, x2, … xn) that includes all our permutations and based on the input provided, gives us the output. The function formed can be a linear or of any degree based on the inputs.

因此,为了解决上述限制,我们使用机器学习。 它接受输入并形成函数f(x1,x2,…xn),其中包括我们的所有排列,并根据提供的输入提供输出。 所形成的函数可以是线性的,也可以是基于输入的任何程度的函数。

Thanks for reading. Hope this blog gave you some insights about how machine learning is conquering the explicit programming.


