
This article explores SQL Count Distinct operator for eliminating the duplicate rows in the result set.

本文探讨了SQL Count Distinct运算符,该运算符用于消除结果集中的重复行。

A developer needs to get data from a SQL table with multiple conditions. Sometimes, we want to get all rows in a table but eliminate the available NULL values. Suppose we want to get distinct customer records that have placed an order last year.

开发人员需要从具有多个条件SQL表中获取数据。 有时,我们希望获取表中的所有行,但要消除可用的NULL值。 假设我们要获得去年下订单的不同客户记录。

Let’s go ahead and have a quick overview of SQL Count Function.

让我们继续快速浏览一下SQL Count Function。

SQL计数功能 (SQL Count Function)

We use SQL Count aggregate function to get the number of rows in the output. Suppose we have a product table that holds records for all products sold by a company. We want to know the count of products sold during the last quarter. We can use SQL Count Function to return the number of rows in the specified condition.

我们使用SQL Count聚合函数来获取输出中的行数。 假设我们有一个产品表,其中包含公司出售的所有产品的记录。 我们想知道上一季度销售的产品数量。 我们可以使用SQL Count Function返回指定条件下的行数。

The syntax of the SQL COUNT function:
COUNT ([ALL | DISTINCT] expression);


By default, SQL Server Count Function uses All keyword. It means that SQL Server counts all records in a table. It also includes the rows having duplicate values as well.

默认情况下,SQL Server计数功能使用All关键字。 这意味着SQL Server对表中的所有记录进行计数。 它还包括具有重复值的行。

Let’s create a sample table and insert few records in it.


CREATE TABLE ##TestTable (Id int identity(1,1), Col1 char(1) NULL);

In this table, we have duplicate values and NULL values as well.


In the following screenshot, we can note that:


  • Count (*) includes duplicate values as well as NULL values 计数(*)包括重复值以及NULL值
  • Count (Col1) includes duplicate values but does not include NULL values 计数(Col1)包含重复值,但不包含NULL值

Suppose we want to know the distinct values available in the table. We can use SQL COUNT DISTINCT to do so.

假设我们想知道表中可用的不同值。 我们可以使用SQL COUNT DISTINCT来做到这一点。

Select count(DISTINCT COL1)
from ##TestTable

In the following output, we get only 2 rows. SQL COUNT Distinct does not eliminate duplicate and NULL values from the result set.

在以下输出中,我们仅获得2行。 SQL COUNT Distinct不会从结果集中消除重复值和NULL值。

Let’s look at another example. In this example, we have a location table that consists of two columns City and State.

让我们看另一个例子。 在此示例中,我们有一个位置表,其中包含两列“城市”和“州”。

(City  VARCHAR(30), State VARCHAR(20)
);Insert into location values('Gurgaon','Haryana')
Insert into location values('Gurgaon','Rajasthan')
Insert into location values('Jaipur','Rajasthan')
Insert into location values('Jaipur','Haryana')

Now, execute the following query to find out a count of the distinct city from the table.


FROM Location;

It returns the count of unique city count 2 (Gurgaon and Jaipur) from our result set.


If we look at the data, we have similar city name present in a different state as well. The combination of city and state is unique, and we do not want that unique combination to be eliminated from the output.

如果我们查看数据,那么在不同的州也将出现相似的城市名称。 城市和州的组合是唯一的,我们不希望从输出中消除该唯一的组合。

We can use SQL DISTINCT function on a combination of columns as well. It checks for the combination of values and removes if the combination is not unique.

我们也可以对列的组合使用SQL DISTINCT函数。 它检查值的组合,并删除组合是否唯一。

FROM Location;

It does not remove the duplicate city names from the output because of a unique combination of values.


Let’s insert one more rows in the location table.


Insert into location values('Gurgaon','Haryana')

We have 5 records in the location table. In the data, you can see we have one combination of city and state that is not unique.

我们在位置表中有5条记录。 在数据中,您可以看到我们具有唯一的城市和州的组合。

Rerun the SELECT DISTINCT function, and it should return only 4 rows this time.

重新运行SELECT DISTINCT函数,这一次它应该只返回4行。

We cannot use SQL COUNT DISTINCT function directly with the multiple columns. You get the following error message.

我们不能将SQL COUNT DISTINCT函数直接用于多个列。 您收到以下错误信息。

We can use a temporary table to get records from the SQL DISTINCT function and then use count(*) to check the row counts.

我们可以使用一个临时表从SQL DISTINCT函数获取记录,然后使用count(*)检查行数。

into #Temp
FROM Location;
Select count(*) from #Temp

We get the row count 4 in the output.


If we use a combination of columns to get distinct values and any of the columns contain NULL values, it also becomes a unique combination for the SQL Server.

如果我们使用列的组合来获取不同的值,并且任何列包含NULL值,则它也将成为SQL Server的唯一组合。

To verify this, let’s insert more records in the location table. We did not specify any state in this query.

为了验证这一点,让我们在位置表中插入更多记录。 我们没有在此查询中指定任何状态。

Insert into location values('Gurgaon','')
Insert into location(city)values('Gurgaon')

Let’s look at the location table data.


Re-run the query to get distinct rows from the location table.


SELECT   distinct City, State
FROM Location;

In the output, we can see it does not eliminate the combination of City and State with the blank or NULL values.


Similarly, you can see row count 6 with SQL COUNT DISTINCT function.

同样,您可以通过SQL COUNT DISTINCT函数看到行计数6。

SELECT COUNT,COUNT(*)和SQL COUNT之间的区别是不同的 (Difference between SELECT COUNT, COUNT(*) and SQL COUNT distinct)




It returns the total number of rows after satisfying conditions specified in the where clause.

It returns the total number of rows after satisfying conditions specified in the where clause.

It returns the distinct number of rows after satisfying conditions specified in the where clause.

It gives the counts of rows. It does not eliminate duplicate values.

It considers all rows regardless of any duplicate, NULL values.

It gives a distinct number of rows after eliminating NULL and duplicate values.

It eliminates the NULL values in the output.

It does not eliminate the NULL values in the output.

It eliminates the NULL values in the output.







它给出了行数。 它不会消除重复的值。






SQL Count不同功能的执行计划 (Execution Plan of SQL Count distinct function)

Let’s look at the Actual Execution Plan of the SQL COUNT DISTINCT function. You need to enable the Actual Execution Plan from the SSMS Menu bar as shown below.

让我们看一下SQL COUNT DISTINCT函数的实际执行计划。 您需要从SSMS菜单栏中启用“实际执行计划”,如下所示。

Execute the query to get an Actual execution plan. In this execution plan, you can see top resource consuming operators:

执行查询以获取实际执行计划。 在此执行计划中,您可以看到资源消耗最大的运算符:

  • Sort (Distinct Sort) – Cost 78% 排序(非重复排序)–费用78%
  • Table Scan – Cost 22% 表扫描–成本22%

You can hover the mouse over the sort operator, and it opens a tool-tip with the operator details.


In the properties windows, also we get more details around the sort operator including memory allocation, statistics, and the number of rows.


In a table with million records, SQL Count Distinct might cause performance issues because a distinct count operator is a costly operator in the actual execution plan.

在具有百万条记录的表中,SQL Count Distinct可能会导致性能问题,因为在实际执行计划中,独特的count运算符是代价昂贵​​的运算符。

SQL Server 2019 improves the performance of SQL COUNT DISTINCT operator using a new Approx_count_distinct function. This new function of SQL Server 2019 provides an approximate distinct count of the rows. There might be a slight difference in the SQL Count distinct and Approx_Count_distinct function output.

SQL Server 2019使用新的Approx_count_distinct函数提高了SQL COUNT DISTINCT运算符的性能。 SQL Server 2019的此新功能提供了大约不同的行数。 SQL Count区别和Approx_Count_distinct函数输出可能会稍有不同。

You can replace SQL COUNT DISTINCT with the keyword Approx_Count_distinct to use this function from SQL Server 2019.

您可以使用关键字Approx_Count_distinct替换SQL COUNT DISTINCT以从SQL Server 2019使用此功能。

FROM Location;

You can explore more on this function in The new SQL Server 2019 function Approx_Count_Distinct.

您可以在新SQL Server 2019函数Approx_Count_Distinct中进一步了解此函数。


In this article, we explored the SQL COUNT Function with various examples. We also covered new SQL function Approx_Count_distinct available from SQL Server 2019. I would suggest reviewing them as per your environment. If you have any comments or questions, feel free to leave them in the comments below.

在本文中,我们通过各种示例探索了SQL COUNT函数。 我们还介绍了可从SQL Server 2019获得的新SQL函数Approx_Count_distinct 。 我建议根据您的环境对其进行审查。 如果您有任何意见或疑问,请随时将其留在下面的评论中。

