ClickHouse 实践

前言

ClickHouse的特性
真正的面向列的DBMS
数据高效压缩
磁盘存储的数据
多核并行处理
在多个服务器上分布式处理
SQL语法支持
向量化引擎
实时数据更新
索引
适合在线查询
支持近似预估计算
支持嵌套的数据结构
支持数组作为数据类型
支持限制查询复杂性以及配额
复制数据复制和对数据完整性的支持

缺点
不支持Transaction：想快就别想Transaction
聚合结果必须小于一台机器的内存大小：不是大问题
缺少完整的Update/Delete操作

适用场景
适合对几百亿级明细数据的大宽表进行秒级的 OLAP 分析

丰富sql支持
非常适合大宽表
join 性能不太好
支持精确去重（一般耗时2-5秒）和非精确去重（一般耗时1秒）
支持地理函数（电子围栏）
支持聚合且支持select *（明细查询）
支持分页（limit offset）
支持大 group by
支持列数量可以到万级别（目前只实践过200多列）
支持多样的物化视图（对比druid只有按时间的rollup）
对大数据的查询（几百G级别）
表结构不需要必须包含时间字段
写入性能优秀（单CK节点50～200MB写入性能）
支持jdbc直写（需要一次性 insert 一批数据，例如每次写入10万条，每次执行间隔周期越大数据量越大性能越好）
支持主键去重（异步执行，生效时间1-5分钟，目前时效性不太好，在改进优化）
支持update/delete（异步执行，生效时间1-5分钟，目前时效性不太好，在改进优化）

一、多表关联查询

适用场景

需要数据明细表关联其他表进行查询，如用户订单表关联营销活动-用户表，过滤参加某个活动的用户的订单信息。

一般适用于明细表数据较多，1、关联的小表过滤后数据量小于一亿条的场景。

注意：单纯的维度表，比如城市维度表，不需要做多表关联，直接明细表写入时写宽表，维度基数不大于一万可以用 LowCardinality 类型。一般只有和明细表是多对多的关系，才需要使用关联查询。

用法

1.in子查询

如果需求是从小表过滤出某个id列表字段在明细表中做过滤，优先使用in子查询，官方wiki。

以查询参加某个营销活动的用户uv为例，明细表是表a，营销活动表是表b。

推荐示例

SELECT uniq(UserID) FROM a WHERE xxx = xx ... AND UserID GLOBAL IN (SELECT UserID FROM b WHERE huodongID = 34)

注意事项

1.小表写在in的子查询里，大表正常查询。

2.推荐使用GLOBAL关键字，会把小表查询结果拉到内存临时表汇总，再同步给每个大表的数据节点。

3.如果子查询的id列表是没去重的，可以指定DISTINCT id。

4.能够减少查询数据量的过滤条件写在子查询的过滤条件中，减少子查询返回的数据量。

2.join
适用于大表多对多的关联小表。

推荐使用inner join，直接写join默认就是使用INNER JOIN。

以查询多个营销活动的用户uv为例，明细表是表a，营销活动表是表b。

SELECT b.huodongID, uniq(UserID) FROM a  GLOBAL join   b  on( a.userId = b.userId) WHERE huodongID in （12，34，56,...）, a.xxx = xx ... group by b.huodongID

注意事项

1.大表左边，小表右边，会将小表过滤出的数据发给大表的数据节点做合并。

2.推荐使用GLOBAL关键字，会把小表查询结果拉到内存临时表汇总，再同步给每个大表的数据节点。

3.用合适的过滤条件减少小表返回的数据量，用合适的排序字段减少大表扫描的数据量。

二、查询优化-group by超过了max_memory_usage怎么办

bigdata-clickhouse-test01.nmg01 :) set max_memory_usage=6000000000;
SET max_memory_usage = 6000000000
Ok.
0 rows in set. Elapsed: 0.001 sec.
bigdata-clickhouse-test01.nmg01 :) select uid, count() as cnt from ods_gsdriverloc_d group by uid order by cnt desc limit 10;
SELECTuid,count() AS cnt
FROM ods_gsdriverloc_d
GROUP BY uid
ORDER BY cnt DESC
LIMIT 10
↖ Progress: 8.35 billion rows, 66.76 GB (586.77 million rows/s., 4.69 GB/s.) ███████████████████████████████████████▋                                    52%Received exception from server (version 18.14.12):
Code: 241. DB::Exception: Received from localhost:9000, 127.0.0.1. DB::Exception: Memory limit (for query) exceeded: would use 5.59 GiB (attempt to allocate chunk of 786432 bytes), maximum: 5.59 GiB.
0 rows in set. Elapsed: 14.274 sec. Processed 8.35 billion rows, 66.76 GB (584.65 million rows/s., 4.68 GB/s.)

增加如下的参数：

set max_memory_usage=6000000000;
set max_bytes_before_external_group_by=3000000000;
set group_by_two_level_threshold=1;
set distributed_aggregation_memory_efficient=1;

max_bytes_before_external_group_by一般设置为max_memory_usage的1/2

bigdata-clickhouse-test01.nmg01 :) select uid, count() as cnt from ods_gsdriverloc_d group by uid order by cnt desc limit 10;
SELECTuid,count() AS cnt
FROM ods_gsdriverloc_d
GROUP BY uid
ORDER BY cnt DESC
LIMIT 10
┌─────────────uid─┬────cnt─┐
│ 566396445599120 │ 141196 │
│ 566480446494363 │ 117578 │
│ 565840488437543 │ 117456 │
│ 563640420016128 │ 116781 │
│ 567950008428471 │  86332 │
│ 565847723680819 │  78105 │
│ 567950090883334 │  75749 │
│ 565998218449677 │  73502 │
│ 567950184114741 │  66629 │
│ 563443520770048 │  66010 │
└─────────────────┴────────┘
10 rows in set. Elapsed: 28.341 sec. Processed 15.71 billion rows, 125.67 GB (554.30 million rows/s., 4.43 GB/s.)

查看日志：

2018.11.20 14:51:26.527758 [ 1668 ] {7bb46332-69bc-4469-a9dd-8bf6f4b24d7d} <Debug> Aggregator: Writing part of aggregation data into temporary file /data1/clickhouse/tmp/tmp11422clvgaa.
2018.11.20 14:51:26.528478 [ 1655 ] {7bb46332-69bc-4469-a9dd-8bf6f4b24d7d} <Debug> Aggregator: Writing part of aggregation data into temporary file /data1/clickhouse/tmp/tmp11422dlvgaa.
2018.11.20 14:51:26.529131 [ 1671 ] {7bb46332-69bc-4469-a9dd-8bf6f4b24d7d} <Debug> Aggregator: Writing part of aggregation data into temporary file /data1/clickhouse/tmp/tmp11422elvgaa.
2018.11.20 14:51:26.529469 [ 1663 ] {7bb46332-69bc-4469-a9dd-8bf6f4b24d7d} <Debug> Aggregator: Writing part of aggregation data into temporary file /data1/clickhouse/tmp/tmp11422flvgaa.
2018.11.20 14:51:26.529593 [ 1673 ] {7bb46332-69bc-4469-a9dd-8bf6f4b24d7d} <Debug> Aggregator: Writing part of aggregation data into temporary file /data1/clickhouse/tmp/tmp11422glvgaa.
2018.11.20 14:51:26.529670 [ 1662 ] {7bb46332-69bc-4469-a9dd-8bf6f4b24d7d} <Debug> Aggregator: Writing part of aggregation data into temporary file /data1/clickhouse/tmp/tmp11422hlvgaa.

三、查询优化-去重函数的选择

CK提供了多个计算去重函数官方函数介绍。

业务要求必须是精准去重，可以接受慢一些的查询速度的，使用 uniqexact。

业务要求查询响应快，可以接受近似值的，使用 uniqhll12

下面是相同测试数据，几个去重函数的测试，供参考。

结果是单个时间分片的去重值，精准去重的误差百分比，和查询时间。

就测试的数据来说，uniqHLL12 的查询速度数量级的快，误差最小。

SELECT uniqExact(driver_id), toStartOfMinute(toDateTime(timestamp / 1000)) AS minSpan
FROM insight_chaos_realtime_driver_v2_detail_jdbc
WHERE toYYYYMMDD(toDateTime(timestamp / 1000)) = 20201124
and city_id in ('1', '2', '3', '4')
GROUP BY toStartOfMinute(toDateTime(timestamp / 1000))
┌─uniqExact(driver_id)─┬─────────────minSpan─┐
│                84539 │ 2020-11-24 18:55:00378 rows in set. Elapsed: 2.027 sec. Processed 26.38 million rows, 350.99 MB (13.01 million rows/s., 173.14 MB/s.)SELECT uniqHLL12(driver_id), toStartOfMinute(toDateTime(timestamp / 1000)) AS minSpan FROM insight_chaos_realtime_driver_v2_detail_jdbc WHERE toYYYYMMDD(toDateTime(timestamp / 1000)) = 20201124 and city_id in ('1', '2', '3', '4') GROUP BY toStartOfMinute(toDateTime(timestamp / 1000));│                84369 │ 2020-11-24 18:55:00
误差0.201%
378 rows in set. Elapsed: 0.301 sec. Processed 26.38 million rows, 350.99 MB (87.71 million rows/s., 1.17 GB/s.)SELECT uniqCombined(driver_id), toStartOfMinute(toDateTime(timestamp / 1000)) AS minSpan FROM insight_chaos_realtime_driver_v2_detail_jdbc WHERE toYYYYMMDD(toDateTime(timestamp / 1000)) = 20201124 and city_id in ('1', '2', '3', '4') GROUP BY toStartOfMinute(toDateTime(timestamp / 1000));┌─uniqCombined(driver_id)─┬─────────────minSpan─┐
│                   84246 │ 2020-11-24 18:55:00
误差0.3465%
378 rows in set. Elapsed: 2.124 sec. Processed 26.38 million rows, 350.99 MB (12.42 million rows/s., 165.22 MB/s.)SELECT uniq(driver_id), toStartOfMinute(toDateTime(timestamp / 1000)) AS minSpan
FROM insight_chaos_realtime_driver_v2_detail_jdbc
WHERE toYYYYMMDD(toDateTime(timestamp / 1000)) = 20201124
and city_id in ('1', '2', '3', '4')
GROUP BY toStartOfMinute(toDateTime(timestamp / 1000))─uniq(driver_id)─┬─────────────minSpan─┐
│           84986 │ 2020-11-24 18:55:00
误差0.52875%
378 rows in set. Elapsed: 2.502 sec. Processed 26.38 million rows, 350.99 MB (10.54 million rows/s., 140.26 MB/s.)

四、查询优化-查询where条件优化

1.去掉无用的条件

有时sql是用程序拼的，注意不要拼出无用的条件。例如下面的条件，所有范围都包含，是无用的。去掉后有十倍性能提升。

(channel < 40000 or channel >= 50000) or (channel >= 40000 and channel < 50000)

2.显示指定时间分区

对于日期做分区字段，查询如果是某一天，应显示指定日期分区，而不是范围。

#日期做分区字段
`timestamp`         DateTime COMMENT '时间戳',
...
partition_index toYYYYMMDD(toDateTime(timestamp))#查询指定日期分区
where ... and  toYYYYMMDD(toDateTime(timestamp)) = 20201126#查询不要指定范围
where AND (timestamp >= toDateTime('2020-11-26 00:00:00') AND timestamp <= toDateTime('2020-11-26 23:59:59'))

五、查询优化-通过排序键进行优化

好的排序键可以让查询时间降低一个数量级！

示例

例如下面的建表语句和查询sql，建表语句1用的默认的时间戳字段做排序键，建表语句2根据业务查询特点。

表1查询要0.69s，表2查询只要0.091s。

#建表语句1
CREATE TABLE insight_data.insight_chaos_realtime_order_cx_detail_jdbc_local
(`city_id`           Int16 COMMENT '城市ID',`carpool_type`      Array(LowCardinality(String)) COMMENT '拼车类型',`channel`           Int64 COMMENT '渠道',`cnt_distinct`      Int8 COMMENT '去重订单计数',`cnt_weighted`      Float64 COMMENT '加权订单计数',`combo_type`        Array(LowCardinality(String)) COMMENT '订单场景',`complete_type`     LowCardinality(String) COMMENT '订单完成类型',`county`            Int32 COMMENT '区县ID',`distance_category` Int16 COMMENT '订单距离: 完单为行驶距离,其他为预估距离',`driver_tag_id`     String COMMENT '司机人群ID',`optype`            LowCardinality(String) COMMENT 'binlog操作类型',`order_status`      Int8 COMMENT '订单状态',`passenger_id`      Int64 COMMENT '乘客id',`passenger_tag_id`  String COMMENT '乘客人群id',`polygon_id`        String COMMENT '围栏id',`product_category`  Array(LowCardinality(String)) COMMENT '业务标识: 68表示A+',`product_id`        Array(LowCardinality(String)) COMMENT '产品线',`require_level`     Array(LowCardinality(String)) COMMENT '请求车型',`scene_level1`      String COMMENT '订单场景',`source_type`       Int8 COMMENT '是否改派单',`table_filter`      LowCardinality(String) COMMENT 'binlog表来源',`time_category`     Int8 COMMENT '时间区间',`t_order_status`    Int8 COMMENT '订单状态变化至(t表示to，区别于f表示from)',`type`              Int8 COMMENT '订单类型',`timestamp`         DateTime COMMENT '时间戳',`_sys_insert_time`  DateTime MATERIALIZED now(),INDEX partition_index toYYYYMMDD(toDateTime(timestamp)) TYPE minmax GRANULARITY 3
) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/insight_data/insight_chaos_realtime_order_cx_detail_jdbc_local', '{replica}') PARTITION BY toYYYYMMDD(toDateTime(timestamp)) ORDER BY timestamp TTL _sys_insert_time + toIntervalDay(90) SETTINGS index_granularity = 8192, storage_policy = 'all_disk';#建表语句2
CREATE TABLE insight_data.insight_chaos_realtime_order_cx2_detail_jdbc_local
(`city_id`           Int16 COMMENT '城市ID',`carpool_type`      Array(LowCardinality(String)) COMMENT '拼车类型',`channel`           Int64 COMMENT '渠道',`cnt_distinct`      Int8 COMMENT '去重订单计数',`cnt_weighted`      Float64 COMMENT '加权订单计数',`combo_type`        Array(LowCardinality(String)) COMMENT '订单场景',`complete_type`     LowCardinality(String) COMMENT '订单完成类型',`county`            Int32 COMMENT '区县ID',`distance_category` Int16 COMMENT '订单距离: 完单为行驶距离,其他为预估距离',`driver_tag_id`     String COMMENT '司机人群ID',`optype`            LowCardinality(String) COMMENT 'binlog操作类型',`order_status`      Int8 COMMENT '订单状态',`passenger_id`      Int64 COMMENT '乘客id',`passenger_tag_id`  String COMMENT '乘客人群id',`polygon_id`        String COMMENT '围栏id',`product_category`  Array(LowCardinality(String)) COMMENT '业务标识: 68表示A+',`product_id`        Array(LowCardinality(String)) COMMENT '产品线',`require_level`     Array(LowCardinality(String)) COMMENT '请求车型',`scene_level1`      String COMMENT '订单场景',`source_type`       Int8 COMMENT '是否改派单',`table_filter`      LowCardinality(String) COMMENT 'binlog表来源',`time_category`     Int8 COMMENT '时间区间',`t_order_status`    Int8 COMMENT '订单状态变化至(t表示to，区别于f表示from)',`type`              Int8 COMMENT '订单类型',`timestamp`         DateTime COMMENT '时间戳',`_sys_insert_time`  DateTime MATERIALIZED now(),INDEX partition_index toYYYYMMDD(toDateTime(timestamp)) TYPE minmax GRANULARITY 3
) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/insight_data/insight_chaos_realtime_order_cx2_detail_jdbc_local', '{replica}') PARTITION BY toYYYYMMDD(toDateTime(timestamp)) ORDER BY (city_id, optype, t_order_status, table_filter, timestamp) TTL _sys_insert_time + toIntervalDay(90) SETTINGS index_granularity = 8192, storage_policy = 'all_disk';#查询
SELECT sum(cnt_distinct),
toStartOfMinute(timestamp) AS minSpan
FROM insight_data.insight_chaos_realtime_order_cx2_detail_jdbc_local
WHERE toYYYYMMDD(toDateTime(timestamp)) = 20201126
and hasAny(product_id, ['1', '2', '3', '4', '5', '6', '7', '8', '9']) = 1
and source_type in (1)
and optype = 'i'
and table_filter = 'd_order_base'
and driver_tag_id = '-1'
AND passenger_tag_id = '-1'
AND city_id IN (1, 2, 3, 4)
and polygon_id = '-1'
GROUP BY toStartOfMinute(timestamp);

排序键含义

排序键(建表语句中的Order By)就是定义了默认的Primary Key(非unique key). 顾名思义, 众多数据行的存储排序依据这个排序键定义的字段依次排序.

在clickhouse中排序键决定了存储的数据的排序. 多个字段的有序存储，在查询的时候能够根据内存中的元数据信息快速跳过不符合条件的众多行。

合理的排序键的设置能显著减少sql查询需要扫描的数据量，从而显著减小耗时，比如减少10~100倍耗时。

排序键最佳实践

clickhouse一般以8192行为一个块，优秀的排序键组合应该能仅仅依据内存中的元数据就能跳过大量的块。
比如(eventId, passengerId)这个组合，那么可以简单认为, 相同eventId的行排序后顺序放在一起, 然后在eventId相等的情况下, 相同passengerId的行排序后放在一起.

当查询条件总是包含eventId和passengerId，比如是eventId=1234 and passengerId=2的时候，eventId可以缩小范围到5千万条记录，而passengerId可以继续缩小范围到5000行记录。

那么过滤条件再加一项osType='ios’的话，这个osType并不能明显减少需要扫描的块的数量，因为5000条已经小于一个块的数量。因此排序键就不需要写成(eventId, passengerId, osType)

总而言之，排序键应该根据业务的查询特性选择，应该能够层层递进的跳过众多的行。

扩展阅读

六、ReplacingMergeTree 去重表

ClickHouse 中可以通过 ReplacingMergeTree 类型引擎提供非实时去重的数据表，一般数据自动去重延迟 10 秒左右，手动去重根据数据量大小会有 5 秒左右的去重处理时间。

ReplacingMergeTree 建表范例

-- 建表
CREATE TABLE test_tbl_replacing (id UInt16,create_time Date,comment Nullable(String)
) ENGINE = ReplacingMergeTree()PARTITION BY create_timeORDER BY (id)PRIMARY KEY (id, create_time)TTL create_time + INTERVAL 1 MONTHSETTINGS index_granularity=8192;-- 写入主键重复的数据
insert into test_tbl_replacing values(0, '2019-12-12', null);
insert into test_tbl_replacing values(0, '2019-12-12', null);
insert into test_tbl_replacing values(1, '2019-12-13', null);
insert into test_tbl_replacing values(1, '2019-12-13', null);
insert into test_tbl_replacing values(2, '2019-12-14', null);-- 查询，可以看到未compaction之前，主键重复的数据，仍旧存在。
select count(*) from test_tbl_replacing;
┌─count()─┐
│       5 │
└─────────┘select * from test_tbl_replacing;
┌─id─┬─create_time─┬─comment─┐
│  0 │  2019-12-12 │ ᴺᵁᴸᴸ    │
└────┴─────────────┴─────────┘
┌─id─┬─create_time─┬─comment─┐
│  0 │  2019-12-12 │ ᴺᵁᴸᴸ    │
└────┴─────────────┴─────────┘
┌─id─┬─create_time─┬─comment─┐
│  1 │  2019-12-13 │ ᴺᵁᴸᴸ    │
└────┴─────────────┴─────────┘
┌─id─┬─create_time─┬─comment─┐
│  1 │  2019-12-13 │ ᴺᵁᴸᴸ    │
└────┴─────────────┴─────────┘
┌─id─┬─create_time─┬─comment─┐
│  2 │  2019-12-14 │ ᴺᵁᴸᴸ    │
└────┴─────────────┴─────────┘-- 强制后台compaction：
optimize table test_tbl_replacing final;-- 再次查询：主键重复的数据已经消失。
select count(*) from test_tbl_replacing;
┌─count()─┐
│       3 │
└─────────┘select * from test_tbl_replacing;
┌─id─┬─create_time─┬─comment─┐
│  2 │  2019-12-14 │ ᴺᵁᴸᴸ    │
└────┴─────────────┴─────────┘
┌─id─┬─create_time─┬─comment─┐
│  1 │  2019-12-13 │ ᴺᵁᴸᴸ    │
└────┴─────────────┴─────────┘
┌─id─┬─create_time─┬─comment─┐
│  0 │  2019-12-12 │ ᴺᵁᴸᴸ    │
└────┴─────────────┴─────────┘-- 注意：去重引擎的分布式表必须使用 intHash64() 分片策略，确保重复的数据能够写入同一个分区目录进行去重
CREATE TABLE test_tbl_replacing_dist ON CLUSTER cluster01
(id UInt16,create_time Date,comment Nullable(String)
)
ENGINE = Distributed(cluster01, test, test_tbl_replacing, intHash64(id));

七、利用bitmap进行圈人(人群画像)

所谓圈人即是找出同时满足多种条件的用户的交集列表. 比如有多张表存储了用户信息, 地理表, 购物表, 行为表等等。

常见的做法就是同时查询多张表各自group by userId, 然后对多个表求join, 最终得到同时满足各种条件的用户列表. 但坏处是这多张大表的join需要的资源极多, 性能并不好。

bitmap可以用来代替这种join操作, bitmap原理可以参见图解bitmap

举个例子

-- 建2张测试表
create table test_bitmap(userId UInt32, type String) engine=MergeTree() order by tuple();
create table test_bitmap2(userId UInt32, type String) engine=MergeTree() order by tuple();
-- 各自导入1亿条数据, 其中独立用户只有1千万个
insert into test_bitmap select rand()%10000000, arrayElement(['a','b','c'], rand()%3 + 1) from numbers(100000000);
insert into test_bitmap2 select rand()%10000000, arrayElement(['a','b','c'], rand()%3 + 1) from numbers(100000000);
-- 对两个表各自求得符合条件的userId并转成bitmap, 然后按其他条件join在一起, 最后把两个bitmap求交集, 然后求得最终有交集的bitmap的大小, 也就是有交集的用户数量.
-- 当然也可以把bitmapCardinality改成bitmapToArray来输出交集的用户id
select type,bitmapCardinality(bitmapAnd(a.users, b.users)) from (select groupBitmapState(userId) as users,type from test_bitmap group by type) a
global join (select groupBitmapState(userId) as users, type from test_bitmap2 group by type) bon a.type = b.type

不过需要注意的是bitmap只支持int32/UInt32, Int64/UInt64的整型会产生严重的误差。
如果只关心有交集的用户数量, 那么可以用xxHash32(userId64)转成32位之后再使用, 注意, hash总是会存在极少量的碰撞, 因此这个交集也是存在少量误差的。

优缺点

优点:

面向列的DBMS
数据压缩
磁盘存储数据
多喝并行处理
在多节点上分布式处理
不支持null
join支持
向量化引擎, 向量化引擎和相较于其他引擎在查询执行操作中根节点调用子节点查询数据,子节点会以将查询结果的一批数据同时返回, 其他引擎一般以一次一个tuple的方式, 会造成处理器在循环遍历叶子节点浪费性能, 但同时向量化引擎批量返回结果的方式需要在各个上层叶子节点缓存数据,会加大磁盘开销, 但是由于列式存储的可压缩性很大, 可以拉近磁盘io能力和计算能力
实时数据更新
索引, 带有索引返回指定时间范围内的数据片
数据复制和数据完整性的支持, 异步多主复制
通过设置mergeTree系列引擎的方式, 实现精确去重

缺点:

无事务处理。
对于聚合，查询结果必须适合单个服务器上的内存。但是，查询的源数据量可能无限大。
缺乏全面的UPDATE / DELETE实现

性能:

吞吐量为单个大型查询, 在单台服务器的处理速度在2-10GB/S, 则行数处理速度为处理速度/每行的字节数
处理短的查询时延迟, 使用主键查询, 行数不到万级别, 延迟一般在50ms
处理大量的短查询时的吞吐量, 建议单台的qps在100左右
数据插入, 建议插入1000行的数据包, 每秒不超过一个插入请求. 在插入mergeTree引擎表,插入速度为50-200m,并行执行多个插入请求, 性能会线性增加

优化：

关闭虚拟内存，物理内存和虚拟内存的数据交换，会导致查询变慢。
为每一个账户添加join_use_nulls配置，左表中的一条记录在右表中不存在，右表的相应字段会返回该字段相应数据类型的默认值，而不是标准SQL中的Null值。
JOIN操作时一定要把数据量小的表放在右边，ClickHouse中无论是Left Join 、Right Join还是Inner Join永远都是拿着右表中的每一条记录到左表中查找该记录是否存在，所以右表必须是小表。
批量写入数据时，必须控制每个批次的数据中涉及到的分区的数量，在写入之前最好对需要导入的数据进行排序。无序的数据或者涉及的分区太多，会导致ClickHouse无法及时对新导入的数据进行合并，从而影响查询性能。
尽量减少JOIN时的左右表的数据量，必要时可以提前对某张表进行聚合操作，减少数据条数。有些时候，先GROUP BY再JOIN比先JOIN再GROUP BY查询时间更短。
ClickHouse的分布式表性能性价比不如物理表高，建表分区字段值不宜过多，防止数据导入过程磁盘可能会被打满。
CPU一般在50%左右会出现查询波动，达到70%会出现大范围的查询超时，CPU是最关键的指标，要非常关注。