备注:
Hive 版本 2.1.1

文章目录

  • Hive job优化概述
  • 一.并行执行
  • 二.本地执行
  • 三.合并输入小文件
  • 四.合并输出小文件
  • 五.控制Map/Reduce数
    • 5.1 控制Hive job中的map数
      • 5.1.1 合并小文件,减小map数
      • 5.1.2 适当增加map数
    • 5.2 控制hive任务的reduce数
  • 参考

Hive job优化概述

实际开发过程中,经常会遇到hive sql运行比较慢的情况,这个时候查看job的信息,也是一直在运行,只是迟迟的不出结果。

可以从如下几个方面来优化hive sql的job:

  1. 并行执行
    Hive产生的MR Job默认是顺序执行的,如果Job之间无依赖可以并行执行
    set hive.exec.parallel=true;

  2. 本地执行
    虽然Hive能够利用MR处理大规模数据,但某些场景下处理的数据量非常小可以本地执行,不必提交集群
    相关参数:
    set hive.exec.mode.local.auto=true;
    hive.exec.mode.local.auto.inputbytes.max(默认128MB)
    hive.exec.mode.local.auto.input.files.max(默认4)

  3. 合并输入小文件
    如果Job输入有很多小文件,造成Map数太多,影响效率
    set hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat #执行Map前进行小文件合并
    set mapred.max.split.size=256000000; #每个Map最大输入大小
    set mapred.min.split.size.per.node=100000000; #一个节点上split的至少的大小
    set mapred.min.split.size.per.rack=100000000; #一个交换机下split的至少的大小

  4. 合并输出小文件
    set hive.merge.mapfiles=true; // map only job结束时合并小文件
    set hive.merge.mapredfiles=true; // 合并reduce输出的小文件
    set hive.merge.smallfiles.avgsize=256000000; //当输出文件平均大小小于该值,启动新job合并文件
    set hive.merge.size.per.task=64000000; //合并之后的每个文件大小

  5. 控制Map/Reduce数
    控制Map/Reduce数来控制Job执行的并行度

Num_Map_tasks= $inputsize/ max($mapred.min.split.size, min($dfs.block.size, $mapred.max.split.size))
Num_Reduce_tasks= min($hive.exec.reducers.max, $inputsize/$hive.exec.reducers.bytes.per.reducer)

一.并行执行

Hive产生的MR Job默认是顺序执行的,如果Job之间无依赖可以并行执行
set hive.exec.parallel=true;

代码:

set hive.exec.parallel=false;
select count(*) from ods_fact_sale_orc
union all
select count(*) from ods_fact_sale_partion
;set hive.exec.parallel=true;
set hive.exec.parallel.thread.number = 8;  -- 默认并行度是8
select count(*) from ods_fact_sale_orc
union all
select count(*) from ods_fact_sale_partion
;

由于本地测试环境资源有限,无法并行执行mr任务,此处省略测试记录

二.本地执行

虽然Hive能够利用MR处理大规模数据,但某些场景下处理的数据量非常小可以本地执行,不必提交集群
相关参数:

set hive.exec.mode.local.auto=true;
hive.exec.mode.local.auto.inputbytes.max(默认128MB)
hive.exec.mode.local.auto.input.files.max(默认4)

代码:

set hive.exec.mode.local.auto=false;
select * from emp where empno = 7369;
set hive.exec.mode.local.auto=true;   //开启本地mr
set hive.exec.mode.local.auto.inputbytes.max=50000000;  //设置local mr的最大输入数据量,当输入数据量小于这个值的时候会采用local mr的方式
set hive.exec.mode.local.auto.tasks.max=10;  //设置local mr的最大输入文件个数,当输入文件个数小于这个值的时候会采用local mr的方式
set hive.exec.mode.local.auto.input.files.max = 50;
select * from emp where empno = 7369;

测试记录:
可以看到一个简单的查询,提交集群15秒,本地执行不到4秒,性能大幅提升

hive> > set hive.exec.mode.local.auto=false;
hive> select * from emp where empno = 7369;
Query ID = root_20210115095845_16ac6734-b0c2-457a-9c71-6c34655dd84e
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1610015767041_0038, Tracking URL = http://hp1:8088/proxy/application_1610015767041_0038/
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/hadoop job  -kill job_1610015767041_0038
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 0
2021-01-15 09:58:53,165 Stage-1 map = 0%,  reduce = 0%
2021-01-15 09:58:59,419 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 6.78 sec
MapReduce Total cumulative CPU time: 6 seconds 780 msec
Ended Job = job_1610015767041_0038
MapReduce Jobs Launched:
Stage-Stage-1: Map: 2   Cumulative CPU: 6.78 sec   HDFS Read: 13532 HDFS Write: 232 HDFS EC Read: 0 SUCCESS
Total MapReduce CPU Time Spent: 6 seconds 780 msec
OK
7369    smith   clerk   7902    1980-12-17      800.00  NULL    20
Time taken: 15.377 seconds, Fetched: 1 row(s)
hive> > set hive.exec.mode.local.auto=true;
hive> set hive.exec.mode.local.auto.inputbytes.max=50000000;
hive> set hive.exec.mode.local.auto.tasks.max=10;
hive> set hive.exec.mode.local.auto.input.files.max = 50;
hive> select * from emp where empno = 7369;
Automatically selecting local only mode for query
Query ID = root_20210115095924_96fedfe1-cd3e-4ec5-aea9-723ec60e416b
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
21/01/15 09:59:27 INFO mapred.LocalDistributedCacheManager: Creating symlink: /tmp/hadoop-root/mapred/local/1610675964612/3.0.0-cdh6.3.1-mr-framework.tar.gz <- /root/mr-framework
21/01/15 09:59:27 INFO mapred.LocalDistributedCacheManager: Localized hdfs://nameservice1/user/yarn/mapreduce/mr-framework/3.0.0-cdh6.3.1-mr-framework.tar.gz as file:/tmp/hadoop-root/mapred/local/1610675964612/3.0.0-cdh6.3.1-mr-framework.tar.gz
21/01/15 09:59:27 INFO mapred.LocalDistributedCacheManager: Creating symlink: /tmp/hadoop-root/mapred/local/1610675964613/libjars <- /root/libjars/*
21/01/15 09:59:27 WARN mapred.LocalDistributedCacheManager: Failed to create symlink: /tmp/hadoop-root/mapred/local/1610675964613/libjars <- /root/libjars/*
21/01/15 09:59:27 INFO mapred.LocalDistributedCacheManager: Localized file:/tmp/hadoop/mapred/staging/root1720005872/.staging/job_local1720005872_0002/libjars as file:/tmp/hadoop-root/mapred/local/1610675964613/libjars
Job running in-process (local Hadoop)
21/01/15 09:59:27 INFO mapred.LocalJobRunner: OutputCommitter set in config org.apache.hadoop.hive.ql.io.HiveFileFormatUtils$NullOutputCommitter
21/01/15 09:59:27 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.hive.ql.io.HiveFileFormatUtils$NullOutputCommitter
21/01/15 09:59:27 INFO mapred.LocalJobRunner: Waiting for map tasks
21/01/15 09:59:27 INFO mapred.LocalJobRunner: Starting task: attempt_local1720005872_0002_m_000000_0
21/01/15 09:59:27 INFO mapred.LocalJobRunner:
21/01/15 09:59:27 INFO mapred.LocalJobRunner: hdfs://nameservice1/user/hive/warehouse/test.db/emp/000000_0_copy_9:0+53
21/01/15 09:59:27 INFO mapred.LocalJobRunner: Finishing task: attempt_local1720005872_0002_m_000000_0
21/01/15 09:59:27 INFO mapred.LocalJobRunner: Starting task: attempt_local1720005872_0002_m_000001_0
21/01/15 09:59:27 INFO mapred.LocalJobRunner:
21/01/15 09:59:27 INFO mapred.LocalJobRunner: hdfs://nameservice1/user/hive/warehouse/test.db/emp/000000_0_copy_8:0+48
21/01/15 09:59:27 INFO mapred.LocalJobRunner: Finishing task: attempt_local1720005872_0002_m_000001_0
21/01/15 09:59:27 INFO mapred.LocalJobRunner: map task executor complete.
2021-01-15 09:59:28,168 Stage-1 map = 100%,  reduce = 0%
Ended Job = job_local1720005872_0002
MapReduce Jobs Launched:
Stage-Stage-1:  HDFS Read: 940220630 HDFS Write: 564850399 HDFS EC Read: 0 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
7369    smith   clerk   7902    1980-12-17      800.00  NULL    20
Time taken: 3.794 seconds, Fetched: 1 row(s)
hive>

三.合并输入小文件

如果Job输入有很多小文件,造成Map数太多,影响效率
set hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat #执行Map前进行小文件合并
set mapred.max.split.size=256000000; #每个Map最大输入大小
set mapred.min.split.size.per.node=100000000; #一个节点上split的至少的大小
set mapred.min.split.size.per.rack=100000000; #一个交换机下split的至少的大小

hive.input.format系统默认值已经是 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat
我们从参数mapred.max.split.size开始调优

代码:

set mapred.max.split.size=256000000;
select count(*) from ods_fact_sale;
set mapred.max.split.size=1024000000;
select count(*) from ods_fact_sale;

测试记录:
可以看到 增加map的最大数据大小,小文件合并得更多了,性能提升了一倍。

hive> set mapred.max.split.size=256000000;
hive> select count(*) from ods_fact_sale;
Query ID = root_20210108095302_fc928195-30a0-4956-a201-06a81dfeb155
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:set mapreduce.job.reduces=<number>
Starting Job = job_1610015767041_0001, Tracking URL = http://hp1:8088/proxy/application_1610015767041_0001/
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/hadoop job  -kill job_1610015767041_0001
Hadoop job information for Stage-1: number of mappers: 117; number of reducers: 1
2021-01-08 09:53:13,719 Stage-1 map = 0%,  reduce = 0%
2021-01-08 09:53:24,058 Stage-1 map = 2%,  reduce = 0%, Cumulative CPU 13.17 sec
2021-01-08 09:53:30,232 Stage-1 map = 3%,  reduce = 0%, Cumulative CPU 19.27 sec
2021-01-08 09:53:37,447 Stage-1 map = 4%,  reduce = 0%, Cumulative CPU 31.55 sec
2021-01-08 09:53:38,475 Stage-1 map = 5%,  reduce = 0%, Cumulative CPU 37.6 sec
2021-01-08 09:53:43,625 Stage-1 map = 6%,  reduce = 0%, Cumulative CPU 43.7 sec
2021-01-08 09:53:45,681 Stage-1 map = 7%,  reduce = 0%, Cumulative CPU 49.85 sec
2021-01-08 09:53:49,801 Stage-1 map = 8%,  reduce = 0%, Cumulative CPU 55.96 sec
2021-01-08 09:53:52,881 Stage-1 map = 9%,  reduce = 0%, Cumulative CPU 62.14 sec
2021-01-08 09:53:59,036 Stage-1 map = 10%,  reduce = 0%, Cumulative CPU 74.19 sec
2021-01-08 09:54:03,149 Stage-1 map = 11%,  reduce = 0%, Cumulative CPU 80.24 sec
2021-01-08 09:54:06,224 Stage-1 map = 12%,  reduce = 0%, Cumulative CPU 86.19 sec
2021-01-08 09:54:10,339 Stage-1 map = 13%,  reduce = 0%, Cumulative CPU 92.07 sec
2021-01-08 09:54:13,430 Stage-1 map = 14%,  reduce = 0%, Cumulative CPU 98.09 sec
2021-01-08 09:54:16,511 Stage-1 map = 15%,  reduce = 0%, Cumulative CPU 104.02 sec
2021-01-08 09:54:23,690 Stage-1 map = 16%,  reduce = 0%, Cumulative CPU 116.06 sec
2021-01-08 09:54:25,738 Stage-1 map = 17%,  reduce = 0%, Cumulative CPU 122.13 sec
2021-01-08 09:54:30,859 Stage-1 map = 18%,  reduce = 0%, Cumulative CPU 128.32 sec
2021-01-08 09:54:31,882 Stage-1 map = 19%,  reduce = 0%, Cumulative CPU 133.84 sec
2021-01-08 09:54:36,999 Stage-1 map = 20%,  reduce = 0%, Cumulative CPU 139.87 sec
2021-01-08 09:54:39,041 Stage-1 map = 21%,  reduce = 0%, Cumulative CPU 145.88 sec
2021-01-08 09:54:45,187 Stage-1 map = 22%,  reduce = 0%, Cumulative CPU 158.01 sec
2021-01-08 09:54:50,312 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU 164.21 sec
2021-01-08 09:54:51,337 Stage-1 map = 24%,  reduce = 0%, Cumulative CPU 170.44 sec
2021-01-08 09:54:57,483 Stage-1 map = 25%,  reduce = 0%, Cumulative CPU 176.53 sec
2021-01-08 09:54:58,502 Stage-1 map = 26%,  reduce = 0%, Cumulative CPU 182.63 sec
2021-01-08 09:55:05,640 Stage-1 map = 27%,  reduce = 0%, Cumulative CPU 194.87 sec
2021-01-08 09:55:10,749 Stage-1 map = 28%,  reduce = 0%, Cumulative CPU 200.93 sec
2021-01-08 09:55:11,775 Stage-1 map = 29%,  reduce = 0%, Cumulative CPU 206.88 sec
2021-01-08 09:55:16,879 Stage-1 map = 30%,  reduce = 0%, Cumulative CPU 212.95 sec
2021-01-08 09:55:17,903 Stage-1 map = 31%,  reduce = 0%, Cumulative CPU 218.94 sec
2021-01-08 09:55:24,025 Stage-1 map = 32%,  reduce = 0%, Cumulative CPU 225.02 sec
2021-01-08 09:55:30,166 Stage-1 map = 33%,  reduce = 0%, Cumulative CPU 237.62 sec
2021-01-08 09:55:31,189 Stage-1 map = 34%,  reduce = 0%, Cumulative CPU 243.58 sec
2021-01-08 09:55:36,312 Stage-1 map = 35%,  reduce = 0%, Cumulative CPU 249.74 sec
2021-01-08 09:55:38,363 Stage-1 map = 36%,  reduce = 0%, Cumulative CPU 255.66 sec
2021-01-08 09:55:43,481 Stage-1 map = 37%,  reduce = 0%, Cumulative CPU 261.76 sec
2021-01-08 09:55:45,526 Stage-1 map = 38%,  reduce = 0%, Cumulative CPU 267.83 sec
2021-01-08 09:55:52,680 Stage-1 map = 39%,  reduce = 0%, Cumulative CPU 280.1 sec
2021-01-08 09:55:57,794 Stage-1 map = 40%,  reduce = 0%, Cumulative CPU 286.35 sec
2021-01-08 09:55:58,820 Stage-1 map = 41%,  reduce = 0%, Cumulative CPU 292.32 sec
2021-01-08 09:56:04,955 Stage-1 map = 42%,  reduce = 0%, Cumulative CPU 298.4 sec
2021-01-08 09:56:05,977 Stage-1 map = 43%,  reduce = 0%, Cumulative CPU 304.46 sec
2021-01-08 09:56:11,079 Stage-1 map = 44%,  reduce = 0%, Cumulative CPU 310.53 sec
2021-01-08 09:56:17,212 Stage-1 map = 45%,  reduce = 0%, Cumulative CPU 322.43 sec
2021-01-08 09:56:18,238 Stage-1 map = 46%,  reduce = 0%, Cumulative CPU 328.44 sec
2021-01-08 09:56:23,351 Stage-1 map = 47%,  reduce = 0%, Cumulative CPU 334.45 sec
2021-01-08 09:56:25,398 Stage-1 map = 48%,  reduce = 0%, Cumulative CPU 340.55 sec
2021-01-08 09:56:29,489 Stage-1 map = 49%,  reduce = 0%, Cumulative CPU 346.59 sec
2021-01-08 09:56:31,544 Stage-1 map = 50%,  reduce = 0%, Cumulative CPU 352.58 sec
2021-01-08 09:56:38,709 Stage-1 map = 51%,  reduce = 0%, Cumulative CPU 363.88 sec
2021-01-08 09:56:41,781 Stage-1 map = 52%,  reduce = 0%, Cumulative CPU 370.0 sec
2021-01-08 09:56:45,864 Stage-1 map = 53%,  reduce = 0%, Cumulative CPU 376.15 sec
2021-01-08 09:56:48,926 Stage-1 map = 54%,  reduce = 0%, Cumulative CPU 382.14 sec
2021-01-08 09:56:53,009 Stage-1 map = 55%,  reduce = 0%, Cumulative CPU 388.37 sec
2021-01-08 09:56:55,055 Stage-1 map = 56%,  reduce = 0%, Cumulative CPU 394.4 sec
2021-01-08 09:57:02,263 Stage-1 map = 57%,  reduce = 0%, Cumulative CPU 406.9 sec
2021-01-08 09:57:06,356 Stage-1 map = 58%,  reduce = 0%, Cumulative CPU 412.94 sec
2021-01-08 09:57:08,406 Stage-1 map = 59%,  reduce = 0%, Cumulative CPU 418.95 sec
2021-01-08 09:57:13,534 Stage-1 map = 61%,  reduce = 0%, Cumulative CPU 431.11 sec
2021-01-08 09:57:19,695 Stage-1 map = 62%,  reduce = 0%, Cumulative CPU 437.12 sec
2021-01-08 09:57:26,861 Stage-1 map = 64%,  reduce = 0%, Cumulative CPU 455.35 sec
2021-01-08 09:57:33,019 Stage-1 map = 65%,  reduce = 0%, Cumulative CPU 461.72 sec
2021-01-08 09:57:34,048 Stage-1 map = 66%,  reduce = 0%, Cumulative CPU 467.72 sec
2021-01-08 09:57:40,215 Stage-1 map = 68%,  reduce = 0%, Cumulative CPU 480.15 sec
2021-01-08 09:57:47,404 Stage-1 map = 69%,  reduce = 0%, Cumulative CPU 492.39 sec
2021-01-08 09:57:53,562 Stage-1 map = 70%,  reduce = 0%, Cumulative CPU 498.58 sec
2021-01-08 09:57:54,594 Stage-1 map = 71%,  reduce = 0%, Cumulative CPU 504.62 sec
2021-01-08 09:58:00,740 Stage-1 map = 73%,  reduce = 0%, Cumulative CPU 516.89 sec
2021-01-08 09:58:06,887 Stage-1 map = 74%,  reduce = 0%, Cumulative CPU 529.17 sec
2021-01-08 09:58:13,036 Stage-1 map = 75%,  reduce = 0%, Cumulative CPU 535.33 sec
2021-01-08 09:58:14,062 Stage-1 map = 76%,  reduce = 0%, Cumulative CPU 541.35 sec
2021-01-08 09:58:19,220 Stage-1 map = 77%,  reduce = 0%, Cumulative CPU 547.59 sec
2021-01-08 09:58:21,266 Stage-1 map = 78%,  reduce = 0%, Cumulative CPU 553.51 sec
2021-01-08 09:58:26,398 Stage-1 map = 79%,  reduce = 0%, Cumulative CPU 559.61 sec
2021-01-08 09:58:33,570 Stage-1 map = 80%,  reduce = 0%, Cumulative CPU 571.67 sec
2021-01-08 09:58:34,597 Stage-1 map = 81%,  reduce = 0%, Cumulative CPU 577.74 sec
2021-01-08 09:58:39,711 Stage-1 map = 82%,  reduce = 0%, Cumulative CPU 583.74 sec
2021-01-08 09:58:45,860 Stage-1 map = 83%,  reduce = 0%, Cumulative CPU 589.8 sec
2021-01-08 09:58:49,965 Stage-1 map = 83%,  reduce = 28%, Cumulative CPU 590.58 sec
2021-01-08 09:58:52,015 Stage-1 map = 84%,  reduce = 28%, Cumulative CPU 596.77 sec
2021-01-08 09:58:59,208 Stage-1 map = 85%,  reduce = 28%, Cumulative CPU 603.02 sec
2021-01-08 09:59:12,545 Stage-1 map = 86%,  reduce = 28%, Cumulative CPU 615.23 sec
2021-01-08 09:59:13,573 Stage-1 map = 86%,  reduce = 29%, Cumulative CPU 615.3 sec
2021-01-08 09:59:18,696 Stage-1 map = 87%,  reduce = 29%, Cumulative CPU 621.24 sec
2021-01-08 09:59:25,866 Stage-1 map = 88%,  reduce = 29%, Cumulative CPU 627.3 sec
2021-01-08 09:59:30,975 Stage-1 map = 89%,  reduce = 29%, Cumulative CPU 632.86 sec
2021-01-08 09:59:37,112 Stage-1 map = 90%,  reduce = 29%, Cumulative CPU 638.91 sec
2021-01-08 09:59:38,137 Stage-1 map = 90%,  reduce = 30%, Cumulative CPU 638.96 sec
2021-01-08 09:59:44,277 Stage-1 map = 91%,  reduce = 30%, Cumulative CPU 644.97 sec
2021-01-08 09:59:58,593 Stage-1 map = 92%,  reduce = 30%, Cumulative CPU 657.22 sec
2021-01-08 10:00:02,681 Stage-1 map = 92%,  reduce = 31%, Cumulative CPU 657.28 sec
2021-01-08 10:00:05,748 Stage-1 map = 93%,  reduce = 31%, Cumulative CPU 663.32 sec
2021-01-08 10:00:11,877 Stage-1 map = 94%,  reduce = 31%, Cumulative CPU 669.46 sec
2021-01-08 10:00:18,009 Stage-1 map = 95%,  reduce = 31%, Cumulative CPU 675.49 sec
2021-01-08 10:00:20,055 Stage-1 map = 95%,  reduce = 32%, Cumulative CPU 675.55 sec
2021-01-08 10:00:24,152 Stage-1 map = 96%,  reduce = 32%, Cumulative CPU 681.46 sec
2021-01-08 10:00:30,288 Stage-1 map = 97%,  reduce = 32%, Cumulative CPU 687.97 sec
2021-01-08 10:00:43,594 Stage-1 map = 98%,  reduce = 32%, Cumulative CPU 701.09 sec
2021-01-08 10:00:44,619 Stage-1 map = 98%,  reduce = 33%, Cumulative CPU 701.15 sec
2021-01-08 10:00:50,744 Stage-1 map = 99%,  reduce = 33%, Cumulative CPU 707.46 sec
2021-01-08 10:00:57,906 Stage-1 map = 100%,  reduce = 33%, Cumulative CPU 714.02 sec
2021-01-08 10:00:58,932 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 716.38 sec
MapReduce Total cumulative CPU time: 11 minutes 56 seconds 380 msec
Ended Job = job_1610015767041_0001
MapReduce Jobs Launched:
Stage-Stage-1: Map: 117  Reduce: 1   Cumulative CPU: 716.38 sec   HDFS Read: 31436897886 HDFS Write: 109 HDFS EC Read: 0 SUCCESS
Total MapReduce CPU Time Spent: 11 minutes 56 seconds 380 msec
OK
767830000
Time taken: 478.119 seconds, Fetched: 1 row(s)
hive> > set mapred.max.split.size=1024000000;
hive> select count(*) from ods_fact_sale;
Query ID = root_20210108100115_bf06e3b5-6ff1-4c29-ab4a-25c8b1559791
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:set mapreduce.job.reduces=<number>
Starting Job = job_1610015767041_0002, Tracking URL = http://hp1:8088/proxy/application_1610015767041_0002/
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/hadoop job  -kill job_1610015767041_0002
Hadoop job information for Stage-1: number of mappers: 30; number of reducers: 1
2021-01-08 10:01:23,394 Stage-1 map = 0%,  reduce = 0%
2021-01-08 10:01:38,815 Stage-1 map = 3%,  reduce = 0%, Cumulative CPU 13.38 sec
2021-01-08 10:01:39,837 Stage-1 map = 7%,  reduce = 0%, Cumulative CPU 27.8 sec
2021-01-08 10:01:52,127 Stage-1 map = 10%,  reduce = 0%, Cumulative CPU 40.53 sec
2021-01-08 10:01:53,151 Stage-1 map = 13%,  reduce = 0%, Cumulative CPU 52.79 sec
2021-01-08 10:02:05,431 Stage-1 map = 17%,  reduce = 0%, Cumulative CPU 65.41 sec
2021-01-08 10:02:06,458 Stage-1 map = 20%,  reduce = 0%, Cumulative CPU 78.34 sec
2021-01-08 10:02:18,762 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU 91.39 sec
2021-01-08 10:02:19,787 Stage-1 map = 27%,  reduce = 0%, Cumulative CPU 104.75 sec
2021-01-08 10:02:31,062 Stage-1 map = 30%,  reduce = 0%, Cumulative CPU 117.2 sec
2021-01-08 10:02:32,093 Stage-1 map = 33%,  reduce = 0%, Cumulative CPU 130.27 sec
2021-01-08 10:02:44,385 Stage-1 map = 37%,  reduce = 0%, Cumulative CPU 142.72 sec
2021-01-08 10:02:45,410 Stage-1 map = 40%,  reduce = 0%, Cumulative CPU 155.18 sec
2021-01-08 10:02:57,675 Stage-1 map = 43%,  reduce = 0%, Cumulative CPU 168.91 sec
2021-01-08 10:02:58,701 Stage-1 map = 47%,  reduce = 0%, Cumulative CPU 181.56 sec
2021-01-08 10:03:03,809 Stage-1 map = 50%,  reduce = 0%, Cumulative CPU 186.53 sec
2021-01-08 10:03:10,962 Stage-1 map = 53%,  reduce = 0%, Cumulative CPU 198.51 sec
2021-01-08 10:03:18,128 Stage-1 map = 57%,  reduce = 0%, Cumulative CPU 211.03 sec
2021-01-08 10:03:24,256 Stage-1 map = 60%,  reduce = 0%, Cumulative CPU 224.14 sec
2021-01-08 10:03:33,442 Stage-1 map = 63%,  reduce = 0%, Cumulative CPU 240.3 sec
2021-01-08 10:03:37,521 Stage-1 map = 67%,  reduce = 0%, Cumulative CPU 252.98 sec
2021-01-08 10:03:47,734 Stage-1 map = 70%,  reduce = 0%, Cumulative CPU 266.97 sec
2021-01-08 10:03:50,800 Stage-1 map = 73%,  reduce = 0%, Cumulative CPU 279.74 sec
2021-01-08 10:04:01,021 Stage-1 map = 77%,  reduce = 0%, Cumulative CPU 292.45 sec
2021-01-08 10:04:04,097 Stage-1 map = 80%,  reduce = 0%, Cumulative CPU 305.39 sec
2021-01-08 10:04:14,330 Stage-1 map = 83%,  reduce = 0%, Cumulative CPU 318.52 sec
2021-01-08 10:04:17,400 Stage-1 map = 87%,  reduce = 0%, Cumulative CPU 331.82 sec
2021-01-08 10:04:28,637 Stage-1 map = 87%,  reduce = 29%, Cumulative CPU 332.58 sec
2021-01-08 10:04:29,657 Stage-1 map = 90%,  reduce = 29%, Cumulative CPU 345.6 sec
2021-01-08 10:04:34,759 Stage-1 map = 90%,  reduce = 30%, Cumulative CPU 345.75 sec
2021-01-08 10:04:42,925 Stage-1 map = 93%,  reduce = 30%, Cumulative CPU 358.71 sec
2021-01-08 10:04:47,016 Stage-1 map = 93%,  reduce = 31%, Cumulative CPU 358.79 sec
2021-01-08 10:04:55,206 Stage-1 map = 97%,  reduce = 31%, Cumulative CPU 371.67 sec
2021-01-08 10:04:59,301 Stage-1 map = 97%,  reduce = 32%, Cumulative CPU 371.73 sec
2021-01-08 10:05:09,527 Stage-1 map = 100%,  reduce = 32%, Cumulative CPU 385.23 sec
2021-01-08 10:05:10,559 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 387.27 sec
MapReduce Total cumulative CPU time: 6 minutes 27 seconds 270 msec
Ended Job = job_1610015767041_0002
MapReduce Jobs Launched:
Stage-Stage-1: Map: 30  Reduce: 1   Cumulative CPU: 387.27 sec   HDFS Read: 31436522876 HDFS Write: 109 HDFS EC Read: 0 SUCCESS
Total MapReduce CPU Time Spent: 6 minutes 27 seconds 270 msec
OK
767830000
Time taken: 237.197 seconds, Fetched: 1 row(s)
hive> >

四.合并输出小文件

正常的map-reduce job后,是否启动merge job来合并reduce端输出的结果,建议开启

  1. hive.merge.smallfiles.avgsize(默认为16MB)

如果不是partitioned table的话,输出table文件的平均大小小于这个值,启动merge job,如果是partitioned table,则分别计算每个partition下文件平均大小,只merge平均大小小于这个值的partition。这个值只有当hive.merge.mapfiles或hive.merge.mapredfiles设定为true时,才有效

  1. hive.exec.reducers.bytes.per.reducer(默认为1G,我的测试环境默认64M)

如果用户不主动设置mapred.reduce.tasks数,则会根据input directory计算出所有读入文件的input summary size,然后除以这个值算出reduce number

   reducers = (int) ((totalInputFileSize + bytesPerReducer - 1) / bytesPerReducer);reducers = Math.max(1, reducers);reducers = Math.min(maxReducers, reducers);

hive.merge.size.per.task(默认是256MB)
merge job后每个文件的目标大小(targetSize),用之前job输出文件的total size除以这个值,就可以决定merge job的reduce数目。merge job的map端相当于identity map,然后shuffle到reduce,每个reduce dump一个文件,通过这种方式控制文件的数量和大小

mapred.max.split.size(默认256MB)
mapred.min.split.size.per.node(默认1 byte)
mapred.min.split.size.per.rack(默认1 byte)
这三个参数CombineFileInputFormat中会使用,Hive默认的InputFormat是CombineHiveInputFormat,里面所有的调用(包括最重要的getSplits和getRecordReader)都会转换成CombineFileInputFormat的调用,所以可以看成是它的一个包装。CombineFileInputFormat 可以将许多小文件合并成一个map的输入,如果文件很大,也可以对大文件进行切分,分成多个map的输入。一个CombineFileSplit对应一个map的输入,包含一组path(hdfs路径list),startoffset, lengths, locations(文件所在hostname list)mapred.max.split.size是一个split 最大的大小,mapred.min.split.size.per.node是一个节点上(datanode)split至少的大小,mapred.min.split.size.per.rack是同一个交换机(rack locality)下split至少的大小通过这三个数的调节,组成了一串CombineFileSplit用户可以通过增大mapred.max.split.size的值来减少Map Task数量

代码:

set hive.exec.reducers.bytes.per.reducer = 64000000;CREATE TABLE merge_test1(
prod_name      string,
max_sale_nums       int,
min_sale_nums        int
)
STORED AS textfile ;insert into merge_test1
select prod_name,max(sale_nums),min(sale_nums)
from ods_fact_sale
group by prod_name;set hive.exec.reducers.bytes.per.reducer = 1024000000;CREATE TABLE merge_test2(
prod_name      string,
max_sale_nums       int,
min_sale_nums        int
)
STORED AS textfile ;insert into merge_test2
select prod_name,max(sale_nums),min(sale_nums)
from ods_fact_sale
group by prod_name;-- 这个代码不仅reduce多,map数也多,再增大map的参数值看看效果
set hive.exec.reducers.bytes.per.reducer = 1024000000;
set mapred.max.split.size=1024000000; CREATE TABLE merge_test3(
prod_name      string,
max_sale_nums       int,
min_sale_nums        int
)
STORED AS textfile ;insert into merge_test3
select prod_name,max(sale_nums),min(sale_nums)
from ods_fact_sale
group by prod_name;

测试记录:
可以看到,reducers由原来的491减少到了31,reduce的时间也大大减少,降低了执行的效率

hive> > > set hive.exec.reducers.bytes.per.reducer = 64000000;
hive> CREATE TABLE merge_test1(                                 > prod_name      string,                                > max_sale_nums       int,> min_sale_nums        int> )> STORED AS textfile ;
OK
Time taken: 0.44 seconds
hive> insert into merge_test1> select prod_name,max(sale_nums),min(sale_nums)> from ods_fact_sale > group by prod_name;
Query ID = root_20210114190128_0ad3fd5e-a9be-48e6-a473-f2ceea28753b
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 491
In order to change the average load for a reducer (in bytes):set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:set mapreduce.job.reduces=<number>
Starting Job = job_1610015767041_0032, Tracking URL = http://hp1:8088/proxy/application_1610015767041_0032/
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/hadoop job  -kill job_1610015767041_0032
Hadoop job information for Stage-1: number of mappers: 117; number of reducers: 491
2021-01-14 19:01:40,009 Stage-1 map = 0%,  reduce = 0%
2021-01-14 19:01:51,572 Stage-1 map = 2%,  reduce = 0%, Cumulative CPU 20.26 sec
2021-01-14 19:02:00,900 Stage-1 map = 3%,  reduce = 0%, Cumulative CPU 29.61 sec
2021-01-14 19:02:10,192 Stage-1 map = 4%,  reduce = 0%, Cumulative CPU 48.25 sec
2021-01-14 19:02:11,215 Stage-1 map = 5%,  reduce = 0%, Cumulative CPU 57.84 sec
2021-01-14 19:02:19,444 Stage-1 map = 6%,  reduce = 0%, Cumulative CPU 67.39 sec
2021-01-14 19:02:20,474 Stage-1 map = 7%,  reduce = 0%, Cumulative CPU 76.67 sec
2021-01-14 19:02:28,691 Stage-1 map = 8%,  reduce = 0%, Cumulative CPU 86.07 sec
2021-01-14 19:02:29,715 Stage-1 map = 9%,  reduce = 0%, Cumulative CPU 95.53 sec
2021-01-14 19:02:38,952 Stage-1 map = 10%,  reduce = 0%, Cumulative CPU 113.56 sec
2021-01-14 19:02:47,187 Stage-1 map = 11%,  reduce = 0%, Cumulative CPU 122.89 sec
2021-01-14 19:02:48,213 Stage-1 map = 12%,  reduce = 0%, Cumulative CPU 131.94 sec
2021-01-14 19:02:55,421 Stage-1 map = 13%,  reduce = 0%, Cumulative CPU 140.98 sec
2021-01-14 19:02:56,448 Stage-1 map = 14%,  reduce = 0%, Cumulative CPU 149.91 sec
2021-01-14 19:03:04,672 Stage-1 map = 15%,  reduce = 0%, Cumulative CPU 158.94 sec
2021-01-14 19:03:13,917 Stage-1 map = 16%,  reduce = 0%, Cumulative CPU 177.18 sec
2021-01-14 19:03:14,943 Stage-1 map = 17%,  reduce = 0%, Cumulative CPU 186.2 sec
2021-01-14 19:03:23,142 Stage-1 map = 18%,  reduce = 0%, Cumulative CPU 195.49 sec
2021-01-14 19:03:24,169 Stage-1 map = 19%,  reduce = 0%, Cumulative CPU 204.71 sec
2021-01-14 19:03:32,392 Stage-1 map = 21%,  reduce = 0%, Cumulative CPU 222.8 sec
2021-01-14 19:03:41,632 Stage-1 map = 22%,  reduce = 0%, Cumulative CPU 240.59 sec
2021-01-14 19:03:50,882 Stage-1 map = 24%,  reduce = 0%, Cumulative CPU 259.31 sec
2021-01-14 19:04:00,123 Stage-1 map = 26%,  reduce = 0%, Cumulative CPU 277.48 sec
2021-01-14 19:04:09,362 Stage-1 map = 27%,  reduce = 0%, Cumulative CPU 295.12 sec
2021-01-14 19:04:17,569 Stage-1 map = 28%,  reduce = 0%, Cumulative CPU 304.25 sec
2021-01-14 19:04:18,598 Stage-1 map = 29%,  reduce = 0%, Cumulative CPU 313.34 sec
2021-01-14 19:04:26,809 Stage-1 map = 31%,  reduce = 0%, Cumulative CPU 331.84 sec
2021-01-14 19:04:36,035 Stage-1 map = 32%,  reduce = 0%, Cumulative CPU 350.14 sec
2021-01-14 19:04:45,285 Stage-1 map = 34%,  reduce = 0%, Cumulative CPU 368.4 sec
2021-01-14 19:04:54,527 Stage-1 map = 36%,  reduce = 0%, Cumulative CPU 386.85 sec
2021-01-14 19:05:02,755 Stage-1 map = 38%,  reduce = 0%, Cumulative CPU 404.95 sec
2021-01-14 19:05:11,991 Stage-1 map = 39%,  reduce = 0%, Cumulative CPU 423.57 sec
2021-01-14 19:05:21,196 Stage-1 map = 41%,  reduce = 0%, Cumulative CPU 441.83 sec
2021-01-14 19:05:30,419 Stage-1 map = 43%,  reduce = 0%, Cumulative CPU 460.06 sec
2021-01-14 19:05:39,645 Stage-1 map = 44%,  reduce = 0%, Cumulative CPU 478.47 sec
2021-01-14 19:05:47,844 Stage-1 map = 46%,  reduce = 0%, Cumulative CPU 496.52 sec
2021-01-14 19:05:57,057 Stage-1 map = 48%,  reduce = 0%, Cumulative CPU 514.75 sec
2021-01-14 19:06:06,287 Stage-1 map = 50%,  reduce = 0%, Cumulative CPU 533.26 sec
2021-01-14 19:06:16,555 Stage-1 map = 51%,  reduce = 0%, Cumulative CPU 550.72 sec
2021-01-14 19:06:22,700 Stage-1 map = 52%,  reduce = 0%, Cumulative CPU 560.75 sec
2021-01-14 19:06:25,764 Stage-1 map = 53%,  reduce = 0%, Cumulative CPU 570.45 sec
2021-01-14 19:06:32,931 Stage-1 map = 54%,  reduce = 0%, Cumulative CPU 579.77 sec
2021-01-14 19:06:34,980 Stage-1 map = 55%,  reduce = 0%, Cumulative CPU 589.18 sec
2021-01-14 19:06:42,160 Stage-1 map = 56%,  reduce = 0%, Cumulative CPU 598.64 sec
2021-01-14 19:06:50,355 Stage-1 map = 57%,  reduce = 0%, Cumulative CPU 617.02 sec
2021-01-14 19:06:52,395 Stage-1 map = 58%,  reduce = 0%, Cumulative CPU 626.5 sec
2021-01-14 19:06:59,573 Stage-1 map = 59%,  reduce = 0%, Cumulative CPU 635.5 sec
2021-01-14 19:07:02,654 Stage-1 map = 60%,  reduce = 0%, Cumulative CPU 644.2 sec
2021-01-14 19:07:08,811 Stage-1 map = 61%,  reduce = 0%, Cumulative CPU 653.34 sec
2021-01-14 19:07:11,884 Stage-1 map = 62%,  reduce = 0%, Cumulative CPU 662.47 sec
2021-01-14 19:07:21,104 Stage-1 map = 63%,  reduce = 0%, Cumulative CPU 680.52 sec
2021-01-14 19:07:26,230 Stage-1 map = 64%,  reduce = 0%, Cumulative CPU 689.71 sec
2021-01-14 19:07:29,303 Stage-1 map = 65%,  reduce = 0%, Cumulative CPU 698.82 sec
2021-01-14 19:07:35,443 Stage-1 map = 66%,  reduce = 0%, Cumulative CPU 708.11 sec
2021-01-14 19:07:38,520 Stage-1 map = 67%,  reduce = 0%, Cumulative CPU 717.54 sec
2021-01-14 19:07:44,681 Stage-1 map = 68%,  reduce = 0%, Cumulative CPU 726.63 sec
2021-01-14 19:07:53,910 Stage-1 map = 69%,  reduce = 0%, Cumulative CPU 744.91 sec
2021-01-14 19:07:56,984 Stage-1 map = 70%,  reduce = 0%, Cumulative CPU 753.6 sec
2021-01-14 19:08:02,103 Stage-1 map = 71%,  reduce = 0%, Cumulative CPU 762.52 sec
2021-01-14 19:08:05,183 Stage-1 map = 72%,  reduce = 0%, Cumulative CPU 771.53 sec
2021-01-14 19:08:11,343 Stage-1 map = 73%,  reduce = 0%, Cumulative CPU 780.5 sec
2021-01-14 19:08:14,411 Stage-1 map = 74%,  reduce = 0%, Cumulative CPU 789.43 sec
2021-01-14 19:08:23,626 Stage-1 map = 75%,  reduce = 0%, Cumulative CPU 807.77 sec
2021-01-14 19:08:29,773 Stage-1 map = 76%,  reduce = 0%, Cumulative CPU 816.92 sec
2021-01-14 19:08:32,855 Stage-1 map = 77%,  reduce = 0%, Cumulative CPU 826.07 sec
2021-01-14 19:08:39,012 Stage-1 map = 78%,  reduce = 0%, Cumulative CPU 835.04 sec
2021-01-14 19:08:42,085 Stage-1 map = 79%,  reduce = 0%, Cumulative CPU 844.31 sec
2021-01-14 19:08:50,282 Stage-1 map = 80%,  reduce = 0%, Cumulative CPU 862.28 sec
2021-01-14 19:08:56,435 Stage-1 map = 81%,  reduce = 0%, Cumulative CPU 871.3 sec
2021-01-14 19:08:59,517 Stage-1 map = 82%,  reduce = 0%, Cumulative CPU 880.27 sec
2021-01-14 19:09:08,745 Stage-1 map = 83%,  reduce = 0%, Cumulative CPU 889.25 sec
2021-01-14 19:09:17,980 Stage-1 map = 84%,  reduce = 0%, Cumulative CPU 899.29 sec
2021-01-14 19:09:27,209 Stage-1 map = 85%,  reduce = 0%, Cumulative CPU 908.46 sec
2021-01-14 19:09:44,642 Stage-1 map = 86%,  reduce = 0%, Cumulative CPU 926.91 sec
2021-01-14 19:09:53,881 Stage-1 map = 87%,  reduce = 0%, Cumulative CPU 936.23 sec
2021-01-14 19:10:03,098 Stage-1 map = 88%,  reduce = 0%, Cumulative CPU 945.51 sec
2021-01-14 19:10:12,330 Stage-1 map = 89%,  reduce = 0%, Cumulative CPU 954.95 sec
2021-01-14 19:10:21,552 Stage-1 map = 90%,  reduce = 0%, Cumulative CPU 964.34 sec
2021-01-14 19:10:29,749 Stage-1 map = 91%,  reduce = 0%, Cumulative CPU 973.66 sec
2021-01-14 19:10:48,196 Stage-1 map = 92%,  reduce = 0%, Cumulative CPU 992.05 sec
2021-01-14 19:10:57,412 Stage-1 map = 93%,  reduce = 0%, Cumulative CPU 1001.08 sec
2021-01-14 19:11:06,644 Stage-1 map = 94%,  reduce = 0%, Cumulative CPU 1010.73 sec
2021-01-14 19:11:14,839 Stage-1 map = 95%,  reduce = 0%, Cumulative CPU 1019.97 sec
2021-01-14 19:11:24,067 Stage-1 map = 96%,  reduce = 0%, Cumulative CPU 1029.3 sec
2021-01-14 19:11:33,310 Stage-1 map = 97%,  reduce = 0%, Cumulative CPU 1038.48 sec
2021-01-14 19:11:50,765 Stage-1 map = 98%,  reduce = 0%, Cumulative CPU 1057.35 sec
2021-01-14 19:12:00,020 Stage-1 map = 99%,  reduce = 0%, Cumulative CPU 1066.54 sec
2021-01-14 19:12:09,255 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1075.51 sec
2021-01-14 19:12:15,439 Stage-1 map = 100%,  reduce = 1%, Cumulative CPU 1082.5 sec
2021-01-14 19:12:25,678 Stage-1 map = 100%,  reduce = 2%, Cumulative CPU 1095.86 sec
2021-01-14 19:12:34,879 Stage-1 map = 100%,  reduce = 3%, Cumulative CPU 1109.58 sec
2021-01-14 19:12:45,106 Stage-1 map = 100%,  reduce = 4%, Cumulative CPU 1122.74 sec
2021-01-14 19:12:55,334 Stage-1 map = 100%,  reduce = 5%, Cumulative CPU 1136.11 sec
2021-01-14 19:13:05,582 Stage-1 map = 100%,  reduce = 6%, Cumulative CPU 1149.12 sec
2021-01-14 19:13:13,776 Stage-1 map = 100%,  reduce = 7%, Cumulative CPU 1159.81 sec
2021-01-14 19:13:23,005 Stage-1 map = 100%,  reduce = 8%, Cumulative CPU 1172.96 sec
2021-01-14 19:13:33,248 Stage-1 map = 100%,  reduce = 9%, Cumulative CPU 1186.41 sec
2021-01-14 19:13:43,498 Stage-1 map = 100%,  reduce = 10%, Cumulative CPU 1199.92 sec
2021-01-14 19:13:53,741 Stage-1 map = 100%,  reduce = 11%, Cumulative CPU 1213.52 sec
2021-01-14 19:14:02,954 Stage-1 map = 100%,  reduce = 12%, Cumulative CPU 1226.76 sec
2021-01-14 19:14:13,196 Stage-1 map = 100%,  reduce = 13%, Cumulative CPU 1239.9 sec
2021-01-14 19:14:23,428 Stage-1 map = 100%,  reduce = 14%, Cumulative CPU 1253.19 sec
2021-01-14 19:14:33,671 Stage-1 map = 100%,  reduce = 15%, Cumulative CPU 1266.5 sec
2021-01-14 19:14:43,901 Stage-1 map = 100%,  reduce = 16%, Cumulative CPU 1279.88 sec
2021-01-14 19:14:53,123 Stage-1 map = 100%,  reduce = 17%, Cumulative CPU 1293.37 sec
2021-01-14 19:15:01,323 Stage-1 map = 100%,  reduce = 18%, Cumulative CPU 1303.97 sec
2021-01-14 19:15:11,564 Stage-1 map = 100%,  reduce = 19%, Cumulative CPU 1316.96 sec
2021-01-14 19:15:21,782 Stage-1 map = 100%,  reduce = 20%, Cumulative CPU 1330.34 sec
2021-01-14 19:15:32,004 Stage-1 map = 100%,  reduce = 21%, Cumulative CPU 1343.36 sec
2021-01-14 19:15:42,233 Stage-1 map = 100%,  reduce = 22%, Cumulative CPU 1357.14 sec
2021-01-14 19:15:51,448 Stage-1 map = 100%,  reduce = 23%, Cumulative CPU 1370.58 sec
2021-01-14 19:16:01,687 Stage-1 map = 100%,  reduce = 24%, Cumulative CPU 1383.78 sec
2021-01-14 19:16:11,950 Stage-1 map = 100%,  reduce = 25%, Cumulative CPU 1397.07 sec
2021-01-14 19:16:22,194 Stage-1 map = 100%,  reduce = 26%, Cumulative CPU 1410.58 sec
2021-01-14 19:16:31,415 Stage-1 map = 100%,  reduce = 27%, Cumulative CPU 1423.93 sec
2021-01-14 19:16:41,656 Stage-1 map = 100%,  reduce = 28%, Cumulative CPU 1437.45 sec
2021-01-14 19:16:49,855 Stage-1 map = 100%,  reduce = 29%, Cumulative CPU 1448.27 sec
2021-01-14 19:17:00,092 Stage-1 map = 100%,  reduce = 30%, Cumulative CPU 1461.38 sec
2021-01-14 19:17:10,324 Stage-1 map = 100%,  reduce = 31%, Cumulative CPU 1475.12 sec
2021-01-14 19:17:19,521 Stage-1 map = 100%,  reduce = 32%, Cumulative CPU 1488.25 sec
2021-01-14 19:17:29,759 Stage-1 map = 100%,  reduce = 33%, Cumulative CPU 1501.32 sec
2021-01-14 19:17:40,008 Stage-1 map = 100%,  reduce = 34%, Cumulative CPU 1514.62 sec
2021-01-14 19:17:50,259 Stage-1 map = 100%,  reduce = 35%, Cumulative CPU 1528.17 sec
2021-01-14 19:17:59,490 Stage-1 map = 100%,  reduce = 36%, Cumulative CPU 1541.34 sec
2021-01-14 19:18:09,743 Stage-1 map = 100%,  reduce = 37%, Cumulative CPU 1554.44 sec
2021-01-14 19:18:19,982 Stage-1 map = 100%,  reduce = 38%, Cumulative CPU 1568.05 sec
2021-01-14 19:18:30,207 Stage-1 map = 100%,  reduce = 39%, Cumulative CPU 1581.4 sec
2021-01-14 19:18:38,393 Stage-1 map = 100%,  reduce = 40%, Cumulative CPU 1592.03 sec
2021-01-14 19:18:47,619 Stage-1 map = 100%,  reduce = 41%, Cumulative CPU 1605.4 sec
2021-01-14 19:18:57,872 Stage-1 map = 100%,  reduce = 42%, Cumulative CPU 1618.69 sec
2021-01-14 19:19:08,116 Stage-1 map = 100%,  reduce = 43%, Cumulative CPU 1632.45 sec
2021-01-14 19:19:18,345 Stage-1 map = 100%,  reduce = 44%, Cumulative CPU 1645.95 sec
2021-01-14 19:19:28,590 Stage-1 map = 100%,  reduce = 45%, Cumulative CPU 1659.07 sec
2021-01-14 19:19:37,817 Stage-1 map = 100%,  reduce = 46%, Cumulative CPU 1672.16 sec
2021-01-14 19:19:48,050 Stage-1 map = 100%,  reduce = 47%, Cumulative CPU 1685.57 sec
2021-01-14 19:19:58,291 Stage-1 map = 100%,  reduce = 48%, Cumulative CPU 1698.8 sec
2021-01-14 19:20:08,544 Stage-1 map = 100%,  reduce = 49%, Cumulative CPU 1711.89 sec
2021-01-14 19:20:18,773 Stage-1 map = 100%,  reduce = 50%, Cumulative CPU 1725.25 sec
2021-01-14 19:20:26,955 Stage-1 map = 100%,  reduce = 51%, Cumulative CPU 1736.15 sec
2021-01-14 19:20:36,155 Stage-1 map = 100%,  reduce = 52%, Cumulative CPU 1749.24 sec
2021-01-14 19:20:46,404 Stage-1 map = 100%,  reduce = 53%, Cumulative CPU 1762.65 sec
2021-01-14 19:20:56,657 Stage-1 map = 100%,  reduce = 54%, Cumulative CPU 1775.99 sec
2021-01-14 19:21:06,879 Stage-1 map = 100%,  reduce = 55%, Cumulative CPU 1789.3 sec
2021-01-14 19:21:16,104 Stage-1 map = 100%,  reduce = 56%, Cumulative CPU 1802.41 sec
2021-01-14 19:21:26,332 Stage-1 map = 100%,  reduce = 57%, Cumulative CPU 1815.6 sec
2021-01-14 19:21:36,583 Stage-1 map = 100%,  reduce = 58%, Cumulative CPU 1828.81 sec
2021-01-14 19:21:46,819 Stage-1 map = 100%,  reduce = 59%, Cumulative CPU 1842.02 sec
2021-01-14 19:21:57,058 Stage-1 map = 100%,  reduce = 60%, Cumulative CPU 1855.19 sec
2021-01-14 19:22:06,269 Stage-1 map = 100%,  reduce = 61%, Cumulative CPU 1868.47 sec
2021-01-14 19:22:14,467 Stage-1 map = 100%,  reduce = 62%, Cumulative CPU 1879.02 sec
2021-01-14 19:22:24,703 Stage-1 map = 100%,  reduce = 63%, Cumulative CPU 1892.26 sec
2021-01-14 19:22:34,925 Stage-1 map = 100%,  reduce = 64%, Cumulative CPU 1905.62 sec
2021-01-14 19:22:45,166 Stage-1 map = 100%,  reduce = 65%, Cumulative CPU 1919.13 sec
2021-01-14 19:22:54,395 Stage-1 map = 100%,  reduce = 66%, Cumulative CPU 1932.36 sec
2021-01-14 19:23:04,643 Stage-1 map = 100%,  reduce = 67%, Cumulative CPU 1945.42 sec
2021-01-14 19:23:14,872 Stage-1 map = 100%,  reduce = 68%, Cumulative CPU 1958.65 sec
2021-01-14 19:23:25,109 Stage-1 map = 100%,  reduce = 69%, Cumulative CPU 1971.72 sec
2021-01-14 19:23:34,345 Stage-1 map = 100%,  reduce = 70%, Cumulative CPU 1985.23 sec
2021-01-14 19:23:45,620 Stage-1 map = 100%,  reduce = 71%, Cumulative CPU 1998.72 sec
2021-01-14 19:23:54,836 Stage-1 map = 100%,  reduce = 72%, Cumulative CPU 2011.85 sec
2021-01-14 19:24:03,048 Stage-1 map = 100%,  reduce = 73%, Cumulative CPU 2022.39 sec
2021-01-14 19:24:15,337 Stage-1 map = 100%,  reduce = 74%, Cumulative CPU 2038.56 sec
2021-01-14 19:24:22,497 Stage-1 map = 100%,  reduce = 75%, Cumulative CPU 2049.15 sec
2021-01-14 19:24:34,747 Stage-1 map = 100%,  reduce = 76%, Cumulative CPU 2065.24 sec
2021-01-14 19:24:42,947 Stage-1 map = 100%,  reduce = 77%, Cumulative CPU 2076.07 sec
2021-01-14 19:24:55,234 Stage-1 map = 100%,  reduce = 78%, Cumulative CPU 2092.16 sec
2021-01-14 19:25:03,437 Stage-1 map = 100%,  reduce = 79%, Cumulative CPU 2102.85 sec
2021-01-14 19:25:14,690 Stage-1 map = 100%,  reduce = 80%, Cumulative CPU 2119.54 sec
2021-01-14 19:25:22,873 Stage-1 map = 100%,  reduce = 81%, Cumulative CPU 2130.15 sec
2021-01-14 19:25:35,166 Stage-1 map = 100%,  reduce = 82%, Cumulative CPU 2146.6 sec
2021-01-14 19:25:43,363 Stage-1 map = 100%,  reduce = 83%, Cumulative CPU 2157.26 sec
2021-01-14 19:25:51,543 Stage-1 map = 100%,  reduce = 84%, Cumulative CPU 2168.07 sec
2021-01-14 19:26:02,807 Stage-1 map = 100%,  reduce = 85%, Cumulative CPU 2184.15 sec
2021-01-14 19:26:10,993 Stage-1 map = 100%,  reduce = 86%, Cumulative CPU 2195.08 sec
2021-01-14 19:26:23,266 Stage-1 map = 100%,  reduce = 87%, Cumulative CPU 2211.35 sec
2021-01-14 19:26:32,495 Stage-1 map = 100%,  reduce = 88%, Cumulative CPU 2222.08 sec
2021-01-14 19:26:42,736 Stage-1 map = 100%,  reduce = 89%, Cumulative CPU 2235.54 sec
2021-01-14 19:26:51,971 Stage-1 map = 100%,  reduce = 90%, Cumulative CPU 2248.66 sec
2021-01-14 19:27:03,245 Stage-1 map = 100%,  reduce = 91%, Cumulative CPU 2261.92 sec
2021-01-14 19:27:12,488 Stage-1 map = 100%,  reduce = 92%, Cumulative CPU 2275.14 sec
2021-01-14 19:27:22,730 Stage-1 map = 100%,  reduce = 93%, Cumulative CPU 2288.47 sec
2021-01-14 19:27:31,945 Stage-1 map = 100%,  reduce = 94%, Cumulative CPU 2301.79 sec
2021-01-14 19:27:40,164 Stage-1 map = 100%,  reduce = 95%, Cumulative CPU 2312.43 sec
2021-01-14 19:27:51,448 Stage-1 map = 100%,  reduce = 96%, Cumulative CPU 2325.67 sec
2021-01-14 19:28:00,688 Stage-1 map = 100%,  reduce = 97%, Cumulative CPU 2338.72 sec
2021-01-14 19:28:10,933 Stage-1 map = 100%,  reduce = 98%, Cumulative CPU 2352.3 sec
2021-01-14 19:28:20,151 Stage-1 map = 100%,  reduce = 99%, Cumulative CPU 2365.62 sec
2021-01-14 19:28:36,544 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 2384.26 sec
MapReduce Total cumulative CPU time: 39 minutes 44 seconds 260 msec
Ended Job = job_1610015767041_0032
Loading data to table default.merge_test1
MapReduce Jobs Launched:
Stage-Stage-1: Map: 117  Reduce: 491   Cumulative CPU: 2384.26 sec   HDFS Read: 31439317219 HDFS Write: 22465 HDFS EC Read: 0 SUCCESS
Total MapReduce CPU Time Spent: 39 minutes 44 seconds 260 msec
OK
Time taken: 1631.392 seconds
hive> > set hive.exec.reducers.bytes.per.reducer = 1024000000;
hive> CREATE TABLE merge_test2(                                 > prod_name      string,                                > max_sale_nums       int,> min_sale_nums        int> )> STORED AS textfile ;
OK
Time taken: 0.078 seconds
hive> insert into merge_test2> select prod_name,max(sale_nums),min(sale_nums)> from ods_fact_sale > group by prod_name;
Query ID = root_20210114192908_41f36835-549c-46bd-a897-c6a08aafab8f
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 31
In order to change the average load for a reducer (in bytes):set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:set mapreduce.job.reduces=<number>
Starting Job = job_1610015767041_0033, Tracking URL = http://hp1:8088/proxy/application_1610015767041_0033/
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/hadoop job  -kill job_1610015767041_0033
Hadoop job information for Stage-1: number of mappers: 117; number of reducers: 31
2021-01-14 19:29:16,912 Stage-1 map = 0%,  reduce = 0%
2021-01-14 19:29:29,259 Stage-1 map = 2%,  reduce = 0%, Cumulative CPU 19.35 sec
2021-01-14 19:29:38,504 Stage-1 map = 3%,  reduce = 0%, Cumulative CPU 37.67 sec
2021-01-14 19:29:46,710 Stage-1 map = 5%,  reduce = 0%, Cumulative CPU 55.21 sec
2021-01-14 19:29:55,943 Stage-1 map = 7%,  reduce = 0%, Cumulative CPU 73.27 sec
2021-01-14 19:30:05,169 Stage-1 map = 9%,  reduce = 0%, Cumulative CPU 91.12 sec
2021-01-14 19:30:14,390 Stage-1 map = 10%,  reduce = 0%, Cumulative CPU 109.31 sec
2021-01-14 19:30:22,567 Stage-1 map = 12%,  reduce = 0%, Cumulative CPU 127.55 sec
2021-01-14 19:30:31,776 Stage-1 map = 14%,  reduce = 0%, Cumulative CPU 145.45 sec
2021-01-14 19:30:40,993 Stage-1 map = 15%,  reduce = 0%, Cumulative CPU 163.87 sec
2021-01-14 19:30:50,204 Stage-1 map = 17%,  reduce = 0%, Cumulative CPU 181.82 sec
2021-01-14 19:30:59,400 Stage-1 map = 19%,  reduce = 0%, Cumulative CPU 199.95 sec
2021-01-14 19:31:07,590 Stage-1 map = 21%,  reduce = 0%, Cumulative CPU 217.98 sec
2021-01-14 19:31:16,793 Stage-1 map = 22%,  reduce = 0%, Cumulative CPU 236.08 sec
2021-01-14 19:31:25,977 Stage-1 map = 24%,  reduce = 0%, Cumulative CPU 254.43 sec
2021-01-14 19:31:35,171 Stage-1 map = 26%,  reduce = 0%, Cumulative CPU 272.32 sec
2021-01-14 19:31:44,396 Stage-1 map = 27%,  reduce = 0%, Cumulative CPU 290.2 sec
2021-01-14 19:31:53,607 Stage-1 map = 28%,  reduce = 0%, Cumulative CPU 299.28 sec
2021-01-14 19:31:54,632 Stage-1 map = 29%,  reduce = 0%, Cumulative CPU 308.24 sec
2021-01-14 19:32:02,821 Stage-1 map = 31%,  reduce = 0%, Cumulative CPU 326.33 sec
2021-01-14 19:32:11,010 Stage-1 map = 32%,  reduce = 0%, Cumulative CPU 335.17 sec
2021-01-14 19:32:20,232 Stage-1 map = 33%,  reduce = 0%, Cumulative CPU 352.9 sec
2021-01-14 19:32:21,249 Stage-1 map = 34%,  reduce = 0%, Cumulative CPU 361.87 sec
2021-01-14 19:32:29,417 Stage-1 map = 35%,  reduce = 0%, Cumulative CPU 370.96 sec
2021-01-14 19:32:30,439 Stage-1 map = 36%,  reduce = 0%, Cumulative CPU 379.86 sec
2021-01-14 19:32:38,604 Stage-1 map = 37%,  reduce = 0%, Cumulative CPU 388.81 sec
2021-01-14 19:32:39,629 Stage-1 map = 38%,  reduce = 0%, Cumulative CPU 397.52 sec
2021-01-14 19:32:47,806 Stage-1 map = 39%,  reduce = 0%, Cumulative CPU 415.38 sec
2021-01-14 19:32:55,995 Stage-1 map = 40%,  reduce = 0%, Cumulative CPU 424.22 sec
2021-01-14 19:32:57,018 Stage-1 map = 41%,  reduce = 0%, Cumulative CPU 433.11 sec
2021-01-14 19:33:05,196 Stage-1 map = 42%,  reduce = 0%, Cumulative CPU 442.05 sec
2021-01-14 19:33:06,219 Stage-1 map = 43%,  reduce = 0%, Cumulative CPU 450.8 sec
2021-01-14 19:33:14,396 Stage-1 map = 44%,  reduce = 0%, Cumulative CPU 459.74 sec
2021-01-14 19:33:23,578 Stage-1 map = 45%,  reduce = 0%, Cumulative CPU 477.34 sec
2021-01-14 19:33:24,602 Stage-1 map = 46%,  reduce = 0%, Cumulative CPU 486.52 sec
2021-01-14 19:33:32,783 Stage-1 map = 48%,  reduce = 0%, Cumulative CPU 504.15 sec
2021-01-14 19:33:41,972 Stage-1 map = 50%,  reduce = 0%, Cumulative CPU 522.16 sec
2021-01-14 19:33:52,197 Stage-1 map = 51%,  reduce = 0%, Cumulative CPU 539.2 sec
2021-01-14 19:33:58,336 Stage-1 map = 52%,  reduce = 0%, Cumulative CPU 548.73 sec
2021-01-14 19:34:02,427 Stage-1 map = 53%,  reduce = 0%, Cumulative CPU 557.99 sec
2021-01-14 19:34:07,545 Stage-1 map = 54%,  reduce = 0%, Cumulative CPU 567.16 sec
2021-01-14 19:34:11,632 Stage-1 map = 55%,  reduce = 0%, Cumulative CPU 576.08 sec
2021-01-14 19:34:16,749 Stage-1 map = 56%,  reduce = 0%, Cumulative CPU 585.06 sec
2021-01-14 19:34:25,935 Stage-1 map = 57%,  reduce = 0%, Cumulative CPU 603.52 sec
2021-01-14 19:34:30,026 Stage-1 map = 58%,  reduce = 0%, Cumulative CPU 612.49 sec
2021-01-14 19:34:35,140 Stage-1 map = 59%,  reduce = 0%, Cumulative CPU 621.77 sec
2021-01-14 19:34:38,210 Stage-1 map = 60%,  reduce = 0%, Cumulative CPU 630.7 sec
2021-01-14 19:34:43,317 Stage-1 map = 61%,  reduce = 0%, Cumulative CPU 639.91 sec
2021-01-14 19:34:47,410 Stage-1 map = 62%,  reduce = 0%, Cumulative CPU 648.89 sec
2021-01-14 19:34:56,615 Stage-1 map = 63%,  reduce = 0%, Cumulative CPU 666.84 sec
2021-01-14 19:35:01,727 Stage-1 map = 64%,  reduce = 0%, Cumulative CPU 675.6 sec
2021-01-14 19:35:05,819 Stage-1 map = 65%,  reduce = 0%, Cumulative CPU 684.69 sec
2021-01-14 19:35:10,933 Stage-1 map = 66%,  reduce = 0%, Cumulative CPU 693.59 sec
2021-01-14 19:35:15,021 Stage-1 map = 67%,  reduce = 0%, Cumulative CPU 702.23 sec
2021-01-14 19:35:20,133 Stage-1 map = 68%,  reduce = 0%, Cumulative CPU 711.21 sec
2021-01-14 19:35:28,322 Stage-1 map = 69%,  reduce = 0%, Cumulative CPU 729.04 sec
2021-01-14 19:35:32,415 Stage-1 map = 70%,  reduce = 0%, Cumulative CPU 738.03 sec
2021-01-14 19:35:37,526 Stage-1 map = 71%,  reduce = 0%, Cumulative CPU 747.06 sec
2021-01-14 19:35:41,611 Stage-1 map = 72%,  reduce = 0%, Cumulative CPU 756.06 sec
2021-01-14 19:35:46,719 Stage-1 map = 73%,  reduce = 0%, Cumulative CPU 765.09 sec
2021-01-14 19:35:50,806 Stage-1 map = 74%,  reduce = 0%, Cumulative CPU 774.04 sec
2021-01-14 19:36:00,012 Stage-1 map = 75%,  reduce = 0%, Cumulative CPU 791.77 sec
2021-01-14 19:36:05,123 Stage-1 map = 76%,  reduce = 0%, Cumulative CPU 800.75 sec
2021-01-14 19:36:09,214 Stage-1 map = 77%,  reduce = 0%, Cumulative CPU 809.81 sec
2021-01-14 19:36:14,328 Stage-1 map = 78%,  reduce = 0%, Cumulative CPU 818.69 sec
2021-01-14 19:36:18,415 Stage-1 map = 79%,  reduce = 0%, Cumulative CPU 827.72 sec
2021-01-14 19:36:26,584 Stage-1 map = 80%,  reduce = 0%, Cumulative CPU 845.87 sec
2021-01-14 19:36:31,694 Stage-1 map = 81%,  reduce = 0%, Cumulative CPU 854.89 sec
2021-01-14 19:36:35,785 Stage-1 map = 82%,  reduce = 0%, Cumulative CPU 863.86 sec
2021-01-14 19:36:44,993 Stage-1 map = 83%,  reduce = 0%, Cumulative CPU 872.92 sec
2021-01-14 19:36:47,043 Stage-1 map = 83%,  reduce = 1%, Cumulative CPU 873.65 sec
2021-01-14 19:36:54,202 Stage-1 map = 84%,  reduce = 1%, Cumulative CPU 882.71 sec
2021-01-14 19:37:02,383 Stage-1 map = 85%,  reduce = 1%, Cumulative CPU 891.72 sec
2021-01-14 19:37:20,770 Stage-1 map = 86%,  reduce = 1%, Cumulative CPU 909.77 sec
2021-01-14 19:37:29,964 Stage-1 map = 87%,  reduce = 1%, Cumulative CPU 918.97 sec
2021-01-14 19:37:39,160 Stage-1 map = 88%,  reduce = 1%, Cumulative CPU 928.27 sec
2021-01-14 19:37:48,352 Stage-1 map = 89%,  reduce = 1%, Cumulative CPU 937.24 sec
2021-01-14 19:37:56,530 Stage-1 map = 90%,  reduce = 1%, Cumulative CPU 946.29 sec
2021-01-14 19:38:05,734 Stage-1 map = 91%,  reduce = 1%, Cumulative CPU 955.18 sec
2021-01-14 19:38:24,140 Stage-1 map = 92%,  reduce = 1%, Cumulative CPU 973.53 sec
2021-01-14 19:38:33,352 Stage-1 map = 93%,  reduce = 1%, Cumulative CPU 982.47 sec
2021-01-14 19:38:41,529 Stage-1 map = 94%,  reduce = 1%, Cumulative CPU 991.47 sec
2021-01-14 19:38:50,735 Stage-1 map = 95%,  reduce = 1%, Cumulative CPU 1000.3 sec
2021-01-14 19:38:59,949 Stage-1 map = 96%,  reduce = 1%, Cumulative CPU 1009.3 sec
2021-01-14 19:39:09,158 Stage-1 map = 97%,  reduce = 1%, Cumulative CPU 1018.28 sec
2021-01-14 19:39:27,573 Stage-1 map = 98%,  reduce = 1%, Cumulative CPU 1036.53 sec
2021-01-14 19:39:35,741 Stage-1 map = 99%,  reduce = 1%, Cumulative CPU 1045.53 sec
2021-01-14 19:39:44,943 Stage-1 map = 100%,  reduce = 1%, Cumulative CPU 1054.51 sec
2021-01-14 19:39:45,966 Stage-1 map = 100%,  reduce = 3%, Cumulative CPU 1056.5 sec
2021-01-14 19:39:49,040 Stage-1 map = 100%,  reduce = 6%, Cumulative CPU 1059.11 sec
2021-01-14 19:39:50,063 Stage-1 map = 100%,  reduce = 10%, Cumulative CPU 1061.79 sec
2021-01-14 19:39:53,133 Stage-1 map = 100%,  reduce = 13%, Cumulative CPU 1064.52 sec
2021-01-14 19:39:54,154 Stage-1 map = 100%,  reduce = 16%, Cumulative CPU 1067.05 sec
2021-01-14 19:39:57,223 Stage-1 map = 100%,  reduce = 19%, Cumulative CPU 1069.71 sec
2021-01-14 19:39:58,246 Stage-1 map = 100%,  reduce = 23%, Cumulative CPU 1072.25 sec
2021-01-14 19:40:01,296 Stage-1 map = 100%,  reduce = 26%, Cumulative CPU 1074.89 sec
2021-01-14 19:40:02,318 Stage-1 map = 100%,  reduce = 29%, Cumulative CPU 1077.5 sec
2021-01-14 19:40:05,376 Stage-1 map = 100%,  reduce = 32%, Cumulative CPU 1080.12 sec
2021-01-14 19:40:06,400 Stage-1 map = 100%,  reduce = 35%, Cumulative CPU 1082.7 sec
2021-01-14 19:40:09,466 Stage-1 map = 100%,  reduce = 39%, Cumulative CPU 1085.39 sec
2021-01-14 19:40:10,490 Stage-1 map = 100%,  reduce = 42%, Cumulative CPU 1087.95 sec
2021-01-14 19:40:13,548 Stage-1 map = 100%,  reduce = 45%, Cumulative CPU 1090.61 sec
2021-01-14 19:40:14,572 Stage-1 map = 100%,  reduce = 48%, Cumulative CPU 1093.35 sec
2021-01-14 19:40:17,633 Stage-1 map = 100%,  reduce = 52%, Cumulative CPU 1095.98 sec
2021-01-14 19:40:18,650 Stage-1 map = 100%,  reduce = 55%, Cumulative CPU 1098.62 sec
2021-01-14 19:40:21,720 Stage-1 map = 100%,  reduce = 58%, Cumulative CPU 1101.33 sec
2021-01-14 19:40:22,737 Stage-1 map = 100%,  reduce = 61%, Cumulative CPU 1103.94 sec
2021-01-14 19:40:25,791 Stage-1 map = 100%,  reduce = 65%, Cumulative CPU 1106.75 sec
2021-01-14 19:40:26,815 Stage-1 map = 100%,  reduce = 68%, Cumulative CPU 1109.47 sec
2021-01-14 19:40:28,856 Stage-1 map = 100%,  reduce = 71%, Cumulative CPU 1112.24 sec
2021-01-14 19:40:30,909 Stage-1 map = 100%,  reduce = 74%, Cumulative CPU 1115.08 sec
2021-01-14 19:40:32,951 Stage-1 map = 100%,  reduce = 77%, Cumulative CPU 1117.95 sec
2021-01-14 19:40:33,977 Stage-1 map = 100%,  reduce = 81%, Cumulative CPU 1120.63 sec
2021-01-14 19:40:37,038 Stage-1 map = 100%,  reduce = 84%, Cumulative CPU 1123.47 sec
2021-01-14 19:40:38,061 Stage-1 map = 100%,  reduce = 87%, Cumulative CPU 1126.17 sec
2021-01-14 19:40:41,128 Stage-1 map = 100%,  reduce = 90%, Cumulative CPU 1128.84 sec
2021-01-14 19:40:42,149 Stage-1 map = 100%,  reduce = 94%, Cumulative CPU 1131.45 sec
2021-01-14 19:40:45,220 Stage-1 map = 100%,  reduce = 97%, Cumulative CPU 1134.14 sec
2021-01-14 19:40:46,243 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 1136.69 sec
MapReduce Total cumulative CPU time: 18 minutes 56 seconds 690 msec
Ended Job = job_1610015767041_0033
Loading data to table default.merge_test2
MapReduce Jobs Launched:
Stage-Stage-1: Map: 117  Reduce: 31   Cumulative CPU: 1136.69 sec   HDFS Read: 31437081538 HDFS Write: 1765 HDFS EC Read: 0 SUCCESS
Total MapReduce CPU Time Spent: 18 minutes 56 seconds 690 msec
OK
Time taken: 699.009 seconds
hive> >
hive> set hive.exec.reducers.bytes.per.reducer = 1024000000;
hive> set mapred.max.split.size=1024000000;
hive> CREATE TABLE merge_test3(                                 > prod_name      string,                                > max_sale_nums       int,> min_sale_nums        int> )> STORED AS textfile ;
OK
Time taken: 0.076 seconds
hive> > insert into merge_test3> select prod_name,max(sale_nums),min(sale_nums)> from ods_fact_sale > group by prod_name;
Query ID = root_20210114194538_e8f2b367-cdcd-42c4-825b-53a5595a14a1
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 31
In order to change the average load for a reducer (in bytes):set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:set mapreduce.job.reduces=<number>
Starting Job = job_1610015767041_0034, Tracking URL = http://hp1:8088/proxy/application_1610015767041_0034/
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/hadoop job  -kill job_1610015767041_0034
Hadoop job information for Stage-1: number of mappers: 30; number of reducers: 31
2021-01-14 19:45:45,306 Stage-1 map = 0%,  reduce = 0%
2021-01-14 19:46:02,766 Stage-1 map = 1%,  reduce = 0%, Cumulative CPU 15.31 sec
2021-01-14 19:46:03,793 Stage-1 map = 2%,  reduce = 0%, Cumulative CPU 30.5 sec
2021-01-14 19:46:08,920 Stage-1 map = 3%,  reduce = 0%, Cumulative CPU 36.79 sec
2021-01-14 19:46:09,945 Stage-1 map = 4%,  reduce = 0%, Cumulative CPU 42.99 sec
2021-01-14 19:46:10,968 Stage-1 map = 5%,  reduce = 0%, Cumulative CPU 45.27 sec
2021-01-14 19:46:11,996 Stage-1 map = 7%,  reduce = 0%, Cumulative CPU 47.59 sec
2021-01-14 19:46:26,330 Stage-1 map = 8%,  reduce = 0%, Cumulative CPU 62.73 sec
2021-01-14 19:46:27,359 Stage-1 map = 9%,  reduce = 0%, Cumulative CPU 77.85 sec
2021-01-14 19:46:32,466 Stage-1 map = 10%,  reduce = 0%, Cumulative CPU 84.08 sec
2021-01-14 19:46:33,491 Stage-1 map = 11%,  reduce = 0%, Cumulative CPU 90.32 sec
2021-01-14 19:46:34,516 Stage-1 map = 12%,  reduce = 0%, Cumulative CPU 92.67 sec
2021-01-14 19:46:35,542 Stage-1 map = 13%,  reduce = 0%, Cumulative CPU 95.09 sec
2021-01-14 19:46:48,834 Stage-1 map = 14%,  reduce = 0%, Cumulative CPU 110.26 sec
2021-01-14 19:46:49,859 Stage-1 map = 16%,  reduce = 0%, Cumulative CPU 125.56 sec
2021-01-14 19:46:56,009 Stage-1 map = 17%,  reduce = 0%, Cumulative CPU 137.94 sec
2021-01-14 19:46:57,026 Stage-1 map = 19%,  reduce = 0%, Cumulative CPU 139.66 sec
2021-01-14 19:46:58,050 Stage-1 map = 20%,  reduce = 0%, Cumulative CPU 141.64 sec
2021-01-14 19:47:12,378 Stage-1 map = 21%,  reduce = 0%, Cumulative CPU 156.7 sec
2021-01-14 19:47:13,397 Stage-1 map = 22%,  reduce = 0%, Cumulative CPU 171.78 sec
2021-01-14 19:47:18,504 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU 177.95 sec
2021-01-14 19:47:19,523 Stage-1 map = 24%,  reduce = 0%, Cumulative CPU 184.12 sec
2021-01-14 19:47:20,547 Stage-1 map = 25%,  reduce = 0%, Cumulative CPU 186.18 sec
2021-01-14 19:47:21,571 Stage-1 map = 27%,  reduce = 0%, Cumulative CPU 188.04 sec
2021-01-14 19:47:34,873 Stage-1 map = 28%,  reduce = 0%, Cumulative CPU 203.22 sec
2021-01-14 19:47:35,891 Stage-1 map = 29%,  reduce = 0%, Cumulative CPU 218.4 sec
2021-01-14 19:47:41,020 Stage-1 map = 30%,  reduce = 0%, Cumulative CPU 224.63 sec
2021-01-14 19:47:42,043 Stage-1 map = 31%,  reduce = 0%, Cumulative CPU 230.9 sec
2021-01-14 19:47:43,068 Stage-1 map = 32%,  reduce = 0%, Cumulative CPU 232.75 sec
2021-01-14 19:47:44,090 Stage-1 map = 33%,  reduce = 0%, Cumulative CPU 234.97 sec
2021-01-14 19:47:58,399 Stage-1 map = 34%,  reduce = 0%, Cumulative CPU 250.1 sec
2021-01-14 19:47:59,422 Stage-1 map = 36%,  reduce = 0%, Cumulative CPU 265.39 sec
2021-01-14 19:48:05,548 Stage-1 map = 37%,  reduce = 0%, Cumulative CPU 277.83 sec
2021-01-14 19:48:06,569 Stage-1 map = 39%,  reduce = 0%, Cumulative CPU 279.85 sec
2021-01-14 19:48:07,593 Stage-1 map = 40%,  reduce = 0%, Cumulative CPU 281.98 sec
2021-01-14 19:48:20,881 Stage-1 map = 41%,  reduce = 0%, Cumulative CPU 297.11 sec
2021-01-14 19:48:22,927 Stage-1 map = 42%,  reduce = 0%, Cumulative CPU 312.34 sec
2021-01-14 19:48:27,019 Stage-1 map = 43%,  reduce = 0%, Cumulative CPU 318.57 sec
2021-01-14 19:48:28,040 Stage-1 map = 44%,  reduce = 0%, Cumulative CPU 324.73 sec
2021-01-14 19:48:29,060 Stage-1 map = 45%,  reduce = 0%, Cumulative CPU 326.87 sec
2021-01-14 19:48:31,106 Stage-1 map = 47%,  reduce = 0%, Cumulative CPU 329.14 sec
2021-01-14 19:48:38,246 Stage-1 map = 50%,  reduce = 0%, Cumulative CPU 336.81 sec
2021-01-14 19:48:44,370 Stage-1 map = 51%,  reduce = 0%, Cumulative CPU 352.0 sec
2021-01-14 19:48:50,502 Stage-1 map = 52%,  reduce = 0%, Cumulative CPU 358.16 sec
2021-01-14 19:48:52,552 Stage-1 map = 53%,  reduce = 0%, Cumulative CPU 360.16 sec
2021-01-14 19:48:53,575 Stage-1 map = 54%,  reduce = 0%, Cumulative CPU 375.37 sec
2021-01-14 19:48:59,710 Stage-1 map = 55%,  reduce = 0%, Cumulative CPU 381.54 sec
2021-01-14 19:49:01,748 Stage-1 map = 57%,  reduce = 0%, Cumulative CPU 384.38 sec
2021-01-14 19:49:07,885 Stage-1 map = 58%,  reduce = 0%, Cumulative CPU 399.4 sec
2021-01-14 19:49:17,082 Stage-1 map = 60%,  reduce = 0%, Cumulative CPU 408.73 sec
2021-01-14 19:49:18,104 Stage-1 map = 61%,  reduce = 0%, Cumulative CPU 423.95 sec
2021-01-14 19:49:23,212 Stage-1 map = 62%,  reduce = 0%, Cumulative CPU 430.2 sec
2021-01-14 19:49:26,275 Stage-1 map = 63%,  reduce = 0%, Cumulative CPU 432.97 sec
2021-01-14 19:49:31,380 Stage-1 map = 64%,  reduce = 0%, Cumulative CPU 448.18 sec
2021-01-14 19:49:37,508 Stage-1 map = 65%,  reduce = 0%, Cumulative CPU 454.34 sec
2021-01-14 19:49:39,545 Stage-1 map = 67%,  reduce = 0%, Cumulative CPU 456.52 sec
2021-01-14 19:49:41,590 Stage-1 map = 68%,  reduce = 0%, Cumulative CPU 471.7 sec
2021-01-14 19:49:47,722 Stage-1 map = 69%,  reduce = 0%, Cumulative CPU 477.89 sec
2021-01-14 19:49:49,767 Stage-1 map = 70%,  reduce = 0%, Cumulative CPU 480.0 sec
2021-01-14 19:49:54,880 Stage-1 map = 71%,  reduce = 0%, Cumulative CPU 495.11 sec
2021-01-14 19:50:00,997 Stage-1 map = 72%,  reduce = 0%, Cumulative CPU 501.3 sec
2021-01-14 19:50:02,020 Stage-1 map = 73%,  reduce = 0%, Cumulative CPU 503.17 sec
2021-01-14 19:50:05,084 Stage-1 map = 74%,  reduce = 0%, Cumulative CPU 518.31 sec
2021-01-14 19:50:10,196 Stage-1 map = 75%,  reduce = 0%, Cumulative CPU 524.53 sec
2021-01-14 19:50:12,241 Stage-1 map = 77%,  reduce = 0%, Cumulative CPU 526.52 sec
2021-01-14 19:50:17,349 Stage-1 map = 78%,  reduce = 0%, Cumulative CPU 541.63 sec
2021-01-14 19:50:23,493 Stage-1 map = 79%,  reduce = 0%, Cumulative CPU 547.87 sec
2021-01-14 19:50:25,531 Stage-1 map = 80%,  reduce = 0%, Cumulative CPU 549.97 sec
2021-01-14 19:50:27,571 Stage-1 map = 81%,  reduce = 0%, Cumulative CPU 565.14 sec
2021-01-14 19:50:33,695 Stage-1 map = 82%,  reduce = 0%, Cumulative CPU 571.29 sec
2021-01-14 19:50:35,732 Stage-1 map = 83%,  reduce = 0%, Cumulative CPU 573.29 sec
2021-01-14 19:50:40,828 Stage-1 map = 84%,  reduce = 0%, Cumulative CPU 588.33 sec
2021-01-14 19:50:46,963 Stage-1 map = 85%,  reduce = 0%, Cumulative CPU 594.58 sec
2021-01-14 19:50:49,007 Stage-1 map = 87%,  reduce = 0%, Cumulative CPU 596.57 sec
2021-01-14 19:50:51,050 Stage-1 map = 87%,  reduce = 1%, Cumulative CPU 597.28 sec
2021-01-14 19:51:03,318 Stage-1 map = 88%,  reduce = 1%, Cumulative CPU 612.73 sec
2021-01-14 19:51:12,513 Stage-1 map = 90%,  reduce = 1%, Cumulative CPU 621.48 sec
2021-01-14 19:51:26,818 Stage-1 map = 91%,  reduce = 1%, Cumulative CPU 636.67 sec
2021-01-14 19:51:32,952 Stage-1 map = 92%,  reduce = 1%, Cumulative CPU 643.0 sec
2021-01-14 19:51:34,994 Stage-1 map = 93%,  reduce = 1%, Cumulative CPU 644.94 sec
2021-01-14 19:51:50,317 Stage-1 map = 94%,  reduce = 1%, Cumulative CPU 660.14 sec
2021-01-14 19:51:55,433 Stage-1 map = 95%,  reduce = 1%, Cumulative CPU 666.44 sec
2021-01-14 19:51:57,482 Stage-1 map = 97%,  reduce = 1%, Cumulative CPU 668.28 sec
2021-01-14 19:52:12,821 Stage-1 map = 98%,  reduce = 1%, Cumulative CPU 683.5 sec
2021-01-14 19:52:18,953 Stage-1 map = 99%,  reduce = 1%, Cumulative CPU 689.74 sec
2021-01-14 19:52:20,999 Stage-1 map = 100%,  reduce = 2%, Cumulative CPU 691.85 sec
2021-01-14 19:52:22,021 Stage-1 map = 100%,  reduce = 3%, Cumulative CPU 693.77 sec
2021-01-14 19:52:25,090 Stage-1 map = 100%,  reduce = 6%, Cumulative CPU 696.17 sec
2021-01-14 19:52:26,114 Stage-1 map = 100%,  reduce = 10%, Cumulative CPU 698.63 sec
2021-01-14 19:52:29,177 Stage-1 map = 100%,  reduce = 13%, Cumulative CPU 701.07 sec
2021-01-14 19:52:30,200 Stage-1 map = 100%,  reduce = 16%, Cumulative CPU 703.66 sec
2021-01-14 19:52:33,257 Stage-1 map = 100%,  reduce = 19%, Cumulative CPU 706.24 sec
2021-01-14 19:52:34,281 Stage-1 map = 100%,  reduce = 23%, Cumulative CPU 708.46 sec
2021-01-14 19:52:37,345 Stage-1 map = 100%,  reduce = 26%, Cumulative CPU 710.83 sec
2021-01-14 19:52:38,367 Stage-1 map = 100%,  reduce = 29%, Cumulative CPU 713.2 sec
2021-01-14 19:52:41,428 Stage-1 map = 100%,  reduce = 32%, Cumulative CPU 715.63 sec
2021-01-14 19:52:42,453 Stage-1 map = 100%,  reduce = 35%, Cumulative CPU 717.97 sec
2021-01-14 19:52:44,497 Stage-1 map = 100%,  reduce = 39%, Cumulative CPU 720.34 sec
2021-01-14 19:52:46,538 Stage-1 map = 100%,  reduce = 42%, Cumulative CPU 722.84 sec
2021-01-14 19:52:48,576 Stage-1 map = 100%,  reduce = 45%, Cumulative CPU 725.32 sec
2021-01-14 19:52:49,600 Stage-1 map = 100%,  reduce = 48%, Cumulative CPU 727.7 sec
2021-01-14 19:52:52,664 Stage-1 map = 100%,  reduce = 52%, Cumulative CPU 730.23 sec
2021-01-14 19:52:53,686 Stage-1 map = 100%,  reduce = 55%, Cumulative CPU 732.5 sec
2021-01-14 19:52:56,769 Stage-1 map = 100%,  reduce = 58%, Cumulative CPU 734.87 sec
2021-01-14 19:52:57,791 Stage-1 map = 100%,  reduce = 61%, Cumulative CPU 737.09 sec
2021-01-14 19:53:00,854 Stage-1 map = 100%,  reduce = 65%, Cumulative CPU 739.72 sec
2021-01-14 19:53:01,877 Stage-1 map = 100%,  reduce = 68%, Cumulative CPU 742.27 sec
2021-01-14 19:53:04,939 Stage-1 map = 100%,  reduce = 71%, Cumulative CPU 744.84 sec
2021-01-14 19:53:05,958 Stage-1 map = 100%,  reduce = 74%, Cumulative CPU 747.31 sec
2021-01-14 19:53:09,013 Stage-1 map = 100%,  reduce = 77%, Cumulative CPU 749.87 sec
2021-01-14 19:53:10,030 Stage-1 map = 100%,  reduce = 81%, Cumulative CPU 752.43 sec
2021-01-14 19:53:13,095 Stage-1 map = 100%,  reduce = 84%, Cumulative CPU 755.09 sec
2021-01-14 19:53:14,115 Stage-1 map = 100%,  reduce = 87%, Cumulative CPU 757.62 sec
2021-01-14 19:53:17,176 Stage-1 map = 100%,  reduce = 90%, Cumulative CPU 760.0 sec
2021-01-14 19:53:18,192 Stage-1 map = 100%,  reduce = 94%, Cumulative CPU 762.38 sec
2021-01-14 19:53:21,260 Stage-1 map = 100%,  reduce = 97%, Cumulative CPU 764.78 sec
2021-01-14 19:53:22,282 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 767.18 sec
MapReduce Total cumulative CPU time: 12 minutes 47 seconds 180 msec
Ended Job = job_1610015767041_0034
Loading data to table default.merge_test3
MapReduce Jobs Launched:
Stage-Stage-1: Map: 30  Reduce: 31   Cumulative CPU: 767.18 sec   HDFS Read: 31436674813 HDFS Write: 1765 HDFS EC Read: 0 SUCCESS
Total MapReduce CPU Time Spent: 12 minutes 47 seconds 180 msec
OK
Time taken: 465.782 seconds
hive>

五.控制Map/Reduce数

进行Hive开发的时候,经常会遇到map/reduce数量过多,导致hive job执行时间过长,性能低下。
此时我们可以控制map、reduce的数来对hive job进行优化

5.1 控制Hive job中的map数

通常情况下,作业会通过input的目录产生一个或者多个map任务。
主要的决定因素有: input的文件总个数,input的文件大小,集群设置的文件块大小(目前为128M, 可在hive中通过set dfs.block.size;命令查看到,该参数不能自定义修改);

举例:

  1. 假设input目录下有1个文件a,大小为780M,那么hadoop会将该文件a分隔成7个块(6个128m的块和1个12m的块),从而产生7个map数
  2. 假设input目录下有3个文件a,b,c,大小分别为10m,20m,130m,那么hadoop会分隔成4个块(10m,20m,128m,2m),从而产生4个map数

即,如果文件大于块大小(128m),那么会拆分,如果小于块大小,则把该文件当成一个块。

map数越多越好吗?
答案是否定的。如果一个任务有很多小文件(远远小于块大小128m),则每个小文件也会被当做一个块,用一个map任务来完成,而一个map任务启动和初始化的时间远远大于逻辑处理的时间,就会造成很大的资源浪费。而且,同时可执行的map数是受限的。

是不是保证每个map处理接近128m的文件块,就高枕无忧了?
答案也是不一定。比如有一个127m的文件,正常会用一个map去完成,但这个文件只有一个或者两个小字段,却有几千万的记录,如果map处理的逻辑比较复杂,用一个map任务去做,肯定也比较耗时。

5.1.1 合并小文件,减小map数

假设一个SQL任务:
select count(*) from ods_fact_sale;

该任务的inputdir 总共有117个文件,其中很多是远远小于128m的小文件,总大小31G,正常执行会用117个map任务。

我通过以下方法来在map执行前合并小文件,减少map数:
set mapred.max.split.size=1024000000;
set mapred.min.split.size.per.node=100000000;
set mapred.min.split.size.per.rack=100000000;
set hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;

再执行上面的语句,用了30个map任务
对于这个简单SQL任务,执行时间上可能差不多,但节省了超过一半的计算资源。

set hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat #执行Map前进行小文件合并
set mapred.max.split.size=1024000000; #每个Map最大输入大小
set mapred.min.split.size.per.node=100000000; #一个节点上split的至少的大小
set mapred.min.split.size.per.rack=100000000; #一个交换机下split的至少的大小

5.1.2 适当增加map数

代码:

create table ods_factsale_tmp1  as select * from ods_fact_sale where sale_date = '2011-08-16 00:00:00.0';select prod_name,count(*),count(distinct id),min(sale_date),max(sale_date), sum(case when id > 100000 then 1 else 0 end),sum(case when prod_name = 'PROD4' then 1 else 0 end),sum(case when prod_name = 'PROD5' then 1 else 0 end),sum(case when prod_name = 'PROD6' then 1 else 0 end),sum(case when prod_name = 'PROD7' then 1 else 0 end),sum(case when prod_name = 'PROD8' then 1 else 0 end),sum(case when prod_name = 'PROD9' then 1 else 0 end),sum(case when prod_name = 'PROD10' then 1 else 0 end)from ods_factsale_tmp1group by prod_name;set mapred.reduce.tasks=10;
create table ods_factsale_tmp2 as select * from ods_factsale_tmp1 distribute by rand(123);set mapred.reduce.tasks=-1;  -- 改回默认值值
select prod_name,count(*),count(distinct id),min(sale_date),max(sale_date), sum(case when id > 100000 then 1 else 0 end),sum(case when prod_name = 'PROD4' then 1 else 0 end),sum(case when prod_name = 'PROD5' then 1 else 0 end),sum(case when prod_name = 'PROD6' then 1 else 0 end),sum(case when prod_name = 'PROD7' then 1 else 0 end),sum(case when prod_name = 'PROD8' then 1 else 0 end),sum(case when prod_name = 'PROD9' then 1 else 0 end),sum(case when prod_name = 'PROD10' then 1 else 0 end)from ods_factsale_tmp2group by prod_name;

测试记录:
可以看到,通过将表的数据重新进行分桶,mapper数由1变为了2,执行时长也由34秒优化到了28秒。

hive> > > select prod_name,>        count(*),>        count(distinct id),>        min(sale_date),>        max(sale_date), >        sum(case when id > 100000 then 1 else 0 end),>        sum(case when prod_name = 'PROD4' then 1 else 0 end),>        sum(case when prod_name = 'PROD5' then 1 else 0 end),>        sum(case when prod_name = 'PROD6' then 1 else 0 end),>        sum(case when prod_name = 'PROD7' then 1 else 0 end),>        sum(case when prod_name = 'PROD8' then 1 else 0 end),>        sum(case when prod_name = 'PROD9' then 1 else 0 end),>        sum(case when prod_name = 'PROD10' then 1 else 0 end)>   from ods_factsale_tmp1>  group by prod_name;
Query ID = root_20210115160619_cc2f311c-e387-4030-bf62-8934aa00f7e4
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:set mapreduce.job.reduces=<number>
Starting Job = job_1610015767041_0094, Tracking URL = http://hp1:8088/proxy/application_1610015767041_0094/
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/hadoop job  -kill job_1610015767041_0094
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2021-01-15 16:06:29,326 Stage-1 map = 0%,  reduce = 0%
2021-01-15 16:06:43,798 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 13.9 sec
2021-01-15 16:06:53,086 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 22.71 sec
MapReduce Total cumulative CPU time: 22 seconds 710 msec
Ended Job = job_1610015767041_0094
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 22.71 sec   HDFS Read: 34999959 HDFS Write: 962 HDFS EC Read: 0 SUCCESS
Total MapReduce CPU Time Spent: 22 seconds 710 msec
OK
PROD10  94470   94470   2011-08-16 00:00:00.0   2011-08-16 00:00:00.0   94456   0       0       0       0       0       0       94470
PROD2   94495   94495   2011-08-16 00:00:00.0   2011-08-16 00:00:00.0   94479   0       0       0       0       0       0       0
PROD3   96743   96743   2011-08-16 00:00:00.0   2011-08-16 00:00:00.0   96730   0       0       0       0       0       0       0
PROD4   94378   94378   2011-08-16 00:00:00.0   2011-08-16 00:00:00.0   94366   94378   0       0       0       0       0       0
PROD5   96994   96994   2011-08-16 00:00:00.0   2011-08-16 00:00:00.0   96980   0       96994   0       0       0       0       0
PROD6   91746   91746   2011-08-16 00:00:00.0   2011-08-16 00:00:00.0   91735   0       0       91746   0       0       0       0
PROD7   95815   95815   2011-08-16 00:00:00.0   2011-08-16 00:00:00.0   95804   0       0       0       95815   0       0       0
PROD8   95109   95109   2011-08-16 00:00:00.0   2011-08-16 00:00:00.0   95087   0       0       0       0       95109   0       0
PROD9   95148   95148   2011-08-16 00:00:00.0   2011-08-16 00:00:00.0   95138   0       0       0       0       0       95148   0
Time taken: 34.765 seconds, Fetched: 9 row(s)
hive> set mapred.reduce.tasks=10;
hive> create table ods_factsale_tmp2 as select * from ods_factsale_tmp1 distribute by rand(123);
Query ID = root_20210115160707_f02b478a-2754-43e0-83fa-59378a8c56f9
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Defaulting to jobconf value of: 10
In order to change the average load for a reducer (in bytes):set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:set mapreduce.job.reduces=<number>
Starting Job = job_1610015767041_0095, Tracking URL = http://hp1:8088/proxy/application_1610015767041_0095/
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/hadoop job  -kill job_1610015767041_0095
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 10
2021-01-15 16:07:15,539 Stage-1 map = 0%,  reduce = 0%
2021-01-15 16:07:23,798 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 6.66 sec
2021-01-15 16:07:29,972 Stage-1 map = 100%,  reduce = 10%, Cumulative CPU 10.39 sec
2021-01-15 16:07:30,998 Stage-1 map = 100%,  reduce = 20%, Cumulative CPU 14.03 sec
2021-01-15 16:07:33,053 Stage-1 map = 100%,  reduce = 30%, Cumulative CPU 17.55 sec
2021-01-15 16:07:35,107 Stage-1 map = 100%,  reduce = 40%, Cumulative CPU 21.14 sec
2021-01-15 16:07:38,190 Stage-1 map = 100%,  reduce = 50%, Cumulative CPU 25.41 sec
2021-01-15 16:07:40,248 Stage-1 map = 100%,  reduce = 60%, Cumulative CPU 29.55 sec
2021-01-15 16:07:43,324 Stage-1 map = 100%,  reduce = 70%, Cumulative CPU 34.19 sec
2021-01-15 16:07:44,349 Stage-1 map = 100%,  reduce = 80%, Cumulative CPU 37.74 sec
2021-01-15 16:07:47,436 Stage-1 map = 100%,  reduce = 90%, Cumulative CPU 42.06 sec
2021-01-15 16:07:48,464 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 45.61 sec
MapReduce Total cumulative CPU time: 45 seconds 610 msec
Ended Job = job_1610015767041_0095
Moving data to directory hdfs://nameservice1/user/hive/warehouse/test.db/ods_factsale_tmp2
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1  Reduce: 10   Cumulative CPU: 45.61 sec   HDFS Read: 35027561 HDFS Write: 34984665 HDFS EC Read: 0 SUCCESS
Total MapReduce CPU Time Spent: 45 seconds 610 msec
OK
Time taken: 42.892 seconds
hive> set mapred.reduce.tasks=-1;
hive> select prod_name,>        count(*),>        count(distinct id),>        min(sale_date),>        max(sale_date), >        sum(case when id > 100000 then 1 else 0 end),>        sum(case when prod_name = 'PROD4' then 1 else 0 end),>        sum(case when prod_name = 'PROD5' then 1 else 0 end),>        sum(case when prod_name = 'PROD6' then 1 else 0 end),>        sum(case when prod_name = 'PROD7' then 1 else 0 end),>        sum(case when prod_name = 'PROD8' then 1 else 0 end),>        sum(case when prod_name = 'PROD9' then 1 else 0 end),>        sum(case when prod_name = 'PROD10' then 1 else 0 end)>   from ods_factsale_tmp2>  group by prod_name;
Query ID = root_20210115160934_909dd909-4761-4fe2-bb76-ba7b15b55b76
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:set mapreduce.job.reduces=<number>
Starting Job = job_1610015767041_0097, Tracking URL = http://hp1:8088/proxy/application_1610015767041_0097/
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/hadoop job  -kill job_1610015767041_0097
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1
2021-01-15 16:09:41,196 Stage-1 map = 0%,  reduce = 0%
2021-01-15 16:09:49,419 Stage-1 map = 50%,  reduce = 0%, Cumulative CPU 7.88 sec
2021-01-15 16:09:53,541 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 21.59 sec
2021-01-15 16:10:01,760 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 29.81 sec
MapReduce Total cumulative CPU time: 29 seconds 810 msec
Ended Job = job_1610015767041_0097
MapReduce Jobs Launched:
Stage-Stage-1: Map: 2  Reduce: 1   Cumulative CPU: 29.81 sec   HDFS Read: 35008642 HDFS Write: 962 HDFS EC Read: 0 SUCCESS
Total MapReduce CPU Time Spent: 29 seconds 810 msec
OK
PROD10  94470   94470   2011-08-16 00:00:00.0   2011-08-16 00:00:00.0   94456   0       0       0       0       0       0       94470
PROD2   94495   94495   2011-08-16 00:00:00.0   2011-08-16 00:00:00.0   94479   0       0       0       0       0       0       0
PROD3   96743   96743   2011-08-16 00:00:00.0   2011-08-16 00:00:00.0   96730   0       0       0       0       0       0       0
PROD4   94378   94378   2011-08-16 00:00:00.0   2011-08-16 00:00:00.0   94366   94378   0       0       0       0       0       0
PROD5   96994   96994   2011-08-16 00:00:00.0   2011-08-16 00:00:00.0   96980   0       96994   0       0       0       0       0
PROD6   91746   91746   2011-08-16 00:00:00.0   2011-08-16 00:00:00.0   91735   0       0       91746   0       0       0       0
PROD7   95815   95815   2011-08-16 00:00:00.0   2011-08-16 00:00:00.0   95804   0       0       0       95815   0       0       0
PROD8   95109   95109   2011-08-16 00:00:00.0   2011-08-16 00:00:00.0   95087   0       0       0       0       95109   0       0
PROD9   95148   95148   2011-08-16 00:00:00.0   2011-08-16 00:00:00.0   95138   0       0       0       0       0       95148   0
Time taken: 28.758 seconds, Fetched: 9 row(s)
hive>

5.2 控制hive任务的reduce数

Hive 如何确定reduce数:
reduce个数的设定极大影响任务执行效率,不指定reduce个数的情况下,Hive会猜测确定一个reduce个数,基于以下两个设定:
hive.exec.reducers.bytes.per.reducer(每个reduce任务处理的数据量,默认为1000^3=1G)
hive.exec.reducers.max(每个任务最大的reduce数,默认为999)
计算reducer数的公式很简单N=min(参数2,总输入数据量/参数1)
即,如果reduce的输入(map的输出)总大小不超过1G,那么只会有一个reduce任务;

如何调整reduce数:

  1. 调整hive.exec.reducers.bytes.per.reducer参数的值;
    set hive.exec.reducers.bytes.per.reducer=500000000; (500M)
  2. set mapred.reduce.tasks = 15;

reduce数量是不是越多越好?
同map一样,启动和初始化reduce也会消耗时间和资源;
另外,有多少个reduce,就会有多少个输出文件,如果生成了很多个小文件,那么如果这些小文件作为下一个任务的输入,则也会出现小文件过多的问题;

什么情况下只有一个reduce?
很多时候你会发现任务中不管数据量多大,不管你有没有设置调整reduce个数的参数,任务中一直都只有一个reduce任务;
其实只有一个reduce任务的情况,除了数据量小于hive.exec.reducers.bytes.per.reducer参数值的情况外,还有以下原因:

  1. 没有group by的汇总,比如把select sale_date,count() from ods_fact_sale_orc where sale_date = ‘2011-08-16 00:00:00.0’ group by sale_date; 写成 select count() from ods_fact_sale_orc where sale_date = ‘2011-08-16 00:00:00.0’;
    这点非常常见,希望大家尽量改写。
  2. 用了Order by
  3. 有笛卡尔积
    通常这些情况下,除了找办法来变通和避免,我暂时没有什么好的办法,因为这些操作都是全局的,所以hadoop不得不用一个reduce去完成;

同样的,在设置reduce个数的时候也需要考虑这两个原则:使大数据量利用合适的reduce数;使单个reduce任务处理合适的数据量;

测试记录:
reduce 数由33变为1,性能也得到了提升

hive> > select sale_date,count(*) from ods_fact_sale_orc where sale_date = '2011-08-16 00:00:00.0' group by sale_date;
Query ID = root_20210115171744_3ddbe697-6889-4f8b-9126-4fc59cff6c26
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 33
In order to change the average load for a reducer (in bytes):set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:set mapreduce.job.reduces=<number>
Starting Job = job_1610015767041_0100, Tracking URL = http://hp1:8088/proxy/application_1610015767041_0100/
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/hadoop job  -kill job_1610015767041_0100
Hadoop job information for Stage-1: number of mappers: 9; number of reducers: 33
2021-01-15 17:17:52,451 Stage-1 map = 0%,  reduce = 0%
2021-01-15 17:18:03,759 Stage-1 map = 22%,  reduce = 0%, Cumulative CPU 18.51 sec
2021-01-15 17:18:10,940 Stage-1 map = 33%,  reduce = 0%, Cumulative CPU 24.4 sec
2021-01-15 17:18:14,017 Stage-1 map = 44%,  reduce = 0%, Cumulative CPU 33.6 sec
2021-01-15 17:18:20,167 Stage-1 map = 56%,  reduce = 0%, Cumulative CPU 43.26 sec
2021-01-15 17:18:23,241 Stage-1 map = 67%,  reduce = 0%, Cumulative CPU 52.4 sec
2021-01-15 17:18:29,388 Stage-1 map = 78%,  reduce = 0%, Cumulative CPU 61.63 sec
2021-01-15 17:18:31,432 Stage-1 map = 89%,  reduce = 0%, Cumulative CPU 70.53 sec
2021-01-15 17:18:38,601 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 79.59 sec
2021-01-15 17:18:40,648 Stage-1 map = 100%,  reduce = 3%, Cumulative CPU 82.66 sec
2021-01-15 17:18:42,691 Stage-1 map = 100%,  reduce = 6%, Cumulative CPU 85.6 sec
2021-01-15 17:18:44,740 Stage-1 map = 100%,  reduce = 9%, Cumulative CPU 88.2 sec
2021-01-15 17:18:46,793 Stage-1 map = 100%,  reduce = 12%, Cumulative CPU 91.17 sec
2021-01-15 17:18:48,841 Stage-1 map = 100%,  reduce = 15%, Cumulative CPU 94.05 sec
2021-01-15 17:18:50,885 Stage-1 map = 100%,  reduce = 18%, Cumulative CPU 97.02 sec
2021-01-15 17:18:52,931 Stage-1 map = 100%,  reduce = 21%, Cumulative CPU 100.09 sec
2021-01-15 17:18:54,975 Stage-1 map = 100%,  reduce = 24%, Cumulative CPU 103.16 sec
2021-01-15 17:18:57,018 Stage-1 map = 100%,  reduce = 27%, Cumulative CPU 106.04 sec
2021-01-15 17:18:59,068 Stage-1 map = 100%,  reduce = 30%, Cumulative CPU 108.96 sec
2021-01-15 17:19:01,117 Stage-1 map = 100%,  reduce = 33%, Cumulative CPU 111.83 sec
2021-01-15 17:19:03,162 Stage-1 map = 100%,  reduce = 36%, Cumulative CPU 114.92 sec
2021-01-15 17:19:04,183 Stage-1 map = 100%,  reduce = 39%, Cumulative CPU 117.84 sec
2021-01-15 17:19:06,230 Stage-1 map = 100%,  reduce = 42%, Cumulative CPU 120.77 sec
2021-01-15 17:19:08,293 Stage-1 map = 100%,  reduce = 45%, Cumulative CPU 123.67 sec
2021-01-15 17:19:10,335 Stage-1 map = 100%,  reduce = 48%, Cumulative CPU 126.6 sec
2021-01-15 17:19:12,383 Stage-1 map = 100%,  reduce = 52%, Cumulative CPU 129.5 sec
2021-01-15 17:19:14,429 Stage-1 map = 100%,  reduce = 55%, Cumulative CPU 132.47 sec
2021-01-15 17:19:16,481 Stage-1 map = 100%,  reduce = 58%, Cumulative CPU 135.11 sec
2021-01-15 17:19:18,520 Stage-1 map = 100%,  reduce = 61%, Cumulative CPU 138.08 sec
2021-01-15 17:19:20,571 Stage-1 map = 100%,  reduce = 64%, Cumulative CPU 140.95 sec
2021-01-15 17:19:22,616 Stage-1 map = 100%,  reduce = 67%, Cumulative CPU 143.83 sec
2021-01-15 17:19:24,664 Stage-1 map = 100%,  reduce = 70%, Cumulative CPU 146.8 sec
2021-01-15 17:19:26,713 Stage-1 map = 100%,  reduce = 73%, Cumulative CPU 149.69 sec
2021-01-15 17:19:28,761 Stage-1 map = 100%,  reduce = 76%, Cumulative CPU 152.46 sec
2021-01-15 17:19:30,802 Stage-1 map = 100%,  reduce = 79%, Cumulative CPU 155.36 sec
2021-01-15 17:19:32,846 Stage-1 map = 100%,  reduce = 82%, Cumulative CPU 158.32 sec
2021-01-15 17:19:34,894 Stage-1 map = 100%,  reduce = 85%, Cumulative CPU 161.2 sec
2021-01-15 17:19:36,937 Stage-1 map = 100%,  reduce = 88%, Cumulative CPU 164.13 sec
2021-01-15 17:19:38,977 Stage-1 map = 100%,  reduce = 91%, Cumulative CPU 166.99 sec
2021-01-15 17:19:41,024 Stage-1 map = 100%,  reduce = 94%, Cumulative CPU 169.92 sec
2021-01-15 17:19:43,069 Stage-1 map = 100%,  reduce = 97%, Cumulative CPU 172.8 sec
2021-01-15 17:19:45,116 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 175.63 sec
MapReduce Total cumulative CPU time: 2 minutes 55 seconds 630 msec
Ended Job = job_1610015767041_0100
MapReduce Jobs Launched:
Stage-Stage-1: Map: 9  Reduce: 33   Cumulative CPU: 175.63 sec   HDFS Read: 1170223064 HDFS Write: 2912 HDFS EC Read: 0 SUCCESS
Total MapReduce CPU Time Spent: 2 minutes 55 seconds 630 msec
OK
2011-08-16 00:00:00.0   854898
Time taken: 121.334 seconds, Fetched: 1 row(s)
hive> select count(*) from ods_fact_sale_orc where sale_date = '2011-08-16 00:00:00.0';
Query ID = root_20210115172041_b4730546-abca-4c9c-8032-bb70888fb011
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:set mapreduce.job.reduces=<number>
Starting Job = job_1610015767041_0101, Tracking URL = http://hp1:8088/proxy/application_1610015767041_0101/
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/hadoop job  -kill job_1610015767041_0101
Hadoop job information for Stage-1: number of mappers: 9; number of reducers: 1
2021-01-15 17:20:48,562 Stage-1 map = 0%,  reduce = 0%
2021-01-15 17:20:59,860 Stage-1 map = 11%,  reduce = 0%, Cumulative CPU 8.93 sec
2021-01-15 17:21:00,887 Stage-1 map = 22%,  reduce = 0%, Cumulative CPU 17.94 sec
2021-01-15 17:21:06,027 Stage-1 map = 33%,  reduce = 0%, Cumulative CPU 23.75 sec
2021-01-15 17:21:08,081 Stage-1 map = 44%,  reduce = 0%, Cumulative CPU 32.82 sec
2021-01-15 17:21:15,257 Stage-1 map = 56%,  reduce = 0%, Cumulative CPU 42.24 sec
2021-01-15 17:21:17,297 Stage-1 map = 67%,  reduce = 0%, Cumulative CPU 51.22 sec
2021-01-15 17:21:24,447 Stage-1 map = 78%,  reduce = 0%, Cumulative CPU 60.42 sec
2021-01-15 17:21:26,497 Stage-1 map = 89%,  reduce = 0%, Cumulative CPU 69.58 sec
2021-01-15 17:21:33,661 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 78.48 sec
2021-01-15 17:21:34,685 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 81.46 sec
MapReduce Total cumulative CPU time: 1 minutes 21 seconds 460 msec
Ended Job = job_1610015767041_0101
MapReduce Jobs Launched:
Stage-Stage-1: Map: 9  Reduce: 1   Cumulative CPU: 81.46 sec   HDFS Read: 1170021233 HDFS Write: 106 HDFS EC Read: 0 SUCCESS
Total MapReduce CPU Time Spent: 1 minutes 21 seconds 460 msec
OK
854898
Time taken: 54.258 seconds, Fetched: 1 row(s)
hive>

参考

1.http://lxw1234.com/archives/2015/04/92.htm
2.https://blog.csdn.net/panfelix/article/details/107583723
3.http://lxw1234.com/archives/2015/04/15.htm

大数据开发之Hive优化篇8-Hive Job优化相关推荐

  1. 大数据开发之Hive篇12-Hive正则表达式

    备注: Hive 版本 2.1.1 文章目录 一.Hive 正则表达式概述 1.1 字符集合: 1.2 边界集合: 1.3 重复次数集合: 1.4 组合操作符: 1.5 匹配操作符: 1.6 转义操作 ...

  2. 大数据开发之Hive篇14-Hive归档(Archiving)

    备注: Hive 版本 2.1.1 文章目录 一.Hive归档简介 二.Hive 归档操作 参考 一.Hive归档简介 由于HDFS的设计,文件系统中的文件数量直接影响namenode中的内存消耗.虽 ...

  3. 大数据开发之Hive篇3-Hive数据定义语言

    备注: Hive 版本 2.1.1 文章目录 一.Hive关系模型概述 1.1.Database 1.2 Table 1.2.1 管理表和外部表 1.2.2 永久表和临时表 1.3 Partition ...

  4. 大数据开发之Hive篇17-Hive锁机制

    备注: Hive 版本 2.1.1 文章目录 一.Hive锁概述 二.Hive 锁相关操作 2.1 Hive的并发性 2.2 查看表的锁 2.3 解锁 三.Hive 事务表锁机制 四.Hive 锁测试 ...

  5. 大数据笔记30—Hadoop基础篇13(Hive优化及数据倾斜)

    Hive优化及数据倾斜 知识点01:回顾 知识点02:目标 知识点03:Hive函数:多行转多列 知识点04:Hive函数:多行转单列 知识点05:Hive函数:多列转多行 知识点06:Hive函数: ...

  6. 高效大数据开发之 bitmap 思想的应用

    作者:xmxiong,PCG 运营开发工程师 数据仓库的数据统计,可以归纳为三类:增量类.累计类.留存类.而累计类又分为历史至今的累计与最近一段时间内的累计(比如滚动月活跃天,滚动周活跃天,最近 N ...

  7. 萌新Java开发实战记录:大数据开发之”IP热力图、地点热门TopN(文章底部附源码)

    提示:文章写完后,目录可以自动生成,如何生成可参考右边的帮助文档 目录 一. 课程设计背景概述 1. <IP经纬热力图>概述 2. <电商分析系统>概述 二.需求分析 1.&l ...

  8. 大数据开发之Sqoop详细介绍

    备注: 测试环境 CDH 6.3.1 Sqoop 1.4.7 文章目录 一.Sqoop概述 二.Sqoop 工具概述 三.Sqoon工具详解 3.1 codegen 3.2 create-hive-t ...

  9. 大数据开发之Hive篇18-Hive的回收站

    备注: Hive 版本 2.1.1 一.模拟误删表 误删除了这张表 hive> > drop table ods_fact_sale_orc; OK 二.从回收站恢复表 查看回收表 [ro ...

最新文章

  1. JWT的使用及登录账号
  2. mysql col与row_使用mysql实现row_number() over(partition by col1 order by col2)函数
  3. 地铁7号线路图_南京地铁S1号线机场线,都经过哪些地方?如何查询地铁换乘?查询站点信息?...
  4. Interface继承至System.Object?
  5. git 配置命令行别名
  6. 快速正确的修改变量的命名和如何正确规范的注释
  7. cudaMemset的调用方式
  8. App后台开发运维和架构实践学习总结(9)——三种常见的API设计错误及解决方案
  9. IDEA主题SublimeTest3修改
  10. PreparedStatement 大数据查询
  11. 谷歌:AI系统需要“自我怀疑”能力,方能作出更好的决定!
  12. 有哪些 Java 源代码看了后让你收获很多,代码思维和能力有较大的提升?
  13. Unity3D人物模型精选——迷你卡通风格篇
  14. 关于ping命令出现大量dup原因
  15. 腾讯云直播相关问题处理
  16. 京东校园招聘2019.04.13 第一题 01序列拉齐
  17. linux系统负载查看进程,Linux查看系统的负载
  18. 【Unity】DnSpy断点调试Unity已发行游戏的dll
  19. 《新学期,新Flag》乘风破浪
  20. 项目管理:(一)项目管理一般知识

热门文章

  1. c mysql win8.1,Win8/8.1/Win7小技巧:揪出C盘空间占用的真凶 - IT之家
  2. 手机别乱清理了,10秒删除这些英文文件夹,立刻腾出5G内存
  3. 人民币银行结算账户管理办法实施细则
  4. 思考(七十一):Protobuf oneof 实现消息分发
  5. php hex2ascii,php ASCII字符和十六进制数之间的相互转化
  6. win8和win10下,visual studio 2008 调试出现无响应的卡死问题解决
  7. stata常用命令 (持续更新)
  8. matlab 多目标非线性,第二章—Matlab解决悬置多目标解耦的非线性优化问题
  9. springboot+vue汽车4S店维修管理系统
  10. 基于LSTM实现春联上联对下联