建表语句

create table test_over
(user_id string,login_date string
)  COMMENT '测试函数使用，可以删除'
row format delimited
fields terminated by '\t';

over 执行计划

spark

spark-sql> explain select>   user_id>   ,login_date>   ,lag(login_date,1,'0001-01-01') over(partition by user_id order by login_date) prev_date> from test_over;
22/03/10 10:55:50 INFO [main] CodeGenerator: Code generated in 9.641436 ms
== Physical Plan ==
Window [lag(login_date#34, 1, 0001-01-01) windowspecdefinition(user_id#33, login_date#34 ASC NULLS FIRST, specifiedwindowframe(RowFrame, -1, -1)) AS prev_date#30], [user_id#33], [login_date#34 ASC NULLS FIRST]
+- *(1) Sort [user_id#33 ASC NULLS FIRST, login_date#34 ASC NULLS FIRST], false, 0+- Exchange hashpartitioning(user_id#33, 200)+- Scan hive default.test_over [user_id#33, login_date#34], HiveTableRelation `default`.`test_over`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [user_id#33, login_date#34]
Time taken: 0.098 seconds, Fetched 1 row(s)
22/03/10 10:55:50 INFO [main] SparkSQLCLIDriver: Time taken: 0.098 seconds, Fetched 1 row(s)

spark-sql> > explain > select>   user_id>   ,login_date>   ,first_value(login_date) over(partition by user_id ) prev_date> from test_over;
== Physical Plan ==
Window [first(login_date#39, false) windowspecdefinition(user_id#38, specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) AS prev_date#35], [user_id#38]
+- *(1) Sort [user_id#38 ASC NULLS FIRST], false, 0+- Exchange hashpartitioning(user_id#38, 200)+- Scan hive default.test_over [user_id#38, login_date#39], HiveTableRelation `default`.`test_over`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [user_id#38, login_date#39]
Time taken: 0.077 seconds, Fetched 1 row(s)
22/03/10 10:57:34 INFO [main] SparkSQLCLIDriver: Time taken: 0.077 seconds, Fetched 1 row(s)

spark-sql> > > explain select>   user_id>   ,login_date>   ,max(login_date) over(partition by user_id ) prev_date> from test_over;
== Physical Plan ==
Window [max(login_date#45) windowspecdefinition(user_id#44, specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) AS prev_date#41], [user_id#44]
+- *(1) Sort [user_id#44 ASC NULLS FIRST], false, 0+- Exchange hashpartitioning(user_id#44, 200)+- Scan hive default.test_over [user_id#44, login_date#45], HiveTableRelation `default`.`test_over`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [user_id#44, login_date#45]
Time taken: 0.081 seconds, Fetched 1 row(s)
22/03/10 10:58:15 INFO [main] SparkSQLCLIDriver: Time taken: 0.081 seconds, Fetched 1 row(s)

hive

hive> explain select>   user_id>   ,login_date>   ,lag(login_date,1,'0001-01-01') over(partition by user_id order by login_date) prev_date> from test_over;
OK
STAGE DEPENDENCIES:Stage-1 is a root stageStage-0 depends on stages: Stage-1STAGE PLANS:Stage: Stage-1Map ReduceMap Operator Tree:TableScanalias: test_overStatistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONEReduce Output Operatorkey expressions: user_id (type: string), login_date (type: string)sort order: ++Map-reduce partition columns: user_id (type: string)Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONEExecution mode: vectorizedReduce Operator Tree:Select Operatorexpressions: KEY.reducesinkkey0 (type: string), KEY.reducesinkkey1 (type: string)outputColumnNames: _col0, _col1Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONEPTF OperatorFunction definitions:Input definitioninput alias: ptf_0output shape: _col0: string, _col1: stringtype: WINDOWINGWindowing table definitioninput alias: ptf_1name: windowingtablefunctionorder by: _col1 ASC NULLS FIRSTpartition by: _col0raw input shape:window functions:window function definitionalias: lag_window_0arguments: _col1, 1, '0001-01-01'name: lagwindow function: GenericUDAFLagEvaluatorwindow frame: ROWS PRECEDING(MAX)~FOLLOWING(MAX)isPivotResult: trueStatistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONESelect Operatorexpressions: _col0 (type: string), _col1 (type: string), lag_window_0 (type: string)outputColumnNames: _col0, _col1, _col2Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONEFile Output Operatorcompressed: falseStatistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONEtable:input format: org.apache.hadoop.mapred.SequenceFileInputFormatoutput format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormatserde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDeStage: Stage-0Fetch Operatorlimit: -1Processor Tree:ListSink

Time taken: 0.211 seconds, Fetched: 61 row(s)
hive> explain select>   user_id>   ,login_date>   ,max(login_date) over(partition by user_id ) prev_date> from test_over;
OK
STAGE DEPENDENCIES:Stage-1 is a root stageStage-0 depends on stages: Stage-1STAGE PLANS:Stage: Stage-1Map ReduceMap Operator Tree:TableScanalias: test_overStatistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONEReduce Output Operatorkey expressions: user_id (type: string)sort order: +Map-reduce partition columns: user_id (type: string)Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONEvalue expressions: login_date (type: string)Execution mode: vectorizedReduce Operator Tree:Select Operatorexpressions: KEY.reducesinkkey0 (type: string), VALUE._col0 (type: string)outputColumnNames: _col0, _col1Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONEPTF OperatorFunction definitions:Input definitioninput alias: ptf_0output shape: _col0: string, _col1: stringtype: WINDOWINGWindowing table definitioninput alias: ptf_1name: windowingtablefunctionorder by: _col0 ASC NULLS FIRSTpartition by: _col0raw input shape:window functions:window function definitionalias: max_window_0arguments: _col1name: maxwindow function: GenericUDAFMaxEvaluatorwindow frame: ROWS PRECEDING(MAX)~FOLLOWING(MAX)Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONESelect Operatorexpressions: _col0 (type: string), _col1 (type: string), max_window_0 (type: string)outputColumnNames: _col0, _col1, _col2Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONEFile Output Operatorcompressed: falseStatistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONEtable:input format: org.apache.hadoop.mapred.SequenceFileInputFormatoutput format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormatserde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDeStage: Stage-0Fetch Operatorlimit: -1Processor Tree:ListSinkTime taken: 3.278 seconds, Fetched: 61 row(s)

lateral view 执行计划

spark

spark-sql> > explain select>   user_id>   ,login_date>   ,single_num> from test_over> lateral view explode(split(login_date,'-')) tmp as single_num;
== Physical Plan ==
Generate explode(split(login_date#58, -)), [user_id#57, login_date#58], false, [single_num#59]
+- Scan hive default.test_over [user_id#57, login_date#58], HiveTableRelation `default`.`test_over`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [user_id#57, login_date#58]
Time taken: 0.103 seconds, Fetched 1 row(s)
22/03/10 14:39:38 INFO [main] SparkSQLCLIDriver: Time taken: 0.103 seconds, Fetched 1 row(s)

hive

hive> > explain select>   user_id>   ,login_date>   ,single_num> from test_over> lateral view explode(split(login_date,'-')) tmp as single_num;
OK
STAGE DEPENDENCIES:Stage-0 is a root stageSTAGE PLANS:Stage: Stage-0Fetch Operatorlimit: -1Processor Tree:TableScanalias: test_overStatistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONELateral View ForwardStatistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONESelect Operatorexpressions: user_id (type: string), login_date (type: string)outputColumnNames: user_id, login_dateStatistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONELateral View Join OperatoroutputColumnNames: _col0, _col1, _col5Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: NONESelect Operatorexpressions: _col0 (type: string), _col1 (type: string), _col5 (type: string)outputColumnNames: _col0, _col1, _col2Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: NONEListSinkSelect Operatorexpressions: split(login_date, '-') (type: array<string>)outputColumnNames: _col0Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONEUDTF OperatorStatistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONEfunction name: explodeLateral View Join OperatoroutputColumnNames: _col0, _col1, _col5Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: NONESelect Operatorexpressions: _col0 (type: string), _col1 (type: string), _col5 (type: string)outputColumnNames: _col0, _col1, _col2Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: NONEListSinkTime taken: 0.081 seconds, Fetched: 41 row(s)

over 与lateral view 的hive、spark sql执行计划相关推荐

Hive/Spark SQL使用案例
Hive/Spark SQL使用案例求 TOPN:开窗函数求天数:datediff() 函数求每个学生的成绩都大于...系列:开窗 / 分组表转置/行转列系列一:concat_ws 函数表转 ...
oracle执行计划走索引类型,SQL执行计划问题：where条件是主键（NUMBER类型字段）LIKE :VAR，为什么执行计划不走索引？...
SQL执行计划问题:where条件是主键(NUMBER类型字段)LIKE :VAR,为什么执行计划不走索引? 中文社区 (MOSC) 数据库 (MOSC) 6 Replies Last update ...
一次搞定各种数据库SQL执行计划
作者 | 董旭阳TonyDong 出品 | CSDN 博客执行计划(execution plan,也叫查询计划或者解释计划)是数据库执行 SQL 语句的具体步骤,例如通过索引还是全表扫描访问表中的数 ...
sql执行组件是灰色的_如何分析SQL执行计划图形组件
sql执行组件是灰色的 In the previous articles of this series, SQL Server Execution Plans overview and SQL Ser ...
sql 执行计划嵌套循环_性能调优–嵌套和合并SQL循环与执行计划
sql 执行计划嵌套循环 In this article, we will explore Nested and Merge SQL Loops in the SQL Execution plan ...
Oracle 查看 SQL执行计划
Oracle 查看 SQL执行计划 SQL性能分析执行计划可以用来分析SQL的性能一.查看执行计划的方法 1. 设置autotrace set autotrace off: 此为默认值,即关闭au ...
mysql 执行计划extra_SQL优化 MySQL版 -分析explain SQL执行计划与Extra
Extra 作者 : Stanley 罗昊 [转载请注明出处和署名,谢谢!] 注:此文章必须有一定的Mysql基础,或观看执行计划入门篇传送门: https://www.cnblogs.com/Sta ...
Oracle查看SQL执行计划的方式
Oracle查看SQL执行计划的方式获取Oracle sql执行计划并查看执行计划,是掌握和判断数据库性能的基本技巧.下面案例介绍了多种查看sql执行计划的方式: 基本有以下几种方式: 1.通过sq ...
sql server varchar最大长度_来自灵魂的拷问—知道什么是SQL执行计划吗？
面试官说:工作这么久了,应该知道sql执行计划吧,讲讲Sql的执行计划吧!看了看面试官手臂上纹的大花臂和一串看不懂的韩文,吞了吞口水,暗示自己镇定点,整理了一下思绪缓缓的对面试官说:我不会面试官:. ...

over 与lateral view 的hive、spark sql执行计划

建表语句

over 执行计划

spark

hive

lateral view 执行计划

spark

hive

over 与lateral view 的hive、spark sql执行计划相关推荐

最新文章

热门文章