(1)Hive 数仓中一些常用的dt与日期的转换操作

下面总结了自己工作中经常用到的一些日期转换,这类日期转换经常用于报表的时间粒度和统计周期的控制中

日期变换:
(1)dt转日期
to_date(from_unixtime(unix_timestamp('${dt}','yyyyMMdd')))
(2)日期转dt
regexp_replace('${date}','-','')
(3)dt转当月1号日期
to_date(from_unixtime(unix_timestamp(concat(substr('${dt}',1,6),'01'),'yyyyMMdd')))
trunc(to_date(from_unixtime(unix_timestamp('${dt}','yyyyMMdd'))),'MM')
-- 下月1号日期
trunc(add_months(to_date(from_unixtime(unix_timestamp('${dt}','yyyyMMdd'))),1),'MM')
(4)dt转当周星期一日期
next_day(date_add(to_date(from_unixtime(unix_timestamp('${dt}','yyyyMMdd'))), -7), 'Mo')
date_sub(next_day(to_date(from_unixtime(unix_timestamp('${dt}','yyyyMMdd'))),'MO'),7)
-- 下周星期一日期
next_day(to_date(from_unixtime(unix_timestamp('${dt}','yyyyMMdd'))),'MO')
(5)dt前六天日期(dt为星期天时得到的是本周周一的日期)
date_add(to_date(from_unixtime(unix_timestamp('${dt}','yyyyMMdd'))), -6)
(5)dt转当季第一天日期
if(length(floor(substr('${dt}',5,2)/3.1)*3+1)=1,concat(substr('${dt}',1,4),'-0',floor(substr('${dt}',5,2)/3.1)*3+1,'-01'),concat(substr('${dt}',1,4),'-',floor(substr('${dt}',5,2)/3.1)*3+1,'-01'))
(6)dt转半年第一天日期
if(length(floor(substr('${dt}',5,2)/6.1)*6+1)=1,concat(substr('${dt}',1,4),'-0',floor(substr('${dt}',5,2)/6.1)*6+1,'-01'),concat(substr('${dt}',1,4),'-',floor(substr('${dt}',5,2)/6.1)*6+1,'-01'))
(7)dt转当年1号日期
concat(substr('${dt}',1,4),'-01-01')(8)在同时有日周月粒度时要注意数据的时间范围,有时每月的第一个自然周会跨月,比如2019年3月的第一周的日期是20190225-20190303where agent_business_date between date_add_day('${dt}',-31) and to_date(from_unixtime(unix_timestamp('${dt}','yyyyMMdd')))where dt between regexp_replace(date_add_day('${dt}',-31),'-','') and '${dt}'------------------------------------------------------------------------------------------
-- 日期维度表表结构edw_public.dim_esf_edw_pub_date
------------------------------------------------------------------------------------------
col_name                    data_type   comment
------------------------------------------------------------------------
calendar_date                 string        日期,格式为"YYYY-MM-DD"
week_english_name           string      星期英文名
week_chinese_name           string      星期中文名
day_of_week_number          int     所属一周当中的第几天
calendar_month_code         string      日期所属月份,格式为"YYYY-MM"
calendar_month_number         int       所属月份数字
month_english_name          string      月份英文名
month_chinese_name          string      月份中文名
day_of_month_number         int     所属月份当中的第几天
calendar_quater_code          string        日期所属季度,格式为"YYYY-QT"
calendar_quater_number        int       所属季度数字
day_of_quater_number          int       所属季度当中的第几天
calendar_half_year_code       string        日期所属半年,格式为"YYYY-HY"
calendar_half_year_number   int     所属半年数字,1为上半年,2为下半年
calendar_year_code          string      日期所属年份,格式为"YYYY"
day_of_year_number          int     所属年份当中的第几天
work_day_flag                 string        工作日标志: Y - 是/ N - 否
holiday_flag                  string        节假日标志: Y - 是/ N - 否-- 日期维度表的使用
-- 当天日期
SELECTcalendar_date
FROMedw_public.dim_esf_edw_pub_date
WHEREcalendar_date = regexp_replace('${dt}','(\\d{4})(\\d{2})(\\d{2})','$1-$2-$3')-- Finereport中日周月季半年年 各周期末日期的算法
select ${if(粒度 == 1," case when date(max(calendar_date))>=date(date_add('day',-1,current_date)) then date(date_add('day',-1,current_date)) else date(max(calendar_date)) end as period_end_date","")}${if(粒度 == 2," distinct case when day_of_week_number = 1 and date_add('day',6,date(calendar_date)) >=date(date_add('day',-1,current_date)) then date(date_add('day',-1,current_date))  when day_of_week_number = 7 and date(calendar_date) >=date(date_add('day',-1,current_date)) then date(date_add('day',-1,current_date)) when day_of_week_number = 1 then date_add('day',6,date(calendar_date))  when day_of_week_number = 7 then date(calendar_date) else date(calendar_date) end as period_end_date ","")}${if(粒度 == 3," case when date(max(calendar_date))>=date(date_add('day',-1,current_date)) then date(date_add('day',-1,current_date)) else date(max(calendar_date)) end as period_end_date ","")}${if(粒度 == 4," case when date(max(calendar_date))>=date(date_add('day',-1,current_date)) then date(date_add('day',-1,current_date)) else date(max(calendar_date)) end as period_end_date ","")}${if(粒度 == 5," case when date(max(calendar_date))>=date(date_add('day',-1,current_date)) then date(date_add('day',-1,current_date)) else date(max(calendar_date)) end as period_end_date ","")}${if(粒度 == 6," case when date(max(calendar_date))>=date(date_add('day',-1,current_date)) then date(date_add('day',-1,current_date)) else date(max(calendar_date)) end as period_end_date ","")}
fromedw_public.dim_esf_edw_pub_date
where calendar_date >= '${开始时间}' and calendar_date <= '${结束时间}'
${if(粒度 == 1," group by calendar_date ","")}
${if(粒度 == 2," and day_of_week_number in (1,7) ","")}
${if(粒度 == 3," group by calendar_month_code  ","")}
${if(粒度 == 4," group by calendar_quater_code  ","")}
${if(粒度 == 5," group by calendar_year_code  ","")}
${if(粒度 == 6," group by calendar_half_year_code  ","")}-- Finereport中日周月季半年年 各周期期初期末日期的算法(这种计算方法当前日期是20190330,输入的日期范围是2019-03-01至2091-03-28则输出的月日期范围是2019-03-29)
select ${if(粒度 == 1,"date(calendar_date) as period_start_date, date(calendar_date) as period_end_date ","")}${if(粒度 == 2,"case when day_of_week_number = 1 then  date(calendar_date) when day_of_week_number = 7 then date_add('day',-6, date(calendar_date)) end as period_start_date, case  when day_of_week_number = 1 and date_add('day',6, date(calendar_date)) >=date(date_add('day',-1,current_date)) then date(date_add('day',-1,current_date)) when day_of_week_number = 7 and date(calendar_date)>=date(date_add('day',-1,current_date)) then date(date_add('day',-1,current_date)) when day_of_week_number = 1 then date_add('day',6, date(calendar_date))  when day_of_week_number = 7 then date(calendar_date)  end as period_end_date ","")}${if(粒度 == 3,"date(calendar_date) as period_start_date, case when date_add('day',-day(date(calendar_date)),date_add('month',1,(date(calendar_date))))>=date(date_add('day',-1,current_date)) then date(date_add('day',-1,current_date)) else date_add('day',-day(date(calendar_date)),date_add('month',1,(date(calendar_date)))) end as period_end_date ","")}${if(粒度 == 4,"calendar_date as period_start_date,date_add('day',-1,date_add('month',1,date(substr(calendar_date,1,4)||'-'||cast(cast(floor(cast(substr(calendar_date,6,2) as int)/3.1)*3+3 as int) as varchar)||'-01')))  as  period_end_date ","")}${if(粒度 == 5,"date(concat(substr(calendar_date,1,4),'-01','-01')) as period_start_date,case when date(concat(substr(calendar_date,1,4),'-12','-31'))>= date(date_add('day',-1,current_date)) then date(date_add('day',-1,current_date)) else date(concat(substr(calendar_date,1,4),'-12','-31')) end as  period_end_date","")}${if(粒度 == 6,"date(min(calendar_date)) as period_start_date,case when date(max(calendar_date))>= date(date_add('day',-1,current_date)) then date(date_add('day',-1,current_date)) else date(max(calendar_date)) end as period_end_date","")}
fromedw_public.dim_esf_edw_pub_date
where  calendar_date >= '${开始时间}' and calendar_date <= '${结束时间}'
${if(粒度 == 1," and 1 = 1 ","")}
${if(粒度 == 2," and day_of_week_number in (1,7) ","")}
${if(粒度 == 3," and day_of_month_number = 1","")}
${if(粒度 == 4," and day_of_quater_number = 1","")}
${if(粒度 == 5," and day_of_year_number = 1","")}
${if(粒度 == 6," group by calendar_half_year_code ","")}------------------------------------------------------------------------------------------------
-- 根据输入的时间范围计算期末日期
------------------------------------------------------------------------------------------------
select t1.*
from
-- 日周月季年半年不同粒度的统计数据各存为了一张表 edw_reports.adm_xf_edw_house_sub_project_report_00${dtype}ly_di t1--日报
join
(
-- 日
SELECTcalendar_date
FROMedw_public.dim_esf_edw_pub_date
WHEREcalendar_date BETWEEN '${bdt}' AND '${edt}'
AND '${dtype}' = '1_dai'
UNION
-- 月
SELECTMAX(calendar_date) AS calendar_date
FROMedw_public.dim_esf_edw_pub_date
WHEREcalendar_date BETWEEN '${bdt}' AND '${edt}'
AND '${dtype}' = '2_dai'
GROUP BYcalendar_month_numberUNION
-- 周
SELECTcalendar_date
FROMedw_public.dim_esf_edw_pub_date
WHEREcalendar_date BETWEEN '${bdt}' AND '${edt}'
AND day_of_week_number = 7
AND '${dtype}' = '3_dai'UNION
-- 季
SELECTMAX(calendar_date) AS calendar_date
FROMedw_public.dim_esf_edw_pub_date
WHEREcalendar_date BETWEEN '${bdt}' AND '${edt}'
AND '${dtype}' = '4_dai'
GROUP BYcalendar_quater_codeUNION
-- 年
SELECTMAX(calendar_date) AS calendar_date
FROMedw_public.dim_esf_edw_pub_date
WHEREcalendar_date BETWEEN '${bdt}' AND '${edt}'
AND '${dtype}' = '5_dai'
GROUP BYcalendar_year_codeUNION
-- 半年
SELECTMAX(calendar_date) AS calendar_date
FROMedw_public.dim_esf_edw_pub_date
WHEREcalendar_date BETWEEN '${bdt}' AND '${edt}'
AND '${dtype}' = '6_dai'
GROUP BYcalendar_half_year_codeUNION
SELECTMAX(calendar_date) AS calendar_date
FROMedw_public.dim_esf_edw_pub_date
WHEREcalendar_date BETWEEN '${bdt}' AND '${edt}'
ORDER BYcalendar_date
) t2
on t1.statistic_date = t2.calendar_date
where
statistic_date between '${bdt}' and '${edt}'
${if(len(tenant_name) == 0,"","and house_sub_project_organization_short_name = '" + tenant_name + "'")}
${if(len(status) == 0,"","and house_sub_project_cooperation_status_code = " + status)}
${if(len(tenant_type) == 0,"","and house_sub_project_organization_business_type_code= " + tenant_type)}
${if(len(project_type) == 0,"","and house_sub_project_cooperation_type_code= " + project_type)}
order by statistic_date

(2)Hive 计算指定日期在本周的第几天和指定日期的本周指定天数的日期

注意这里需要先明确本周的第一天到底是星期一还是星期天?dayofweek函数定义星期天是一周中的第一天,另外dayofweek在hive2.2.0才开始支持,低版本的hive不支持dayofweek函数,需要使用其他方法实现,请见我的博客Hive和sparksql中的dayofweek

-- 计算指定日期本周的第一天和最后一天
selectday, dayofweek(day)                                                                    as dw1, date_add(day,1 - dayofweek(day))                                                  as Su_s -- 周日_start, date_add(day,7 - dayofweek(day))                                                  as Sa_e -- 周六_end, case when dayofweek(day) = 1 then 7 else dayofweek(day) - 1 end                   as dw2, date_add(day,1 - case when dayofweek(day) = 1 then 7 else dayofweek(day) - 1 end) as Mo_s -- 周一_start, date_add(day,7 - case when dayofweek(day) = 1 then 7 else dayofweek(day) - 1 end) as Su_e -- 周日_end, trunc(day,'YY')                                                                   as yearly_first_day, trunc(day,'MM')                                                                   as monthly_first_day    -- 本月1号日期, last_day(day)                                                                     as monthly_last_day     -- 本月最后一天日期, date_add(next_day(day,'MO'), -7)                                                  as weekly_first_day     -- 本周一日期, next_day(date_add(day, -7),'MO')                                                  as weekly_first_day     -- 本周一日期    , case when (7 - datediff(next_day(day,'SU'),day)) <> 0 then next_day(day,'SU') else day end as weekly_end_day      -- 本周日日期from (select '2018-11-01' as day union allselect '2018-11-02' as day union allselect '2018-11-03' as day union allselect '2018-11-04' as day union allselect '2018-11-05' as day union allselect '2018-11-06' as day union allselect '2018-11-07' as day union allselect '2018-11-08' as day union allselect '2018-11-09' as day union allselect '2018-11-10' as day union allselect '2018-11-11' as day union allselect '2018-11-12' as day union allselect '2018-11-13' as day union allselect '2018-11-14' as day union allselect '2018-11-15' as day union allselect '2018-11-16' as day union allselect '2018-11-17' as day union allselect '2018-11-18' as day union allselect '2018-11-19' as day union allselect '2018-11-20' as day union allselect '2018-11-21' as day union allselect '2018-11-22' as day union allselect '2018-11-23' as day union allselect '2018-11-24' as day union allselect '2018-11-25' as day union allselect '2018-11-26' as day union allselect '2018-11-27' as day union allselect '2018-11-28' as day union allselect '2018-11-29' as day union allselect '2018-11-30' as day union all
) t1
;

其他一些参考资料:

Hive 时间日期处理总结

转载于:https://www.cnblogs.com/shujuxiong/p/10001437.html

Hive 数仓中常见的日期转换操作相关推荐

  1. 数仓中指标-标签,维度-度量,自然键-代理键,数据集市等各名词解析及关系

    序列号 内容 链接 1 大数据知识面试题-通用(2022版) https://blog.csdn.net/qq_43061290/article/details/124819089 2 大数据知识面试 ...

  2. 大数据-案例-离线数仓-在线教育:MySQL(业务数据)-ETL(Sqoop)->Hive数仓【ODS层-数据清洗->DW层(DWD-统计分析->DWS)】-导出(Sqoop)->MySQL->可视化

    一.商业BI系统概述 商业智能系统,通常简称为商业智能系统,是商业智能软件的简称,是为提高企业经营绩效而采用的一系列方法.技术和软件的总和.通常被理解为将企业中的现有数据转换为知识并帮助企业做出明智的 ...

  3. 数仓中指标-标签,维度-度量,自然键-代理键等各名词深度解析

    作为一个数据人,是不是经常被各种名词围绕,是不是对其中很多概念认知模糊.有些词虽然只有一字之差,但是它们意思完全不同,今天我们就来了解下数仓建设及数据分析时常见的一些概念含义及它们之间的关系. 本文首 ...

  4. Hive 老当益庄 | 深度解读 Flink 1.11:流批一体 Hive 数仓

    精选30+云产品,助力企业轻松上云!>>> 首先恭喜 Table/SQL 的 blink planner 成为默认 Planner,撒花.撒花. Flink 1.11 中流计算结合 ...

  5. Hadoop(HDFS+MapReduce+Hive+数仓基础概念)学习笔记(自用)

    文章目录 修改虚拟机IP 复制网卡的配置 Vi编辑器的常用命令 实操部分 复制网卡的配置 Hadoop集群初体验 20.secondarynameNode如何辅助管理FSImage与Edits文件 ⭐ ...

  6. HIve数仓新零售项目DWD层的构建

    HIve数仓新零售项目 注:大家觉得博客好的话,别忘了点赞收藏呀,本人每周都会更新关于人工智能和大数据相关的内容,内容多为原创,Python Java Scala SQL 代码,CV NLP 推荐系统 ...

  7. Flink SQL 1.11新功能详解:Hive 数仓实时化 Flink SQL + CDC 实践

    问题导读 1.Flink 1.11 有哪些新功能? 2.如何使用 flink-cdc-connectors 捕获 MySQL 和 Postgres 的数据变更? 3.怎样利用 Flink SQL 做多 ...

  8. Apache Doris在美团外卖数仓中的应用实践

    来自:美团技术团队 美团外卖数据仓库通过MOLAP+ROLAP双引擎模式来适配不同应用场景.MOLAP引擎使用了Apache Kylin.ROLAP我们经过综合考虑,选择了Apache Doris.本 ...

  9. 数仓中应该出现的所有表格

    数仓中应该出现的所有表格及其逻辑 1.ods_app_log(app日志贴源表) 计算:详情请见数据预处理整体代码实现 源表:原始数据 +--------------+---------------- ...

最新文章

  1. DNA RNA 蛋白质
  2. 解决Linux安装 VMware tools 工具的方法
  3. camel_Apache Camel 2.14中的更多指标
  4. [转]分布式文件系统 MogileFS 安装手册
  5. logback与log4j比较
  6. linux dd devzero,makefile中ifeq与ifneq dev/null和dev/zero简介 dd命令
  7. 解锁环境变量在云原生应用中的各种姿势
  8. MYSQL索引结构学习笔记
  9. MySQL关键字constra_mysql总结笔记(一)
  10. sql语句与mysql_MySQL-sql语句
  11. 思科2960g端口限速配置
  12. Linq的Distinct太不给力了
  13. RHadoop(一)
  14. PG Doc:17章1-3小节翻译
  15. linux密码记录木马,注意 “QQ大盗”木马注入 QQ 进程记录QQ账号与密码
  16. linux下java的日志在哪里,Linux下的系统日志管理
  17. 看不见你的笑我怎么睡得着
  18. 干货:一个案例看懂“结巴”分词(Jieba),入行NLP必备
  19. 新房装修选怎中式装修是不是能省钱
  20. Unity 的协程的原理

热门文章

  1. python中国大学排名爬虫写明详细步骤-【Python爬虫】从html里爬取中国大学排名...
  2. arcgis python工具-使用python制作ArcGIS插件(1)工具介绍
  3. python编码读法-python读音
  4. python电脑下载有问题-Python 解决火狐浏览器不弹出下载框直接下载的问题
  5. 在当当买了python怎么下载源代码-python爬虫爬取当当网
  6. python动态图-Python处理gif动态图的解析与合成操作的介绍
  7. python培训深圳-深圳哪里有Python培训?
  8. python游戏最简单代码-如何利用Python开发一个简单的猜数字游戏
  9. python装饰器-如何更通俗地讲解Python的装饰器?
  10. php和python哪个学起来简单一点-作为初学者,php,python和ruby应学哪个?