10亿数据导入oracle方案

方案：由于数据量过大，而且里面有重复数据，直接导入数据库超慢，所以需要建立临时表，然后去重插入最终表。临时表、最终表分区，都不建立主键和索引，设置临时表和最终表为nologging+并行，使用sqlload直路径＋并行把数据导入临时表，然后重构主键为本地索引（为了以后查询数据速度快些），临时表对每个分区去重后插入最终表，最终表重构主键和索引，设置最终表为logging+noparallel，删除临时表。

步骤：

1. 创建临时表

-- 创建临时表
create table ESB_CUSTOMERNO_RELATION_OLD
(CUSTOMERNO  VARCHAR2(40) not null,IDTYPE      VARCHAR2(10) not null,CUSTOMERID  VARCHAR2(20) not null,CREDATE     DATE,UPDDATE     DATE,CUSTOMERSEQ VARCHAR2(2) not null
)
partition by hash (CUSTOMERID)
(partition P01tablespace TCBUCC_DATA_P01,......partition P42tablespace TCBUCC_DATA_P42
);

2. 导入数据到临时表
2.1 txt文件格式如下：

10000201044324534026|a|320223196301195428|1|20180504120000|20180504120000
10000201020567704042|a|320222198002261889|1|20180504120000|20180504120000
......
10000201012050396509|a|640221197309130625|1|20180504120000|20180504120000

2.2 ctl文件格式如下：

load  data
CHARACTERSET AL32UTF8
infile 'esb_customerno_relation_1_01.txt'
......
infile 'esb_customerno_relation_1_55.txt'
append
into  table  esb_customerno_relation_old
FIELDS TERMINATED BY '|' TRAILING NULLCOLS
(CUSTOMERNO,
IDTYPE,
CUSTOMERID,
CUSTOMERSEQ,
CREDATE date 'yyyy/mm/dd hh24:mi:ss',
UPDDATE date 'yyyy/mm/dd hh24:mi:ss'
)

如果IDTYPE为空则默认赋值a，则把IDTYPE改为IDTYPE "nvl(:IDTYPE,'a')"

如果CUSTOMERSEQ全部赋"3"，则把CUSTOMERSEQ改为CUSTOMERSEQ "3"

2.3 sqlload命令

nohup sqlldr root/123456@127.0.0.1:1521/root control=esb_customerno_relation.ctl direct=true parallel=true errors=99999999 readsize=209715200 bindsize=33554432 > esb_customerno_relation.out 2>&1 &

9.6亿条数据总共耗时：80分钟

3. 创建最终表

-- 创建最终表
create table ESB_CUSTOMERNO_RELATION
(CUSTOMERNO  VARCHAR2(40) not null,IDTYPE      VARCHAR2(10) not null,CUSTOMERID  VARCHAR2(20) not null,CREDATE     DATE,UPDDATE     DATE,CUSTOMERSEQ VARCHAR2(2) not null
)
partition by hash (CUSTOMERID)
(partition P01tablespace TCBUCC_DATA_P01,......partition P42tablespace TCBUCC_DATA_P42
);

4. 把临时表和最终表设为nologging 并行，并给临时表创建本地索引

4.1 查看cpu信息，在cmd窗口运行，据说并行度不要超过cpu个数，查询结果为16核

show parameters cpu;

4.2 设置nologging 并行

alter table ESB_CUSTOMERNO_RELATION nologging parallel 16;
alter table ESB_CUSTOMERNO_RELATION_OLD nologging parallel 16;

4.3 为临时表建立本地索引，(CUSTOMERID, CUSTOMERNO, CUSTOMERSEQ)为最终表的联合主键
create index IDX_ESB_CUSTOMERNO_OLD_01 on ESB_CUSTOMERNO_RELATION_OLD (CUSTOMERID, CUSTOMERNO, CUSTOMERSEQ) local nologging parallel 16;

遇到问题:

解决方式：增加撤销表空间和临时表空间大小

增加撤销表空间：

alter tablespace UNDOTBS1 add tempfile '/oracle/oradata/undotbs02.dbf' size 67m reuse autoextend on next 100m maxsize 32767m;

增加临时表空间：

alter tablespace temp add tempfile '/oracle/oradata/temp02.dbf' size 67m reuse autoextend on next 100m maxsize 32767m;

耗时：16分钟

5. 对每个分区去重并插入最终表

declareinsert_sql varchar2(200);cursor cur_dupl isselect s.partition_namefrom user_tab_partitions swhere s.table_name = 'ESB_CUSTOMERNO_RELATION_OLD';
beginfor partition_record in cur_dupl loopinsert_sql := 'insert into esb_customerno_relation p(customerno, customerid, customerseq, idtype, credate, upddate)select ss.customerno,ss.customerid,ss.customerseq,ss.idtype,ss.credate,ss.upddatefrom (select s.customerno,s.customerid,s.customerseq,s.idtype,s.credate,s.upddate,row_number() over(partition by s.customerid, s.customerno, s.customerseq order by rownum) as rnfrom esb_customerno_relation_old partition(' ||partition_record.partition_name || ') s) sswhere rn = 1';dbms_output.put_line('执行语句：' || insert_sql);execute immediate insert_sql;commit;dbms_output.put_line(partition_record.partition_name || '分区插入完成');end loop;
exceptionwhen others thendbms_output.put_line('sqlerrm-->' || sqlerrm);rollback;
end;

耗时：160分钟

6. 为最终表创建主键及索引

6.1 为最终表创建主键和索引全局索引建议分区，此处由于某些原因未分区

create unique index PK_ESB_CUSTOMERNO_RELATION on ESB_CUSTOMERNO_RELATION(CUSTOMERID, CUSTOMERNO, CUSTOMERSEQ) local nologging parallel 16;
alter table ESB_CUSTOMERNO_RELATION add constraint PK_ESB_CUSTOMERNO_RELATION primary key (CUSTOMERID, CUSTOMERNO, CUSTOMERSEQ);
create index IDX_ESB_CUSTOMERNO_RELATION_01 on ESB_CUSTOMERNO_RELATION (UPDDATE) tablespace tcbucc_index nologging parallel 16;

耗时：26分钟

6.2 修改最终表的索引和表为logging，noparallel

alter index PK_ESB_CUSTOMERNO_RELATION logging noparallel;
alter index IDX_ESB_CUSTOMERNO_RELATION_01 logging noparallel;
alter table ESB_CUSTOMERNO_RELATION logging noparallel;

7. 删除中间表

drop table ESB_CUSTOMERNO_RELATION_OLD;

总共耗时：接近5小时

8. 一些sql语句

--查询分区的索引
select * from user_part_indexes;
--查询分区的表
select * from user_part_tables;
--查询索引分区信息
select * from user_ind_partitions;
--查询表分区信息
select * from user_tab_partitions;
--查询段
select * from user_segments;
--查询临时表空间
select * from v$temp_extent_pool;
--查询临时表空间物理地址
select * from dba_temp_files;
--查询表空间
select * from dba_tablespaces;
--查询可用表空间
select * from dba_free_space;
--查询表空间物理地址
select * from dba_data_files;
--查询进程记录
select * from v$px_process;
--查看用户权限
select * from user_sys_privs;
--查看角色权限
select * from user_role_privs;
--查询并行度
select table_name,degree from user_tables;
--执行计划：
explain plan for ...
--查看执行计划：
select * from table(dbms_xplan.display);
--查看cpu
show parameters cpu;
--查看并行参数
show parameter parallel;

结束语：由于本人从事Java开发，不是专业的DB，有些地方处理的可能不是很好，如有问题，欢迎批评指正，感谢！

10亿数据导入oracle方案相关推荐

图数据库hugegraph如何快速导入10亿+数据
随着社交.电商.金融.零售.物联网等行业的快速发展,现实社会织起了了一张庞大而复杂的关系网,亟需一种支持海量复杂数据关系运算的数据库即图数据库.本系列文章是学习知识图谱以及图数据库相关的知识梳理与总结 ...
文本导入数据到oracle_教你如何把文本数据导入Oracle中
导读:Oracle数据库功能性是非常好的,正是由于Oracle数据库的优点,Oracle数据库赢得了广大用户的喜爱.Dos 环境下使用SQl*Loader命令加载使用其它数据库的数据转移工具Orac ...
架构师之路：粉丝关系链，10亿数据，如何设计？
目录粉丝关系链,10亿数据,如何设计? 什么是关系链业务? 弱好友关系的建立,不需要双方彼此同意: 强好友关系的建立,需要好友关系双方彼此同意: 弱好友关系,存储层应该如何实现? 如何查询一个用户关 ...
sql server 2008数据导入Oracle方法
试了几种sql server数据导入Oracle的方法,发现还是sql server 的导入导出工具最好使.使用方法很简单,照着向导做就可以.不过使用中需要注意以下几点: 系统盘需要足够大.因为SSI ...
如何将TXT,EXCEL或CSV数据导入ORACLE到对应表中
如何将TXT,EXCEL或CSV数据导入ORACLE到对应表中 2011-05-12 14:19 方法一,使用SQL*Loader 这个是用的较多的方法,前提必须oracle数据中目的表已经 ...
Redis 10亿数据量只需要100MB内存，为什么这么牛？
作者:java架构设计来源:toutiao.com/i6767642839267410445 本文主要和大家分享一下redis的高级特性:bit位操作. 力求让大家彻底学会使用redis的bit ...
JDBC实现从Hive抽取数据导入Oracle
环境:浙江移动华为云平台云平台大数据采用了 Kerberos 认证. 开发历程: 1.在宁波大数据实验环境测试通过了JDBC实现从Hive抽取数据导入Oracle功能. 2.通过查看其它项目的数据库 ...
查询oracle数据库的表格数据类型,excel表格中如何查询数据库数据类型-我想把excel表格中的数据导入oracle数据库中，想在......
在excel表里,什么是:字段.记录.数据类型.多工... declare @t table(id numeric(18,2)) insert into @t SELECT col1 FROM ...
oracle将excel导入,Win7系统把Excel数据导入oracle的方法（图文）
今天本教程小编分享Win7系统把Excel数据导入oracle的方法,操作数据库时,将Excel的数据导入到oracle中是非常常见的方式,但是有很多新手用户不知道Win7系统怎么把Excel数据导入 ...
oracle+excel转txt,Excel数据导入Oracle的方法
Excel数据导入Oracle的方法最近同事遇到了每天手工导入Excel数据到Oracle的问题,他目前的操作是使用PL/SQL Developer中的复制粘贴方法,这样每天都需要进行手工的操作,很 ...

10亿数据导入oracle方案

10亿数据导入oracle方案相关推荐

最新文章

热门文章