使用sqoop将us_order表中的数据导入到hive中,hive的库名为exam_ods,表名叫ods_us_order,根据order_date的日期来实现分区导入数据,形成脚本。

解释

#!/bin/bash#先删除hive中的此表信息
/usr/local/hive/bin/hive -e "drop table exam_ods.ods_us_order" > /dev/null 2>&1ALL_DATE=`/usr/bin/mysql -h mypc01 -uroot -p123456 -s -e 'select date(order_date) from sz.us_order group by date(order_date)'`# 设置一些下面用到的变量
SQOOP_HOME=/usr/local/sqoop
MYSQL_CONNECT=jdbc:mysql://mypc01:3306/sz
MYSQL_USERNAME=root
MYSQL_PWD=123456for ONE_DAY in ${ALL_DATE}
do
${SQOOP_HOME}/bin/sqoop import \
--connect ${MYSQL_CONNECT}  \
--username ${MYSQL_USERNAME} \
--password ${MYSQL_PWD}  \
--table 'us_order' \
--where "date(order_date)='${ONE_DAY}'" \
--hive-import \
--hive-overwrite \
--hive-table 'exam_ods.ods_us_order' \
--fields-terminated-by ',' \
--hive-partition-key 'dt' \
--hive-partition-value "${ONE_DAY}" \
--num-mappers 3
done

注释

  • hive -e 用于执行一条sql查询语句.其他hive常用命令参数如下.
usage: hive-d,--define <key=value>          Variable subsitution to apply to hivecommands. e.g. -d A=B or --define A=B--database <databasename>     Specify the database to use-e <quoted-query-string>         SQL from command line-f <filename>                    SQL from files-H,--help                        Print help information--hiveconf <property=value>   Use value for given property--hivevar <key=value>         Variable subsitution to apply to hivecommands. e.g. --hivevar A=B-i <filename>                    Initialization SQL file-S,--silent                      Silent mode in interactive shell-v,--verbose                     Verbose mode (echo executed SQL to theconsole)
  • > /dev/null 2>&1表示将输出送入黑洞,就是不在console上面显示.具体原因不深究了.
  • /usr/bin/mysql -h mypc01 -uroot -p123456 -s -e 'select date(order_date) from sz.us_order group by date(order_date) 表示远程启动mysql并执行一条sql语句.
  • 查看帮助可以知道 -h后面跟mysql所在的主机名, -e后面跟sql语句. -s表示静默输出
    mysql还有一些其他可以在linux命令行执行的指令,如
# mysql --help
Usage: mysql [OPTIONS] [database]-?, --help          Display this help and exit.-I, --help          Synonym for -?--auto-rehash       Enable automatic rehashing. One doesn't need to use'rehash' to get table and field completion, but startupand reconnecting may take a longer time. Disable with--disable-auto-rehash.(Defaults to on; use --skip-auto-rehash to disable.)-A, --no-auto-rehashNo automatic rehashing. One has to use 'rehash' to gettable and field completion. This gives a quicker start ofmysql and disables rehashing on reconnect.--auto-vertical-outputAutomatically switch to vertical output mode if theresult is wider than the terminal width.-B, --batch         Don't use history file. Disable interactive behavior.(Enables --silent.)--bind-address=name IP address to bind to.--binary-as-hex     Print binary data as hex--character-sets-dir=nameDirectory for character set files.--column-type-info  Display column type information.-c, --comments      Preserve comments. Send comments to the server. Thedefault is --skip-comments (discard comments), enablewith --comments.-C, --compress      Use compression in server/client protocol.-#, --debug[=#]     This is a non-debug version. Catch this and exit.--debug-check       This is a non-debug version. Catch this and exit.-T, --debug-info    This is a non-debug version. Catch this and exit.-D, --database=name Database to use.--default-character-set=nameSet the default character set.--delimiter=name    Delimiter to be used.--enable-cleartext-pluginEnable/disable the clear text authentication plugin.-e, --execute=name  Execute command and quit. (Disables --force and historyfile.)-E, --vertical      Print the output of a query (rows) vertically.-f, --force         Continue even if we get an SQL error.--histignore=name   A colon-separated list of patterns to keep statementsfrom getting logged into syslog and mysql history.-G, --named-commandsEnable named commands. Named commands mean this program'sinternal commands; see mysql> help . When enabled, thenamed commands can be used from any line of the query,otherwise only from the first line, before an enter.Disable with --disable-named-commands. This option isdisabled by default.-i, --ignore-spaces Ignore space after function names.--init-command=name SQL Command to execute when connecting to MySQL server.Will automatically be re-executed when reconnecting.--local-infile      Enable/disable LOAD DATA LOCAL INFILE.-b, --no-beep       Turn off beep on error.-h, --host=name     Connect to host.-H, --html          Produce HTML output.-X, --xml           Produce XML output.--line-numbers      Write line numbers for errors.(Defaults to on; use --skip-line-numbers to disable.)-L, --skip-line-numbersDon't write line number for errors.-n, --unbuffered    Flush buffer after each query.--column-names      Write column names in results.(Defaults to on; use --skip-column-names to disable.)-N, --skip-column-namesDon't write column names in results.--sigint-ignore     Ignore SIGINT (CTRL-C).-o, --one-database  Ignore statements except those that occur while thedefault database is the one named at the command line.--pager[=name]      Pager to use to display results. If you don't supply anoption, the default pager is taken from your ENV variablePAGER. Valid pagers are less, more, cat [> filename],etc. See interactive help (\h) also. This option does notwork in batch mode. Disable with --disable-pager. Thisoption is disabled by default.-p, --password[=name]Password to use when connecting to server. If password isnot given it's asked from the tty.-P, --port=#        Port number to use for connection or 0 for default to, inorder of preference, my.cnf, $MYSQL_TCP_PORT,/etc/services, built-in default (3306).--prompt=name       Set the mysql prompt to this value.--protocol=name     The protocol to use for connection (tcp, socket, pipe,memory).-q, --quick         Don't cache result, print it row by row. This may slowdown the server if the output is suspended. Doesn't usehistory file.-r, --raw           Write fields without conversion. Used with --batch.--reconnect         Reconnect if the connection is lost. Disable with--disable-reconnect. This option is enabled by default.(Defaults to on; use --skip-reconnect to disable.)-s, --silent        Be more silent. Print results with a tab as separator,each row on new line.-S, --socket=name   The socket file to use for connection.--ssl-mode=name     SSL connection mode.--ssl               Deprecated. Use --ssl-mode instead.(Defaults to on; use --skip-ssl to disable.)--ssl-verify-server-certDeprecated. Use --ssl-mode=VERIFY_IDENTITY instead.--ssl-ca=name       CA file in PEM format.--ssl-capath=name   CA directory.--ssl-cert=name     X509 cert in PEM format.--ssl-cipher=name   SSL cipher to use.--ssl-key=name      X509 key in PEM format.--ssl-crl=name      Certificate revocation list.--ssl-crlpath=name  Certificate revocation list path.--tls-version=name  TLS version to use, permitted values are: TLSv1, TLSv1.1,TLSv1.2--server-public-key-path=nameFile path to the server public RSA key in PEM format.--get-server-public-keyGet server public key-t, --table         Output in table format.--tee=name          Append everything into outfile. See interactive help (\h)also. Does not work in batch mode. Disable with--disable-tee. This option is disabled by default.-u, --user=name     User for login if not current user.-U, --safe-updates  Only allow UPDATE and DELETE that uses keys.-U, --i-am-a-dummy  Synonym for option --safe-updates, -U.-v, --verbose       Write more. (-v -v -v gives the table output format).-V, --version       Output version information and exit.-w, --wait          Wait and retry if connection is down.--connect-timeout=# Number of seconds before connection timeout.--max-allowed-packet=#The maximum packet length to send to or receive fromserver.--net-buffer-length=#The buffer size for TCP/IP and socket communication.--select-limit=#    Automatic limit for SELECT when using --safe-updates.--max-join-size=#   Automatic limit for rows in a join when using--safe-updates.--secure-auth       Refuse client connecting to server if it uses old(pre-4.1.1) protocol. Deprecated. Always TRUE--server-arg=name   Send embedded server this as a parameter.--show-warnings     Show warnings after every statement.-j, --syslog        Log filtered interactive commands to syslog. Filtering ofcommands depends on the patterns supplied via histignoreoption besides the default patterns.--plugin-dir=name   Directory for client-side plugins.--default-auth=name Default authentication client-side plugin to use.--binary-mode       By default, ASCII '\0' is disallowed and '\r\n' istranslated to '\n'. This switch turns off both features,and also turns off parsing of all clientcommands except\C and DELIMITER, in non-interactive mode (for inputpiped to mysql or loaded using the 'source' command).This is necessary when processing output from mysqlbinlogthat may contain blobs.--connect-expired-passwordNotify the server that this client is prepared to handleexpired password sandbox mode.

example

[root@mypc01 openresty]# /usr/bin/mysql -h mypc01 -uroot -p123456 -e 'select date(order_date) from sz.us_order group by date(order_date)'+------------------+
| date(order_date) |
+------------------+
| 2019-07-14       |
| 2019-07-15       |
| 2019-07-16       |
+------------------+

如何把命令的查询结果赋给一个shell变量呢?需要给命令加反单引号

ALL_DATE=`/usr/bin/mysql -h mypc01 -uroot -p123456  -e 'select date(order_date) from sz.us_order group by date(order_date)'`
echo $ALL_DATE

执行结果如下

date(order_date) 2019-07-14 2019-07-15 2019-07-16

如果要去掉第一个,需要加上-s

或者如下写法也可以

ALL_DATE=$(/usr/bin/mysql -h mypc01 -uroot -p123456  -e 'select date(order_date) from sz.us_order group by date(order_date)')
echo $ALL_DATE

接下来为shell循环的写法
流类比

for i in $(seq 1 10)
do
echo $i
done

使用shell脚本将mysql数据导入HIve中相关推荐

  1. MySQL通过接口导入hive_利用Sqoop将MySQL数据导入Hive中

    参考 http://www.cnblogs.com/iPeng0564/p/3215055.html http://www.tuicool.com/articles/j2yayyj http://bl ...

  2. mysql shell可视化_shell编程系列24--shell操作数据库实战之利用shell脚本将文本数据导入到mysql中...

    shell编程系列24--shell操作数据库实战之利用shell脚本将文本数据导入到mysql中 利用shell脚本将文本数据导入到mysql中 需求1:处理文本中的数据,将文本中的数据插入到mys ...

  3. python etl工具 sqoop hive_python脚本 用sqoop把mysql数据导入hive

    转:https://blog.csdn.net/wulantian/article/details/53064123 用python把mysql数据库的数据导入到hive中,该过程主要是通过pytho ...

  4. 将EXCEL表格数据导入hive中

    将EXCEL表格数据导入hive中 第一步:将excel表格转为.csv格式文件(utf-8格式) 第二步:将建表语句中分隔符改为逗号(英文),格式改为text文本格式 第三步:使用linux rz命 ...

  5. sqoop把mysql数据导入hive

    环境: 软件 版本 备注 Ubuntu 19.10   sqoop 1.4.7   mysql 8.0.20-0ubuntu0.19.10.1 (Ubuntu)   hadoop 3.1.2   hi ...

  6. 使用 Sqoop 将 30W+ MySQL 数据导入 Hive

    本实验完成的是,使用 Sqoop 从 MySQL 导出数据到 Hive. 整体步骤分为: 初始化 MySQL 的 30W+ 数据 安装配置 Sqoop 在 Hive 中初始化目标表 Sqoop 脚本实 ...

  7. linux mysql清除缓存_案例:通过shell脚本实现mysql数据备份与清理

    导读 Shell是系统的用户界面,提供了用户与内核进行交互操作的一种接口.它接收用户输入的命令并把它送入内核去执行,实际上Shell是一个命令解释器,它解释由用户输入的命令并且把它们送到内核,不仅如此 ...

  8. 【华为云技术分享】使用sqoop导入mysql数据到hive中

    Sqoop 是一个数据转储工具,它能够将 hadoop HDFS 中的数据转储到关系型数据库中,也能将关系型数据库中的数据转储到 HDFS 中. Apache Sqoop,是"SQL to ...

  9. 将MySQL数据导入hive时报错发现如下错误

    报错如下 Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hive/shi ...

最新文章

  1. C 标准库 - ctype.h
  2. 未来教育计算机二级为什么分数很低,计算机二级考试失分了却不知道为什么?...
  3. Android Context Hook
  4. 关键字_Java Volatile关键字
  5. 由作用域安全的构造函数想到的
  6. CentOS SSH安装与配置
  7. linux 延时一微秒_让我们暂停一微秒
  8. 为什么大多数程序员都抽烟_为什么大多数重新设计都会失败
  9. 大数据_Flink_Java版_数据处理_流处理API_Flink中的UDF函数类---Flink工作笔记0036
  10. python贴吧-python爬取贴吧帖子
  11. mysql用其他用户登录报错_mysql如何登录创建的用户
  12. 微信小程序(一)认识微信小程序
  13. 中国电信向小米释放善意,高度认可它的5G性能领先而不是华为
  14. JAVA几种缓存技术介绍说明
  15. 一、Tomcat 配置
  16. snmp trap发送及接收
  17. 密码学---数字签名和认证协议---数字签名标准
  18. 计算机不能启动 无法验证数字签名,Win10提示错误0xc0000428无法验证此文件的数字签名怎么办?...
  19. K2 BPM_从流程梳理到落地,为企业打造流程管理闭环_全业务流程管理专家
  20. 我这样回答了Spring 5的新特性,面试官对我刮目相看 | 文末送书

热门文章

  1. 阿里云服务器的安全组怎么创建和修改?
  2. python删除数据库的数据完整代码_轻松掌握Python对数据库的增、删、改、查
  3. python最大公约数计算。从键盘接收两个整数_python如何求解两数的最大公约数
  4. cesium cesium is not defined
  5. 2n皇后的问题java_蓝桥杯java 基础练习 2n皇后问题
  6. 创建数据账号只有个别表的权限_只有普通权限账号,如何把远程数据库中该用户的数据表导入到本地数据库?...
  7. 华为交换机带宽不足会丢包吗_华为岳伟:品质家宽,释放F5G网络体验红利
  8. python 识别图形验证码_python爬虫20 | 小帅b教你如何用python识别图片验证码
  9. 列车停站方案_高速铁路列车停站方案与运行图协同优化理论和方法
  10. c# combobox集合数据不显示_C#实战036:各种泛型的定义和使用详解