[Spark][Python]Spark 访问 mysql , 生成 dataframe 的例子:

mydf001=sqlContext.read.format("jdbc").option("url","jdbc:mysql://localhost/loudacre")\

.option("dbtable","accounts").option("user","training").option("password","training").load()

In [10]: mydf001=sqlContext.read.format("jdbc").option("url","jdbc:mysql://localhost/loudacre")\

....: .option("dbtable","accounts").option("user","training").option("password","training").load()

17/10/03 05:59:53 INFO hive.HiveContext: default warehouse location is /user/hive/warehouse

17/10/03 05:59:53 INFO hive.HiveContext: Initializing metastore client version 1.1.0 using Spark classes.

17/10/03 05:59:53 INFO client.ClientWrapper: Inspected Hadoop version: 2.6.0-cdh5.7.0

17/10/03 05:59:53 INFO client.ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0-cdh5.7.0

17/10/03 05:59:56 INFO hive.metastore: Trying to connect to metastore with URI thrift://localhost.localdomain:9083

17/10/03 05:59:56 INFO hive.metastore: Opened a connection to metastore, current connections: 1

17/10/03 05:59:56 INFO hive.metastore: Connected to metastore.

17/10/03 05:59:56 INFO session.SessionState: Created local directory: /tmp/c2d22d09-7425-4bb3-94c3-39cb32267c7d_resources

17/10/03 05:59:56 INFO session.SessionState: Created HDFS directory: /tmp/hive/training/c2d22d09-7425-4bb3-94c3-39cb32267c7d

17/10/03 05:59:56 INFO session.SessionState: Created local directory: /tmp/training/c2d22d09-7425-4bb3-94c3-39cb32267c7d

17/10/03 05:59:56 INFO session.SessionState: Created HDFS directory: /tmp/hive/training/c2d22d09-7425-4bb3-94c3-39cb32267c7d/_tmp_space.db

17/10/03 05:59:56 INFO session.SessionState: No Tez session required at this point. hive.execution.engine=mr.

In [11]:

In [11]: type(mydf001)

Out[11]: pyspark.sql.dataframe.DataFrame

In [12]: mydf001.count()

17/10/03 06:00:29 INFO spark.SparkContext: Starting job: count at NativeMethodAccessorImpl.java:-2

17/10/03 06:00:29 INFO scheduler.DAGScheduler: Registering RDD 2 (count at NativeMethodAccessorImpl.java:-2)

17/10/03 06:00:29 INFO scheduler.DAGScheduler: Got job 0 (count at NativeMethodAccessorImpl.java:-2) with 1 output partitions

17/10/03 06:00:29 INFO scheduler.DAGScheduler: Final stage: ResultStage 1 (count at NativeMethodAccessorImpl.java:-2)

17/10/03 06:00:29 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage 0)

17/10/03 06:00:29 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage 0)

17/10/03 06:00:29 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[2] at count at NativeMethodAccessorImpl.java:-2), which has no missing parents

17/10/03 06:00:30 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 11.0 KB, free 11.0 KB)

17/10/03 06:00:31 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 5.2 KB, free 16.1 KB)

17/10/03 06:00:31 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:36793 (size: 5.2 KB, free: 208.8 MB)

17/10/03 06:00:31 INFO spark.SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1006

17/10/03 06:00:31 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[2] at count at NativeMethodAccessorImpl.java:-2)

17/10/03 06:00:31 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0 with 1 tasks

17/10/03 06:00:31 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, partition 0,PROCESS_LOCAL, 1911 bytes)

17/10/03 06:00:31 INFO executor.Executor: Running task 0.0 in stage 0.0 (TID 0)

17/10/03 06:00:32 INFO codegen.GenerateMutableProjection: Code generated in 425.82589 ms

17/10/03 06:00:32 INFO codegen.GenerateUnsafeProjection: Code generated in 78.278589 ms

17/10/03 06:00:33 INFO codegen.GenerateMutableProjection: Code generated in 84.676206 ms

17/10/03 06:00:33 INFO codegen.GenerateUnsafeRowJoiner: Code generated in 60.144399 ms

17/10/03 06:00:33 INFO codegen.GenerateUnsafeProjection: Code generated in 95.977074 ms

17/10/03 06:00:34 INFO jdbc.JDBCRDD: closed connection

17/10/03 06:00:34 INFO executor.Executor: Finished task 0.0 in stage 0.0 (TID 0). 1334 bytes result sent to driver

17/10/03 06:00:34 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 3081 ms on localhost (1/1)

17/10/03 06:00:34 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool

17/10/03 06:00:34 INFO scheduler.DAGScheduler: ShuffleMapStage 0 (count at NativeMethodAccessorImpl.java:-2) finished in 3.163 s

17/10/03 06:00:34 INFO scheduler.DAGScheduler: looking for newly runnable stages

17/10/03 06:00:34 INFO scheduler.DAGScheduler: running: Set()

17/10/03 06:00:34 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 1)

17/10/03 06:00:34 INFO scheduler.DAGScheduler: failed: Set()

17/10/03 06:00:34 INFO scheduler.DAGScheduler: Submitting ResultStage 1 (MapPartitionsRDD[5] at count at NativeMethodAccessorImpl.java:-2), which has no missing parents

17/10/03 06:00:34 INFO storage.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 12.1 KB, free 28.3 KB)

17/10/03 06:00:34 INFO storage.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 5.6 KB, free 33.9 KB)

17/10/03 06:00:34 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on localhost:36793 (size: 5.6 KB, free: 208.8 MB)

17/10/03 06:00:34 INFO spark.SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006

17/10/03 06:00:34 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 1 (MapPartitionsRDD[5] at count at NativeMethodAccessorImpl.java:-2)

17/10/03 06:00:34 INFO scheduler.TaskSchedulerImpl: Adding task set 1.0 with 1 tasks

17/10/03 06:00:34 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, localhost, partition 0,NODE_LOCAL, 1999 bytes)

17/10/03 06:00:34 INFO executor.Executor: Running task 0.0 in stage 1.0 (TID 1)

17/10/03 06:00:34 INFO storage.ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks

17/10/03 06:00:34 INFO storage.ShuffleBlockFetcherIterator: Started 0 remote fetches in 32 ms

17/10/03 06:00:35 INFO codegen.GenerateMutableProjection: Code generated in 52.636353 ms

17/10/03 06:00:35 INFO codegen.GenerateMutableProjection: Code generated in 49.757505 ms

17/10/03 06:00:35 INFO executor.Executor: Finished task 0.0 in stage 1.0 (TID 1). 1666 bytes result sent to driver

17/10/03 06:00:35 INFO scheduler.DAGScheduler: ResultStage 1 (count at NativeMethodAccessorImpl.java:-2) finished in 0.795 s

17/10/03 06:00:35 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 789 ms on localhost (1/1)

17/10/03 06:00:35 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool

17/10/03 06:00:35 INFO scheduler.DAGScheduler: Job 0 finished: count at NativeMethodAccessorImpl.java:-2, took 6.451521 s

Out[12]: 129761

In [13]:

原文:http://www.cnblogs.com/gaojian/p/7624493.html

spark to mysql date_[Spark][Python]Spark 访问 mysql , 生成 dataframe 的例子:相关推荐

  1. [Spark][Python]Spark 访问 mysql , 生成 dataframe 的例子:

    [Spark][Python]Spark 访问 mysql , 生成 dataframe 的例子: mydf001=sqlContext.read.format("jdbc").o ...

  2. php访问mysql数据库实验报告,php访问mysql数据库

    //建一个连接,造一个连接对象 $db = new MySQLi("localhost","root","123","mydb&q ...

  3. vb访问mysql容易死机_VB访问MySQL

    最近研究的东西中, 有一部分涉及到用VB访问MySQL数据库, 今天研究了一下小有收获, 共享出来供大家参考首先下载 MySQL 的 ODBC 驱动, 我下载的是 MySQL ODBC 3.51 wi ...

  4. mysql odb驱动_odb C++访问mysql数据库,从安装到写入

    一:ubuntu下odb 安装 get-apt install gcc get-apt install g++ get-apt install odb 生成的odb位于:/usr/odb 生成的库(l ...

  5. 云函数连接mysql超时_云函数访问MYSQL数据库出错?

    // 云函数入口文件 const cloud = require('wx-server-sdk') const mysql=require('mysql2/promise') cloud.init() ...

  6. cpp mysql fetch row_Linux下C++访问MySQL

    今天给大家演示在linux下如果用C++操作mysql 1:安装MySQL 挂载光盘: mkdir /cdrom mount /dev/hdc /cdrom cd /cdrom/Server rpm ...

  7. mysql数据库open函数_C#访问Mysql数据库方法,以及库函数

    先说说在C#当中需要准备的工作吧.在这里我分为了四个步骤: 1.网上下载MySql.Data.dll 2.在项目当中添加引用MySql.Data.dll 3.在操作类当中添加: using MySql ...

  8. python mysql example_Python_Example_ Pycharm(python) 与 数据库(MySQL) 连接学习/示例

    #coding=utf-8#--------------------------------- '''# Author : chu ge # Function: #''' #------------- ...

  9. python中fillna函数_Pandas DataFrame.fillna()例子

    本文概述 我们可以使用fillna()函数填充数据集中的空值. 句法 DataFrame.fillna(value=None, method=None, axis=None, inplace=Fals ...

最新文章

  1. “强化学习之父”萨顿:预测学习马上要火,AI将帮我们理解人类意识
  2. python递归合并排序_python 归并排序的递归法与迭代法(利用队列)实现,以及性能测试...
  3. 皮一皮:美甲的最高境界...
  4. Java设计模式(七):适配器设计模式
  5. 哈工大威海算法设计与分析_计算机算法设计与分析第一章 算法概述
  6. 【NLP】Prompt-Tuning这么好用?
  7. 【Python基础】推荐几个神器来拯救奇丑无比的python代码
  8. git仓库的简单使用
  9. java语言编程基础_Java编程基础02——Java语言基础
  10. 开发者盛宴!Apache HBasecon 峰会来北京了,速来免费报名
  11. jmeter聚合报告如何添加单位_JMeter聚合报告(Aggregate Report)理解
  12. 如何通过js处理相同时间的信息整合到一起的问题
  13. Maven and Ant for Hybris
  14. php 有 stringbuffer,String、StringBuffer、StringBulider三者介绍
  15. 微信多开软件苹果版_微信PC版 v3.0.0.57 多开amp;消息防撤回
  16. base包中自定义activity
  17. JavaScript文档对象模型document对象改变Html元素内容(3)
  18. Mahout实战---编写自己的相似度计算方法
  19. python快速编程入门黑马程序员pdf_Python快速编程入门 传智播客 黑马程序员 python编程从入门到实践基础视频教程核心编程爬虫数据分析程序设计机器学习简明书籍...
  20. opencv实现人脸识别中过曝光人脸图片处理

热门文章

  1. PCB 电测试--测试点数自动输出到流程指示中(读取TGZ Stephdr文件)
  2. CentOS 7 安装 配置 Nginx + PHP
  3. Codeforces7C 扩展欧几里得
  4. xxl-job使用笔记
  5. 对多线程程序,单核cpu与多核cpu如何工作相关的探讨
  6. MySQL5.7之开启远程连接
  7. 算法笔记--STL中的各种遍历及查找(待增)
  8. Unity ---WidgetsUI CreateTileView Demo
  9. hadoop的部署以及应用
  10. GPUImage使用之stillCamera多滤镜