创建Scala项目

使用Idea创建项目：

指定<groupId>cn.ac.iie.spark</groupId> 和 <artifactId>sql</artifactId>
依赖包如下：

 <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"><modelVersion>4.0.0</modelVersion><groupId>cn.ac.iie.spark</groupId><artifactId>sql</artifactId><version>1.0-SNAPSHOT</version><inceptionYear>2008</inceptionYear><properties><scala.version>2.11.12</scala.version><spark.version>2.4.5</spark.version></properties><dependencies><!-- Scala --><dependency><groupId>org.scala-lang</groupId><artifactId>scala-library</artifactId><version>${scala.version}</version></dependency><!-- Spark --><dependency><groupId>org.apache.spark</groupId><artifactId>spark-sql_2.11</artifactId><version>${spark.version}</version></dependency></dependencies><build><sourceDirectory>src/main/scala</sourceDirectory><testSourceDirectory>src/test/scala</testSourceDirectory><plugins><plugin><groupId>org.scala-tools</groupId><artifactId>maven-scala-plugin</artifactId><version>2.15.0</version><executions><execution><goals><goal>compile</goal><goal>testCompile</goal></goals><configuration><args><arg>-dependencyfile</arg><arg>${project.build.directory}/.scala_dependencies</arg></args></configuration></execution></executions></plugin><plugin><groupId>org.apache.maven.plugins</groupId><artifactId>maven-surefire-plugin</artifactId><version>2.6</version><configuration><useFile>false</useFile><disableXmlReport>true</disableXmlReport><!-- If you have classpath issue like NoDefClassError,... --><!-- useManifestOnlyJar>false</useManifestOnlyJar --><includes><include>**/*Test.*</include><include>**/*Suite.*</include></includes></configuration></plugin><plugin><artifactId>maven-assembly-plugin</artifactId><configuration><archive><manifest><mainClass></mainClass></manifest></archive><descriptorRefs><descriptorRef>jar-with-dependencies</descriptorRef></descriptorRefs></configuration></plugin></plugins></build></project>

创建Object

创建一个Object，内容如下：

package cn.ac.iie.sparkimport org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.sql.SQLContext/*** SQLContextApp 的使用* 注意：IDEA是在本地，而测试数据在服务器上，能不能在本地进行开发测试？可以*/
object SQLContextApp {def main(args: Array[String]): Unit = {val path = args(0)// 1. 创建相应的Contextval sparkConf= new SparkConf()// 在生产或测试环境中，APPName和Master是通过脚本指定的sparkConf.setAppName("SQLContextApp")sparkConf.setMaster("local[2]")val sc = new SparkContext(sparkConf)val sqlContext = new SQLContext(sc)// 2. 相关处理:jsonval people = sqlContext.read.format("json").load(path)people.printSchema()people.show()// 3. 关闭资源sc.stop()}
}

因为需要传参数，我们设置Edit Configuration：

原始数据如下：

{"name":"Michael", "salary":3000}
{"name":"Andy", "salary":4500}
{"name":"Justin", "salary":3500}
{"name":"Berta", "salary":4000}
{"name":"vincent", "salary":90000}

控制台输出如下：

root|-- name: string (nullable = true)|-- salary: long (nullable = true)+-------+------+
|   name|salary|
+-------+------+
|Michael|  3000|
|   Andy|  4500|
| Justin|  3500|
|  Berta|  4000|
|vincent| 90000|
+-------+------+

可以看到正确打印出来了schema和数据
在目录下使用maven进行打包：mvn clean package -DskipTests

提交到环境中运行

往往我们在生产或测试环境中，APPName和Master是通过脚本指定的，因此需要注释掉sparkConf.setAppName("SQLContextApp") sparkConf.setMaster("local[2]")
在服务器中提交：

./spark-submit --name SQLContextApp --class cn.ac.iie.spark.SQLContextApp   --master local[2]    /home/iie4bu/lib/sql-1.0-SNAPSHOT.jar   /home/iie4bu/app/spark-2.4.5-bin-2.6.0-cdh5.15.1/examples/src/main/resources/people.json

Spark SQL 1.x之SQL Context使用相关推荐

python调用spark和调用hive_Spark(Hive) SQL数据类型使用详解(Python)
Spark SQL使用时需要有若干"表"的存在,这些"表"可以来自于Hive,也可以来自"临时表".如果"表"来自于Hi ...
spark python_Python、流、SQL 有更新！耗时两年，Spark 3.0 重磅发布！
2020 年 6 月 19 日,经过近两年的开发之后,Apache Spark TM 3.0.0 版本终于面世了.据官方介绍,此次 Spark 3.0.0 版本更新了 3,400 多个补丁程序,将使 ...
《Oracle高性能SQL引擎剖析：SQL优化与调优机制详解》一2.2　内部函数与操作
2.2 内部函数与操作实际上,在Oracle内部,执行计划的每一个数据源(Row Source)操作都与一个内部函数(qer<*>)相对应,而操作对象.谓词条件都是这些函数的参数.这些函 ...
计算机目录读取,从项目目录中读取SQL查询文件(Read SQL query file from project directory)...
从项目目录中读取SQL查询文件(Read SQL query file from project directory) 我在Visual Studio项目中放置了3个特别大的SQL查询,位于项目目录中 ...
oracle定义变量sql赋值_ORACLE获取SQL绑定变量值的方法总结
本文总结一下ORACLE数据库中如何获取SQL绑定变量值的方法,在SQL优化调优过程中,经常会用到这方面的知识点.在此梳理.总结一下这方面的知识点,方面日后查找.翻阅. 方法1:查询V$SQL V$S ...
在SQL中使用PL/SQL函数存在的问题
-----------------------------Cryking原创------------------------------ -----------------------转载请注明出处, ...
sql truncate_如何在SQL Delete和SQL Truncate语句后使用数据库备份恢复数据
sql truncate This article explores the recovery of data removed by SQL Delete and SQL Truncate state ...
如何在SQL Server中创建SQL依赖关系图
Deleting or changing objects may affect other database objects like views or procedures that depends ...
sql truncate_SQL Truncate和SQL Delete语句的内部
sql truncate This article gives you an insight into the SQL Truncate and SQL Delete commands behavio ...
linux sql server客户端,Linux的MS SQL Server客户端SQuirrel SQL Client
因为工作需要,要在ubuntu下连接mssql数据库,找了很久终于找到了SQuirreL SQL Client. SQuirreL SQL Client最大的魅力在于: 基于Java,具备良好的夸平台 ...

Spark SQL 1.x之SQL Context使用

创建Scala项目

创建Object

提交到环境中运行

Spark SQL 1.x之SQL Context使用相关推荐

最新文章

热门文章