1.  为什么要用Livy
  • Have long running SparkContexts that can be used for multiple Spark jobs, by multiple clients
  • Share cached RDDs or Dataframes across multiple jobs and clients
  • Multiple SparkContexts can be managed simultaneously, and they run on the cluster (YARN/Mesos) instead of the Livy Server for good fault tolerance and concurrency
  • Jobs can be submitted as precompiled jars, snippets of code, or via Java/Scala client API
  • Ensure security via secure authenticated communication
  • Apache License, 100% open source
Then we upload the Spark example jar /usr/lib/spark/lib/spark-examples.jar on HDFS and point to it. If you are using Livy in local mode and not YARN mode, just keep the local path /usr/lib/spark/lib/spark-examples.jar.
It is strongly recommended to configure Spark to submit applications in YARN cluster mode. That makes sure that user sessions have their resources properly accounted for in the YARN cluster, and that the host running the Livy server doesn't become overloaded when multiple user sessions are running.(当有多个session的时候为了减少Livy server的压力,建议部署成yarn的模式)
3.restful 接口
curl -X POST --data '{"file": "/opt/jars/testLivy.jar", "className": "com.testLivy.TestLivyJob"}' -H "Content-Type: application/json" localhost:8998/batches
2.查看状态( 有not_started starting idle running busy shutting_down error dead success 等状态)
localhost:8998/batches/3  结果:"id": 3,  "state": "dead"
4.livy 参数修改
(1) which can be changed with the livy.server.port config option  默认端口为8998,在livy.conf中可修改参数
(2) livy.yarn.jar : this config has been replaced by separate configs listing specific archives for different Livy features. Refer to the default  livy.conf  file shipped with Livy for instructions.
livy.repl.enableHiveContext = true
livy.impersonation.enabled = true
livy.server.session.timeout = 1h
"args":["2016-10-10 22:00:00"],
"jars":["/opt/jars/jar/ficus_2.10-1.0.1.jar","/opt/jars/jar/mysql-connector-java-5.1.39.jar"],//livy hdfs上面的的依赖jar 问题
Livy 提供的关键字参数

(16 known properties: "executorCores", "className", "conf", "driverMemory", "name", "driverCores", "pyFiles", "archives", "queue", "executorMemory", "files", "jars", "proxyUser", "numExecutors", "file" [truncated]]


