这里写自定义目录标题

  • 说明
  • 1. Json导入到本地TinkerGraph
    • 1.1 配置
    • 1.2 样例Json
    • 1.3 代码
    • 1.4 文件校验
  • 2. CSV导入到本地TinkerGraph
    • 2.1 配置
    • 2.2 样例CSV
    • 2.3 代码
    • 2.4 文件校验
  • 3. Json导入到分布式存储(berkeleyje-es)
    • 3.1 配置
    • 3.2 样例Json
    • 3.3 代码
    • 3.4 验证

说明

本文中的代码基于janusgraph 0.3.1进行演示。数据文件都为janusgraph包中自带的数据文件。

1. Json导入到本地TinkerGraph

1.1 配置

conf/hadoop-graph/hadoop-load-json.properties 配置如下:

#
# Hadoop Graph Configuration
#
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader=org.apache.tinkerpop.gremlin.hadoop.structure.io.graphson.GraphSONInputFormat
gremlin.hadoop.graphWriter=org.apache.hadoop.mapreduce.lib.output.NullOutputFormat
gremlin.hadoop.inputLocation=./data/grateful-dead.json
gremlin.hadoop.outputLocation=output
gremlin.hadoop.jarsInDistributedCache=true#
# SparkGraphComputer Configuration
#
spark.master=local[*]
spark.executor.memory=1g
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator

1.2 样例Json

{"id":1,"label":"song","inE":{"followedBy":[{"id":3059,"outV":153,"properties":{"weight":1}},{"id":276,"outV":5,"properties":{"weight":2}},{"id":3704,"outV":3,"properties":{"weight":2}},{"id":4383,"outV":62,"pr
operties":{"weight":1}}]},"outE":{"followedBy":[{"id":0,"inV":2,"properties":{"weight":1}},{"id":1,"inV":3,"properties":{"weight":2}},{"id":2,"inV":4,"properties":{"weight":1}},{"id":3,"inV":5,"properties":{"we
ight":1}},{"id":4,"inV":6,"properties":{"weight":1}}],"sungBy":[{"id":7612,"inV":340}],"writtenBy":[{"id":7611,"inV":527}]},"properties":{"name":[{"id":0,"value":"HEY BO DIDDLEY"}],"songType":[{"id":2,"value":"
cover"}],"performances":[{"id":1,"value":5}]}}
{"id":2,"label":"song","inE":{"followedBy":[{"id":0,"outV":1,"properties":{"weight":1}},{"id":323,"outV":34,"properties":{"weight":1}}]},"outE":{"followedBy":[{"id":6190,"inV":123,"properties":{"weight":1}},{"i
d":6191,"inV":50,"properties":{"weight":1}}],"sungBy":[{"id":7666,"inV":525}],"writtenBy":[{"id":7665,"inV":525}]},"properties":{"name":[{"id":3,"value":"IM A MAN"}],"songType":[{"id":5,"value":"cover"}],"perfo
rmances":[{"id":4,"value":1}]}}
s

1.3 代码

readGraph = GraphFactory.open('conf/hadoop-graph/hadoop-load-json.properties')
writeGraphConf = new BaseConfiguration()
writeGraphConf.setProperty("gremlin.graph", "org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph")
writeGraphConf.setProperty("gremlin.tinkergraph.graphFormat", "gryo")
writeGraphConf.setProperty("gremlin.tinkergraph.graphLocation", "/tmp/csv-graph.kryo")
blvp = BulkLoaderVertexProgram.build().bulkLoader(OneTimeBulkLoader).writeGraph(writeGraphConf).create(readGraph)
readGraph.compute(SparkGraphComputer).workers(1).program(blvp).submit().get()

1.4 文件校验

新生成的文件如下

[root@vm03 data]# ls -l /tmp/csv-graph.kryo
-rw-r--r--. 1 root root 726353 May 29 04:09 /tmp/csv-graph.kryo

2. CSV导入到本地TinkerGraph

2.1 配置

conf/hadoop-graph/hadoop-load-csv.properties 配置如下:

#
# Hadoop Graph Configuration
#
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader=org.apache.tinkerpop.gremlin.hadoop.structure.io.script.ScriptInputFormat
gremlin.hadoop.graphWriter=org.apache.tinkerpop.gremlin.hadoop.structure.io.graphson.GraphSONOutputFormat
gremlin.hadoop.inputLocation=./data/grateful-dead.txt
gremlin.hadoop.outputLocation=output
gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.scriptInputFormat.script=./data/script-input-grateful-dead.groovy#
# SparkGraphComputer Configuration
#
spark.master=local[*]
spark.executor.memory=1g
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator

2.2 样例CSV

1,song,HEY BO DIDDLEY,cover,5   followedBy,2,1|followedBy,3,2|followedBy,4,1|followedBy,5,1|followedBy,6,1|sungBy,340|writtenBy,527     followedBy,3,2|followedBy,5,2|followedBy,62,1|followedBy,153,1
2,song,IM A MAN,cover,1 followedBy,50,1|followedBy,123,1|sungBy,525|writtenBy,525       followedBy,1,1|followedBy,34,1
3,song,NOT FADE AWAY,cover,531  followedBy,81,1|followedBy,86,5|followedBy,127,10|followedBy,59,1|followedBy,83,3|followedBy,103,2|followedBy,68,1|followedBy,134,2|followedBy,131,1|followedBy,151,1|followedBy,3

2.3 代码

script-input-grateful-dead.groovy 代码如下:

def parse(line) {def (vertex, outEdges, inEdges) = line.split(/\t/, 3)def (v1id, v1label, v1props) = vertex.split(/,/, 3)def v1 = graph.addVertex(T.id, v1id.toInteger(), T.label, v1label)switch (v1label) {case "song":def (name, songType, performances) = v1props.split(/,/)v1.property("name", name)v1.property("songType", songType)v1.property("performances", performances.toInteger())breakcase "artist":v1.property("name", v1props)breakdefault:throw new Exception("Unexpected vertex label: ${v1label}")}[[outEdges, true], [inEdges, false]].each { def edges, def out ->edges.split(/\|/).grep().each { def edge ->def parts = edge.split(/,/)def otherV, eLabel, weight = nullif (parts.size() == 2) {(eLabel, otherV) = parts} else {(eLabel, otherV, weight) = parts}def v2 = graph.addVertex(T.id, otherV.toInteger())def e = out ? v1.addOutEdge(eLabel, v2) : v1.addInEdge(eLabel, v2)if (weight != null) e.property("weight", weight.toInteger())}}return v1
}

janusgraph代码:

readGraph = GraphFactory.open('conf/hadoop-graph/hadoop-load-csv.properties')
writeGraphConf = new BaseConfiguration()
writeGraphConf.setProperty("gremlin.graph", "org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph")
writeGraphConf.setProperty("gremlin.tinkergraph.graphFormat", "gryo")
writeGraphConf.setProperty("gremlin.tinkergraph.graphLocation", "/tmp/csv-graph2.kryo")
blvp = BulkLoaderVertexProgram.build().bulkLoader(OneTimeBulkLoader).writeGraph(writeGraphConf).create(readGraph)
readGraph.compute(SparkGraphComputer).workers(1).program(blvp).submit().get()g = GraphFactory.open(writeGraphConf).traversal()
g.V().valueMap(true)

2.4 文件校验

新生成的文件如下

[root@vm03 data]# ls -l /tmp/csv-graph2.kryo
-rw-r--r--. 1 root root 339939 May 29 04:56 /tmp/csv-graph2.kryo

3. Json导入到分布式存储(berkeleyje-es)

3.1 配置

conf/hadoop-graph/hadoop-load-json-ber-es.properties 配置如下:

#
# Hadoop Graph Configuration
#
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader=org.apache.tinkerpop.gremlin.hadoop.structure.io.graphson.GraphSONInputFormat
gremlin.hadoop.graphWriter=org.apache.hadoop.mapreduce.lib.output.NullOutputFormat
gremlin.hadoop.inputLocation=./data/grateful-dead.json
gremlin.hadoop.outputLocation=output
gremlin.hadoop.jarsInDistributedCache=true#
# SparkGraphComputer Configuration
#
spark.master=local[*]
spark.executor.memory=1g
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator

./conf/janusgraph-berkeleyje-es-bulkload.properties 配置如下:

gremlin.graph=org.janusgraph.core.JanusGraphFactory
storage.backend=berkeleyje
storage.directory=../db/berkeley
index.search.backend=elasticsearch

3.2 样例Json

{"id":1,"label":"song","inE":{"followedBy":[{"id":3059,"outV":153,"properties":{"weight":1}},{"id":276,"outV":5,"properties":{"weight":2}},{"id":3704,"outV":3,"properties":{"weight":2}},{"id":4383,"outV":62,"pr
operties":{"weight":1}}]},"outE":{"followedBy":[{"id":0,"inV":2,"properties":{"weight":1}},{"id":1,"inV":3,"properties":{"weight":2}},{"id":2,"inV":4,"properties":{"weight":1}},{"id":3,"inV":5,"properties":{"we
ight":1}},{"id":4,"inV":6,"properties":{"weight":1}}],"sungBy":[{"id":7612,"inV":340}],"writtenBy":[{"id":7611,"inV":527}]},"properties":{"name":[{"id":0,"value":"HEY BO DIDDLEY"}],"songType":[{"id":2,"value":"
cover"}],"performances":[{"id":1,"value":5}]}}
{"id":2,"label":"song","inE":{"followedBy":[{"id":0,"outV":1,"properties":{"weight":1}},{"id":323,"outV":34,"properties":{"weight":1}}]},"outE":{"followedBy":[{"id":6190,"inV":123,"properties":{"weight":1}},{"i
d":6191,"inV":50,"properties":{"weight":1}}],"sungBy":[{"id":7666,"inV":525}],"writtenBy":[{"id":7665,"inV":525}]},"properties":{"name":[{"id":3,"value":"IM A MAN"}],"songType":[{"id":5,"value":"cover"}],"perfo
rmances":[{"id":4,"value":1}]}}
s

3.3 代码

outputGraphConfig = './conf/janusgraph-berkeleyje-es-bulkload.properties'
readGraph = GraphFactory.open('conf/hadoop-graph/hadoop-load-json-ber-es.properties')blvp = BulkLoaderVertexProgram.build().writeGraph(outputGraphConfig).create(readGraph)
readGraph.compute(SparkGraphComputer).workers(1).program(blvp).submit().get()
g = GraphFactory.open(outputGraphConfig).traversal()
g.V().valueMap(true)

3.4 验证

通过gremlin-server搭建服务进行验证

  1. gremline-server配置文件如下(gremlin-server-berkeleyje-bulkload.yaml),与gremlin-server-berkeleyje.yaml类似,下面的位置进行调整:
graph: conf/janusgraph-berkeleyje-es-bulkload.properties
  1. ./gremlin-server.sh conf/gremlin-server/gremlin-server-berkeleyje-bulkload.yaml 启动服务
  2. 通过graphexp进行查询

JanusGraph批量导入数据代码总结相关推荐

  1. matlab常用代码(读取文件、批量导入数据、与或非)

    学习matlab使用过程中遇到的各种常见小操作,放在这里权当记录,持续更新中.包括批量导入数据.读取/写入不同类型的文件.与或非.cell.randperm的使用等 一.常见函数或小技巧 1. 记录程 ...

  2. 从TXT文本文档向Sql Server中批量导入数据

    因为工作的需要,近期在做数据的分析和数据的迁移.在做数据迁移的时候需要将原有的数据导入到新建的数据库中.本来这个单纯的数据导入导出是没有什么问题的,但是客户原有的数据全部都是存在.dat文件中的.所以 ...

  3. c datatable导入mysql_《项目经验》–简单三层使用DataTable向数据库表批量导入数据—向SqlServer一张表中导入数据 | 学步园...

    向数据库的一张表中添加数据,可以采用单个添加,即一条数据.一条数据的添加:也可以采用批量导入,依次将好些条数据写入数据库的一张表中.文本借助实例<添加系列信息>讲解一种向数据库批量导入数据 ...

  4. python批量导入网页信息_python批量导入数据进Elasticsearch的实例

    ES在之前的博客已有介绍,提供很多接口,本文介绍如何使用python批量导入.ES官网上有较多说明文档,仔细研究并结合搜索引擎应该不难使用. 先给代码 #coding=utf-8 from datet ...

  5. 微信小程序云开发——常用功能2:操作云数据库一键批量导入数据(导入json文件)

    微信小程序云开发--常用功能2:操作云数据库一键批量导入数据(导入json文件) 今天我们要添加100条数据.下面的过程是先创建一条记录,然后导出这条数据看json文件中是如何编辑字段的,然后仿照这个 ...

  6. 批量导入数据将word文档转换成HTML文档

    1.在批量导入数据里:第一步下载一个word文档模板,用户可以根据这个worm文档模板的要求去填写数据,填学好数据之后保存worm文档 2.在页面选择到word文档保存到from表单中,通过ajaxS ...

  7. 【转帖】Java实现Excel批量导入数据

    这篇文章主要为大家详细介绍了Java实现Excel批量导入数据,文中示例代码介绍的非常详细,具有一定的参考价值,感兴趣的小伙伴们可以参考一下 Excel的批量导入是很常见的功能,这里采用Jxl实现,数 ...

  8. java使用POI实现Excel批量导入数据。

    1.背景 项目中有使用easypoi,处理常规excel问题,但是现在有个需求,需要动态生成导出的报表字段.同时,根据导入的excel,增加数据信息.(有可能会出现,导入的报表是几天前下载的,不会最新 ...

  9. SpringMVC框架通过Excel批量导入数据

    文章目录 SpringMVC框架通过Excel批量导入数据 1.导入需要的jar包 2.配置文件 3.创建Java对象类 4.解析Excel表格数据的工具类 5.前端请求 6.Controller处理 ...

  10. 《项目经验》--简单三层使用DataTable向数据库表批量导入数据---向SqlServer多张张表中导入数据

    前面已经介绍过如何向数据库的一张表中批量导入数据,详情见博客<项目经验---简单三层使用DataTable向数据库表批量导入数据---向SqlServer一张表中导入数据>:本文主要介绍如 ...

最新文章

  1. 【OpenCV 4开发详解】Scharr算子
  2. android log时间,android – Logcat的日志时间戳不按顺序排列
  3. SQL Server 默认跟踪(Default Trace)
  4. Nuxt爬坑系列之vuex
  5. 互换性与技术测量电子版_181套建设工程全套资料表格,从开工到完工,完整电子版手慢无...
  6. python语言的核心理念是_学习Python语言四大核心优势
  7. 第2章 JSP数据交互(一)
  8. JAVA并发容器之CopyOnWrite容器
  9. 方便检测电脑配置的软件收集
  10. 【原创】惠普 CQ35-222TX 笔记本电脑安装东皇 v3.2 Mac OS 详解
  11. GIF微信表情如何制作
  12. linux查看udp丢包数量,Linux下UDP丢包问题分析思路
  13. 下一关口令:别犹“豫”,看“浙”里,一起“皖”
  14. Rust 学习3, 枚举,集合
  15. kubernetes用户使用token安全认证教程
  16. 设计模式--模板方法模式(照旧,有类关系图)
  17. 数字金融VS传统金融,区块链如何革新信任机制?
  18. javascript中getmonth()的问题
  19. JEM software ticket45:Console output error of nQP when LCU level rate control is enabled
  20. 瑞幸咖啡布局“无人零售”的多重不确定性

热门文章

  1. 【ITSM】什么是ITSM,IT部门为什么需要ITSM
  2. UE4.26像素流公网访问linux和win两种实现方式
  3. vue2.5去哪儿(慕课网)学习笔记
  4. python 100days github_GitHub - 2668599092/Python-100-Days: Python - 100天从新手到大师
  5. 筛选出计算机或英语不及格的记录,浅谈EXCEL“高级筛选”中条件的书写
  6. Windows Edge浏览器右键菜单透明、难以分辨问题的解决办法
  7. “踢爆”职场焦虑、玩机车、文科转大厂程序媛,乘风破浪的 IT 女神太飒了!
  8. WARNING: There was an error checking the latest version of pip.
  9. 爬虫 -----beautifulsoup、Xpath、re (三)附淘宝比价定向爬虫
  10. 开发一个安卓App-计算器-改色换肤(完结篇)