pyspark AttributeError: 'NoneType' object has no attribute 'setCallSite'
pyspark:
AttributeError: 'NoneType' object has no attribute 'setCallSite'
我草,是pyspark的bug。解决方法:
print("Approximately joining on distance smaller than 0.6:")distance_min = model.approxSimilarityJoin(imsi_proc_df, imsi_proc_df, 1e6, distCol="JaccardDistance") \.select(col("datasetA.id").alias("idA"),col("datasetB.id").alias("idB"),col("JaccardDistance")) #.filter("idA=idB")print(distance_min.show())print("*"*88)print(imsi_proc_df.show())key = Vectors.sparse(53, [1, 3], [1.0, 1.0])print(model.approxNearestNeighbors(imsi_proc_df, key, 2).show())print("start calculate find botnet!")print("*"*99)print("time start:", time.time())print(type(distance_min), dir(distance_min))print(dir(distance_min.toLocalIterator)) ############################################## add this line to solvedistance_min.sql_ctx.sparkSession._jsparkSession = spark_app._jsparkSessiondistance_min._sc = spark_app._sc #############################################similarity_val_rdd = distance_min.toLocalIterator #.collect()print("time end:", time.time())print(similarity_val_rdd)print("*"*99)try:G = ConnectedGraph()ddos_ue_list = []for item in similarity_val_rdd():imsi, imsi2, jacard_similarity_val = item["idA"], item["idB"], item["JaccardDistance"]print("???", imsi, imsi2, jacard_similarity_val)
Description
reproducing the bug from the example in the documentation:
import pyspark
from pyspark.ml.linalg import Vectors
from pyspark.ml.stat import Correlation
spark = pyspark.sql.SparkSession.builder.getOrCreate()
dataset = [[Vectors.dense([1, 0, 0, -2])],[Vectors.dense([4, 5, 0, 3])],[Vectors.dense([6, 7, 0, 8])],[Vectors.dense([9, 0, 0, 1])]]
dataset = spark.createDataFrame(dataset, ['features'])
df = Correlation.corr(dataset, 'features', 'pearson') df.collect()
This produces the following stack trace:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-92-e7889fa5d198> in <module>()11 dataset = spark.createDataFrame(dataset, ['features'])12 df = Correlation.corr(dataset, 'features', 'pearson')
---> 13 df.collect()/opt/spark/python/pyspark/sql/dataframe.py in collect(self)530 [Row(age=2, name=u'Alice'), Row(age=5, name=u'Bob')] 531 """ --> 532 with SCCallSiteSync(self._sc) as css: 533 sock_info = self._jdf.collectToPython() 534 return list(_load_from_socket(sock_info, BatchedSerializer(PickleSerializer()))) /opt/spark/python/pyspark/traceback_utils.py in __enter__(self) 70 def __enter__(self): 71 if SCCallSiteSync._spark_stack_depth == 0: ---> 72 self._context._jsc.setCallSite(self._call_site) 73 SCCallSiteSync._spark_stack_depth += 1 74 AttributeError: 'NoneType' object has no attribute 'setCallSite'
Analysis:
Somehow the dataframe properties `df.sql_ctx.sparkSession._jsparkSession`, and `spark._jsparkSession` do not match with the ones available in the spark session.
The following code fixes the problem (I hope this helps you narrowing down the root cause)
df.sql_ctx.sparkSession._jsparkSession = spark._jsparkSession
df._sc = spark._scdf.collect()>>> [Row(pearson(features)=DenseMatrix(4, 4, [1.0, 0.0556, nan, 0.4005, 0.0556, 1.0, nan, 0.9136, nan, nan, 1.0, nan, 0.4005, 0.9136, nan, 1.0], False))]
转载于:https://www.cnblogs.com/bonelee/p/10976253.html
pyspark AttributeError: 'NoneType' object has no attribute 'setCallSite'相关推荐
- AttributeError: 'NoneType' object has no attribute 'sc' 解决方法(二)
上一次本以为可以解决了这个问题,然而并没有那么地简单.博主最近在edx网站学习pyspark,想打一下视频上的代码,结果报错了,依旧是报了"AttributeError:'NoneType' ...
- Traceback (most recent call last): File AttributeError: 'NoneType' object has no attribute 'group'
Traceback (most recent call last):File "<stdin>", line 1, in <module> Attribut ...
- Keras问题“AttributeError: 'NoneType' object has no attribute 'update”解决
BUG 在使用Keras训练模型时,在每个epoch完成后save_model时会报错 "AttributeError: 'NoneType' object has no attribute ...
- Pywinauto 应用后端类型选择错误:AttributeError: ‘NoneType‘ object has no attribute ‘backend‘. 原因及解决办法
AttributeError: 'NoneType' object has no attribute 'backend'. 错误原因: 选择的应用后端类型不对. windows 上应用的后端类型有两种 ...
- 成功解决AttributeError: ‘NoneType‘ object has no attribute ‘shape‘
成功解决AttributeError: 'NoneType' object has no attribute 'shape' 目录 解决问题 解决思路 解决方法 解决问题 multiplier = [ ...
- 成功解决AttributeError: 'NoneType' object has no attribute '__array_interface__'
成功解决AttributeError: 'NoneType' object has no attribute '__array_interface__' 目录 解决问题 解决思路 解决方法 原因及思路 ...
- AttributeError: 'NoneType' object has no attribute 'grid'报错解决方案
1问题描述: 当我们在使用tkinter时经常遇到AttributeError: 'NoneType' object has no attribute 'grid'的报错 2.原因分析: import ...
- Python学习笔记:‘’AttributeError: NoneType object has no attribute‘’
前言 最近在学习python,犯了很多低级错误,总结一下 目录 文章目录 前言 目录 问题 出处 方案 问题 AttributeError: 'NoneType' object has no attr ...
- 解决AttributeError AttributeError: 'NoneType' object has no attribute 'filename'
原因忘记上传文件 表单需要加属性 enctype="multipart/form-data" 否则报错!AttributeError AttributeError: 'NoneTy ...
最新文章
- 【干货】机器学习经典书PRML 最新 Python 3 代码实现,附最全 PRML 笔记视频学习资料...
- Tomcat unable to start within 45 seconds.
- 网络营销外包专员浅析尽管快照不见了网络营销外包仍在继续
- python requests库的简单使用
- java getselecteditem_Java JComboBox.getSelectedItem方法代碼示例
- 隐马尔可夫(HMM)/感知机/条件随机场(CRF)----词性标注
- gensim提取一个句子的关键词_NLP(五):关键词提取补充(语料库和向量空间)...
- java后端技术有哪些_Java后端精选技术:什么是JVM?
- 艰难前行的故事 (《梦断代码》读后感)
- EXCEL数组公式,群里求助的问题,按条件查最大值,中位数等, 可用数组公式解决
- 恒讯科技分析:国外服务器中最常用的6种“可视化管理工具”
- 如何能把 CAJ 格式文档转换成 PDF 格式?
- 易语言html5内核,精易Web浏览器支持库易语言版
- 加盟汉庭酒店,后疫情时代稳健的投资方式
- uni-app 中如何打开外部应用,如:浏览器、淘宝、京东、微博等
- [计算机一级MS备考]
- 蓝桥杯 平方和(JAVA)
- 沈剑:技术核心管理者的时间,都只花在这 20% 的事情上
- php取FBOX数据,云平台制作(1)-OPC Client取数模块的制作
- Linux连接redis数据库
热门文章
- bluez 设置绑定pin码_国家工信部重要提醒:一定要设置这个密码!
- linux的简单面试题,收集的一些简单的UNIX/Linux面试题
- 树莓派3 64linux,树莓派3 model b安装64位debian+qt5.9
- 线程池传递对象参数_一次线程池参数错误引起的线上故障
- python list除以_扫描器篇(三)之python编写基于字典的网站目录探测脚本
- java等号_java等号
- windows2012挂linux盘阵,磁盘阵(IPSAN)挂载Windows和Linux测试过程.doc
- 一条正确的Java职业生涯规划,从理论到实践!
- 【深度学习】新的深度学习优化器探索(协同优化)
- Android移动开发之【Android实战项目】通过Java代码设置TextView